HELLO AI WORLD: MEET JETSON NANO - NVIDIA

Transcription

HELLO AI WORLD —MEET JETSON NANO

Intro to Jetson Nano- AI for Autonomous Machines- Jetson Nano Developer Kit- Jetson Nano Compute ModuleJetson Software- JetPack 4.2- ML/DL Framework SupportWEBINARAGENDA- NVIDIA TensorRT- Inferencing BenchmarksApplication SDKs- DeepStream SDK- Isaac Robotics SDKGetting Started- Jetson Nano Resources- Hello AI World- JetBot- System Setup- Tips and Tricks

JETSON POWERS AUTONOMOUS AL3

JETSON NANO DEVELOPER KIT 99 CUDA-X AI Computer128 CUDA Cores 4 Core CPU4GB LPDDR4 Memory472 GFLOPs5W 10WAccessible and easy to use4

JETSON NANO DEVKIT SPECSINTERFACESPROCESSORCPU64-bit Quad-core ARM A57 @ 1.43GHzGPU128-core NVIDIA Maxwell @ 921MHzMemory4GB 64-bit LPDDR4 @ 1600MHz 25.6GB/sVideo EncoderVideo Decoder4Kp30 (4x) 1080p30 (2x) 1080p604Kp60 (2x) 4Kp30 (8x) 1080p30 (4x) 1080p60USB(4x) USB 3.0 A (Host) USB 2.0 Micro B (Device)CameraMIPI CSI-2 x2 (15-position Flex Connector)DisplayHDMI DisplayPortNetworkingGigabit Ethernet (RJ45, PoE)WirelessM.2 Key-E with PCIe x1StorageMicroSD card (16GB UHS-1 recommended minimum)40-Pin HeaderUART SPI I2C I2S Audio Clock GPIOsPower5V DC (µUSB, Barrel Jack, PoE) - 5W 10WSize80x100mmDistributors Include:5

JETSON NANOCompact AI Compute Module128 CUDA Cores 4 Core CPU4GB LPDDR4 Memory16GB eMMC 5.145x70mm5W 10W 129 (1Ku)Available June 20196

JETSON NANO COMPUTE MODULEPROCESSORCPU64-bit Quad-core ARM A57 @ 1.43GHzGPU128-core NVIDIA Maxwell @ 921MHzMemory4GB 64-bit LPDDR4 @ 1600MHz 25.6GB/sVideo Encoder4Kp30 (4x) 1080p30 (2x) 1080p60Video Decoder4Kp60 (2x) 4Kp30 (8x) 1080p30 (4x) 1080p60INTERFACESUSBUSB 3.0 (3x) USB 2.0Camera12 lanes MIPI CSI-2 (up to 4 cameras)DisplayHDMI DP eDP DSINetworkingGigabit EthernetPCIePCIe Gen2 x1/x2/x4Storage16GB eMMC 5.1Other I/O(4x) I2C (2x) SPI (3x) UART (2x) I2S GPIOPower5V DC, 5W 10WSize45x70mm, 260-pin SODIMM connectorProduction moduleavailable June 20197

THE JETSON FAMILYFrom AI at the Edge to Autonomous MachinesJETSON NANOJETSON TX1 JETSON TX2 4GBJETSON TX2 8GB IndustrialJETSON AGX XAVIER5—10W0.5 TFLOPS (FP16)45mm x 70mm 129 / 99 (Devkit)7—15W1—1.3 TFLOPS (FP16)50mm x 87mm 2997—15W1.3 TFLOPS (FP16)50mm x 87mm 399— 74910—30W11 TFLOPS (FP16) 32 TOPS (INT8)100mm x 87mm 1099AI at the EdgeFully Autonomous MachinesMultiple Devices — Same Software8

JETSON SOFTWAREJetPack SDKDepthEstimationTensorRTcuDNNDeep LearningObjectDetectionIsaac Robotics EnginePoseEstimationVisionWorksOpenCVComputer VisionGestureRecognitioncuBLAScuFFTAccel. sNavigationlibargusVideo rsNsight Developer ToolsModulesDeepStream SDKCUDA-X Linux for Tegra ROSJetson NanoJetson TX1/TX2developer.nvidia.com/jetpackJetson AGX Xavier9

Package VersionsJETPACK 4.2Available Now For Jetsondeveloper.nvidia.com/jetpackL4T BSP32.1CUDA10.0.166Linux V3.3.1GLX1.4NPP10.0X11 ABI24Wayland1.14L4T Multimedia API32.1Argus Camera API0.97GStreamer1.14.1Nsight Systems2019.3Nsight Graphics2018.7Nsight Compute1.0Jetson GPIO1.0Jetson OSUbuntu 18.04Host OSUbuntu 16.04 / 18.04Install TensorFlow, PyTorch, Caffe,Caffe2, MXNet, ROS, and otherGPU-accelerated libraries10

OPEN FRAMEWORK SUPPORTMACHINE LEARNINGROBOTICS / IOTJETSON11

NVIDIA TUFFTensorRTModel OptimizerTensorRTRuntime Engine.caffemodelLayer Fusion, Kernel Autotuning,GPU Optimizations, Mixed Precision,Tensor Layout, Batch Size TuningC / Python12

JETSON NANO RUNS MODERN et-50Inception-v4VGG-19TensorFlowPyTorchMxNetSSD Mobilenet- SSD Mobilenet- SSD Mobilenetv2 (300x300) v2 (480x272) v2 (960x544)TensorFlowTensorFlowTensorFlowJetson NanoTiny YOLODarknetU-NetSuperOpenPoseResolutionNot supported/Does not jetson-nano-dl-inference-benchmarks13

JETSON NANO RUNS MODERN AIInference50Coral Dev Board (Edge TPU)Raspberry Pi 3 Intel Neural Compute Stick 240Jetson NanoNot G-19SSD Mobilenet- SSD Mobilenet- SSD Mobilenetv2 (300x300) v2 (480x272) v2 (960x544)Tiny -dl-inference-benchmarksSuperResolutionOpenPose14

Replace with FinalDEEPSTREAM15

NETWORK VIDEO RECORDER16

ISAAC SDKIsaac SimKAYA (Nano)Sensor andActuator DriversCARTER (Xavier)Core LibrariesGEMSReference DNNLINK (Multi Xavier)ToolsISAAC OPEN TOOLBOXIsaac GymCUDA-XJetson NanoJetson TX2Jetson AGX Xavierdeveloper.nvidia.com/isaac-sdk17

ISAAC ROBOTSLidarDriverNVIDIA CarterNVIDIA KayaWaypoint asGoalMapRangeScanLocalizationGlobalPlannerRGB DriverLQR eveloper.nvidia.com/isaac-sdk18

ResourcesGETTINGSTARTEDTutorialsSystem SetupTips and TricksAccessories19

JETSON NANO RESOURCESTutorialsProjectsJetson Developer ZoneeLinux WikiDeveloper ForumsAccessories20

HELLO AI WORLDGetting Started with Deep LearningPretrainedNetworksNVIDIA JetsonJetPack meInferencing

HELLO AI WORLDGetting Started with Deep Learning1. Download and Build the GitHub Repogit clone http://github.com/dusty-nv/jetson-inference2. Classifying Images from Command Line3. Coding Your Own Recognition Program4. Realtime Recognition from Live Camera5. Detecting Objects in Images from Disk6. Object Detection from Live Cameragithub.com/dusty-nv/jetson-inference22

HELLO AI WORLDGetting Started with Deep Learning1. Download and Build the GitHub Repo2. Classifying Images from Command Line./imagenet-console bear 0.jpg output 0.jpg3. Coding Your Own Recognition Program4. Realtime Recognition from Live Camera5. Detecting Objects in Images from Disk6. Object Detection from Live Cameragithub.com/dusty-nv/jetson-inference23

HELLO AI WORLDGetting Started with Deep Learning1. Download and Build the GitHub Repo2. Classifying Images from Command Line3. Coding Your Own Recognition Program#include jetson-inference/imageNet.h #include jetson-utils/loadImage.h int main( int argc, char** argv ){// load the image recognition network with TensorRTimageNet* net imageNet::Create(imageNet::GOOGLENET);// this variable will store the confidence of the classification (between 0 and 1)float confidence 0.0;./my-recognition test-image.jpg// classify the image with TensorRT on the GPU (hence we use the CUDA pointer)// this will return the index of the object class that the image was recognized asconst int classIndex net- Classify(imgCUDA, imgWidth, imgHeight, &confidence);4. Realtime Recognition from Live Camera// make sure a valid classification result was returnedif( classIndex 0 ){// retrieve the name/description of the object class indexconst char* classDescription net- GetClassDesc(classIndex);5. Detecting Objects in Images from Disk6. Object Detection from Live Camera// print out the classification resultsprintf("image is recognized as '%s' (class #%i) with %f%% confidence\n",classDescription, classIndex, confidence * 100.0f);}github.com/dusty-nv/jetson-inference// free the network's resources before shutting downdelete net;return 0;}24

HELLO AI WORLDGetting Started with Deep Learning1. Download and Build the GitHub Repo2. Classifying Images from Command Line3. Coding Your Own Recognition Program4. Realtime Recognition from Live Camera./imagenet-camera googlenet5. Detecting Objects in Images from Disk6. Object Detection from Live Cameragithub.com/dusty-nv/jetson-inference25

HELLO AI WORLDGetting Started with Deep Learning1. Download and Build the GitHub Repo2. Classifying Images from Command Line3. Coding Your Own Recognition Program4. Realtime Recognition from Live Camera5. Detecting Objects in Images from Disk./detectnet-console dogs.jpg output.jpg coco-dog./detectnet-console peds.jpg output.jpg multiped6. Object Detection from Live Cameragithub.com/dusty-nv/jetson-inference26

HELLO AI WORLDGetting Started with Deep Learning1. Download and Build the GitHub Repo2. Classifying Images from Command Line3. Coding Your Own Recognition Program4. Realtime Recognition from Live Camera5. Detecting Objects in Images from Disk6. Object Detection from Live Camera./detectnet-camera model-name github.com/dusty-nv/jetson-inferenceObject Detection e(airplanes)27

TWO DAYS TO A DEMOTraining InferenceAI WORKFLOWTRAINING GUIDESDEEP VISION PRIMITIVESTrain using DIGITS and cloud/PCDeploy to the field with JetsonAll the steps required to follow to trainyour own models, including the datasets.Image Recognition, Object Detectionand Segmentationgithub.com/dusty-nv/jetson-inference

JETBOT 250 DIY Autonomous Deep Learning Robotics KitProgrammable through Jupyter IPython NotebooksTrainable DNNs for obstacle detection, objectfollowing, path planning, and navigationROS support and Gazebo simulator availableJoin our upcoming JetBot webinar, May 16 2019github.com/NVIDIA-AI-IOT/JetBot29

30

SYSTEM SETUP Device is booted from a MicroSD card 16GB UHS-1 recommended minimum Download the SD card image from NVIDIA.com Flash the SD card image with Etcher program From a Windows/Mac/Linux PC You can also flash JetPack with NV SDK ManagerInsert the MicroSD card into the slot located on theunderside of the Jetson Nano module Connect keyboard, mouse, display, and power supply Board will automatically boot when power is applied Green power LED will lightNVIDIA.com/JetsonNano-Start31

POWER SUPPLIES 5V 2A Micro-USB charger 5V 4A DC barrel jack adapter Adafruit #1466 5.5mm OD x 2.1mm ID x 9.5mm length Place a jumper on header J48J41 Expansion Header, pins 2/4 Adafruit #1995Up to 5V 3A per pin (5V 6A total)Power over Ethernet (PoE) Standard PoE supply is 48V Use a PoE hat or 5V regulator J40 Button Header can disable Auto Power-On Manual Power-On / Reset Enter Recovery Mode32

POWER MODESDifferent power mode presets: 5W and 10WDefault mode is 10WUsers can create their own presets, specifying clocks and online cores in /etc/nvpmodel.conf POWER MODEL ID 1 NAME 5W CPU ONLINE CORE 0 1CPU ONLINE CORE 1 1CPU ONLINE CORE 2 0CPU ONLINE CORE 3 0CPU A57 MAX FREQ 918000GPU MAX FREQ 640000000EMC MAX FREQ 1600000000NVIDIA Power Model ToolPower Mode10W†5WMode ID01Online CPU Cores42CPU Max Frequency (MHz)1428918*GPU Max Frequency (MHz)921640*Memory Max Freq. (MHz)16001600† Default Mode is 10W (ID:0)* Rounded at runtime to closest discrete freq. availablesudo nvpmodel –q(for checking the active mode)sudo nvpmodel –m 0(for changing mode, persists after reboot)sudo jetson clocks(to disable DVFS and lock clocks to max for active mode)33

PERFORMANCE MONITORRun sudo tegrastats to launch the performance/utilization monitor:RAM 1216/3963MB (lfb 330x4MB) IRAM 0/252kB(lfb 252kB)CPU [27%@102,36%@307,6%@204,35%@518] EMC FREQ 19%@204 GR3D FREQ 0%@76 APE 25PLL@25C CPU@29.5C PMIC@100C GPU@27C AO@34C thermal@28C POM 5V IN 1532/1452POM 5V GPU 0/20 POM 5V CPU 241/201MemoryMemory Used / Total CapacityCPUUtilization / Frequency (MHz)MemoryBandwidth % @ Frequency (MHz)GPUUtilization / Frequency (MHz)ThermalZone @ Temperature ( C)PowerCurrent Consumption (mW) / Average (mW)Refer to the L4T Developer Guide for more options and documentation on the output.docs.nvidia.com/jetson34

USING GPIO Similar 40-pin header to rPI, 3.3V logic levels Adafruit Blinka SeeedStudio Grove support Jetson.GPIO Python library J41 Expansion Headersysfs GPIO Compatible API with rPI.GPIOgpio216 Docs & samples in /opt/nvidia/jetson-gpio/gpio50sysfs I/O access from /sys/class/gpio/ Map GPIO pingpio14gpio194echo 38 /sys/class/gpio/exportgpio16 Set directionecho out /sys/class/gpio/gpio38/direction Bit-bangingecho 1 /sys/class/gpio/gpio38/value Unmap GPIOecho 38 /sys/class/gpio/unexport Query statuscat /sys/kernel/debug/gpio s.txt C/C programs (and other languages) can use same sysfs files I2C – libi2c for C/C and NamePin PinNamesysfs GPIO213.3V5.0V34I2C 2 SDA5.0V65I2C 2 SCLGND8AUDIO MCLK 7UART 2 TX910GNDUART 2 RXUART 2 RTS 11 12 I2S 4 SCLKgpio7913 14SPI 2 SCKGND15 16LCD TESPI 2 CS1gpio23217 183.3VSPI 2 CS0gpio15SPI 1 MOSI 19 20GNDSPI 1 MISO 21 22 SPI 2 MISOgpio1323 24SPI 1 SCKSPI 1 CS0gpio1925 26GNDSPI 1 CS1gpio2027 28I2C 1 SDAI2C 1 SCLCAM AF EN 29 30GND31 32 LCD BL PWMGPIO PZ0gpio16833 34GPIO PE6GNDI2S 4 LRCK 35 36 UART 2 CTSgpio51SPI 2 MOSI 37 38I2S 4 SDINgpio7739 40 I2S 4 SDOUTGNDgpio7835

JETSON NANO ACCESSORIESPrintable EnclosuresBattery Packs5V FansCarriersGPIO HatseLinux.org/Jetson NanoSensors & Cameras36

CAMERA CAPTURE NVIDIA Argus (libargus) Low-overhead offloaded ingest & ISP for MIPI CSI sensors Docs & samples in /usr/src/tegra multimedia api/argus/ argus camera – C /Python wrapper library on GitHubx4x2CSInvarguscamerasrc element uses Argus internally gst-launch-1.0 nvarguscamerasrc ! ‘video/x-raw(memory:NVMM), \width (int)1920, height (int)1080, format (string)NV12, \StatisticsISP BStatisticsCSIGStreamer ISP AVIx4x2CSIMemory Interface CSIframerate (fraction)30/1' ! nvoverlaysink -e nvgstcapture camera viewer applicationUp to three MIPI CSI-2 x4 cameras or four cameras in x4/x2 configurations(12 MIPI CSI-2 lanes total)V4L2 Interface with USB cameras and MIPI CSI YUV sensors (/dev/video) libv4l (C/C ), pip install v4l2 (Python), v4l2src (GStreamer) l/v4l2.html37

VIDEO CODECS Multi-stream HW encoder and decoder engines GStreamer NV Encoder elements: omxh265enc, omxh264enc, ect. gst-launch-1.0 videotestsrc ! 'video/x-raw, format (string)I420, \width (int)1920, height (int)1080' ! omxh265enc ! matroskamux ! \filesink location test.mkv -e NV Decoder elements: omxh265dec, omxh264dec, ect. gst-launch-1.0 filesrc location test.mkv ! matroskademux ! \h265parse ! omxh265dec ! nvoverlaysink -e More pipelines in L4T Accelerated GStreamer User GuideV4L2 Extensions NV Encoder: /dev/nvhost-msenc (YUV in, H.264/H.265 out)Encoder ProfileH.265 (Main, Main 10)4Kp30 (2x) 1080p60 (4x) 1080p30H.264 (Base, Main, High)4Kp30 (2x) 1080p60 (4x) 1080p30H.264 (MVC Stereo)VP84Kp30 (2x) 1080p60 (4x) 1080p30JPEG600 MP/sDecoder ProfileH.265 (Main, Main 10)4Kp60 (2x) 4Kp30 (4x) 1080p60 (8x) 1080p30H.264 (Base, Main, High)4Kp60 (2x) 4Kp30 (4x) 1080p60 (8x) 1080p30H.264 (MVC Stereo)4Kp30 (2x) 1080p60 (4x) 1080p30VP9 (Profile 0, 8-bit)4Kp60 (2x) 4Kp30 (4x) 1080p60 (8x) 1080p30VP84Kp60 (2x) 4Kp30 (4x) 1080p60 (8x) 1080p30VC-1 (Simple, Main, Adv.) NV Decoder: /dev/nvhost-nvdec (Bitstream in, NV12/YUV out) Documentation samples included with L4T Multimedia API1440p30 1080p60 (2x) 1080p30MPEG-2 (Main)JPEG(2x) 1080p60* (4x) 1080p30*4Kp60 (2x) 4Kp30 (4x) 1080p60* (8x) 1080p30*600 MP/s* Supports progressive and interlaced formats38

ZERO COPY Shared memory fabric allows processor engines to accessthe same memory, without needing to copy between them CUDA Mapped Memory API’s cudaHostAlloc(&cpuPtr, size, cudaHostAllocMapped); cudaHostGetDevicePointer(&gpuPtr, cpuPtr, 0); No cudaMemcpy() requiredCUDA Unified Memory cudaMallocManaged() Coherent synchronization and caching Disregards data movement on Jetson EGLStreams – graphics API interoperability Argus, NV V4L2 extensions, and DeepStreamlibraries are optimized for using e/39

Thank you!Developer SiteGetting StartedHello AI WorldDevTalk ForumsVisit the x.org/Jetson NanoQ&A: What can I help you build?Dev Blog Jetson Nano Brings AIComputing to Everyone40

From AI at the Edge to Autonomous Machines JETSON TX2 8GB Industrial 7—15W 1.3 TFLOPS (FP16) 50mm x 87mm 399— 749 JETSON AGX XAVIER 10—30W 11 TFLOPS (FP16) 32 TOPS (INT8) 100mm x 87mm 1099 JETSON NANO 5—10W 0.5 TFLOPS (FP16) 45mm x 70mm 129 / 99 (Devkit) Mu