1. [Topic](/topics/)

[Vision AI](/metropolis)

NVIDIA TAO

# NVIDIA TAO

NVIDIA TAO is a framework for customizing vision foundation models for high accuracy and performance with fine-tuning microservices. TAO’s suite of modular microservices helps you easily adapt and optimize vision AI models for specific domains or tasks. This dramatically reduces the time and data you need to build high-performing AI solutions that are ready for deployment from the edge to the cloud.   
  
At the heart of TAO is a collection of vision foundation models, multimodal models, and pre-trained vision models built on vast, commercially relevant datasets. Applicable across various industries, TAO excels at delivering custom industrial AI models for visual inspection, quality control, and robotic guidance.

[Download Now](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/containers/tao-toolkit &quot;Download from NGC&quot;)[Get Started  
  
](https://docs.nvidia.com/tao/tao-toolkit/text/quick_start_guide/index.html &quot;Quick Start Guide&quot;)[Forum](https://forums.developer.nvidia.com/c/accelerated-computing/intelligent-video-analytics/tao-toolkit/17 &quot;TAO Forums&quot;)

* * *

## How TAO Works

The NVIDIA TAO workflow shows how developers can go from model training to production deployment in a seamless pipeline. The process begins with selecting a pretrained foundation model from the TAO model zoo or bringing a third-party model with an architecture supported by TAO. Next, developers adapt the model to their domain by fine-tuning and optimizing it to be smaller and faster at runtime. Finally, the trained models can be exported into open formats for deployment across diverse environments—from edge to cloud—with NVIDIA DeepStream Inference Builder. This structured workflow ensures that high-performing AI models can be quickly customized, optimized, and deployed at scale.

![What is TAO Toolkit and how does it fit into AI model development workflow?](https://developer.download.nvidia.com/images/tao/tao-diagram.svg)

### Tech Blog  

Build real-time visual inspection pipelines with NVIDIA TAO 6.

[Read the Blog](https://developer.nvidia.com/blog/build-a-real-time-visual-inspection-pipeline-with-nvidia-tao-6-and-nvidia-deepstream-8/)

### TAO Documentation  

Browse documentation and learn how to get started on TAO.

[See the Quick Start Guide](https://docs.nvidia.com/tao/tao-toolkit/text/quick_start_guide/index.html)

### TAO 6 Release Note

Learn the new features released in the latest TAO 6.

[Read the Release Note](https://docs.nvidia.com/tao/tao-toolkit/text/release_notes.html#version-list)

* * *

## Key Features

### Scale Custom Model Development With New Vision Foundation Models 

Use high-performance vision foundation models as a general-purpose starting point for developing a variety of downstream vision tasks, like classification, detection, segmentation, and more. You can customize models for domain or task-specific vision applications across industries based on training data availability and performance requirements.

### Achieve High Accuracy With Advanced Training Techniques

Apply advanced training and fine-tuning capabilities, including self-supervised learning (SSL), to learn from unlabeled, unstructured data. This accelerates training time and reduces annotation costs. Plus, post-train third-party models with an architecture supported by TAO.

### Increase Inference Throughput With Knowledge Distillation 

Use knowledge distillation to compress large models into efficient, edge-ready versions with minimal reduction in accuracy.

### Reduce Data Preparation Times With TAO Data Services 

Manage, process, and prepare datasets for AI model training with services that streamline the data pipeline process with tools for data ingestion, auto-labeling, and conversion to formats optimized for NVIDIA TAO.

### Deploy Anywhere, Run Efficiently

With fine-tuning microservices (FTMS) and DeepStream Inference Builder, TAO standardizes training and deployment for all supported models for inference on edge or cloud. It offers training job orchestration, boosts status monitoring, and automatically searches for the best hyperparameters with AutoML.

* * *

## TAO Models

### Vision Foundation Models 

Use vision foundation models (VFMs) as pretrained starting points, making it easy to fine-tune models for domain-specific tasks and deploy them efficiently at scale.

- 

[C-RADIO v2](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/cradiov2)

- 

[NV-DINO v2: VFM With SSL](https://docs.nvidia.com/tao/tao-toolkit/text/cv_finetuning/pytorch/self_supervised_learning/nvdinov2.html)

### Pre-Trained Vision Models 

Easily combine pretrained vision models with a foundation model for tasks like detection, segmentation, classification, and change detection, streamlining domain-specific customization.

- 

Real-time detection ([RT-DETR](https://docs.nvidia.com/tao/tao-toolkit/text/cv_finetuning/pytorch/object_detection/rt_detr.html#))

- 

Text prompt-based segmentation ([SegFormer](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/pretrained_segformer_imagenet?version=fan_hybrid_tiny))

- 

Visual change detection ([Visual ChangeNet](https://docs.nvidia.com/tao/tao-toolkit/text/cv_finetuning/pytorch/visual_changenet/visual_changenet_classify.html#multiple-golden-data-format))

### Depth Estimation Models

Use mono and stereo depth estimation foundation models to achieve strong zero-shot generalization.

- 

[FoundationStereo](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/foundationstereo)

- 

[NVDepthAnythingv2](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/nvdepthanythingv2)

### Multimodal Vision Models

Use multimodal vision models to combine vision (image and video) data with text to perform tasks like feature extraction, detection, or segmentation 

- 

[NVCLIP](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/nvclip_vit)

- 

[Grounding DINO](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/grounding_dino)

- 

[Mask Grounding DINO](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/mask_grounding_dino)

* * *

## Get Started With TAO

 ![icon representing workflows](https://developer.download.nvidia.com/icons/m48-interapp-workflow-complex.svg)
### Set Up Your System

Check to see if your machine meets the system requirements and compatibility, then get started by installing TAO.

[Hardware Requirement](https://docs.nvidia.com/tao/tao-toolkit/text/quick_start_guide/index.html#hardware-requirements)

[Setup Options](https://docs.nvidia.com/tao/tao-toolkit/text/quick_start_guide/index.html#running-tao)

 ![icon representing TAO resources](https://developer.download.nvidia.com/icons/m48-demo-topics.svg)
### TAO Github Tutorials and Notebooks

Check out extended resources and Jupyter notebooks for TAO. 

[Learn More](https://github.com/NVIDIA/tao_tutorials/tree/main/notebooks/tao_api_starter_kit/tutorials)

 ![icon representing neural network](https://developer.download.nvidia.com/icons/m48-neural-network-3.svg)
### Download From NGC

Find the latest TAO and models at NGC.

[Download Now](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/containers/tao-toolkit)

* * *

## Performance

Unlock peak inference performance with NVIDIA pretrained models across platforms—from the edge with NVIDIA Jetson™ solutions to the cloud featuring NVIDIA Ampere architecture GPUs. For more details on batch size and other models, check the [detailed performance datasheet](https://docs.nvidia.com/tao/tao-toolkit/text/model_zoo/overview.html#performance-metrics).

| Model Arch | Model Variant | Inference Resolution | Precision | NVIDIA DGX Spark | NVIDIA Jetson AGX Thor™ | NVIDIA L40s | NVIDIA A100 | RTX PRO 6000 SE | NVIDIA H200 | NVIDIA B200 | NVIDIA HGX™ GB200 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| C-RADIOv2 Classification | Large 322M | 3x224x224 | FP16 | 297 | 635 | 1453 | 1520 | 2443 | 3579 | 5781 | 6018 |
| NV-DINOv2 | Large 305M | 3x224x224 | FP16 | 207 | 413 | 1020 | 1048 | 1747 | 2542 | 5667 | 5957 |
| RT-DETR+C-RADIOv2 | Base 147M | 3x640x640 | FP16 | 204 | 248 | 670 | Awaiting Results | 1204 | 1934 | 3316 | Awaiting Results |
| SegFormer+C-RADIOv2 | Base 92M | 3x640x640 | FP16 | 254 | 264 | 1155 | 1330 | 1960 | 2746 | 3187 | 3386 |
| Multi-Golden ChangeNet Classification+C-RADIOv2 | Base | 3x224x224 | FP16 | Awaiting Results | Awaiting Results | 332 | 418 | 525 | 847 | 820 | 867 |
| NV-DepthAnythingv2 | Large 360M | 3x518x924 | FP32+FP16 | Awaiting Results | 25 | 66 | 70 | 108 | 176 | 320 | 320 |
| C-FoundationStereo | Small 221M | 2x3x320x736 | FP16 | 2.3 | 1.5 | 1.0 | Awaiting Results | 18 | 20 | 19 | Awaiting Results |

* * *

## Starter Kits

### Accelerated Computing Hub   

Visit the Accelerated Computing Hub to see examples of CUDA in action in C++ and Python. You’ll find tutorials and example code that will help you learn more about how to use CUDA.

- 

[Foundation Models Documentation](https://docs.nvidia.com/tao/tao-toolkit/text/foundation_models/overview.html)

- 

Download [C-RADIO v2](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/cradiov2)

- 

Download [NV-DINO v2](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/nvdinov2_vitg)

### Fine-Tuning   

Take advantage of Supervised Fine-Tuning (SFT) with labeled data and Self-Supervised Learning (SSL)​ with unlabeled data. 

- 

[Self-Supervised Learning](https://docs.nvidia.com/tao/tao-toolkit/text/cv_finetuning/pytorch/self_supervised_learning/index.html)

- 

[Computer Vision Fine-Tuning](https://docs.nvidia.com/tao/tao-toolkit/text/cv_finetuning/index.html)

### Model Distillation

Distill knowledge from a larger teacher model into a smaller student model for target compute​.

- 

[Knowledge Distillation](https://docs.nvidia.com/tao/tao-toolkit/text/qat_and_amp_for_training.html#knowledge-distillation)

### AI-Assisted Auto Labeling   

Use prompts and descriptors to auto-label object detection and segmentation masks.

- 

[Auto-Labeling in TAO](https://docs.nvidia.com/tao/tao-toolkit/text/data_services/auto-label.html)

### Depth Estimation   

Access the highest-accuracy depth estimation models​.

- 

Download [FoundationStereo](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/foundationstereo)

- 

Download [NVDepthAnythingv2](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/nvdepthanythingv2)

### Model Deployment   

Optimize inference with the DeepStream SDK.

- 

[Deploy With DeepStream I](https://docs.nvidia.com/tao/tao-toolkit/text/ds_tao/deepstream_tao_integration.html)nference Builder

- 

[Watch the Video](https://www.youtube.com/watch?v=yj11L8rFC30)

* * *

## More Resources

 ![Decorative image representing Community](https://developer.download.nvidia.com/icons/m48-people-group.svg)
### Join the Community

 ![Decorative image representing Developer Newsletter](https://developer.download.nvidia.com/icons/m48-email-settings.svg)
### Sign Up for the Developer Newsletter

 ![Decorative image representing Developer Program](https://developer.download.nvidia.com/icons/m48-developer-1.svg)
### Join the NVIDIA Developer Program

* * *

## Ethical AI 

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.   
  
For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety &amp; Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

Get Started Today.

[Start With NVIDIA TAO](https://developer.nvidia.com/cuda-downloads)


