Skip to content

Commit

Permalink
cherry pick doc fix #3238 from main to release2.5 (#3240)
Browse files Browse the repository at this point in the history
Co-authored-by: Dheeraj Peri <peri.dheeraj@gmail.com>
  • Loading branch information
lanluo-nvidia and peri044 authored Oct 16, 2024
1 parent edf63bb commit f2e1e6c
Show file tree
Hide file tree
Showing 9 changed files with 75 additions and 47 deletions.
48 changes: 37 additions & 11 deletions docsrc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,35 @@ User Guide
user_guide/saving_models
user_guide/runtime
user_guide/using_dla


Tutorials
------------

* :ref:`torch_compile_advanced_usage`
* :ref:`vgg16_ptq`
* :ref:`engine_caching_example`
* :ref:`engine_caching_bert_example`
* :ref:`refit_engine_example`
* :ref:`serving_torch_tensorrt_with_triton`
* :ref:`torch_export_cudagraphs`
* :ref:`custom_kernel_plugins`
* :ref:`mutable_torchtrt_module_example`

.. toctree::
:caption: Tutorials
:maxdepth: 1
:hidden:

tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage
tutorials/_rendered_examples/dynamo/vgg16_ptq
tutorials/_rendered_examples/dynamo/engine_caching_example
tutorials/_rendered_examples/dynamo/engine_caching_bert_example
tutorials/_rendered_examples/dynamo/refit_engine_example
tutorials/serving_torch_tensorrt_with_triton
tutorials/_rendered_examples/dynamo/torch_export_cudagraphs
tutorials/_rendered_examples/dynamo/custom_kernel_plugins
tutorials/_rendered_examples/dynamo/mutable_torchtrt_module_example

Dynamo Frontend
----------------
Expand Down Expand Up @@ -97,27 +122,28 @@ FX Frontend

fx/getting_started_with_fx_path

Tutorials
Model Zoo
------------
* :ref:`torch_tensorrt_tutorials`
* :ref:`serving_torch_tensorrt_with_triton`
* :ref:`torch_compile_resnet`
* :ref:`torch_compile_transformer`
* :ref:`torch_compile_stable_diffusion`
* :ref:`torch_export_gpt2`
* :ref:`torch_export_llama2`
* :ref:`notebooks`

.. toctree::
:caption: Tutorials
:caption: Model Zoo
:maxdepth: 3
:hidden:

tutorials/serving_torch_tensorrt_with_triton
tutorials/notebooks

tutorials/_rendered_examples/dynamo/torch_compile_resnet_example
tutorials/_rendered_examples/dynamo/torch_compile_transformers_example
tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion
tutorials/_rendered_examples/dynamo/torch_export_cudagraphs
tutorials/_rendered_examples/dynamo/custom_kernel_plugins
tutorials/_rendered_examples/distributed_inference/data_parallel_gpt2
tutorials/_rendered_examples/distributed_inference/data_parallel_stable_diffusion
tutorials/_rendered_examples/dynamo/mutable_torchtrt_module_example
tutorials/_rendered_examples/dynamo/torch_export_gpt2
tutorials/_rendered_examples/dynamo/torch_export_llama2
tutorials/notebooks

Python API Documentation
------------------------
Expand Down Expand Up @@ -214,4 +240,4 @@ Legacy Further Information (TorchScript)
* `GTC 2021 Fall Talk <https://www.nvidia.com/en-us/on-demand/session/gtcfall21-a31107/>`_
* `PyTorch Ecosystem Day 2021 <https://assets.pytorch.org/pted2021/posters/I6.png>`_
* `PyTorch Developer Conference 2021 <https://s3.amazonaws.com/assets.pytorch.org/ptdd2021/posters/D2.png>`_
* `PyTorch Developer Conference 2022 <https://pytorch.s3.amazonaws.com/posters/ptc2022/C04.pdf>`_
* `PyTorch Developer Conference 2022 <https://pytorch.s3.amazonaws.com/posters/ptc2022/C04.pdf>`_
5 changes: 2 additions & 3 deletions docsrc/tutorials/notebooks.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
.. _notebooks:

Example notebooks
Legacy notebooks
===================

There exists a number of notebooks which cover specific using specific features and models
with Torch-TensorRT
There exists a number of notebooks which demonstrate different model conversions / features / frontends available within Torch-TensorRT

Notebooks
------------
Expand Down
5 changes: 1 addition & 4 deletions examples/README.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
.. _torch_tensorrt_tutorials:

Torch-TensorRT Tutorials
===========================

The user guide covers the basic concepts and usage of Torch-TensorRT.
We also provide a number of tutorials to explore specific usecases and advanced concepts
===========================
29 changes: 16 additions & 13 deletions examples/dynamo/README.rst
Original file line number Diff line number Diff line change
@@ -1,19 +1,22 @@
.. _torch_compile:
.. _torch_tensorrt_examples:

Dynamo / ``torch.compile``
----------------------------
Here we provide examples of Torch-TensorRT compilation of popular computer vision and language models.

Torch-TensorRT provides a backend for the new ``torch.compile`` API released in PyTorch 2.0. In the following examples we describe
a number of ways you can leverage this backend to accelerate inference.
Dependencies
------------------------------------

Please install the following external dependencies (assuming you already have correct `torch`, `torch_tensorrt` and `tensorrt` libraries installed (`dependencies <https://github.com/pytorch/TensorRT?tab=readme-ov-file#dependencies>`_))

.. code-block:: python
pip install -r requirements.txt
Model Zoo
------------------------------------
* :ref:`torch_compile_resnet`: Compiling a ResNet model using the Torch Compile Frontend for ``torch_tensorrt.compile``
* :ref:`torch_compile_transformer`: Compiling a Transformer model using ``torch.compile``
* :ref:`torch_compile_advanced_usage`: Advanced usage including making a custom backend to use directly with the ``torch.compile`` API
* :ref:`torch_compile_stable_diffusion`: Compiling a Stable Diffusion model using ``torch.compile``
* :ref:`torch_export_cudagraphs`: Using the Cudagraphs integration with `ir="dynamo"`
* :ref:`custom_kernel_plugins`: Creating a plugin to use a custom kernel inside TensorRT engines
* :ref:`refit_engine_example`: Refitting a compiled TensorRT Graph Module with updated weights
* :ref:`mutable_torchtrt_module_example`: Compile, use, and modify TensorRT Graph Module with MutableTorchTensorRTModule
* :ref:`vgg16_fp8_ptq`: Compiling a VGG16 model with FP8 and PTQ using ``torch.compile``
* :ref:`engine_caching_example`: Utilizing engine caching to speed up compilation times
* :ref:`engine_caching_bert_example`: Demonstrating engine caching on BERT
* :ref:`_torch_export_gpt2`: Compiling a GPT2 model using AOT workflow (`ir=dynamo`)
* :ref:`_torch_export_llama2`: Compiling a Llama2 model using AOT workflow (`ir=dynamo`)

2 changes: 1 addition & 1 deletion examples/dynamo/torch_compile_resnet_example.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""
.. _torch_compile_resnet:
Compiling ResNet using the Torch-TensorRT `torch.compile` Backend
Compiling ResNet with dynamic shapes using the `torch.compile` backend
==========================================================
This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a ResNet model."""
Expand Down
2 changes: 1 addition & 1 deletion examples/dynamo/torch_compile_stable_diffusion.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""
.. _torch_compile_stable_diffusion:
Torch Compile Stable Diffusion
Compiling Stable Diffusion model using the `torch.compile` backend
======================================================
This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a Stable Diffusion model. A sample output is featured below:
Expand Down
4 changes: 2 additions & 2 deletions examples/dynamo/torch_compile_transformers_example.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
"""
.. _torch_compile_transformer:
Compiling a Transformer using torch.compile and TensorRT
Compiling BERT using the `torch.compile` backend
==============================================================
This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a transformer-based model."""
This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a BERT model."""

# %%
# Imports and Model Definition
Expand Down
13 changes: 7 additions & 6 deletions examples/dynamo/torch_export_gpt2.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
"""
.. _torch_export_gpt2:
Compiling GPT2 using the Torch-TensorRT with dynamo backend
Compiling GPT2 using the dynamo backend
==========================================================
This interactive script is intended as a sample of the Torch-TensorRT workflow with dynamo backend on a GPT2 model."""
This script illustrates Torch-TensorRT workflow with dynamo backend on popular GPT2 model."""

# %%
# Imports and Model Definition
Expand Down Expand Up @@ -78,9 +78,10 @@
tokenizer.decode(trt_gen_tokens[0], skip_special_tokens=True),
)

# %%
# The output sentences should look like
# Prompt : What is parallel programming ?

# =============================
# Pytorch model generated text: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my
# Pytorch model generated text: The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that

# =============================
# TensorRT model generated text: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my
# TensorRT model generated text: The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that
14 changes: 8 additions & 6 deletions examples/dynamo/torch_export_llama2.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
"""
.. _torch_export_llama2:
Compiling Llama2 using the Torch-TensorRT with dynamo backend
Compiling Llama2 using the dynamo backend
==========================================================
This interactive script is intended as a sample of the Torch-TensorRT workflow with dynamo backend on a Llama2 model."""
This script illustrates Torch-TensorRT workflow with dynamo backend on popular Llama2 model."""

# %%
# Imports and Model Definition
Expand Down Expand Up @@ -82,9 +82,11 @@
)[0],
)

# %%
# The output sentences should look like

# Prompt : What is dynamic programming?

# =============================
# Pytorch model generated text: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my
# Pytorch model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and

# =============================
# TensorRT model generated text: I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my
# TensorRT model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and

0 comments on commit f2e1e6c

Please sign in to comment.