From 3e085e3cee1e37f9a3254853522f1d33367d27cb Mon Sep 17 00:00:00 2001 From: Martial Michel Date: Thu, 30 Nov 2023 11:43:37 -0500 Subject: [PATCH] Content clarifications --- README.md | 127 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 76 insertions(+), 51 deletions(-) diff --git a/README.md b/README.md index 0755c5b..3031905 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # CTPO: CUDA + TensorFlow + PyTorch + OpenCV Docker containers -Latest revision: 20231120 +Latest release: 20231120 * 1. [Builds and Notes](#BuildsandNotes) @@ -22,54 +22,64 @@ Latest revision: 20231120 /vscode-markdown-toc-config --> -`Dockerfile`s to build containers with support for CPU and GPU (NVIDIA CUDA) containers with support for TensorFlow, PyTorch and OpenCV (or combinations of), based on Ubuntu 22.04 container images. +`Dockerfile`s to build containers with support for CPU and GPU (NVIDIA CUDA) containers with support for TensorFlow, PyTorch and OpenCV (or combinations of), based on `nvidia/cuda` and Ubuntu 22.04 container images. The tool's purpose is to enable developers, ML and CV enthusiasts to build and test solutions `FROM` a docker container, allowing fast prototyping and release of code to the community. -Building each container independently is made possible by the `Dockerfile` store in the `BuildDetails//` directories. -Building each container takes resources and time (counted in many cores, memory and hours). +Building each container independently is made possible by the `Dockerfile` available in the `BuildDetails//-` directories. +Building each container takes resources and time (counted in many cores, GB of memory and build hours). Pre-built containers are available from Infotrend Inc.'s Docker account at https://hub.docker.com/r/infotrend/ Details on the available container and build are discussed in this document. -A Jupyter Lab and Unraid version of this WebUI-enabled version are also available on our Docker Hub. +A Jupyter Lab and Unraid version of this WebUI-enabled version are also available on our Docker Hub, as well as able to be built from the `Makefile`. -Note: this tool was built earlier in 2023, iterations of its Jupyter Lab were made available to our data scientists, and we are releasing it to help the developer community. +Note: this tool was built earlier in 2023, iterations of its Jupyter Lab were made available to Infotrend's data scientists, and we are releasing it to help the developer community. ## 1. Builds and Notes -The base OS for those container images is pulled from Dockerhub's official `ubuntu:22.04` or `nvidia/cuda:[...]-devel-ubuntu22.04` images. -More details on the Nvidia base images are available at https://hub.docker.com/r/nvidia/cuda/ . +The base for those container images is pulled from Dockerhub's official `ubuntu:22.04` or `nvidia/cuda:[...]-devel-ubuntu22.04` images. + +More details on the Nvidia base images are available at https://hub.docker.com/r/nvidia/cuda/ In particular, please note that "By downloading these images, you agree to the terms of the license agreements for NVIDIA software included in the images"; with further details on DockerHub version from https://docs.nvidia.com/cuda/eula/index.html#attachment-a -For GPU-optimized versions, you will need to build the `cuda_` versions on a host with the final hardware. +For GPU-optimized versions, you will need to build the `cuda_` versions on a host with the supported hardware. When using GPU and building the container, you need to install the NVIDIA Container Toolkit found at https://github.com/NVIDIA/nvidia-container-toolkit -We note that your NVIDIA video driver needs to support the version of CUDA that you are trying to build +Note that your NVIDIA video driver on your Linux host needs to support the version of CUDA that you are trying to build (you can see the supported CUDA version and driver version information when running the `nvidia-smi` command) -For CPU builds, you can simply build the non-`cuda_` versions. +For CPU builds, simply build the non-`cuda_` versions. -Pre-built images are available for download on Infotrend's DockerHub. Those are built using the same method provided by the `Makefile`. The corresponding `Dockerfile` used is stored in the `BuildDetails` directory matching the container image. +Pre-built images are available for download on Infotrend's DockerHub (at https://hub.docker.com/r/infotrend/). +Those are built using the same method provided by the `Makefile` and the corresponding `Dockerfile` used for those builds is stored in the matching `BuildDetails//-` directory. ### 1.1. Tag naming conventions -The tag naming convention follows the `_`-components split after the base name of `infotrend/ctpo-` followed by the "release" tag. -Any `infotrend/ctpo-cuda_` build is a `GPU` build while all non-`cuda_` ones are `CPU` only. -Note: Docker tags are always lowercase. +The tag naming convention follows a `_`-components split after the base name of `infotrend/ctpo-` followed by the "release" tag (Docker container images are always lowercase). +`-` is used as a feature separator, in particular for `jupyter` or `unraid` specific builds. +Any `cuda_` build is a `GPU` build while all non-`cuda_` ones are `CPU` only. -For example, for `infotrend/ctpo-tensorflow_pytorch_opencv:2.12.0_2.0.1_4.7.0-20231120`, this means: `"base name"-"component1"_"compoment2"_"component3":"component1_version"_"component2_version"_"component3_version"-"release tag"` with: -- `base name`=`infotrend/ctop-` +For example, for `infotrend/ctpo-tensorflow_pytorch_opencv:2.12.0_2.0.1_4.7.0-20231120`, this means: `"base name"-"component1"_"compoment2"_"component3":"component1_version"_"component2_version"_"component3_version"-"release"` with: +- `base name`=`infotrend/ctpo-` - `component1` + `component1_version` = `tensorflow` `2.12.0` - `component2` + `component2_version` = `pytorch` `2.0.1` - `component3` + `component3_version` = `opencv` `4.7.0` -As such, this was "Infotrend's CTPO release 20231120 with TensorFlow 2.12.0, PyTorch 2.0.1, and OpenCV 4.7.0 without any CUDA support." (Since no `cuda_` was part of the name, this is a `CPU` build) +- `release`=`20231120` +As such, this was "Infotrend's CTPO release 20231120 with TensorFlow 2.12.0, PyTorch 2.0.1, and OpenCV 4.7.0 without any CUDA support." +(Since no `cuda_` was part of the name, this is a `CPU` build) -Similarly, `infotrend/ctpo-cuda_pytorch_opencv:11.8.0_2.0.1_4.7.0-20231120` can be read as: +Similarly, `infotrend/ctpo-jupyter-cuda_tensorflow_pytorch_opencv-unraid:11.8.0_2.12.0_2.0.1_4.7.0-20231120` can be read as: +- `base name`=`infotrend/ctpo-` +- `feature1` = `jupyter` - `component1` + `component1_version` = `cuda` `11.8.0` -- `component2` + `component2_version` = `pytorch` `2.0.1` -- `component3` + `component3_version` = `opencv` `4.7.0` -As such, this was "Infotrend's CTPO release 20231120 with PyTorch 2.0.1, OpenCV 4.7.0 and CUDA support." +- `component2` + `component2_version` = `tensorflow` `2.12.0` +- `component3` + `component3_version` = `pytorch` `2.0.1` +- `component4` + `component4_version` = `opencv` `4.7.0` +- `feature2` = `unraid` +- `release`=`20231120` + "Infotrend's CTPO release 20231120 with a Jupyter Lab and Unraid specific components with PyTorch 2.0.1, OpenCV 4.7.0 and GPU (CUDA) support." -There can be more or less than three components per name (ex: `tensorflow_opencv` or `cuda_tensorflow_pytorch_opencv`). It is left to the end user to follow the naming convention. +There will be a variable number of components or features in the full container name as shown above. +It is left to the end user to follow the naming convention. ### 1.2. Building @@ -100,15 +110,18 @@ Below you will see the result of this command for the `20231120` release: jupyter-cuda_tensorflow_pytorch_opencv-11.8.0_2.12.0_2.0.1_4.7.0 ``` -In this printout appears multiple sections: +In this usage are multiple sections: - The `Docker Image tag ending` matches the software release tag. - The `Docker Runtime` explains the current default runtime. For `GPU` (CTPO) builds it is recommended to add `"default-runtime": "nvidia"` in the `/etc/docker/daemon.json` file and restart the docker daemon. Similarly, for `CPU` (TPO) builds, that `"default-runtime"` should be removed (or commented.) You can check the current status of your runtime by running: `docker info | grep "Default Runtime"` - The `Available Docker images to be built` section allows you to select the possible build targets. For `GPU`, the `cuda_` variants. For `CPU` the non `cuda_` variants. Naming conventions and tags follow the guidelines specified in the "Tag naming conventions" section. - The `Jupyter Labs ready containers` are based on the containers built in the "Available Docker images[...]" and adding a running "Jupyter Labs" following the specific `Dockerfile` in the `Jupyter_build` directory. The list of built containers is limited to the most components per `CPU` and `GPU` to simplify distribution. +Local builds will not have the `infotrend/ctpo-` added to their base name. +Those are only for release to Docker hub by maintainers. + ### 1.3. Dockerfile -Each time you request a specific `make` target a dedicated `Dockerfile` is built in the `BuildDetails//` directory. +Each time you request a specific `make` target, a dedicated `Dockerfile` is built in the `BuildDetails//` directory. That `Dockerfile` contains `ARG` and `ENV` values that match the specific build parameters. For example in release `20231120`, when building the `tensorflow_opencv-2.12.0_4.7.0` target, the `BuildDetails/20231120/tensorflow_opencv-2.12.0_4.7.0-20231120/Dockerfile` will be created and used to build the `tensorflow_opencv:2.12.0_4.7.0-20231120` container image. @@ -138,23 +151,34 @@ That `Dockerfile` should enable developers to integrate their modifications to b When the maintainers upload this image to Dockerhub, that image will be preceded by `infotrend/ctpo-`. -If you choose to build the image for your hardware, please be patient, building any of those images might take a long time (counted in hours). +If you choose to build a container image for your hardware, please be patient, building any of those images might take a long time (counted in hours). To build it this way, find the corresponding `Dockerfile` and `docker build -f /Dockerfile .` from the location of this `README.md`. -The build process will require some of the script in the `tools` directory to complete. +The build process will require some of the scripts in the `tools` directory to complete. For example, to build the `BuildDetails/20231120/tensorflow_opencv-2.12.0_4.7.0-20231120/Dockerfile` and tag it as `to:test` from the directory where this `README.md` is located, run: ``` % docker build -f ./BuildDetails/20231120/tensorflow_opencv-2.12.0_4.7.0-20231120/Dockerfile --tag to:test . ``` +> ℹ️ If you use an existing `Dockerfile`, please update the `ARG CTPO_NUMPROC=` line with the value of running the `nproc --all` command. +> The value in the `Dockerfile` reflects the build as it was performed for release to Docker Hub and might not represent your build system. + +The `Makefile` contains most of the variables that define the versions of the different frameworks. +The file has many comments that allow developers to tailor the build. + +For example, any release on our Dockerhub is made with "redistributable" packages, the `CTPO_ENABLE_NONFREE` variable in the `Makefile` controls that feature: +> `The default is not to build OpenCV non-free or build FFmpeg with libnpp, as those would make the images unredistributable.` +> `Replace "free" by "unredistributable" if you need to use those for a personal build` + ### 1.4. Available builds on DockerHub The `Dockerfile` used for a Dockerhub pushed built is shared in the `BuildDetails` directory (see the [Dockerfile](#Dockerfile) section above) -We will publish releases into [Infotrend Inc](https://hub.docker.com/r/infotrend/)'s Docker Hub account, as well as other tools. +We will publish releases into [Infotrend Inc](https://hub.docker.com/r/infotrend/)'s Docker Hub account. +There you can find other releases from Infotrend. The tag naming reflects the [Tag naming conventions](#Tagnamingconventions) section above. -`latest` is used to point to the most recent release. +`latest` is used to point to the most recent release for a given container image. The different base container images that can be found there are: - CPU builds: @@ -175,9 +199,9 @@ The different base container images that can be found there are: ### 1.5. Build Details -The [`README-BuildDetails.md`](README-BuildDetails.md) file is built automatically from the content of the `BuildDetails` directory and contains link to different files stored in each sub-directory. +The [`README-BuildDetails.md`](README-BuildDetails.md) file is built automatically from the content of the `BuildDetails` directory and contains links to different files stored in each sub-directory. -It reflects each build's detailed information, such as (where relevant), the Docker tag, version of CUDA, cuDNN, TensorFlow, PyTorch, OpenCV, FFmpeg and Ubuntu. Most content also links to sub-files that contain further insight into the system package, enabled build parameters, etc. +It reflects each build's detailed information, such as (where relevant) the Docker tag, version of CUDA, cuDNN, TensorFlow, PyTorch, OpenCV, FFmpeg and Ubuntu. Most content also links to sub-files that contain further insight into the system package, enabled build parameters, etc. ### 1.6. Jupyter build @@ -187,19 +211,19 @@ A "user" version (current user's UID and GID are passed to the internal user) ca The specific details of such builds are available in the `Jupyter_build` directory, in the `Dockerfile` and `Dockerfile-user` files. -In particular, the Notebook default Jupyter lab password (`iti`) is stored in the `Dockerfile` and can be modified by the builder by replacing the `--IdentityProvider.token='iti'` command line option. +The default Jupyter Lab's password (`iti`) is stored in the `Dockerfile` and can be modified by the builder by replacing the `--IdentityProvider.token='iti'` command line option. -When using the Jupyter-specific container, it is also important to remember to expose the port used by the tool (here 8888), as such in your `docker run` command, make sure to add something akin to `-p 8888:8888` to the command line. +When using the Jupyter-specific container, it is important to remember to expose the port used by the tool (here: 8888), as such in your `docker run` command, make sure to add `-p 8888:8888` to the command line. Pre-built containers are available, see the [Available builds on DockerHub](#AvailablebuildsonDockerHub) section above. ### 1.7. Unraid build -Those are specializations of the Jupyter Lab's builds, and container images with a `sudo`-capable `jupyter` user using unraid's specific `uid` and `gid` and the same default `iti` Jupyter lab's default password. +Those are specializations of the Jupyter Lab's builds, and container images with a `sudo`-capable `jupyter` user using Unraid's specific `uid` and `gid` and the same default `iti` Jupyter Lab's default password. -The unraid version can be built using `make JN_MODE="-unraid" jupyter_tpo jupyter_ctpo`. +The Unraid version can be built using `make JN_MODE="-unraid" jupyter_tpo jupyter_ctpo`. -The build details are available in the `Jupyter_build/Dockerfile-unraid` file. +The build `Dockerfile` is `Jupyter_build/Dockerfile-unraid`. Pre-built containers are available, see the [Available builds on DockerHub](#AvailablebuildsonDockerHub) section above. @@ -207,54 +231,55 @@ Pre-built containers are available, see the [Available builds on DockerHub](#Ava ### 2.1. A note on supported GPU in the Docker Hub builds -In some cases, a minimum Nvidia driver version is needed to run specific version of CUDA, [Table 1: CUDA Toolkit and Compatible Driver Versions](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver) and [Table 2: CUDA Toolkit and Minimum Compatible Driver Versions](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html) as well as the `nvidia-smi` command on your host will help you determine if a specific version of CUDA will be supported. +A minimum Nvidia driver version is needed to run the CUDA builds. +[Table 1: CUDA Toolkit and Compatible Driver Versions](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver) and [Table 2: CUDA Toolkit and Minimum Compatible Driver Versions](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html) as well as the `nvidia-smi` command on your host will help you determine if a specific version of CUDA will be supported. -It is important to note that not all GPUs are supported in the Docker Hub builds. The containers are built for "compute capability (version)" (as defined in the [GPU supported](https://en.wikipedia.org/wiki/CUDA#GPUs_supported) Wikipedia page) of 6.0 and above (ie Pascal and above). - -If you need a different GPU compute capability, please edit the `Makefile` and alter the various `DNN_ARCH_` matching the one that you need to build and add your architecture. Then type `make` to see the entire list of containers that the release you have obtained can build and use the exact tag that you want to build to build it locally (on Ubuntu, you will need `docker` and `build-essential` installed --at least-- to do this). Building a container image takes a lot of CPU and can take multiple hours, so we recommend you build only the target you need. +Not all GPUs are supported in the Docker Hub builds. +The containers are built for "compute capability (version)" (as defined in the [GPU supported](https://en.wikipedia.org/wiki/CUDA#GPUs_supported) Wikipedia page) of 6.0 and above (ie Pascal and above). +If you need a different GPU compute capability, please edit the `Makefile` and alter the various `DNN_ARCH_` matching the one that you need to build and add your architecture. Then type `make` to see the entire list of containers that the release you have obtained can build and use the exact tag that you want to build to build it locally (on Ubuntu, you will need `docker` and `build-essential` installed --at least-- to do this). +Building a container image takes a lot of CPU and can take multiple hours, so we recommend you build only the target you need. ### 2.2. Using the container images Build or obtain the container image you require from DockerHub. We understand the image names are verbose. This is to avoid confusion between the different builds. -It is recommended to `tag` containers with short names for easy `docker run`. +It is possible to `tag` containers with shorter names for easy `docker run`. The `WORKDIR` for the containers is set as `/iti`, as such, should you want to map the current working directory within your container and test functions, you can `-v` as `/iti`. When using a GPU image, make sure to add `--gpus all` to the `docker run` command line. -For example to run the GPU-Jupyter container and expose the WebUI to port 8765, one could: +For example to run the GPU-Jupyter container and expose the WebUI to port 8765, one would: ``` % docker run --rm -v `pwd`:/iti --gpus all -p 8765:8888 infotrend/ctpo-jupyter-cuda_tensorflow_pytorch_opencv:11.8.0_2.12.0_2.0.1_4.7.0-20231120 ``` By going to http://localhost:8765 you will be shown the Jupyter `Log in` page. As a reminder, the default token is `iti`. -When you login, you will see the Jupyter lab interface and the list of files mounted in `/iti` on the left. -From that WebUI, when you `File->Shutdown`, the container will exit. +When you log in, you will see the Jupyter Lab interface and the list of files mounted in `/iti` in the interface. +From that WebUI, when you `File -> Shutdown`, the container will exit. -The non-Jupyter containers are set to provide the end users with a `bash`. If the `/iti` directory is mounted in a directory where the developer has some come for testing with one of the provided tools, this can be done. For example to run some of the content of the `test` directory on CPU (in the directory where this `README.md` is located): +The non-Jupyter containers are set to provide the end users with a `bash`. +`pwd`-mounting the `/iti` directory to a directory where the developer has some code for testing enables the setup of a quick prototyping/testing container-based environment. +For example to run some of the content of the `test` directory on a CPU, in the directory where this `README.md` is located: ``` % docker run --rm -it -v `pwd`:/iti infotrend/ctpo-tensorflow_opencv:2.12.0_4.7.0-20231120 - + [this starts the container in interactive mode and we can type command in the provided shell] root@b859b8aced9c:/iti# python3 ./test/tf_test.py Tensorflow test: CPU only - - On CPU: tf.Tensor( [[22. 28.] [49. 64.]], shape=(2, 2), dtype=float32) - - Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images (batch x height x width x channel). Sum of ten runs. CPU (s): 0.483618629979901 Tensorflow test: Done ``` -Note that the base container runs as root, if you want to run it as a non-root user, add `-u $(id -u):$(id -g)` to the `docker` command line but ensure that you have access to the directories you will work in. +Note that the base container runs as `root`. +If you want to run it as a non-root user, add `-u $(id -u):$(id -g)` to the `docker` command line and ensure that you have access to the directories you will work in. ## 3. Version History