SolidWriting Documentation - CUDA / llama-cpp-python

1. Install NVIDIA CUDA v12.8 or Newer

Download and install NVIDIA CUDA v12.8 or a newer version from the official NVIDIA website.

Download the CUDNN version compatible with CUDA v12 from the NVIDIA CUDNN download page. An account is required.

Extract the cudnn.zip file and copy the bin, include, and lib folders to the following directory:
```
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8
```

Copy the files from the following directory:

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\extras\visual_studio_integration\MSBuildExtensions

To this directory:

C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\MSBuild\Microsoft\VC\v170\BuildCustomizations

For CUDA support in llama-cpp-python, refer to the official llama-cpp-python documentation.

Set the environment variable for the CUDA compiler (nvcc.exe):

$env:CUDACXX="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvcc.exe"

Set the CMake arguments for the build process:

set CMAKE_ARGS=-DGGML_CUDA=on -DCMAKE_GENERATOR_TOOLSET="cuda=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8"

Set the environment variable for the CUDA toolkit directory:

$env:CudaToolkitDir="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\"

Install or force-reinstall the llama-cpp-python package (This may take 30-50 minutes to compile):
```
pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir --verbose
```

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126