CUDA, CUTENSOR, CUQUANTUM #580

pcchen · 2025-03-15T16:29:47Z

It seems that if I turn cuda option on via -DUSE_CUDA=ON, then one automatically turns on cutensor and cuquantum. If this is the case, then maybe we don't need -DUSE_CUTENSOR=ON and -DUSE_CUQUANTUM=ON?

The text was updated successfully, but these errors were encountered:

IvanaGyro · 2025-03-15T17:01:39Z

Are you discussing the presets created in #579?

With the current settings (without #579), enabling CUDA will not automatically enable cuTENSOR and cuQuantum.

In #579, CMakeLists.txt follows the current settings. cuTENSOR and cuQuantum are not enabled when configuring with the command cmake -DUSE_CUDA. cuTENSOR and cuQuantum are only enabled with CUDA when using presets. cmake --preset openblas-cuda enables cuTENSOR and cuQuantum.

I updated the PR message of #579.

pcchen · 2025-03-16T02:11:26Z

Question 1:

Can one build gpu support without using CUTENSOR and CUQUANTUM?

pcchen · 2025-03-16T02:17:04Z

With commit 46dc390
I assume this is the current settings.

If I do cmake -DUSE_CUDA=ON, it seems to automatically turns on the CUTENSOR/CUQUANTUM:

-- ------------------------------------------------------------------------
--   Project Cytnx, A Cross-section of Python & C++,Tensor network library 
-- ------------------------------------------------------------------------
-- 
-- /home/pcchen/github/Cytnx/cmake/Modules
--  Generator: Unix Makefiles
--  Build Target: -
--  Installation Prefix: 
--  Version: 1.0.0
-- The CXX compiler identification is GNU 13.3.0
-- The C compiler identification is GNU 13.3.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- The CUDA compiler identification is NVIDIA 12.8.93
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found Boost: /usr/lib/x86_64-linux-gnu/cmake/Boost-1.83.0/BoostConfig.cmake (found version "1.83.0")  
-- backend = cytnx
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl  
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl  
-- LAPACK found: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl
-- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found version "12.8.93") 
-- Looking for cuTENSOR in /home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive
--  cudaver: 12
-- ok
-- Build with CuTensor: YES
-- CuTensor: libdir:/home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/lib/12 incdir:/home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/include libs:/home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/lib/12/libcutensor.so;/home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/lib/12/libcutensorMg.so
-- Looking for cuTENSOR in /home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive
--  cudaver: 12
-- ok
-- Build with CuQuantum: YES
-- CuQuantum: libdir:/home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/lib incdir:/home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/include libs:/home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/lib/libcutensornet.so;/home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/lib/libcustatevec.so
--  Build CUDA Support: YES
--   - CUDA Version: 
--   - CUDA Toolkit Root: 
--   - Internal macro switch: GPU/CUDA
--   - Cudatoolkit include dir: /usr/local/cuda/targets/x86_64-linux/include
--   - Cudatoolkit lib dir: /usr/local/cuda/lib64
--   - CuSolver library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcusolver.so
--   - Curand library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcurand.so
--   - CuBlas library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcublas.so
--   - Cuda rt library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudart_static.a -ldl
--   - Cuda devrt library:  -lrt -lcudadevrt
--   - Cuda cusparse library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcusparse.so
--  Build OMP Support: NO
-- Found Python: /usr/bin/python3 (found version "3.12.3") found components: Interpreter Development Development.Module Development.Embed 
-- Found pybind11: /usr/include (found version "2.11.1")
-- pybind11 include dir: /home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/lib/12
-- pybind11 include dir: /home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/lib
--  Build Python Wrapper: YES
--   - Python Excutable  : 
--   - Python Headers    : 
--   - Python Library    : 
--  Build Documentation: NO
-- |= Final FLAGS infomation for install >>>>> 
--     CXX Compiler: /usr/bin/c++
--     CXX Flags: 
--     BLAS and LAPACK Libraries: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl
--     Link libraries: 
-- 
-- 
-- 
-- Configuring done (2.9s)
-- Generating done (0.0s)
-- Build files have been written to: /home/pcchen/github/Cytnx/xxx

pcchen · 2025-03-16T02:18:29Z

I have to explicitly turn off the cutensor/cuquantum. cmake -DUSE_CUDA=ON -DUSE_CUTENSOR=OFF -DUSE_CUQUANTUM=OFF ..

-- ------------------------------------------------------------------------
--   Project Cytnx, A Cross-section of Python & C++,Tensor network library 
-- ------------------------------------------------------------------------
-- 
-- /home/pcchen/github/Cytnx/cmake/Modules
--  Generator: Unix Makefiles
--  Build Target: -
--  Installation Prefix: 
--  Version: 1.0.0
-- The CXX compiler identification is GNU 13.3.0
-- The C compiler identification is GNU 13.3.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- The CUDA compiler identification is NVIDIA 12.8.93
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found Boost: /usr/lib/x86_64-linux-gnu/cmake/Boost-1.83.0/BoostConfig.cmake (found version "1.83.0")  
-- backend = cytnx
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl  
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl  
-- LAPACK found: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl
-- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found version "12.8.93") 
--  Build CUDA Support: YES
--   - CUDA Version: 
--   - CUDA Toolkit Root: 
--   - Internal macro switch: GPU/CUDA
--   - Cudatoolkit include dir: /usr/local/cuda/targets/x86_64-linux/include
--   - Cudatoolkit lib dir: /usr/local/cuda/lib64
--   - CuSolver library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcusolver.so
--   - Curand library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcurand.so
--   - CuBlas library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcublas.so
--   - Cuda rt library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudart_static.a -ldl
--   - Cuda devrt library:  -lrt -lcudadevrt
--   - Cuda cusparse library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcusparse.so
--  Build OMP Support: NO
-- Found Python: /usr/bin/python3 (found version "3.12.3") found components: Interpreter Development Development.Module Development.Embed 
-- Found pybind11: /usr/include (found version "2.11.1")
-- pybind11 include dir: 
-- pybind11 include dir: 
--  Build Python Wrapper: YES
--   - Python Excutable  : 
--   - Python Headers    : 
--   - Python Library    : 
--  Build Documentation: NO
-- |= Final FLAGS infomation for install >>>>> 
--     CXX Compiler: /usr/bin/c++
--     CXX Flags: 
--     BLAS and LAPACK Libraries: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl
--     Link libraries: 
-- 
-- 
-- 
-- Configuring done (2.9s)
-- Generating done (0.0s)
-- Build files have been written to: /home/pcchen/github/Cytnx/xxx

pcchen · 2025-03-16T02:27:06Z

Hmm. I just find out that:

If I do cmake -DUSE_CUDA=ON .. then it compiles OK.

If I do cmake -DUSE_CUDA=ON -DUSE_CUTENSOR=OFF -DUSE_CUQUANTUM=OFF .., I got following erros


[  1%] Building CXX object CMakeFiles/cytnx.dir/src/Bond.cpp.o
In file included from /usr/include/c++/13/bits/stl_tempbuf.h:61,
                 from /usr/include/c++/13/bits/stl_algo.h:69,
                 from /usr/include/c++/13/algorithm:61,
                 from /home/pcchen/github/Cytnx/src/RegularNetwork.cpp:1:
/usr/include/c++/13/bits/stl_construct.h: In instantiation of ‘void std::_Construct(_Tp*, _Args&& ...) [with _Tp = cytnx::Node; _Args = {shared_ptr<cytnx::Node>&}]’:
/usr/include/c++/13/bits/alloc_traits.h:661:19:   required from ‘static void std::allocator_traits<std::allocator<void> >::construct(allocator_type&, _Up*, _Args&& ...) [with _Up = cytnx::Node; _Args = {std::shared_ptr<cytnx::Node>&}; allocator_type = std::allocator<void>]’
/usr/include/c++/13/bits/shared_ptr_base.h:604:39:   required from ‘std::_Sp_counted_ptr_inplace<_Tp, _Alloc, _Lp>::_Sp_counted_ptr_inplace(_Alloc, _Args&& ...) [with _Args = {std::shared_ptr<cytnx::Node>&}; _Tp = cytnx::Node; _Alloc = std::allocator<void>; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]’
/usr/include/c++/13/bits/shared_ptr_base.h:971:16:   required from ‘std::__shared_count<_Lp>::__shared_count(_Tp*&, std::_Sp_alloc_shared_tag<_Alloc>, _Args&& ...) [with _Tp = cytnx::Node; _Alloc = std::allocator<void>; _Args = {std::shared_ptr<cytnx::Node>&}; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]’
/usr/include/c++/13/bits/shared_ptr_base.h:1712:14:   required from ‘std::__shared_ptr<_Tp, _Lp>::__shared_ptr(std::_Sp_alloc_shared_tag<_Tp>, _Args&& ...) [with _Alloc = std::allocator<void>; _Args = {std::shared_ptr<cytnx::Node>&}; _Tp = cytnx::Node; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]’
/usr/include/c++/13/bits/shared_ptr.h:464:59:   required from ‘std::shared_ptr<_Tp>::shared_ptr(std::_Sp_alloc_shared_tag<_Tp>, _Args&& ...) [with _Alloc = std::allocator<void>; _Args = {std::shared_ptr<cytnx::Node>&}; _Tp = cytnx::Node]’
/usr/include/c++/13/bits/shared_ptr.h:1009:14:   required from ‘std::shared_ptr<typename std::enable_if<(! std::is_array< <template-parameter-1-1> >::value), _Tp>::type> std::make_shared(_Args&& ...) [with _Tp = cytnx::Node; _Args = {shared_ptr<cytnx::Node>&}; typename enable_if<(! is_array< <template-parameter-1-1> >::value), _Tp>::type = cytnx::Node]’
/home/pcchen/github/Cytnx/src/RegularNetwork.cpp:1099:58:   required from here
/usr/include/c++/13/bits/stl_construct.h:119:7: error: no matching function for call to ‘cytnx::Node::Node(std::shared_ptr<cytnx::Node>&)’
  119 |       ::new((void*)__p) _Tp(std::forward<_Args>(__args)...);
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/pcchen/github/Cytnx/include/Network.hpp:13,
                 from /home/pcchen/github/Cytnx/src/RegularNetwork.cpp:7:
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:54:5: note: candidate: ‘cytnx::Node::Node(std::shared_ptr<cytnx::Node>, std::shared_ptr<cytnx::Node>, const cytnx::UniTensor&)’
   54 |     Node(std::shared_ptr<Node> in_left, std::shared_ptr<Node> in_right,
      |     ^~~~
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:54:5: note:   candidate expects 3 arguments, 1 provided
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:28:5: note: candidate: ‘cytnx::Node::Node(const cytnx::Node&)’
   28 |     Node(const Node& rhs)
      |     ^~~~
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:28:22: note:   no known conversion for argument 1 from ‘std::shared_ptr<cytnx::Node>’ to ‘const cytnx::Node&’
   28 |     Node(const Node& rhs)
      |          ~~~~~~~~~~~~^~~
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:26:5: note: candidate: ‘cytnx::Node::Node()’
   26 |     Node() : is_assigned(false) {}
      |     ^~~~
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:26:5: note:   candidate expects 0 arguments, 1 provided

IvanaGyro · 2025-03-16T05:53:14Z

I have to explicitly turn off the cutensor/cuquantum. cmake -DUSE_CUDA=ON -DUSE_CUTENSOR=OFF -DUSE_CUQUANTUM=OFF ..

I checked CMakeLists.txt. Yes, the current default setting in Install.sh and CMakeLists.txt are not aligned. USE_CUTENSOR and USE_CUQUANTUM are defaulted to ON in CMakeLists.txt but are defaulted to OFF in Install.sh. As you found, the user has to explicitly set USE_CUTENSOR and USE_CUQUANTUM to OFF if they want to enable CUDA without enabling cuTENSOR and cuQuantum.

With #579, the behavior in CMakeLists.txt is aligned with the behavior in the current Install.sh. cuTENSOR and cuQuantum are not enabled with CUDA.

pcchen · 2025-03-16T07:04:53Z

Because I fail to compile gpu support when I turn off cutensor/cuquantum. I am wondering, with the current code.

Can one compiles gpu support using cuda, but WITHOUT cutensor/cuquantum?
Can one turns on only one of the cutensor/cuquantum? or they have to be turned on together.

IvanaGyro · 2025-03-16T10:32:46Z

No, I can't either. I think it's a bug.

Changing this line

Cytnx/src/RegularNetwork.cpp

Line 1099 in 5256162

    
           std::shared_ptr<Node> root = std::make_shared<Node>(this->CtTree.nodes_container.back());

to

std::shared_ptr<Node> root = this->CtTree.nodes_container.back();

makes build sucessful.

pcchen · 2025-03-16T13:23:57Z

This is to confirm that, after changing line 1099 to

std::shared_ptr<Node> root = this->CtTree.nodes_container.back();

It compiles OK.

yingjerkao · 2025-03-16T17:08:34Z

Following this thread, I propose to using CuQuantum as the main dependency to avoid the complicated compiling options.

yingjerkao · 2025-03-16T19:27:05Z

Also, I wonder if CUTT is already integrated into cuTensorNet in cuQuantum?

This was referenced Mar 18, 2025

Fix build when enabling CUDA but not cuQuantum #582

Merged

Reduce combinations of build configurations. #584

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA, CUTENSOR, CUQUANTUM #580

CUDA, CUTENSOR, CUQUANTUM #580

pcchen commented Mar 15, 2025

IvanaGyro commented Mar 15, 2025

pcchen commented Mar 16, 2025

pcchen commented Mar 16, 2025

pcchen commented Mar 16, 2025

pcchen commented Mar 16, 2025

IvanaGyro commented Mar 16, 2025

pcchen commented Mar 16, 2025

IvanaGyro commented Mar 16, 2025

pcchen commented Mar 16, 2025

yingjerkao commented Mar 16, 2025

yingjerkao commented Mar 16, 2025

CUDA, CUTENSOR, CUQUANTUM #580

CUDA, CUTENSOR, CUQUANTUM #580

Comments

pcchen commented Mar 15, 2025

IvanaGyro commented Mar 15, 2025

pcchen commented Mar 16, 2025

pcchen commented Mar 16, 2025

pcchen commented Mar 16, 2025

pcchen commented Mar 16, 2025

IvanaGyro commented Mar 16, 2025

pcchen commented Mar 16, 2025

IvanaGyro commented Mar 16, 2025

pcchen commented Mar 16, 2025

yingjerkao commented Mar 16, 2025

yingjerkao commented Mar 16, 2025