Skip to content

CUDA, CUTENSOR, CUQUANTUM #580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pcchen opened this issue Mar 15, 2025 · 11 comments
Open

CUDA, CUTENSOR, CUQUANTUM #580

pcchen opened this issue Mar 15, 2025 · 11 comments

Comments

@pcchen
Copy link
Collaborator

pcchen commented Mar 15, 2025

It seems that if I turn cuda option on via -DUSE_CUDA=ON, then one automatically turns on cutensor and cuquantum. If this is the case, then maybe we don't need -DUSE_CUTENSOR=ON and -DUSE_CUQUANTUM=ON?

@IvanaGyro
Copy link
Collaborator

Are you discussing the presets created in #579?

With the current settings (without #579), enabling CUDA will not automatically enable cuTENSOR and cuQuantum.

In #579, CMakeLists.txt follows the current settings. cuTENSOR and cuQuantum are not enabled when configuring with the command cmake -DUSE_CUDA. cuTENSOR and cuQuantum are only enabled with CUDA when using presets. cmake --preset openblas-cuda enables cuTENSOR and cuQuantum.

I updated the PR message of #579.

@pcchen
Copy link
Collaborator Author

pcchen commented Mar 16, 2025

Question 1:

Can one build gpu support without using CUTENSOR and CUQUANTUM?

@pcchen
Copy link
Collaborator Author

pcchen commented Mar 16, 2025

With commit 46dc390
I assume this is the current settings.

If I do cmake -DUSE_CUDA=ON, it seems to automatically turns on the CUTENSOR/CUQUANTUM:

-- ------------------------------------------------------------------------
--   Project Cytnx, A Cross-section of Python & C++,Tensor network library 
-- ------------------------------------------------------------------------
-- 
-- /home/pcchen/github/Cytnx/cmake/Modules
--  Generator: Unix Makefiles
--  Build Target: -
--  Installation Prefix: 
--  Version: 1.0.0
-- The CXX compiler identification is GNU 13.3.0
-- The C compiler identification is GNU 13.3.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- The CUDA compiler identification is NVIDIA 12.8.93
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found Boost: /usr/lib/x86_64-linux-gnu/cmake/Boost-1.83.0/BoostConfig.cmake (found version "1.83.0")  
-- backend = cytnx
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl  
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl  
-- LAPACK found: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl
-- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found version "12.8.93") 
-- Looking for cuTENSOR in /home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive
--  cudaver: 12
-- ok
-- Build with CuTensor: YES
-- CuTensor: libdir:/home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/lib/12 incdir:/home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/include libs:/home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/lib/12/libcutensor.so;/home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/lib/12/libcutensorMg.so
-- Looking for cuTENSOR in /home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive
--  cudaver: 12
-- ok
-- Build with CuQuantum: YES
-- CuQuantum: libdir:/home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/lib incdir:/home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/include libs:/home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/lib/libcutensornet.so;/home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/lib/libcustatevec.so
--  Build CUDA Support: YES
--   - CUDA Version: 
--   - CUDA Toolkit Root: 
--   - Internal macro switch: GPU/CUDA
--   - Cudatoolkit include dir: /usr/local/cuda/targets/x86_64-linux/include
--   - Cudatoolkit lib dir: /usr/local/cuda/lib64
--   - CuSolver library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcusolver.so
--   - Curand library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcurand.so
--   - CuBlas library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcublas.so
--   - Cuda rt library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudart_static.a -ldl
--   - Cuda devrt library:  -lrt -lcudadevrt
--   - Cuda cusparse library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcusparse.so
--  Build OMP Support: NO
-- Found Python: /usr/bin/python3 (found version "3.12.3") found components: Interpreter Development Development.Module Development.Embed 
-- Found pybind11: /usr/include (found version "2.11.1")
-- pybind11 include dir: /home/pcchen/src/libcutensor-linux-x86_64-2.2.0.0-archive/lib/12
-- pybind11 include dir: /home/pcchen/src/cuquantum-linux-x86_64-24.11.0.21_cuda12-archive/lib
--  Build Python Wrapper: YES
--   - Python Excutable  : 
--   - Python Headers    : 
--   - Python Library    : 
--  Build Documentation: NO
-- |= Final FLAGS infomation for install >>>>> 
--     CXX Compiler: /usr/bin/c++
--     CXX Flags: 
--     BLAS and LAPACK Libraries: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl
--     Link libraries: 
-- 
-- 
-- 
-- Configuring done (2.9s)
-- Generating done (0.0s)
-- Build files have been written to: /home/pcchen/github/Cytnx/xxx

@pcchen
Copy link
Collaborator Author

pcchen commented Mar 16, 2025

I have to explicitly turn off the cutensor/cuquantum. cmake -DUSE_CUDA=ON -DUSE_CUTENSOR=OFF -DUSE_CUQUANTUM=OFF ..

-- ------------------------------------------------------------------------
--   Project Cytnx, A Cross-section of Python & C++,Tensor network library 
-- ------------------------------------------------------------------------
-- 
-- /home/pcchen/github/Cytnx/cmake/Modules
--  Generator: Unix Makefiles
--  Build Target: -
--  Installation Prefix: 
--  Version: 1.0.0
-- The CXX compiler identification is GNU 13.3.0
-- The C compiler identification is GNU 13.3.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- The CUDA compiler identification is NVIDIA 12.8.93
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found Boost: /usr/lib/x86_64-linux-gnu/cmake/Boost-1.83.0/BoostConfig.cmake (found version "1.83.0")  
-- backend = cytnx
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl  
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl  
-- LAPACK found: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl
-- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found version "12.8.93") 
--  Build CUDA Support: YES
--   - CUDA Version: 
--   - CUDA Toolkit Root: 
--   - Internal macro switch: GPU/CUDA
--   - Cudatoolkit include dir: /usr/local/cuda/targets/x86_64-linux/include
--   - Cudatoolkit lib dir: /usr/local/cuda/lib64
--   - CuSolver library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcusolver.so
--   - Curand library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcurand.so
--   - CuBlas library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcublas.so
--   - Cuda rt library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudart_static.a -ldl
--   - Cuda devrt library:  -lrt -lcudadevrt
--   - Cuda cusparse library: /usr/local/cuda-12.8/targets/x86_64-linux/lib/libcusparse.so
--  Build OMP Support: NO
-- Found Python: /usr/bin/python3 (found version "3.12.3") found components: Interpreter Development Development.Module Development.Embed 
-- Found pybind11: /usr/include (found version "2.11.1")
-- pybind11 include dir: 
-- pybind11 include dir: 
--  Build Python Wrapper: YES
--   - Python Excutable  : 
--   - Python Headers    : 
--   - Python Library    : 
--  Build Documentation: NO
-- |= Final FLAGS infomation for install >>>>> 
--     CXX Compiler: /usr/bin/c++
--     CXX Flags: 
--     BLAS and LAPACK Libraries: /opt/intel/oneapi/mkl/2025.0/lib/libmkl_rt.so;-lm;-ldl;-lm;-ldl
--     Link libraries: 
-- 
-- 
-- 
-- Configuring done (2.9s)
-- Generating done (0.0s)
-- Build files have been written to: /home/pcchen/github/Cytnx/xxx

@pcchen
Copy link
Collaborator Author

pcchen commented Mar 16, 2025

Hmm. I just find out that:

If I do cmake -DUSE_CUDA=ON .. then it compiles OK.

If I do cmake -DUSE_CUDA=ON -DUSE_CUTENSOR=OFF -DUSE_CUQUANTUM=OFF .., I got following erros


[  1%] Building CXX object CMakeFiles/cytnx.dir/src/Bond.cpp.o
In file included from /usr/include/c++/13/bits/stl_tempbuf.h:61,
                 from /usr/include/c++/13/bits/stl_algo.h:69,
                 from /usr/include/c++/13/algorithm:61,
                 from /home/pcchen/github/Cytnx/src/RegularNetwork.cpp:1:
/usr/include/c++/13/bits/stl_construct.h: In instantiation of ‘void std::_Construct(_Tp*, _Args&& ...) [with _Tp = cytnx::Node; _Args = {shared_ptr<cytnx::Node>&}]’:
/usr/include/c++/13/bits/alloc_traits.h:661:19:   required from ‘static void std::allocator_traits<std::allocator<void> >::construct(allocator_type&, _Up*, _Args&& ...) [with _Up = cytnx::Node; _Args = {std::shared_ptr<cytnx::Node>&}; allocator_type = std::allocator<void>]’
/usr/include/c++/13/bits/shared_ptr_base.h:604:39:   required from ‘std::_Sp_counted_ptr_inplace<_Tp, _Alloc, _Lp>::_Sp_counted_ptr_inplace(_Alloc, _Args&& ...) [with _Args = {std::shared_ptr<cytnx::Node>&}; _Tp = cytnx::Node; _Alloc = std::allocator<void>; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]’
/usr/include/c++/13/bits/shared_ptr_base.h:971:16:   required from ‘std::__shared_count<_Lp>::__shared_count(_Tp*&, std::_Sp_alloc_shared_tag<_Alloc>, _Args&& ...) [with _Tp = cytnx::Node; _Alloc = std::allocator<void>; _Args = {std::shared_ptr<cytnx::Node>&}; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]’
/usr/include/c++/13/bits/shared_ptr_base.h:1712:14:   required from ‘std::__shared_ptr<_Tp, _Lp>::__shared_ptr(std::_Sp_alloc_shared_tag<_Tp>, _Args&& ...) [with _Alloc = std::allocator<void>; _Args = {std::shared_ptr<cytnx::Node>&}; _Tp = cytnx::Node; __gnu_cxx::_Lock_policy _Lp = __gnu_cxx::_S_atomic]’
/usr/include/c++/13/bits/shared_ptr.h:464:59:   required from ‘std::shared_ptr<_Tp>::shared_ptr(std::_Sp_alloc_shared_tag<_Tp>, _Args&& ...) [with _Alloc = std::allocator<void>; _Args = {std::shared_ptr<cytnx::Node>&}; _Tp = cytnx::Node]’
/usr/include/c++/13/bits/shared_ptr.h:1009:14:   required from ‘std::shared_ptr<typename std::enable_if<(! std::is_array< <template-parameter-1-1> >::value), _Tp>::type> std::make_shared(_Args&& ...) [with _Tp = cytnx::Node; _Args = {shared_ptr<cytnx::Node>&}; typename enable_if<(! is_array< <template-parameter-1-1> >::value), _Tp>::type = cytnx::Node]’
/home/pcchen/github/Cytnx/src/RegularNetwork.cpp:1099:58:   required from here
/usr/include/c++/13/bits/stl_construct.h:119:7: error: no matching function for call to ‘cytnx::Node::Node(std::shared_ptr<cytnx::Node>&)’
  119 |       ::new((void*)__p) _Tp(std::forward<_Args>(__args)...);
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/pcchen/github/Cytnx/include/Network.hpp:13,
                 from /home/pcchen/github/Cytnx/src/RegularNetwork.cpp:7:
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:54:5: note: candidate: ‘cytnx::Node::Node(std::shared_ptr<cytnx::Node>, std::shared_ptr<cytnx::Node>, const cytnx::UniTensor&)’
   54 |     Node(std::shared_ptr<Node> in_left, std::shared_ptr<Node> in_right,
      |     ^~~~
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:54:5: note:   candidate expects 3 arguments, 1 provided
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:28:5: note: candidate: ‘cytnx::Node::Node(const cytnx::Node&)’
   28 |     Node(const Node& rhs)
      |     ^~~~
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:28:22: note:   no known conversion for argument 1 from ‘std::shared_ptr<cytnx::Node>’ to ‘const cytnx::Node&’
   28 |     Node(const Node& rhs)
      |          ~~~~~~~~~~~~^~~
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:26:5: note: candidate: ‘cytnx::Node::Node()’
   26 |     Node() : is_assigned(false) {}
      |     ^~~~
/home/pcchen/github/Cytnx/include/contraction_tree.hpp:26:5: note:   candidate expects 0 arguments, 1 provided

@IvanaGyro
Copy link
Collaborator

I have to explicitly turn off the cutensor/cuquantum. cmake -DUSE_CUDA=ON -DUSE_CUTENSOR=OFF -DUSE_CUQUANTUM=OFF ..

I checked CMakeLists.txt. Yes, the current default setting in Install.sh and CMakeLists.txt are not aligned. USE_CUTENSOR and USE_CUQUANTUM are defaulted to ON in CMakeLists.txt but are defaulted to OFF in Install.sh. As you found, the user has to explicitly set USE_CUTENSOR and USE_CUQUANTUM to OFF if they want to enable CUDA without enabling cuTENSOR and cuQuantum.

With #579, the behavior in CMakeLists.txt is aligned with the behavior in the current Install.sh. cuTENSOR and cuQuantum are not enabled with CUDA.

@pcchen
Copy link
Collaborator Author

pcchen commented Mar 16, 2025

Because I fail to compile gpu support when I turn off cutensor/cuquantum. I am wondering, with the current code.

  • Can one compiles gpu support using cuda, but WITHOUT cutensor/cuquantum?
  • Can one turns on only one of the cutensor/cuquantum? or they have to be turned on together.

@IvanaGyro
Copy link
Collaborator

No, I can't either. I think it's a bug.

Changing this line

std::shared_ptr<Node> root = std::make_shared<Node>(this->CtTree.nodes_container.back());

to

std::shared_ptr<Node> root = this->CtTree.nodes_container.back();

makes build sucessful.

@pcchen
Copy link
Collaborator Author

pcchen commented Mar 16, 2025

This is to confirm that, after changing line 1099 to

std::shared_ptr<Node> root = this->CtTree.nodes_container.back();

It compiles OK.

@yingjerkao
Copy link
Collaborator

Following this thread, I propose to using CuQuantum as the main dependency to avoid the complicated compiling options.

@yingjerkao
Copy link
Collaborator

Also, I wonder if CUTT is already integrated into cuTensorNet in cuQuantum?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants