Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail train_full_pipeline after multiple hours #231

Open
jongeorge1999 opened this issue Nov 30, 2024 · 0 comments
Open

Fail train_full_pipeline after multiple hours #231

jongeorge1999 opened this issue Nov 30, 2024 · 0 comments

Comments

@jongeorge1999
Copy link

I am having some issues running train_full_pipeline to completion, I am getting an error stack:

Loading Vanilla 3DGS model config output/vanilla_gs/lh_5fps/...
Found image extension .png
Vanilla 3DGS Loaded.
211 training images detected.
The model has been trained for 7000 steps.
0.854081 M gaussians detected.
Binding radiance cloud to surface mesh...
Building UV map done.
Traceback (most recent call last):
  File "/root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
    subprocess.run(
  File "/root/miniconda3/envs/sugar/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/c/users/george/desktop/sugar/train.py", line 197, in <module>
    refined_mesh_path = extract_mesh_and_texture_from_refined_sugar(refined_mesh_args)
  File "/mnt/c/users/george/desktop/sugar/sugar_extractors/refined_mesh.py", line 195, in extract_mesh_and_texture_from_refined_sugar
    textured_mesh = compute_textured_mesh_for_sugar_mesh(
  File "/root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/c/users/george/desktop/sugar/sugar_extractors/texture.py", line 104, in compute_textured_mesh_for_sugar_mesh
    rasterizer = MeshRasterizer(
  File "/mnt/c/users/george/desktop/sugar/sugar_utils/mesh_rasterization.py", line 102, in __init__
    self.gl_context = dr.RasterizeGLContext()
  File "/root/miniconda3/envs/sugar/lib/python3.9/site-packages/nvdiffrast/torch/ops.py", line 228, in __init__
    self.cpp_wrapper = _get_plugin(gl=True).RasterizeGLStateWrapper(output_db, mode == 'automatic', cuda_device_idx)
  File "/root/miniconda3/envs/sugar/lib/python3.9/site-packages/nvdiffrast/torch/ops.py", line 125, in _get_plugin
    torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=common_opts+cc_opts, extra_cuda_cflags=common_opts+['-lineinfo'], extra_ldflags=ldflags, with_cuda=True, verbose=False)
  File "/root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "/root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'nvdiffrast_plugin_gl': [1/2] c++ -MMD -MF common.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin_gl -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/include -isystem /root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/include/TH -isystem /root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /root/miniconda3/envs/sugar/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -DNVDR_TORCH -c /root/miniconda3/envs/sugar/lib/python3.9/site-packages/nvdiffrast/common/common.cpp -o common.o
FAILED: common.o
c++ -MMD -MF common.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin_gl -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/include -isystem /root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/include/TH -isystem /root/miniconda3/envs/sugar/lib/python3.9/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /root/miniconda3/envs/sugar/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -DNVDR_TORCH -c /root/miniconda3/envs/sugar/lib/python3.9/site-packages/nvdiffrast/common/common.cpp -o common.o
In file included from /usr/local/cuda/include/cuda_runtime.h:83,
                 from /root/miniconda3/envs/sugar/lib/python3.9/site-packages/nvdiffrast/common/common.cpp:9:
/usr/local/cuda/include/crt/host_config.h:1:1: error: stray ‘\’ in program
    1 | \/*
      | ^
ninja: build stopped: subcommand failed.

Can anybody read into this or give me advice on how to proceed?

Im running an RTX4080 on windows 11 through WSL2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant