Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting error while training the model #30

Open
jitender-code opened this issue Jan 24, 2021 · 0 comments
Open

Getting error while training the model #30

jitender-code opened this issue Jan 24, 2021 · 0 comments

Comments

@jitender-code
Copy link

I have tried to train this model on layoutnet datasets with all default parameters mentioned here (https://github.com/sunset1995/HorizonNet).
I executed the following command
(HorizonNet) D:\HorizonNet-master>python train.py --id resnet50_rnn

I am getting the following error

Epoch: 0%| | 0/300 [00:00<?, ?ep/s]
Traceback (most recent call last):
File "train.py", line 181, in
iterator_train = iter(loader_train)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\utils\data\dataloader.py", line 279, in iter
return _MultiProcessingDataLoaderIter(self)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\utils\data\dataloader.py", line 719, in init
w.start()
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function at 0x0000023358098158>: attribute lookup on main failed

(HorizonNet) D:\HorizonNet-master>Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Then I modified "train.py" at line number 114 as "num_workers=0".
I am using anaconda in which a new environment named HorizonNet is created with python version = 3.6.
Now I am getting the following error

(HorizonNet) D:\HorizonNet-master>python train.py --id resnet50_rnn --epochs 50
Train ep1: 0%| | 0/204 [00:01<?, ?it/s]
Epoch: 0%| | 0/50 [00:01<?, ?ep/s]
Traceback (most recent call last):
File "train.py", line 191, in
losses = feed_forward(net, x, y_bon, y_cor)
File "train.py", line 26, in feed_forward
y_bon_, y_cor_ = net(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 242, in forward
feature = self.reduce_height_module(conv_list, x.shape[3]//self.step_cols)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 166, in forward
for f, x, out_c in zip(self.ghc_lst, conv_list, self.cs)
File "D:\HorizonNet-master\model.py", line 166, in
for f, x, out_c in zip(self.ghc_lst, conv_list, self.cs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 138, in forward
x = self.layer(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 124, in forward
return self.layers(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 31, in forward
return lr_pad(x, self.padding)
File "D:\HorizonNet-master\model.py", line 21, in lr_pad
return torch.cat([x[..., -padding:], x, x[..., :padding]], dim=3)
RuntimeError: CUDA out of memory. Tried to allocate 66.00 MiB (GPU 0; 6.00 GiB total capacity; 4.28 GiB already allocated; 4.91 MiB free; 4.34 GiB reserved in total by PyTorch)

Please help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant