Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement NxN conv (N>1) #25

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

capitaso
Copy link

@capitaso capitaso commented Nov 25, 2020

Hello, I implemented NxN (N>1) convolution case in AMC. You can run test with VGG16 model as follows.
bash ./scripts/search_vgg16_0.5flops.sh

@LiYunJamesPhD
Copy link

@capitaso Thank you for implementing NxN conv. However, in your implementation, there is a serious error where could not run "least_square_sklearn" because of 4 dimension inputs.

Thanks,

@capitaso
Copy link
Author

@li-yun Thanks for reporting the error. Can you share the error massage? and what model did you try to prune?

The 4-dimensional inputs are reshaped into 2-dimensional matrix when using least_squares_sklearn, so that should not happen, but I may have done something wrong with it.

@LiYunJamesPhD
Copy link

@capitaso Sure. I tried to prune a pre-trained VGG16. I also added the following error message.

Traceback (most recent call last):
File "amc_search.py", line 233, in
train(args.train_episode, agent, env, args.output)
File "amc_search.py", line 132, in train
observation2, reward, done, info = env.step(action)
File "/home/liyun/model_compression/amc/env/channel_pruning_env.py", line 99, in step
action, d_prime, preserve_idx = self.prune_kernel(self.prunable_idx[self.cur_ind], action, preserve_idx)
File "/home/liyun/model_compression/amc/env/channel_pruning_env.py", line 284, in prune_kernel
rec_weight = least_square_sklearn(X=masked_X, Y=Y)
File "/home/liyun/model_compression/amc/lib/utils.py", line 130, in least_square_sklearn
reg.fit(X, Y)
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/linear_model/_base.py", line 505, in fit
X, y = self._validate_data(X, y, accept_sparse=['csr', 'csc', 'coo'],
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/base.py", line 432, in _validate_data
X, y = check_X_y(X, y, **check_params)
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 72, in inner_f
return f(**kwargs)
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 795, in check_X_y
X = check_array(X, accept_sparse=accept_sparse,
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 72, in inner_f
return f(**kwargs)
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 640, in check_array
raise ValueError("Found array with dim %d. %s expected <= 2."
ValueError: Found array with dim 4. Estimator expected <= 2.
liyun@ferrari:~/model_compression/amc$

Thank you for replying to my message.

@capitaso
Copy link
Author

@li-yun Thanks for the additional info. I quickly checked, and at least, I put something like below and it did not cause error.

python amc_search.py --job=train --model=vgg16 --ckpt_path=checkpoints/vgg16.pth --dataset=imagenet --data_root=../../datasets/ILSVRC2012 --preserve_ratio=0.5 --lbound=0.2 --rbound=1 --reward=acc_reward --n_calibration_batches 15 --seed 2018

Then, can you tell me, in what phase do you have the error? "strategy search" (--job=train) or "export" (--job=export)?

@capitaso
Copy link
Author

@li-yun Although I am not sure this is the cause of the error you have, I found a bug related to the first FC layer that happens in the exporting phase. Actually, I used AMC to prune the convolution layers only, and did not check about FC layers carefully. If you want to prune only convolution layers, the following change (in "env/channel_pruning_env.py" at line 24) may solve the problem. The bug will be fixed in the next few weeks.

  •    self.prunable_layer_types = [torch.nn.modules.conv.Conv2d, torch.nn.modules.linear.Linear]
    
  •    self.prunable_layer_types = [torch.nn.modules.conv.Conv2d]
    

@LiYunJamesPhD
Copy link

LiYunJamesPhD commented Feb 15, 2021

@capitaso The error was in the search phase. My plan is also going to prune the convolution layers.

I believe the error is in the line "rec_weight = least_square_sklearn(X=masked_X, Y=Y)" in "env/channel_pruning_env.py". masked_X and Y are 4 dimension inputs where are [3000, 16, 32, 32] and [3000, 64, 32, 32] in my case. Because their size is great than 2, the linear regression function in sklearn fails to perform linear regression. Do you have any thoughts?

Another thing is that I am planning to prune the VGG16 model that is trained on CIFAR10 rather than Imagenet. I doubt the command you provided is working for me.

Thanks

@capitaso
Copy link
Author

capitaso commented Feb 16, 2021

@li-yun And what is the kernel width and height in that layer? If its 1 x 1, there might be a problem at line 229. I will fix it, but please clarify the kernel width/height.

@LiYunJamesPhD
Copy link

I used a 3 by 3 kernel in that layer. Yeah. I agree with that.

@capitaso
Copy link
Author

@li-yun Then, the problem is something else... Can you share the whole network architecture? Pasting the output of "print(model)" will help.

@LiYunJamesPhD
Copy link

@capitaso Sorry to reply to the message later. Sure. The following is the network architecture.

vgg(
(feature): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(25): ReLU(inplace=True)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(27): ReLU(inplace=True)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(29): ReLU(inplace=True)
)
(classifier): Linear(in_features=512, out_features=10, bias=True)
)

Thanks

@capitaso
Copy link
Author

@li-yun Sorry, I'm late. I fixed a bit and committed. Can you try the new one? I think it should work now. If it still does not work, probably I need your source code to debug more.

@LiYunJamesPhD
Copy link

@capitaso Thank you so much!! I will try the new one.

@LiYunJamesPhD
Copy link

@capitaso I tried the new one, but I got a different error, which is IndexError: boolean index did not match indexed array along dimension 1; dimension is 65536 but corresponding boolean dimension is 64.

I guess the problem is in these lines.

231 k_size = int(X.shape[1] / weight.shape[1])
232 XX = X.reshape((X.shape[0],-1,k_size))
233 masked_X = XX[:, mask, :]
234 masked_X = masked_X.reshape((masked_X.shape[0],-1))

The shape of X and weight is [3000, 64, 32, 32] and [64, 64, 3, 3], respectively.

@LiYunJamesPhD
Copy link

@capitaso please skip the previous message. The code is working.

@Beeeam
Copy link

Beeeam commented Mar 10, 2023

Thank you for your work on this, and I implement the code. It works. But have you fix the accuracy problem of vgg model? I have the same problem you’ve mentioned.

@capitaso
Copy link
Author

@Beeeam Do you mean this problem? No, I could not fix it. After some struggling, I gave it up.

@Beeeam
Copy link

Beeeam commented Mar 13, 2023

@capitaso Thanks for your relpy. Besides I am also curious about the parameter 'n_points_eachlayer'. I used a larger one(from 10 to 20), but got a worse results.

@capitaso
Copy link
Author

@Beeeam I think i did that too, but changing the hyper-parameters did not work at all. And I have no idea what is the remaining problem...

But, my implementation is maybe ok. I fixed the pruning rate (did not use amc) and confirmed it worked well.

@Beeeam
Copy link

Beeeam commented Mar 13, 2023

@capitaso The hyper-parameters I found really importan is warmup...

Also, I am thinking that using filter pruning will help?

@capitaso
Copy link
Author

@Beeeam What do you exactly mean by filter pruning? Something like Fig. 1 (a) of this? If so, I have not tried it, but to my understanding filter pruning is equivalent to channel pruning in the previous layer.

@Beeeam
Copy link

Beeeam commented Mar 13, 2023

@capitaso filter_pruned_num = int(weight_torch.size()[0] * (1 - compress_rate)) select weight by this dimension. It seems finer-grained than channel pruning.

@capitaso
Copy link
Author

@Beeeam Sorry for my late reply. I did not find such code in env/channel_pruning_env.py. Can you please specify the lines you are mentioning?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants