Question about Experiment Settings #1

soeun-22 · 2025-01-07T12:58:29Z

Hello,
Thank you for providing such an excellent paper and code! I truly appreciate your contributions.

I’ve been running experiments using your code and encountered a question regarding the experimental settings. Specifically, my Wikitext2 PPL results seem to differ from the results reported in Table 1.

I conducted the experiment using the following settings:

Model: LLaMA2-7B
Rank: 256
BL, BR: 4bit
Outer iterations: 15
Inner iterations: 10

With these settings, I obtained a PPL of '6.4685444831848145', which is higher than the reported results in Table 1.

I would like to ask:

Could you provide more details on the exact configurations or hyperparameters used to achieve the results in Table 1?
Regarding the Random Hadamard Matrix generation, it seems to be created randomly based on the seed value. Could you share the specific seed values used for each experiment?

For clarity, I have attached a screenshot of my experimental setup.

Thank you again for this remarkable project and for your support.
I look forward to your guidance and hope you have a great day!

NSagan271 · 2025-01-15T01:23:17Z

Hi, thank you for the question! Can you try setting Q_hessian_downdate to true and increasing the number of iterations to 20 outer iterations and 50 inner iterations (as the time bottleneck is the quantization of Q, increasing the number of inner iterations is reasonable)? Also, for the Hessian matrices, try using these ones from QuIP# if you are not already (QuIP# Hessians have been computed with a large calibration dataset).

Regarding the random seeds, we found that the impact of the random seed is minimal due the relatively high dimensions of the matrices and iterative nature of the algorithm (and especially so if finetuning is performed over the diagonal matrices of the randomized Hadamard transform, though that is an optional step).

Also, in case you are interested in using the CALDERA-quantized version of LLaMa-2-7B that we computed, you can now find it here on Huggingface. This checkpoint has been obtained with the above configuration, and achieves the reported PPL.

Edit: 15 outer iterations should work, as long as the number of inner iterations is 50.

soeun-22 · 2025-01-15T12:21:37Z

I will proceed with the experiments based on the settings you kindly provided.
Thank you so much for your thoughtful and detailed response—I truly appreciate it.

rajarshisaha95 · 2025-03-21T16:54:00Z

Hi @soeun-22 did the above configuration help? If so, please feel free to resolve the issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Experiment Settings #1

Question about Experiment Settings #1

soeun-22 commented Jan 7, 2025 •

edited

Loading

NSagan271 commented Jan 15, 2025 •

edited

Loading

soeun-22 commented Jan 15, 2025

rajarshisaha95 commented Mar 21, 2025

Question about Experiment Settings #1

Question about Experiment Settings #1

Comments

soeun-22 commented Jan 7, 2025 • edited Loading

NSagan271 commented Jan 15, 2025 • edited Loading

soeun-22 commented Jan 15, 2025

rajarshisaha95 commented Mar 21, 2025

soeun-22 commented Jan 7, 2025 •

edited

Loading

NSagan271 commented Jan 15, 2025 •

edited

Loading