Replies: 2 comments 3 replies
-
All data generated by dpgen is used for training. One should prepare the test dataset by his/herself. |
Beta Was this translation helpful? Give feedback.
1 reply
-
For the meaning of the parameters, you can refer to https://docs.deepmodeling.com/projects/dpgen/en/devel/run/index.html. Besides, here is a possible structure of a single dataset. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi All,
There is always no test_err in the lcurve.out of my dpgen-train since using dpgen.
I guess it has tested every disp_freq but no results output in lcurve.out.
So, I wonder if my results are right and how to abtain the test_err in lcurve.out.
as shown below,
(base) [********]$ head -n 2 lcurve.out && tail -n 2 lcurve.out
step rmse_trn rmse_e_trn rmse_f_trn rmse_v_trn lr
0 6.12e+01 1.99e+00 1.93e+00 2.31e-01 1.0e-03
499000 5.46e-01 3.57e-02 3.31e-01 3.61e-02 3.7e-08
500000 4.35e-01 1.45e-02 9.97e-02 2.27e-02 3.5e-08
but, I found a normal lcurve.out recently, https://zhuanlan.zhihu.com/p/555628454 as shown below,
cat iter.000000/00.train/000/
head -n 2 lcurve.out && tail -n 2 lcurve.out
batch l2_tst l2_trn l2_e_tst l2_e_trn l2_f_tst l2_f_trn lr
0 8.14e+00 8.00e+00 1.00e+01 1.00e+01 4.78e-02 3.41e-03 1.0e-03
398000 6.47e-03 7.17e-03 3.47e-06 1.81e-06 6.30e-03 6.98e-03 5.3e-08
400000 6.46e-03 7.74e-03 2.85e-06 1.36e-06 6.30e-03 7.55e-03 5.0e-08
batch 的最终数值 param.json 中 stop_batch 的指定值。
So, I wonder if my results are right. I guess it has tested every disp_freq
DEEPMD INFO initialize model from scratch
DEEPMD INFO start training at lr 1.00e-03 (== 1.00e-03), decay_step 2500, decay_rate 0.950006, final lr will be 3.51e-08
DEEPMD INFO batch 100 training time 9.35 s, testing time 0.09 s
DEEPMD INFO batch 200 training time 9.14 s, testing time 0.07 s
DEEPMD INFO batch 300 training time 9.20 s, testing time 0.10 s
DEEPMD INFO batch 400 training time 8.95 s, testing time 0.06 s
DEEPMD INFO batch 500 training time 8.41 s, testing time 0.03 s
and this is my param.json, I set the "numb_test": 1,
"training": {
"stop_batch": 500000,
"disp_file": "lcurve.out",
"disp_freq": 100,
"numb_test": 1,
"save_freq": 10000,
"save_ckpt": "model.ckpt",
"disp_training": true,
"time_training": true,
"profiling": false,
"profiling_file": "timeline.json",
"_comment": "that's all",
best,
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions