-
Notifications
You must be signed in to change notification settings - Fork 184
Description
Summary
while I use dpgen run param.json machine.json it started training but shows,
RuntimeError: job:b2538db0a3289e46a65c44051b82c12759272ecd 65874 failed 3 times.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/snehanshu/miniconda3/bin/dpgen", line 8, in
sys.exit(main())
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/main.py", line 255, in main
args.func(args)
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/generator/run.py", line 5474, in gen_run
run_iter(args.PARAM, args.MACHINE)
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/generator/run.py", line 4805, in run_iter
run_train(ii, jdata, mdata)
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/generator/run.py", line 724, in run_train
return run_train_dp(iter_index, jdata, mdata)
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/generator/run.py", line 927, in run_train_dp
submission.run_submission()
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpdispatcher/submission.py", line 235, in run_submission
self.handle_unexpected_submission_state()
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpdispatcher/submission.py", line 360, in handle_unexpected_submission_state
raise RuntimeError(
RuntimeError: Meet errors will handle unexpected submission state.
Debug information: remote_root==/home/snehanshu/sankha2/laves/ML potential check/DP_GEN/tmp/6af1ebb125d8995076fc49039a4e584a5e7053e4.
Debug information: submission_hash==6af1ebb125d8995076fc49039a4e584a5e7053e4.
Please check error messages above and in remote_root. The submission information is saved in /home/snehanshu/.dpdispatcher/submission/6af1ebb125d8995076fc49039a4e584a5e7053e4.json.
For furthur actions, run the following command with proper flags: dpdisp submission 6af1ebb125d8995076fc49039a4e584a5e7053e4
DP-GEN Version
dpgen 2.1.2/3
Platform, Python Version, etc
Details
while I use dpgen run param.json machine.json it started training but shows,
RuntimeError: job:b2538db0a3289e46a65c44051b82c12759272ecd 65874 failed 3 times.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/snehanshu/miniconda3/bin/dpgen", line 8, in
sys.exit(main())
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/main.py", line 255, in main
args.func(args)
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/generator/run.py", line 5474, in gen_run
run_iter(args.PARAM, args.MACHINE)
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/generator/run.py", line 4805, in run_iter
run_train(ii, jdata, mdata)
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/generator/run.py", line 724, in run_train
return run_train_dp(iter_index, jdata, mdata)
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpgen/generator/run.py", line 927, in run_train_dp
submission.run_submission()
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpdispatcher/submission.py", line 235, in run_submission
self.handle_unexpected_submission_state()
File "/home/snehanshu/miniconda3/lib/python3.10/site-packages/dpdispatcher/submission.py", line 360, in handle_unexpected_submission_state
raise RuntimeError(
RuntimeError: Meet errors will handle unexpected submission state.
Debug information: remote_root==/home/snehanshu/sankha2/laves/ML potential check/DP_GEN/tmp/6af1ebb125d8995076fc49039a4e584a5e7053e4.
Debug information: submission_hash==6af1ebb125d8995076fc49039a4e584a5e7053e4.
Please check error messages above and in remote_root. The submission information is saved in /home/snehanshu/.dpdispatcher/submission/6af1ebb125d8995076fc49039a4e584a5e7053e4.json.
For furthur actions, run the following command with proper flags: dpdisp submission 6af1ebb125d8995076fc49039a4e584a5e7053e4