-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gaussian Troubleshoot Combinations Improvement #670
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #670 +/- ##
==========================================
+ Coverage 73.54% 73.67% +0.13%
==========================================
Files 99 99
Lines 26620 26866 +246
Branches 5545 5593 +48
==========================================
+ Hits 19578 19794 +216
- Misses 5672 5679 +7
- Partials 1370 1393 +23
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @calvinp0!
a quick comment, other than that it looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @calvinp0 for taking the time to fix and improve out troubleshooting!
I made some suggestions in the comments.
@Lilachn91 says that we have an issue with the nosym
keyword in Gaussian and that is is misplaced when creating input files. Perhaps she can contribute a commit to this PR to fix that?
arc/job/trsh.py
Outdated
elif 'SCF' in job_status['keywords'] and 'scf=NoDIIS' not in ess_trsh_methods: | ||
# Switching off Pulay's Direct Inversion | ||
logger.info(f'Troubleshooting {job_type} job in {software} for {label} using scf=NoDIIS') | ||
ess_trsh_methods.append('scf=NoDIIS') | ||
trsh_keyword = 'scf=NoDIIS' | ||
if 'CheckFile' in job_status['keywords'] or 'checkfile=None' in ess_trsh_methods: | ||
remove_checkfile = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can (A) put these lines somewhere where they'll be always called but just written once, and (B) generalize them for any previous trsh keyword and not just checkfile
Another thought: what about creating a small class of Trsh
that stores the used trsh keywords and specifies which keyword should be kept until explicitly removed, and which should only be tried/used once. This class could be an attribute of a Species
, and passed around to the Job. It'll will create a "memory" of trsh methods used. trsh here should manipulate this object (add/remove keywords). I think that ultimately this is more sustainable and the implementation is keyword/ess independent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think a class of Trsh would be more manageable long term, and much easier to handle when it comes to writing input files for the ESS. However, I think it would be a bit more work as it would be impacting a couple of parts of ARC. So maybe for another PR we could talk about doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to put all of these (basically identical) likes in one place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't understand. Do you mean the SCF stuff into one place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that the line if 'CheckFile' in job_status['keywords'] or 'checkfile=None' in ess_trsh_methods:
repeats itself many time throughout this block. We should probably have a small function that does this (with a test?), and call that function wherever needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have attempted to turn all these if statements (when possible) into functions
Line 1540 in 66b5d1e
def trsh_keyword_checkfile(job_status, ess_trsh_methods, couldnt_trsh) -> bool: |
arc/job/adapters/gaussian.py
Outdated
@@ -230,7 +230,13 @@ def write_input_file(self) -> None: | |||
input_dict['method'] = self.level.method | |||
input_dict['multiplicity'] = self.multiplicity | |||
input_dict['scan_trsh'] = self.args['keyword']['scan_trsh'] if 'scan_trsh' in self.args['keyword'] else '' | |||
input_dict['trsh'] = self.args['trsh']['trsh'] if 'trsh' in self.args['trsh'] else '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have many instances where we try to retrieve a value from self.args
, but we must first check that it contains the keys.
At some point we should implement a simple function that gets key_1
and key_2
and does all the checks for args
in one place, and returns the requested value if it exists. It'll make the rest of the code more elegant. You don't have to implement it as part of this PR, just a thought
arc/job/trsh.py
Outdated
@@ -851,66 +851,110 @@ def trsh_ess_job(label: str, | |||
couldnt_trsh = True | |||
|
|||
elif software == 'gaussian': | |||
trsh_keyword = [] # TODO: Current making this a list to allow for multiple keywords to be tried. Will need another approach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the Trsh
class approach could assist here?
BTW, I prefer not to commit TODO
comments (sorry for the hypocrisy, I know I have done so myself more than a few times)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i think this is a good example of trsh class
I can remove the TODO - it was more a note to you and kfir if you guys had another idea for this current PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I added some comments
arc/job/trsh.py
Outdated
elif 'SCF' in job_status['keywords'] and 'scf=NoDIIS' not in ess_trsh_methods: | ||
# Switching off Pulay's Direct Inversion | ||
logger.info(f'Troubleshooting {job_type} job in {software} for {label} using scf=NoDIIS') | ||
ess_trsh_methods.append('scf=NoDIIS') | ||
trsh_keyword = 'scf=NoDIIS' | ||
if 'CheckFile' in job_status['keywords'] or 'checkfile=None' in ess_trsh_methods: | ||
remove_checkfile = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to put all of these (basically identical) likes in one place?
e592590
to
37ad8b9
Compare
ac34aef
to
6b1be3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this addition! Please see some comments below
arc/job/adapter_test.py
Outdated
@@ -364,7 +364,7 @@ def test_add_to_args(self): | |||
'trsh': {}} | |||
self.assertEqual(self.job_1.args, expected_args) | |||
self.job_1.write_input_file() | |||
new_expected_args = {'keyword': {'general': 'val_tst_1 val_tst_2 val_tst_3 scf=xqc'}, # scf=xqc | |||
new_expected_args = {'keyword': {'general': 'val_tst_1 val_tst_2 val_tst_3'}, # scf=xqc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also remove the comment?
arc/job/adapters/common.py
Outdated
if key not in input_dict.keys(): | ||
input_dict[key] = '' | ||
input_dict[key] += f'{val} ' | ||
elif key == 'trsh': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the full expectation of the desired structure of arguments. This was just to rectifiy an issue that I came across month ago.
arc/job/adapters/gaussian.py
Outdated
if parameters: | ||
input_dict['trsh'] += f" scf=({','.join(parameters)})" | ||
|
||
input_dict = { key.strip(): (val.rstrip() if isinstance(val, str) and val.startswith('\n') and not val.endswith('\n') else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part:
(val.rstrip() if isinstance(val, str) and val.startswith('\n') and not val.endswith('\n') else
val.lstrip() if isinstance(val, str) and not val.startswith('\n') and val.endswith('\n') else
val.strip() if isinstance(val, str) and not val.startswith('\n') and not val.endswith('\n') else val)
has some level of complexity. I recommend to extract it into a function in job.common with some tests, and use ti here inside the loop/generator
arc/job/trsh.py
Outdated
@@ -849,60 +849,61 @@ def trsh_ess_job(label: str, | |||
or 'is not appropriate for the this chemistry' in job_status['error']): | |||
output_errors.append(f'Error: Could not recognize basis set {job_status["error"].split()[-1]} in {software}; ') | |||
couldnt_trsh = True | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these empty lines above?
arc/job/trsh.py
Outdated
@@ -1536,3 +1537,72 @@ def scan_quality_check(label: str, | |||
return invalidate, invalidation_reason, message, actions | |||
|
|||
return invalidate, invalidation_reason, message, actions | |||
|
|||
def trsh_keyword_checkfile(job_status, ess_trsh_methods, couldnt_trsh) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return types here and below are tuples of bool and other things, something like: Tuple[bool, list, bool]
arc/job/adapters/common.py
Outdated
@@ -380,3 +451,39 @@ def which(command: Union[str, list], | |||
return bool(ans) | |||
else: | |||
return ans | |||
|
|||
def combine_parameters(input_dict: dict, terms: list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Please add the returned type: -> Tuple[dict, list]
pytest will always be in sys.modules even if debug is not run.
${trsh} has been added to the input template Added the integral algorithm change - this was not done previously Will change qc to xqc if it exists in the input_dict['trsh'] - this change will mean fundamental changes to the original input whereby not using xqc/qc to begin with. if we want this defaulted, we should just make it default changed the fine logic - if set to fine, then it will look for current scf in the input_dict['trsh'] and edit it
501011a
to
8dab219
Compare
Large updates including determining the trsh/keywords/block values Also created a combine parameters function Adjusted input_dict_strip and added hints
d8420a7
to
66dc921
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
The current version of ARC cannot fully troubleshoot Gaussian due to lack of combination troubleshooting. This PR attempts to rectify this issue.
For example, if a job receives a first error of L401/301, and so we remove the chkfile by changing
guess=read
toguess=mix
but then it its next failed run it gives another error where we need to addscf=(nosymm)
, then this modification will create aninput.gjf
file that uses BOTH troubleshooting methods.