Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gaussian Troubleshoot Combinations Improvement #670

Merged
merged 14 commits into from
Nov 19, 2023
Merged

Gaussian Troubleshoot Combinations Improvement #670

merged 14 commits into from
Nov 19, 2023

Conversation

calvinp0
Copy link
Member

@calvinp0 calvinp0 commented Jun 6, 2023

The current version of ARC cannot fully troubleshoot Gaussian due to lack of combination troubleshooting. This PR attempts to rectify this issue.

For example, if a job receives a first error of L401/301, and so we remove the chkfile by changing guess=read to guess=mix but then it its next failed run it gives another error where we need to add scf=(nosymm), then this modification will create an input.gjf file that uses BOTH troubleshooting methods.

@calvinp0 calvinp0 requested review from alongd and kfir4444 June 6, 2023 05:34
@calvinp0 calvinp0 changed the title Trsh scan 301 Gaussian Troubleshoot Combinations Improvement Jun 6, 2023
@codecov
Copy link

codecov bot commented Jun 6, 2023

Codecov Report

Attention: 71 lines in your changes are missing coverage. Please review.

Comparison is base (7f5190d) 73.54% compared to head (d8420a7) 73.67%.

❗ Current head d8420a7 differs from pull request most recent head 66dc921. Consider uploading reports for the commit 66dc921 to get more accurate results

Files Patch % Lines
arc/job/trsh.py 72.22% 10 Missing and 10 partials ⚠️
arc/job/adapters/gaussian.py 51.28% 17 Missing and 2 partials ⚠️
arc/job/adapters/common.py 70.58% 7 Missing and 8 partials ⚠️
arc/scheduler.py 0.00% 9 Missing ⚠️
arc/job/adapters/gaussian_test.py 93.33% 0 Missing and 5 partials ⚠️
arc/species/species.py 0.00% 2 Missing ⚠️
arc/imports.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #670      +/-   ##
==========================================
+ Coverage   73.54%   73.67%   +0.13%     
==========================================
  Files          99       99              
  Lines       26620    26866     +246     
  Branches     5545     5593      +48     
==========================================
+ Hits        19578    19794     +216     
- Misses       5672     5679       +7     
- Partials     1370     1393      +23     
Flag Coverage Δ
unittests 73.67% <76.94%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@kfir4444 kfir4444 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @calvinp0!
a quick comment, other than that it looks great!

arc/species/species.py Show resolved Hide resolved
arc/species/species.py Show resolved Hide resolved
Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @calvinp0 for taking the time to fix and improve out troubleshooting!
I made some suggestions in the comments.

@Lilachn91 says that we have an issue with the nosym keyword in Gaussian and that is is misplaced when creating input files. Perhaps she can contribute a commit to this PR to fix that?

arc/job/trsh.py Outdated
elif 'SCF' in job_status['keywords'] and 'scf=NoDIIS' not in ess_trsh_methods:
# Switching off Pulay's Direct Inversion
logger.info(f'Troubleshooting {job_type} job in {software} for {label} using scf=NoDIIS')
ess_trsh_methods.append('scf=NoDIIS')
trsh_keyword = 'scf=NoDIIS'
if 'CheckFile' in job_status['keywords'] or 'checkfile=None' in ess_trsh_methods:
remove_checkfile = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can (A) put these lines somewhere where they'll be always called but just written once, and (B) generalize them for any previous trsh keyword and not just checkfile

Another thought: what about creating a small class of Trsh that stores the used trsh keywords and specifies which keyword should be kept until explicitly removed, and which should only be tried/used once. This class could be an attribute of a Species, and passed around to the Job. It'll will create a "memory" of trsh methods used. trsh here should manipulate this object (add/remove keywords). I think that ultimately this is more sustainable and the implementation is keyword/ess independent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think a class of Trsh would be more manageable long term, and much easier to handle when it comes to writing input files for the ESS. However, I think it would be a bit more work as it would be impacting a couple of parts of ARC. So maybe for another PR we could talk about doing that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to put all of these (basically identical) likes in one place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand. Do you mean the SCF stuff into one place?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that the line if 'CheckFile' in job_status['keywords'] or 'checkfile=None' in ess_trsh_methods: repeats itself many time throughout this block. We should probably have a small function that does this (with a test?), and call that function wherever needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have attempted to turn all these if statements (when possible) into functions

def trsh_keyword_checkfile(job_status, ess_trsh_methods, couldnt_trsh) -> bool:

arc/species/species.py Show resolved Hide resolved
arc/job/adapters/gaussian.py Show resolved Hide resolved
@@ -230,7 +230,13 @@ def write_input_file(self) -> None:
input_dict['method'] = self.level.method
input_dict['multiplicity'] = self.multiplicity
input_dict['scan_trsh'] = self.args['keyword']['scan_trsh'] if 'scan_trsh' in self.args['keyword'] else ''
input_dict['trsh'] = self.args['trsh']['trsh'] if 'trsh' in self.args['trsh'] else ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have many instances where we try to retrieve a value from self.args, but we must first check that it contains the keys.
At some point we should implement a simple function that gets key_1 and key_2 and does all the checks for args in one place, and returns the requested value if it exists. It'll make the rest of the code more elegant. You don't have to implement it as part of this PR, just a thought

arc/job/adapters/gaussian_test.py Outdated Show resolved Hide resolved
arc/job/trsh_test.py Outdated Show resolved Hide resolved
arc/job/trsh.py Outdated
@@ -851,66 +851,110 @@ def trsh_ess_job(label: str,
couldnt_trsh = True

elif software == 'gaussian':
trsh_keyword = [] # TODO: Current making this a list to allow for multiple keywords to be tried. Will need another approach
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the Trsh class approach could assist here?
BTW, I prefer not to commit TODO comments (sorry for the hypocrisy, I know I have done so myself more than a few times)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i think this is a good example of trsh class

I can remove the TODO - it was more a note to you and kfir if you guys had another idea for this current PR

arc/job/adapters/gaussian_test.py Outdated Show resolved Hide resolved
Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I added some comments

arc/job/adapters/gaussian.py Show resolved Hide resolved
arc/job/trsh.py Outdated
elif 'SCF' in job_status['keywords'] and 'scf=NoDIIS' not in ess_trsh_methods:
# Switching off Pulay's Direct Inversion
logger.info(f'Troubleshooting {job_type} job in {software} for {label} using scf=NoDIIS')
ess_trsh_methods.append('scf=NoDIIS')
trsh_keyword = 'scf=NoDIIS'
if 'CheckFile' in job_status['keywords'] or 'checkfile=None' in ess_trsh_methods:
remove_checkfile = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to put all of these (basically identical) likes in one place?

arc/job/adapters/common.py Fixed Show fixed Hide fixed
arc/job/trsh.py Fixed Show fixed Hide fixed
arc/job/trsh.py Fixed Show fixed Hide fixed
@calvinp0 calvinp0 force-pushed the trsh_scan_301 branch 2 times, most recently from ac34aef to 6b1be3a Compare November 14, 2023 10:00
Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this addition! Please see some comments below

@@ -364,7 +364,7 @@ def test_add_to_args(self):
'trsh': {}}
self.assertEqual(self.job_1.args, expected_args)
self.job_1.write_input_file()
new_expected_args = {'keyword': {'general': 'val_tst_1 val_tst_2 val_tst_3 scf=xqc'}, # scf=xqc
new_expected_args = {'keyword': {'general': 'val_tst_1 val_tst_2 val_tst_3'}, # scf=xqc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also remove the comment?

arc/job/adapters/common.py Show resolved Hide resolved
if key not in input_dict.keys():
input_dict[key] = ''
input_dict[key] += f'{val} '
elif key == 'trsh':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is an important addition. Could you find some place in a docstring where it makes sense, and document the desired structure of args, and secifically the bahavior of trsh in it to help maintain it?
See other instances for example, here and here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the full expectation of the desired structure of arguments. This was just to rectifiy an issue that I came across month ago.

arc/job/adapters/common.py Show resolved Hide resolved
arc/job/adapters/common.py Outdated Show resolved Hide resolved
if parameters:
input_dict['trsh'] += f" scf=({','.join(parameters)})"

input_dict = { key.strip(): (val.rstrip() if isinstance(val, str) and val.startswith('\n') and not val.endswith('\n') else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part:

(val.rstrip() if isinstance(val, str) and val.startswith('\n') and not val.endswith('\n') else
                  val.lstrip() if isinstance(val, str) and not val.startswith('\n') and val.endswith('\n') else
                  val.strip() if isinstance(val, str) and not val.startswith('\n') and not val.endswith('\n') else val)

has some level of complexity. I recommend to extract it into a function in job.common with some tests, and use ti here inside the loop/generator

arc/job/adapters/gaussian_test.py Show resolved Hide resolved
arc/job/trsh.py Outdated
@@ -849,60 +849,61 @@ def trsh_ess_job(label: str,
or 'is not appropriate for the this chemistry' in job_status['error']):
output_errors.append(f'Error: Could not recognize basis set {job_status["error"].split()[-1]} in {software}; ')
couldnt_trsh = True


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove these empty lines above?

arc/job/trsh.py Outdated
@@ -1536,3 +1537,72 @@ def scan_quality_check(label: str,
return invalidate, invalidation_reason, message, actions

return invalidate, invalidation_reason, message, actions

def trsh_keyword_checkfile(job_status, ess_trsh_methods, couldnt_trsh) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return types here and below are tuples of bool and other things, something like: Tuple[bool, list, bool]

arc/job/trsh.py Outdated Show resolved Hide resolved
@@ -380,3 +451,39 @@ def which(command: Union[str, list],
return bool(ans)
else:
return ans

def combine_parameters(input_dict: dict, terms: list):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Please add the returned type: -> Tuple[dict, list]

pytest will always be in sys.modules even if debug is not run.
${trsh} has been added to the input template
Added the integral algorithm change - this was not done previously
Will change qc to xqc if it exists in the input_dict['trsh'] - this change will mean fundamental changes to the original input whereby not using xqc/qc to begin with. if we want this defaulted, we should just make it default
changed the fine logic - if set to fine, then it will look for current scf in the input_dict['trsh'] and edit it
Large updates including determining the trsh/keywords/block values

Also created a combine parameters function

Adjusted input_dict_strip and added hints
Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@alongd alongd merged commit 26b200b into main Nov 19, 2023
@alongd alongd deleted the trsh_scan_301 branch November 19, 2023 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants