-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop #83
Conversation
…sed, and avoid re-parsing
… no subfolder used
…, to reflect preferred usage as DOS object for expedited analysis. Also update docstrings, add `get_fermi_dos`, add some utility options to simplify workflow etc
…alse` more robust, and add test for this
…matgen#3879) so `pymatgen` version constraint not required
WalkthroughThis update focuses on refining and expanding the Changes
Poem
Warning Review ran into problemsProblems (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Outside diff range and nitpick comments (8)
tests/test_thermodynamics.py (2)
23-23
: Consider organizing imports according to PEP 8 standards: standard library imports, third-party imports, and local application/library specific imports should be grouped separately.
Line range hint
2099-2120
: The loop iterating overbulk_dos
variants should explicitly handle different input types (string vs. DOS object) to avoid runtime errors. Additionally, the assertion for the mean of quenched Fermi levels should have a clear comment explaining its purpose.for i, bulk_dos in enumerate([k10_dos_vr_path, get_vasprun(k10_dos_vr_path, parse_dos=True)]): print(f"Testing k10 DOS with thermo for {'str input' if i == 0 else 'DOS object input'}") quenched_fermi_levels = [] for anneal_temp in self.reduced_anneal_temperatures: gap_shift = belas_linear_fit(anneal_temp) - 1.5 try: ( fermi_level, e_conc, h_conc, conc_df, ) = self.defect_thermo.get_quenched_fermi_level_and_concentrations( bulk_dos=bulk_dos, limit="Te-rich", annealing_temperature=anneal_temp, delta_gap=gap_shift, ) quenched_fermi_levels.append(fermi_level) except Exception as e: print(f"Error processing bulk_dos: {str(e)}") continue # Ensure the Fermi levels are within expected bounds # This checks the consistency of Fermi level calculations across different DOS calculations. assert np.isclose(np.mean(quenched_fermi_levels[6:8]), 0.31825, atol=1e-3), "Fermi levels out of expected range"doped/core.py (1)
Line range hint
1-3000
: Optimize the handling of exceptions and error messages.Throughout the file, there are multiple instances where exceptions are caught broadly (e.g.,
except Exception
). This practice can obscure the source of errors and make debugging more difficult. Refine the exception handling by catching more specific exceptions. Additionally, enhance the error messages to provide more context about the error, helping in quicker resolution and better maintainability.tests/test_analysis.py (5)
228-230
: Consider adding docstrings to thesetUp
method to explain its purpose and the resources it sets up for the tests.
Line range hint
232-236
: ThetearDown
method should include error handling to ensure that file operations do not cause the test to fail unexpectedly. Use try-except blocks around file operations.- shutil.move( + try: + shutil.move( - f"{self.v_Cd_m2_path}/OUTCAR.gz", f"{self.v_Cd_m2_path}/hidden_otcr.gz") # use FNV + f"{self.v_Cd_m2_path}/OUTCAR.gz", f"{self.v_Cd_m2_path}/hidden_otcr.gz") # use FNV + except Exception as e: + print(f"Error during file operation: {e}")
Line range hint
238-242
: The test methodtest_parsing_cdte
lacks a docstring. Adding a brief description of what the test aims to verify would improve code readability and maintainability.
Line range hint
244-252
: The methodtest_kumagai_order
is testing the robustness of the Kumagai defect correction against different orderings of atoms in the input files. However, there's no assertion here to verify the outcome of the test. Consider adding assertions to ensure that the parsed energies or corrections meet expected values.+ # Example assertion to verify the energy correction + expected_correction = <expected_value> + assert np.isclose(parsed_v_cd_m2_orig.corrections['kumagai_charge_correction'], expected_correction, atol=1e-3)
Line range hint
254-258
: The methodtest_freysoldt_order
is testing the Freysoldt defect correction parser. Similar to the previous method, it lacks assertions to verify the correctness of the outcomes. It's crucial to add assertions to confirm that the corrections are as expected.+ # Example assertion to verify the energy correction + expected_correction = <expected_value> + assert np.isclose(parsed_v_cd_m2_orig.corrections['freysoldt_charge_correction'], expected_correction, atol=1e-3)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (2)
examples/CdTe/CdTe_prim_k181818_NKRED_2_vasprun.xml.gz
is excluded by!**/*.gz
tests/data/remote_baseline_plots/CdTe_LZ_Te_rich_concentrations.png
is excluded by!**/*.png
Files selected for processing (11)
- README.md (1 hunks)
- docs/index.rst (1 hunks)
- doped/analysis.py (16 hunks)
- doped/core.py (2 hunks)
- doped/generation.py (4 hunks)
- doped/thermodynamics.py (31 hunks)
- doped/utils/symmetry.py (3 hunks)
- tests/data/CsPbCl3_supercell_POSCAR (1 hunks)
- tests/test_analysis.py (8 hunks)
- tests/test_generation.py (5 hunks)
- tests/test_thermodynamics.py (6 hunks)
Files skipped from review due to trivial changes (1)
- docs/index.rst
Additional context used
LanguageTool
README.md
[style] ~8-~8: Although a hyphen is possible, it is not necessary in a compound modifier in which the first word is an adverb that ends in ‘ly’. (HYPHENATED_LY_ADVERB_ADJECTIVE)
Context: ...ducible, user-friendly yet powerful and fully-customisable manner. Tutorials showing the code fun...
[style] ~15-~15: Although a hyphen is possible, it is not necessary in a compound modifier in which the first word is an adverb that ends in ‘ly’. (HYPHENATED_LY_ADVERB_ADJECTIVE)
Context: ...ures All features and functionality are fully-customisable: - Supercell Generation: Generate a...
[style] ~23-~23: Although a hyphen is possible, it is not necessary in a compound modifier in which the first word is an adverb that ends in ‘ly’. (HYPHENATED_LY_ADVERB_ADJECTIVE)
Context: ...ections, etc. -Python
Interface: Fully-customisable, modularPython
API. Plug-and-play w/...
[uncategorized] ~54-~54: If this is a compound adjective that modifies the following noun, use a hyphen. (EN_COMPOUND_ADJECTIVE_INTERNAL)
Context: ...i.org/10.21105/joss.06433). Journal of Open Source Software 9 (96), 6433, 2024 ## `S...
Additional comments not posted (16)
tests/data/CsPbCl3_supercell_POSCAR (1)
1-143
: The POSCAR file format appears correct with proper lattice vectors and atomic positions specified.README.md (1)
71-71
: The updated title and publication venue for W. Dou et al.'s research paper are correctly formatted and match the provided summary.doped/utils/symmetry.py (1)
Line range hint
304-326
: Refine the lattice matrix sorting logic in_rotate_and_get_supercell_matrix
.
[REFACTOR_SUGGESTion]- possible_mappings, - key=lambda x: ( - _lattice_matrix_sorting_func(x[1].T @ x[2]), - _lattice_matrix_sorting_func(x[2]), - _lattice_matrix_sorting_func(x[1]), - ), + possible_mappings, key=lambda x: ( + _lattice_matrix_sorting_func(x[1].T @ x[2]), # R*S + _lattice_matrix_sorting_func(x[2]), # S + _lattice_matrix_sorting_func(x[1]), # R + ),The sorting function for lattice matrix mappings has been refined. The refactor makes the sorting criteria clearer by explicitly commenting on what each part of the tuple represents (
R*S
,S
,R
). This enhances readability and maintainability.doped/core.py (1)
Line range hint
1-3000
: Review the implementation of defect-specific classes.The
Vacancy
,Substitution
, andInterstitial
classes are well-implemented, providing specific functionalities for different types of defects. The use of inheritance and overriding of methods is correctly applied, ensuring that each defect type behaves as expected while still leveraging shared functionalities from theDefect
class.doped/generation.py (3)
1204-1212
: The default settings forsupercell_gen_kwargs
are well-defined and clear.However, consider documenting these default values in the class docstring for better visibility to users.
1268-1270
: The unpacking ofsupercell_gen_kwargs
using**
directly in the function call is efficient.This approach keeps the function call clean and leverages Python's unpacking feature effectively.
1277-1277
: Check the logical condition for structure size comparison.#!/bin/bash # Description: Verify the logic that checks if the input structure size meets the requirements. # Test: Search for the function usage and ensure the logic is correctly implemented across the codebase. rg --type python $'input_min_image_distance >= specified_min_image_distance and (primitive_structure * supercell_matrix).num_sites >= self.structure.num_sites'Ensure that the comparison logic aligns with the intended functionality and that there are no edge cases overlooked.
doped/analysis.py (3)
373-373
: Update parameter to accept both string andVasprun
object types for flexibility.This change enhances the flexibility of the
defect_entry_from_paths
function by allowing thebulk_band_gap_vr
parameter to accept either a string path or aVasprun
object directly. This should simplify the function's usage in different contexts where either the path or the object might already be available, reducing the need for redundant loading operations.Also applies to: 414-426
465-465
: Ensure consistent handling ofbulk_band_gap_vr
across different classes.The consistent update across different initializers in the
DefectsParser
class to support thebulk_band_gap_vr
parameter is a good practice. This ensures that wherever the class is instantiated, there is flexibility in specifying the bulk band gap data, which is crucial for accurate defect analysis.Also applies to: 533-551
1181-1182
: Ensure dynamic descriptions in progress bars during multiprocessing.The dynamic setting of descriptions in the progress bars during multiprocessing parsing provides better visibility into the current state of processing, which is particularly useful in debugging or monitoring long-running parsing operations.
doped/thermodynamics.py (1)
452-461
: Clarify the usage of thevbm
parameter in theDefectThermodynamics
class constructor.
[REFACTOR_Suggestion]VBM eigenvalue to use as Fermi level reference point for analysis. If None (default), will use ``"vbm"`` from the ``calculation_metadata`` dict attributes of the parsed ``DefectEntry`` objects, which by default is taken from the bulk supercell VBM (unless ``bulk_band_gap_vr`` is set during defect parsing). Note that ``vbm`` should only affect the reference for the Fermi level values output by ``doped`` (as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.tests/test_generation.py (5)
694-695
: Ensure that the file path is correctly set and consider adding error handling for file loading.
1472-1472
: Addition of interstitial coordinates appears consistent and correct.
1587-1587
: Consistent addition of interstitial coordinates.
3273-3284
: The testing for defect generation is well-implemented. Consider clarifying the comment about adjusting differences to make outputs equal.
3332-3385
: Robust testing for handling unrecognized keyword arguments. Good use of specific error messages to ensure clarity in test failures.
[APROVED]
@@ -1048,7 +1048,7 @@ def __init__( | |||
interstitial_coords: Optional[list] = None, | |||
generate_supercell: bool = True, | |||
charge_state_gen_kwargs: Optional[dict] = None, | |||
supercell_gen_kwargs: Optional[dict] = None, | |||
supercell_gen_kwargs: Optional[dict[str, Union[int, float, bool]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure proper type hinting for the supercell_gen_kwargs
parameter.
- supercell_gen_kwargs: Optional[dict[str, Union[int, float, bool]]] = None,
+ supercell_gen_kwargs: Optional[Dict[str, Union[int, float, bool]]] = None,
Please import Dict
from typing
to maintain consistency and clarity in type annotations.
Committable suggestion was skipped due to low confidence.
VBM eigenvalue to use as Fermi level reference point for calculating | ||
formation energy. If ``None`` (default), will use ``"vbm"`` from the | ||
``calculation_metadata`` dict attribute if present -- which corresponds | ||
to the VBM of the `bulk supercell` calculation by default, unless | ||
``bulk_band_gap_vr`` is set during defect parsing). | ||
fermi_level (float): | ||
Value corresponding to the electron chemical potential, | ||
referenced to the VBM. Default is 0 (i.e. the VBM). | ||
Value corresponding to the electron chemical potential, referenced | ||
to the VBM eigenvalue. Default is 0 (i.e. the VBM). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the documentation to clarify the new behavior of VBM and Fermi level calculation.
The documentation for the formation_energy
method has been updated to reflect changes in handling the VBM eigenvalue as the reference point for calculating formation energy. Ensure that the new behavior is clearly documented, especially the conditions under which the VBM eigenvalue is used or not used.
Ensure consistency in the handling of VBM across different methods.
The handling of the VBM eigenvalue as a reference point for calculations is crucial for accurate defect analysis. It appears in both the formation_energy
and equilibrium_concentration
methods. Consistency in its usage across these methods is essential for the integrity of the calculations. Consider refactoring to centralize the handling of VBM to ensure consistency and reduce potential errors.
Also applies to: 1113-1117
VBM eigenvalue to use as Fermi level reference point for calculating | ||
the formation energy. If ``None`` (default), will use ``"vbm"`` from the | ||
``calculation_metadata`` dict attribute if present -- which corresponds | ||
to the VBM of the `bulk supercell` calculation by default, unless | ||
``bulk_band_gap_vr`` is set during defect parsing). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify the VBM handling in the equilibrium_concentration
method.
Similar to the formation_energy
method, the equilibrium_concentration
method documentation has been updated to reflect changes in handling the VBM eigenvalue. It's crucial to ensure that these changes are accurately described and that the conditions under which the VBM eigenvalue is used or not used are clear.
doped/utils/symmetry.py
Outdated
transformation_matrix = target_struct.lattice.matrix @ np.linalg.inv(prim_struct.lattice.matrix) | ||
if not np.allclose(np.rint(transformation_matrix), transformation_matrix, atol=1e-3): | ||
raise ValueError # if non-integer transformation matrix | ||
|
||
return prim_struct, transformation_matrix | ||
return prim_struct, np.rint(transformation_matrix) | ||
|
||
except ValueError: # if non-integer transformation matrix | ||
prim_struct, transformation_matrix = _rotate_and_get_supercell_matrix(prim_struct, target_struct) | ||
|
||
if np.allclose(np.rint(transformation_matrix), transformation_matrix, atol=1e-3): | ||
return prim_struct, np.rint(transformation_matrix) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for non-integer transformation matrices in _get_supercell_matrix_and_possibly_rotate_prim
.
- transformation_matrix = target_struct.lattice.matrix @ np.linalg.inv(prim_struct.lattice.matrix)
- if not np.allclose(np.rint(transformation_matrix), transformation_matrix, atol=1e-3):
- raise ValueError # if non-integer transformation matrix
+ transformation_matrix = np.dot(target_struct.lattice.matrix, np.linalg.inv(prim_struct.lattice.matrix))
+ if not np.all(np.isclose(transformation_matrix, np.rint(transformation_matrix), atol=1e-3)):
+ raise ValueError("Non-integer transformation matrix detected.")
The matrix multiplication has been changed from @
to np.dot
for consistency with other numpy operations in the codebase. Additionally, the condition to check for non-integer values now explicitly uses np.all
with np.isclose
to ensure that all elements meet the criterion, enhancing the robustness of the check. The error raised now includes a message to clarify the reason for the exception.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
transformation_matrix = target_struct.lattice.matrix @ np.linalg.inv(prim_struct.lattice.matrix) | |
if not np.allclose(np.rint(transformation_matrix), transformation_matrix, atol=1e-3): | |
raise ValueError # if non-integer transformation matrix | |
return prim_struct, transformation_matrix | |
return prim_struct, np.rint(transformation_matrix) | |
except ValueError: # if non-integer transformation matrix | |
prim_struct, transformation_matrix = _rotate_and_get_supercell_matrix(prim_struct, target_struct) | |
if np.allclose(np.rint(transformation_matrix), transformation_matrix, atol=1e-3): | |
return prim_struct, np.rint(transformation_matrix) | |
transformation_matrix = np.dot(target_struct.lattice.matrix, np.linalg.inv(prim_struct.lattice.matrix)) | |
if not np.all(np.isclose(transformation_matrix, np.rint(transformation_matrix), atol=1e-3)): | |
raise ValueError("Non-integer transformation matrix detected.") | |
return prim_struct, np.rint(transformation_matrix) | |
except ValueError: # if non-integer transformation matrix | |
prim_struct, transformation_matrix = _rotate_and_get_supercell_matrix(prim_struct, target_struct) | |
if np.allclose(np.rint(transformation_matrix), transformation_matrix, atol=1e-3): | |
return prim_struct, np.rint(transformation_matrix) | |
cls.fermi_dos = get_fermi_dos( | ||
os.path.join(cls.CdTe_EXAMPLE_DIR, "CdTe_prim_k181818_NKRED_2_vasprun.xml.gz") | ||
) | ||
cls.anneal_temperatures = np.arange(200, 1401, 50) | ||
cls.reduced_anneal_temperatures = np.arange(200, 1401, 100) # for quicker testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fermi_dos
attribute initialization should handle potential exceptions if the file path or file format is incorrect. Consider adding error handling around these operations.
- cls.fermi_dos = get_fermi_dos(
+ try:
+ cls.fermi_dos = get_fermi_dos(
os.path.join(cls.CdTe_EXAMPLE_DIR, "CdTe_prim_k181818_NKRED_2_vasprun.xml.gz")
+ )
+ except Exception as e:
+ cls.fermi_dos = None
+ print(f"Failed to load fermi DOS: {str(e)}")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
cls.fermi_dos = get_fermi_dos( | |
os.path.join(cls.CdTe_EXAMPLE_DIR, "CdTe_prim_k181818_NKRED_2_vasprun.xml.gz") | |
) | |
cls.anneal_temperatures = np.arange(200, 1401, 50) | |
cls.reduced_anneal_temperatures = np.arange(200, 1401, 100) # for quicker testing | |
try: | |
cls.fermi_dos = get_fermi_dos( | |
os.path.join(cls.CdTe_EXAMPLE_DIR, "CdTe_prim_k181818_NKRED_2_vasprun.xml.gz") | |
) | |
except Exception as e: | |
cls.fermi_dos = None | |
print(f"Failed to load fermi DOS: {str(e)}") | |
cls.anneal_temperatures = np.arange(200, 1401, 50) | |
cls.reduced_anneal_temperatures = np.arange(200, 1401, 100) # for quicker testing |
if bulk_band_gap_vr and not isinstance(bulk_band_gap_vr, Vasprun): | ||
self.bulk_band_gap_vr = get_vasprun(bulk_band_gap_vr, parse_projected_eigen=False) | ||
else: | ||
self.bulk_band_gap_vr = bulk_band_gap_vr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Properly handle the type of bulk_band_gap_vr
when setting the attribute.
- if bulk_band_gap_vr and not isinstance(bulk_band_gap_vr, Vasprun):
- self.bulk_band_gap_vr = get_vasprun(bulk_band_gap_vr, parse_projected_eigen=False)
- else:
- self.bulk_band_gap_vr = bulk_band_gap_vr
+ self.bulk_band_gap_vr = get_vasprun(bulk_band_gap_vr, parse_projected_eigen=False) if isinstance(bulk_band_gap_vr, str) else bulk_band_gap_vr
The handling of bulk_band_gap_vr
should be streamlined to avoid unnecessary checks and to ensure that the Vasprun
object is correctly handled regardless of its initial form. The suggested change simplifies the logic, making it more readable and less prone to errors.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if bulk_band_gap_vr and not isinstance(bulk_band_gap_vr, Vasprun): | |
self.bulk_band_gap_vr = get_vasprun(bulk_band_gap_vr, parse_projected_eigen=False) | |
else: | |
self.bulk_band_gap_vr = bulk_band_gap_vr | |
self.bulk_band_gap_vr = get_vasprun(bulk_band_gap_vr, parse_projected_eigen=False) if isinstance(bulk_band_gap_vr, str) else bulk_band_gap_vr |
doped/thermodynamics.py
Outdated
def _parse_fermi_dos( | ||
self, bulk_dos: Union[str, Vasprun, FermiDos], skip_check: bool = False | ||
) -> FermiDos: | ||
if isinstance(bulk_dos, FermiDos): | ||
fdos = bulk_dos | ||
# most similar settings to Vasprun.eigenvalue_band_properties: | ||
fdos_vbm = fdos.get_cbm_vbm(tol=1e-4, abs_tol=True)[1] # tol 1e-4 is lowest possible, as VASP | ||
fdos_band_gap = fdos.get_gap(tol=1e-4, abs_tol=True) # rounds the DOS outputs to 4 dp | ||
|
||
if isinstance(bulk_dos, str): | ||
bulk_dos = get_vasprun(bulk_dos, parse_dos=True) # converted to fdos in next block | ||
|
||
if isinstance(bulk_dos_vr, str): | ||
bulk_dos_vr = get_vasprun(bulk_dos_vr, parse_dos=True) | ||
if isinstance(bulk_dos, Vasprun): # either supplied Vasprun or parsed from string there | ||
fdos_band_gap, _cbm, fdos_vbm, _ = bulk_dos.eigenvalue_band_properties | ||
fdos = get_fermi_dos(bulk_dos) | ||
|
||
fermi_dos_band_gap, _cbm, fermi_dos_vbm, _ = bulk_dos_vr.eigenvalue_band_properties | ||
if abs(fermi_dos_vbm - self.vbm) > 0.1: | ||
if abs(fdos_vbm - self.vbm) > 0.05 and not skip_check: | ||
warnings.warn( | ||
f"The VBM eigenvalue of the bulk DOS calculation ({fermi_dos_vbm:.2f} eV, with band " | ||
f"gap of {fermi_dos_band_gap:.2f} eV) differs from that of the bulk supercell " | ||
f"calculations ({self.vbm:.2f} eV, with band gap of {self.band_gap:.2f} eV) by more " | ||
f"than 0.1 eV. If this is only due to slight differences in kpoint sampling for the " | ||
f"bulk DOS vs defect supercell calculations, and consistent functional settings " | ||
f"(LHFCALC, AEXX etc) were used, then the eigenvalue references should be consistent " | ||
f"and this warning can be ignored. If not, then this could lead to inaccuracies in " | ||
f"the predicted Fermi level. Note that the Fermi level will be referenced to the VBM " | ||
f"of the bulk supercell (i.e. DefectThermodynamics.vbm)" | ||
f"The VBM eigenvalue of the bulk DOS calculation ({fdos_vbm:.2f} eV, band gap = " | ||
f"{fdos_band_gap:.2f} eV) differs by >0.05 eV from `DefectThermodynamics.vbm/gap` " | ||
f"({self.vbm:.2f} eV, band gap = {self.band_gap:.2f} eV; which are taken from the bulk " | ||
f"supercell calculation by default, unless `bulk_band_gap_vr` is set during defect " | ||
f"parsing). If this is only due to differences in kpoint sampling for the bulk DOS vs " | ||
f"supercell calculations, then you should use the `bulk_band_gap_vr` option during " | ||
f"defect parsing to set the bulk band gap and VBM eigenvalue " | ||
f"(`DefectThermodynamics.gap/vbm`) to the correct values (though the absolute values of " | ||
f"predictions should not be affected as the eigenvalue references in the calculations " | ||
f"are consistent, just the reported Fermi levels will be referenced to " | ||
f"`DefectThermodynamics.vbm` which may not be the exact VBM position here).\n" | ||
f"Otherwise if this is due to changes in functional settings (LHFCALC, AEXX etc), then " | ||
f"the calculations should be redone with consistent settings to ensure accurate " | ||
f"predictions.\n" | ||
f"Note that the Fermi level will be always referenced to `DefectThermodynamics.vbm`!" | ||
) | ||
return FermiDos(bulk_dos_vr.complete_dos, nelecs=get_nelect_from_vasprun(bulk_dos_vr)) | ||
return fdos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider consolidating the conditions for checking the instance types of bulk_dos
.
def _parse_fermi_dos(
self, bulk_dos: Union[str, Vasprun, FermiDos], skip_check: bool = False
) -> FermiDos:
if isinstance(bulk_dos, FermiDos):
fdos = bulk_dos
# most similar settings to Vasprun.eigenvalue_band_properties:
fdos_vbm = fdos.get_cbm_vbm(tol=1e-4, abs_tol=True)[1] # tol 1e-4 is lowest possible, as VASP
fdos_band_gap = fdos.get_gap(tol=1e-4, abs_tol=True) # rounds the DOS outputs to 4 dp
if isinstance(bulk_dos, str):
bulk_dos = get_vasprun(bulk_dos, parse_dos=True) # converted to fdos in next block
if isinstance(bulk_dos, Vasprun): # either supplied Vasprun or parsed from string there
fdos_band_gap, _cbm, fdos_vbm, _ = bulk_dos.eigenvalue_band_properties
fdos = get_fermi_dos(bulk_dos)
if abs(fdos_vbm - self.vbm) > 0.05 and not skip_check:
warnings.warn(
f"The VBM eigenvalue of the bulk DOS calculation ({fdos_vbm:.2f} eV, band gap = "
f"{fdos_band_gap:.2f} eV) differs by >0.05 eV from `DefectThermodynamics.vbm/gap` "
f"({self.vbm:.2f} eV, band gap = {self.band_gap:.2f} eV; which are taken from the bulk "
f"supercell calculation by default, unless `bulk_band_gap_vr` is set during defect "
f"parsing). If this is only due to differences in kpoint sampling for the bulk DOS vs "
f"supercell calculations, then you should use the `bulk_band_gap_vr` option during "
f"defect parsing to set the bulk band gap and VBM eigenvalue "
f"(`DefectThermodynamics.gap/vbm`) to the correct values (though the absolute values of "
f"predictions should not be affected as the eigenvalue references in the calculations "
f"are consistent, just the reported Fermi levels will be referenced to "
f"`DefectThermodynamics.vbm` which may not be the exact VBM position here).\n"
f"Otherwise if this is due to changes in functional settings (LHFCALC, AEXX etc), then "
f"the calculations should be redone with consistent settings to ensure accurate "
f"predictions.\n"
f"Note that the Fermi level will be always referenced to `DefectThermodynamics.vbm`!"
)
return fdos
Committable suggestion was skipped due to low confidence.
doped/thermodynamics.py
Outdated
def calculate_md5(file_path): | ||
""" | ||
Calculate the MD5 hash of a file. | ||
|
||
Used for checking if the pimitive cell DOS ``vasprun.xml(.gz)`` | ||
file has already been parsed, to avoid wasting time re-parsing. | ||
|
||
https://www.geeksforgeeks.org/finding-md5-of-files-recursively-in-directory-in-python/ | ||
""" | ||
hasher = hashlib.md5() | ||
with open(file_path, "rb") as f: | ||
for chunk in iter(lambda: f.read(4096), b""): | ||
hasher.update(chunk) | ||
return hasher.hexdigest() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a more detailed docstring to calculate_md5
.
def calculate_md5(file_path):
"""
Calculate the MD5 hash of a file.
+ This function is useful for verifying the integrity of files by comparing their hashes.
+ It reads the file in chunks to handle large files efficiently.
Used for checking if the primitive cell DOS ``vasprun.xml(.gz)``
file has already been parsed, to avoid wasting time re-parsing.
https://www.geeksforgeeks.org/finding-md5-of-files-recursively-in-directory-in-python/
"""
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def calculate_md5(file_path): | |
""" | |
Calculate the MD5 hash of a file. | |
Used for checking if the pimitive cell DOS ``vasprun.xml(.gz)`` | |
file has already been parsed, to avoid wasting time re-parsing. | |
https://www.geeksforgeeks.org/finding-md5-of-files-recursively-in-directory-in-python/ | |
""" | |
hasher = hashlib.md5() | |
with open(file_path, "rb") as f: | |
for chunk in iter(lambda: f.read(4096), b""): | |
hasher.update(chunk) | |
return hasher.hexdigest() | |
def calculate_md5(file_path): | |
""" | |
Calculate the MD5 hash of a file. | |
This function is useful for verifying the integrity of files by comparing their hashes. | |
It reads the file in chunks to handle large files efficiently. | |
Used for checking if the primitive cell DOS ``vasprun.xml(.gz)`` | |
file has already been parsed, to avoid wasting time re-parsing. | |
https://www.geeksforgeeks.org/finding-md5-of-files-recursively-in-directory-in-python/ | |
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Outside diff range and nitpick comments (3)
doped/thermodynamics.py (3)
Line range hint
1371-1475
: Refactor theget_equilibrium_fermi_level
method to reduce complexity.The
get_equilibrium_fermi_level
method is quite complex and handles multiple scenarios and configurations. Consider breaking this method into smaller, more focused methods to improve readability and maintainability. For example, handling different types ofbulk_dos
inputs could be separated into distinct methods or functions.
Line range hint
1527-1868
: Enhance documentation for complex methods.The method
get_quenched_fermi_level_and_concentrations
is complex and multifunctional, which is great for flexibility but can be challenging to understand quickly. Enhancing the documentation with examples or more detailed explanations of the parameters and their interactions could greatly aid users in utilizing this method effectively.
3121-3130
: Standardize documentation format.The documentation for the
get_fermi_dos
function should follow the same formatting as other docstrings in the file for consistency. This includes a more detailed parameter description and potentially a small example or a more detailed return description.
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (2)
tests/data/remote_baseline_plots/CdTe_LZ_Te_rich_Fermi_levels.png
is excluded by!**/*.png
tests/data/remote_baseline_plots/CdTe_LZ_Te_rich_concentrations.png
is excluded by!**/*.png
Files selected for processing (6)
- CHANGELOG.rst (1 hunks)
- docs/conf.py (1 hunks)
- doped/thermodynamics.py (29 hunks)
- doped/utils/symmetry.py (3 hunks)
- pyproject.toml (1 hunks)
- tests/test_generation.py (12 hunks)
Files skipped from review due to trivial changes (2)
- docs/conf.py
- pyproject.toml
Files skipped from review as they are similar to previous changes (2)
- doped/utils/symmetry.py
- tests/test_generation.py
Additional comments not posted (3)
CHANGELOG.rst (3)
8-8
: Ensure the description clearly explains the improvements made to the reverse-supercell-matrix determination.
6-7
: Ensure the description accurately reflects the changes discussed in the linked pull request.#!/bin/bash # Description: Verify the accuracy of the link and the description of the changes. # Test: Check if the link is reachable and contains relevant content. curl -o /dev/null --silent --head --fail https://github.com/materialsproject/pymatgen/pull/3879
9-11
: Confirm the refactor frombulk_band_gap_path
tobulk_band_gap_vr
is consistently applied across the codebase.#!/bin/bash # Description: Verify the consistency of the refactor across the codebase. # Test: Search for any remaining instances of the old parameter name. rg --type rst --type py 'bulk_band_gap_path'
VBM eigenvalue to use as Fermi level reference point for analysis. | ||
If None (default), will use ``"vbm"`` from the ``calculation_metadata`` | ||
dict attributes of the parsed ``DefectEntry`` objects, which by default | ||
is taken from the bulk supercell VBM (unless ``bulk_band_gap_vr`` is set | ||
during defect parsing). | ||
Note that ``vbm`` should only affect the reference for the Fermi level | ||
values output by ``doped`` (as this VBM eigenvalue is used as the zero | ||
reference), thus affecting the position of the band edges in the defect | ||
formation energy plots and doping window / dopability limit functions, | ||
and the reference of the reported Fermi levels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify the description of the vbm
parameter in the __init__
method.
The description for the vbm
parameter in the __init__
method is slightly misleading. It mentions that the vbm
is used as a reference for the Fermi level, which might confuse users into thinking it alters the Fermi level directly. It would be clearer to state that it adjusts the reference point for Fermi level calculations in the defect formation energy plots.
@@ -463,7 +469,7 @@ def __init__( | |||
el_refs (dict): | |||
Dictionary of elemental reference energies for the chemical potentials. | |||
vbm (float): | |||
VBM eigenvalue in the bulk supercell, to use as Fermi level reference. | |||
VBM energy to use as Fermi level reference point for analysis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure consistency in the parameter description across the class.
The description of the vbm
parameter in the class attributes is slightly different from that in the __init__
method. It's crucial to maintain consistency to avoid confusion. Consider aligning the descriptions to ensure that they convey the same information.
if abs(fdos_vbm - self.vbm) > 0.05 and not skip_check: | ||
warnings.warn( | ||
f"The VBM eigenvalue of the bulk DOS calculation ({fermi_dos_vbm:.2f} eV, with band " | ||
f"gap of {fermi_dos_band_gap:.2f} eV) differs from that of the bulk supercell " | ||
f"calculations ({self.vbm:.2f} eV, with band gap of {self.band_gap:.2f} eV) by more " | ||
f"than 0.1 eV. If this is only due to slight differences in kpoint sampling for the " | ||
f"bulk DOS vs defect supercell calculations, and consistent functional settings " | ||
f"(LHFCALC, AEXX etc) were used, then the eigenvalue references should be consistent " | ||
f"and this warning can be ignored. If not, then this could lead to inaccuracies in " | ||
f"the predicted Fermi level. Note that the Fermi level will be referenced to the VBM " | ||
f"of the bulk supercell (i.e. DefectThermodynamics.vbm)" | ||
f"The VBM eigenvalue of the bulk DOS calculation ({fdos_vbm:.2f} eV, band gap = " | ||
f"{fdos_band_gap:.2f} eV) differs by >0.05 eV from `DefectThermodynamics.vbm/gap` " | ||
f"({self.vbm:.2f} eV, band gap = {self.band_gap:.2f} eV; which are taken from the bulk " | ||
f"supercell calculation by default, unless `bulk_band_gap_vr` is set during defect " | ||
f"parsing). If this is only due to differences in kpoint sampling for the bulk DOS vs " | ||
f"supercell calculations, then you should use the `bulk_band_gap_vr` option during " | ||
f"defect parsing to set the bulk band gap and VBM eigenvalue " | ||
f"(`DefectThermodynamics.gap/vbm`) to the correct values (though the absolute values of " | ||
f"predictions should not be affected as the eigenvalue references in the calculations " | ||
f"are consistent, just the reported Fermi levels will be referenced to " | ||
f"`DefectThermodynamics.vbm` which may not be the exact VBM position here).\n" | ||
f"Otherwise if this is due to changes in functional settings (LHFCALC, AEXX etc), then " | ||
f"the calculations should be redone with consistent settings to ensure accurate " | ||
f"predictions.\n" | ||
f"Note that the Fermi level will be always referenced to `DefectThermodynamics.vbm`!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle potential discrepancies in VBM values gracefully.
The method _parse_fermi_dos
issues a warning if there is a significant difference between the VBM values from different sources. This is good, but it might be beneficial to provide a mechanism or recommendation on how to resolve such discrepancies, such as recalculating the values or adjusting parameters to ensure consistency.
to the VBM (using ``self.vbm``, which is the VBM of the `bulk supercell` | ||
calculation by default). If None (default), set to the mid-gap Fermi | ||
level (E_g/2). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify the default behavior of the fermi_level
parameter.
The fermi_level
parameter in the get_equilibrium_concentrations
method defaults to the mid-gap Fermi level, but this isn't explicitly stated in the parameter's description. It's helpful to explicitly state default behaviors in the documentation to improve clarity and user understanding.
def _parse_fermi_dos( | ||
self, bulk_dos: Union[str, Vasprun, FermiDos], skip_check: bool = False | ||
) -> FermiDos: | ||
if isinstance(bulk_dos, FermiDos): | ||
fdos = bulk_dos | ||
# most similar settings to Vasprun.eigenvalue_band_properties: | ||
fdos_vbm = fdos.get_cbm_vbm(tol=1e-4, abs_tol=True)[1] # tol 1e-4 is lowest possible, as VASP | ||
fdos_band_gap = fdos.get_gap(tol=1e-4, abs_tol=True) # rounds the DOS outputs to 4 dp | ||
|
||
if isinstance(bulk_dos_vr, str): | ||
bulk_dos_vr = get_vasprun(bulk_dos_vr, parse_dos=True) | ||
if isinstance(bulk_dos, str): | ||
bulk_dos = get_vasprun(bulk_dos, parse_dos=True) # converted to fdos in next block | ||
|
||
fermi_dos_band_gap, _cbm, fermi_dos_vbm, _ = bulk_dos_vr.eigenvalue_band_properties | ||
if abs(fermi_dos_vbm - self.vbm) > 0.1: | ||
if isinstance(bulk_dos, Vasprun): # either supplied Vasprun or parsed from string there | ||
fdos_band_gap, _cbm, fdos_vbm, _ = bulk_dos.eigenvalue_band_properties | ||
fdos = get_fermi_dos(bulk_dos) | ||
|
||
if abs(fdos_vbm - self.vbm) > 0.05 and not skip_check: | ||
warnings.warn( | ||
f"The VBM eigenvalue of the bulk DOS calculation ({fermi_dos_vbm:.2f} eV, with band " | ||
f"gap of {fermi_dos_band_gap:.2f} eV) differs from that of the bulk supercell " | ||
f"calculations ({self.vbm:.2f} eV, with band gap of {self.band_gap:.2f} eV) by more " | ||
f"than 0.1 eV. If this is only due to slight differences in kpoint sampling for the " | ||
f"bulk DOS vs defect supercell calculations, and consistent functional settings " | ||
f"(LHFCALC, AEXX etc) were used, then the eigenvalue references should be consistent " | ||
f"and this warning can be ignored. If not, then this could lead to inaccuracies in " | ||
f"the predicted Fermi level. Note that the Fermi level will be referenced to the VBM " | ||
f"of the bulk supercell (i.e. DefectThermodynamics.vbm)" | ||
f"The VBM eigenvalue of the bulk DOS calculation ({fdos_vbm:.2f} eV, band gap = " | ||
f"{fdos_band_gap:.2f} eV) differs by >0.05 eV from `DefectThermodynamics.vbm/gap` " | ||
f"({self.vbm:.2f} eV, band gap = {self.band_gap:.2f} eV; which are taken from the bulk " | ||
f"supercell calculation by default, unless `bulk_band_gap_vr` is set during defect " | ||
f"parsing). If this is only due to differences in kpoint sampling for the bulk DOS vs " | ||
f"supercell calculations, then you should use the `bulk_band_gap_vr` option during " | ||
f"defect parsing to set the bulk band gap and VBM eigenvalue " | ||
f"(`DefectThermodynamics.gap/vbm`) to the correct values (though the absolute values of " | ||
f"predictions should not be affected as the eigenvalue references in the calculations " | ||
f"are consistent, just the reported Fermi levels will be referenced to " | ||
f"`DefectThermodynamics.vbm` which may not be the exact VBM position here).\n" | ||
f"Otherwise if this is due to changes in functional settings (LHFCALC, AEXX etc), then " | ||
f"the calculations should be redone with consistent settings to ensure accurate " | ||
f"predictions.\n" | ||
f"Note that the Fermi level will be always referenced to `DefectThermodynamics.vbm`!" | ||
) | ||
return FermiDos(bulk_dos_vr.complete_dos, nelecs=get_nelect_from_vasprun(bulk_dos_vr)) | ||
return fdos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider enhancing error handling in the _parse_fermi_dos
method.
The method _parse_fermi_dos
could benefit from additional error handling. For instance, if the bulk_dos
input is neither a str
nor a Vasprun
object, the method might fail without a clear error message. Adding a type check and a clear error message would enhance the robustness and user-friendliness of the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- doped/thermodynamics.py (31 hunks)
- tests/test_generation.py (12 hunks)
- tests/test_thermodynamics.py (8 hunks)
Files skipped from review as they are similar to previous changes (2)
- tests/test_generation.py
- tests/test_thermodynamics.py
Additional comments not posted (5)
doped/thermodynamics.py (5)
435-444
: Consider improving the clarity of thevbm
parameter description in the__init__
method.
472-472
: Ensure consistency in the description of thevbm
parameter across the class.
1230-1232
: Clarify the default behavior of thefermi_level
parameter.
1349-1365
: Handle potential discrepancies in VBM values gracefully in the_parse_fermi_dos
method.
1333-1367
: Consider enhancing error handling in the_parse_fermi_dos
method.
No description provided.