feat: switch to Pixi #133

jjjermiah · 2024-11-24T20:22:34Z

Summary by CodeRabbit

New Features
- Introduced a new test for downloading a single series in the toolkit.
- Added a configuration for managing coverage settings.
- Implemented a new documentation build job in the CI/CD pipeline.
- Added a new pixi.toml configuration file for project management.
- Transitioned project configuration from Poetry to Hatch in pyproject.toml.
- Enhanced the NBIAClient class with improved error handling and logging capabilities.
- Introduced new data models for handling patient, study, and series information in the NBIA toolkit.
- Added support for improved DICOM element string handling.
Bug Fixes
- Improved error handling in the logout functionality to enhance security.
Documentation
- Added new configuration files for Pixi and Ruff, detailing project dependencies and linting rules.
Tests
- Updated assertions in existing tests for clarity and best practices.
- Added a new test case for the downloadSingleSeries method.
Chores
- Expanded the .gitignore file to include additional files related to Pixi and Python package metadata.

coderabbitai · 2024-11-24T20:22:41Z

Walkthrough

The changes in this pull request involve multiple updates across configuration files and code, primarily aimed at enhancing project structure and functionality. Key modifications include the introduction of new configuration files for coverage, linting, and documentation, along with updates to the CI/CD workflow to accommodate broader branch triggers and new testing strategies. Additionally, there are adjustments to the handling of Python package dependencies and the addition of new tests to validate functionality within the nbiatoolkit package.

Changes

File	Change Summary
`.gitattributes`	Added entry for `pixi.lock` file to specify YAML syntax highlighting and mark it as generated.
`.github/workflows/main.yml`	Modified CI/CD pipeline to allow pushes to any branch, updated `Unit-Tests` job strategy, added `Build-Documentation` job, and commented out several job dependencies.
`.gitignore`	Added entries to ignore `.pixi` and `*.egg-info` files.
`config/coverage.toml`	Introduced new configuration file for coverage settings with sections for running and reporting coverage.
`config/mypy.ini`	Added section to exclude `tests/` directory from type checking.
`config/pytest.ini`	Introduced section to ignore `pytest.PytestWarning` warnings during test execution.
`config/ruff.toml`	Added new configuration file for Ruff linter with rules, exclusions, and formatting preferences.
`old_pyproject.toml`	Introduced new `pyproject.toml` for `nbiatoolkit`, detailing package metadata, dependencies, and semantic versioning.
`pixi.toml`	Added new configuration file for the `nbia-toolkit` project, defining project metadata, dependencies, environments, and tasks.
`pyproject.toml`	Transitioned from Poetry to Hatch configuration, updating package name, dependencies, and build specifications.
`src/nbiatoolkit/auth.py`	Modified import statements and error handling in the `logout` method, ensuring sensitive information is cleared upon logout.
`tests/test_auth.py`	Simplified assertions related to token expiration and logout functionality for improved readability.
`tests/test_downloadSeries.py`	Added new test function `test_downloadSingleSeries` to validate the `downloadSingleSeries` method functionality.
`src/nbiatoolkit/models/base.py`	Introduced abstract base classes `AbstractModel` and `AbstractListModel` to enhance model functionality and data manipulation.
`src/nbiatoolkit/models/nbia_responses.py`	Added data models for `Patient`, `Study`, and `Series`, along with their respective list classes, and implemented custom validators for date fields.
`src/nbiatoolkit/nbia.py`	Enhanced `NBIAClient` class methods, improved error handling, and added logging capabilities.
`devnotes/model_responses.ipynb`	Introduced a Jupyter Notebook for asynchronous data fetching from an API using `aiohttp`, including logging and structured data processing.

Poem

In the garden where code does bloom,
New configs and tests dispel the gloom.
With Pixi and Hatch, we dance and play,
Ensuring our toolkit grows day by day.
So hop with joy, let’s celebrate,
For every change, we cultivate! 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 10

🧹 Outside diff range and nitpick comments (16)

pyproject.toml (2)

10-10: Consider relaxing Python version constraint

The Python version requirement >=3.10,<4 is quite strict. Consider if supporting Python 3.9 would be beneficial for wider compatibility, especially for users on older LTS distributions.

Line range hint 53-54: Review branch pattern for semantic releases

The branch pattern (main|master|development) includes the development branch. Consider if you want to trigger releases from the development branch, as this is typically not recommended for semantic versioning.

old_pyproject.toml (2)

31-37: Review dependency version constraints

While setting minimum versions is good practice, some considerations:

The comment on line 28 about requests "min 2.31 instead of hard lock" conflicts with the ^2.31.0 syntax which does create an upper bound

Consider if strict upper bounds are necessary for all dependencies, especially for a library

1-102: Migration checklist for Poetry to Pixi transition

Since this file is being replaced as part of the migration to Pixi, please ensure:

All dependencies are correctly transferred with appropriate version constraints

CLI entry points are preserved or documented if changing

Build system configuration is updated for the new tooling

CI/CD pipelines are updated to use Pixi instead of Poetry

Documentation is updated to reflect the new package name and installation process
tests/test_auth.py (1)
81-90: Consider reordering assertions for better test structure

While the new assertions improve test coverage, consider reordering them to check critical state first:
 def test_logout(oauth: OAuth2) -> None:
     oauth.logout()
     assert oauth.access_token is None
-    assert oauth.refresh_token == ""
-    assert oauth.refresh_expiry is None
-    assert oauth.expiry_time is None
-    assert oauth.api_headers == {
-        "Authorization": "Bearer None",
-        "Content-Type": "application/json",
-    }
-    assert oauth.token_expiration_time is None
-    assert oauth.refresh_expiration_time is None
-    assert oauth.token_scope is None
+    # Critical state checks first
+    assert oauth.token_expiration_time is None
+    assert oauth.refresh_expiration_time is None
+    assert oauth.token_scope is None
+    assert oauth.expiry_time is None
+    assert oauth.refresh_expiry is None
+    # Then check tokens
+    assert oauth.refresh_token == ""
+    # Finally check derived state
+    assert oauth.api_headers == {
+        "Authorization": "Bearer None",
+        "Content-Type": "application/json",
+    }
This reordering groups related assertions together and checks the most important state first, making the test more maintainable and easier to debug when failures occur.
config/ruff.toml (3)
1-23: Consider tracking the linting migration progress

The phased approach to implementing linting is good practice. However, to ensure the excluded directories don't remain permanently unlinted, consider:

Adding TODO comments with target dates for including each excluded directory

Creating GitHub issues to track the linting migration progress

Would you like me to help create GitHub issues to track the linting migration for each excluded directory?

95-105: Enhance documentation for ignored rules

While most ignore reasons are clear, consider expanding the documentation for formatting conflicts:

Add specific examples of the COM812 conflicts

Link to any relevant project discussions/decisions
  # Ignored because https://docs.astral.sh/ruff/formatter/#conflicting-lint-rules 
- "COM812", # https://docs.astral.sh/ruff/rules/missing-trailing-comma/#missing-trailing-comma-com812
+ "COM812", # Trailing comma conflicts with formatter, see:
+           # 1. https://docs.astral.sh/ruff/formatter/#conflicting-lint-rules
+           # 2. https://docs.astral.sh/ruff/rules/missing-trailing-comma/#missing-trailing-comma-com812
107-116: Clean up duplicate configuration and verify indentation preference

Remove the commented-out duplicate configuration:
- # [lint.pydocstyle]
- # convention = "numpy"
Consider using spaces instead of tabs for better PEP 8 compliance:
- indent-style = "tab"
+ indent-style = "space"
pixi.toml (3)

4-4: Update the project description

The current description "Add a short description here" is a placeholder. Please replace it with a meaningful description of the project.

12-12: Consider tightening Python version constraints

The current constraint python = "<3.13" allows any Python version below 3.13. Consider setting both lower and upper bounds (e.g., ">=3.10,<3.13") to ensure compatibility with the Python versions defined in your environments.

33-35: Consider adding version constraints for dependencies

Several dependencies use "*" as version constraint, which could lead to compatibility issues. Consider adding specific version constraints for:

ipython

ipykernel

jupyterlab

pytest

pytest-cov

pytest-xdist

This helps ensure reproducible environments across different installations.

Also applies to: 50-52
.github/workflows/main.yml (3)
7-7: Consider limiting branch triggers to reduce CI resource usage

While running CI on all branches ("*") helps catch issues early, it might consume significant CI minutes. Consider limiting to specific branch patterns (e.g., feature/*, bugfix/*) if CI resource usage becomes a concern.

26-32: Review Pixi setup configuration

The Pixi setup looks good with:

✅ Pinned Pixi version for stability

✅ Environment-specific caching

✅ Matrix-based environment selection

However, setting locked: false might lead to inconsistent dependencies across different runs.

Consider setting locked: true once the dependency set is stable to ensure reproducible builds.

34-36: Consider adding test result artifacts

While the coverage report is being uploaded, consider also uploading test results for better debugging of failures.

Add these steps after the test run:
- name: Upload test results
  if: always()  # Upload even if tests fail
  uses: actions/upload-artifact@v4
  with:
    name: test-results-${{ matrix.os }}-${{ matrix.env }}
    path: |
      test-results/
      pytest-report.xml
src/nbiatoolkit/auth.py (2)
299-300: Improve error handling documentation

The comment should better explain why we're suppressing the error and what the expected behavior will be once TCIA implements the functionality.

Consider updating the comment to be more descriptive:
- pass # WAIT UNTIL TCIA IMPLEMENTS LOGOUT FUNCTIONALITY
+ pass  # Suppress HTTP errors until TCIA implements proper logout endpoint
Line range hint 301-315: Critical issues in logout implementation

Several issues need to be addressed:

Setting self = None is incorrect and has no effect as it only modifies the local reference.

The method can be more concise by using a dictionary to reset attributes.

The explicit return None is redundant.

Here's a suggested implementation:
     def logout(self) -> None:
         if not self.access_token:
-            return None
+            return
 
         query_url = NBIA_BASE_URLS.LOGOUT_URL.value
         response = requests.get(query_url, headers=self.api_headers)
         try:
             response.raise_for_status()
         except requests.exceptions.HTTPError:
-            pass # WAIT UNTIL TCIA IMPLEMENTS LOGOUT FUNCTIONALITY
+            pass  # Suppress HTTP errors until TCIA implements proper logout endpoint
         finally:
-            # set the entire object to None
-            self.__dict__.clear()
-            self.username = ""
-            self.password = ""
-            self.client_id = ""
-            self.base_url = ""
-            self._access_token = ""
-            self.expiry_time = None
-            self.refresh_expiry = None
-            self.refresh_token = ""
-            self.scope = None
-            self._fernet_key = b""
-            self = None
-            return None
+            # Reset all attributes to their default values
+            defaults = {
+                'username': "",
+                'password': "",
+                'client_id': "",
+                'base_url': "",
+                '_access_token': "",
+                'expiry_time': None,
+                'refresh_expiry': None,
+                'refresh_token': "",
+                'scope': None,
+                '_fernet_key': b"",
+            }
+            for attr, value in defaults.items():
+                setattr(self, attr, value)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d8f4bb4 and eac1090.

⛔ Files ignored due to path filters (1)

pixi.lock is excluded by !**/*.lock

📒 Files selected for processing (13)

.gitattributes (1 hunks)
.github/workflows/main.yml (2 hunks)
.gitignore (1 hunks)
config/coverage.toml (1 hunks)
config/mypy.ini (1 hunks)
config/pytest.ini (1 hunks)
config/ruff.toml (1 hunks)
old_pyproject.toml (1 hunks)
pixi.toml (1 hunks)
pyproject.toml (2 hunks)
src/nbiatoolkit/auth.py (2 hunks)
tests/test_auth.py (3 hunks)
tests/test_downloadSeries.py (0 hunks)

💤 Files with no reviewable changes (1)

tests/test_downloadSeries.py

✅ Files skipped from review due to trivial changes (5)

.gitattributes
.gitignore
config/coverage.toml
config/mypy.ini
config/pytest.ini

🔇 Additional comments (15)

pyproject.toml (1)

36-38: Verify the version path exists

The semantic release configuration references a version variable in src/nbiatoolkit/nbia.py. This path should be updated to match the new package name.

✅ Verification successful

Version path in semantic release configuration is correct

The version path src/nbiatoolkit/nbia.py:__version__ in the semantic release configuration is valid. The file exists and contains the __version__ variable as expected.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the version variable exists in the specified file
fd "nbia.py" src -x grep -l "__version__" {}

Length of output: 68

Script:

#!/bin/bash
# Check the actual content of the file to verify the __version__ variable
cat src/nbiatoolkit/nbia.py | grep -A 1 "__version__"

Length of output: 103

old_pyproject.toml (3)

1-15: Package name inconsistency with planned migration

The AI summary indicates this package is being renamed from nbiatoolkit to nbia-toolkit as part of the Poetry to Hatch migration. However, this file still uses the old name. Please ensure the new Hatch configuration uses the updated package name consistently.

56-102: Update semantic release paths for new structure

The semantic release configuration contains paths that need updating:

src/nbiatoolkit/nbia.py should be updated to reflect the new package name
pyproject.toml:tool.poetry.version needs updating for Hatch
Consider if the commit parsing rules align with your team's new conventions

17-27: Verify CLI functionality preservation

The AI summary indicates these CLI scripts are being removed in the migration to Hatch. Please ensure:

All these CLI functionalities are preserved in the new configuration
Documentation is updated to reflect any changes in CLI usage

✅ Verification successful

Let me gather more information to verify if these CLI functionalities are preserved in the new configuration.

Let me check the actual CLI function implementations to verify if they are being maintained.

CLI functionality is preserved in the migration

The verification shows that:

All CLI entry points listed in the old pyproject.toml are present in the new configuration with identical mappings
The corresponding CLI function implementations exist in src/nbiatoolkit/nbia_cli.py and are maintained
No functionality appears to be removed or changed in the migration

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for CLI entry points in new configuration files
fd -e toml -e yaml -x cat {} \; | rg -A 5 "scripts|entry_points|console_scripts"

Length of output: 371

Script:

#!/bin/bash
# Check for new CLI entry points in pyproject.toml
fd pyproject.toml -x cat {} \; | rg -A 10 "\[project.scripts\]|\[tool.poetry.scripts\]"

# Check for any new CLI-related files
fd -e py -x rg -l "def.*_cli" {}

Length of output: 869

Script:

#!/bin/bash
# Check the CLI implementations in the source files
rg -A 3 "def.*_cli" ./src/nbiatoolkit/nbia_cli.py

# Check if there's a new CLI module
fd -e py -x rg -l "entry_point|cli\.py" {}

Length of output: 1457

tests/test_auth.py (3)

30-32: LGTM! Improved assertion style

The changes follow Python best practices by using not and is True instead of explicit boolean comparisons. The test logic effectively covers both expired and non-expired token states.

54-54: LGTM! Consistent assertion style

The assertion style matches the earlier improvements and correctly verifies token validity after refresh.

75-75: LGTM! Maintains consistent assertion style

The assertion style remains consistent and properly verifies that newly requested tokens are valid.

config/ruff.toml (1)

26-26: LGTM!

The 100-character line length is a good choice, balancing code readability with modern screen sizes.

pixi.toml (4)

1-160: Overall configuration looks good with some improvements needed

The Pixi configuration provides a comprehensive setup for development, testing, documentation, and quality tools. However, please address the following:

Update the project description
Resolve the documentation system choice (Sphinx vs MkDocs)
Ensure all referenced configuration files exist
Consider adding version constraints for dependencies
Verify Python version compatibility

Once these are addressed, this will provide a solid foundation for the project's development environment.

125-136: Verify existence of quality tool configuration files

The quality tools reference external configuration files that need to be verified.

✅ Verification successful

Configuration files for quality tools are present and correctly referenced

Both configuration files referenced in the quality tool tasks exist in the expected locations:

config/ruff.toml for Ruff linting and formatting
config/mypy.ini for MyPy type checking

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for existence of quality tool configuration files

echo "Checking for quality tool configuration files..."
for file in "config/ruff.toml" "config/mypy.ini"; do
  if [ -f "$file" ]; then
    echo "✓ Found: $file"
  else
    echo "✗ Missing: $file"
  fi
done

Length of output: 387

63-64: Verify existence of test configuration files

The test configuration references external files that need to be verified.

✅ Verification successful

Test configuration files are present and properly formatted

Both referenced configuration files exist and contain valid configuration settings:

config/coverage.toml has the expected coverage tool sections
config/pytest.ini contains valid pytest configuration with warning filters

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for existence of test configuration files

echo "Checking for test configuration files..."
for file in "config/coverage.toml" "config/pytest.ini"; do
  if [ -f "$file" ]; then
    echo "✓ Found: $file"
  else
    echo "✗ Missing: $file"
  fi
done

Length of output: 401

Script:

#!/bin/bash
# Let's verify the content of these configuration files to ensure they are properly formatted
echo "=== Content of config/coverage.toml ==="
cat config/coverage.toml

echo -e "\n=== Content of config/pytest.ini ==="
cat config/pytest.ini

Length of output: 330

151-160: Verify build configuration and dependencies

The build configuration references pyproject.toml and uses semantic-release. Let's verify the setup.

✅ Verification successful

Let me verify the semantic-release configuration details and hatch build settings to ensure they are properly configured.

Build configuration is properly set up and aligned

The build configuration in pixi.toml is correctly configured:

Hatch is properly set up as the build backend with hatchling.build
Build outputs are correctly directed to dist/* as configured in both pixi.toml and pyproject.toml (directory = "dist")
Semantic-release is properly configured with version tracking and dist path settings
All necessary build dependencies and configurations are present

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for existence of build configuration files and validate semantic-release setup

# Check for pyproject.toml
if [ -f "pyproject.toml" ]; then
  echo "✓ Found: pyproject.toml"
else
  echo "✗ Missing: pyproject.toml"
fi

# Check for semantic-release configuration
if grep -r "semantic_release" .; then
  echo "✓ Found semantic-release configuration"
else
  echo "✗ Missing semantic-release configuration"
fi

Length of output: 795

Script:

#!/bin/bash
# Check hatch configuration in pyproject.toml
echo "=== Hatch Configuration ==="
grep -A 10 "\[tool\.hatch" pyproject.toml

echo -e "\n=== Semantic Release Configuration ==="
grep -A 10 "\[tool\.semantic_release\]" pyproject.toml

echo -e "\n=== Project Build Configuration ==="
grep -A 5 "\[build-system\]" pyproject.toml

Length of output: 1025

.github/workflows/main.yml (2)

20-20: Verify Windows compatibility plans

The removal of windows-latest from the test matrix might impact Windows users. Please confirm if this is temporary and if there are plans to restore Windows support.

✅ Verification successful

Let me gather more specific information about Windows-related code and dependencies.

Let me check the commit message and PR description for any context about Windows support.

Windows support is temporarily removed with explicit comment

The workflow file explicitly indicates that Windows support is temporarily removed with the comment "windows-latest removed for now". Since there's no Windows-specific code that would break without Windows CI, and the comment clearly indicates this is a temporary measure, this change appears to be an intentional temporary modification.

The inline comment "# , windows-latest removed for now" in the workflow file indicates this is a known temporary change
No critical Windows-specific code patterns were found that would require immediate Windows testing
The change is properly documented in the workflow file

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any Windows-specific code or dependencies
rg -l 'win32|winreg|msvcrt|nt|_winapi'

Length of output: 1746

Script:

#!/bin/bash
# Check for Windows-specific code patterns in source files
rg -C 3 'if.*platform.*windows|if.*sys\.platform.*win|os\.name.*nt' src/

# Check for Windows-specific dependencies or configurations in project files
rg -C 3 'windows|win32|win64' pyproject.toml pixi.toml

Length of output: 131

Script:

#!/bin/bash
# Check recent commit messages and PR description for Windows-related changes
git log -n 5 --grep="windows\|Windows"

# Check if there are any open issues related to Windows
gh issue list --search "windows in:title,body"

# Check if there are any comments in the workflow file about Windows
rg -C 3 "windows" .github/workflows/

Length of output: 496

85-239: Clarify plans for deployment jobs

A significant portion of the CI/CD pipeline (PyPI publishing, Docker builds, installation tests) is commented out. Please clarify:

Are these jobs temporarily disabled during the Pixi migration?
Will they be updated to use Pixi?
Is there a timeline for re-enabling these jobs?

src/nbiatoolkit/auth.py (1)

4-4: Verify removal of NBIA_ENDPOINTS import

The change from relative to absolute import is good. However, let's verify that NBIA_ENDPOINTS is not used elsewhere in the codebase.

✅ Verification successful

NBIA_ENDPOINTS is still actively used in the codebase

The removal of the import in auth.py appears to be correct as there are no direct usages of NBIA_ENDPOINTS in this file. However, NBIA_ENDPOINTS is extensively used throughout the codebase, particularly in:

src/nbiatoolkit/nbia.py for API endpoint management
src/nbiatoolkit/utils/nbia_endpoints.py where it's defined
Various other modules that import and use it

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining usage of NBIA_ENDPOINTS
rg "NBIA_ENDPOINTS" --type py

Length of output: 2423

pyproject.toml

coderabbitai · 2024-11-24T20:26:49Z

pyproject.toml

+dependencies = [
+    "requests>=2.31.0,<3",
+    "pydicom>=3.0",
+    "tqdm>=4.66.1",
+    "beautifulsoup4>=4.12.3",
+    "cryptography>=42.0.2",
+    "pandas>=2.2.0",
+]


💡 Codebase verification

Update cryptography package to version 42.0.4 or higher to address security vulnerability

A recent high-severity vulnerability (published Feb 21, 2024) affects cryptography versions >= 38.0.0, < 42.0.4, involving NULL pointer dereference in pkcs12.serialize_key_and_certificates. The current constraint "cryptography>=42.0.2" includes affected versions.

Update the dependency constraint in pyproject.toml to: "cryptography>=42.0.4"

🔗 Analysis chain

Dependencies look well-structured with appropriate version constraints

The dependencies section follows best practices with both lower and upper version bounds. Security-critical packages are pinned to recent versions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Check for known vulnerabilities in cryptography package gh api graphql -f query=' { securityVulnerabilities(first: 5, ecosystem: PIP, package: "cryptography", severities: [CRITICAL, HIGH]) { nodes { advisory { summary severity publishedAt } vulnerableVersionRange } } }'

Length of output: 1327

old_pyproject.toml

coderabbitai · 2024-11-24T20:26:50Z

config/ruff.toml

+select = [
+  ###########################################################################
+  # TYPE ANNOTATIONS
+  # Ensure all functions have type annotations
+  # https://docs.astral.sh/ruff/rules/#flake8-annotations-ann
+  "ANN",
+  # Use type hinting consistently
+  # https://docs.astral.sh/ruff/rules/#flake8-type-checking-tch
+  "TCH",
+
+  ###########################################################################
+  # IMPORTS
+  # Sort imports naturally
+  # https://docs.astral.sh/ruff/rules/#isort-i
+  "I",
+  # Follow import conventions
+  # https://docs.astral.sh/ruff/rules/#flake8-import-conventions-icn
+  "ICN",
+  # Clean up and organize imports
+  # https://docs.astral.sh/ruff/rules/#flake8-tidy-imports-tid
+  "TID",
+
+  ###########################################################################
+  # CODE QUALITY
+  # Detect possible bugs, like unused variables or exception handling issues
+  # https://docs.astral.sh/ruff/rules/#flake8-bugbear-b
+  "B",
+  # Avoid using Python builtins incorrectly
+  # https://docs.astral.sh/ruff/rules/#flake8-builtins-a
+  "A",
+  # Enforce correct usage of commas in lists, tuples, etc.
+  # https://docs.astral.sh/ruff/rules/#flake8-commas-com
+  "COM",
+  # Prevent use of debugging code, like breakpoints
+  # https://docs.astral.sh/ruff/rules/#flake8-debugger-t10
+  "T10",
+  # Disallow print statements
+  # https://docs.astral.sh/ruff/rules/#flake8-print-t20
+  "T20",
+  # Provide clear and explanatory error messages
+  # https://docs.astral.sh/ruff/rules/#flake8-errmsg-em
+  "EM",
+
+  ###########################################################################
+  # STANDARDS & STYLE
+  # Prefer pathlib for path manipulation
+  # https://docs.astral.sh/ruff/rules/#flake8-use-pathlib-pth
+  "PTH",
+  # Adhere to Pylint conventions
+  # https://docs.astral.sh/ruff/rules/#pylint-pl
+  "PL",
+  # Simplify code to reduce complexity
+  # https://docs.astral.sh/ruff/rules/#flake8-simplify-sim
+  "SIM",
+  # errors like undefined names and unused imports without enforcing style rules.
+  # https://docs.astral.sh/ruff/rules/#pyflakes-f
+  "F",
+  #
+  # https://docs.astral.sh/ruff/rules/#pep8-naming-n
+  "N",
+  # Pydocstyle
+  # https://docs.astral.sh/ruff/rules/#pydocstyle-d
+  # "D",
+]


💡 Codebase verification

Consider relaxing T20 rule for test files

The T20 rule (no print statements) will affect existing test files, particularly test_tags.py which uses print statements for test feedback. Consider either:

Excluding test files from T20 using extend-exclude in ruff.toml

Or replacing print statements with proper test assertions/logging

Example locations requiring attention:

tests/test_tags.py: Contains 5 instances of print statements used for test feedback

🔗 Analysis chain

Verify compatibility with development workflow

The rule selection is comprehensive and well-documented. However, please verify:

T20 (no print statements) might be too restrictive for development/debugging

Ensure these rules align with Pixi's environment management practices

Run this script to check for print statements in test files that might be affected:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check for print statements in test files that might be affected by T20 rule # Search for print statements in test files rg "print\(" "tests/"

Length of output: 296

coderabbitai · 2024-11-24T20:26:50Z

config/ruff.toml

+  "src/imgtools/utils/**/*.py",
+  "src/imgtools/modules/**/*.py",
+  "src/imgtools/transforms/**/*.py",
+  "src/imgtools/autopipeline.py",
+  "src/imgtools/pipeline.py",
+  "src/imgtools/image.py",
+]
+
+
+line-length = 100
+
+[lint]
+
+select = [
+  ###########################################################################
+  # TYPE ANNOTATIONS
+  # Ensure all functions have type annotations
+  # https://docs.astral.sh/ruff/rules/#flake8-annotations-ann
+  "ANN",
+  # Use type hinting consistently
+  # https://docs.astral.sh/ruff/rules/#flake8-type-checking-tch
+  "TCH",
+
+  ###########################################################################
+  # IMPORTS
+  # Sort imports naturally
+  # https://docs.astral.sh/ruff/rules/#isort-i
+  "I",
+  # Follow import conventions
+  # https://docs.astral.sh/ruff/rules/#flake8-import-conventions-icn
+  "ICN",
+  # Clean up and organize imports
+  # https://docs.astral.sh/ruff/rules/#flake8-tidy-imports-tid
+  "TID",
+
+  ###########################################################################
+  # CODE QUALITY
+  # Detect possible bugs, like unused variables or exception handling issues
+  # https://docs.astral.sh/ruff/rules/#flake8-bugbear-b
+  "B",
+  # Avoid using Python builtins incorrectly
+  # https://docs.astral.sh/ruff/rules/#flake8-builtins-a
+  "A",
+  # Enforce correct usage of commas in lists, tuples, etc.
+  # https://docs.astral.sh/ruff/rules/#flake8-commas-com
+  "COM",
+  # Prevent use of debugging code, like breakpoints
+  # https://docs.astral.sh/ruff/rules/#flake8-debugger-t10
+  "T10",
+  # Disallow print statements
+  # https://docs.astral.sh/ruff/rules/#flake8-print-t20
+  "T20",
+  # Provide clear and explanatory error messages
+  # https://docs.astral.sh/ruff/rules/#flake8-errmsg-em
+  "EM",
+
+  ###########################################################################
+  # STANDARDS & STYLE
+  # Prefer pathlib for path manipulation
+  # https://docs.astral.sh/ruff/rules/#flake8-use-pathlib-pth
+  "PTH",
+  # Adhere to Pylint conventions
+  # https://docs.astral.sh/ruff/rules/#pylint-pl
+  "PL",
+  # Simplify code to reduce complexity
+  # https://docs.astral.sh/ruff/rules/#flake8-simplify-sim
+  "SIM",
+  # errors like undefined names and unused imports without enforcing style rules.
+  # https://docs.astral.sh/ruff/rules/#pyflakes-f
+  "F",
+  #
+  # https://docs.astral.sh/ruff/rules/#pep8-naming-n
+  "N",
+  # Pydocstyle
+  # https://docs.astral.sh/ruff/rules/#pydocstyle-d
+  # "D",
+]
+
+ignore = [
+  # allow self to not need type annotations
+  "ANN101",
+  # Allow too many arguments for functions
+  "PLR0913",
+  # Public Module Docstrings
+  "D100",
+  # Ignored because https://docs.astral.sh/ruff/formatter/#conflicting-lint-rules 
+  "COM812", # https://docs.astral.sh/ruff/rules/missing-trailing-comma/#missing-trailing-comma-com812
+  "D206"
+]
+
+# [lint.pydocstyle]
+# convention = "numpy"
+[lint.pydocstyle]
+convention = "numpy"
+
+[format]
+
+quote-style = "single"
+indent-style = "tab"
+docstring-code-format = true


💡 Codebase verification

Add security and performance-focused rules to the configuration

Based on the codebase analysis, I recommend adding the following rule sets to the configuration:

Add S rules (flake8-bandit) for security checks, as the codebase uses os, subprocess, and other security-sensitive modules

Add PERF rules for performance checks

Add UP rules (pyupgrade) to ensure modern Python patterns

Add RUF rules for Ruff-specific checks

Specific locations to update in config/ruff.toml:

Add to the select section:

# Security checks "S", # Performance "PERF", # Python Upgrades "UP", # Ruff-specific "RUF",

🔗 Analysis chain

Verify coverage of essential Python rules

Please verify that no critical rules are missing, particularly:

Security-related rules (e.g., bandit equivalents)

Performance rules

Additional type checking rules

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check available rules that might be beneficial to add # List all available rules and compare with current selection ruff linter | grep -vE "$(rg '^ "[A-Z0-9]+",' config/ruff.toml | tr -d ' ",' | paste -sd'|' -)"

Length of output: 292

Script:

#!/bin/bash # Let's analyze the ruff configuration more systematically # 1. Check if there are any security-related imports or patterns rg -i "import.*os|import.*subprocess|import.*eval|exec$|shell\s*=\s*True" --type py # 2. Check for potential performance-critical patterns rg "\.copy\($|\.deepcopy|while\s+True|^\s*time\.sleep" --type py # 3. Look for type annotation patterns ast-grep --pattern 'from typing import $_' # 4. Check actual Python files structure to understand coverage fd -e py -d 3

Length of output: 1740

pixi.toml

.github/workflows/main.yml

…M tag search

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)

src/nbiatoolkit/dicomtags/tags.py (3)
170-171: LGTM! Consider additional validation for element format.

The escaping of parentheses is correct and necessary to prevent regex pattern errors when searching for DICOM elements.

Consider adding format validation before escaping to ensure the element string follows the DICOM format (<group>,<element>):
 def subsetSeriesTags(series_tags_df: pd.DataFrame, element: str) -> pd.DataFrame:
+    if not element.strip().startswith("(") or not element.strip().endswith(")"):
+        raise ValueError(f"Invalid element format. Expected (<group>,<element>), got {element}")
     escaped_element = element.replace("(", r"$").replace(")", r"$")
245-247: LGTM! Consider extracting escaping logic to a utility function.

The escaping implementation is correct but duplicates logic from subsetSeriesTags. Consider extracting this into a reusable utility function.

Consider this refactor to improve code reuse:
+def escape_dicom_element(element: str) -> str:
+    """
+    Escapes special characters in DICOM element strings for safe regex operations.
+    
+    Args:
+        element (str): The DICOM element string to escape.
+        
+    Returns:
+        str: The escaped DICOM element string.
+    """
+    return element.replace("(", r"$").replace(")", r"$")
+
 def subsetSeriesTags(series_tags_df: pd.DataFrame, element: str) -> pd.DataFrame:
-    escaped_element = element.replace("(", r"$").replace(")", r"$")
+    escaped_element = escape_dicom_element(element)
     locs = series_tags_df[series_tags_df["element"].str.contains(escaped_element)]

 def getReferencedSeriesUIDS(series_tags_df: pd.DataFrame) -> List[str]:
-    escaped_element = SeriesInstanceUID_element.replace("(", r"$").replace(")", r"$")
+    escaped_element = escape_dicom_element(SeriesInstanceUID_element)
     value: pd.DataFrame = search_space[search_space["element"].str.contains(escaped_element)]
Remove or properly implement the commented-out getRTSTRUCT_ROI_info function

The commented-out function is incomplete and references a client object that should be an instance of the NBIA class from nbiatoolkit.nbia. This class has the required getDICOMTags method. Since other RTSTRUCT-related functions in the file are properly implemented and active, this commented-out function should either be:

Removed if it's no longer needed

Properly implemented by accepting an NBIA client instance as a parameter

🔗 Analysis chain

Line range hint 373-386: Address or remove the commented-out function.

The commented-out getRTSTRUCT_ROI_info function references an undefined client. If this is part of the Pixi migration, consider either implementing it or removing it to maintain code cleanliness.

Let's check if this client is defined elsewhere:

Let me gather more context about this client usage:
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for client definition or imports
rg -g '!*.pyc' -g '!*{.json,.md}' "client\." "src/"
Length of output: 402

Script:
#!/bin/bash
# Search for getDICOMTags function and its usage
rg -g '!*.pyc' -g '!*{.json,.md}' "getDICOMTags" "src/" -A 3

# Look for any RTSTRUCT related functions
rg -g '!*.pyc' -g '!*{.json,.md}' "RTSTRUCT" "src/" -A 3
Length of output: 3356

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between eac1090 and 0b22aa5.

⛔ Files ignored due to path filters (1)

pixi.lock is excluded by !**/*.lock

📒 Files selected for processing (2)

pixi.toml (1 hunks)
src/nbiatoolkit/dicomtags/tags.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

pixi.toml

…ptography dependency to 43.0.3

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)

.github/workflows/main.yml (4)
7-7: Consider limiting branch triggers to reduce CI usage

While running CI on all branches ("*") is helpful for development, it might consume more CI minutes than necessary. Consider limiting to specific patterns like ["main", "development", "feature/**", "bugfix/**"] if CI minutes become a concern.

20-20: Document Windows support status

The Windows runner has been removed from the matrix. Please document this limitation in the project's README or documentation, including any plans to restore Windows support in the future.

69-70: Maintain consistent action versions

The job uses actions/checkout@v3 while other jobs use v4. Consider updating to maintain consistency:
- uses: actions/checkout@v3
+ uses: actions/checkout@v4
85-239: Document deployment pipeline status

The entire deployment pipeline (PyPI, Docker, testing) has been commented out. Please:

Add a comment explaining why these jobs are disabled

Create an issue to track the re-enablement of these jobs

Update the project documentation to reflect the current deployment status

This will help maintain transparency about the project's CI/CD capabilities during the Pixi migration.

Would you like me to help create an issue to track the re-enablement of these deployment jobs?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0b22aa5 and 310d089.

📒 Files selected for processing (2)

.github/workflows/main.yml (3 hunks)
pyproject.toml (2 hunks)

🧰 Additional context used

📓 Learnings (1)

pyproject.toml (1)

Learnt from: jjjermiah
PR: jjjermiah/nbia-toolkit#133
File: pyproject.toml:1-2
Timestamp: 2024-11-24T21:15:17.218Z
Learning: The source code directory `src/nbiatoolkit/` is intentionally named differently from the package name `nbia-toolkit`.

🔇 Additional comments (6)

pyproject.toml (5)

1-10: LGTM! Project metadata is well-structured

The project metadata follows best practices with clear name, version, description, and appropriate Python version constraints.

12-19: LGTM! Dependencies are well-defined with appropriate version constraints

Dependencies are properly structured with both lower and upper version bounds where needed. The cryptography package version (>=43.0.3) is secure and up-to-date.

21-31: LGTM! Build system configuration is correct

The build system configuration is properly set up with hatchling, and the package path correctly points to the source directory.

Line range hint 38-79: LGTM! Semantic release configuration is comprehensive

The semantic release configuration is well-structured with appropriate:

Commit type categorization
Changelog management
Branch matching patterns

35-37: 🛠️ Refactor suggestion

Update version variable path to match package name

The version variable path still references the old module path. Consider updating it to match the package structure:

version_variables = [
-    "src/nbiatoolkit/nbia.py:__version__",
+    "src/nbia_toolkit/nbia.py:__version__",
]

.github/workflows/main.yml (1)

29-32: Review Pixi configuration settings

The current Pixi setup has locked: false which might lead to inconsistent environments across different CI runs. Consider:

Setting locked: true and committing the pixi.lock file
Or document why locked mode is disabled (e.g., if it's due to platform-specific dependencies)

Let's check if the lock file exists and is being tracked:

codecov · 2024-11-24T21:19:05Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.80%. Comparing base (af6d30c) to head (47722f2).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #133      +/-   ##
==========================================
- Coverage   85.84%   85.80%   -0.04%     
==========================================
  Files          27       27              
  Lines        1455     1451       -4     
==========================================
- Hits         1249     1245       -4     
  Misses        206      206

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…ion and filtering capabilities

coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (9)

src/nbiatoolkit/__init__.py (1)
Line range hint 19-28: Consider implementing a deprecation strategy

Since both NBIA_ENDPOINTS and NBIA_BASE_URLS are exposed in __all__, consider:

Adding a deprecation warning when NBIA_ENDPOINTS is used

Documenting the transition in the changelog

Planning for eventual removal of NBIA_ENDPOINTS

Here's a suggested implementation:
import warnings

# At the top of utils/nbia_endpoints.py
def __getattr__(name):
    if name == 'NBIA_ENDPOINTS':
        warnings.warn(
            "NBIA_ENDPOINTS is deprecated and will be removed in a future version. "
            "Use NBIA_BASE_URLS instead.",
            DeprecationWarning,
            stacklevel=2
        )
        return NBIA_ENDPOINTS
    raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
pixi.toml (2)
4-4: Update project description

The current description is a placeholder. Please provide a meaningful description that explains the purpose and key features of the package.

151-160: Consider adding validation steps to build task

The build task looks good but could benefit from additional validation steps before building:

Running tests

Checking types

Verifying documentation builds

Consider adding these as dependencies to the build task:
 build = { cmd = [
   "hatch",
   "build",
   "--clean",
 ], inputs = [
   "src",
   "pyproject.toml",
-], outputs = [
+], depends_on = [
+  "test",
+  "type-check",
+  "doc-build"
+], outputs = [
   "dist/*",
 ], description = "Build the package" }
src/nbiatoolkit/models/base.py (2)

35-55: Consider using dateutil.parser.parse for flexible date parsing

The convert_date method manually tries multiple date formats, which may be limiting and less maintainable. Using dateutil.parser.parse can handle a wider range of date formats and simplify the code.

102-120: Potential performance improvement for key-based item access

The __getitem__ method searches through self.items to find an item by key, resulting in O(n) time complexity. For larger datasets, consider indexing items by key using a dictionary to achieve O(1) lookup times.
src/nbiatoolkit/models/nbia_responses.py (4)
121-177: Move execution logic to a separate script or module

The code under if __name__ == "__main__": includes execution logic and example usage, which is not typically placed within a models file. To maintain a clean separation of concerns, consider moving this code to a separate script or module, such as example_usage.py or main.py.

27-31: Ensure 'PatientSex' comparisons are case-insensitive

In the is_male and is_female methods, consider making the comparisons case-insensitive to handle data where PatientSex might be in lowercase or mixed case. This will make your methods more robust to variations in the data.

Apply this diff to update the methods:
 def is_male(self) -> bool:
-    return self.PatientSex == 'M'
+    return self.PatientSex and self.PatientSex.upper() == 'M'

 def is_female(self) -> bool:
-    return self.PatientSex == 'F'
+    return self.PatientSex and self.PatientSex.upper() == 'F'
137-138: Avoid using 'type: ignore' by adding type annotations

The # type: ignore comments are used to bypass type checking, which can hide potential issues. Consider adding appropriate type annotations to the NBIAClient methods to eliminate the need for ignoring type checks.

If updating the NBIAClient methods is not possible, consider assigning the returned values to variables with explicit type annotations.
-        collections = client.getCollections()  # type: ignore
+        collections: List[Dict[str, Any]] = client.getCollections()
126-126: Handle exceptions when initializing 'NBIAClient'

When creating an instance of NBIAClient, consider wrapping it in a try-except block to handle any potential exceptions during initialization, such as connection errors or misconfigurations.

Apply this diff to add exception handling:
-    client = NBIAClient(log_level="DEBUG")
+    try:
+        client = NBIAClient(log_level="DEBUG")
+    except Exception as e:
+        print(f"Failed to initialize NBIAClient: {e}")
+        raise

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 310d089 and bc274d4.

⛔ Files ignored due to path filters (1)

pixi.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

.gitignore (1 hunks)
pixi.toml (1 hunks)
pyproject.toml (2 hunks)
src/nbiatoolkit/__init__.py (2 hunks)
src/nbiatoolkit/models/base.py (1 hunks)
src/nbiatoolkit/models/nbia_responses.py (1 hunks)
src/nbiatoolkit/nbia.py (0 hunks)

💤 Files with no reviewable changes (1)

src/nbiatoolkit/nbia.py

🚧 Files skipped from review as they are similar to previous changes (1)

.gitignore

🧰 Additional context used

📓 Learnings (2)

pixi.toml (1)

Learnt from: jjjermiah
PR: jjjermiah/nbia-toolkit#133
File: pixi.toml:101-110
Timestamp: 2024-11-24T21:14:10.539Z
Learning: The nbia-toolkit project uses Sphinx for documentation builds, not MkDocs.

pyproject.toml (1)

Learnt from: jjjermiah
PR: jjjermiah/nbia-toolkit#133
File: pyproject.toml:1-2
Timestamp: 2024-11-24T21:15:17.218Z
Learning: The source code directory `src/nbiatoolkit/` is intentionally named differently from the package name `nbia-toolkit`.

🔇 Additional comments (9)

src/nbiatoolkit/__init__.py (1)

15-15: Verify if NBIA_ENDPOINTS is still needed

The code imports both NBIA_ENDPOINTS and NBIA_BASE_URLS, but the PR title and related changes suggest a transition to using NBIA_BASE_URLS. Consider removing NBIA_ENDPOINTS if it's no longer needed.

Let's check if NBIA_ENDPOINTS is still used in the codebase:

✅ Verification successful

Keep NBIA_ENDPOINTS import as it's actively used throughout the codebase

The import of both NBIA_ENDPOINTS and NBIA_BASE_URLS is correct. NBIA_ENDPOINTS is extensively used throughout the codebase, particularly in nbia.py where it's used in multiple method signatures and API endpoint construction. The import should be retained to maintain functionality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for NBIA_ENDPOINTS usage across the codebase
rg "NBIA_ENDPOINTS" --type py

Length of output: 2439

pyproject.toml (4)

1-10: LGTM! Well-structured project metadata

The project metadata is complete and follows best practices. The Python version constraint is appropriately set to ensure compatibility while preventing breaking changes.

12-23: Well-structured dependencies with appropriate version constraints

Good job on the dependency specifications:

Critical packages (requests, cryptography) have upper bounds to prevent breaking changes
Development dependencies have appropriate flexibility with >= constraints
All versions are recent and stable

25-35: LGTM! Build configuration is properly set up

The build system configuration is correct with appropriate backend and build target settings.

Line range hint 38-83: Consider reviewing the branch pattern for semantic releases

The semantic release configuration is well-structured, but the branch pattern "(main|master|development)" includes the development branch. This might trigger unintended releases from the development branch.

Consider if you want semantic releases to be triggered from the development branch, or if releases should only be created from main/master branches.

pixi.toml (2)

63-64: Verify configuration file paths

The test configuration references several external files and directories:

config/coverage.toml
config/pytest.ini
src and tests directories
coverage-report/coverage.xml

Please ensure these paths are consistent with the project structure.

Also applies to: 66-67

✅ Verification successful

All referenced configuration paths exist and are valid

All configuration files and directories referenced in the test configuration are present in the project structure:

config/coverage.toml and config/pytest.ini exist in the config directory
src/ and tests/ directories are present
coverage-report/ directory exists

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if referenced configuration files and directories exist
echo "Checking configuration files..."
fd -t f "coverage.toml|pytest.ini" config/

echo "Checking source and test directories..."
fd -t d "^src$|^tests$"

echo "Checking coverage report directory..."
fd -t d "^coverage-report$"

Length of output: 398

6-6: Consider expanding platform support

The platforms list currently includes only linux-64 and osx-arm64. Consider adding support for other common platforms:

windows-64
osx-64 (Intel-based macOS)

src/nbiatoolkit/models/nbia_responses.py (2)

85-85: Verify the data type of 'AnnotationsFlag'

The field AnnotationsFlag is defined as Optional[bool], but the incoming data might represent boolean values as strings (e.g., 'true', 'false'). Ensure that the parsing correctly converts these string representations to boolean values.

If necessary, add a validator to handle the conversion:

@validator("AnnotationsFlag", pre=True, always=True)
def validate_annotations_flag(cls, value):
    if isinstance(value, str):
        return value.lower() == 'true'
    return value

94-94: Handle potential large file sizes appropriately

The FileSize field is defined as an Optional[int]. If the file sizes are large, ensure that the data type can handle large integers without overflow issues. Consider using a larger integer type if necessary.

coderabbitai · 2024-11-25T14:55:14Z

pixi.toml

+mkdocs = "*"
+mkdocs-material = ">=9.5.44,<10"
+mkdocstrings = ">=0.27.0,<0.28"
+mkdocstrings-python = ">=1.12.2,<2"
+mkdocs-git-authors-plugin = ">=0.9.0,<0.10"
+mkdocs-git-revision-date-localized-plugin = ">=1.2.9,<2"
+griffe-inherited-docstrings = ">=1.1.1,<2"
+mike = ">=2.1.2,<3"
+mkdocs-include-markdown-plugin = ">=7.1.1,<8"
+jupyter = "*"


🛠️ Refactor suggestion

Remove unused MkDocs dependencies

According to previous discussions, this project uses Sphinx for documentation. However, there are several MkDocs-related dependencies that appear to be unused:

mkdocs

mkdocs-material

mkdocstrings

mkdocstrings-python

mkdocs-git-authors-plugin

mkdocs-git-revision-date-localized-plugin

mike

Consider removing these unused dependencies to maintain a cleaner dependency list and reduce potential confusion.

coderabbitai · 2024-11-25T14:55:14Z

pixi.toml

+types-pytz = ">=2024.2.0.20241003,<2025"
+types-tqdm = ">=4.66.0.20240417,<5"


💡 Codebase verification

Let me analyze the imports and type ignores to generate a final response about type stubs coverage.

Add type stubs for additional dependencies

The codebase uses several third-party packages that would benefit from type stubs:

pydicom: Used for DICOM file handling

beautifulsoup4: Used for parsing via bs4

requests: Used for HTTP requests

pandas: Used extensively for data manipulation

The current type: ignore comments are mostly for internal type handling rather than missing stubs, but adding these type stubs would improve type checking coverage and help catch potential issues earlier.

🔗 Analysis chain

Verify type stubs coverage

The configuration includes type stubs for pytz and tqdm. Consider checking if other dependencies also require type stubs for complete type checking coverage.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Check for imports in the source code that might need type stubs echo "Checking imports in source files..." rg "^import |^from \w+ import" src/ echo "Checking for potential missing type stubs..." rg "type: ignore" src/

Length of output: 3849

coderabbitai · 2024-11-25T14:55:14Z

src/nbiatoolkit/models/nbia_responses.py

+	@validator("PatientBirthDate", pre=True, always=True)
+	def validate_birth_date(cls, value):
+		"""
+		Custom validator to convert date strings to datetime using the convert_date method.
+		"""
+		if value is None:
+			return value
+		return AbstractModel.convert_date(value)
+


🛠️ Refactor suggestion

Refactor duplicate date validators into a shared method

The custom validators for date fields like PatientBirthDate are duplicated across multiple classes (Patient, Study, Series). Consider refactoring these validators into a shared method or using a class method in AbstractModel to reduce code duplication and improve maintainability.

You can create a general validator in AbstractModel and reference it in the subclasses:

# In AbstractModel @classmethod def validate_date(cls, value): if value is None: return value return cls.convert_date(value) # In subclasses @validator("PatientBirthDate", "StudyDate", "SeriesDate", pre=True, always=True) def validate_dates(cls, value): return cls.validate_date(value)

coderabbitai · 2024-11-25T14:55:14Z

src/nbiatoolkit/models/nbia_responses.py

+		studies = json.loads(study_list_file.read_text())
+	else:
+		study_list_file.parent.mkdir(parents=True, exist_ok=True)
+		studies = client.getStudies(Collection=collections[0]['Collection']) # type: ignore


⚠️ Potential issue

Add a check for empty 'collections' before accessing

At line 151, accessing collections[0]['Collection'] without verifying that collections is not empty can lead to an IndexError if collections is empty. Add a check to ensure that collections contains at least one element before accessing it.

Apply this diff to add the necessary check:

+ if not collections: + raise ValueError("No collections found.") studies = client.getStudies(Collection=collections[0]['Collection']) # type: ignore

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

studies = client.getStudies(Collection=collections[0]['Collection']) # type: ignore

if not collections:

raise ValueError("No collections found.")

studies = client.getStudies(Collection=collections[0]['Collection']) # type: ignore

🛠️ Refactor suggestion

Handle KeyError when accessing 'Collection' key

When accessing collections[0]['Collection'], there's a possibility that the 'Collection' key might not exist in the dictionary. Add error handling to manage potential KeyError exceptions.

Apply this diff to add a safer access method:

- studies = client.getStudies(Collection=collections[0]['Collection']) # type: ignore + try: + collection_name = collections[0]['Collection'] + except KeyError: + raise KeyError("The key 'Collection' was not found in the collections data.") + studies = client.getStudies(Collection=collection_name) # type: ignore

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

studies = client.getStudies(Collection=collections[0]['Collection']) # type: ignore

try:

collection_name = collections[0]['Collection']

except KeyError:

raise KeyError("The key 'Collection' was not found in the collections data.")

studies = client.getStudies(Collection=collection_name) # type: ignore

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

devnotes/model_responses.ipynb (2)
18-18: Consider removing unused import 'requests'.
It appears that 'requests' is imported at line 18 but never used throughout the rest of the notebook. This can lead to confusion and clutter.

To address this, remove the unused import:
- import requests
21-22: Consolidate duplicate import statements from 'typing'.
Currently, there are two separate lines:
from typing import Optional, List, Dict, Any, Union, Tuple
from typing import Optional
This can be simplified to a single import line for clarity.
-from typing import Optional, List, Dict, Any, Union, Tuple
-from typing import Optional
+from typing import Optional, List, Dict, Any, Union, Tuple

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc274d4 and e3abae2.

📒 Files selected for processing (2)

devnotes/model_responses.ipynb (1 hunks)
src/nbiatoolkit/models/base.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/nbiatoolkit/models/base.py

🔇 Additional comments (2)

devnotes/model_responses.ipynb (2)

220-220: Double-check limiting the queries to the first 10.
When there’s no existing event loop, only the first 10 queries are processed (query_tuples[:10]). Verify that truncating the task list is intended. If it’s just for testing, ensure production code handles all queries or continues from where it left off.

205-211: Ensure concurrency-based file creation doesn't cause race conditions.
Each filename is derived from “data_path / 'metadata' / collection”, and directories are created in a loop. Although this approach likely avoids conflicts, it’s worth confirming that multiple tasks do not write to the same path concurrently.

✅ Verification successful

Let me search for more specific information about the file paths and concurrent operations.

Let me search for one more piece of information about how these operations are executed.

Based on the gathered information, I can now provide the final response:

The concurrent file operations are safely implemented
The code uses asyncio.gather() to run tasks concurrently, but the file paths are unique per collection and heading combination. The directory creation with mkdir(parents=True, exist_ok=True) is atomic and thread-safe in Python, preventing race conditions. Each task writes to a distinct path constructed from data_path/metadata/{collection}/{heading}List_{collection}.json, ensuring no path conflicts occur.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that no two tasks share the same file path by reporting duplicates.

rg '"metadata"' -A 3

Length of output: 1871

Script:

#!/bin/bash
# Search for the query_tuples construction and any async/concurrent operations
rg -A 5 "query_tuples = \[" 

# Look for async operations and concurrent file operations
ast-grep --pattern 'async def $FUNC($$$) { $$$ }'

Length of output: 488

Script:

#!/bin/bash
# Look for concurrent execution patterns and how query_tuples are processed
rg -B 2 -A 5 "asyncio\.gather|concurrent\.|ThreadPool|ProcessPool|fetch_and_write"

Length of output: 3360

coderabbitai · 2024-12-18T19:05:08Z

devnotes/model_responses.ipynb

+    "\t\t\t# Ensure the response is JSON\n",
+    "\t\t\tif response.headers.get('Content-Type') == 'application/json':\n",
+    "\t\t\t\tcontent = await response.json()\n",
+    "\t\t\t\tthislogger.debug(\"Received JSON response\")\n",
+    "\t\t\telse:\n",
+    "\t\t\t\tthislogger.warning(\"Unexpected Content-Type: %s\", response.headers.get('Content-Type'))\n",


🛠️ Refactor suggestion

Make Content-Type checks more robust.
The code checks strictly for 'application/json'. However, some responses set headers like 'application/json; charset=utf-8'.

To handle both cases, consider:

-if response.headers.get('Content-Type') == 'application/json': - content = await response.json() - thislogger.debug("Received JSON response") -else: - thislogger.warning("Unexpected Content-Type: %s", response.headers.get('Content-Type')) - content = None +content_type = response.headers.get('Content-Type', '') +if 'application/json' in content_type.lower(): + content = await response.json() + thislogger.debug("Received JSON response") +else: + thislogger.warning("Unexpected Content-Type: %s", content_type) + content = None

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"\t\t\t# Ensure the response is JSON\n",

"\t\t\tif response.headers.get('Content-Type') == 'application/json':\n",

"\t\t\t\tcontent = await response.json()\n",

"\t\t\t\tthislogger.debug(\"Received JSON response\")\n",

"\t\t\telse:\n",

"\t\t\t\tthislogger.warning(\"Unexpected Content-Type: %s\", response.headers.get('Content-Type'))\n",

"\t\t\t# Ensure the response is JSON\n",

"\t\t\tcontent_type = response.headers.get('Content-Type', '')\n",

"\t\t\tif 'application/json' in content_type.lower():\n",

"\t\t\t\tcontent = await response.json()\n",

"\t\t\t\tthislogger.debug(\"Received JSON response\")\n",

"\t\t\telse:\n",

"\t\t\t\tthislogger.warning(\"Unexpected Content-Type: %s\", content_type)\n",

"\t\t\t\tcontent = None\n",

coderabbitai · 2024-12-18T19:05:09Z

devnotes/model_responses.ipynb

+    "  series_file = list(collection_path.glob(\"SeriesList*.json\"))[0]\n",
+    "  with series_file.open(mode=\"r\") as f:\n",
+    "    series_list = json.loads(f.read())\n",


⚠️ Potential issue

Prevent an IndexError when globbing for SeriesList files.
Using [0] on an empty list from glob can result in an exception if no matching file is found.

Consider a safer approach:

-series_file = list(collection_path.glob("SeriesList*.json"))[0] +series_files = list(collection_path.glob("SeriesList*.json")) +if not series_files: + logger.warning("No SeriesList files found for %s", collection) + continue +series_file = series_files[0]

Committable suggestion skipped: line range outside the PR's diff.

…entication logic

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.github/workflows/main.yml (1)

7-7: Consider the implications of running CI on all branches

While running CI on all branches helps catch issues early, it may increase CI costs and resource usage. Consider limiting this to specific branch patterns if needed.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3abae2 and 47722f2.

📒 Files selected for processing (4)

.github/workflows/main.yml (3 hunks)
devnotes/model_responses.ipynb (1 hunks)
src/nbiatoolkit/auth.py (4 hunks)
src/nbiatoolkit/models/base.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/nbiatoolkit/auth.py
src/nbiatoolkit/models/base.py

🔇 Additional comments (7)

.github/workflows/main.yml (3)

20-36: LGTM: Well-structured Pixi integration

The changes effectively:

Define clear environment matrix
Set up Pixi with proper versioning and caching
Update test commands to use Pixi

46-61: Consider enhancing the documentation build process

The basic documentation build with Pixi is set up correctly. However, as previously suggested, consider adding:

Documentation artifact upload
Basic validation
GitHub Pages PR previews

86-240: Clarify the removal of deployment jobs

Multiple deployment jobs have been commented out without explanation:

Continuous-Deployment
Publish-To-PyPi
Build-Docker-Images
Test-PyPi-Install
Test-Docker-Image

Please clarify:

Why these jobs are being removed
How deployments will be handled going forward
If this is temporary during the Pixi transition

devnotes/model_responses.ipynb (4)

1-64: LGTM: Well-structured setup

The notebook has:

Well-organized imports
Proper logging configuration
Clean client initialization

111-117: Make Content-Type checks more robust

The current Content-Type check is too strict. Some responses may include charset information.

119-123: Clarify the commented-out file writing logic

The file writing logic is commented out. Please clarify:

Is this intentional?
How is the data being persisted?
If this is temporary, add a TODO comment explaining the plan

322-324: Prevent an IndexError when globbing for SeriesList files

The current implementation may raise an IndexError if no files are found.

coderabbitai · 2024-12-18T19:48:36Z

devnotes/model_responses.ipynb

+    }
+   ],
+   "source": [
+    "sl.filter(Modality == \"CT\")"


⚠️ Potential issue

Fix undefined variable usage

The filter operation uses an undefined variable Modality. This will raise a NameError.

Consider importing or defining the necessary filter operators:

-sl.filter(Modality == "CT") +from nbiatoolkit.models.filters import Modality # or appropriate import +sl.filter(Modality == "CT")

Committable suggestion skipped: line range outside the PR's diff.

feat: huge changes

eac1090

coderabbitai bot reviewed Nov 24, 2024

View reviewed changes

jjjermiah added 2 commits November 24, 2024 16:12

fix: update feature flags in pixi.toml and escape parentheses in DICO…

0b22aa5

…M tag search

fix: update download-artifact action to version 4 in workflow

3253bda

coderabbitai bot reviewed Nov 24, 2024

View reviewed changes

fix: update download-artifact action to version 4.0.0 and upgrade cry…

310d089

…ptography dependency to 43.0.3

coderabbitai bot reviewed Nov 24, 2024

View reviewed changes

feat: add abstract models for patient, study, and series with validat…

bc274d4

…ion and filtering capabilities

coderabbitai bot reviewed Nov 25, 2024

View reviewed changes

feat: add support for additional datetime format in AbstractModel

e3abae2

coderabbitai bot reviewed Dec 18, 2024

View reviewed changes

refactor: improve type hints and formatting in model classes and auth…

47722f2

…entication logic

coderabbitai bot reviewed Dec 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: switch to Pixi #133

feat: switch to Pixi #133

jjjermiah commented Nov 24, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 24, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 24, 2024

coderabbitai bot Nov 24, 2024

coderabbitai bot Nov 24, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

codecov bot commented Nov 24, 2024 •

edited

Loading

coderabbitai bot left a comment

coderabbitai bot Nov 25, 2024

coderabbitai bot Nov 25, 2024

coderabbitai bot Nov 25, 2024

coderabbitai bot Nov 25, 2024

coderabbitai bot left a comment

coderabbitai bot Dec 18, 2024

coderabbitai bot Dec 18, 2024

coderabbitai bot left a comment

coderabbitai bot Dec 18, 2024

		types-pytz = ">=2024.2.0.20241003,<2025"
		types-tqdm = ">=4.66.0.20240417,<5"

-		studies = client.getStudies(Collection=collections[0]['Collection']) # type: ignore
+		try:
+			collection_name = collections[0]['Collection']
+		except KeyError:
+			raise KeyError("The key 'Collection' was not found in the collections data.")
+		studies = client.getStudies(Collection=collection_name)  # type: ignore

feat: switch to Pixi #133

Are you sure you want to change the base?

feat: switch to Pixi #133

Conversation

jjjermiah commented Nov 24, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Nov 24, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 24, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov bot commented Nov 24, 2024 • edited Loading

Codecov Report

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 25, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 25, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 25, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 25, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 18, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 18, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 18, 2024

Choose a reason for hiding this comment

jjjermiah commented Nov 24, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 24, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Nov 24, 2024 •

edited

Loading