Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added pickles, auto-deployment #4

Merged
merged 1 commit into from
Dec 8, 2024
Merged

Conversation

AmazingAbhi
Copy link
Contributor

@AmazingAbhi AmazingAbhi commented Dec 8, 2024

…t-treasury

Summary by CodeRabbit

  • New Features
    • Introduced a new GitHub Actions workflow for automated build and deployment to Google Cloud Run upon merging pull requests.
    • Added functionality to save and update files on GitHub, supporting both CSV and pickle formats for daily treasury rates data.
  • Documentation
    • Updated the README.md for improved clarity and structure, including new sections on project purpose, functionalities, and sample data.
  • Chores
    • Updated .gitignore to include the .idea/ directory in the ignore list.

@AmazingAbhi AmazingAbhi linked an issue Dec 8, 2024 that may be closed by this pull request
Copy link

coderabbitai bot commented Dec 8, 2024

Walkthrough

This pull request introduces several changes across multiple files. A new GitHub Actions workflow, build.yml, is added to automate the build and deployment process to Google Cloud Run upon merging pull requests to the main branch. The .gitignore file is updated to include the .idea/ directory in the ignore list. The README.md file is restructured and expanded to provide clearer project documentation, including new sections on project functionality and sample data. Additionally, a new function save_and_update_github is introduced in main.py to streamline file updates on GitHub.

Changes

File Change Summary
.github/workflows/build.yml New workflow added for automating build and deployment to Google Cloud Run upon PR merge.
.gitignore Added .idea/ directory to the ignore list.
README.md Restructured with a new title, added sections on project functionality, and improved formatting.
main.py Introduced save_and_update_github function for saving and updating files on GitHub; updated main execution block.

Poem

🐇 In the meadow where code does play,
New workflows bloom, brightening the day.
With GitHub's magic, our builds take flight,
Ignoring the clutter, our paths are now bright.
Documentation shines, clear as the sun,
Together we code, and together we run! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Outside diff range comments (2)
README.md (2)

Line range hint 17-42: Enhance implementation details section.

The "How it works" section needs more detailed information about:

  • Error handling mechanisms
  • Retry strategies for failed downloads
  • Specific scheduling details (timezone, retry policies)
  • Data validation processes

Also, convert bare URLs to proper markdown links:

-https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&field_tdr_date_value=2024
+[Treasury Yield Curve Data](https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&field_tdr_date_value=2024)
🧰 Tools
🪛 Markdownlint (0.35.0)

26-26: null
Bare URL used

(MD034, no-bare-urls)


30-30: null
Bare URL used

(MD034, no-bare-urls)


Line range hint 70-89: Enhance sample data documentation.

The sample data is helpful, but could be improved by:

  1. Adding column descriptions
  2. Explaining data types and formats
  3. Including validation rules or constraints
🧰 Tools
🪛 Markdownlint (0.35.0)

64-64: null
Multiple top-level headings in the same document

(MD025, single-title, single-h1)


46-46: null
Bare URL used

(MD034, no-bare-urls)


47-47: null
Bare URL used

(MD034, no-bare-urls)


48-48: null
Bare URL used

(MD034, no-bare-urls)


49-49: null
Bare URL used

(MD034, no-bare-urls)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE

📥 Commits

Reviewing files that changed from the base of the PR and between e5c7797 and 9043fa1.

⛔ Files ignored due to path filters (6)
  • data/2021-daily-treasury-rates.pkl is excluded by !**/*.pkl
  • data/2022-daily-treasury-rates.pkl is excluded by !**/*.pkl
  • data/2023-daily-treasury-rates.pkl is excluded by !**/*.pkl
  • data/2024-daily-treasury-rates.csv is excluded by !**/*.csv
  • data/2024-daily-treasury-rates.pkl is excluded by !**/*.pkl
  • data/daily-treasury-rates.pkl is excluded by !**/*.pkl
📒 Files selected for processing (4)
  • .github/workflows/build.yml (1 hunks)
  • .gitignore (1 hunks)
  • README.md (3 hunks)
  • main.py (4 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
main.py

30-30: Missing return type annotation for public function save_and_update_github

Add return type annotation: None

(ANN201)


30-30: Missing type annotation for function argument file_path

(ANN001)


30-30: Missing type annotation for function argument repo

(ANN001)


30-30: Missing type annotation for function argument github_path

(ANN001)


35-35: Logging statement uses f-string

(G004)


44-44: Logging statement uses f-string

(G004)


51-51: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


51-51: Logging statement uses f-string

(G004)


51-51: Use explicit conversion flag

Replace with conversion flag

(RUF010)


52-52: Use raise without specifying exception name

Remove exception name

(TRY201)


83-83: sort_file_names may be undefined, or defined from star imports

(F405)


84-84: merge_files may be undefined, or defined from star imports

(F405)


106-106: f-string without any placeholders

Remove extraneous f prefix

(F541)


106-106: Logging statement uses f-string

(G004)

🪛 Markdownlint (0.35.0)
README.md

26-26: null
Bare URL used

(MD034, no-bare-urls)


30-30: null
Bare URL used

(MD034, no-bare-urls)


64-64: null
Multiple top-level headings in the same document

(MD025, single-title, single-h1)


46-46: null
Bare URL used

(MD034, no-bare-urls)


47-47: null
Bare URL used

(MD034, no-bare-urls)


48-48: null
Bare URL used

(MD034, no-bare-urls)


49-49: null
Bare URL used

(MD034, no-bare-urls)


93-93: null
Bare URL used

(MD034, no-bare-urls)

🪛 actionlint (1.7.4)
.github/workflows/build.yml

17-17: property "deploy" is not defined in object type {auth: {conclusion: string; outcome: string; outputs: {access_token: string; auth_token: string; credentials_file_path: string; id_token: string; project_id: string}}}

(expression)

🪛 yamllint (1.35.1)
.github/workflows/build.yml

[warning] 3-3: truthy value should be one of [false, true]

(truthy)


[error] 20-20: wrong indentation: expected 6 but found 4

(indentation)

🔇 Additional comments (3)
.gitignore (1)

164-164: LGTM!

Adding .idea/ to .gitignore is a good practice as it excludes IDE-specific files.

README.md (2)

3-15: LGTM! Well-structured project overview.

The project overview section effectively communicates the purpose and core functionalities of the project.


43-49: 🛠️ Refactor suggestion

Verify external repository references and improve formatting.

Please verify if it's appropriate to reference these external repositories. Also, convert bare URLs to properly formatted markdown links with descriptive text.

✅ Verification successful

Convert bare URLs to formatted markdown links and keep public repository references

The referenced QuantConnect repository is public and appears to be used only as a reference dataset source. Please update the links using proper markdown formatting:

## Reference dataset: QC

- [US Treasury Yield Curve Data](https://www.quantconnect.com/data/us-treasury-yield-curve)
- [Treasury Yield Curve Algorithm](https://github.com/QuantConnect/Lean.DataSource.USTreasury/blob/master/USTreasuryYieldCurveDataAlgorithm.cs)
- [Treasury Data Source Repository](https://github.com/QuantConnect/Lean.DataSource.USTreasury/)
- [Treasury Yield Curve Rate Model](https://github.com/QuantConnect/Lean.DataSource.USTreasury/blob/master/USTreasuryYieldCurveRate.cs)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the referenced repositories are public
for repo in "QuantConnect/Lean.DataSource.USTreasury"; do
  gh repo view "$repo" --json isPrivate | jq -r '.isPrivate'
done

Length of output: 152


Script:

#!/bin/bash
# Let's check if there are any sensitive tokens, credentials, or internal references in the codebase
rg -i "(api[_-]?key|secret|token|password|credential|internal)" -g "!{.git,node_modules,dist,build}/*"

# Also check for any references to QuantConnect in the codebase to understand the relationship
rg -i "quantconnect" -g "!{.git,node_modules,dist,build}/*"

Length of output: 607

🧰 Tools
🪛 Markdownlint (0.35.0)

46-46: null
Bare URL used

(MD034, no-bare-urls)


47-47: null
Bare URL used

(MD034, no-bare-urls)


48-48: null
Bare URL used

(MD034, no-bare-urls)


49-49: null
Bare URL used

(MD034, no-bare-urls)

Comment on lines +15 to +17
environment:
name: production
url: ${{ steps.deploy.outputs.url }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

⚠️ Potential issue

Add protection rules for production environment

The production environment lacks protection rules. Consider adding:

  • Required reviewers
  • Deployment approval gates
  • Wait timer

Would you like me to provide an example of environment protection configuration?

🧰 Tools
🪛 actionlint (1.7.4)

17-17: property "deploy" is not defined in object type {auth: {conclusion: string; outcome: string; outputs: {access_token: string; auth_token: string; credentials_file_path: string; id_token: string; project_id: string}}}

(expression)

Comment on lines +36 to +39
- name: Build Docker image
run: |
docker build -t gcr.io/dfg-analytics-insights-prod/us-department-treasury:latest .

Copy link

@coderabbitai coderabbitai bot Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

⚠️ Potential issue

Add .dockerignore file to prevent secrets leakage

The Docker build context includes all files. Create a .dockerignore file to exclude sensitive files (e.g., .env, credentials) from the build context.

Would you like me to generate a .dockerignore file with recommended exclusions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, generate a .dockerignore

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinostoll-dfg, here's a recommended .dockerignore file to prevent sensitive and unnecessary files from being included in your Docker build context:

# Python cache files
__pycache__/
*.py[cod]
*$py.class

# Virtual environment directories
env/
venv/
.venv/

# System files
.DS_Store

# Git files
.git
.gitignore

# IDE settings
.idea/
.vscode/
*.sublime-project
*.sublime-workspace

# Environment and secret files
.env
*.env
secrets/
credentials.json

# Data files
*.csv
*.pickle
data/

# Logs
logs/
*.log

# Other
*.bak
*.swp

This will help ensure that sensitive information and unnecessary files are excluded from the build context. Let me know if you'd like to customize it further.

.github/workflows/build.yml Show resolved Hide resolved

- name: Build Docker image
run: |
docker build -t gcr.io/dfg-analytics-insights-prod/us-department-treasury:latest .
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add explicit version tag for Docker image

Using latest tag is considered an anti-pattern as it makes deployments unpredictable and harder to rollback. Consider using a deterministic tag based on commit SHA or release version.

-docker build -t gcr.io/dfg-analytics-insights-prod/us-department-treasury:latest .
+docker build -t gcr.io/dfg-analytics-insights-prod/us-department-treasury:${{ github.sha }} .
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
docker build -t gcr.io/dfg-analytics-insights-prod/us-department-treasury:latest .
docker build -t gcr.io/dfg-analytics-insights-prod/us-department-treasury:${{ github.sha }} .

main.py Show resolved Hide resolved
Comment on lines 83 to 84
file_names = sort_file_names(csv_dir)
merged = merge_files(file_names, csv_dir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Avoid star imports

The functions sort_file_names and merge_files are imported via star import. This makes it harder to track dependencies and can lead to naming conflicts.

-from udt.utils import *
+from udt.utils import sort_file_names, merge_files

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.8.0)

83-83: sort_file_names may be undefined, or defined from star imports

(F405)


84-84: merge_files may be undefined, or defined from star imports

(F405)

main.py Show resolved Hide resolved
main.py Show resolved Hide resolved
Comment on lines +93 to 95
For: https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&field_tdr_date_value=2024

![US Treasury-Daily Treasury Par Yield Curve Rates](https://github.com/deerfieldgreen/us-department-treasury/blob/main/images/Screenshot%202024-10-08%20at%2014.41.25.png)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Improve screenshot section formatting.

  1. Convert the bare URL to a proper markdown link
  2. Use relative path for the image instead of hardcoding the branch
-For: https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&field_tdr_date_value=2024
-
-![US Treasury-Daily Treasury Par Yield Curve Rates](https://github.com/deerfieldgreen/us-department-treasury/blob/main/images/Screenshot%202024-10-08%20at%2014.41.25.png)
+[Data Source](https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&field_tdr_date_value=2024)
+
+![US Treasury-Daily Treasury Par Yield Curve Rates](./images/Screenshot%202024-10-08%20at%2014.41.25.png)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
For: https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&field_tdr_date_value=2024
![US Treasury-Daily Treasury Par Yield Curve Rates](https://github.com/deerfieldgreen/us-department-treasury/blob/main/images/Screenshot%202024-10-08%20at%2014.41.25.png)
[Data Source](https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&field_tdr_date_value=2024)
![US Treasury-Daily Treasury Par Yield Curve Rates](./images/Screenshot%202024-10-08%20at%2014.41.25.png)
🧰 Tools
🪛 Markdownlint (0.35.0)

93-93: null
Bare URL used

(MD034, no-bare-urls)


95-95: null
Files should end with a single newline character

(MD047, single-trailing-newline)

README.md Show resolved Hide resolved
@AmazingAbhi AmazingAbhi merged commit 9043fa1 into main Dec 8, 2024
1 check passed
@AmazingAbhi AmazingAbhi deleted the 3-pandasdataframe-pickle branch December 8, 2024 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pandas.Dataframe pickle
2 participants