Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Errno2 on GTFS RT validation #3553

Merged
merged 17 commits into from
Nov 25, 2024
Merged

Conversation

erikamov
Copy link
Contributor

Description

@ohrite and I refactored the code on gtfs_rt_parser.py using the tests created on the previous PR to then fix the issue #2780 to properly skip duplicate extracts instead of generating the error.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

How has this been tested?

Tested locally:

❯ poetry run pytest --gcs
================================================ test session starts ================================================
platform darwin -- Python 3.9.19, pytest-8.3.3, pluggy-1.5.0
rootdir: /Users/erikaypacheco/Documents/workspace/data-infra/jobs/gtfs-rt-parser-v2
configfile: pyproject.toml
plugins: env-1.1.5
collected 10 items

tests/test_gtfs_rt_parser.py ..........                                                                       [100%]

================================================= warnings summary ==================================================
...
========================================= 10 passed, 111 warnings in 44.72s =========================================

Post-merge follow-ups

  • No action required
  • Actions required (specified below)

Check after the merge if the results on cal-itp-data-infra.staging.int_gtfs_quality__rt_validation_outcomes for next days/hours does not contain the Errno 2 reported on the issue.

@erikamov erikamov marked this pull request as draft November 21, 2024 22:45
@erikamov erikamov force-pushed the mov/2780-gtfs-rt-validation-errno2 branch 2 times, most recently from 000764c to e7f3dd8 Compare November 21, 2024 23:04
@ohrite ohrite force-pushed the mov/2780-gtfs-rt-validation-errno2 branch from 15cb151 to 483433d Compare November 23, 2024 00:30
@ohrite ohrite force-pushed the mov/2780-gtfs-rt-validation-errno2 branch from 483433d to 8cb8db8 Compare November 23, 2024 00:32
@ohrite ohrite marked this pull request as ready for review November 23, 2024 21:48
@ohrite
Copy link
Contributor

ohrite commented Nov 25, 2024

This PR lands two changes:

  1. A GTFS-RT validator wrapper that has been factored into classes
  2. A new logic change: duplicate (MD5) VP/TU/SA messages are marked as successful but will not load a result into BigQuery

Copy link
Member

@evansiroky evansiroky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring review to @ohrite and @erikamov as they paired together.

@ohrite ohrite self-requested a review November 25, 2024 20:13
Copy link
Contributor

@ohrite ohrite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍐 w/ @erikamov

@erikamov erikamov merged commit 4f07ab5 into main Nov 25, 2024
4 checks passed
@erikamov erikamov deleted the mov/2780-gtfs-rt-validation-errno2 branch November 25, 2024 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants