Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: first-pass at handling "download/<label>:: <url>" as an array #45

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

bodom0015
Copy link
Collaborator

@bodom0015 bodom0015 commented Feb 20, 2025

Problem

We would like to track multiple download links in the references field - ideally this would let us offer different data formats for download

Our current field support a single key-value pair for the download URL, but we want to expand this to support a list of URLs + labels for tracking different formats

Fixes #44

Approach

  • feat: first-pass at supporting a list of download links in different formats
  • fix: tested loading/editing/saving different formats
  • fix: adjust existing downloadUrl list format to match new pattern (see sdoh-indices.json)

We now support a list of values under the key http://schema.org/downloadUrl

Format

Expected format: download/<label>:: <url>

Each entry matching the above format with a unique label will be converted into a list of url+label pairs
For example, download/asdf:: 123456 becomes 'http://schema.org/downloadUrl': [{'label':'asdf', 'url':'123456'}]

Precedence

This format will take precedence over a single URL - if the user specifies the following conflicting entries:

http://schema.org/downloadUrl:: 654321
download/asdf:: 123456

Only the special syntax will be recognized, and overwritten to use JSON instead:

{'http://schema.org/downloadUrl': [{'label': 'asdf', 'url': '123456'}]}

If no other entries use this special format, then http://schema.org/downloadUrl can still be used for normal key-value pairs

How to Test

  1. Checkout and run this branch locally, start up the MetadataManager
  2. Navigate to your local MetadataManager in the browser
    • You should see a list of metadata entries
  3. Locate a metadata entry and click on it
    • You should be brought to the "View" view
  4. At the top-right, click on the Edit button
    • You should be brought to the "Edit" view
  5. Scroll down to the References section and enter the following contents:
http://schema.org/url:: https://sdohatlas.github.io/
http://schema.org/downloadUrl:: http://example.com/download
download/ZIP:: http://example.com/download?format=zip
download/CSV:: http://example.com/download?format=csv
download/XML:: http://example.com/download?format=xml
  1. Scroll back up and click Save at the top-right
  2. Scroll back down to References to see the format that was saved
    • You should see that http://schema.org/url is a classic key-value that gets preserved
    • You should see that the three download/<label> special syntax entries were transformed into a JSON array of objects
    • You should see that the key-value form of http://schema.org/downloadUrl was ignored, because the special syntax entries overwrote it
    • You should see the following value/format was saved:
{'http://schema.org/url': 'https://sdohatlas.github.io/', 'http://schema.org/downloadUrl': [{'label': 'ZIP', 'url': 'http://example.com/download?format=zip'}, {'label': 'CSV', 'url': 'http://example.com/download?format=csv'}, {'label': 'XML', 'url': 'http://example.com/download?format=xml'}]}

@bodom0015 bodom0015 requested a review from mradamcox February 20, 2025 21:52
Copy link
Collaborator

@mradamcox mradamcox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reading through, this looks great, thanks! download/ is a good way to handle it. The only changes I'd like to see are removing the commented sections, and then we need to have an idea of the "upgrade" path. Do we just re-save all records? Do we need a one-off script/command to fix this up? Let me know what you think.

@bodom0015
Copy link
Collaborator Author

bodom0015 commented Feb 20, 2025

Thank you! I've removed the commented-out code blocks - some logging and logic bits that I was experimenting with 👍

For existing data, there are a few cases already present:

  • 1 case of downloadUrl already existing in the array format (for example, see sdoh-indices.json): "http://schema.org/downloadUrl": "[{\"label\": \"ZIP Archive\", \"url\": \"https://geodacenter.github.io/data-and-lab/data/us-sdoh-2014.zip\"}]" - this example has already been adjusted to use the correct syntax
  • 32 cases of downloadUrl mapping to a single string (for example, see edi.json): "http://schema.org/downloadUrl": "https://www.epa.gov/ejscreen/download-ejscreen-data" - this example uses a format that will be ignored / superseded by our new format
  • 0 cases of our new download/ format - this might slowly grow over time while we move to the new format

We could write a script to do this or I could simply adjust these manually - as mentioned above, the simple key-value approach where the key is http://schema.org/downloadUrl is still supported, but if we were to add a download/example to those cases, it would remove the previous syntax in favor of our new list format

So my hope is that this sort of self-migrates over time 👍

@bodom0015 bodom0015 requested a review from mradamcox February 24, 2025 17:43
# downloadUrl is a single string
lines += f"{x}:: {y}\n"
else:
lines += f"{x}:: {y}\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, what if in the except block, we force a transformation to the new pattern? Something like

lines += f"download/Data Download:: {y}"

That way, when a form is opened for an old record, it will auto-populate the old download link into the new format, and then get saved appropriately. We can have students go through all the records then an check the download links, which will fix them.

Does that make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expand download url capability within References field
2 participants