Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate sqlite db from csv's in repo... could offer more config opti… #2

Merged
merged 3 commits into from
Oct 29, 2023

Conversation

whatever
Copy link
Contributor

Include Makefile and python script to generate db (#1)

  • generate sqlite db from csv's in repo... could offer more config options fwiw

  • ❌ remove trailing newline

Aside: I am using a sqlite db based on these CSV's to make a torch dataset:
https://github.com/whatever/nude-2.0/blob/matt/view-met-and-tag/src/nude2/data.py#L143-L152

#1)

* generate sqlite db from csv's in repo... could offer more config options fwiw

* ❌ remove trailing newline
* 🐺 include sqlite db

* update readme
@gregsadetsky
Copy link
Owner

hey I'm finally able to look into this! thanks again for your contribution!

few notes:

  • I ran the Python code and was surprised that the resulting sqlite .db file did not have any rows in the chicago_images tables -- turns out there's a small mistake on this line i.e. it should be inserting into chicago_images, not met_images
  • the included 2.sqlify/open-access-is-great-but-where-are-the-images.db file has this same issue i.e. there are no rows in the chicago_images tables

if you don't mind fixing that & regenerating the sqlite3 database, that'd be great! I'll be happy to merge after that


also I just opened an issue (#6) for a sqlite-related side idea if you're curious/interested! :-)

@whatever
Copy link
Contributor Author

@gregsadetsky - ah this only includes the met_images by design! It would need another minor, separate tweak to accommodate chicago, since the ID type is VARCHAR(*) not NUMBER

@gregsadetsky
Copy link
Owner

@gregsadetsky - ah this only includes the met_images by design! It would need another minor, separate tweak to accommodate chicago, since the ID type is VARCHAR(*) not NUMBER

Just to make sure I understand, you do read from chicago-images.csv though, right?

https://github.com/gregsadetsky/open-access-is-great-but-where-are-the-images/pull/2/files#diff-2878a1dcb938b7463f42f6e736e2512f59dead0a92da93ae133d13d6a6eeb3cfR68

That's what surprised me -- i.e. reading from chicago but inserting into the met table

@whatever
Copy link
Contributor Author

whatever commented Oct 27, 2023

That's what surprised me -- i.e. reading from chicago but inserting into the met table

Ah I see now - I made the corresponding change!

@gregsadetsky gregsadetsky merged commit 64a56fa into gregsadetsky:main Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants