Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset a good fit for Foundry? #75

Open
sgbaird opened this issue Jul 2, 2022 · 3 comments
Open

Dataset a good fit for Foundry? #75

sgbaird opened this issue Jul 2, 2022 · 3 comments

Comments

@sgbaird
Copy link
Member

sgbaird commented Jul 2, 2022

@blaiszik

https://github.com/MLMI2-CSSI/foundry

Right now the dataset is on figshare, and have thought about getting it onto matminer. For now, it works downloading from figshare, but it seemed like more code than might be necessary to download, make sure I'm not getting an incomplete download, etc.

@sgbaird
Copy link
Member Author

sgbaird commented Aug 5, 2022

Internal response from @blaiszik in response to my question about using Forge or Foundry

I’d suggest Foundry most likely. The difference is that there is more structured data there. Forge is helpful (and used in Foundry), but we created Foundry specifically because of how time consuming it is to make sense of the unstructured data in MDF that Forge queries return.
The downside is that Foundry is new and we have fewer datasets there. But it’s where the bulk of our ML effort is invested now and likely in the future.

MLMI2-CSSI/foundry#240, MLMI2-CSSI/foundry#239

@sgbaird sgbaird transferred this issue from sparks-baird/mp-time-split Jun 17, 2023
@kjappelbaum
Copy link

kjappelbaum commented Jun 30, 2023

An alternative to look into might be also https://github.com/cthoyt/pystow (for mofdscribe I currently have Zenodo archives and pull from Zenodo using cthoyt, because it allows for nice versioning of the datasets)

@sgbaird
Copy link
Member Author

sgbaird commented Jul 1, 2023

@kjappelbaum pystow is actually what I'm using already, thanks to your suggestion on xtal2png I'm pretty sure :) For this, I'm using figshare instead of zenodo. Part of considering matminer and foundry is getting it onto materials-specific platforms for better visibility, though this introduces an issue with keeping the copies across different platforms in-sync. For now, pystow + figshare is functional, though recently I've liked zenodo better. I'm using pystow+zenodo with an API for a more recent project https://github.com/sparks-baird/matsci-opt-benchmarks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants