Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve BED Classifier #60

Open
donaldcampbelljr opened this issue Apr 11, 2024 · 2 comments
Open

Improve BED Classifier #60

donaldcampbelljr opened this issue Apr 11, 2024 · 2 comments
Assignees
Milestone

Comments

@donaldcampbelljr
Copy link
Member

This issue will track Phase 2 of the Bed Classifier system. Phase 1: #34

As we populate the database, we will find some BED files are not correctly classified.
A current example from the most recent upload: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM6754599

We can start with narrowpeak classification, using GEOFetch to pull narrowpeaks and run BedClassifier on these files to determine where the false negatives occur, adjust the classification algorithm, and then re-insert the new (and hopefully more accurate) classifications.

@donaldcampbelljr donaldcampbelljr added this to the v0.3.0 milestone Apr 11, 2024
@donaldcampbelljr donaldcampbelljr self-assigned this Apr 11, 2024
@nsheff
Copy link
Member

nsheff commented Aug 9, 2024

@donaldcampbelljr can you update as to the status of this here?

@donaldcampbelljr
Copy link
Member Author

This phase has not begun yet. The initial strategy posted above is still a good place to start, I believe. We did discuss potentially adding a column to the database that notes the discrepancy between the bedboss classification and the user/file extension classification (giving us a list of files to check). I don't believe this functionality was added to bedboss, however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants