Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command line option for sample overlap to mutation prediction scripts #43

Merged
merged 6 commits into from
Apr 22, 2021

Conversation

jjc2718
Copy link
Contributor

@jjc2718 jjc2718 commented Apr 21, 2021

Closes #33.

In our paper, we want to compare data types for mutation prediction in 3 different experiments: one using gene expression data only, one using gene expression and methylation, and one using expression/methylation/RPPA/mutational signatures data. In each of these cases, we only want to use TCGA samples that have data for all of these data types.

Before, I was handling this in a super hacky way, by commenting out lines in mpmp/config.py. This change creates a command line option to select the data types to use for calculating the sample overlap, which makes this step much more reproducible and understandable.

It's a fairly straightforward change to the code, not too much to review.

@jjc2718 jjc2718 requested a review from miltondp April 21, 2021 15:24
Copy link
Member

@miltondp miltondp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I left a couple of minor comments.

@jjc2718 jjc2718 merged commit 337d2da into greenelab:master Apr 22, 2021
@jjc2718 jjc2718 deleted the data_types_samples_option branch April 22, 2021 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sample overlap improvement/refactoring
2 participants