Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to retrieve the predictions (i.e low confidence Mosaic) #39

Open
ngs1810 opened this issue Sep 23, 2023 · 1 comment
Open

Unable to retrieve the predictions (i.e low confidence Mosaic) #39

ngs1810 opened this issue Sep 23, 2023 · 1 comment

Comments

@ngs1810
Copy link

ngs1810 commented Sep 23, 2023

Hello.

When i used MF in 2021, upon the genotype prediction steps, i was able to see the prediction to have such categories:-

cat $file.genotype.predictions.refined.300721.bed | cut -f35 | sort | uniq -c
23578 het
> 682 mosaic
> 1 mosaic;cautious:AF<0.01;low-confidence:extra-high-coverage
19 mosaic;cautious:only-1-altallele
39 mosaic;cautious:only-1-altallele;low-confidence:extra-high-coverage
1136 mosaic;low-confidence:extra-high-coverage
50 mosaic;low-confidence:extra-high-coverage;low-confidence:likelyCNV

1 prediction
6 refhom
1562 repeat

But, i redownloaded the software in March 2022, and the same sample do not give the same predictions for mosaic. It categorised every mosaic as "mosaic" instead providing the information whether it is low confidence as previously. and i have been using the same commands since 2021. The number of "mosaic" is still the same for both "outputs". i just need to filter out the low confident mosaic calls in my analysis.

cat 003P.genotype.predictions.refined.bed | cut -f35 | sort | uniq -c
23579 het
> 1927 mosaic
1 prediction
6 refhom
1562 repeat

Command Used:
singularity run -B /hpcfs /hpcfs/users/$USER/mosaicforecast_0.0.1.sif Prediction.R $DIR/${sample[$SLURM_ARRAY_TASK_ID]}.features.bed $MFORECAST/models_trained/50xRFmodel_addRMSK_Refine.rds Refined $DIR/${sample[$SLURM_ARRAY_TASK_ID]}.genotype.predictions.refined.bed

I am not sure how to retrieve back the original classification, although i can do that manually in R. But, do let me know if there is additional settings that i am not aware of.

Thank you.

@douym
Copy link
Collaborator

douym commented Sep 25, 2023

Hello.

When i used MF in 2021, upon the genotype prediction steps, i was able to see the prediction to have such categories:-

cat $file.genotype.predictions.refined.300721.bed | cut -f35 | sort | uniq -c
23578 het
> 682 mosaic
> 1 mosaic;cautious:AF<0.01;low-confidence:extra-high-coverage
19 mosaic;cautious:only-1-altallele
39 mosaic;cautious:only-1-altallele;low-confidence:extra-high-coverage
1136 mosaic;low-confidence:extra-high-coverage
50 mosaic;low-confidence:extra-high-coverage;low-confidence:likelyCNV

1 prediction
6 refhom
1562 repeat

But, i redownloaded the software in March 2022, and the same sample do not give the same predictions for mosaic. It categorised every mosaic as "mosaic" instead providing the information whether it is low confidence as previously. and i have been using the same commands since 2021. The number of "mosaic" is still the same for both "outputs". i just need to filter out the low confident mosaic calls in my analysis.

cat 003P.genotype.predictions.refined.bed | cut -f35 | sort | uniq -c
23579 het
> 1927 mosaic
1 prediction
6 refhom
1562 repeat

Command Used: singularity run -B /hpcfs /hpcfs/users/$USER/mosaicforecast_0.0.1.sif Prediction.R DIR/{sample[$SLURM_ARRAY_TASK_ID]}.features.bed $MFORECAST/models_trained/50xRFmodel_addRMSK_Refine.rds Refined DIR/{sample[$SLURM_ARRAY_TASK_ID]}.genotype.predictions.refined.bed

I am not sure how to retrieve back the original classification, although i can do that manually in R. But, do let me know if there is additional settings that i am not aware of.

Thank you.

Hi @ngs1810 ,

Thanks for your message. I checked "https://github.com/parklab/MosaicForecast/blob/master/Prediction.R" and confirmed that the "low-confidence" predictions are still there. Is it possible that your input lines happen to not contain the low-confidence mutations?

best wishes,

Y.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants