Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

develop dosage caller #7

Merged
merged 15 commits into from
Jan 22, 2024
Merged

develop dosage caller #7

merged 15 commits into from
Jan 22, 2024

Conversation

alkaZeltser
Copy link
Collaborator

@alkaZeltser alkaZeltser commented Jan 15, 2024

This PR contains one new function: the dosage caller. It converts a genotype input (in the form of alphabetical alleles) into 0/1/2 dosages relative to the specified risk allele.

Also unit tests as per usual.

  • I have read the code review guidelines and the code review best practice on GitHub check-list.

  • The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)-[brief_description_of_branch].

  • I have set up or verified the branch protection rule following the github standards before opening this pull request.

  • I have added the changes included in this pull request to NEWS under the next release version or unreleased, and updated the date.

  • I have updated the version number in metadata.yaml and DESCRIPTION.

  • Both R CMD build and R CMD check run successfully.

Testing Results

All tests pass.

[ FAIL 0 | WARN 6 | SKIP 0 | PASS 103 ]

@alkaZeltser alkaZeltser marked this pull request as ready for review January 15, 2024 22:05

dosage <- rep(NA, length(called.alleles));
for (i in 1:length(called.alleles)) {
if ((split.alleles$called.allele.a[i] == missing.label) & (split.alleles$called.allele.b[i] == missing.label)) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this dosage calling algorithm does not explicitly account for a case where one of the alleles is known and the other is missing e.g. A/. or ./A. I don't recall if this is a possible result from variant calling. I would expect that this is not possible, and if it was, I don't really know how I would interpret that result. As it is, the algorithm just counts the number of risk alleles so if A is a risk allele the above examples would result in dosage 1. @tyamaguchi-ucla would you mind chiming in briefly on whether you think a dosage calling algorithm should allow this type of input?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alkaZeltser There are so many variant callers, so it's possible we might encounter this case, although I'm not sure if I've seen one. I suggest we treat any edge cases as warnings (still skip them) and update the package if necessary.

@alkaZeltser
Copy link
Collaborator Author

@forbiddenpersimmon Sorry I realized that I really dropped the ball on formatting checks in this function. Added more checks and lots of new tests

Copy link

@forbiddenpersimmon forbiddenpersimmon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes to the function! It's clearer now what checks it is performing on the input vectors. Tests also are more comprehensive, LGTM on that front!

There's one more lintr error related to whitespace to fix. Do you have an extension installed to highlight trailing spaces? I would recommend installing one, if not.

@alkaZeltser
Copy link
Collaborator Author

Added just one more test, for using real data. I'll check with Taka about the ./A case on Friday during our 1:1 and then merge.

@alkaZeltser
Copy link
Collaborator Author

alkaZeltser commented Jan 18, 2024

There's one more lintr error related to whitespace to fix. Do you have an extension installed to highlight trailing spaces? I would recommend installing one, if not.

I do! But it's kinda subtle and I still miss it sometimes lol!

@alkaZeltser
Copy link
Collaborator Author

@forbiddenpersimmon I made one more adjustment to handle the half missing genotype case. Would you mind just tracing through the main allele calling logic one more time, to make sure I didn't miss something?

@forbiddenpersimmon
Copy link

@forbiddenpersimmon I made one more adjustment to handle the half missing genotype case. Would you mind just tracing through the main allele calling logic one more time, to make sure I didn't miss something?

Missing semi-colon here. Logic looks good with new case!

@alkaZeltser alkaZeltser merged commit 5f37463 into main Jan 22, 2024
3 checks passed
@alkaZeltser alkaZeltser deleted the nzeltser-develop-dosage-caller branch January 22, 2024 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants