You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Methods that take a position (pos) as an argument exhibit unexpected behaviour if the position is too large to be supported by the variantkey format (i.e. if the position does not fit in the available 28 bits). In this case, the position can end up being bitshifted into the bits reserved for the encoding of the chromosome.
The methods in question are encode_variantkey and variantkey_range.
For example, if encode_variantkey is called with chrom=2 and pos=2^28, then a variantkey on chromosome 3 will be returned as 1 bit from the position has been bitshifted into the least significant bit encoding the chromosome, which was previously a 0.
The text was updated successfully, but these errors were encountered:
@abowden1989 My understanding is that the Chromosome 1 is the largest human chromosome with 249 million nucleotide base pairs. 2^28 can hold a max position of 268,435,455 > 249 million.
The library is designed on purpose for performance and any value range check if required is a burden of the caller.
Yes, 28 bits is enough to encode any position on the human genome, so there is no issue in that regard. However, the signature of the methods in question is that they will happily accept any 32 bit integer. If someone asks for the variantkey_range on chromosome 1 where the minimum position is 0 and the maximum position is i32::MAX, they might expect to receive all valid variantkeys on chromosome 1, but instead the returned range will overflow to other chromosomes.
We (Genomics plc) have no intention of addressing this at this time in this repo, and the workaround is never to pass positions >= 2^28
Methods that take a position (
pos
) as an argument exhibit unexpected behaviour if the position is too large to be supported by the variantkey format (i.e. if the position does not fit in the available 28 bits). In this case, the position can end up being bitshifted into the bits reserved for the encoding of the chromosome.The methods in question are
encode_variantkey
andvariantkey_range
.For example, if
encode_variantkey
is called withchrom=2
andpos=2^28
, then a variantkey on chromosome 3 will be returned as1
bit from the position has been bitshifted into the least significant bit encoding the chromosome, which was previously a0
.The text was updated successfully, but these errors were encountered: