Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bgen_reader.allele_expectation allocates memory based on unindexed genotype #40

Open
jordanero opened this issue Aug 24, 2021 · 3 comments
Assignees

Comments

@jordanero
Copy link

jordanero commented Aug 24, 2021

bgen_reader.allele_expectation allocates memory based on the unindexed genotype. This causes problems when indexing a large bgen (for example UKBioBank).

The following code attempts to allocate a 4.45TiB array when computing the expectation for a single variant and sample

from bgen_reader import open_bgen
bgen = open_bgen('ukb_imp_chr22_v3.bgen', samples_filepath = 'ukb1404_imp_chr1_v2_s487406.sample', verbose = True)
bgen.allele_expectation(index = c(1,1))
Traceback (most recent call last):
File "", line 1, in
File "/n/home12/jrossen/.conda/envs/python3/lib/python3.8/site-packages/bgen_reader/_bgen2.py", line 1381, in allele_expectation
ploidy0 = self.read(return_probabilities=False, return_ploidies=True)[
File "/n/home12/jrossen/.conda/envs/python3/lib/python3.8/site-packages/bgen_reader/_bgen2.py", line 563, in read
ploidy_val = np.full(
File "/n/home12/jrossen/.conda/envs/python3/lib/python3.8/site-packages/numpy/core/numeric.py", line 343, in full
a = empty(shape, dtype, order)
numpy.core._exceptions.MemoryError: Unable to allocate 4.45 TiB for an array with shape (487409, 1255683) and data type int64

@CarlKCarlK
Copy link
Collaborator

CarlKCarlK commented Aug 25, 2021 via email

@jordanero
Copy link
Author

That's helpful. Thanks for making the package!

@CarlKCarlK
Copy link
Collaborator

This is fixed with branch "fixissue40".

@jordanero, you can install the fix early with
pip install git+git://github.com/limix/bgen-reader-py.git@fixissue40

@horta When you get a chance, you can publish the fix?

  • Carl

@CarlKCarlK CarlKCarlK mentioned this issue Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants