-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On disk layers
in H5AD file
#63
Comments
I think I might know what's going on here but I need to look into it more. Do you have an example file you would be happy to share? A small subset of this data would be perfect. |
After a lot of digging and working with my colleague, I figured out it's an issue with the way the H5AD was generated with pegasus. Instead of using the standard Here is some code to reproduce the H5AD file with this issue. import pegasus as pg
import pandas as pd
# wget https://storage.googleapis.com/terra-featured-workspaces/Cumulus/MantonBM_nonmix_subset.zarr.zip
data = pg.read_input('MantonBM_nonmix_subset.zarr.zip')
pg.identify_robust_genes(data)
# Transform counts, but retain original in backup_matrix
# The default is raw.X which saves to h5ad_file/raw like we expect
# When backup_matrix is set to a custom value, the result is saved in layers/customField
# This is what causes the problem, since h5ad_file/raw is no longer written
# https://pegasus.readthedocs.io/en/stable/api/pegasus.log_norm.html#pegasus.log_norm
pg.log_norm(data, backup_matrix = 'raw_new')
pg.write_output( data, "out.h5ad") Based on this, it's not a Agree? Best, |
Thanks! I still need to test things but I think that makes sense. We should actually be able to support |
Hi! This is just a longer 👀 message ;) @Nick-Eagles @abspangler13 and I are going to be using some of the same files Gabriel is using, and so, we will have the same issues Gabriel described. Thank you Gabriel et al for spearheading this and thanks Luke for your support! From our side, Nick is the one who has R and Python experience, and thus has been our in house Best, |
The main thing that would be helpful would be an example |
We have worked out the issue on our end, by writing to Thanks! |
Thanks Gabriel, Prashant, @Nick-Eagles et al for figuring this one out! Thanks again Luke for the support ^^. |
I have an H5AD file that stores both normalized data and raw counts produced by pegasus. I can use
zellkonverter
to read the default normalized counts as aDelayedMatrix
, but the raw counts are imported as adgCMatrix
. How can I use aDelayedMatrix
instead?This follows up on our conversion in #57, but applied to the new H5AD format.
As for the details, I have an H5AD file with the structure:
> h5ls example.h5ad X Group layers Group obs Group obsm Group obsp Group raw Group uns Group var Group varm Group varp Group
where
X
stores normalized data andlayers/raw_new
stores the raw counts.I read the data in using:
The
raw_new
field is a 12GbdgCMatrix
.I have
zellkonverter v1.7.0
,Using anndata version 0.8.0
Cheers,
Gabriel
The text was updated successfully, but these errors were encountered: