Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in value[[3L]](cond) : DESeq2 analysis failed: every gene contains at least one zero, cannot compute log geometric means #130

Open
Liviacmg opened this issue Dec 5, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Liviacmg
Copy link

Liviacmg commented Dec 5, 2024

Hi, I'm trying to use DESeq2 because ALDEx2 is not running due to problems already reported here, but in the daa_results_df generation this error appears. Does anyone know what could I do to solve this? I already tried to remove columns/rows with full empty values, but in the pathway error bar plot that is generated, it misses log fold change and relative abundance information, with only the names of pathways and everything in the background blank.

daa_results_df <- pathway_daa(
abundance = kegg_abundance,
metadata = metadata,
group = "Diagnosis",
daa_method = "DESeq2"
)

Using column 'SampleID' as sample identifier
Running DESeq2 analysis...
converting counts to integer mode
converting counts to integer mode
Error in value[3L] :
DESeq2 analysis failed: every gene contains at least one zero, cannot compute log geometric means

@Liviacmg Liviacmg added the bug Something isn't working label Dec 5, 2024
@cafferychen777
Copy link
Owner

Dear Livia,

Thank you for reporting this issue with DESeq2 analysis in ggpicrust2. I apologize for the delayed response as I am currently preparing for finals.

I understand you're encountering the error "every gene contains at least one zero, cannot compute log geometric means" and having issues with the pathway visualization. This is an important technical issue that requires careful attention to resolve.

I will provide a detailed response after December 10th with:

  1. The potential causes of this DESeq2 error
  2. Recommended solutions and workarounds
  3. Guidelines for proper data preprocessing
  4. Alternative approaches if needed

Please expect a thorough technical solution from me next week.

Thank you for your patience and understanding.

Best regards,
Chen

@cafferychen777
Copy link
Owner

Hi @Liviacmg,

Thank you for reporting this issue. I've analyzed the error and found that it's related to DESeq2's handling of zero counts in your dataset. Specifically, when all samples have at least one zero count for certain pathways, DESeq2 cannot compute log geometric means.

Here are several solutions you can try:

  1. Pre-filter your data:
# Remove pathways that have zero counts in all samples
kegg_abundance <- kegg_abundance[rowSums(kegg_abundance) > 0, ]

# Optional: Remove pathways with very low counts
min_count <- 10  # adjust this threshold as needed
kegg_abundance <- kegg_abundance[rowSums(kegg_abundance) >= min_count, ]
  1. Use alternative analysis methods:

    a. Using "ALDEx2" (default method, more robust to zero counts):

daa_results_df <- pathway_daa(
  abundance = kegg_abundance,
  metadata = metadata,
  group = "Diagnosis",
  daa_method = "ALDEx2"
)

b. Using "LinDA" (specifically designed for microbiome data):

daa_results_df <- pathway_daa(
  abundance = kegg_abundance,
  metadata = metadata,
  group = "Diagnosis",
  daa_method = "LinDA"
)
  1. Add pseudocounts (if you must use DESeq2):
kegg_abundance <- kegg_abundance + 1

Regarding the blank pathway error bar plot: This is because the differential analysis did not produce valid results. After applying one of the solutions mentioned above, the visualization should work properly.

Please let me know if you need any clarification or encounter other issues!

Best regards,
Chen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants