Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The 'method' column in the 'daa_results_df' data frame contains more than one method. #105

Open
freemutation opened this issue Apr 2, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@freemutation
Copy link

Describe the Bug
Wnen I use ggpicrust2 function, it always generates the error. Any help would be greatly appreciated.

Here's my dataset including the "pred_metagenome_unstart.tsv" file and "metadata.txt" file.
pred_metagenome_unstrat.tsv.gz
metadata.txt

Reproducible Example
Here is my code:

library(ggpicrust2)
library(readr)
library(tibble)
library(tidyverse)
library(ggprism)
library(patchwork)
library(ggh4x)

abundance_file <- paste(funcpath,"/pred_metagenome_unstrat.tsv", sep = "")
abundance_data <- read_delim(abundance_file, delim = "\t", col_names = TRUE, trim_ws = TRUE)

metadata = phyloseq::sample_data(ps) %>% as_tibble()
metadata$Group = factor(metadata$Group,levels =c("FHC","FCEP","FREP"))

metadata_com = metadata[metadata$Group %in% c("FHC", "FCEP"),]
abundance_data_com = abundance_data[,c("function", metadata_com$SampleID)]

results_data_input <- ggpicrust2( data = abundance_data_com,
metadata = metadata_com,
group = "Group",
pathway = "KO",
daa_method = "LinDA",
p_values_bar = TRUE,
p.adjust = "none",
ko_to_kegg = TRUE,
order = "pathway_class",
select = NULL,
reference = NULL)

Actual Behavior
I got an error below:

Starting the ggpicrust2 analysis...

Converting KO to KEGG...

Processing provided data frame...
Loading KEGG reference data. This might take a while...
Performing KO to KEGG conversion. Please be patient, this might take a while...
|==============================================================================================================================| 100%
KO to KEGG conversion completed. Time elapsed: 38.24 seconds.
Removing KEGG pathways with zero abundance across all samples...
KEGG abundance calculation completed successfully.
Performing pathway differential abundance analysis...

Sample names extracted.
Identifying matching columns in metadata...
Matching columns identified: SampleID . This is important for ensuring data consistency.
Using all columns in abundance.
Converting abundance to a matrix...
Reordering metadata...
Converting metadata to a matrix and data frame...
Extracting group information...
Running LinDA analysis...
Performing LinDA analysis...
0 features are filtered!
The filtered data has 27 samples and 212 features will be tested!
Pseudo-count approach is used.
Fit linear models ...
Completed.
Processing LinDA results...
LinDA analysis is complete.
Success: Found 22 statistically significant biomarker(s) in the dataset.
Annotating pathways...

Starting pathway annotation...
DAA results data frame is not null. Proceeding...
KO to KEGG is set to TRUE. Proceeding with KEGG pathway annotations...
We are connecting to the KEGG database to get the latest results, please wait patiently.

Processing pathways in chunks...

|==============================================================================================================================| 100%
Finished processing chunks. Time taken: 1.91 seconds.

Finalizing pathway annotations...

|================= | 14%
Finished finalizing pathway annotations. Time taken: 0.02 seconds.

Returning DAA results filtered annotation data frame...
Creating pathway error bar plots...

The following pathways are missing annotations and have been excluded: ko00281
You can use the 'pathway_annotation' function to add annotations for these pathways.
The 'method' column in the 'daa_results_df' data frame contains more than one method. Please filter it to contain only one method.
The 'group1' or 'group2' column in the 'daa_results_df' data frame contains more than one group. Please filter each to contain only one group.
Error in pathway_errorbar(abundance = abundance, daa_results_df = daa_sub_method_results_df, :
Visualization with 'pathway_errorbar' cannot be performed because there are no features with statistical significance. For possible solutions, please check the FAQ section of the tutorial.
In addition: Warning message:
In MicrobiomeStat::linda(abundance, LinDA_metadata_df, formula = "~Group_group_nonsense_", :
Some features have less than 3 nonzero values!
They have virtually no statistical power. You may consider filtering them in the analysis!

Environment Information:

  • Operating System: Windows & RStudio
  • R Version: 4.3.2
  • Package Version: 1.7.3

Additional Context
Add any other context about the problem here, e.g., is this issue sporadic or consistent? Did it work in previous versions?

@freemutation freemutation added the bug Something isn't working label Apr 2, 2024
@uemechebe
Copy link

Getting the same error. Any luck in this @freemutation @cafferychen777

@AbbiHern
Copy link

AbbiHern commented Jun 11, 2024

Any update on this? I have code that used to work beautifully and now when I run the same code, I get all sorts of errors, including this one!

I've run the unique() function to see what is contained in the $group1 and $group2 columns, and they both clearly only have one thing.

Mine is also now saying that there is more than one thing in the method column and saying there are no significant features, but there are!

@GERMAN00VP
Copy link

I have the same problem:
"
Maaslin2 analysis complete. You can view the full analysis results and logs in the current default file location: /home/german/Documents/German/LAST_MAFLD/Maaslin2_results_Overweight
Annotating pathways...

Starting pathway annotation...
DAA results data frame is not null. Proceeding...
KO to KEGG is set to FALSE. Proceeding with standard workflow...
Loading KO reference data...
Returning DAA results data frame...
Creating pathway error bar plots...

The 'method' column in the 'daa_results_df' data frame contains more than one method. Please filter it to contain only one method.
The 'group1' or 'group2' column in the 'daa_results_df' data frame contains more than one group. Please filter each to contain only one group.

"

@cafferychen777
Copy link
Owner

Hi @GERMAN00VP,

Thank you for reporting this issue. The error occurs because the pathway_errorbar function expects a single method and comparison group in the results. Here's how to fix it:

  1. First, check your DAA results:
# Check unique values in method column
print(unique(daa_results_df$method))

# Check unique values in group columns
print(unique(daa_results_df$group1))
print(unique(daa_results_df$group2))
  1. Filter for a single method and comparison:
# For LinDA results
daa_sub_method_results_df <- daa_results_df %>%
  filter(method == "LinDA") %>%  # Or your specific method
  filter(group1 == "YourGroup1" & group2 == "YourGroup2")  # Your specific groups

# For ALDEx2 results
daa_sub_method_results_df <- daa_results_df %>%
  filter(method == "ALDEx2_Welch's t test") %>%  # Or other ALDEx2 method
  filter(group1 == "YourGroup1" & group2 == "YourGroup2")
  1. Complete working example:
# Load libraries
library(tidyverse)
library(ggpicrust2)

# Your existing code...
results_data_input <- ggpicrust2(
  data = abundance_data_com,
  metadata = metadata_com,
  group = "Group",
  pathway = "KO",
  daa_method = "LinDA",
  ko_to_kegg = TRUE,
  order = "pathway_class"
)

# Filter results for visualization
daa_results_filtered <- results_data_input$daa_results_df %>%
  filter(method == "LinDA") %>%
  filter(group1 == "FHC" & group2 == "FCEP")  # Your specific groups

# Create visualization
pathway_errorbar(
  abundance = results_data_input$abundance,
  daa_results_df = daa_results_filtered,
  Group = metadata_com$Group,
  p_values_threshold = 0.05,
  order = "pathway_class",
  ko_to_kegg = TRUE,
  p_value_bar = TRUE
)

Important Notes:

  1. Make sure you're using the latest version of ggpicrust2
  2. For multiple group comparisons, you'll need to filter for specific group pairs
  3. The p-value threshold (default 0.05) affects which features are considered significant
  4. Check that your abundance data and metadata match exactly

If you're still having issues, please share:

  1. The output of str(daa_results_df)
  2. Your specific group comparisons
  3. The p-values of your significant features

Best regards,
Chen Yang

@GERMAN00VP
Copy link

Thank you for your response!

Best regards,
Germán Vallejo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants