Fungal reads - Which is the best database? #278

andressamv · 2024-02-05T19:00:51Z

Hi! I have been using Kaiju for a while, and now I am interested in filtering fungal reads. For this, I used the Kaiju app in KBase and compared the results using two different databases: NCBI BLAST nr+euk (protein sequences from nr: Bacteria, Archaea, Viruses, Fungi, and microbial eukaryotes) and fungi (protein sequences from a representative set of fungal genomes). Based on the same samples, I would expect to have more fungal reads using the comprehensive database (nr since I thought RefSeq would be included in nr), but the fungal one results in way more hits. Please, what is the explanation for that?

pmenzel · 2024-02-14T11:03:32Z

Hi! Not necessarily all genomes from RefSeq are contained in the BLAST nr database, so it might well be, that more reads get classified by the RefSeq fungi database.

You can manually check some of the reads that are classified by the RefSeq database and not by the nr database and use the NCBI BLAST website to see if they have good matches in nr..

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fungal reads - Which is the best database? #278

Fungal reads - Which is the best database? #278

andressamv commented Feb 5, 2024

pmenzel commented Feb 14, 2024

Fungal reads - Which is the best database? #278

Fungal reads - Which is the best database? #278

Comments

andressamv commented Feb 5, 2024

pmenzel commented Feb 14, 2024