Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

one error occurred when I used the miniExample dir #20

Open
lirui-max opened this issue Jan 31, 2023 · 3 comments
Open

one error occurred when I used the miniExample dir #20

lirui-max opened this issue Jan 31, 2023 · 3 comments

Comments

@lirui-max
Copy link

Hi, professor.
one error occurred when I used the miniExample dir:

annotateGenesViaCESAR.pl POLR3K hg38_oryAfe1.bb twoGenes.gp.forCESAR hg38 oryAfe1 CESARoutput 2bitDir $profilePath -maxMemory 1
Processing gene 'POLR3K'
-speciesList is not a valid option
Error running '/bin/bash -c 'set -o pipefail; mafExtract -region=chr16:53475-53586 hg38_oryAfe1.bb stdout|mafSpeciesSubset stdin NULL /dev/shm/exon.maf.qaUzT18Za -speciesList=oryAfe1,hg38''

@MichaelHiller
Copy link
Contributor

This error occurs in mafSpeciesSubset.
You are using UCSC's mafSpeciesSubset, which does not know the speciesList option.
If you compile the kent binaries that we provide with CESAR2 and set your PATH to first use these binaries, it will work, as we added this parameter to the provided mafSpeciesSubset.

@lirui-max
Copy link
Author

Thank you very much, professor.

I met the troubles when I compile the provided Kent. So I compile the ucsc CESAR2 Kent with the following code and it finally runs.
rsync -azvP rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/linux.x86_64/ ./

Could the provided GitHub CESAR2 have a code like this "rsync" to make it easier for using?

Thank you very much again, professor.

@MichaelHiller
Copy link
Contributor

Sorry, but you got something wrong.

  1. If you rsync the UCSC binaries (I just did that), UCSC's mafSpeciesSubset does NOT have the -speciesList parameter.
  2. If you follow our documentation and compile it locally:
    git clone https://github.com/hillerlab/CESAR2.0/
    cd CESAR2.0/
    export PATH=pwd/kent/bin:pwd/tools:$PATH
    export profilePath=pwd
    make
    cd kent/src
    make
    cd ../../

then
which mafSpeciesSubset
~/CESAR2.0/kent/bin/mafSpeciesSubset

~/CESAR2.0/kent/bin/mafSpeciesSubset
mafSpeciesSubset - Extract a maf that just has a subset of species.
usage:
mafSpeciesSubset in.maf species.lst out.maf
Where:
in.maf is a file where the sequence source are either simple species
names, or species.something. Usually actually it's a genome
database name rather than a species before the dot to tell the
truth.
species.lst is a file with a list of species to keep
out.maf is the output. It will have columns that are all - or . in
the reduced species set removed, as well as the lines representing
species not in species.lst removed.
options:
-speciesList=species1,species2,... - list of species to keep. Overrides 'species.lst' (set this to NULL)
-keepFirst - If set, keep the first 'a' line in a maf no matter what
Useful for mafFrag results where we use this for the gene name

--> This binary provides the parameter our workflow needs.
So no rsync is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants