Adding [metabolic reaction annotations from the Gramene Plant Reactome project to plant genes as xrefs
A registry file with the locations of the core database server(s) and the production database (or -production_db $PROD_DB_URL
specified).
Reactome flat files (see below).
init_pipeline.pl Bio::EnsEMBL::EGPipeline::PipeConfig::Xref_GPR_conf \
$($CMD details script) \
-hive_force_init 1\
-registry $REG_FILE \
-xref_reac_file Ensembl2PlantReactomeReactions.txt \
-xref_path_file Ensembl2PlantReactome.txt \
${OTHER_OPTIONS} \
2> $OUT_DIR/init.stderr \
1> $OUT_DIR/init.stdout
SYNC_CMD=$(cat $OUT_DIR/init.stdout | grep -- -sync'$' | perl -pe 's/^\s*//; s/"//g')
# should get something like
# beekeeper.pl -url $url -sync
LOOP_CMD=$(cat $OUT_DIR/init.stdout | grep -- -loop | perl -pe 's/^\s*//; s/\s*#.*$//; s/"//g')
# should get something like
# beekeeper.pl -url $url -reg_file $REG_FILE -loop
$SYNC_CMD 2> $OUT_DIR/sync.stderr 1> $OUT_DIR/sync.stdout
$LOOP_CMD 2> $OUT_DIR/loop.stderr 1> $OUT_DIR/loop.stdout
option | default value | meaning |
---|---|---|
-species |
species to process, several -species options are possible |
|
-pipeline_dir |
directory to store results to | |
-xref_reac_file |
flat file with Gramene Plant Reactome reactions data | |
-xref_path_file |
flat file with Gramene Plant Reactome pathways data | |
-uppercase_gene_id |
0 | use uppercased versions of stable IDs for mapping; 1 -- to uppercase |
-production_lookup |
1 | Fetch analysis display name, description and web data from the production database; 0 -- to disable |
The pipeline is designed to be run using Ensembl/plant_tools add_gramene_reactome.pl script.
Please, see Ensembl/plant_tools on how to setup, etc.
Here's the example scenario of usage.
- Get reactome flat files:
wget -c https://plantreactome.gramene.org/download/current/Ensembl2PlantReactomeReactions.txt
wget -c https://plantreactome.gramene.org/download/current/Ensembl2PlantReactome.txt
- Load xrefs using the script itself:
ENS_ROOT_DIR=$(pwd) # path to the ensembl repos
ENS_VERSION=104 # or whatever
PIPELINE_DIR=$(pwd)/pipeline_out # whatever, dir to store intermediate results
mkdir -p ${PIPELINE_DIR}
${ENS_ROOT_DIR}/plant_tools/production/core/add_gramene_reactome.pl
-v ${ENS_VERSION} \
-R ${REG_FILE} \
-P ${PIPELINE_DIR} \
-reactions Ensembl2PlantReactomeReactions.txt \
-pathways Ensembl2PlantReactome.txt \
-s triticum_turgidum \
-w
Sometimes you neef to pass -uppercase_gene_id 1
option to the underlying pipeline to
allow usage of uppercase gene stable IDs for mapping (i.e. for Oryza sativa (rice))
- Make sure you have a proper
analysis_description
entry in you database(s).
A few generic from Common::RunnableDB.
A few from Xref.