%ep3util(1) irdmtools user manual | version 0.0.89 5850ad61 % R. S. Doiel and Tom Morrell % 2024-10-03
ep3util
ep3util [OPTIONS] ACTION [ACTION_PARAMETERS ...]
ep3util provides a quick wrapper around EPrints 3.3 REST API. By default ep3util looks for five environment variables.
REPO_ID : the EPrints repository id (name of database and archive subdirectory).
EPRINT_HOST : the hostname for EPrint's.
EPRINT_USER : the username having permissions to access the EPrint REST API.
EPRINT_PASSWORD : the password for the username with access to the EPrint REST API.
C_NAME : If harvesting the dataset collection name to harvest the records to.
EPRINT_DB_HOST : The MySQL hostname holding the EPrints repository database
EPRINT_DB_USER : The MySQL username used to access EPrints repository database
EPRINT_DB_PASSWORD : The MySQL password used to access EPrints repository database
The environment provides the default values for configuration. They maybe overwritten by using a JSON configuration file. The corresponding attributes are "repo_id", "eprint_host", "c_name", "eprint_db_host", "eprint_db_user", and "eprint_db_password".
If the environment variables for MySQL access are set then the results reflect direct access to the database instead of the EPrint REST API.
help : display help
license : display license
version : display version
config : provide a path to an alternate configuration file (e.g. "irdmtools.json")
ep3util supports the following actions.
setup : Display an example JSON setup configuration file, if it already exists then it will display the current configuration file. No optional or required parameters. When displaying the JSON configuration a placeholder will be used for the token value.
get_all_ids : Returns a list of all repository record ids. The method uses OAI-PMH for id retrieval. It is rate limited and will take come time to return all record ids. A test instance took 11 minutes to retrieve 24000 record ids.
get_modified_ids START [END] : Return a list of records created or modified in the START and END date range. If END is not provided it is assume to be today.
get_record RECORD_ID : Returns a specific simplified record indicated by RECORD_ID, e.g. 23808. The RECORD_ID is a required parameter.
harvest [HARVEST_OPTIONS] [KEY_LIST_JSON]
: harvest takes a JSON file containing a list of keys and harvests each record into a dataset collection. If combined
with one of the options, e.g. -all
, you can skip providing the KEY_LIST_JSON file.
-all : Harvest all records
-modified START [END] : Harvest records modified between start and end dates.
-as-citations : This harvests the record into a minimal citation form similar to citeproc
Action parameters are the specific optional or required parameters need to complete an aciton.
Setup for ep3util by writing an example JSON configuration file. "nano" is an example text editor program, you need to edit the sample configuration appropriately.
ep3util setup >eprinttools.json
nano eprinttools.json
Get a list of all EPrint record ids.
ep3util get_all_ids
Get a specific EPrint record. Record is validated against irdmtool EPrints data model.
ep3util get_record 23808
Harvest all records
ep3util harvest -all
Harvest records created or modified in the month of September, 2023.
ep3util harvest -modified 2023-09-01 2023-09-30