-
Notifications
You must be signed in to change notification settings - Fork 1
Linked Data Reconciliation Services Breakdown
Once we have a proper environment set up, we can clone and call a desired service when needed, then let OpenRefine know where to listen in on our local computer to use it.
I have a forked version of the FAST reconciliation script called fast-reconcile.
- Clone the repo.
- In your shell while in the refine3 environment,
cd
in, and type:
$ python reconcile.py
- The shell should report that the service is running and note which port, something like
Running on http://0.0.0.0:5000/
. - Open OpenRefine, select the column you would like to reconcile, click on the arrow at the top, choose
Reconcile
>Start Reconciling...
- Click on the Add Standard Service button in the bottom left corner.
- Now enter the URL that the local service is running on - it should be
http://localhost:5000/reconcile
- You should now be greeted by a list of possible reconciliation types for the service. Choose your desired options and then click
Start Reconciling
. - Whenever you are finished and wish to close the service down, hit
Ctrl
+C
to stop it.
Note that this is the first-time set up. OpenRefine will save that service, but you must still beforehand activate the conda
environment and run the script.
We will use Christina Harlow's service, geonames-reconcile.
The instructions are the same as above for FAST, except one crucial bit: it relies on a GeoNames API user name. So first:
- Go to the login page and register. After your account is activated, enable it for free web services.
- Once you have your GeoNames username, create an environment variable on your computer with your Geonames username as so:
- Open your shell
- Type in
$ export GEONAMES_USERNAME="username"
(replacing username with your username)
- Proceed as above in the FAST reconciliation
We will use Christina Harlow's service lc-reconcile.
The instructions are the same as above for FAST, except the local URL to use when selecting Standard Service
in OpenRefine is http://localhost:5000/
.
Optionally, you could ignore this local version and run the hosted verion by putting instead the URL http://lc-reconcile.cmh2166.webfactional.com/
.
Although there used to be a python-based VIAF reconciliation service, it has since moved to a much bigger Java-based framework called conciliator.
Conciliator has grown to provide reconciliation for way more than just VIAF, including ORCID, and any Solr data source.
Since Java set up is beyond the scope of this repo, read the manual, and use the hosted version of conciliator if you don't wish to install it locally.
Wikidata reconciliation is quickly becoming a highly-desired ability, and there is a Wikidata Hosted Reconciliation Service. In order to use it, simply choose to reconcile a column in OpenRefine, then add the API endpoint as a "Standard Service": https://tools.wmflabs.org/openrefine-wikidata/en/api.
Update: In the latest releases of OpenRefine, 2.7+, you can now reconcile to Wikidata right out of the box. Neat!