A Simple runner for RDF test cases & validations.
RDF Validator runs a collection of test cases against a SPARQL endpoint. The endpoint can be either a HTTP(s) SPARQL endpoint or a file or directory of RDF files on disk. Test cases can be specified as either a single SPARQL query file containing either
an ASK
or a SELECT
query, or a suite of such files with a suite manifest.
Main features:
- 👍 SPARQL
SELECT
orASK
queries as validations - 👌🏾 Package suites as git dependencies with a simple manifest format
- 🏃 Run 3rd party validations as dependencies via git or maven dependencies (thanks to the Clojure CLI tools)
- 🏃🏾 Run validations against SPARQL endpoints or files of RDF
- 🚴 Optionally dynamically generate queries with handlebars-like selmer templates
The quickest way to get started is to use the Swirrl's PMD RDF data validations project which builds upon this application.
This readme explains how to customise and develop your own validation suite.
The recommended way to use the RDF Validator is as a Clojure application (although you could compile a jar instead) which will allow you to include suites of validations from git that will be automatically fetched and installed on first usage, and cached thereafter.
You'll first need to install the Clojure command line tools.
Once clojure is installed you can create a new directory and add a deps.edn
file declaring a dependency on the swirrl/rdf-validator
application:
{:aliases
{:rdf-validator
{:extra-deps
{ swirrl/rdf-validator {:git/url "https://github.com/Swirrl/rdf-validator.git"
:sha "4a81411e713adbaad12b32bc800e8bef7175d93e"}}
:main-opts ["-m" "rdf-validator.core"]}}}
The clojure cli tool will fetch the application (so you won't need to git clone
this repository) when you run it with the above :rdf-validator
alias.
For example, to run a sparql test against a remote endpoint you can do:
$ clojure -M:rdf-validator --suite mytest.sparql --endpoint http://my/sparql/endpoint
You can also have the validator load-up an in-memory sparql endpoint from a RDF file:
$ clojure -M:rdf-validator --suite mytest.sparql --endpoint mycube.ttl
Or by recursing through a directory tree of RDF files:
$ clojure -M:rdf-validator --suite mytest.sparql --endpoint /path/to/rdf
You can see more examples in the docs on command-line usage.
Validations are written as SPARQL queries. We recommend that you write SELECT
queries that will identify and describe the causes of validation failures. The docs explain more about writing test cases.
You can pass your .sparql
files to the validator with a command-line option (here validating a file of RDF data):
$ clojure -M:rdf-validator --suite test1.sparql --endpoint data.ttl
To provide more structure you may want to collate your tests into suites.
To do this you can put the files into a directory ("src"
) with a manifest file rdf-validator-suite.edn
at the root:
myvalidator
├── deps.edn
└── src
├── myorg
│ └── mysuite
│ ├── test1.sparql
│ └── test2.sparql
└── rdf-validator-suite.edn
The manifest should specify the suite name and the relative paths to the SPARQL files to include:
{:suite-name ["myorg/mysuite/test1.sparql"
"myorg/mysuite/test2.sparql"]}
You can also use the manifest to add labels and descriptions or to modularise and re-use tests. See the docs on defining test suites for more.
You can pass this suite as a command-line option:
$ clojure -M:rdf-validator --suite src --endpoint data.ttl
Or record it in your deps.edn
file:
{:aliases ;; as above
:paths ["src"]}
This will mean your suite is included by default so you can omit that option when running the validator:
$ clojure -M:rdf-validator --endpoint data.ttl
You can include third-party validation suites (and indeed share your own for others to build upon) using Clojure's deps tool by adding them as dependencies to the deps.edn
file. Clojure deps supports various ways of fetching and resolving dependencies and putting them on the classpath, such as via git, maven packaged jars, or just dependencies at a :local/root
.
For example, we can include a specific version of the qb
suite from pmd-rdf-validations by extending your deps.edn
to add a :deps
key alongside the :aliases
:
{:aliases ;; as above
:deps
{ swirrl/validations.qb {:git/url "git@github.com:Swirrl/pmd-rdf-validations.git"
:sha "b8c6f8fcee9ed7e00f6b5aad4d691b441cd5428b"
:deps/manifest :deps
:deps/root "qb"}}}
The Swirrl/pmd-rdf-validations.git
repository contains multiple suites, each defined as their own dep within the same repo. The :deps/root
key essentially lets us point to a specific sub-directory, here for the "qb"
(data cube) validations.
Once these are specified we can run them against a repository containing data cubes, e.g.
$ clojure -M:rdf-validator --endpoint http://some.domain/sparql/query
Note that this command will first fetch the validation suite dependency, cache it locally for future use, and run all the validation suites we put on the classpath (here just the data cube validations).
Copyright © 2018 Swirrl IT Ltd.
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.