-
Notifications
You must be signed in to change notification settings - Fork 15
/
Copy pathREADME.html
139 lines (127 loc) · 5.48 KB
/
README.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>README.html</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
</head>
<body>
<h1>ckanext-syndicate</h1>
<p>CKAN plugin to syndicate datasets to another CKAN instance.</p>
<p>This plugin provides a mechanism to syndicate datasets to another instance of
CKAN. If a dataset has the <code>syndicate</code> flag set to <code>True</code> in its custom
metadata, any updates to the dataset will be reflected in the syndicated
version. Resources in the syndicated dataset are stored as the URLs of the
resources in the original. You must have the API key of a user on the target
instance of CKAN. See the Config Settings section below.</p>
<p>Plugins can modify data sent for syndication or react to before/after
syndication events by implementing the <code>ISyndicate</code> interface and subscibing
to corresponding signals. This is useful if the schemas are different between
CKAN instances.</p>
<h2>Requirements</h2>
<ul>
<li>Tested with CKAN 2.9.x branch on python v3.7+</li>
<li>To work over SSL, requires <code>pyOpenSSL</code></li>
</ul>
<h2>Installation</h2>
<p>To install ckanext-syndicate:</p>
<ol>
<li>
<p>Activate your CKAN virtual environment, for example::</p>
<pre><code>. /usr/lib/ckan/default/bin/activate
</code></pre>
</li>
<li>
<p>Install the ckanext-syndicate Python package into your virtual environment::</p>
<pre><code>git clone https://github.com/aptivate/ckanext-syndicate.git
pip install -e ckanext-syndicate
</code></pre>
</li>
<li>
<p>Add <code>syndicate</code> to the <code>ckan.plugins</code> setting in your CKAN config file.</p>
</li>
</ol>
<h2>Config Settings for using in .ini file</h2>
<p>If you are using syndication to single endpoint:</p>
<pre><code> # The URL of the site to be syndicated to
ckan.syndicate.ckan_url = https://data.humdata.org/
# The API key of the user on the syndicated site
ckan.syndicate.api_key = 9efdd954-c643-444a-97a1-c9c374cef861
# The custom metadata flag used for syndication
# (optional, default: syndicate).
ckan.syndicate.flag = syndicate_to_hdx
# The custom metadata field to store the syndicated dataset ID
# on the original dataset
# (optional, default: syndicated_id)
ckan.syndicate.id = hdx_id
# A prefix to apply to the name of the syndicated dataset
# (optional, default: )
ckan.syndicate.name_prefix = my-prefix
# The name of the organization on the target CKAN to use when creating
# the syndicated datasets
# (optional, default: None)
ckan.syndicate.organization = my-org-name
# Try to preserve dataset's organization
# (optional, default: false)
ckan.syndicate.replicate_organization = boolean
# The username whose api_key is used.
# If the dataset already exists on the target CKAN instance, the dataset will be updated
# only if this option is set and its creator matches this user name
# (optional, default: None)
ckan.syndicate.author = some_user_name
</code></pre>
<p>If you are using syndication to multiple endpoints, specify multiple
values for each section, divided either with space or with
newline. Only distinction is <code>ckan.syndicate.predicate</code> directive,
which specifies predicate for check, whether dataset need to be
syndicated for current profile. This option uses
<code>import.path:function_name</code> format and predicate function will be
called with syndicated package object as single argument. If function
returns falsy value, no syndication happens::</p>
<p>ckan.syndicate.api_key = 4c38ad33-0d77-4213-a6da-b394f66146e7 c203782c-2c5e-410e-b47e-001818b9a674
ckan.syndicate.author =
sergey
sergey
ckan.syndicate.ckan_url = http://127.0.0.1:8000
http://127.0.0.1:7000
ckan.syndicate.replicate_organization = true false
ckan.syndicate.organization = default pdp
ckan.syndicate.predicate = <strong>builtin</strong>:bool ckanext.anzlic.helpers:is_pdp_dataset
ckan.syndicate.field_id = syndicate_seed_id syndicate_pdp_id</p>
<hr />
<h2>Config Settings in CKAN UI</h2>
<p>link to admin page <code>/syndicate-config</code> sysadmins are only allowed.
(.ini file config will be used if no configs are set or missing in the UI)</p>
<p>New feature::
- Using Syndicate CKAN UI, you can add multiple ckan instances;
- UI provides syndicate logs page, that show all failed syndications. You can manually run syndication for each of these logs.</p>
<hr />
<h2>API</h2>
<ul>
<li>syndicate_individual_dataset.
ex.: curl -X POST <CKAN_URL> -H "Authorization: <USER_API_KEY>" -d '{"id": "<DATASET_ID>", "api_key": "<REMOTE_INSTANCE_API_KEY>"}'
Trigger syndication for individual dataset.
Restrictions:</li>
<li>User must have <code>package_update</code> access</li>
<li>
<p><REMOTE_INSTANCE_API_KEY> must be added as syndication endpoint to updated dataset.</p>
</li>
<li>
<p>syndicate_datasets_by_endpoint.
ex.: curl -X POST <CKAN_URL> -H "Authorization: <USER_API_KEY>" -d '{"api_key": "<REMOTE_INSTANCE_API_KEY>"}'
Trigger syndication for all dataset that have specified endpoing among <code>syndication_endpoints</code>.
Restrictions:</p>
</li>
<li>User must have <code>sysadmin</code> access</li>
</ul>
<hr />
<h2>CLI</h2>
<p>Mass or individual syndication can be triggered as well from command line::</p>
<p>ckan syndicate sync [ID]</p>
<hr />
<h2>Running the Tests</h2>
<p>To run the tests, do::</p>
<p>pytest --test-ini ckan.ini</p>
</body>
</html>