Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for CISG-Online #3169

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

thebluepotato
Copy link

CISG-Online is (chiefly) a database of cases related to the UN Convention on United Nations Convention on Contracts for the International Sale of Goods. This translator allows Zotero to save cases found on there.

Instead of filling the abstract, info is now saved as formatted notes.
Now supports downloading multiple PDF attachments.
Parsing logic cleaned up by avoiding trailing array functions.
CISG-Online.js Outdated Show resolved Hide resolved
CISG-Online.js Outdated


function detectWeb(doc, url) {
// TODO: adjust the logic here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove TODOs

if (url.includes('caseId=')) {
return 'case';
}
else if (url.includes('search-for-cases')) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always true due to target regex

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's better? A target regex that matches the entire website with a detectWeb that may return false, or a specific target regex where detectWeb always returns an itemType?

CISG-Online.js Outdated
function getSearchResults(doc, checkOnly) {
var items = {};
var found = false;
let selector = 'div.search-result div.col-md-10 a.search-result-link';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove div specifiers (not important) and .col-md-10

for (let block of blocks.values()) {
result += `<h2>${label}</h2><ul>`;
// TODO: check fragility
let rows = block.querySelectorAll("div.row > div > div.row");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does look pretty fragile! Anything else we can match on?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tragically no... Example:

<div class="is-content">
    <div class="row">
        <div class="col-xs-10">
            <div class="row">
                <div class="col-xs-12 col-lg-4">
                    <b>Name</b>
                </div>
                <div class="col-xs-12 col-lg-8">
                    Brands International Corporation
                </div>
            </div>

            <div class="row">
                <div class="col-xs-12 col-lg-4">
                    <b>Place of business</b>
                </div>
                <div class="col-xs-12 col-lg-8">
                    Canada
                </div>
            </div>

            <div class="row">
                <div class="col-xs-12 col-lg-4">
                    <b>Role in transaction</b>
                </div>
                <div class="col-xs-12 col-lg-8">
                    Seller
                </div>
            </div>
        </div>

        <div class="col-xs-2 text-right">
            <img src="/css/flags/blank.gif" class="flag flag-ca align-top" />
        </div>
    </div>
</div>

Let's just say the website does not really adhere to semantic web...

CISG-Online.js Outdated
result += `<h2>${label}</h2><ul>`;
// TODO: check fragility
let rows = block.querySelectorAll("div.row > div > div.row");
for (let row of rows.values()) result += `<li>${labellize(row)}</li>`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be no need for .values() (here and elsewhere)

CISG-Online.js Outdated
for (let row of rows.values()) result += `<li>${labellize(row)}</li>`;
result += "</ul>\n";
}
Z.debug(result);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove debugs before merge

CISG-Online.js Outdated

let result = "<h2>Case History</h2><ul>";
for (let cisgCase of cases.values()) {
//result += `<li>${labellize(row)}</li>`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented-out lines

"url": "https://cisg-online.org/search-for-cases?caseId=14425",
"attachments": [
{
"title": "Full text of decision Original language: English",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full Text PDF (can match on common title patterns like "Full text of decision"). Language should go in language field on item.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This database provides both the original language PDF and translations (most of the time). The name of the PDF files comes from the database itself. How should the files be "canonically" named in Zotero when there are multiple?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants