Skip to content

Releases: wcmc-its/ReCiter

ReCiter 2.1.5

04 Apr 02:57
dd525c3
Compare
Choose a tag to compare

Added ORCID ID to Reciter Identity Model wcmc-its/ReCiter-Identity-Model#7
Fixed issue #527

ReCiter 2.1.4

01 Sep 11:22
6630751
Compare
Choose a tag to compare

Outputs the "Equal Contribution" attribute (equalContrib) at the author level. This attribute when set to "yes" is an indication that any given authors who have that designation should share credit. Our intention is to use this to define co-senior and co-first author when it comes to publication reporting.

ReCiter 2.1.3

06 Apr 19:23
133c502
Compare
Choose a tag to compare

ReCiter 2.1.2

15 Dec 20:43
02f0476
Compare
Choose a tag to compare
  • #485 Fix log4j vulnerability
  • #486 Fix squiggly filters
  • #484 Bug fixes for feature generator by group. Feature generator by group api now accepts list of unique IDs as parameter. When this parameter is supplied all other filtering parameter is ignored. There is a new property in application.properties property to set the max allowed limit of uids to make sure the performance of the api is not impacted.
  • Suppress antlr runtime warnings

ReCiter 2.1.1

23 Aug 13:57
1e1c871
Compare
Choose a tag to compare

This release includes a bunch of bug fixes and enhancements especially improvements to nameScoring Strategy

  • #474 Name scoring strategy bug fix for mismatched names
  • #473 Addition of more meshMajor Terms
  • #455 Capture lookup_type in esearchresults
  • #370 Fix nameScoring bugs
  • #322 Output email even if it's not a match
  • #454 Candidate article count is wrong
  • #444 Update Feature Generator API so it returns count of pending publications for a scholar

ReCiter 2.1.0

04 Feb 17:29
e2b5966
Compare
Choose a tag to compare
  • Esearchresults table now include lookupType. This allows us to more reliably identify the count of candidate articles for the articleCountStrategy in cases where the ONLY_NEWLY_ADDED_PUBLICATIONS is used. #455
  • For articleCountStrategy, candidate article count now relies on distinct count of all retrieved publications except those from the gold standard retrieval strategy. #454
  • Time-based lookups against PubMed were only looking for articles based on date added to Entrez. This caused some publications to be missed. Now we're searching for that or date added to PubMed. #450
  • Update Swagger from 2.0 → 3.0. #447
  • Update Java 8 → 11. #446
  • Environment variable JAVA_OPTS was added to docker image to specify java heap size https://github.com/wcmc-its/ReCiter/blob/a3d5d4665e8692853ca69f2db0caba0eb56f557d/kubernetes/k8-deployment.yaml#L81-L82 and also to Dockerfile https://github.com/wcmc-its/ReCiter/blob/a3d5d4665e8692853ca69f2db0caba0eb56f557d/Dockerfile#L8
  • Output the top keywords and their counts for accepted publications. This will be used in Publication Manager. #442
  • Output count of pubs where userAssertion = NULL as attribute enhancement. This will be used in Publication Manager. #399
  • ReCiter Identity data model was updated to v2.0.8 wcmc-its/ReCiter-Identity-Model#3 to include primaryOrganizationalUnit, primaryInstitution, startDate, and endDate
  • ReCiter Article data model was updated to v2.0.16. This includes adding orcid identifier, affiliations and emails for authors, countOfPendingPubs, topArticleKeywords
  • Fixed error running DynamoDb locally in Docker. #452
  • Add healthcheck path for application use <protocol>://<host>:<port>/reciter/ping
  • Upgrade to all dependencies to use latest stable releases
  • AWS Codebuild images were also updated to use Java 11 and latest release
  • Docker image was updated to use adoptopenjdk/openjdk11:alpine-jre for security

ReCiter 2.0.0

21 Jul 19:24
0ec26df
Compare
Choose a tag to compare
  • Create a Multi-User Feature Generator API, which outputs pending articles for groups of scholars. This can be used in Publication Manager to quickly review pending publications for large groups of people. #330
  • Feature Generator API now outputs:
    • ORCID identifiers associated with authors #336
    • an identifier associated with each cluster #365
    • MeSH terms #402
  • More powerful use of the year when scholars received their degree. #391
  • Identity API returns list of scholars via S3-based cache, significantly improving performance of Publication Manager. #400
  • Support for Kubernetes, an open-source system for automating deployment, scaling, and management of containerized application
  • Bug fix: Analysis objects are in both DynamoDB Analysis table and s3, and should only be in s3 #392
  • Bug fix: incremental lookup
  • Updated timeout settings
  • Add performance metrics for s3 caching
  • Updated article and identity models in Maven Central

ReCiter 1.2

01 Sep 14:35
e4e339b
Compare
Choose a tag to compare
  • Evidence weights in application.properties are now optimized according to a support vector machine analysis
  • Created a userFeedback service for feedback from Publications Manager
  • Added an API controller in Swagger for ReCiter Publications Manager
  • Fixed a bug in common affiliation strategy
  • Bucket names in S3 are dynamically created
  • Fixed affiliation count of non-target authors. #361

ReCiter 1.1

12 Jun 20:51
f9e2c33
Compare
Choose a tag to compare

Release notes for ReCiter 1.1

  • Use name to infer gender of targetAuthor and identity. Downweight cases where there's a difference in inferred gender. #357
  • Tracks a person’s original name as recorded in a source system and outputs it in the feature generator as opposed to using the sanitized/standardized version of that name. #317
  • Tracks an organization’s original name as recorded in a source system and outputs it in the feature generator as opposed to using the standardized version and/or synonym of that name. #356
  • Single matching departmental affiliation, no matter the synonyms, should only count once. #326
  • Update articleCountScoringStrategy so it better accounts for retrieval counts in strict mode. This way people with more common names get lower scores for articleCountScoringStrategy - even though their looks up are done in strict mode. #278
  • Penalize relationship scores in cases for each non-match. This will address cases where there are a lot of co-authors and just by sheer chance some of them have a known relationship match. #341
  • Added ScienceMetrix journalDepartmentCategory scores. This covers the 250+ most common organizational affiliations in PubMed and their scores for all 180 subfields. #352
  • The number of organizational unit synonyms has been expanded. In many cases, it includes commons translations, e.g., Cirugia (Surgery). This expands the coverage of journalDepartmentCategory scoring. #354
  • journalDepartmentCategory scoring should pick most favorable match. This is useful in cases where a person has multiple organizational affiliations, one of which scores highly. #355
  • Improved method for identifying target author. It turns out author’s email is often not assigned to the person behind that email. #185

ReCiter 1.0

02 Apr 21:22
Compare
Choose a tag to compare

ReCiter is a highly accurate system for guessing which publications in PubMed a given person has authored. ReCiter includes a Java application, a DynamoDB-hosted database, and a set of RESTful microservices which collectively allow institutions to maintain accurate and up-to-date author publication lists for thousands of people. This software is optimized for disambiguating authorship in PubMed and, optionally, Scopus.

ReCiter accurately identifies articles, including those at previous affiliations, by a given person. It does this by leveraging institutionally maintained identity data (e.g., departments, relationships, email addresses, year of degree, etc.) With the more complete and efficient searches that result from combining these types of data, you can save time and your institution can be more productive. If you run ReCiter daily, you can ensure that the desired users are the first to learn when a new publication has appeared in PubMed.

ReCiter is fast. It uses an advanced multi-threading strategy known as a work stealing pool to make up to 10 retrieval requests at a time.

ReCiter is freely available and open source under the Apache 2.0 license.