Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"undefined" string in synonyms scientific name - 038C0022FFEDAA3669A9CBC826386A13 #387

Open
camiplata opened this issue Feb 10, 2025 · 6 comments
Assignees
Labels
fix request A fix requested for a specific paper or treatment

Comments

@camiplata
Copy link

CLB
https://tb.plazi.org/GgServer/summary/FFB5785AFFE7AA3A6921CA7D25136848

Image

I check and the issue comes from Plazi DwC-A, so maybe is an export problem as I don't see it reflected on the Plazi website. Neither I could find a reference to a synonym inside the treatments

This dataset is also missing the authorship + year

@flsimoes flsimoes self-assigned this Feb 11, 2025
@flsimoes flsimoes added the fix request A fix requested for a specific paper or treatment label Feb 11, 2025
@flsimoes
Copy link

The article has some citations of undefined species

Image

It seems that all these you listed are such cases.

@camiplata
Copy link
Author

I see, but even in that case is not a good practice to add "sp." or "undefined" on the dwc:scientificName, you could use dwc: verbatimIdentification and/or dwc:verbatimTaxonRank

@flsimoes
Copy link

@myrmoteras and Guido let me know your thoughts on the matter

@flsimoes
Copy link

flsimoes commented Feb 16, 2025

I see, but even in that case is not a good practice to add "sp." or "undefined" on the dwc:scientificName, you could use dwc: verbatimIdentification and/or dwc:verbatimTaxonRank

We implemented the isUncertain = true attribute to such cases a while ago
Unfortunately this particular article was from before we implemented it. I've just added the attribute.

@camiplata
Copy link
Author

@mdoering are we using the attribute isUncertain = true on our side? would it be enough to filter out this kind of names?

@mdoering
Copy link

isUncertain is probably some plazi internal treatment attribute.
Pheidole sp. eg-111 is a valid dwc:scientificName. We would not have a problem parsing that and it would result in an "informal" name category. Please don't use any verbatim dwc terms like verbatimIdentification. ChecklistBank doesn't use any of them as we are not dealing with occurrences.

We can improve things on our side, but it wouldn't be too bad:
https://api.checklistbank.org/parser/name?q=Pheidole%20sp.%20eg-111

The problem is the eg prefix, which makes the parser believe its a real epithet. Pure numbers work better:
https://api.checklistbank.org/parser/name?q=Pheidole%20sp.%20111

But that's sth we can improve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix request A fix requested for a specific paper or treatment
Projects
None yet
Development

No branches or pull requests

3 participants