-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refatoração do processamento XML PMC para melhorar o tratamento de elementos <aff> e ampliar a cobertura de testes #677
base: master
Are you sure you want to change the base?
Changes from 3 commits
9bcb65b
df5ccaa
16152ec
3b70df7
1b8a4a6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,14 +43,29 @@ def xml_pmc_aff(xml_tree): | |
""" | ||
affs = xml_tree.findall(".//aff") | ||
for aff in affs: | ||
aff_institution = aff.find("./institution[@content-type='original']").text | ||
original_institution = aff.find("./institution[@content-type='original']") | ||
if original_institution is not None: | ||
aff_institution = original_institution.text | ||
else: | ||
aff_with_address = [] | ||
aff_with_address.append(aff.find("./institution[@content-type='orgname']").text) | ||
|
||
addr_line = aff.find("./addr-line") | ||
if addr_line is not None: | ||
named_contents = addr_line.findall(".//named-content") | ||
aff_with_address.extend([named_content.text for named_content in named_contents]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @samuelveigarangel algumas versões tem city e state no lugar de named-content. Então, você tem que considerá-los para completar aff. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. |
||
|
||
country = aff.find("./country") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @samuelveigarangel pode adicionar o country no xpath |
||
if country is not None: | ||
aff_with_address.append(country.text) | ||
aff_institution = ", ".join(aff_with_address) | ||
|
||
|
||
for institution in aff.findall(".//institution"): | ||
aff.remove(institution) | ||
|
||
aff.remove(aff.find("./addr-line")) | ||
|
||
aff.remove(aff.find("./country")) | ||
for element in [aff.find("./addr-line"), aff.find("./country")]: | ||
aff.remove(element) | ||
|
||
node_label = aff.find("./label") | ||
|
||
|
@@ -154,3 +169,4 @@ def xml_pmc_ref(xml_tree): | |
refs = xml_tree.findall(".//ref") | ||
for ref in refs: | ||
ref.remove(ref.find("./mixed-citation")) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No lugar de
named_contents = addr_line.findall(".//named-content")
, usenamed_contents = addr_line.xpath(".//named-content | .//state | .//city")