The repository contains data and script to analyze the ACM Conference MobileHCI.
The different scripts should be run subsequently, starting with the lowest number to reproduce the information. Step_10_Get-Proceedings.ipynb
can be skipped as the data is already included in the repository. This step can take multiple days, and the raw data from the ACM DL needs to be crawled without risking IP blocking.
All country names are according to ISO 3166 using the list of github.comlukes/ISO-3166-Countries-with-Regional-Codes regions are according to UN MP49. However, ACM does not always use the Official names; thus, to map ACM names to the Alpha-3 country code, we can use uitls/mapCountryNamesACM.csv
to clean the data coming from ACM.
Affiliation names are commonly associated with the https://www.wikidata.org/ entry. Affiliation names are typically in line with their English names. If they are different, then the affiliation has a different official name. uitls/mapAffiliationDefault.csv
provides the link map between the affiliation, Wikimedia entry, and the default county.
For the map draing we use the data as defined by github.com/nvkelso/natural-earth-vector
Since various affiliations have multiple names, one big clearning task is to consolidate them all. All current fixes are stored in uitls/mapAffiliation.csv
. If one is missing, they can be simply added to the uitls/mapAffiliation.csv
file.
The folder data
holds general information about the conferences, e.g. the locations, acceptance rates.