Dendi Wijaya , Gede Primahadi Wijaya Rajeg , Engga Zakaria Sangian
This work is part of the AHRC-funded
project on the
lexical resources for Enggano, led by the Faculty of Linguistics,
Philology and Phonetics at the University of Oxford, UK. Visit the
central webpage of the Enggano
project.
Enggano Flora and Fauna Lexicon by Dendi Wijaya, Gede Primahadi W. Rajeg, and Engga Zakaria Sangian is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
If you use the data from this repository (Wijaya et al., 2024), please cite as follows:
Wijaya, D., Rajeg, G. P. W., & Sangian, E. Z. (2024). Enggano Flora and Fauna Lexicon (Version 1). University of Oxford. Dataset. https://doi.org/10.25446/oxford.28091270.v1
This repository holds the annotated databases for the Enggano Flora and Fauna Lexicon. The databases, originally stored as Google Spreadsheets for collaboration, are then accessed and processed using R codes in this repository using several R packages (Bryan, 2023; Cysouw, 2018; D’Agostino McGowan & Bryan, 2023; Moran & Cysouw, 2018; Ooms, 2023; Wickham et al., 2019; Wickham & Bryan, 2023).
The processing includes creating orthography profile and tokenisation/segmentation of the phonemic transcription, and, most importantly, creating links between the Enggano forms and their corresponding pictures (ID) to be used in the Contemporary Enggano Dictionary, which is processed using R here.
The databases consist of four different file types: .rds (R data file), .csv, .tsv, and .xlsx.
Dendi Wijaya gathered the primary data in October 2023 and November 2024; transcribed the forms; translated them into Indonesian and English; provided the IPA transcription; and rename the photos according to the ID of the forms.
Gede Primahadi W. Rajeg (GPWR) checked which items have been in the contemporary Enggano FLEx databases and which one to exclude from the main dictionary databases (e.g., due to duplication, etc.), in consultation with Engga Zakaria Sangian. GPWR also manually annotated the main entry variable of the forms so that complex forms can be subsumed under/linked to their main/root entry in the dictionary; annotated the crossref. column; performed the segmentation of the IPA transcription (and fixed errors); linked the forms ID with the photo by filename; manage this GitHub repository for archiving.
Engga Zakaria Sangian was consulted in a number of meetings for the verification of orthography and inclusion of the forms.
Bryan, J. (2023). googlesheets4: Access google sheets using the sheets API V4 (Version 1.1.1) [Computer software]. https://CRAN.R-project.org/package=googlesheets4
Cysouw, M. (2018). qlcData: Processing data for quantitative language comparison (Version 0.2.1) [Computer software]. https://cran.r-project.org/web/packages/qlcData/index.html
D’Agostino McGowan, L., & Bryan, J. (2023). Googledrive: An interface to google drive (Version 2.1.1) [Computer software]. https://CRAN.R-project.org/package=googledrive
Moran, S., & Cysouw, M. (2018). The unicode cookbook for linguists: Managing writing systems using orthography profiles. Language Science Press. https://doi.org/10.5281/zenodo.1296780
Ooms, J. (2023). Writexl: Export data frames to excel ’xlsx’ format (Version 1.4.2) [Computer software]. https://CRAN.R-project.org/package=writexl
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T., Miller, E., Bache, S., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, H., & Bryan, J. (2023). Readxl: Read excel files (Version 1.4.3) [Computer software]. https://CRAN.R-project.org/package=readxl
Wijaya, D., Rajeg, G. P. W., & Sangian, E. Z. (2024). Enggano Flora and Fauna Lexicon (Version 1) [Dataset]. University of Oxford. https://doi.org/10.25446/oxford.28091270.v1