Skip to content

LIB410-Spring2023/wilkinsonsandbox

Repository files navigation

Healers Project: Interview Collection DMP

Roles & Responsibilities

The DMP should clearly articulate how sharing of primary data is to be implemented. It should outline the rights and obligations of all parties with respect to their roles and responsibilities in the management and retention of research data. It should also consider changes to roles and responsibilities that will occur if a project director or co-project director leaves the institution or project. Any costs stemming from the management of data should be explained in the budget notes.

The Healers interview collection consists of audio clips of interviews, as well as accompanying transcripts and translations, conducted through years of fieldwork throughout the Caribbean and the Pacific Northwest. These interviews broadly address the question: What does healing look like in the context of Afro-indigenous traditions? The interview collection will be shared with the general public using the University of Oregon’s cultural heritage digital library, Oregon Digital. It will also be deposited into the Digital Library of the Caribbean at the University of Florida. We are taking a multi-institutional approach to having the collection in both digital repositories because we believe in LOCKSS (Lots of Copies Keep Stuff Safe). However, the authoritative interviews will be held by the University of Oregon and aggregated through the Digital Library of the Caribbean not the University of Oregon.

  • Alai Reyes Santos, Healers Project Principal Investigators, is the director and producer of the interview collection. She is responsible for interview collection development and complying with data management guidelines set by the Healers Project data managers. Interviews that are captured by Alai give her the authority to determine what is and is not included in the interview collection through the UO Libraries and the Digital Library of the Caribbean. It is important to note that all interviews in this collection are clips, not full interviews. This is because full explanation of the Healer’s customs and practices are restricted only to their communities. Alai is responsible for ensuring whatever interview clips become public have been approved for public sharing before archiving with UO and dLOC.

  • Rachael Lee, Digital Humanities Research Assistant GE, is a Healers Project data manager. She is responsible for tracking and communicating with other Healer's data managers about all incoming interviews and organizing them within a content inventory before the interviews are treated for archival access and preservation. Rachael time on the project is grant funded and limited to its duration. Her responsibilities will be passed on to a new GE or Kate Thornhill.

  • Elizabeth Peterson, UO Libraries, is a Healers Project data manager. She is responsible for managing the interview collections metadata, troubleshooting and setting up streaming media platforms, converting interview file formats, and aiding with other data curation activities.

  • Kate Thornhill, UO Libraries, oversees all data management work for the Healers Project and the data management team. If any data managers leave the project, then she is responsible for their contributions and work on the interview project. She supports GEs and interns brought onto the project through data manager team training, understanding requirements and specifications for data to be formatted and interoperable for platform upload and access. She is also the administrative contact for any transcription and translation needs required through vendor services.

Expected Data

The DMP should describe the types of data, samples, physical collections, software, curriculum materials, or other materials to be produced during the project. It should then describe the expected types of data to be retained. Project directors should address matters such as these in the DMP:

  • the types of data that their project might generate and eventually share with others, and under what conditions;
  • how data will be managed and maintained until shared with others;
  • factors that might impinge on their ability to manage data, for example, legal and ethical restrictions on access to non-aggregated data;
  • the lowest level of aggregated data that project directors might share with others in the scholarly or scientific community, given that community's norms on data;
  • the mechanism for sharing data and/or making it accessible to others; and
  • other types of information that should be maintained and shared regarding data, for example, the way it was generated, analytical and procedural information, and the metadata.

Data types

There will be a minimum of 20 interviews in this digital collection. The interviews that will be published on the web include audio (.mp3) and transcription and translation documents (.pdfs). Data quantity: As of spring 2023, there are 20 interviews. These interviews are clips from longer ones. It is important to note that the collection will grow as more interviews are added. This means for every interview in the collection, data managers need also count the transcription and translation. So, if there are 20 interview clips then there are 20 transcriptions and 20 translations, which makes the collection contact 60 files. Here is a breakdown of collection size, work type, file format, total number of assets, average file size, and average total size for a set of file types.

Work Type File Format Total Assets What is the average file size (in MB) for each file type in this collection? What is the total size of all files (in MB) that have the same file type?
audio mp3 16 3.46 mb 55.4 mb
text srt 16 3.9 kb 62.4 kb
text pdf 32 2.48 mb 6.44 mb
text docx 32 43.38 kb 694 kb

In total the collection’s data quantity is 62.59mb as of March 2023.

Data Handling: Interviews will be captured by Alai and her research field team. They are following their research project’s human-subject research protocol to ensure data privacy and protections are in place before passing interviews on for processing and archiving. Interview clips will be made by people on Alai’s field team before handed over to the data management team. Once interviews are ready for archival processing then the data managers will apply file naming standardization, file conversions, resource descriptions, and create transcriptions and translations as web accessible PDFs and create closed caption files. All data shared with the data management team will be hosted and activity handled in a University of Oregon Dropbox Team folder. Once the interviews have been processed, they will then be made ready for upload to Oregon Digital and dLOC.

Folder structure: There will be 1 folder inside Dropbox, and it will be used to keep all interview materials that have been shared with data managers. Each level indicates the holder hierarchical order.

  1. Interview_collection a. Originals – The field team deposits interviews for processing in this folder. b. Processing_clips: This folder is for data managers to prepare clips for conversation, manipulation, and transcription and translation c. Cataloging: This folder is for data managers to apply resource descriptions to interviews d. Ready_share: Data managers should add options and metadata that are ready to share through Oregon Digital and the Digital Library of the Caribbean

File naming standard: Filenames should follow the following format: name_topic_serialnumber; name_topic_sourcetype_serialnumber Examples: daniela_nature_001; daniela_nature_transcript_001

Resource Descriptions: See Appendix A

Legal and ethical restrictions on access to non-aggregated data: This has the same treatment as aggregated data. Kate Thornhill and Rachael Lee are the managers of the Dropbox folder where the Healers interview collection data are stored. If a collaborator leaves the Healers project, they will ensure that any permissions have been re-attributed (if applicable) and revoke their access.

Aggregated and Shared data: All interviews and metadata will be available for download and reuse through the University of Oregon and the Digital Library of the Caribbean.

All Healers interview collection digital assets are original content created by Dr. Alai Reyes-Santos and community partners. Rights and reuse are governed by several frameworks in consultation with the healers themselves.

Traditional Knowledge Label: Each interview, transcript, and translated transcript has a Traditional Knowledge Label to indicate attribution, access, and use rights. The Traditional Knowledge Labels are provided by LocalContexts.org, which “supports Indigenous communities to manage their intellectual and cultural property, cultural heritage, environmental data and genetic resources within digital environments.” Traditional Knowledge Labels in use for the interview collection include the following:

  • TK Verified
  • TK Outreach

Rights: Each interview, transcript, and translated transcript is covered by the same rights, which were selected from the options in RightsStatements.org. This organization provides simple and standardized rights statements that “are designed to be used by cultural heritage institutions to communicate the copyright and re-use status of digital objects to their users. These statements provide a best practice for use by both international, national and regional aggregators of cultural heritage data, and the individual institutions and organisations that contribute data to them.” The rights statement in use for the interview collection is the following:

  • In Copyright -- Educational Use Permitted

Rights Holder: The healer who provides the interview is the rights holder.

Period of Data Retention

NEH is committed to timely and rapid data distribution. However, it recognizes that types of data can vary widely and that acceptable norms also vary by discipline. It is strongly committed, however, to the underlying principle of timely access. In their DMP applicants should address how timely access will be assured.

Interviews will be made available to the public within 12 months after grant funding ends.

Data Formats and Dissemination

The DMP should describe data formats, media, and dissemination approaches that will be used to make data and metadata available to others. Policies for public access and sharing should be described, including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements. Research centers and major partnerships with industry or other user communities must also address how data are to be shared and managed with partners, center members, and other major stakeholders. Work types and their appropriate file formats will be compliant with these standards.

Object Type Work Type File Format
audio Interview recording .mp3
text Transcription; translations .pdf

Metadata will be shared through Oregon Digital and the dLOC. Privacy and security will be considered and applied to the data based on a research protocol created by the field team. This includes not using healer last names or disclosing their locations in greater detail. The field team will also work with data managers to erase GPS data from recordings. Intellectual property information can be referenced in the expected data section of this DMP.

Data Storage & Preservation of Access

The DMP should describe physical and cyber resources and facilities that will be used to effectively preserve and store research data. These can include third-party facilities and repositories. Data preservation will be ensured by Oregon Digital and dLOC.

Appendix A

Metadata Application Profile

Sound objects Sound objects are items representing audio-based materials. In this collection the sound objects are audio interviews. They can be digitally represented through the following file formats: mp3.

collectionbuilder-gh

A project to generate a free and simple digital collection site using GitHub Pages given:

  • a CSV of collection metadata
  • a folder of JPEG images or PDF documents

Gather your digital objects together and create your metadata using the CollectionBuilder-GH Metadata Template. Then click the green "use this template" button above to create your repository, add your metadata and configure the repository to fit your collection and settings.

See Getting Started Docs for detailed information.

View the demo site.

Note: Since collectionbuilder-gh uses GitHub Pages, it is only suitable for small collections, with lower resolution images. GitHub repositories are limited to 1GB.

Demo CollectionBuilder with our Workshop Tutorial

If you'd like to demo CollectionBuilder, we've made a step-through tutorial using the following spreadsheet and zipped directory. (The tutorial uses items from our Psychiana Digital Collection, which is worth a visit!)

Metadata is drawn from the following Google Sheet:

Objects are collected in this zip file:

These files are stored in this CollectionBuilder-gh Google Drive Folder, along with some other metadata sheets and zipped object directories that can be used for other workshops and demonstrations.

More on CollectionBuilder

collectionbuilder-gh is intended as a simple template for hands-on teaching about digital libraries. It can be used in a workshop setting to take participants through digitization and metadata creation, to having a live collection site hosted on GitHub.

collectionbuilder-gh aims to be well documented and easy to configure by following the example, with the potential to scaffold learning of a multitude of transferable digital and data skills. A project in "minimal computing", it provides a depth of learning opportunities, allowing users to take complete ownership over the project and make their work open to the world.

Learn about:

  • Git and GitHub basics
  • Markdown, plaintext writing and content creation
  • HTML, CSS, and JS literacy
  • commandline literacy
  • GitHub collaboration and project management
  • Jekyll basics
  • working in the Open, open source and open data
  • digital libraries concepts such as "collections as data", minimal computing, data-driven design

We prefer commonly understood formats (such as CSV spreadsheets over YAML), and convention over configuration (follow the example over learn all the options).

Features

Build a Digital Collection!

Check out the CollectionBuilder docs for how to get started, or visit the CollectionBuilder home for more information.

If you are interested in using CollectionBuilder, or are already using it, please drop us a line (libstatic.uidaho@gmail.com) since we would love to learn more about it's use in the wild. There are also currently opportunities to collaborate on CollectionBuilder.

License

CollectionBuilder documentation and general web content is licensed Creative Commons Attribution-ShareAlike 4.0 International. This license does NOT include any objects or images used in digital collections, which may have individually applied licenses described by a "rights" field. CollectionBuilder code is licensed MIT. This license does not include external dependencies included in the assets/lib directory, which are covered by their individual licenses.