Skip to content

Latest commit

 

History

History
3813 lines (3190 loc) · 144 KB

Lectures.org

File metadata and controls

3813 lines (3190 loc) · 144 KB

All of everything

(pcase  (org-lms-get-keyword "ORG_LMS_SECTION")
  ((pred (string= "lecture")) (message "lecture"))
  (otherwise (message "what?")))

Test code export

<msDesc>
  <msIdentifier>
    <settlement>Oxford
    </settlement>
    <repository>Bodleian Library
    </repository>
    <idno>MS. Add. A. 61
    </idno>
  </msIdentifier>
  <msContents>
  <p>
      <quote>Hic incipit Bruitus Anglie,
      </quote> the
      <title>De origine et gestis  Regum Angliae
      </title> of Geoffrey of Monmouth (Galfridus Monumetensis): beg.
      <quote>Cum mecum multa &amp; de multis.
      </quote> In Latin.
    </p>
  </msContents>
  <physDesc>
    <p>
      <material>Parchment
      </material>: written in more than one hand: 7¼ x 5⅜ in., I + 55 leaves, in double columns: with a few coloured capitals.
    </p>
  </physDesc>
</msDesc>
(setq org-huveal-root "../../vendor/reveal")

Lecture test – dull

testing this

hello

1 Digital Humanities – An Introduction

Today

  • Intro
  • What is DH?
  • DH: A partial history
  • Course syllabus
  • A Little Game, a little thinking (offline)

Sign up for Slack!

I’d like to keep all chat in Slack outside of class. So please click on this link in your copy of the lecture notes

Introductions

MeHistorian of Science, Emphasis on “Engaged Teaching”
YouTell us about yourself

Visit the #introductions channel in Slack. Write us a little message:

  • What’s your name, what are your majors/minors?
  • why are you interested in this class? If the answer is “it’s a requirement” then tell us why you want to do DH
  • what skills are you hoping to develop here/elsewhere/in future?
  • tell us one interesting fact about yourself that you’re willing to share with everyone in the class (nothing truly private, please)
  • please also add a profile picture to your slack profile: ideally, this should be the same as or similar to your Quercus profile picture. It’s even better if the picture looks enough like you for me to identify you! I am very slow with names, so I need all the help I can get.

DH Minor

  • awareness of what the humanities are
  • exposure to established tools (but these change!)
  • thoughtful attention to form, content, and their relationship.
  • willingness to explore technical skills

The Humanities

(we’ll do more next time)

“The humanities—including the study of languages, literature, history, jurisprudence, philosophy, comparative religion, ethics, and the arts—are disciplines of memory and imagination, telling us where we have been and helping us envision where we are going.”

/The Heart of the Matter/ (Report of the American Academy of Arts & Science’s Commission on the Humanities and Social Sciences to the U. S. Congress in June 2013)

The Humanities

  • questions of knowledge:
  • do humanists care about facts?
  • historical role has been to create narratives that connect various parts of the human experience
  • goal was generally not accumulation of fact but understanding of relationships
  • we live in an age of facts. How should the humanities engage?

The Problem of Quantitative Reasoning

Humanists have traditionally disdained mathematical reasoning. Why?

  • it’s hard and they don’t want to try?
  • the “two cultures”? [cite:@Snowtwocultures1993a]
  • quantitative data distorts the questions they want to ask?

The Spectre of Reductionism

  • most empirical disciplines are atomistic and, at least much of the time reductionist
  • humanists are largely interested in emergent properties and experiences
  • How can humanists make use of “data” without losing a “dialectectical” relationship to their material (their texts, the pasts, their artifacts)?
  • and so, a dilemma! and one that will not go away.

DH: at the fluctuating intersections of the humanities with computing

./images/hum-cs-interface.svg

Origins of DH

./images/dh-timeline.png

Father Roberto Busa, S. J. (1913-2011)

Index Thomisticus (1950s – 1980s; 2005 online)

  • 11 million words of medieval Latin
  • 30+ years of editing and analysis
  • 8000+ hours of computer processing stacks of punch cards
  • 1500 + km of magnetic tape

@@html:</div><div class=”paired fragment appear”>@@ https://web.archive.org/web/20160910202350if_/http://blog.gale.com/wp-content/uploads/2016/05/busa1.jpg

Father Busa’s Female Punch Card Operators

From Father Busa’s archive. CIRCSE Research Centre, Università Cattolica del Sacro Cuore, Milan, Italy.in [cite:@TerrasAdaLovelaceDay2013] Top left: Livia Canestraro.

https://melissaterras.files.wordpress.com/2013/10/728a0-0028.jpg

Father Busa and the Index Thomisticus

./images/index-thom-online.png

DH Today

./images/dh-flower.png

./images/dh-flower-plus.png

DH: Project Types

  • Digital editing & narratives: making texts and narratives available digitally
  • Data visualization: giving visual forms to data
  • Digital archives: digital (or digitized) collections of primary documents
  • Digital mapping: plotting historical or literary data onto a modern, historical, or imaginary map
  • Augmented/virtual reality: using computing to overlay virtual elements onto real landscapes (AR)
  • 3D printing: turning a digital model into a real object
  • Storytelling & performance: video games, coding as art practice

From the early Information Age to its Decadent Period

  • Busa: at the optimistic invention of “information” [cite:@Shannonmathematicaltheorycommunication1948a]
    • Cold War: hope; triumph; fear; reason and unreason; “human use of human beings” [cite:@wiener_human_1989]
    • Busa sought to use informatic tools to supplement traditional approaches to traditional questions
  • Compare with today
    • ubiquitous informatic devices; surveillance; disinformation; systems manipulation (markets; ballot measures; elections; mass opinions)
    • all work is now “digital”, and we take “supplementation” for granted
    • our task in DH has to be: in an era in which subtle thinking appears to be endangered, can we harness the tremendous computational power to continue to think complex thoughts, not against the humanistic tradition, but within it?

Your DH Minor, again

  • The digital part is important: you have to be willing to learn technical skills
  • But it’s not nearly as important as the humanist part! So you will have to always think hard about texts, about cultural artifacts, about what it means to be human

Course Syllabus

4 main blocks!

Assignments!

Let’s Play a Game!

2: More on Humanities; and Anatomy of DH Projects

Today

  • First, let’s think about Humanities
  • Some DH examples
  • Very Basic Design Parameters
    • Data and Data Models
    • How do we Think about Technology?
    • Users
  • In Class Activities

Humanities, or, Why Are we here?

  • What are the Humanities?
  • Why do we care about them?
  • What are the methods?

A Good DH Project is also a Good Humanities Project

  • Bring to this field your intuitions and interests from other Humanities Courses
  • If you don’t have those… time to acquire them!

Defining the Humanities (redux)

History of “The Humanities”

  • the objects of humanistic inquiry have been studied in diverse times and places, using many different methods (e.g., systematic study of language as an abstract human capacity begins sometime between 700 and 400 BCE near the Afghanistan-Pakistan border)
  • the term “Humanist” arises in 14th Century Italy (cf. Bod, pp 145 ff), as a contrast to “Divines” whose most exalted objects of study were theological.
  • Development of intellectual disciplines over the next several centuries slowly separated humanities from the natural and social sciences.
  • Formal argument in late c.19 about Geisteswissenschaften (“Sciences of the Spirit” == Humanities) and its distinction from “social” and “Natural”
    • “Understanding” vs. “explaining”
    • This remains a live debate for us!

“Verstehen” and “Erklären”

“Verstehen” and “Erklären”

  • We study the works of human culture in an effort to come to grips, not with “what humans are”, but with what it means to be human
  • The question is: what are the best methods for answering that question?
  • In addition to “data” we need
    • synthetic understanding
    • empathy
    • creativity

Qualitative and Quantitative Analysis

  • Humanities disciplines are not all the same, but for the most part they have always employed some kind of qualitative analysis
  • Computation has tremendously powerful applications for quantitative analysis
  • So… what exactly does DH bring to the table?
    • treat text as data
    • compare large numbers of texts/objects
    • allow large-scale collaboration
    • provide a media infrastructure for disciplinary conversation.

Boundaries of Humanities (from the “Verstehen” perspective)

The Question Concerning DH

  • Do computers actually help us to answer Humanities questions?
  • Alternatively, do computers just make humanities questions obsolete?
  • You decide!

DH Design: Project Examples

Digital Editions, Archives, Narratives

Maps, Visualizations, Interpretations

Communication

Why “Anatomy of DH”?

DH pipelines

./images/dh-arrow.svg

Data

  • abstract representation of things or processes
  • choice of virtual entities, relationships, properties/aspects → “toy universe”
  • top row of your spreadsheet

Metadata

Vocabulary and Ontology

Controlled Vocabulary

./images/black-death.png

Research Data: Life Cycle

Preservation-Ready Data

Data: accessible, usable, readable, preservable
  • Human-readable and human-editable
  • Separable from technical platforms
  • Described via metadata standards used in your discipline
  • Contextualized in clear documentation
  • Housed in non-proprietary, open source standards and technologies
  • Saved in, or reducible to, simple formats: .txt, TEI P5, .csv, JSON, .pdf, .jpg, tiff
  • Embedded in your disciplinary community

Tools: Choices

Affordances

Users and Intended Audiences

Back to Our Examples

In class exercises

The Humanities (1)

Discuss these questions:

  1. What are “the Humanities”? What kinds of objects do humanists study, and why?
  2. In a Digital Age, are the humanities relevant anymore?

The Humanities (2)

Pick a random object from the picker.

  • what humanities questions could be asked about it?
  • what questions about it really aren’t humanities questions?

DH Project

Now lok through the list. What huanities questions are they asking/answering? How successful does their work appear to be?

3: Deformance. Tokenization. TEI. HTML. CSS. JS. Your Assignment. Will that be enough for one day?

Today

  • Deformance
  • Tokenization
  • TEI
  • LONG Detour: HTML, CSS, JavaScript
  • Data Structures
  • Nadik Al-Malaika, Free Verse, and “Cholera”

Deformance

  • how did you approach this text?
  • what do you need to know in order to understand the arguments?
  • what arguments is it making?

History of Criticism

  • from Dante: the idea of reading against the grain of the work as an act of devotion:

    a poem has a determinate conceptual intelligibility, and while one may mistake it, or grasp it partially or inadequately, it nonetheless subsists, just as a transcendentally intelligible Word subsists behind or within all creation.

  • from Dickinson: the idea that a work of art has the possibility to mean many things:

    In this perspective, the critical and interpretive question is not “what does the poem mean?” but “how do we release or expose the poem’s possibilities of meaning?”

Deformance: expose new possibilities of meaning

  • “portmaneau” of deformation and performance
  • it creates new possibilities:

    Far more important is the stochastic process it entails. Reading Backward is a highly regulated method for disordering the senses of a text. It turns off the controls that organize the poetic system at some of its most general levels. When we run the deformative program through a particular work we cannot predict the results. As Dickinson elegantly puts it, “A Something overtakes the Mind,” and we are brought to a critical position in which we can imagine things about the text that we did not and perhaps could not otherwise know.

    There is one other important result. A deformative procedure puts the reader in a highly idiosyncratic relation to the work. This consequence could scarcely be avoided, since deformance sends both reader and work through the textual looking glass. On that other side customary rules are not completely short-circuited, but they are held in abeyance, to be chosen among (there are many systems of rules), to be followed or not as one decides. Deformative moves reinvestigate the terms in which critical commentary will be undertaken.

Some notes

  • this process is not mere randomness. It is informed by
    • prior knowledge of the poem and its context, and
    • familiarity with critical literature

Tokenization and algorithmic deformation

  • Note that deformation, as described, consists mostly in the application of some set of rules to the poem
  • in order to do this, we need some concept of the parts of the poem, their relationships, and the rules of transformation
  • formal division of a text into parts is called “tokenization”
    • find individual “tokens” of particular “types”

Tokenizing “Twinkle Twinkle”

Tokenizing “Twinkle Twinkle”

Exercise 1: Tokenize one stanza (or at least 4 lines) of your poem

Encoding Text – TEI

Encodings

  • HTML is a text encoming system
  • but of course there are many others, often with different purposes!
  • HTML is a web standard but there are other standards
  • TEI is an influential standard designed by and for humanists for encoding textual complexity

TEI is XML

  • the eXtensible Markup Language looks a bit like HTML but is much more complex
  • remains a key technology in many domains (e.g., your word docs and powerpoints slides are written in XML)
  • built with similar syntax but much, much more powerful
  • also way harder to read
<book category="children">
  <title>Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>

(trivial example)

Real TEI

Using TEI

  • we won’t!
  • but you might want to if:
    • you want to record textual complexities in a rigourous format
    • you hope to collaborate with others/make your work useful to others
    • you want to join a scholarly community around precise textual analysis
  • our work in the class is thematically similar even if we don’t use the same technlogy

Understanding The Web – HTML, CSS, JS

HTML at work, and some consequences

Three Levels

  • HTML for structure/content;
  • CSS for presentation
  • JS for dynamic changes and rendering remote data

Tag Anatomy

./images/tagname.png

  • opening and closing tags around content
  • “attributes” can affect how the browser displays and interprets the tag
  • the only attribute that really matters for us in this exercise is
  • class="list of classnames"

CSS Basics

http://bollig.co/assets/page-images/understanding-the-cascade-part-1/css-cascade.jpg

CSS Anatomy

With and without CSS

JavaScript

Learning More

The Mozilla Developers’ Network Is the best place to learn about web technologies, though there are lots of other resources too!

For the assignment, you may want to learn about some specific CSS rules. Every current and proposed CSS rule is listed here with links to a very full explanation of how to use it.

Javascript Data Structures

Arrays

let myArray = ["some",
               "collection",
               "of", 
               "things",
               4,
               7,
               3458,
               ["a", "sub", "array"],
               "pretty",
               "random"
               ]
  • an array is a list of elements separated by commas, and delimited by []
  • note 3 kinds of elements in this particular array:
    • string: collection of characters delimited by ""
    • number: numerals only, no quotes
    • array: one of the array elements is itself an array!

Objects

{
  title: "I am a collection of key-value pairs",
  separation: "always separated by commas",
  possibleValues: ["value", "can", "be", "any", "javascript", "data structure"],
  evenOtherObjects: {
    subvalue1: "Even other objects",
    subvalue2: 234098,
    subvalue3: "This is getting silly"
  },
  "propertyNamesCanBeQuoted": "Not really a good idea but we do it anyway"
}
  • key-value pairs
  • separated by commas (remember: CSS rule separator is semi-colon, JS object property separator is comma)
  • value can itself be a complex data structure

In our Assignment

{
    "type": "stanza",
    "class": "",
    "content": [
      {
        "type": "line",
        "class": "",
        "content": [
          {"type": "word", "content": "Everywhere", "class": "", "meta": ""},
          {"type": "word", "content": "lies", "class": "", "meta": ""},
          {"type": "word", "content": "a", "class": "", "meta": ""},
          {"type": "word", "content": "corpse", "meta": ""},
          {"type": "punctuation", "content": ",", "class": "", "meta": ""},
          {"type": "word", "content": "mourned", "class": "", "meta": ""}
        ]
      },
      {
        "type": "line",
        "class": "",
        "content": [
          {"type": "word", "content": "without", "class": "", "meta": ""},
          {"type": "word", "content": "a", "class": "", "meta": ""},
          {"type": "word", "content": "eulogy", "class": "", "meta": "How do we think of a eulogy?"},
          {"type": "word", "content": "or", "class": "", "meta": ""},
          {"type": "word", "content": "a", "class": "", "meta": ""},
          {"type": "word", "content": "moment", "class": "", "meta": ""},
          {"type": "word", "content": "of", "class": "", "meta": ""},
          {"type": "word", "content": "silence", "class": "", "meta": ""},
          {"type": "punctuation", "content": ".", "class": "", "meta": ""}
        ]
      }
    ]
  }

Your Task: JS part

  • add classes to relevant components
  • consider adding meta, if you want to explain something
  • also possible to add new elements!
{
    "type": "stanza",
    "class": "",
    "content": [
      {
        "type": "line",
        "class": "",
        "content": [
          {"type": "word", "content": "without", "class": "", "meta": ""},
          {"type": "word", "content": "a", "class": "", "meta": ""},
          {"type": "word", "content": "eulogy", "class": "death", "meta": ""},
          {"type": "word", "content": "or", "class": "", "meta": ""},
          {"type": "word", "content": "a", "class": "", "meta": ""},
          {"type": "phrase", "class": "",
           "meta": "To be without a moment of silence, amidst the grieving; the agony is awful",
           "content": [
             {"type": "word", "content": "moment", "class": "time", "meta": ""},
             {"type": "word", "content": "of", "class": "", "meta": ""},
             {"type": "word", "content": "silence", "class": "", "meta": ""}
           ]},
          {"type": "punctuation", "content": ".", "class": "", "meta": ""}
        ]
      }
    ]
  }

Exercise 2

Some CSS Rules

Display and Visibility

  • display: tells the browser how to manage the selected element in the document flow:
    • block means “give this element its own line,”
    • inline means “use the line (or box) that the parent element has already defined”
    • none means “just don’t use display this element at all” – it will be removed from the screen
  • visibility: tells the browser to hide the text, but still reserve the space that it would normally take up

color and background-color

  • color sets the text content color
  • background-color sets the background color
  • you can use the pre-defined named colors like white, black, magenta, fuchsia, etc
  • or you can use rgb values (red,green, blue values between 0 and 255): rgb(127,127,127) is grey, for instance
  • or you can use hex values: #888888 is also grey

font-style and font-weight

  • font-style can be set to italic
  • font-weight can be set to bold or normal

Other Rules/values

  • think about what effect you want to achieve
  • then google around/ask in Slack about how to do it!

Nadik Al-Malaika, Free Verse, and “Cholera”

Nadik Al-Malaika

Al-Kolera (“Cholera”), 1947 – in context

  • September 1947: first reports of cholera in Egypt
  • Reports of the dead piling up in downtown Cairo broadcast by radio throughout Middle East
  • At age 24, Al-Malaika hears these reports in her hometown Baghdad
  • and writes the poem, her first in Free Verse
    • Free Verse: modernist movement in poetry, move away from formal requirements of meter, number of lines, rhyme, etc.
      • arises as a formal movement in 19th C France (”vers libre”)
      • becomes popular in the 1920’s in English
      • very divergent from classical Arabic poetry, which has strict meter and rhyme requirements
      • “Al-Kolera” is perhaps the first free verse poem written in Arabic; heavily influenced by European Romantic & modernist poetry movements

The Poem

Deformance – What would it look like here?

  • an interpretative re-arrangement of the poem
  • highlighting? deleting? reversing? etc.

3.1: class activities

  • working with file and editors
  • editing css
  • seeing stuf happen
  • figuring out what to deform

ACTION 4: TEI 2 (<2020-07-09 Thu>)

CF the Romeo & Juliet Repo & associated announcements.

4: CSS day – Assignment, Deformance, etc

Modifying Al-Koler

  • stanzas reversed~
  • stanzas randomized
  • render as prose
  • tag “sound” words and remove them.
    • replace some words with antonyms (this is tricky. Could do by hand, or add some tricky css to replace with content of a data attr
  • a couple of other things too.
  • hide last 3 lines;

Your Poems

  • what things do you want to change? What is interesting about the poem? What do you want to emphasize.

ACTION Elizabethan Drama, Thomas Nashe, etc. Morality Plays. Interludes. Drama.

  • need to write this out so I can easily do a lecture!

5.1: Contagion, Pandemic History, and Literature

Today

  • discuss disease
  • thoughts about assignment
  • intor to Omeka (final assignment framework)

The Long Arc of Disease

  • in 2019, we lived in a relatively epidemic-free moment in human history
  • not anymore!
  • what can we learn from the broad sweep of human encounters with widespread infections disease?

Some moments in Pandemic History

  • actual “plague”: 3 pandemics: “Justinian” (6th C CE), “Black Death” (1346-~1700), “Modern Plague” (mid-c.19-~1959)
  • Syphilis (~1500-early c.20)
  • Cholera: early c.19-present
  • AIDS: late c.20-present
  • Ebola: 2014-present

(many others)

Disease in and as culture

  • diseases have many cultural effects
  • also, we understand disease in culturally-specific ways
  • “dialectical” (or “feedback”) relationships
  • literature as reflection and generator of understanding

Breakout groups: how to do this

  • quickly assign someone as record keeper. This person will report back in real time *in Slack, where everyone ca see the conversation
  • meanwhile, I will float around and listen in – not to eavesdrop, but to be helpful

Omeka

Getting Started with the DH Omeka installation

Omeka!

Items and Collections

Exhibits

::ID: 41fcc1bc-3677-4edd-ace1-5c23aec0dd31

Omeka Building Blocks

Items
individual pieces of content
Metadata
information about an item – ”Dublin Core” Standard
Collections
groups of items; each item is in exactly one collection
Exhibits
narratives woven around items.

Heroes and Villains: SilverAge Comics

Items

Metadata

Collections

Exhibits

Exhibits

./images/silver-age.png

Add an Item

Add an Item: Dublin Core Metadata

  • In the Item’s fields, enter the metadata: Title, Subject, Description, etc. Before you add items to your collection, you will have figured out how the Dublin Core metadata schema applies to your particular data: consistency across your collection is key.
  • For example, for medieval manuscripts, you can list authors of the texts under Author: but what of known scribes, who may have also intervened in the text? Are they Authors or Publishers or Contributors? Pick what makes sense and apply it consistently.
  • For details on Dublin Core :
  • http://dublincore.org/documents/dces/
  • If you wish the Item to be visible on the public view of the site, check “Public” (under “Add Item”).
  • Click “Add Item” (green, right).

Add an item: Dublin Core Metadata

  • In this assignment, you are responsible for the following fields (and some may have to stay blank):
  • Title;
  • Description (a paragraph recording your description of the object, in your own words: 100-200 words)
  • Creator;
  • Source (can be e.g. manuscript or book or collection);
  • Publisher;
  • Date;
  • Rights (i.e. who owns copyright – language here will vary strongly depending on institution the items come from, and that is not an error on the students’ part);
  • Format material, e.g. bronze, parchment, etc.);
  • Language;
  • Coverage (place where the object is from/was made).

Add an Item: Success

Add an Image to the Item

Add an Image to the Item

Collections

Collections

Add an Exhibit: Plan

  • The Exhibit is a narrative pathway through the collection; or, if you will, a digital essay based on the items in your collection.
  • First write the prose for your exhibit and decide what items need to appear in it.
  • To look pretty in the Exhibit, the items need to have picture or video files attached.
  • Save your prose in a separate file.
  • Put your immortal prose somewhere safe. Omeka Exhibits cannot be exported automatically: and if you ever delete yours in error, there is no getting it back.

Add an Exhibit: Structure

Add an Exhibit: Creation

Add an Exhibit: Metadata

Add an Exhibit: Structure

  • The Exhibit consists of Pages.
  • Each Page is a section of your Exhibit.
  • To build your Exhibit, start adding Pages by clicking the Add Page green button.

Add an Exhibit: Pages

Add an Exhibit: Blocks

Add an Exhibit: Overall Design

  • In your finished Exhibit, each Page will be a Section of the Narrative, illustrated with Items from your digital collection.
  • You can use and reuse items from your collection in as many Exhibits as you wish.

Building an Omeka Site: Conclusions

  • Make a project plan with deadlines. But allow for disasters.
  • Plan your exhibits (possibly on paper) well ahead of time
  • Draft and save your Item descriptions and your Exhibit content in word processing software; save this draft;
  • Paste your content from Word into Omeka.

Class Exercise: An Omeka Project

Notes for <2021-05-25 Tue>: Plague Literature

Group Work 1: Masque of the Red Death

  • Published May 1942, by Edgar Allan Poe
  • First, read the story (20 mins)
  • Then, answer questions:
    • What is the story about?
    • If you were to choose this story as your “Plague Lit” artifact, what historical contexts would matter?
    • What literary traditions would you need to investigate?
    • What actually interests you about the story? Note: You don’t have to enjoy the story to be interested in how it operates, just like you don’t have to enjoy cancer to be an oncologist, or enjoy fascism to be a historian of the Nazi period.
      • Are there themes or stylistic elements to which you might want to call attention?
      • What quotes would you work with?
      • what kinds of items would you add to Omeka?

Group Work 2: “Meaning of contagion”

Here we have a rather more difficult text, and if you have not read cultural history of science before, then it may feel quite difficult/opaque. So let’s try to understand what the essay is trying to do without even really reading it.

  • Pelling is a social and cultural historian of medicine, with an interest in contagious disease esp. in the c.19.
  • Let’s skim the text by reading pp. 1-4 and 19 (15-18 & 33 in the original numbering).
    • what main points do you see?
    • Do you feel confused? Disoriented? How might you go about orienting your

Digital Archives

Today

  • Knowledge, Danger, Digital
  • Omeka: Your Questions and Concerns

Making & Sustaining Digital Archives

./images/archive-defn.png

Toronto Public Library Digital Archive: Print Media, Digitized

./images/tpl-arc.png

Digital archives can originate with print media, like the Toronto Public Library Digital Archive.

Digital Archives: Egyptian Revolution (Born Digital)

./images/egypt-rev.png

Other archives are born digital.

https://archive-it.org/collections/2358

http://guides.library.cornell.edu/c.php?g=31688&p=200748

https://storify.com/acarvin/new-story-2

./images/born-dig-collections.png

Archives consist of records: digitized manuscripts, books, newspapers, legal documents, video footage, oral histories, sounds, tweets.

Digitizing Parchment & Paper

Internet Archive

./images/ia-screenshot.png

Materials

./images/boxes.jpg

Internet Archive: Materials for digitization.

Digitizing Manuscripts

Digitisation of a Dunhuang manuscript (Pictured: De Vere 480 camera. Wikipedia: International Dunhuang Project, 2006.)

./images/manuscript-scan.jpg

Digitizing Printed Books

Digitizing Books at the Fisher Library (Photo: Paul Armstrong, 2016)

./images/scanner.png

Digitizing Printed Books

./images/darkroom-scan.jpg

Digitizing Printed Books

./images/darkroom-person.png

Google Data Centre: Server Farm

Image: Google, Data Centre Gallery (https://www.google.com/about/datacenters/gallery)

./images/datacenter.png

“The Mirror of the World & Its Memory”

“Documentary heritage reflects the diversity of languages, peoples and cultures. It is the mirror of the world and its memory. But this memory is fragile. Every day, irreplaceable parts of this memory disappear for ever.”

– UNESCO Memory of the World Programme

British Library, Endangered Archives Programme

./images/endangered-archives.png

http://eap.bl.uk/database/map.a4d

Digital Imaging of Endangered Books

./images/hmml.png

./images/hmml-2.png

The Declaration of Independence (draft)

Declaration of Independence: Patriots? Residents?

Risks to Digital Archives

Domesday Book

Data Loss and Digital Archives: Storage, Loss, Preservation

Data Storage

  • Risky
    • Own machine: YIKES
    • Dropbox or Google Drive: better, but not perfect
  • Safer
    • Multiple backups off-site
    • GitHub (version control)
    • Institutional repository (i.e. the library) with technical safeguards against data degradation
    • LOCKSS: distributed network of institutional repositories
    • Dark archives: secret, inaccessible archives (disaster recovery)

Hardware and Software Platforms

  • Risky
    • Dedicated, unique software platforms made by commercial providers to fit your data beautifully.
  • Safer
    • Open-source, community-supported software

Preservation Approaches

  • Migration
    • Moving your data from an obsolete, less stable platform and format into a newer, more stable platform and format.
    • Great for simpler digital objects (text, images)
    • Example: DOE data.
  • Emulation
    • In a new, stable software platform, recreating the—now obsolete—original environment of a digital object
    • Good for complicated digital objects (software)
    • Example: old arcade video games.

Marginalized Communities: Archival Absence & Misrepresentation

Disasters

Fires

Freezers

Flim-Flam Men

./images/toxic-agenda.jpg

EDGI & The Archiving Movement

./images/g-a.png

Data Rescue

./images/dr.png

Website Monitoring

./images/wm-changes.png

Website Monitoring

./images/wm-flow.svg

Infrastructure

./images/edgi-gh.png

What’s the Real Nature of the problem?

  • single point of failure
  • who owns knowledge
  • who can/should we trust?

5: Endangered Knowledge 2: Fisher Library (<2020-07-14 Tue>)

6: Endangered Knowledge 3: Guest Lecture! (<2020-07-16 Thu>)

Prof B!

cool-looking talk!

file:///home/matt/wdw235/images/mattu-data-driven-society-oct-30-2019.jpg

“Endangerment” and Censorship

Understanding my comments

Word/symbolmeaning
paragraph; by itself it means “new paragraph here”
“good job; this is what you need to do to move your argument forward” It doesn’t mean “I agree”
awk/awkward“This sentence or phrase is not clear, and that unclarity potentially affects the success of your argument”
grammargrammatical error; should be obvious to you
agreementsubject & verb do not agree
😄:-)
  • overall: you guys did pretty well. congrats.
  • First Person: when to avoid
  • close reading: it’s hard

Assignment 2

  • marking is coming along, but won’t be done tomorrow
  • so far so good!

Proposal Details!

  • Due next Friday
  • Assignment says 100-200 words but you may need more
    • identify your book
    • explain its significance
    • Describe the ban or challenge
    • Briefly describe your plans for the project

Project Planning!

  • what main themes will you discuss?
  • what images do you need
  • how will you imagine the layout (wireframe)

More Omeka!

Data 1: Data Models for the Humanities

What is Data?

Data pass themselves off as mere descriptions of a priori conditions. Rendering observation (the act of creating a statistical, empirical, or subjective account or image) as if it were the same as the phenomena observed collapses the critical distance between the phenomenal world and its interpretation, undoing the basis of interpretation on which humanistic knowledge production is based. We know this. But we seem ready and eager to suspend critical judgment in a rush to visualization.

Data and Capta

To overturn the assumptions that structure conventions acquired from other domains requires that we re-examine the intellectual foundations of digital humanities, putting techniques of graphical display on a foundation that is humanistic at its base. This requires first and foremost that we reconceive all data as capta. Differences in the etymological roots of the terms data and capta make the distinction between constructivist and realist approaches clear. Capta is “taken” actively while data is assumed to be a “given” able to be recorded and observed. From this distinction, a world of differences arises. Humanistic inquiry acknowledges the situated, partial, and constitutive character of knowledge production, the recognition that knowledge is constructed, taken, not simply given as a natural representation of pre-existing fact.

Preserving Humanistic Inquiry

The polemic I set forth here outlines several basic principles on which to proceed differently by suggesting that what is needed is not a set of applications to display humanities “data” but a new approach that uses humanities principles to constitute capta and its display. At stake, as I have said before and in many contexts, is the authority of humanistic knowledge in a culture increasingly beset by quantitative approaches that operate on claims of certainty. Bureaucracies process human activity through statistical means and when the methods grounded in empirical sciences are put at the service of the social sciences or humanities in a crudely reductive manner, basic principles of critical thought are violated, or at the very least, put too far to the side. To intervene in this ideological system, humanists, and the values they embrace and enact, must counter with conceptual tools that demonstrate humanities principles in their operation, execution, and display.

Ambiguity as merely irreducible or constitutive

But an important distinction needs to be clear from the outset: the task of representing ambiguity and uncertainty has to be distinguished from a second task – that of using interpretations that arise in observer-codependence, characterized by ambiguity and uncertainty, as the basis on which a representation is constructed. This is the difference between putting many kinds of points on a map to show degrees of certainty by shades of color, degrees of crispness, transparency etc., and creating a map whose basic coordinate grid is constructed as an effect of these ambiguities. In the first instance, we have a standard map with a nuanced symbol set. In the second, we create a non-standard map that expresses the constructed-ness of space.

Compare Graphs

Spatiality

Drucker: Temporality

http://www.digitalhumanities.org/dhq/vol/5/1/000091/…000091/resources/images/figure07.jpg

What is data visualization?

= the presentation of data, information, knowledge, or insight in a pictorial or graphical format

pretend data 1

https://imgs.xkcd.com/comics/fuck_grapefruit.png

pretend data 2

http://thisisindexed.com/wp-content/uploads/2018/11/card6019.jpg

pretend data 3

http://thisisindexed.com/wp-content/uploads/2017/11/card5371.jpg

Lying with Dataviz

Ignoring conventions

Parik, “How to Lie with Data” https://i.kinja-img.com/gawker-media/image/upload/s–SKWrO6sh–/c_fit,f_auto,fl_progressive,q_80,w_636/uqs2i9txqkdyc5jkpfut.jpg

Cumulative Graphs

Parik, “How to Lie with Data”

Non-Zero baseline

https://i.kinja-img.com/gawker-media/image/upload/ksd0huhaczb6xsxhrszp.png

DH-ing with Dataviz

  • Unstructured Data
    • A corpus of literary texts
  • Semi-structured Data
    • TEI-encoded text
  • Structured Data
    • Spreadsheet of catalogue entries
    • collection of geocoded points in a GIS system

Humanities Data: Unstructured

./images/mandeville-cover-page.png

Humanities Data: Semi-structured

<msDesc>
  <msIdentifier>
    <settlement>Oxford
    </settlement>
    <repository>Bodleian Library
    </repository>
    <idno>MS. Add. A. 61
    </idno>
  </msIdentifier>
  <msContents>
    <p>
      <quote>Hic incipit Bruitus Anglie,
      </quote> the
      <title>De origine et gestis  Regum Angliae
      </title> of Geoffrey of Monmouth (Galfridus Monumetensis): beg.
      <quote>Cum mecum multa &amp; de multis.
      </quote> In Latin.
    </p>
  </msContents>
  <physDesc>
    <p>
      <material>Parchment
      </material>: written in more than one hand: 7¼ x 5⅜ in., I + 55 leaves, in double columns: with a few coloured capitals.
    </p>
  </physDesc>
</msDesc>

Humanities Data: Structured

./images/dc-ss-list.png

Humanities Data:

  • What “counts” cannot necessarily be counted
  • Data representation = interpretation:
  • The process of modelling and collecting our data is an interpretive process that is shaped by our choices re. what aspects of the data we model; by our research question, argument, perspective, discipline, social context, institutional context, tools available etc.

Is Our Data “fungible”?

“When you call something data, you imply that it exists in discrete, fungible units; that it is computationally tractable; that its meaningful qualities can be enumerated in a finite list; that someone else performing the same operations on the same data will come up with the same results. This is not how humanists think of the material they work with.” (Miriam Posner, http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/)

Reminder: Data and Capta

“[DH visualization tools borrowed from the sciences] carry with them assumptions of knowledge as observer-independent and certain, rather than observer co-dependent and interpretative. […] To begin, the concept of data as a given has to be rethought through a humanistic lens and characterized as capta, taken and constructed.” Johanna Drucker, “Humanities Approaches to Graphical Display.”

Display as Argument: Visual Knowledge Creation

  • Data vs. Capta
  • Display as argument:

“Graphic artifacts present knowledge through the combination of symbolic codes and structured relations of these elements in a flat field. […T]he forms that are generally used for the presentation of information can be understood and read as culturally coded expressions of knowledge with their own epistemological assumptions and historical lineage” (Drucker, “Graphesis: Visual Knowledge Production and Representation,” 2011).

Drucker’s Graphesis

  • Johanna Drucker: graphesis = “the field of knowledge production embodied in visual expressions … a visual epistemology” (Drucker, “Graphesis” 2011)
  • Visual forms carry the assumptions and values of their fields of origin, and impose these assumptions and values on the data they present, whether these assumptions and values are appropriate to that data or not.
  • As humanists, we ask ourselves: What arguments, values, and perspectives do visualizations encode and embody? What kind of knowledge do they produce? What field’s assumptions do they draw from?

Data vs. Capta – an Illustration

  • Data: “given”, objective, observed
  • Quantitative approaches: from concordances to corpora, from measuring word frequencies and stylometric patterns to thematic discovery through topic modelling
    • Visual representations of quantities, trajectories, measurable relationships
    • Wordle, Gephi, Cytoscape; pie charts, bar charts, and bubble graphs
  • Qualitative approaches: visual and performative, enacting poetics, making subjectivity and interpretation visible
    • Maps and timelines of literary narratives; digital collections; interpretive visualizations

Data vs. Capta: Two Maps

https://carto.com/img/layout/gallery/bbva-geo-risk/big.6b6fed37.gif

Data vs. Capta: Two Maps

DH Dataviz: Some Less and More Creative Examples

Network Graph

  • Things: nodes (vertices)
  • Relationships: edges

./images/network-graph.png

Les Miserables: Network Graph of Character Interactions

./images/les-mis-network.svg

Word Cloud

./images/rj-wordle.svg

  • Visualizes word frequencies in a text
  • The larger the word, the more often it appears

Mapping Imaginary Spaces

./images/mandeville-map.jpg

Non-Linear Timelines

./images/knotted-line.png

Rewriting John Snow

“Who are those dots? Each individual had a profile, age, size, health, economic potential, family and social roles. […] But what if we take the rate of deaths, their frequency, and chart that on a temporal axis inflected by increasing panic. Then give a graphical expression to the shape of the terrain, that urban streetscape, as it is redrawn to express the emotional landscape. Then imagine drawing this same streetscape from the point of view of a mother of six young children, a recent widow, a small child, or an elderly man whose son has just died” (Drucker, “Humanities Approaches”).

Some Rules of Thumb

  • in general, creative relationships to data/capta are more work than rigourous but straightforward quantitative analysis
  • they require familiarity both both humanities concepts and the underlying technologies
  • humaniites tools, though, try to lower the barrier to entry and so often hide the underlying technology.
  • this lets you play with visualization but is rarely sufficient to bring rela insights and creative accomplishments

Data Visualization: Play Time

  • we have 3 texts: Lady Susan,, Frankenstein, and Les Misérables
  • for each of these there is also a processed data file and in 2 cases a project file
  • we will deal with them in 3 tools: Cytoscape (demo only), Palladio, and voyant Tools

Cytoscape

  • original data in lesmis.txt
  • project file (if you want to replicate) in lesmis.cys

Maps & Networks with Palladio

Text Visualization with Voyant Tools

  • full-text visualization tool
  • try with frankenstein.txt,

tableau

Today

  • Cautionary Tales re: Visualization
  • Peculiarities of Humanities Data
  • Some Viz Examples
  • Play Time

Tableau recap

  • what was most interesting?
  • what was most difficult?
  • what are the take away lessons?

Data 2: OpenRefine Workshop! (Intro)

Working with Messy Data

  • As we have discussed, the world is messy, data needs to be clean / assumes cleanliness
  • This introduces some problems for humanists! We will largely set them aside for the moment (but keep thinking)
  • OpenRefine is one of many tools that can be used to clean data
  • for more complex operations, can be augmented by a scripting language (Python, R, Julia, etc.)

Objectives for This Week

  • Introduce OpenRefine & the Interface
  • Use sample datasets to Understand how cleaning works
  • Learn where to go next for more complex operations
  • think about how we could use this in the humanities, and what additional issues we might face as humanists

OpenRefine

According to its creator, it is:

  • more powerful than a spreadsheet
  • more interactive and visual than scripting
  • more provisional / exploratory / experimental / playful than a database

Also:

  • runs locally, but in a browser

Official Download instructions here, if you missed them in the course instructions

Cleaning

  • we use OpenRefine to clean data
  • this is a boring, but important, and sometimes difficult, step
  • there are many ways to clean data, but OpenRefine strikes a very nice balance between power and usability
  • Data cleaning is not epistemologically neutral: it reorganizes the world when you do it.

Your Tasks

  • Complete the Data Cleaning course as discussed in announcements and modules
  • Complete a short response to the course and hand it in (link will be in modules)

temorarily splitting this lecture and adding a conclusion above – only for convenience.

Household Expenditures

Starting OpenRefine

Follow the instructions here, but basically:

  • windows: double click openrefine.exe
  • mac: find openrefine in your Applications folder
  • linux: if installed via yr package manager, type openrefine at the command line
    • if donwloaded form website, navigate to installation folder, open a terminal there, and type ./refine
  • now open http://localhost:3333

Create a new Project with Household Expenses

  • browse for the Survey_of_household_expenses.xlsx file
  • remove first 5 lines, then
  • import create project

Some basic cleaning

  • remove empty column
  • set view to 50
  • fill down geography
  • Trim whitespace in househould expenditures (Edit Cells → Common transforms → trim whitespace)

Viewing

  • sort, remove sort
  • filter, facet

Transposing Data

  • switch from “wide” to “long” data
  • Transpose → Trnaspose Cells across columns into Rows
    • choose from 2010 to 2016
    • Transpose into Year (key) and AvgExp (value)

Modify Data in Place Using Facets

Explore:

  • what did average BC household spend on pet food?

GREL

  • upper/lowercase of column names

New Project with iNaturalist

Clustering

  • fnd the species_guess column, and make a text fact
  • chose the “cluster” button, and see what it does

Split

  • split species into 2 columns

Explore

  • split species_guess into genus_guess and epithet_guess; which is the most popular value of gunus_guess?

Concatenate/Join

  • a: join columns (Edit Column → Join)
  • b: concatenate (Edit Column → Add Column based on this Column → GREL Expression)

Explore

  • Combine the scientific_name and common_name columns using both of these methods

Removing/Reordering

  • rmeove some unused licences
  • move licence, quality grade, species info to the fornt

Flagging

  • facet by license/true, quality/casual
  • flag these rowse
  • unfacet
  • refacet on flags (in “all)
  • remove matching (also in “all”)

Undo/Redo/Extract/Apply

  • long list of undo
  • you can navigate through this list
  • be careful not to lose work while exploring!

Now: ome Humanities Work: Cleaning Text

We had some options here and I am trying to decide which is easiest! Let’s seewhere we get!

9: Data Visualization – Workshop! (<2020-07-28 Tue>)

<2020-07-30 Thu> 10

Notes on Kelly’s work

“Network density”?
percentage of possible connections between nodes that are actualy present
Cleveland and McGill Ranging System
Graphical Perception from 1984. principle finding is the ranking of elementary perceptual tasks given in Kelly’s slide 36.
small multiples
dividing plot into multiple individual graphsto avoid overplotting. cf this example
treemaps
rectangles w/ sub-rectangles. cf. google charts api for treempas
Bertin’s varliables
(1967) what are the categories “selective, associative, ordered, quantitative”?
selective
is a change enough to allow us to select it from a group?
associative
is a change enou to allow us to perceive it as a group?
quantitative
is there a numerical reading obtainable from changes in this variable?

::

mckinlay
what are quantitative, ordinal, and nominal categories?

12: Finale

Introduction to Digital Humanities

Learning Goals

  • Digital Humanities (DH) is a discipline at the intersections of the humanities with computing.
  • Digital humanists analyze languages through digital text collections; build digital archives of forbidden books; resurrect historical cities through digital maps; or construct video games to study literature.
  • This year the course focuses on plague literature: the .
  • By the end of the course, you will have mastered concepts and technologies you can use in future courses and workplaces: data visualization, data analysis, and digital exhibit platforms. And you will learn how our stories and cultural conversations work and shapeshift through digital environments.

By the end of the course:

  • You will be able to describe the history and intellectual landscape of the digital humanities, including the central concepts, debates, projects, and digital tools current in the discipline.
  • You will have developed a set of best practices around datasets, project design and management, and data curation.
  • You will have analyzed data and digital artifacts as complex cultural objects, shaped by, and shaping, how we live, think, and know.

Job Listing

./images/deloite-job.png

Requirements vs. Achievements

Continuing in DH

  • Statistical reasoning
  • programming skills
  • close reading
  • creativity

Diagrams

(org-babel-do-load-languages
 'org-babel-load-languages
  '((ditaa . t)
   (latex . t)
   (plantuml . t))) 
(setq org-ditaa-jar-path "/home/matt/src/org-mode/contrib/scripts/ditaa.jar" )
(setq org-plantuml-jar-path "/usr/share/java/plantuml/plantuml.jar")
+--------------+  +----------+
| cBLUE        |  |          |
| Humanities   | -|          |
|              |  |          |
+--------------+  +----------+
                             
                  

Simple DOT diagram

./images/test-dog.svg

Simple PlantUML diagram

skinparam ArrowColor red
skinparam backgroundColor #EEEBDC
skinparam handwritten true
skinparam defaultFontSize 30
skinparam sequenceArrowThickness 20
skinparam defaultArrowThickness 20


skinparam sequence {
ArrowColor Magenta
ActorBorderColor DeepSkyBlue
LifeLineBorderColor blue
LifeLineBackgroundColor #A9DCDF
ParticipantBorderColor DeepSkyBlue
ParticipantBackgroundColor DodgerBlue
ParticipantFontName Impact
ParticipantFontSize 25
ParticipantFontColor #A9DCDF
ActorBackgroundColor aqua
ActorFontColor DeepSkyBlue
ActorFontSize 17
ActorFontName Aapex
}


node "Humanities" as H #DeepSkyBlue
node "Computing\n Tools and\n Methodologies" as N #DeepSkyBlue

N =l=> H
H =r=> N

Latext diagrams

latex-dh.svg

\begin{tikzpicture}
\draw[red] (0,0) circle (1cm);
\end{tikzpicture}

Simple network diagram

graph graphname { 
      node [style="filled", color="blue"]
      a -- b; 
      b -- c;
      c -- a;
      c -- d;
      c -- g;
      b -- d;
      d -- a;
      e -- d;
      e -- c;
	} 

Technology-in-practice (Wanda Orlikowski)

./images/tech-in-practice.svg

Final Class

Alec

  • Daston 2020 (not 2021)
  • can you talk in more detail and with more nuance about “true” and “false” claims? Like, cna you discuss how people decide what information sources are “authoritative”?
  • re: vaccine efficacy more specifcally: this doesn’t feel to me like “misinformation” exactly. So htere’s something more complex htat has to be conveyed here.

Mia

  • pilgrimage
  • ocllectivism & individualism (why use this division)
  • “Catholic Christians” –> Catholics

Appendix: More Complex XML Example (where is this from?)

introc

It may help you to see a few more tags. Here is a slightly more complex example, with a more completely marked-up selection of the poem. I have not annotated this example, but it showcases a few more features of the systems we’re learning. In particular, this example introduces:

Coce