(pcase (org-lms-get-keyword "ORG_LMS_SECTION")
((pred (string= "lecture")) (message "lecture"))
(otherwise (message "what?")))
<msDesc>
<msIdentifier>
<settlement>Oxford
</settlement>
<repository>Bodleian Library
</repository>
<idno>MS. Add. A. 61
</idno>
</msIdentifier>
<msContents>
<p>
<quote>Hic incipit Bruitus Anglie,
</quote> the
<title>De origine et gestis Regum Angliae
</title> of Geoffrey of Monmouth (Galfridus Monumetensis): beg.
<quote>Cum mecum multa & de multis.
</quote> In Latin.
</p>
</msContents>
<physDesc>
<p>
<material>Parchment
</material>: written in more than one hand: 7¼ x 5⅜ in., I + 55 leaves, in double columns: with a few coloured capitals.
</p>
</physDesc>
</msDesc>
(setq org-huveal-root "../../vendor/reveal")
hello
- Intro
- What is DH?
- DH: A partial history
- Course syllabus
- A Little Game, a little thinking (offline)
I’d like to keep all chat in Slack outside of class. So please click on this link in your copy of the lecture notes
Me | Historian of Science, Emphasis on “Engaged Teaching” |
You | Tell us about yourself |
Visit the #introductions channel in Slack. Write us a little message:
- What’s your name, what are your majors/minors?
- why are you interested in this class? If the answer is “it’s a requirement” then tell us why you want to do DH
- what skills are you hoping to develop here/elsewhere/in future?
- tell us one interesting fact about yourself that you’re willing to share with everyone in the class (nothing truly private, please)
- please also add a profile picture to your slack profile: ideally, this should be the same as or similar to your Quercus profile picture. It’s even better if the picture looks enough like you for me to identify you! I am very slow with names, so I need all the help I can get.
- awareness of what the humanities are
- exposure to established tools (but these change!)
- thoughtful attention to form, content, and their relationship.
- willingness to explore technical skills
(we’ll do more next time)
“The humanities—including the study of languages, literature, history, jurisprudence, philosophy, comparative religion, ethics, and the arts—are disciplines of memory and imagination, telling us where we have been and helping us envision where we are going.”
– /The Heart of the Matter/ (Report of the American Academy of Arts & Science’s Commission on the Humanities and Social Sciences to the U. S. Congress in June 2013)
- questions of knowledge:
- do humanists care about facts?
- historical role has been to create narratives that connect various parts of the human experience
- goal was generally not accumulation of fact but understanding of relationships
- we live in an age of facts. How should the humanities engage?
Humanists have traditionally disdained mathematical reasoning. Why?
- it’s hard and they don’t want to try?
- the “two cultures”? [cite:@Snowtwocultures1993a]
- quantitative data distorts the questions they want to ask?
- most empirical disciplines are atomistic and, at least much of the time reductionist
- humanists are largely interested in emergent properties and experiences
- How can humanists make use of “data” without losing a “dialectectical” relationship to their material (their texts, the pasts, their artifacts)?
- and so, a dilemma! and one that will not go away.
- 11 million words of medieval Latin
- 30+ years of editing and analysis
- 8000+ hours of computer processing stacks of punch cards
- 1500 + km of magnetic tape
@@html:</div><div class=”paired fragment appear”>@@ https://web.archive.org/web/20160910202350if_/http://blog.gale.com/wp-content/uploads/2016/05/busa1.jpg
From Father Busa’s archive. CIRCSE Research Centre, Università Cattolica del Sacro Cuore, Milan, Italy.in [cite:@TerrasAdaLovelaceDay2013] Top left: Livia Canestraro.
https://melissaterras.files.wordpress.com/2013/10/728a0-0028.jpg
- Digital editing & narratives: making texts and narratives available digitally
- Data visualization: giving visual forms to data
- Digital archives: digital (or digitized) collections of primary documents
- Digital mapping: plotting historical or literary data onto a modern, historical, or imaginary map
- Augmented/virtual reality: using computing to overlay virtual elements onto real landscapes (AR)
- 3D printing: turning a digital model into a real object
- Storytelling & performance: video games, coding as art practice
- Busa: at the optimistic invention of “information” [cite:@Shannonmathematicaltheorycommunication1948a]
- Cold War: hope; triumph; fear; reason and unreason; “human use of human beings” [cite:@wiener_human_1989]
- Busa sought to use informatic tools to supplement traditional approaches to traditional questions
- Compare with today
- ubiquitous informatic devices; surveillance; disinformation; systems manipulation (markets; ballot measures; elections; mass opinions)
- all work is now “digital”, and we take “supplementation” for granted
- our task in DH has to be: in an era in which subtle thinking appears to be endangered, can we harness the tremendous computational power to continue to think complex thoughts, not against the humanistic tradition, but within it?
- The digital part is important: you have to be willing to learn technical skills
- But it’s not nearly as important as the humanist part! So you will have to always think hard about texts, about cultural artifacts, about what it means to be human
- Navigate here
- Let’s Look at the “code”
- Now Build Your Own Stories
- Finally, Hand them in
- First, let’s think about Humanities
- Some DH examples
- Very Basic Design Parameters
- Data and Data Models
- How do we Think about Technology?
- Users
- In Class Activities
- What are the Humanities?
- Why do we care about them?
- What are the methods?
- Bring to this field your intuitions and interests from other Humanities Courses
- If you don’t have those… time to acquire them!
- the objects of humanistic inquiry have been studied in diverse times and places, using many different methods (e.g., systematic study of language as an abstract human capacity begins sometime between 700 and 400 BCE near the Afghanistan-Pakistan border)
- the term “Humanist” arises in 14th Century Italy (cf. Bod, pp 145 ff), as a contrast to “Divines” whose most exalted objects of study were theological.
- Development of intellectual disciplines over the next several centuries slowly separated humanities from the natural and social sciences.
- Formal argument in late c.19 about Geisteswissenschaften (“Sciences of the Spirit” == Humanities) and its distinction from “social” and “Natural”
- “Understanding” vs. “explaining”
- This remains a live debate for us!
- We study the works of human culture in an effort to come to grips, not with “what humans are”, but with what it means to be human
- The question is: what are the best methods for answering that question?
- In addition to “data” we need
- synthetic understanding
- empathy
- creativity
- Humanities disciplines are not all the same, but for the most part they have always employed some kind of qualitative analysis
- Computation has tremendously powerful applications for quantitative analysis
- So… what exactly does DH bring to the table?
- treat text as data
- compare large numbers of texts/objects
- allow large-scale collaboration
- provide a media infrastructure for disciplinary conversation.
- Do computers actually help us to answer Humanities questions?
- Alternatively, do computers just make humanities questions obsolete?
- You decide!
- abstract representation of things or processes
- choice of virtual entities, relationships, properties/aspects → “toy universe”
- top row of your spreadsheet
- Human-readable and human-editable
- Separable from technical platforms
- Described via metadata standards used in your discipline
- Contextualized in clear documentation
- Housed in non-proprietary, open source standards and technologies
- Saved in, or reducible to, simple formats: .txt, TEI P5, .csv, JSON, .pdf, .jpg, tiff
- Embedded in your disciplinary community
Discuss these questions:
- What are “the Humanities”? What kinds of objects do humanists study, and why?
- In a Digital Age, are the humanities relevant anymore?
Pick a random object from the picker.
- what humanities questions could be asked about it?
- what questions about it really aren’t humanities questions?
Now lok through the list. What huanities questions are they asking/answering? How successful does their work appear to be?
- Deformance
- Tokenization
- TEI
- LONG Detour: HTML, CSS, JavaScript
- Data Structures
- Nadik Al-Malaika, Free Verse, and “Cholera”
- how did you approach this text?
- what do you need to know in order to understand the arguments?
- what arguments is it making?
- from Dante: the idea of reading against the grain of the work as an act of devotion:
a poem has a determinate conceptual intelligibility, and while one may mistake it, or grasp it partially or inadequately, it nonetheless subsists, just as a transcendentally intelligible Word subsists behind or within all creation.
- from Dickinson: the idea that a work of art has the possibility to mean many things:
In this perspective, the critical and interpretive question is not “what does the poem mean?” but “how do we release or expose the poem’s possibilities of meaning?”
- “portmaneau” of deformation and performance
- it creates new possibilities:
Far more important is the stochastic process it entails. Reading Backward is a highly regulated method for disordering the senses of a text. It turns off the controls that organize the poetic system at some of its most general levels. When we run the deformative program through a particular work we cannot predict the results. As Dickinson elegantly puts it, “A Something overtakes the Mind,” and we are brought to a critical position in which we can imagine things about the text that we did not and perhaps could not otherwise know.
There is one other important result. A deformative procedure puts the reader in a highly idiosyncratic relation to the work. This consequence could scarcely be avoided, since deformance sends both reader and work through the textual looking glass. On that other side customary rules are not completely short-circuited, but they are held in abeyance, to be chosen among (there are many systems of rules), to be followed or not as one decides. Deformative moves reinvestigate the terms in which critical commentary will be undertaken.
- this process is not mere randomness. It is informed by
- prior knowledge of the poem and its context, and
- familiarity with critical literature
- Note that deformation, as described, consists mostly in the application of some set of rules to the poem
- in order to do this, we need some concept of the parts of the poem, their relationships, and the rules of transformation
- formal division of a text into parts is called “tokenization”
- find individual “tokens” of particular “types”
- HTML is a text encoming system
- but of course there are many others, often with different purposes!
- HTML is a web standard but there are other standards
- TEI is an influential standard designed by and for humanists for encoding textual complexity
- the eXtensible Markup Language looks a bit like HTML but is much more complex
- remains a key technology in many domains (e.g., your word docs and powerpoints slides are written in XML)
- built with similar syntax but much, much more powerful
- also way harder to read
<book category="children">
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
(trivial example)
- we won’t!
- but you might want to if:
- you want to record textual complexities in a rigourous format
- you hope to collaborate with others/make your work useful to others
- you want to join a scholarly community around precise textual analysis
- our work in the class is thematically similar even if we don’t use the same technlogy
- HTML for structure/content;
- CSS for presentation
- JS for dynamic changes and rendering remote data
- opening and closing tags around content
- “attributes” can affect how the browser displays and interprets the tag
- the only attribute that really matters for us in this exercise is
class="list of classnames"
- “Cascading Style Sheets”
- Style sheet
- that “cascades” = overrides prior values
- add dynamic transformations
- as in e.g. this silly example: https://developer.mozilla.org/en-US/docs/Learn/JavaScript/First_steps/What_is_JavaScript#A_high-level_definition
- we are not really learning to program in this class
- but it helps to be aware of JS and we will fool around with it during class
The Mozilla Developers’ Network Is the best place to learn about web technologies, though there are lots of other resources too!
- Check out the “Complete Beginners” Sequence here
- Teach yourself HTML here
- Get started with CSS here
- The JavaScript page of the “Complete Beginners” course is a really good place to start learning JS
- Afterwards you can check out the JS sequence
For the assignment, you may want to learn about some specific CSS rules. Every current and proposed CSS rule is listed here with links to a very full explanation of how to use it.
let myArray = ["some",
"collection",
"of",
"things",
4,
7,
3458,
["a", "sub", "array"],
"pretty",
"random"
]
- an array is a list of elements separated by commas, and delimited by
[]
- note 3 kinds of elements in this particular array:
- string: collection of characters delimited by
""
- number: numerals only, no quotes
- array: one of the array elements is itself an array!
- string: collection of characters delimited by
{
title: "I am a collection of key-value pairs",
separation: "always separated by commas",
possibleValues: ["value", "can", "be", "any", "javascript", "data structure"],
evenOtherObjects: {
subvalue1: "Even other objects",
subvalue2: 234098,
subvalue3: "This is getting silly"
},
"propertyNamesCanBeQuoted": "Not really a good idea but we do it anyway"
}
- key-value pairs
- separated by commas (remember: CSS rule separator is semi-colon, JS object property separator is comma)
- value can itself be a complex data structure
{
"type": "stanza",
"class": "",
"content": [
{
"type": "line",
"class": "",
"content": [
{"type": "word", "content": "Everywhere", "class": "", "meta": ""},
{"type": "word", "content": "lies", "class": "", "meta": ""},
{"type": "word", "content": "a", "class": "", "meta": ""},
{"type": "word", "content": "corpse", "meta": ""},
{"type": "punctuation", "content": ",", "class": "", "meta": ""},
{"type": "word", "content": "mourned", "class": "", "meta": ""}
]
},
{
"type": "line",
"class": "",
"content": [
{"type": "word", "content": "without", "class": "", "meta": ""},
{"type": "word", "content": "a", "class": "", "meta": ""},
{"type": "word", "content": "eulogy", "class": "", "meta": "How do we think of a eulogy?"},
{"type": "word", "content": "or", "class": "", "meta": ""},
{"type": "word", "content": "a", "class": "", "meta": ""},
{"type": "word", "content": "moment", "class": "", "meta": ""},
{"type": "word", "content": "of", "class": "", "meta": ""},
{"type": "word", "content": "silence", "class": "", "meta": ""},
{"type": "punctuation", "content": ".", "class": "", "meta": ""}
]
}
]
}
- add classes to relevant components
- consider adding meta, if you want to explain something
- also possible to add new elements!
{
"type": "stanza",
"class": "",
"content": [
{
"type": "line",
"class": "",
"content": [
{"type": "word", "content": "without", "class": "", "meta": ""},
{"type": "word", "content": "a", "class": "", "meta": ""},
{"type": "word", "content": "eulogy", "class": "death", "meta": ""},
{"type": "word", "content": "or", "class": "", "meta": ""},
{"type": "word", "content": "a", "class": "", "meta": ""},
{"type": "phrase", "class": "",
"meta": "To be without a moment of silence, amidst the grieving; the agony is awful",
"content": [
{"type": "word", "content": "moment", "class": "time", "meta": ""},
{"type": "word", "content": "of", "class": "", "meta": ""},
{"type": "word", "content": "silence", "class": "", "meta": ""}
]},
{"type": "punctuation", "content": ".", "class": "", "meta": ""}
]
}
]
}
- go to: https://digitalhistory.github.io/poem-json-project/
- follow the instructions
- copy the result
- discuss
- display: tells the browser how to manage the selected element in the document flow:
- block means “give this element its own line,”
- inline means “use the line (or box) that the parent element has already defined”
- none means “just don’t use display this element at all” – it will be removed from the screen
- visibility: tells the browser to hide the text, but still reserve the space that it would normally take up
color
sets the text content colorbackground-color
sets the background color- you can use the pre-defined named colors like white, black, magenta, fuchsia, etc
- or you can use rgb values (red,green, blue values between 0 and 255):
rgb(127,127,127)
is grey, for instance - or you can use hex values:
#888888
is also grey
- font-style can be set to
italic
- font-weight can be set to
bold
ornormal
- think about what effect you want to achieve
- then google around/ask in Slack about how to do it!
- September 1947: first reports of cholera in Egypt
- Reports of the dead piling up in downtown Cairo broadcast by radio throughout Middle East
- At age 24, Al-Malaika hears these reports in her hometown Baghdad
- and writes the poem, her first in Free Verse
- Free Verse: modernist movement in poetry, move away from formal requirements of meter, number of lines, rhyme, etc.
- arises as a formal movement in 19th C France (”vers libre”)
- becomes popular in the 1920’s in English
- very divergent from classical Arabic poetry, which has strict meter and rhyme requirements
- “Al-Kolera” is perhaps the first free verse poem written in Arabic; heavily influenced by European Romantic & modernist poetry movements
- Free Verse: modernist movement in poetry, move away from formal requirements of meter, number of lines, rhyme, etc.
- an interpretative re-arrangement of the poem
- highlighting? deleting? reversing? etc.
- working with file and editors
- editing css
- seeing stuf happen
- figuring out what to deform
CF the Romeo & Juliet Repo & associated announcements.
- stanzas reversed~
- stanzas randomized
- render as prose
- tag “sound” words and remove them.
- replace some words with antonyms (this is tricky. Could do by hand, or add some tricky css to replace with content of a data attr
- a couple of other things too.
- hide last 3 lines;
- what things do you want to change? What is interesting about the poem? What do you want to emphasize.
- need to write this out so I can easily do a lecture!
- discuss disease
- thoughts about assignment
- intor to Omeka (final assignment framework)
- in 2019, we lived in a relatively epidemic-free moment in human history
- not anymore!
- what can we learn from the broad sweep of human encounters with widespread infections disease?
- actual “plague”: 3 pandemics: “Justinian” (6th C CE), “Black Death” (1346-~1700), “Modern Plague” (mid-c.19-~1959)
- Syphilis (~1500-early c.20)
- Cholera: early c.19-present
- AIDS: late c.20-present
- Ebola: 2014-present
(many others)
- diseases have many cultural effects
- also, we understand disease in culturally-specific ways
- “dialectical” (or “feedback”) relationships
- literature as reflection and generator of understanding
- quickly assign someone as record keeper. This person will report back in real time *in Slack, where everyone ca see the conversation
- meanwhile, I will float around and listen in – not to eavesdrop, but to be helpful
- Go to the login page for the DH Omeka site, and
- sign in with this auth:
- user: WDW235-daytime
- pass: PlagueLitProject
- Items
- individual pieces of content
- Metadata
- information about an item – ”Dublin Core” Standard
- Collections
- groups of items; each item is in exactly one collection
- Exhibits
- narratives woven around items.
Heroes and Villains: SilverAge Comics
- In the Item’s fields, enter the metadata: Title, Subject, Description, etc. Before you add items to your collection, you will have figured out how the Dublin Core metadata schema applies to your particular data: consistency across your collection is key.
- For example, for medieval manuscripts, you can list authors of the texts under Author: but what of known scribes, who may have also intervened in the text? Are they Authors or Publishers or Contributors? Pick what makes sense and apply it consistently.
- For details on Dublin Core :
- http://dublincore.org/documents/dces/
- If you wish the Item to be visible on the public view of the site, check “Public” (under “Add Item”).
- Click “Add Item” (green, right).
- In this assignment, you are responsible for the following fields (and some may have to stay blank):
- Title;
- Description (a paragraph recording your description of the object, in your own words: 100-200 words)
- Creator;
- Source (can be e.g. manuscript or book or collection);
- Publisher;
- Date;
- Rights (i.e. who owns copyright – language here will vary strongly depending on institution the items come from, and that is not an error on the students’ part);
- Format material, e.g. bronze, parchment, etc.);
- Language;
- Coverage (place where the object is from/was made).
- The Exhibit is a narrative pathway through the collection; or, if you will, a digital essay based on the items in your collection.
- First write the prose for your exhibit and decide what items need to appear in it.
- To look pretty in the Exhibit, the items need to have picture or video files attached.
- Save your prose in a separate file.
- Put your immortal prose somewhere safe. Omeka Exhibits cannot be exported automatically: and if you ever delete yours in error, there is no getting it back.
- The Exhibit consists of Pages.
- Each Page is a section of your Exhibit.
- To build your Exhibit, start adding Pages by clicking the Add Page green button.
- In your finished Exhibit, each Page will be a Section of the Narrative, illustrated with Items from your digital collection.
- You can use and reuse items from your collection in as many Exhibits as you wish.
- Make a project plan with deadlines. But allow for disasters.
- Plan your exhibits (possibly on paper) well ahead of time
- Draft and save your Item descriptions and your Exhibit content in word processing software; save this draft;
- Paste your content from Word into Omeka.
- Published May 1942, by Edgar Allan Poe
- First, read the story (20 mins)
- Then, answer questions:
- What is the story about?
- If you were to choose this story as your “Plague Lit” artifact, what historical contexts would matter?
- What literary traditions would you need to investigate?
- What actually interests you about the story? Note: You don’t have to enjoy the story to be interested in how it operates, just like you don’t have to enjoy cancer to be an oncologist, or enjoy fascism to be a historian of the Nazi period.
- Are there themes or stylistic elements to which you might want to call attention?
- What quotes would you work with?
- what kinds of items would you add to Omeka?
Here we have a rather more difficult text, and if you have not read cultural history of science before, then it may feel quite difficult/opaque. So let’s try to understand what the essay is trying to do without even really reading it.
- Pelling is a social and cultural historian of medicine, with an interest in contagious disease esp. in the c.19.
- Let’s skim the text by reading pp. 1-4 and 19 (15-18 & 33 in the original numbering).
- what main points do you see?
- Do you feel confused? Disoriented? How might you go about orienting your
- Knowledge, Danger, Digital
- Omeka: Your Questions and Concerns
Digital archives can originate with print media, like the Toronto Public Library Digital Archive.
Other archives are born digital.
https://archive-it.org/collections/2358
http://guides.library.cornell.edu/c.php?g=31688&p=200748
https://storify.com/acarvin/new-story-2
Archives consist of records: digitized manuscripts, books, newspapers, legal documents, video footage, oral histories, sounds, tweets.
Digitisation of a Dunhuang manuscript (Pictured: De Vere 480 camera. Wikipedia: International Dunhuang Project, 2006.)
Digitizing Books at the Fisher Library (Photo: Paul Armstrong, 2016)
Image: Google, Data Centre Gallery (https://www.google.com/about/datacenters/gallery)“Documentary heritage reflects the diversity of languages, peoples and cultures. It is the mirror of the world and its memory. But this memory is fragile. Every day, irreplaceable parts of this memory disappear for ever.”
– UNESCO Memory of the World Programme
http://eap.bl.uk/database/map.a4d
- Risky
- Own machine: YIKES
- Dropbox or Google Drive: better, but not perfect
- Safer
- Multiple backups off-site
- GitHub (version control)
- Institutional repository (i.e. the library) with technical safeguards against data degradation
- LOCKSS: distributed network of institutional repositories
- Dark archives: secret, inaccessible archives (disaster recovery)
- Risky
- Dedicated, unique software platforms made by commercial providers to fit your data beautifully.
- Safer
- Open-source, community-supported software
- Migration
- Moving your data from an obsolete, less stable platform and format into a newer, more stable platform and format.
- Great for simpler digital objects (text, images)
- Example: DOE data.
- Emulation
- In a new, stable software platform, recreating the—now obsolete—original environment of a digital object
- Good for complicated digital objects (software)
- Example: old arcade video games.
- single point of failure
- who owns knowledge
- who can/should we trust?
file:///home/matt/wdw235/images/mattu-data-driven-society-oct-30-2019.jpg
Word/symbol | meaning |
---|---|
¶ | paragraph; by itself it means “new paragraph here” |
✓ | “good job; this is what you need to do to move your argument forward” It doesn’t mean “I agree” |
awk/awkward | “This sentence or phrase is not clear, and that unclarity potentially affects the success of your argument” |
grammar | grammatical error; should be obvious to you |
agreement | subject & verb do not agree |
😄 | :-) |
- overall: you guys did pretty well. congrats.
- First Person: when to avoid
- close reading: it’s hard
- marking is coming along, but won’t be done tomorrow
- so far so good!
- Due next Friday
- Assignment says 100-200 words but you may need more
- identify your book
- explain its significance
- Describe the ban or challenge
- Briefly describe your plans for the project
- what main themes will you discuss?
- what images do you need
- how will you imagine the layout (wireframe)
Data pass themselves off as mere descriptions of a priori conditions. Rendering observation (the act of creating a statistical, empirical, or subjective account or image) as if it were the same as the phenomena observed collapses the critical distance between the phenomenal world and its interpretation, undoing the basis of interpretation on which humanistic knowledge production is based. We know this. But we seem ready and eager to suspend critical judgment in a rush to visualization.
To overturn the assumptions that structure conventions acquired from other domains requires that we re-examine the intellectual foundations of digital humanities, putting techniques of graphical display on a foundation that is humanistic at its base. This requires first and foremost that we reconceive all data as capta. Differences in the etymological roots of the terms data and capta make the distinction between constructivist and realist approaches clear. Capta is “taken” actively while data is assumed to be a “given” able to be recorded and observed. From this distinction, a world of differences arises. Humanistic inquiry acknowledges the situated, partial, and constitutive character of knowledge production, the recognition that knowledge is constructed, taken, not simply given as a natural representation of pre-existing fact.
The polemic I set forth here outlines several basic principles on which to proceed differently by suggesting that what is needed is not a set of applications to display humanities “data” but a new approach that uses humanities principles to constitute capta and its display. At stake, as I have said before and in many contexts, is the authority of humanistic knowledge in a culture increasingly beset by quantitative approaches that operate on claims of certainty. Bureaucracies process human activity through statistical means and when the methods grounded in empirical sciences are put at the service of the social sciences or humanities in a crudely reductive manner, basic principles of critical thought are violated, or at the very least, put too far to the side. To intervene in this ideological system, humanists, and the values they embrace and enact, must counter with conceptual tools that demonstrate humanities principles in their operation, execution, and display.
But an important distinction needs to be clear from the outset: the task of representing ambiguity and uncertainty has to be distinguished from a second task – that of using interpretations that arise in observer-codependence, characterized by ambiguity and uncertainty, as the basis on which a representation is constructed. This is the difference between putting many kinds of points on a map to show degrees of certainty by shades of color, degrees of crispness, transparency etc., and creating a map whose basic coordinate grid is constructed as an effect of these ambiguities. In the first instance, we have a standard map with a nuanced symbol set. In the second, we create a non-standard map that expresses the constructed-ness of space.
http://www.digitalhumanities.org/dhq/vol/5/1/000091/…000091/resources/images/figure07.jpg
= the presentation of data, information, knowledge, or insight in a pictorial or graphical format http://thisisindexed.com/wp-content/uploads/2018/11/card6019.jpg http://thisisindexed.com/wp-content/uploads/2017/11/card5371.jpg Parik, “How to Lie with Data” https://i.kinja-img.com/gawker-media/image/upload/s–SKWrO6sh–/c_fit,f_auto,fl_progressive,q_80,w_636/uqs2i9txqkdyc5jkpfut.jpg Parik, “How to Lie with Data”https://i.kinja-img.com/gawker-media/image/upload/ksd0huhaczb6xsxhrszp.png
- Unstructured Data
- A corpus of literary texts
- Semi-structured Data
- TEI-encoded text
- Structured Data
- Spreadsheet of catalogue entries
- collection of geocoded points in a GIS system
<msDesc>
<msIdentifier>
<settlement>Oxford
</settlement>
<repository>Bodleian Library
</repository>
<idno>MS. Add. A. 61
</idno>
</msIdentifier>
<msContents>
<p>
<quote>Hic incipit Bruitus Anglie,
</quote> the
<title>De origine et gestis Regum Angliae
</title> of Geoffrey of Monmouth (Galfridus Monumetensis): beg.
<quote>Cum mecum multa & de multis.
</quote> In Latin.
</p>
</msContents>
<physDesc>
<p>
<material>Parchment
</material>: written in more than one hand: 7¼ x 5⅜ in., I + 55 leaves, in double columns: with a few coloured capitals.
</p>
</physDesc>
</msDesc>
- What “counts” cannot necessarily be counted
- Data representation = interpretation:
- The process of modelling and collecting our data is an interpretive process that is shaped by our choices re. what aspects of the data we model; by our research question, argument, perspective, discipline, social context, institutional context, tools available etc.
“When you call something data, you imply that it exists in discrete, fungible units; that it is computationally tractable; that its meaningful qualities can be enumerated in a finite list; that someone else performing the same operations on the same data will come up with the same results. This is not how humanists think of the material they work with.” (Miriam Posner, http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/)
“[DH visualization tools borrowed from the sciences] carry with them assumptions of knowledge as observer-independent and certain, rather than observer co-dependent and interpretative. […] To begin, the concept of data as a given has to be rethought through a humanistic lens and characterized as capta, taken and constructed.” Johanna Drucker, “Humanities Approaches to Graphical Display.”
- Data vs. Capta
- Display as argument:
“Graphic artifacts present knowledge through the combination of symbolic codes and structured relations of these elements in a flat field. […T]he forms that are generally used for the presentation of information can be understood and read as culturally coded expressions of knowledge with their own epistemological assumptions and historical lineage” (Drucker, “Graphesis: Visual Knowledge Production and Representation,” 2011).
- Johanna Drucker: graphesis = “the field of knowledge production embodied in visual expressions … a visual epistemology” (Drucker, “Graphesis” 2011)
- Visual forms carry the assumptions and values of their fields of origin, and impose these assumptions and values on the data they present, whether these assumptions and values are appropriate to that data or not.
- As humanists, we ask ourselves: What arguments, values, and perspectives do visualizations encode and embody? What kind of knowledge do they produce? What field’s assumptions do they draw from?
- Data: “given”, objective, observed
- Quantitative approaches: from concordances to corpora, from measuring word frequencies and stylometric patterns to thematic discovery through topic modelling
- Visual representations of quantities, trajectories, measurable relationships
- Wordle, Gephi, Cytoscape; pie charts, bar charts, and bubble graphs
- Qualitative approaches: visual and performative, enacting poetics, making subjectivity and interpretation visible
- Maps and timelines of literary narratives; digital collections; interpretive visualizations
https://carto.com/img/layout/gallery/bbva-geo-risk/big.6b6fed37.gif
- Things: nodes (vertices)
- Relationships: edges
- Visualizes word frequencies in a text
- The larger the word, the more often it appears
“Who are those dots? Each individual had a profile, age, size, health, economic potential, family and social roles. […] But what if we take the rate of deaths, their frequency, and chart that on a temporal axis inflected by increasing panic. Then give a graphical expression to the shape of the terrain, that urban streetscape, as it is redrawn to express the emotional landscape. Then imagine drawing this same streetscape from the point of view of a mother of six young children, a recent widow, a small child, or an elderly man whose son has just died” (Drucker, “Humanities Approaches”).
- in general, creative relationships to data/capta are more work than rigourous but straightforward quantitative analysis
- they require familiarity both both humanities concepts and the underlying technologies
- humaniites tools, though, try to lower the barrier to entry and so often hide the underlying technology.
- this lets you play with visualization but is rarely sufficient to bring rela insights and creative accomplishments
- we have 3 texts: Lady Susan,, Frankenstein, and Les Misérables
- for each of these there is also a processed data file and in 2 cases a project file
- we will deal with them in 3 tools: Cytoscape (demo only), Palladio, and voyant Tools
- original data in lesmis.txt
- project file (if you want to replicate) in lesmis.cys
- full-text visualization tool
- try with frankenstein.txt,
- Cautionary Tales re: Visualization
- Peculiarities of Humanities Data
- Some Viz Examples
- Play Time
- what was most interesting?
- what was most difficult?
- what are the take away lessons?
- As we have discussed, the world is messy, data needs to be clean / assumes cleanliness
- This introduces some problems for humanists! We will largely set them aside for the moment (but keep thinking)
- OpenRefine is one of many tools that can be used to clean data
- for more complex operations, can be augmented by a scripting language (Python, R, Julia, etc.)
- Introduce OpenRefine & the Interface
- Use sample datasets to Understand how cleaning works
- Learn where to go next for more complex operations
- think about how we could use this in the humanities, and what additional issues we might face as humanists
According to its creator, it is:
- more powerful than a spreadsheet
- more interactive and visual than scripting
- more provisional / exploratory / experimental / playful than a database
Also:
- runs locally, but in a browser
Official Download instructions here, if you missed them in the course instructions
- we use OpenRefine to clean data
- this is a boring, but important, and sometimes difficult, step
- there are many ways to clean data, but OpenRefine strikes a very nice balance between power and usability
- Data cleaning is not epistemologically neutral: it reorganizes the world when you do it.
- Complete the Data Cleaning course as discussed in announcements and modules
- Complete a short response to the course and hand it in (link will be in modules)
- First, open this spreadhseet in Excel (or whatever spreadsheet program you use) and let’s look at it.
- Note the format & category choices
- This link goes to a file on my local computer – won’t work for you!
Follow the instructions here, but basically:
- windows: double click
openrefine.exe
- mac: find openrefine in your Applications folder
- linux: if installed via yr package manager, type
openrefine
at the command line- if donwloaded form website, navigate to installation folder, open a terminal there, and type
./refine
- if donwloaded form website, navigate to installation folder, open a terminal there, and type
- now open http://localhost:3333
- browse for the
Survey_of_household_expenses.xlsx file
- remove first 5 lines, then
- import create project
- remove empty column
- set view to
50
- fill down geography
- Trim whitespace in
househould expenditures
(Edit Cells → Common transforms → trim whitespace
)
- sort, remove sort
- filter, facet
- switch from “wide” to “long” data
Transpose → Trnaspose Cells across columns into Rows
- choose
from
2010to
2016 - Transpose into Year (key) and AvgExp (value)
- choose
- what did average BC household spend on pet food?
- upper/lowercase of column names
- navigate to citizenscience.csv(again, only for me)
- this time just create project (no need for more complex manipulation)
- this is crowdsourced data. What might some ofh tei ssues be?
- fnd the
species_guess
column, and make a text fact - chose the “cluster” button, and see what it does
- split
species
into 2 columns
- split
species_guess
intogenus_guess
andepithet_guess
; which is the most popular value ofgunus_guess
?
- a: join columns (
Edit Column → Join
) - b: concatenate (
Edit Column → Add Column based on this Column → GREL Expression
)
- Combine the scientific_name and common_name columns using both of these methods
- rmeove some unused licences
- move licence, quality grade, species info to the fornt
- facet by license/true, quality/casual
- flag these rowse
- unfacet
- refacet on flags (in “all)
- remove matching (also in “all”)
- long list of undo
- you can navigate through this list
- be careful not to lose work while exploring!
We had some options here and I am trying to decide which is easiest! Let’s seewhere we get!
<2020-07-30 Thu> 10
- “Network density”?
- percentage of possible connections between nodes that are actualy present
- Cleveland and McGill Ranging System
- Graphical Perception from 1984. principle finding is the ranking of elementary perceptual tasks given in Kelly’s slide 36.
- small multiples
- dividing plot into multiple individual graphsto avoid overplotting. cf this example
- treemaps
- rectangles w/ sub-rectangles. cf. google charts api for treempas
- Bertin’s varliables
- (1967) what are the categories “selective, associative, ordered, quantitative”?
- selective
- is a change enough to allow us to select it from a group?
- associative
- is a change enou to allow us to perceive it as a group?
- quantitative
- is there a numerical reading obtainable from changes in this variable?
::
- mckinlay
- what are quantitative, ordinal, and nominal categories?
- Digital Humanities (DH) is a discipline at the intersections of the humanities with computing.
- Digital humanists analyze languages through digital text collections; build digital archives of forbidden books; resurrect historical cities through digital maps; or construct video games to study literature.
- This year the course focuses on plague literature: the .
- By the end of the course, you will have mastered concepts and technologies you can use in future courses and workplaces: data visualization, data analysis, and digital exhibit platforms. And you will learn how our stories and cultural conversations work and shapeshift through digital environments.
- You will be able to describe the history and intellectual landscape of the digital humanities, including the central concepts, debates, projects, and digital tools current in the discipline.
- You will have developed a set of best practices around datasets, project design and management, and data curation.
- You will have analyzed data and digital artifacts as complex cultural objects, shaped by, and shaping, how we live, think, and know.
- Statistical reasoning
- programming skills
- close reading
- creativity
(org-babel-do-load-languages
'org-babel-load-languages
'((ditaa . t)
(latex . t)
(plantuml . t)))
(setq org-ditaa-jar-path "/home/matt/src/org-mode/contrib/scripts/ditaa.jar" )
(setq org-plantuml-jar-path "/usr/share/java/plantuml/plantuml.jar")
+--------------+ +----------+
| cBLUE | | |
| Humanities | -| |
| | | |
+--------------+ +----------+
skinparam ArrowColor red
skinparam backgroundColor #EEEBDC
skinparam handwritten true
skinparam defaultFontSize 30
skinparam sequenceArrowThickness 20
skinparam defaultArrowThickness 20
skinparam sequence {
ArrowColor Magenta
ActorBorderColor DeepSkyBlue
LifeLineBorderColor blue
LifeLineBackgroundColor #A9DCDF
ParticipantBorderColor DeepSkyBlue
ParticipantBackgroundColor DodgerBlue
ParticipantFontName Impact
ParticipantFontSize 25
ParticipantFontColor #A9DCDF
ActorBackgroundColor aqua
ActorFontColor DeepSkyBlue
ActorFontSize 17
ActorFontName Aapex
}
node "Humanities" as H #DeepSkyBlue
node "Computing\n Tools and\n Methodologies" as N #DeepSkyBlue
N =l=> H
H =r=> N
\begin{tikzpicture}
\draw[red] (0,0) circle (1cm);
\end{tikzpicture}
graph graphname {
node [style="filled", color="blue"]
a -- b;
b -- c;
c -- a;
c -- d;
c -- g;
b -- d;
d -- a;
e -- d;
e -- c;
}
- Daston 2020 (not 2021)
- can you talk in more detail and with more nuance about “true” and “false” claims? Like, cna you discuss how people decide what information sources are “authoritative”?
- re: vaccine efficacy more specifcally: this doesn’t feel to me like “misinformation” exactly. So htere’s something more complex htat has to be conveyed here.
- pilgrimage
- ocllectivism & individualism (why use this division)
- “Catholic Christians” –> Catholics
It may help you to see a few more tags. Here is a slightly more complex example, with a more completely marked-up selection of the poem. I have not annotated this example, but it showcases a few more features of the systems we’re learning. In particular, this example introduces:
- more complex colors
- the “a” or hyperlink tag
- curly braces as a shortcut for xpaths
- the somewhat confusing xpath test “node”, which sometimes makes sense to use when “current()” doesn’t produce the effect you want.