Coding Queen Elizabeth’s Court

This is an edited version of a lightning talk I gave at the Shakespeare Association of America Folger Digital Tools (Pre-Conference) Workshop. image of a poster created to advertise an encode-a-thon

In March of 2019, Meaghan Brown of the Folger Shakespeare Library came to George Mason University’s campus to run an encode-a-thon with the upper-level undergraduates who were students in my Early Modern England course.

For this event, we were working on the Elizabethan Court Day By Day dataset, compiled by Marion E. Colthorpe. While the origin of this project was an attempt to track Queen Elizabeth’s royal progresses through the countryside over the course of her reign, it morphed into a dataset tracking the events of her court for ever day of her 44-year reign. It includes not just a list of events but also select quotations from a wide variety of primary sources.

Folgerpedia page explaining the Elizabethan Court Day By Day datasetThis dataset was donated to the Folger Shakespeare Library as a massive PDF. At over 2000 pages long, it’s a treasure trove of information about the peregrinations and events of Elizabeth’s reign. The Digital Media and Publications team at the Folger have extracted the information from the dataset into plain text and have been working to encode it, to facilitate future analyses of the data.

This encoding is done in a modified version of the Folger’s Dromio TEI transcription/collation tool, developed by Mike Poston, for Early Modern Manuscripts Online transcribathons. Students begin by entering their name or the alias by which they would like to receive credit – I hear rumors William Shakespeare is quite active on the platform! Then they choose a month of Elizabeth’s reign to begin encoding and land in the interface pictured below.

image of HTML encoding of datasetThe first line of each day is metadata (colored red in the image to the right) where participants record the day, the type of event that has been captured, and the location of the event (where known). This particular month is February, 1601, chosen as the dates for the Earl of Essex’s rebellion – there are two events on the 7th and one on the 8th. While I wasn’t certain where Essex was when he refused to obey the Privy Council’s summons (politics), his followers were definitely at the Globe Theatre later that day, when the Lord Chamberlain’s Men performed Richard II (performance), presumably to get them in the right headspace for their attempted rebellion! The events of the 8th were somewhere in London, per the text, but at multiple locations within the city, hence the location is recorded at a larger scale than the single building of the Globe.

After adding (or editing some of the automatically generated) metadata about each day, students then marked up each day’s text by highlighting important words/phrases and clicking the buttons at the top of the interface to designate them as people (individual people, groups of people, or countries acting as people such as “Spain invaded”), places, dates, quotes, books, or general sources. While straightforward enough for a novice human encoder, this isn’t a task that can be done automatically for a lot reasons. For example, Essex is a person, a county, and also (in conjunction with the word “House”) a building in London.

image of XML encoding of datasetMost of my students worked solely in the interface, but a significant minority also flipped over into the XML markup view and dealt with manual tags, especially when they were dealing with days that needed to be split into multiple events or other complicated encodings. As someone who is personally familiar with XML, this view was in a lot of ways easier for me to work with as I could see very clearly when I’d accidentally gotten a tag in the wrong space or included a space in a place tag, or other “messiness” that I know will have to be processed out in the final analysis of the dataset.

While my students weren’t entirely certain, going into the event, how exactly this “encode-a-thon” thing was going to relate to their other classroom experiences, when I briefed my students after the event, they identified a number of positive outcomes they’d had from the activity:

  • up-close and personal look at daily life among Elizabethan elite and their servants
  • humanized historical actors
  • looking at “raw data” of facts/events that forms historical narratives
  • variety of sources used to reconstruct narrative of events
  • coverage of major events that didn’t make it into the classroom narrative due to in-class time limits
  • work behind building historical datasets
  • critically think about categorization of events/activities
  • introduction to XML (advanced students only)

And, perhaps most importantly, in between them regaling me with the stories they’d uncovered – and linking those stories back to important points I’d made in class lectures and their assignments – they reported having fun.

Quantitative, Computational, Digital: Musing on Definitions and History

I recently ran across a trifecta of adjectives: “quantitative, computational, and digital” history. It intrigued me enough that I did an internet search which gave me precisely 4 hits, 3 of which were for the same job posting. Clearly this isn’t mainstream yet.

That said, the phrasing really resonated with me on a number of levels and continued to haunt me to the point where I finally decided it was worth writing about at a bit of length.

I am, at the end of the day, a quantitative historian – numbers are integral to both my sources and many of my methods. When I first encountered demographic history in grad school, I instinctively called it “history by numbers” and critiqued sample sizes while interrogating authors’ calculations. My dissertation and first book project analyze early modern British numeracy and quantitative thinking, while my current DH project involves quantification at a massive scale (Death by Numbers: building a database out of the London Bills of Mortality so that I can examine, among other things, early modern people’s addition skills).

As needed, I am also a computational historian – methodologically I use statistics and computer programming on a semi-regular basis. My work on the Six Degrees of Francis Bacon project involved statistical work in R, as well as less quantitative programming in PostGreSQL, Ruby/Rails, JavaScript, HTML, and a sprinkling of Python for good measure. The bibliometric work I’ve done on Identifying Early Modern Books was also fundamentally computational as is much of the work I’m doing on Death by Numbers (I’m not calculating with nearly a million numbers by hand!) And my newest project, the Bridges of Pittsburgh, will involve a variety of pre-existing softwares as well as probably some bespoke programming for the graph theory aspects. Some of these computational methods are clearly also quantitative, but not all of them.

Lastly, by my actual title and job description, I am a digital historian – for whatever contested definition we give for DH. Increasingly, I and my colleagues in the Pittsburgh area have been scoping DH and digital scholarship projects using the criteria of web-facing, which plays out interestingly against the other two terms I use above. By these definitions, the digital is often but not always computational. An Omeka exhibit or WordPress site is digital but not particularly computational (in either the quantitative or programmatic sense). And if we define digital as web-facing, then the computational is not always digital. An example of this disjunction could be found in any computational project that ends with a traditional article or monograph publication rather than a sustained digital project.

Cue Venn Diagram to visualize the way I’ve been thinking about these similarities and differences… c’mon, you knew this was coming, didn’t you? Venn

So where does this leave DH (and Humanities Computing, Quantitative History, and the like)? Not a clue, hence the reason I called this a “musings” post. This will certainly not be the last (virtual) ink spilled on this very-contested and interesting subject of definitions. In the meantime, I will continue to enjoy my liminality and try on adjectives to suit my research objectives of the moment – be they qualitative, quantitative, computational, digital, or something else entirely.