Teaching With Wikipedia

I have been privileged to work with several awesome graduate students this semester, including Cordelia “Cory” Brazile and Lauren Churilla. A few weeks ago, we had a thought-provoking conversation about using Wikipedia in the classroom that inspired this blog post.

I’m sure I’m not the only one who despairs when my students cut-paste out of Wikipedia in their classroom papers. But the ship of “don’t use Wikipedia” sailed at least a decade ago, if not more, and honestly it’s not completely different from the age-old problem of “don’t copy from encyclopedias.” Tertiary sources have always (for definitions of always involving everyone alive today) been a part of our academic landscape and have clear uses for both research and pedagogy so banning the use of encyclopedias (crowdsourced or not) makes no sense. We need to use them responsibly and teach our students to do the same.

Enter Wiki Edu, a set of online resources dedicated to supporting teachers who want to incorporate Wikipedia assignments into their syllabi. If you have the desire and inclination to structure a major classroom assignment (equivalent to a whole-semester or half-semester research paper) into your syllabus, I highly recommend checking the site out. It includes a host of video resources that teach students how to edit Wikipedia – including videos on copyright and attribution – and teachers who register their classes on the site can get technological and staff support during the assignment.

That said, not everyone has the time or inclination to engage that intensely with Wikipedia. Our syllabi are jam-packed with everything else we need to convey to students and many teachers don’t want to reinforce an encyclopedia-like “history is facts” mentality over the argumentative, thesis-driven model of historical writing. So what are some alternatives? This is the list Lauren, Cory, and I brainstormed.

  • Explore an article edit history to see how the “facts” emerge over time. 

Using the example of the World history article, students can use the “view history” link to see the original version of this article in which a user snarkily defined world history as:

First the earth cooled.

Then the dinosaurs got too big and fat so they all died.


Johnny, “Airplane – the Movie”

(Among other things, this provides students with a window into why some teachers view the site with serious skepticism…) In the following years, the page was redirected to “History” more generally, then redirected again to “History of the World,” before finally becoming a brief definition of the field of World History and expanding from there. Guiding students through a journey likes this helps them understand the evolution of the “facts” that they often take for granted when reading a Wikipedia article.

  • Explore an article’s talk page to see how the “facts” of controversial subjects are negotiated among interested parties.

The Wiki Edu resources deliberately steer students away from editing controversial articles (which is smart given their assumptions that students are editing/writing articles) but exploring these controversies can give students a peek behind the curtain of Wikipedia’s vaunted “neutrality.” Each article has a talk page, where editors can discuss changes before making them (or debate changes if someone else has reverted their changes). Exploring an article’s talk page in addition to the edit history gives an added dimension to the goal of exposing students to how historical consensus emerges in the case of controversial subjects. As a bonus, the talk pages themselves have a edit history! Reading the edit history of, say, the Black Lives Matter Wikipedia talk page is a fascinating exercise in the construction of Wikipedia article, in addition to educating students about the emergence of the movement. Wikipedia maintains a page with a list of controversial articles, which range from politics to sports, as well as a page on Wikipedia controversies.

  • Examine the bibliographies of articles and assess the chosen sources for bias.

Despite Wikipedia’s attempts to create “neutral” articles, which don’t favor any particular point of view over another, it doesn’t always succeed. Sometimes the bias is overt and comes under review or is tagged as biased. In other cases, however, the bias emerges only by carefully reviewing the sources chose for citation. My favorite biased article (yes, I have a favorite) is the Long Parliament article, which is almost entirely derived from the work of two nineteenth century publications and is very Whiggish. I’ve been using it as an object lesson to my students for over four years now and eventually someone will fix it but it hasn’t happened yet! With the exception of an InternetArchiveBot, no one has even touched the article since December 2015. Exploring the bias in this particular article also leads to a greater discussion of the nature of copyright, what sources people seeking to edit Wikipedia have – or don’t have – access to, and how lack of access to cutting-edge academic research might create barriers to public understanding. Insert discussion of the Open Access movement here…

  • Find articles that have been flagged for lack of citation and see if students can find reliable citations for them.

Wikipedia flags articles that need additional verification as well as maintains a page with a list of all these articles. (It even includes a button “I can help! Give me a random citation to find!” to get people started.) Asking students to work on a particularly under-verified article requires them to discover research sources outside of Wikipedia. Students then can practice providing sources for assertions of fact or determine that alleged facts lack citation because there is no evidence for them and modify the articles accordingly.

  • For multilingual students, examine the difference between the English language version of a article and the version in a different language.

We know that Wikipedia exist across multiple languages, but not everyone realizes that articles can differ radically between languages. The English Wikipedia article for King Sancho IV of Castile, for example, is a mere 5 sections long, most of which are genealogical, while the Spanish version is over twice as long with 11 sections. Comparing the two helps introduce students to the idea that “facts” and “common knowledge” can actually be culturally – in this case linguistically – determined.

  • Last, but not least, conduct targeted article edits.

Just because you don’t want to spend a huge chunk of your semester doing a Wikipedia assignment doesn’t mean that it can’t be productive to do some group live-editing of an article during class, or as a single homework assignment. Showing students how to make simple tweaks to the site gives them a better understanding of the crowdsourced nature of the site and how it’s constantly being updated – and may reinforce both a sense of publicly contributing to knowledge instead of writing solely for an audience of the teacher, as well as the idea that perhaps they should question the veracity of things they read online, instead of taking it all for granted.

So those were our ideas! If you have any other great things you’ve tried out with Wikipedia in the classroom, I’d be interested to hear it.


Between Two Cultures

At the RSA annual meeting, I attended a roundtable on DH pedagogy. While I was originally hoping to learn about my fellow early modern DHers’ pedagogical strategies, a conversation that took place before the panel quickly cued me into the fact that I was going to be getting something very different out of the discussion. I was taking an anthropological journey into the world of the DH-curious (people who are intrigued by the potential of DH but inexperienced and often intimidated by it).

When we got to the conversational part of the roundtable, an audience member quickly steered us into a familiar framework from 1959 – C.P. Snow’s “Two Cultures.” The TL;DR of that Wikipedia link is that there are two cultures in modern intellectual society: scientists and humanists. This is an idea that a lot of people have bought into, from administrators trying to defund the humanities to force people into more “productive” career tracks in the sciences, to students who claim they are only “good at” one or the other. When fellow historians profess amazement that I would study something as “difficult” as the history of mathematics, I understand anew why some “History of Science” departments decided to calve off from “History” departments and consider themselves a whole different discipline. And when an entire discussion at the RSA gets bogged down in implicit assumptions that humanists can not or should not be required to learn to think programmatically, I can’t help but sympathize with DHers who argue that DH is a discipline distinct from the rest of the humanities.

That said, the idea that humanists and scientists form two separate cultures – and that humanists shouldn’t be expected to understand scientists, and vice versa – undermines the entire point of a liberal arts education. It erases undergraduate students who move freely between humanistic and scientific classes, often to the point of double-majoring or major-minoring across this perceived divide. It essentializes graduate students to a single narrow disciplinary viewpoint, as if they are unable to have any interests outside their dissertation topic. It ignores anyone who’s worked in the history of science (seriously: science has a history, just like anything else) as well as social sciences, quantitative humanities, computational/digital humanities, and any other field that doesn’t fit neatly into these two predetermined cultural categories.

The notion of this divide was so embedded into the roundtable discussion – and people were so convinced that those of us who span this divide don’t exist – that an idea was pitched of a DH workshop for RSA 2019 that required collaboration across the divide:

Now don’t get me wrong, collaboration is great. But for collaboration to work, you need to speak enough of the same language to communicate with one another either directly or through a translator. The former requires the humanist and the computer scientist to learn the other’s language, at least a little bit – for example, understanding the difference between computer scientists, data scientists, statisticians, and digital humanists, rather than lumping them all together under the label “IT people.” The latter requires working with someone who already exists across that two cultures divide, making it hard to argue that scholars can’t really be expected to do what some scholars are already doing.

What message, then, does the two cultures debate send to those of us who work across and between these allegedly separate cultures? Why must we be forced to choose sides, to repudiate half of our work/skills/knowledge/selves in order to force ourselves into a binary constrution that doesn’t reflect reality? Why are we teaching future generations of scholars that you can only be “good at” one or the other, instead of celebrating the scholarly role models who show that there are more than just two paths?

And so I end with the tweet that was my first gut reaction when the two cultures digression began to take over the panel:


Quantitative, Computational, Digital: Musing on Definitions and History

I recently ran across a trifecta of adjectives: “quantitative, computational, and digital” history. It intrigued me enough that I did an internet search which gave me precisely 4 hits, 3 of which were for the same job posting. Clearly this isn’t mainstream yet.

That said, the phrasing really resonated with me on a number of levels and continued to haunt me to the point where I finally decided it was worth writing about at a bit of length.

I am, at the end of the day, a quantitative historian – numbers are integral to both my sources and many of my methods. When I first encountered demographic history in grad school, I instinctively called it “history by numbers” and critiqued sample sizes while interrogating authors’ calculations. My dissertation and first book project analyze early modern British numeracy and quantitative thinking, while my current DH project involves quantification at a massive scale (Death by Numbers: building a database out of the London Bills of Mortality so that I can examine, among other things, early modern people’s addition skills).

As needed, I am also a computational historian – methodologically I use statistics and computer programming on a semi-regular basis. My work on the Six Degrees of Francis Bacon project involved statistical work in R, as well as less quantitative programming in PostGreSQL, Ruby/Rails, JavaScript, HTML, and a sprinkling of Python for good measure. The bibliometric work I’ve done on Identifying Early Modern Books was also fundamentally computational as is much of the work I’m doing on Death by Numbers (I’m not calculating with nearly a million numbers by hand!) And my newest project, the Bridges of Pittsburgh, will involve a variety of pre-existing softwares as well as probably some bespoke programming for the graph theory aspects. Some of these computational methods are clearly also quantitative, but not all of them.

Lastly, by my actual title and job description, I am a digital historian – for whatever contested definition we give for DH. Increasingly, I and my colleagues in the Pittsburgh area have been scoping DH and digital scholarship projects using the criteria of web-facing, which plays out interestingly against the other two terms I use above. By these definitions, the digital is often but not always computational. An Omeka exhibit or WordPress site is digital but not particularly computational (in either the quantitative or programmatic sense). And if we define digital as web-facing, then the computational is not always digital. An example of this disjunction could be found in any computational project that ends with a traditional article or monograph publication rather than a sustained digital project.

Cue Venn Diagram to visualize the way I’ve been thinking about these similarities and differences… c’mon, you knew this was coming, didn’t you? Venn

So where does this leave DH (and Humanities Computing, Quantitative History, and the like)? Not a clue, hence the reason I called this a “musings” post. This will certainly not be the last (virtual) ink spilled on this very-contested and interesting subject of definitions. In the meantime, I will continue to enjoy my liminality and try on adjectives to suit my research objectives of the moment – be they qualitative, quantitative, computational, digital, or something else entirely.

Never Use White Text on a Black Background: Astygmatism and Conference Slides

TL;DR – never use white text on a black background in your slides.

This post has been a long time coming. Every conference I go to, there will be at least one (and more often ten or twenty) presentations that use white text on a black background. These slides range from hard-to-read to outright illegible and in particularly bad set-ups are so visually painful that I have to close my eyes or turn away from the projection screen. Even conferences that provide advice on designing accessible presentations nod at the “make slides high contrast” but are silent on the white text issue. So! Here it is.

The facts:

  • approximately half the population has some degree of astigmatism
  • white text on black backgrounds creates a visual fuzzing effect called “halation”
  • halation is known to reduce readability of text and is particularly bad for people with astygmatism

The visual aids:



So please, everyone, strike white text with black backgrounds from your color repetoire, the same way you’ve removed color combinations that are illegible to color-blind people. It’s not a question of preferences, it’s an accessibility issue.

Touring the Tepper Quad

Thanks to the Carnegie Mellon Women’s Association, I and a group of other CMU women were able to tour the new Tepper Quad construction site last week. While I’m still sad about the gutting of Morewood parking lot (so cheap! so convenient!), the new building is going to be fantastic.


The building will have eight floors but – in true CMU building-into-hillsides fashion – it will technically have five floors and three basements. In cool trivia, it’s being built with poured concrete instead of steel-and-rebar because that enabled them to fit more floors in while keeping the building under our self-imposed height restriction of 75′ (to keep the building from overshadowing Hamburg Hall, across the street). Anyone else notice those giant bubbles they had on the site a few months back? They use that as “filler” in the concrete to keep the slabs from getting too heavy.


While we call it new Tepper and its main recipients will, of course, be the Tepper folks, the building is intended to be useful for the entire campus community. There will be a dining hall and a gym, for starters! There will also be a 600-person lecture hall with an adjacent greenroom for housing visiting speakers before their big events. The hall can also be divided in half to create two halls for 300. It looks more impressive in person, honest!


Per our guides, the Tepper Quad will be done by May 2018! Not “around May” or “sometime over the summer” or “before the start of the 2018-19 academic year” but May 2018. Period. All the steps that might cause major delays are apparently over so barring the apocalypse or alien invasion, the work will be done in good time for folks to move offices over the summer months.


Overall, it was a great tour – two thumbs up. I highly recommend you take one if you’re given the opportunity before May comes around and all you’ll have left to tour is a bright, shiny new building.

Conference Strategies for the Shy and Introverted

So, a comment on Twitter last night made me realize how many strategies I’ve developed over the past few years to deal with being shy and introverted in a conference environment. To anyone who has met me and is laughing at the thought of me being either shy or introverted? I rest my case as to the effectiveness of some of my strategies! Caveats that these are still very much a work in progress, they function best at small-to-midsize conferences, and I don’t always practice what I preach 🙂

So, without further ado, some strategies that will hopefully be of help to other folks as well:

1) Remember that self-care is more important than “networking.” If you start to hit the end of your abilities to cope with people and bounce off a reception/event/invitation, don’t beat yourself up over it. It takes me much less time to recover from oversocialization if realize I’m at my limit and manage to step back before crashing.

2) Find a conference buddy who is willing to be your social “home base” for some of the breaks, meals, and group events. Ideally said conference buddy will be less shy/introverted than you are, and/or know a few people you don’t and can introduce you to them, but that’s not necessary. You just need someone you can hang out with so you don’t feel socially isolated.

2a) If you have a really good friend, you can also room with your conference buddy to save money and improve the ease of schedule coordination.

3) If possible, make plans to eat with folks in advance. I find that figuring out group meals is one of the most difficult parts of conference networking because it’s far harder to casually fall into a meal group than to have a quick chat during a coffee break.

4) If you’re alone during breaks/receptions, you can hover for a bit (a minute or so is usually my max) near groups of people that are having interesting conversations and/or include someone you sort of know. Sometimes the circle will organically open to include you.

4a) If you’re in one of those groups and see someone hovering alone, physically move a bit to open the circle and give them space to join you. Introverts helping out other introverts for the win!

5) If starting up an in-person conversation with strangers is too hard, try chatting to folks on Twitter during panels then going up to meet them afterwards. “Hi, we were just talking on Twitter earlier and I wanted to introduce myself in person” makes meeting new people a lot less stressful, especially if you can then continue a conversation you started online.

6) Acquaintance chaining works. You know one person who introduces you to a person who then introduces you to another person and suddenly you hit the point where you start to know a lot of people.

7) Reassure yourself that communities build over time. The first several conferences can be hard, but eventually you’ll hit a tipping point where you’ve met enough people in the community that things get easier. They never get EASY but if conferences were easy, then we wouldn’t be introverted, now would we?

Twitter at the Big Three: Global Network Stats

Every year in the break between Fall and Spring academic semesters, tens of thousands of scholars from across the world descend on an American city for several caffeine-fueled days of panels, receptions, job interviews, and social networking. Actually, this happens more than once, as members of the American Historical Association, the Modern Language Association, and the American Library Association all meet in January. And while most of their social networking happens face-to-face, some of it happens on Twitter where enterprising digital humanists armed with Martin Hawksey’s TAGS can collect conference tweets and analyze them for fun and profit.

Posts in this (intended) series include (and will be linked as they are published):

  1. Global Networks Stats
  2. Bipartite Network Analysis
  3. Directed Network Analysis
  4. Preliminary Conclusions (TL;DR)
  5. The Methods Post

So without further ado, here are some initial stats about the networks I constructed from the three official conference Twitter hashtags: #aha17, #mla17, and #alamw17.

The AHA network is the smallest at 2,826 nodes (people who either tweeted or whose twitter handle showed up in another person’s tweets) and 6,945 edges (connections generated by said tweets). These edges have been weighted so that if Person A mentions Person B 14 times in tweets, the edge from Person A to Person B has weight 14. If Person A mentions Person C only once, the edge from Person A to Person C has weight 1. The average degree is 2.5 (number of edges divided by number of nodes) but when weight is factored in (edges are multiplied by their weight before added and divided by number of nodes) the average weighted degree is 3.9.

There are 74 connected components (subnetworks with no connection to the rest of the network), with the largest connected component containing 90% of the nodes and 96% of the edges in the overall network. This component has diameter 10 (the shortest distance between two people furthest away from each other) and average path length 4.3 (average of the shortest distance between every pair of people in the network).

The MLA network is slightly bigger and slightly more connected:

  • nodes: 3,538
  • edges: 10,178
  • average degree: 2.9
  • average weighted degree: 5.2
  • connected components: 70
  • largest connected component contains
    • nodes: 94.2%
    • edges: 97.8%
  • diameter 12
  • average path length 4.4

The ALA network is the largest and most connected:

  • nodes: 7,851
  • edges: 20,505
  • average degree: 2.6
  • average weighted degree: 3.9
  • connected components: 99
  • largest connected component:
    • nodes: 96.1%
    • edges: 98.9%
  • diameter: 14
  • average path length: 5.4

So what happens when we put it all together?


Green edges = #aha17 hashtag. Red edges = #mla17 hashtag. Blue edges = #alamw17 hashtag.

Merging the three networks together creates some overlap of nodes (people on Twitter during more than one conference) and edges (people tweeting to the same people at more than one conference) but the three networks remain largely discrete. The force atlas 2 layout I employed in Gephi created more overlap of the AHA and MLA conferences than the ALA conference, but in general disciplinarity is the rule of the day.

While some of this is likely an artifact of most scholars’ inability to physically attend multiple conferences (the AHA and MLA, in particular, occurred at the same time in Colorado and Pennsylvania respectively), scholars have the ability to interact via Twitter with conferences they aren’t attending. The co-occurrance of the AHA and MLA could have – theoretically – increased connectivity between the two conferences if similar themes and conversations arose at both then connected via social media. Alas, I don’t have the 2015 metrics (the last time these conferences didn’t co-occur) to do a comparison, but if anyone has them and wants to share I’d love to see them!

In general, the merged “Big Three” network stats clearly derive from their constituent conferences’ stats:

  • nodes: 13,489
  • edges: 37,308
  • average degree: 2.8
  • average weighted degree: 4.5
  • connected components: 203
  • largest connected component:
    • nodes: 95%
    • edges: 98.3%
  • diameter: 16
  • average path length: 5.9

One of these numbers, however, immediately jumped out at me as not like the others: the number of connected components. If only the largest connected component of each conference network had been able to connected in the Big Three network, there should have been 74+(70-1)+(99-1)=241 connected components. Instead, 38 of the small components in the conference networks appear to have merged with another component (either the largest connected component or another small component).

This is encouraging to me as it implies that there is an interdisciplinary scholarly community that emerges on Twitter, not just in the dense “center” of the network but also in the disconnected “margins.” In is not (yet?) clear whether this interdisciplinary community is generated by digital humanists, librarians, geographical proximity, common interests, or – most likely – some combination of factors, including some I haven’t considered.

Regardless of the cause, something is going on. In the interests of exploring it, next time I’m going to restructure my data as a bipartite network to see if anything else interesting emerges.