Hello World! Twenty Years Later...
Posted on: 28 May '10

I wrote the following bit of code for the first time about twenty years ago:

#include <stdio.h>

int main ()
   printf ("Hello World!\n");

I had encountered elementary programming in middle school in the form of Logo (a rather quaint variation of Lisp), and GW-BASIC (a rather awkward variation of BASIC). But this was C. This was the real deal! I couldn't believe my luck when my parents somehow agreed to spend a small fortune on a state-of-the-art Intel 386 machine with 1Mb of RAM and a giant 40Mb (yes, mega-bytes!) hard-drive. It ran the latest MS-DOS 4.0 as well - none of that clunky old DOS 3.2. So, with bated breath I typed in this bit of code from Kernighan and Ritchie's fabled The C Programming Language and fired up the Borland Turbo C compiler. Sure enough, after a brief flutter of the hard drive, my terminal said "Hello World!" - welcome to programing.

It was around the same time that the Text Encoding Initiative was established to develop standards for digitizing texts, although I wasn't aware of it then. But there was no lack of excitement for someone discovering the brave new worlds of programming in the nineties. The emergence of the world wide web opened new frontiers, the GNU project was finding its voice in Richard Stallman and some kid named Linus Torvalds had put together a free Unix kernel! If C was exciting, C++ and object-oriented programming were mind-boggling. Just as we were dipping into Linux, and dabbling in Assembler, Sun came out with a language that ran on a virtual machine - Java!

A Sea of Data

It was not until much later that I encountered digitized texts and the possibilities and challenges presented by them. When I had made the decision to go to grad school to study early modern English literature, it was with great excitement but also with a sense of loss because I felt I would eventually not have time to pursue my love of technology. So, when I first started exploring the so-called "digital humanities" it was with a sense of anticipation but also with skepticism. I was spending a year at the University of Warwick in the Department of History to study under Steve Hindle, a social historian who had worked on poverty in early modern England. While I had hoped that this minor shift in institutional perspective - from a literature department to a history department - would broaden the horizons of my dissertation on the representation of crime and poverty in early modern literature, I was quite unprepared for the magnitude of the task before me. While fumbling around in paleography and the subtleties of early modern "secretary hand," I was overwhelmed by the massive archive of records that was suddenly available to me. From records at the British Library, the National Archives at Kew and things like the Bridewell Courtbooks, I was suddenly overwhelmed by a massive sea of data that my literary training with its emphasis on "slow reading" had not quite prepared me for. Moreover, any semblance of a distinction between "text" and "context," literary works and their socio-cultural background - already under challenge as literary scholars undermined the notion of a canon - completely dissolved under this deluge of data and new scholarship on areas I had never looked into before.

The EEBO database had already digitized most of the titles in the English Short Title Catalogue and EEBO-TCP was on its way to encoding a large chunk of them with the evolving TEI standard. To step beyond the cosy bounds of the reified canon, to read an early modern pamphlet, jestbook or ballad was to fall into a sea of data - to encounter a text among 150,000 texts, and beyond that the archival data and socio-economic and literary scholarship. Why am I reading a particular text among thousands of possible others? Are their ways in which this corpus of texts can be harnessed? Can some sort of coherence, some sort of meaning be extracted from this chaos that would shed light on my particular interest in early modern literature and society?

Digital Humanities: Sailing the Seas

I returned to Madison with more questions than answers, and with a frighteningly broadened horizon that could amount to intellectual paralysis when it came to the concrete work of hammering out chapters of my dissertation. Around the same time I heard of a project Michael Witmore was involved in that was putting chunks of Shakespeare's plays through a language analysis program. Strange though it seemed, it was intriguing, and the buzz around digital humanities had grown into a palpable hum, so I decided to delve deeper into the problems presented by the digitization, encoding and curation of texts. I soon discovered that there were a plethora of exciting tools and projects that were striving with that sea of data. The promise of digital humanities seemed immense, but so were the challenges faced by it. In the meantime I was fortunate enough to work within a radically interdisciplinary environment at the Institute for Research in the Humanities at UW-Madison, which drew scholars from a wide variety of disciplines. As I tried to use technology to facilitate this collaborative environment, I got the chance to brush up some of my old skills and to learn many new ones.

This blog, therefore, starts at the beginning of a new journey that is also a return to an old one. As I document the challenges of using technology to tackle the problems posed by humanities scholarship and the discoveries that technology can lead us to, I am awed and excited by the way that my two passions have converged in this newly emerging field. So, twenty years later, let me once again say - "Hello World!"