Brandon Walsh

Introduction to Text Analysis: A Coursebook

[Crossposted on the WLUDH blog]

I am happy to share publicly the initial release of a project that I have been shopping around in various talks and presentations for a while now. This semester, I co-taught a course on “Scandal, Crime, and Spectacle in the 19th Century” with Professor Sarah Horowitz in the history department here at Washington and Lee University. The course counted as digital humanities credit for our students, who were given a quick and dirty introduction to text analysis over the course of the term. In preparing for the class, I knew that I wanted my teaching materials on text analysis to be publicly available for others to use and learn from. One option might be to blog aggressively during the semester, but I worried that I would let the project slide, particularly once teaching got underway. Early conversations with Professor Horowitz suggested, instead, that we take advantage of time that we both had over the summer and experiment. By assembling our lesson plans far in advance, we could collaboratively author them and share them in a format that would be legible for publication both to our students, colleagues, and a wider audience. I would learn from her, she from me, and the product would be a set of resources useful to others.

At a later date I will write more on the collaboration, particularly on how the co-writing process was a way for both of us to build our digital skill sets. For now, though, I want to share the results of our work - Introduction to Text Analysis: A Coursebook. The materials here served as the backbone to roughly a one-credit introduction in text analysis, but we aimed to make them as modular as possible so that they could be reworked into other contexts. By compartmentalizing text analysis concepts, tool discussions, and exercises that integrate both, we hopefully made it a little easier for an interested instructor to pull out pieces for their own needs. All our materials are on GitHub, so use them to your heart’s content. If you are a really ambitious instructor, you can take a look at our section on Adapting this Book for information on how to clone and spin up your own copy of the text materials. While the current platform complicates this process, as I’ll mention in a moment, I’m working to mitigate those issues. Most importantly to me, the book focuses on concepts and tools without actually introducing a programming language or (hopefully) getting too technical. While there were costs to these decisions, they were meant to make any part of the book accessible for complete newcomers, even if they haven’t read the preceding chapters. The book is really written with a student audience in mind, and we have the cute animal photos to prove it. Check out the Preface and Introduction to the book for more information about the thinking that went into it.

The work is, by necessity, schematic and incomplete. Rather than suggesting that this be the definitive book on the subject (how could anything ever be?), we want to suggest that we always benefit from iteration. More teaching materials always help. Any resource can be a good one - bad examples can be productive failures. So we encourage you to build upon these materials in your courses, workshops, or otherwise. We also welcome feedback on these resources. If you see something that you want to discuss, question, or contest, please drop us a line on our GitHub issues page. This work has already benefited from the kind feedback of others, either explicit or implicit, and we are happy to receive any suggestions that can improve the materials for others.

One last thing - this project was an experiment in open and collaborative publishing. In the process of writing the book, it became clear that the platform we used for producing it - GitBook - was becoming a problem. The platform was fantastic for spinning up a quick collaboration, and it really paid dividends in its ease of use for writers new to Markdown and version control. But the service is new and under heavy development. Ultimately, the code is out of our control, and I want something more stable and more fully in my hands for long-term sustainability. I am in the process of transferring the materials to a Jekyll installation that would run off GitHub pages. Rather than wait for this final, archive version of the site to be complete, it seemed better to release this current working version out into the world. I will update all the links here once I migrate things over. If the current hosting site is down, you can download a PDF copy of the most recent version of the book here.

Text Analysis Workshop: Four Ways to Read a Text

[Crossposted on the WLUDH blog.]

On Monday I visited Mackenzie Brooks’s course on “Data in the Humanities” to introduce digital text analysis to her students. I faced a few challenges when planning for the visit:

  • Scope - I had two hours for the workshop and a lot of material to cover. I was meant to introduce anything and everything, as much as I wanted in a general overview of text analysis.
  • Background - This course is an introductory digital humanities course that counts as a science credit at W&L, so I assumed no prior knowledge of programming. Mackenzie will be covering some things with them later in the course, but at this stage I needed to avoid anything really technical.
  • Length - Two hours was both a lot of time and no time at all. It was certainly not enough time to teach anyone to program for the first time. As an aside, I often find it hard to gauge how much material is appropriate for anything longer than 75 minutes.
  • Content - Since this was meant to be a general overview of the field, I did not want to lean too heavily on analysis by tools. I worried that if I did so the takeaway for the students would be how to use the tools, not the underlying concepts that the tools aided them in exploring.

I wound up developing a workshop I called “Introduction to Text Analysis: Four Ways to Read a Text.” Focusing on four ways meant that I felt comfortable cutting a section if things started to go long. It also meant that I was developing a workshop model that could easily fit varying lengths in the future. For example, I’ll be using portions of this workshop throughout my introduction to text analysis lectures in my own course this fall. The approach would necessarily be pretty distant - I couldn’t go into much detail for any one method in this time. Finally, I wanted the students to think about text analysis concepts first and then come to tools that would help them to do so, so I tried to displace the tools and projects from the conversation slightly. The hope was that, by enacting or intuiting the methods by hand first, the concepts would stick more easily than they might otherwise.

The basic structure of the workshop was this:

  1. I introduce a basic methodology for reading.
  2. Students are presented with a handout asking them to read in a particular way with a prompt from me. They complete the exercise.
  3. We talk about the process. We clarify the concept a little more together, and the students infer some of the basic difficulties and affordances of the approach.
  4. Then I show a couple tools and projects that use that method for real results.

The four ways of reading I covered were close reading, bags of words, topic modeling, and sentiment analysis. So, to use the topic modeling portion as an example, any one of those units looked something like this:

  1. I note how, until now, we have been discussing how counting words gives us a sense of the overall topic or scope of the text. Over time and in close proximity, individual words combine to give us a sense of what a text is about.
  2. I give the students three paragraphs with the words scrambled and out of order (done pretty quickly in Python). I ask the students to get in groups and tell me what the underlying topics or themes are for each excerpt. They had to produce three single-word topics for each paragraph, and paragraphs could share topics.
  3. We talk about how were able to determine the topics of the texts even with the paragraphs virtually unreadable. Even out of order, certain words in proximity together suggest the underlying theme of a text. We can think of texts as made up of a series of topics like these, clusters of words that occur in noticeable patterns near one another. We have human limits as to how much we can comprehend, but computers can help us run similar, mathematical versions of the same process to find out what words occur near each other in statistically significant patterns. The results can be thought of as the underlying topics or discourses that make up a series of documents. A lot of hand waving, I know, but I am assuming here that students will examine topic modeling in more detail at a later date. Better, I think, to introduce the broad strokes than lose students in the details.
  4. I then share Mining the Dispatch as an example of topic modeling in action to show the students the kinds of research questions that can be explored using this method.

So, in essence, what I tried to do is create a hands-on approach to teaching text analysis concepts that is flexible enough to fit a variety of needs and contexts. My handouts and slides are all up on a github repository. Feel free to share, reuse, and remix them in any way you would like.

Reading Speech: Virginia Woolf, Machine Learning, and the Quotation Mark

[Crossposted on the Scholars’ Lab blog as well as the WLUDH blog. What follows is a slightly more fleshed out version of what I presented this past week at HASTAC 2016 (complete with my memory-inflected transcript of the Q&A). I gave a bit more context for the project at the event than I do here, so it might be helpful to read my past two posts on the project here and here before going forward. This talk continues that conversation.]

This year in the Scholar’s Lab I have been working with Eric on a machine learning project that studies speech in Virginia Woolf’s fiction. I have written elsewhere about the background for the project and initial thoughts towards its implications. For the purposes of this blog post, I will just present a single example to provide context. Consider the famous first line of Mrs. Dalloway:

Mrs Dalloway said, “I will buy the flowers myself.”

Nothing to remark on here, except for the fact that this is not how the sentence actually comes down to us. I have modified it from the original:

Mrs Dalloway said she would buy the flowers herself.

My project concerns moments like these, where Woolf implies the presence of speech without marking it as such with punctuation. I have been working with Eric to lift such moments to the surface using computational methods so that I can study them more closely.

I came to the project by first tagging such moments myself as I read through the text, but I quickly found myself approaching upwards of a hundred instances in a single novel-far too many for me to keep track of in any systematic way. What’s more, the practice made me aware of just how subjective my interpretation could be. Some moments, like this one, parse fairly well as speech. Others complicate distinctions between speech, narrative, and thought and are more difficult to identify. I became interested in the features of such moments. What is it about speech in a text that helps us to recognize it as such, if not for the quotation marks themselves? What could we learn about sound in a text from the ways in which it structures such sound moments?

These interests led me towards a particular kind of machine learning, supervised classification, as an alternate means of discovering similar moments. For those unfamiliar with the concept, an analogy might be helpful. As I am writing this post on a flight to HASTAC and just finished watching a romantic comedy, these are the tools that I will work with. Think about the genre of the romantic comedy. I only know what this genre is by virtue of having seen my fair share of them over the course of my life. Over time I picked up a sense of the features associated with these films: a serendipitous meeting leads to infatuation, things often seem resolved before they really are, and the films often focus on romantic entanglements more than any other details. You might have other features in mind, and not all romantic comedies will conform to this list. That’s fine: no one’s assumptions about genre hold all of the time. But we can reasonably say that, the more romantic comedies I watch, the better my sense of what a romantic comedy is. My chances of being able to watch a movie and successfully identify it as conforming to this genre will improve with further viewing. Over time, I might also be able to develop a sense of how little or how much a film departs from these conventions.

Supervised classification works on a similar principle. By using the proper tools, we can feed a computer program examples of something in order to have it later identify similar objects. For this project, this process means training the computer to recognize and read for speech by giving it examples to work from. By providing examples of speech occurring within quotation marks, we can teach the program when quotation marks are likely to occur. By giving it examples of what I am calling ‘implied speech,’ it can learn how to identify those as well.

For this project, I analyzed Woolf texts downloaded from Project Gutenberg. Eric and I put together scripts in Python 3 that used a package known as the Natural Language Toolkit for classifying. All of this work can be found at the project’s GitHub repository.

The project is still ongoing, and we are still working out some difficulties in our Python scripts. But I find the complications of the process to be compelling in their own right. For one, when working in this way we have to tell the computer what features we want it to pay attention to: a computer does not intuitively know how to make sense of the examples that we want to train it on. In the example of romantic comedies, I might say something along the lines of “while watching these films, watch out for the scenes and dialogue that use the word ‘love.’” We break down the larger genre into concrete features that can be pulled out so that the program knows what to watch out for.

To return to Woolf, punctuation marks are an obvious feature of interest: the author suggests that we have shifted into the realm of speech by inserting these grammatical markings. Find a quotation mark-you are likely to be looking at speech. But I am interested in just those moments where we lose those marks, so it helps to develop a sense of how they might work. We can then begin to extrapolate those same features to places where the punctuation marks might be missing. We have developed two models for understanding speech in this way: an external and an internal model. To illustrate, I have taken a single sentence and bolded what the model takes to be meaningful features according to each model. Each represents a different way of thinking about how we recognize something as speech.

External Model for Speech:

“I love walking in London,” said Mrs. Dalloway. “Really it’s better than walking in the country.”

The external model was our initial attempt to model speech. In it, we take an interest in the narrative context around quotation marks. In any text, we can say that there exist a certain range of keywords that signal a shift into speech: said, recalled, exclaimed, shouted, whispered, etc. Words like these help the narrative attribute speech to a character and are good indicators that speech is taking place. Given a list of words like this, we could reasonably build a sense of the locations around which speech is likely to be happening. So when training the program on this model, we had the classifier first identify locations of quotation marks. Around each quotation mark, the program took note of the diction and parts of speech that occurred within a given distance from the marking. We build up a sense of the context around speech.

Internal Model for Speech:

I love walking in London,” said Mrs. Dalloway. “Really it’s better than walking in the country.”

The second model we have been working with works in an inverse direction: instead of taking an interest in the surrounding context of speech, an internal model assumes that there are meaningful characteristics within the quotation itself. In this example, we might notice that the shift to the first-person ‘I’ is a notable feature in a text that is otherwise largely written in the third person. This word suggests a shift in register. Each time this model encounters a quotation mark it continues until it finds a second quotation mark. The model then records the diction and parts of speech inside the pair of markings.

Each model suggests a distinct but related understanding for how sound works in the text. When I set out on this project, I had aimed to use the scripts to give me quantifiable evidence for moments of implied speech in Woolf’s work. The final step in this process, after all, is to actually use these models to identify speech: looking at texts they haven’t seen before, the scripts insert a caret marker every time they believe that a quotation mark should occur. But it quickly became apparent that the construction of the algorithms to describe such moments would be at least as interesting as any results that the project could produce. In the course of constructing them, I have had to think about the relationships among sound, text, and narrative in new ways.

The algorithms are each interpretative in the sense that they reflect my own assumptions about my object of study. The models also reflect assumptions about the process of reading, how it takes place, and about how a reader converts graphic markers into representations of sound. In this sense, the process of preparing for and executing text analysis reflects a certain phenomenology of reading as much as it does a methodology of digital study. The scripting itself is an object of inquiry in its own right and reflects my own interpretation of what speech can be. These assumptions are worked and reworked as I craft algorithms and python scripts, all of which are as shot through with humanistic inquiry and interpretive assumptions as any close readings.

For me, such revelations are the real reasons for pursuing digital study: attempting to describe complex humanities concepts computationally helps me to rethink basic assumptions about them that I had taken for granted. In the end, the pursuit of an algorithm to describe textual speech is nothing more or less than the pursuit of deeper and enriched theories of text and speech themselves.

Postscript

I managed to take note of the questions I got when I presented this work at HASTAC, so what follows are paraphrases of my memory of them as well as some brief remarks that roughly reflect what I said in the moment. There may have been one other that I cannot quite recall, but alas such is the fallibility of the human condition.

Q: You distinguish between speech and implied speech, but do you account at all for the other types of speech in Woolf’s novels? What about speech that is remembered speech that happened in earlier timelines not reflected in the present tense of the narrative’s events?

A: I definitely encountered this during my first pass at tagging speech and implied speech in the text by hand. Instead of binaries like quoted speech/implied speech, I found myself wanting to mark for a range of speech types: present, actual; remembered, might not have happened; remembered incorrectly; remembered, implied; etc. I decided that a binary was more feasible for the machine learning problems that I was interested in, but the whole process just reinforced how subjective any reading process is: another reader might mark things differently. If these processes shape the construction of the theories that inform the project, then they necessarily also affect the algorithms themselves as well as the results they can produce. And it quickly becomes apparent that these decisions reflect a kind of phenomenology of reading as much as anything: they illlustrate my understanding of how a complicated set of markers and linguistic phenomenon contribute to our understanding that a passage is speech or not.

Q: Did you encounter any variations in the particular markings that Woolf was using to punctuate speech? Single quotes, etc., and how did you account for them?

A: Yes - the version of Orlando that I am working with used single quotes to notate speech. So I was forced to account for such edge cases. But the question points at two larger issues: one authorial and one bibliographical. As I worked on Woolf I was drawn to the idea of being able to run such a script against a wider corpus. Since the project seemed to impinging on how we also understand psychologized speech, it would be fascinating to be able to search for implied speech in other authors. But, if you are familiar with, say, Joyce, you might remember that he hated quotation marks and used dashes to denote speech. The question is how much can you account for such edge cases, and, if not, the study becomes only one of a single author’s idiosyncrasies (which still has value). But from there the question spirals outwards. At least one of my models (the internal one) relies on quotation marks themselves as boundary markers. The model assumes that quotation marks will come in pairs, and this is not always the case. Sometimes authors, intentionally or accidentally, omit a closing quotation mark. I had to massage the data in at least half a dozen places where there was no quotation mark in the text and where its lack was causing my program to fail entirely. As textual criticism has taught us, punctuation marks are the single most likely things to be modified over time during the process of textual transmission by scribes, typesetters, editors, and authors. So in that sense, I am not doing a study of Woolf’s punctuation so much as a study of Woolf’s punctuation in these particular versions of the texts. One can imagine an exhaustive study that works on all versions of all Woolf’s texts as a study that might approach some semblance of a correct and thorough reading. For this project, however, I elected to take the lesser of two evils that would still allow me to work through the material. I worked with the texts that I had. I take all of this as proof that you have to know your corpus and your own shortcomings in order to responsibly work on the materials - such knowledge helps you to validate your responses, question your results, and reframe your approaches.

Q: You talked a lot about text approaching sound, but what about the other way around - how do things like implied speech get reflected in audiobooks, for example? Is there anything in recordings of Woolf that imply a kind of punctuation that you can hear?

A: I wrote about this extensively in my dissertation, but for here I will just say that I think the textual phenomenon the questioner is referencing occurs on a continuum. Some graphic markings, like pictures, shapes, punctuation marks, do not clearly translate to sound. And the reverse is true: the sounded quality of a recording can only ever be remediated by a print text. There are no perfect analogues between different media forms. Audiobook performers might attempt to convey things like punctuation or implied speech (in the audiobook of Ulysses, for example, Jim Norton throws his voice and lowers his volume to suggest free indirect discourse). In the end, I think such moments are playing with an idea of what my dissertation calls audiotextuality, the idea that all texts recordings of texts, to varying degrees, contain both sound and print elements. The two spheres may work in harmony or against each other as a kind of productive friction. The idea is a slippery one, but I think it speaks to moments like the implied punctuation mark that come through in a particularly powerful audiobook recording.

Apps, Maps, & Models: A New View

[Crossposted on the Washington and Lee University Digital Humanities Blog]

Last Monday several of us here at WLUDH traveled down to Duke University for their symposium on Apps, Maps & Models: Digital Pedagogy in Art History, Archaeology & Visual Studies. I found the trip to be enlightening and invigorating. If you are interested in the event, you can find videos of the talks here and here as well as a storify of the Twitter action here. That the event was so well documented is a testimony to how well organized it was by the Wired! Lab.

Many speakers at the event considered how the tools they were using might relate to more “traditional” modes for carrying out their research. They considered and responded to tough questions with and about their work. Are digital methods for tracing the topography of a surface, for example, fundamentally different in kind from analog means of doing so? If so, are they meant to displace those old tools? Why should we spend the time to learn the new technologies? A related question that comes up at almost every digital humanities presentation (though not at any of these): can digital humanities methods show us anything that we do not already know?

Such questions can be particularly troubling when we are investing such time and energy on the work they directly critique, but we nonetheless need to have answers for them that demonstrate the value of digital humanities work, in and out of the classroom. Numerous well-known scholars have offered justifications of digital work in a variety of venues, and, to my mind, the symposium offered many answers of its own, in part by showcasing amazing work that spanned a variety of fields related to preservation, public humanities, and academic scholarship. Presenters were using digital technology to rebuild the past, using digital modeling to piece together the fragments of a ruined church that have since been incorporated into other structures. They were using these tools to engage the present, to draw the attention of museum patrons to overlooked artifacts. The work on display at the symposium struck me, at its core, as engaging with questions and values that cut across disciplines, digital or otherwise.

Most compelling to me, the symposium drew attention to how the tools we use to examine the objects of our study change our relationship to them. The presenters acknowledged that such an idea does hold dangers – after all, we want museum-goers to consider the objects in a collection, not just spend time perusing an iPad application meant to enrich them. But just as new tools offer new complications, changes in medium also offer changes in perspective. As was illustrated repeatedly at the symposium, drone photography, for all its deeply problematic political and personal valences, can offer you a new way of seeing the world, a new way of looking that is more comprehensive than the one we see from the ground. Even as we hold new methodologies and tools up to critique we can still consider how they might cause us to consider an object, a project, or a classroom differently.

Seeing from a different angle allows us to ask new questions and re-evaluate old ones, an idea that speaks directly to my experience at the symposium. I work at the intersections of digital humanities, literary studies, and sound studies. So my participation in the symposium was as something of an outsider, someone ready to learn about an adjacent and overlapping field but, ultimately, not a home discipline. Thinking through my work from an outsider perspective made me want to ask many questions of my own work. The presenters here were deeply engaged in preserving and increasing access to the cultural record. How might I do the same through text analysis or through my work with audio artifacts? What questions and goals are common to all academic disciplines? How might I more thoroughly engage students in public humanities work?

Obviously, the event left me with more questions than answers, but I think that is ultimately the sign of a successful symposium. I would encourage you to check out the videos of the conference, as this short note is necessarily reductive of such a productive event. The talks will offer you new thoughts on old questions and new ways of thinking about digital scholarship no matter your discipline.

Embedding COinS Metadata on a Page Using the Zotero API

[Crossposted on the Washington and Lee University Digital Humanities Blog]

This year I am working with Mackenzie, Steve McCormick, and his students on the Huon d’Auvergne project, a digital edition of a Franco-Italian romance epic. Last term we finished TEI-encoding of two of the manuscripts and put them online, and there is still much left to do. Making the digital editions of each manuscript online is a valuable scholarly endeavor in its own right, but we’ve also been spending a lot of time considering other ways in which we can enrich this scholarly production using the digital environment.

All of which brings me to the bibliography for our site. At first, our bibliography page was just a transcription of a text file that Steve would send along with regular updates. This collection of materials is great to have in its own right, but a better solution would be to leverage the many digital humanities approaches to citation management to produce something a bit more dynamic.

Steve already had everything in a Zotero, so my first step was to integrate the site’s bibliography with the Zotero collection that Steve was using to populate the list. I found a python 2 library called zot_bib_web that could do all this quite nicely with a bit of modification. Now, by running the script from my computer, the site’s bibliography will automatically pull in an updated Zotero collection for the project. Not only is it now easier to update our site (no more copying and pasting from a word document), but now others can contribute new resources to the same bibliography on Zotero by requesting to join the group and uploading citations. The project’s bibliography can continue to grow beyond us, and we will capture these additions as well.

Mackenzie suggested that we take things a bit further by including COiNS metadata in the bibliography so that someone coming to our bibliography could export our information into the citation manager of their choosing. Zotero’s API can also do this, and I used a piece of the pyzotero Python library to do so. The first step was to add this piece to the zot_bib_web code:

zot = zotero.Zotero(library_id, library_type, api_key) coins = zot.collection_items(collection_id, content=’coins’) coin_strings = [str(coin) for coin in coins] for coin in coin_strings: fullhtml += coin

Now, before the program outputs html for the bibliography, it goes out to the Zotero API and gets COinS metadata for all the citations, converts them into a format that will work for the embedding, and then attaches each returned span to the HTML for the bibliography.

Now that I had the data that I needed, I wanted to make it work a bit more cleanly in our workflow. Initially, the program returned each bibliographic entry in its own page and meant for the whole bibliography to also be a stand-alone page on the website. I got rid of all that and, instead, wanted to embed them within the website as I already had it. I have the python program exporting the bibliography and COinS data into a small HTML file that I then attach to a div with an id of “includedContent”. inserted in the bibliography page. I use some jQuery to do so:

$(function(){
  $("#includedContent").load("/zotero-bib.html");
});

Instead of distributing content across several different pages, I mark a placeholder area on the main site where all the bibliographic data and metadata will be dumped. All of the relevant data gets saved in a file ‘zot-bib.html’ that gets automatically included inside the shell of the bibliography.html page. From there, I just modified the style so that it would fit into the aesthetic of the site.

Now anyone going to our bibliography page with a Zotero extension will see this in the right of the address bar:

Clicking on the folder icon will bring up the Zotero interface for downloading any of the items in our collection.

And to update this information we only need to run a single python script from the terminal to re-generate everything.

The code is not live on the Huon site just yet, but you can download and manipulate these pieces from an example file I uploaded to the Huon GitHub repository. You’ll probably want to start by installing zot_bib_web first to familiarize yourself with the configuration, and you’ll have a few settings to update before it will work for you: the library id, library type, api key, and collection ID will all need to be updated for your particular case, and the jQuery excerpt above will need to point to wherever you output the bibliography file.

These steps have strengthened the way in which we handle bibliographic metadata so that it can be more useful for everyone, and we were really only able to do it because of the many great open source libraries that allow others to build on them. It’s a great thing - not having to reinvent the wheel.

Reflections on a Year of DH Mentoring

[Crossposted on the Scholars’ Lab blog and the Digital Humanities at Washington and Lee blog]

This year I am working with Eric Rochester in the Scholars’ Lab on a fellowship project that has me learning natural language processing (NLP), the application of computational methods to human languages. We’re adapting these techniques to study quotation marks in the novels of Virginia Woolf (read more about the project here). We actually started several months before this academic year began, and, as we close out another semester, I have been spending time thinking about just what has made it such an effective learning experience for me. I already had a technical background from my time in the Scholars’ Lab at the beginning of the process, but I had no experience with Python or NLP. Now I feel most comfortable with the former of any other programming language and familiar enough with the latter to experiment with it in my own work.

The general mode of proceeding has been this: depending on schedules and deadlines, we meet once or twice every two weeks. Between our meetings I would work as far and as much as I could, and the sessions would offer a space for Eric and me to talk about what I had done. The following are a handful of things we have done that, I think, have helped to create such an effective environment for learning new technical skills. Though they are particular to this study, I think they can be usefully extrapolated to apply to many other project-based courses of study in digital humanities. They are primarily written from the perspective of a student but with an eye to how and why the methods Eric used proved so effective for me.

Let the Wheel Be Reinvented Before Sharing Shortcuts

I came to Eric with a very small program adapted from Matt Jockers’s book on Text Analysis with R for Students of Literature that did little beyond count quotation marks and give some basic statistics. I was learning as I built the thing, so I was unaware that I was reinventing the wheel in many cases, rebuilding many protocols for dealing with commonly recognized problems that come from working with natural language. After working on my program and my approach to a degree of satisfaction, Eric pulled back the curtain to reveal that a commonly used python module, the Natural Language ToolKit (NLTK), could address many of my issues and more. NLTK came as something of a revelation, and working inductively in this way gave me a great sense of the underlying problems the tools could address. By inventing my own way to read in a text, clean it to make its text uniformly readable by the computer, and breaking the whole piece into a series of words that could be analyzed, I understood the magic behind a couple lines of NLTK code that could do all that for me. The experience also helped me to recognize ways in which we would have to adapt NLTK for our own purposes as I worked through the book.

Have a Plan, but Be Flexible

After discussing NLTK and how it offered an easier way of doing the things that I wanted, Eric had me systematically work through the NLTK book for a few months. Our meetings took on the character of an independent study: the book set the syllabus, and I went through the first seven chapters at my own pace. Working from a book gave our meetings structure, but we were careful not to hew too closely to the material. Not all chapters were relevant to the project, and we cut sections of the book accordingly. We shaped the course of study to the intellectual questions rather than the other way around.

Move from Theory to Practice / Textbook to Project

As I worked through the book, I was able to recognize certain sections that felt most relevant to the Woolf work. Once I felt as though I had reached a critical mass, we switched from the book to the project itself and started working. I tend to learn from doing best, so the shift from theory to execution was a natural one. The quick and satisfying transition helped the work to feel productive right away: I was applying my new skills as I was still learning to feel comfortable with them. Where the initial months had more the feel of a traditional student-teacher interaction, the project-based approach we took up at this point felt more like a real and true collaboration. Eric and I would develop to-do items together, we would work alongside each other, and we would talk over the project together.

Document Everything

Between our meetings I would work as far and as much as I could, carefully noting places at which I encountered problems. In some cases, these were conceptual problems that needed clarifying, and these larger questions frequently found their way into separate notes. But my questions were frequently about what a particular line of code, a particular command or function, might be doing. In that case, I made comments directly in the code describing my confusion. I quickly found that these notes were as much for me as for Eric–I needed to get back in the frame of mind that led to the confusion in the first place, and copious notes helped remind me what the problem was. These notes offered a point of departure for our meetings: we always had a place to start, and we did so based on the work that I had done.

Communicate in as Many Ways as Possible

We met in person as much as possible, but we also used a variety of other platforms to keep things moving. Eric and I had all of our code on GitHub so that we could share everything that we had each been working on and discuss things from a distance if necessary. Email, obviously, can do a lot, but I found the chat capabilities of the Scholars’ Lab’s IRC channel to be far better for this sort of work. If I hit a particular snag that would only require a couple minutes for Eric to answer, we could quickly work things out through a web chat. With Skype and Google Hangouts we could even share the code on the other person’s computer even from hundreds of miles away. All of these things meant that we could keep working around whatever life events happened to call us away.

Recognize Spinning Wheels

These multiple avenues of communication are especially important when teaching technical skills. Not all questions or problems are the same: students can work through some on their own, but others can take them days to troubleshoot. Some amount of frustration is a necessary part of learning, and I do think it’s necessary that students learn to confront technical problems on their own. But not all frustration is pedagogically productive. There comes a point when you have tried a dozen potential solutions and you feel as though you have hit a wall. An extra set of eyes can (and should) help. Eric and I talked constantly about how to recognize when it was time for me to ask for help, and low-impact channels of communication like IRC could allow him to give me quick fixes to what, to me at least, seemed like impossible problems. Software development is a collaborative process, and asking for help is an important skill for humanists to develop.

In-person Meetings Can Take Many Forms

When we met, Eric and I did a lot of different things. First, we would talk through my questions from the previous week. If I felt a particular section of code was clunky or poorly done, he would talk and walk me through rewriting the same piece in a more elegant form. We would often pair program, where Eric would write code while I watched, carefully stopping him each time I had a question about something he was doing. And we often took time to reflect on where the collaboration was going - what my end goal was as well as what my tasks before the next meeting would be. Any project has many pieces that could be dealt with at any time, and Eric was careful to give me solo tasks that he felt I could handle on my own, reserving more difficult tasks for times in which we would be able to work together. All of this is to say that any single hour we spent together was very different from the last. We constantly reinvented what the meetings looked like, which kept them fresh and pedagogically effective.

This is my best attempt to recreate my experience of working in such a close mentoring relationship with Eric. Obviously, the collaboration relies on an extremely low student-to-teacher ratio: I can imagine this same approach working very well for a handful of students, but this work required a lot of individual attention that would be hard to sustain for larger classes. One idea for scaling the process up might be to divide a course into groups, being training one, and then have students later in the process begin to mentor those who are just beginning. Doing so would preserve what I see as the main advantage of this approach: it helps to collapse the hierarchy between student and teacher and engage both in a common project. Learning takes place, but it does so in the context of common effort. I’d have to think more about how this mentorship model could be adapted to fit different scenarios. The work with Eric is ongoing, but it’s already been one of the most valuable learning experiences I have had.

Music Genre and Spotify Metadata

Crossposted on the Scholars’ Lab blog

For the last couple weeks, I have been exploring APIs useful to sound studies for a sound recording and poetry project I am working on with former Scholars’ Lab fellow Annie Swafford. I was especially drawn to playing around with Spotify, which has an API that allows you to access metadata for the large catalog of music available through their service. The experiment described below focuses on genre: a notoriously messy category that we nonetheless rely on to tell us how to process the materials we read, view, or hear. Genre tells us what to expect from the art we take in, and our construction and reception of generic categories can tell us a lot about ourselves. In music, especially, genres and subgenres can activate fierce debates about authenticity and belonging. Does your favorite group qualify as “authentic” jazz? What composers do you have to know in order to think of yourself as a real classical music aficionado? Playing with an artist’s metadata can expose a lot of the assumptions that were made in its collection, and I was especially interested in the ways in which Spotify models relations among artists.

I wanted to explore Spotify’s metadata in a way that would model the interpretive messiness of generic categories. To do so, I built a program that bounces through Spotify’s metadata to produce multiple readings of the idea of genre in relation to a particular artist. Spotify offers a fairly robust API, and there are a number of handy wrappers that make it easier to work with. I used a Python module called Spotipy for the material below, and you can find the code for my little genre experiment over on my GitHub page. If you do try to run this on your own machine, note that you will need to clone Spotipy’s repository and manually install it from the terminal with the following command from within the downloaded repository:

$ python setup.py install

Pip will install an older distribution of the code that will only run in Python 2, but Spotipy’s GitHub page has a more recent release that is compatible with Python 3.

When run, the program outputs what I like to think of as the equivalent of music nerds arguing over musical genres. You provide an artist name and a number, and the terminal will work through Spotify’s API to produce the specified number of individual “mappings” of that artist’s genre as well as an aggregate list of all their associated genres. The program starts by pulling out all the genre categories associated with the given artist as well as those given to artists that Spotify flags as related. Once finished, the program picks one of those related artists at random and continues to do the same until the process returns no new genre categories, building up a list of associated genres over time.

So, in short, you give the program an artist and it offers you a few attempts at describing that artist generically using Spotify’s catalog, the computational equivalent of instigating an argument about genre in your local record store. Here are the results for running the program three times for the band New Order:

Individual genre maps

Just one nerd's opinions on New Order:

['dance rock', 'new wave', 'permanent wave', 'new romantic', 'new wave pop', 'hi nrg', 'europop', 'power pop', 'album rock']

Just one nerd's opinions on New Order:

['dance rock', 'new wave', 'permanent wave', 'gothic metal', 'j-metal', 'visual kei', 'intelligent dance music', 'uk post-punk', 'metropopolis', 'ambient', 'big beat', 'electronic', 'illbient', 'piano rock', 'trance', 'progressive house', 'progressive trance', 'uplifting trance', 'quebecois', 'deep uplifting trance', 'garage rock', 'neo-psychedelic', 'space rock', 'japanese psychedelic']

Just one nerd's opinions on New Order:

['dance rock', 'new wave', 'permanent wave', 'uk post-punk', 'gothic rock', 'discofox', 'madchester', 'britpop', 'latin', 'latin pop', 'teen pop', 'classic colombian pop', 'rai', 'pop rap', 'southern hip hop', 'trap music', 'deep rai']

Aggregate genre map for New Order:

['dance rock', 'new wave', 'permanent wave', 'new romantic', 'new wave pop', 'hi nrg', 'europop', 'power pop', 'album rock', 'gothic metal', 'j-metal', 'visual kei', 'intelligent dance music', 'uk post-punk', 'metropopolis', 'ambient', 'big beat', 'electronic', 'illbient', 'piano rock', 'trance', 'progressive house', 'progressive trance', 'uplifting trance', 'quebecois', 'deep uplifting trance', 'garage rock', 'neo-psychedelic', 'space rock', 'japanese psychedelic', 'gothic rock', 'discofox', 'madchester', 'britpop', 'latin', 'latin pop', 'teen pop', 'classic colombian pop', 'rai', 'pop rap', 'southern hip hop', 'trap music', 'deep rai']

In each case, the genre maps all begin the same, with the categories directly assigned to the source artist. Because the process is slightly random, the program eventually maps the same artist’s genre differently each time. For each iteration, the program runs until twenty randomly selected related artists return no new genre categories, which I take to be a kind of threshold of completion for one understanding of an artist’s genre.

The results suggest an amalgam of generic influence, shared characteristics, common lineages, and overlapping angles of approach. The decisions I made in how the program interacts with Spotify’s metadata suggest a definition of genre like the one offered by Alastair Fowler: “Representatives of a genre may then be regarded as making up a family whose septs and individual members are related in various ways, without necessarily having any single feature shared in common by all” (41). Genre is fluid and a matter of interpretive opinion - it is not necessarily based on objective links. The program reflects this in its results: sometimes a particular generic mapping feels very coherent, while at other times the script finds its way to very bizarre tangents. The connections do exist in the metadata if you drill down deeply enough, and it is possible to reproduce the links that brought about such output. But the more leaps the program takes from the original artist the more tenuous the connections appear to be. As I wrote this sentence, the program suggested a connection between garage rock revivalists The Strokes and big band jazz music: such output looks less like a conversation among music nerds and more like the material for a Ph.D. dissertation. As the program illustrates, generic description is the beginning of interpretation - not the ending.

Of course, the program does not actually search all music ever: it only has access to the metadata for artists listed in Spotify, and some artists like Prince or the Beatles are notoriously missing from the catalog. Major figures like these have artist pages that serve as stubs for content drawn largely from compilation CDs, and the program can successfully crawl through these results. But this wrinkle points to a larger fact: the results the program produces are as skewed as the collection of musicians in the service’s catalog. Many of the errors I had to troubleshoot were related to the uneven nature of the catalog: early versions of the script were thrown into disarray when Spotify listed no related artists for a musician. On occasion, the API suggested a related artist who did not actually have an artist page in the system (often the case with new or less-established musicians). I massaged these gaps to make this particular exercise work (you’ll now get a tongue in cheek “Musical dead end” or “Artist deleted from Spotify” output for them), but the silences in the archive offer significant reminders of the commercial politics that go into generic and archival formation, particularly when an archive is proprietary. I can imagine tweaking things slightly to create a script that produces only those archival gaps, but that is work for another day. In the meantime, I’ll be trying to figure out how Kanye West might be considered Christmas music.

Works Cited:

Fowler, Alastair David Shaw. Kinds of Literature: An Introduction to the Theory of Genres and Modes. Repr. Oxford: Clarendon Press, 1997. Print.

Virginia Woolf, Natural Language Processing, and the Quotation Mark

Crossposted on the Scholars’ Lab blog

For my fellowship in the Scholars’ Lab this year I’ll be working with Eric to expand a project we began last year on Virginia Woolf and natural language processing. My dissertation focuses on sound recordings and modernism, and this year I will focus on how Woolf’s quotation marks offer evidence of her engagement with sound as a textual device. In my reading, the quotation mark is the most obvious point at which sound meets text, the most heavily used sound recording technology in use by writers. Patterns in quotation mark usage across large corpora can tell us a lot about the role that sound plays in literature, but, as you might expect, there are lots of quotation marks - hundreds or thousands in any given text. Computational methods can help us make sense of the vast number and turn them into reasonable objects of study.

You can find more information in this post about my thinking on quotation marks and some preliminary results from thinking about them in relation to Woolf. As I discuss there, finding quotation marks in a text is not especially challenging, but this year Eric and I will be focusing on a particular wrinkle in Woolf’s use of the marks, best conveyed in The Hours, Michael Cunningham’s late-century riff on Virginia Woolf. In The Hours, Cunningham offers a fictionalized version of Woolf meditating on her composition process:

She passes a couple, a man and woman younger than herself, walking together, leisurely, bent towards each other in the soft lemon-colored glow of a streetlamp, talking (she hears the man, “told me something something something in this establishment, something something, harrumph, indeed”) (166).

The repeated “somethings” of the passage suggest the character’s imperfect experience of the conversation as well as the limits of her senses. As the moment is conveyed through the character’s perspective, the conversation will always be incomplete. Recording technology was largely unreliable during the early days of the twentieth century, and, similarly, the sound record of this conversation as given by the text is already degraded before we hear it. Cunningham points to how the sounded voice is given character in the ears of the listener, and, in a print context, in the pen of the writer. A printed voice can speak in a variety of ways and in a variety of modes.

Cunningham’s passage contains echoes of what will eventually be the famous first sentence of Woolf’s Mrs. Dalloway: “Mrs. Dalloway said she would buy the flowers herself.” The text implies that Mrs. Dalloway speaks, but it does not mark it as such: the same conversational tone in Cunningham remains here, but the narrator does not differentiate sound event from narrative by using quotation marks. We see moments of indirect speech like this all the time, when discourse becomes submerged in the texture of the narrative, but it doesn’t disappear entirely. Speech implies a lot: social relations, the thoughts of a speaking body, among others. Things get muddy when the line between narrative voice and speech becomes unclear. If quotation marks imply a different level of speech than unquoted speech, might they also imply changes in the social relations they represent?

Mrs. Dalloway is filled with moments like these, and this year I’ll be working to find ways to float them to the surface of the text. Examining these moments can tell us how conversation changes during the period, what people are talking about and for, how we conceive of the limits of print and sound, and about changing priorities in literary aesthetics. The goal this year is to train the computer to identify moments like this, moments that a human reader would be able to parse as spoken but that are not marked as such. Our first pass will be to work with the quoted material, which we can easily identify to build a series of trigger words that Woolf uses to flag speech as sound (said, asked, called, etc.). With this lexicon, we can then look for instances in her corpus where they pop up without punctuation. Teaching the computer to classify these passages correctly will be a big task, and this process alone will offer me lots of new material to work with as I untangle the relationship between modernist print and sound. In upcoming posts I’ll talk more about the process of learning natural language processing and about some preliminary results and problems. Stay tuned!

Works Cited:

Cunningham, Michael. The Hours. New York: Picador USA : Distributed by Holtzbrinck Publishers, 2002. Print.

Woolf, Virginia. Mrs. Dalloway. 1st Harvest/HBJ ed. San Diego: Harcourt Brace Jovanovich, 1990. Print.

Hearing Silent Woolf

[This week I presented at the 2015 Huskey Research Exhibition at UVA. The talk was delivered from very schematic notes, but below is a rough recreation of what I discussed. The talk I gave is a crash course in a new project I’ve started working on with the generous help of the Scholars’ Lab that thinks about sound in Virginia Woolf’s career using computational methods. Eric Rochester, especially, has been endlessly giving of his time and expertise, helping me think through and prototype work on this material. The talk wound up receiving first prize for the digital humanities panel of which I was a part. The project is still very much inchoate, and I’d welcome thoughts on it.]

When I talk to you, you make certain assumptions about me as a person based on what you’re hearing. You decide whether or not I might be worth paying attention to, and you develop a sense of our social relations based around the sound of my voice. The voice conveys and generates assumptions about the body and about power: am I making myself heard? Am I registering as a speaking voice? Am I worth listening to?

The human microphone, made famous by Occupy Wall Street, nicely encapsulates the social dimensions of sound that interest me: one person speaks, and the people around her repeat what she says more loudly, again and again, amplifying the human voice without technology. Sound literally moves through multiple bodies and structures the social relations between people, and the whole movement is an attempt to make a group of people heard by those who would rather not listen.

As a literary scholar, I am interested in how texts can speak in similar ways. The texts we read frequently contain large amounts of speech within them: conversations, monologues, poetic voice, etc. We talk about sound in texts all the time, and the same social and political dimensions of sound still remain even if a text appears silent on the page. If who can be heard and who gets to speak are both contested questions in the real world, they continue to structure our experiences of printed universes.

All of this brings me to the quotation mark. The humble piece of punctuation does a lot of work for us every day, and I want to think more closely about how it can help us understand how texts speak. The quotation mark is the most obvious point at which sound meets text. Computational methods tend to focus on the vocabulary of a text as the building blocks of meaning, but they can also help us turn quotation marks into objects of inquiry. Quotation marks can tell us a lot about how texts engage with the human voice, but there are lots of them in texts. Digital methods can help us make sense of the scale.

I examine Virginia Woolf’s quotation marks, in particular, for a number of reasons. Aesthetically, we can see her bridging the Victorian and modernist literary periods, though she tends to fall in with the latter of the two. Politically, she lived through periods of intense social and political upheaval at the beginning of the twentieth century. Very few recordings of Woolf remain, but she nonetheless thought deeply about sound recording. The worldwide market for gramophones exploded during her lifetime, and her texts frequently featured technologies of sound reproduction. Woolf’s gramophones frequently malfunction in her novels, and I’m interested in seeing how her quotation marks might analogously be irregular or broken intentionally. Woolf is especially good for thinking about punctuation marks in this way: she owned a printing press, and she often set type herself.

The following series of histograms gives a rough estimation of how Woolf’s use of quotation changes over the course of her career. On GitHub you can find the script I’ve been working on with Eric to generate these results. The number of quotations is plotted on the y-axis against their position in the novel on the x-axis, so each histogram represents more quoted speech with higher bars and more concentrated darknesses. If you have an especially good understanding of a particular novel, Mrs. Dalloway, say, you could pick out moments of intense conversation based on sudden spikes in the number of quotations. The histograms are organized in such a way that to read chronologically through Woolf’s career you would read left to right line by line, as you would the text of a book. The top-left histogram is Woolf’s earliest novel, the bottom-right corner her last.

To my eye, the output suggests high concentrations of conversation in the novels at the beginning and ending of Woolf’s career. We can see that her middle period, especially, appears to have a significant decrease in the amount of quoted speech. In one sense, this might make sense to someone familiar with Woolf’s career. Her first two novels feel more typically Victorian in their aesthetics, and she really gets into the thick of modernist experiment with her third novel. One way we often describe the shift from Victorian to the modernist period is as a shift inward, away from society and towards the psychology of the self. So it makes sense that we might see the amount of conversation between multiple speaking bodies significantly fall away over the course of those novels. The seventh histogram is especially interesting, because it suggests the least amount of speech of anything in her corpus. But if we visualize things a different way, we see that this novel, The Waves, actually shows a huge spike in punctuated speech. This graph represents the percentage of each text that is contained within quotation marks, the amount of text represented as punctuated speech.

This might look like a problem with the data: how could the text with the fewest number of quotations also have the highest percentage of quoted speech? But the script is actually giving me exactly what I asked for: The Waves is a series of monologues by six disembodied voices, and the amount of non-speech text is extremely small. More generally, charting the percentage of quoted speech in the corpus appears to support my general readings of the original nine histograms: roughly three times as much punctuated speech in the early novels as in the middle period, with a slight leveling off in the end of her career.

We could think of The Waves as an anomaly, but I think it more clearly calls for a revision of such a reading of speech in Woolf’s career. The spike in quoted speech is a hint that there is something else going on in Woolf’s work. Perhaps we can use the example of The Waves to propose that there might be a range of discourses, of types of speech in Woolf’s corpus. Before I suggested that speech diminished in the middle of Woolf’s career, but that’s not exactly true. My suspicion is that it just enters a different mode. Consider these two passages, both quoted from Mrs. Dalloway:

Mrs. Dalloway said she would buy the flowers herself.

Times without number Clarissa had visited Evelyn Whitbread in a nursing home. Was Evelyn ill again? Evelyn was a good deal out of sorts, said Hugh, intimating by a kind of pout or swell of his very well-covered, manly, extremely handsome, perfectly upholstered body (he was almost too well dressed always, but presumably had to be, with his little job at Court) that his wife had some internal ailment, nothing serious, which, as an old friend, Clarissa Dalloway would quite understand without requiring him to specify.

In each case, the text implies speech by Mrs. Dalloway and by Hugh without marking it as such with punctuation marks. Discourse becomes submerged in the texture of the narrative, but it doesn’t disappear entirely. Moments like these suggest a range of discourses in Woolf’s corpus: dialogue, monologue, conversation, punctuated, implied, etc. All of these speech types have different implications, but it’s difficult to get a handle on them because of their scale. I began the project by simply trying to mark down moments of implied speech in Mrs. Dalloway by hand. Once I got to about two hundred, it seemed like it was time to ask the computer for help.

The current plan moving forward is to build a corpus of test passages containing both quoted speech and implied speech, train a python script against this set of passages, and then use this same script to search for instances of implied speech throughout Woolf’s corpus. Theoretically, at least, the script will search for a series of words that flag text as implied speech to a human reader - said, recalled, exclaimed, etc. Using this lexicon as a basis, the script would then pull out the context surrounding these words to produce a database of sentences meant to serve as speech. At Eric’s suggestion, I’m currently exploring the Natural Language Toolkit to take a stab at all of this. My own hypothesis is that there will be an inverse relationship between quoted speech and implied speech in her corpus, that the amount of speech left unflagged by quotation marks will increase in the middle of Woolf’s career. Once I have all this material, I’ll be able to subject the results to further analysis and to think more deeply about speech in Woolf’s work. Who speaks? What about? What counts as a voice, and what is left in an ambiguous, unsounded state?

The project is very much in its beginning stages, but it’s already opening up the way that I think about speech in Woolf’s text. It tries to untangle the relationship between our print record and our sonic record, and further work will help show how discourse is unfolding over time in the modernist period.

Moving People, Linking Lives DH Symposium

[I am very excited to be working with Alison Booth, Jenny Strauss Clay, and Amy Ogden to plan a digital humanities symposium this March. What follows is our general announcement of the event, cross-posted on the Scholars’ Lab blog.]

I am pleased to announce that “Moving People, Linking Lives: An Interdisciplinary Symposium” will take place March 20-21, 2015 at the University of Virginia. Presentations and workshops will open dialogue across different fields, periods, and methods, from textual interpretation to digital research. Invited participants include specialists on narrative theory and life writing, prosopography or comparative studies of life narratives in groups, and the diverse field of digital humanities or computer-assisted research on cultural materials, from ancient texts to Colonial archives, from printed books to social media.

Invited participants include: Elton Barker, Jason Boyd, James Phelan, Susan Brown, Margaret Cormack, Courtney Evans, Will Hanley, Ben Jasnow, Ruth Page, Sue Perdue, Sidonie Smith. We hope to have lots of locals involved with digital work participate as well, and we particularly encourage graduate students to join in for the weekend!

Our symposium will bridge the gaps among our fields; share the innovations of several digital projects; and welcome the skeptical or the uninitiated, whether in our historical fields or in the applications of technology in the humanities. Booth, Clay, and Ogden have each led digital projects with some common themes and aims: locating, identifying, and interpreting the narratives—or very often, the lack of discursive records—about individuals in groups or documents, in Homer or other ancient text, Medieval French hagiography, and nineteenth-century printed collections of biographies in English. We want to open discussion of many potential methods including our own—data mining and digital editions of texts; relational databases and historical timelines and maps—for research on groups of interlinked persons, narratives or data about their lives, and documents or other records, and synthesizing and visualizing this research in accessible ways that reach students and the public. Digital innovation, however, should be informed by traditions of scholarly interpretation and advanced theoretical insights and commitments. Narrative theory and Theory generally, ideological critique including studies of gender and race, textual and book history studies, transnational and social historiography, philology and language studies, archeology, cultural geography and critical cartography, are all gaining influence on digital projects.

Invited participants will be posting about their research to our blog in the weeks leading up to the symposium, anyone is free to comment on the posts. In addition, our participants will be building a Zotero-powered bibliography in the weeks leading up to the symposium full of rich materials related to the event’s discussion.

Organized and hosted by Alison Booth, Jenny Strauss Clay, and Amy Odgen and sponsored by the Page Barbour Committee, the departments of English, French, and Art, the Institute for Humanities and Global Cultures, the Scholars’ Lab and Institute for Advanced Technology in the Humanities, and other entities at UVa, all events are free and open to the public. More information can be found on the blog as planning progresses, and you can follow us on twitter at @livesdh.

Join in the conversation on the blog at movingpeoplelinkinglives.org, and we hope to see many come out for fruitful interchange in March!

Comments