Brandon Walsh

Music Genre and Spotify Metadata

Cross-posted on the Scholars’ Lab blog

For the last couple weeks, I have been exploring APIs useful to sound studies for a sound recording and poetry project I am working on with former Scholars’ Lab fellow Annie Swafford. I was especially drawn to playing around with Spotify, which has an API that allows you to access metadata for the large catalog of music available through their service. The experiment described below focuses on genre: a notoriously messy category that we nonetheless rely on to tell us how to process the materials we read, view, or hear. Genre tells us what to expect from the art we take in, and our construction and reception of generic categories can tell us a lot about ourselves. In music, especially, genres and subgenres can activate fierce debates about authenticity and belonging. Does your favorite group qualify as “authentic” jazz? What composers do you have to know in order to think of yourself as a real classical music aficionado? Playing with an artist’s metadata can expose a lot of the assumptions that were made in its collection, and I was especially interested in the ways in which Spotify models relations among artists.

I wanted to explore Spotify’s metadata in a way that would model the interpretive messiness of generic categories. To do so, I built a program that bounces through Spotify’s metadata to produce multiple readings of the idea of genre in relation to a particular artist. Spotify offers a fairly robust API, and there are a number of handy wrappers that make it easier to work with. I used a Python module called Spotipy for the material below, and you can find the code for my little genre experiment over on my GitHub page. If you do try to run this on your own machine, note that you will need to clone Spotipy’s repository and manually install it from the terminal with the following command from within the downloaded repository:

$ python install

Pip will install an older distribution of the code that will only run in Python 2, but Spotipy’s GitHub page has a more recent release that is compatible with Python 3.

When run, the program outputs what I like to think of as the equivalent of music nerds arguing over musical genres. You provide an artist name and a number, and the terminal will work through Spotify’s API to produce the specified number of individual “mappings” of that artist’s genre as well as an aggregate list of all their associated genres. The program starts by pulling out all the genre categories associated with the given artist as well as those given to artists that Spotify flags as related. Once finished, the program picks one of those related artists at random and continues to do the same until the process returns no new genre categories, building up a list of associated genres over time.

So, in short, you give the program an artist and it offers you a few attempts at describing that artist generically using Spotify’s catalog, the computational equivalent of instigating an argument about genre in your local record store. Here are the results for running the program three times for the band New Order:

Individual genre maps

Just one nerd's opinions on New Order:

['dance rock', 'new wave', 'permanent wave', 'new romantic', 'new wave pop', 'hi nrg', 'europop', 'power pop', 'album rock']

Just one nerd's opinions on New Order:

['dance rock', 'new wave', 'permanent wave', 'gothic metal', 'j-metal', 'visual kei', 'intelligent dance music', 'uk post-punk', 'metropopolis', 'ambient', 'big beat', 'electronic', 'illbient', 'piano rock', 'trance', 'progressive house', 'progressive trance', 'uplifting trance', 'quebecois', 'deep uplifting trance', 'garage rock', 'neo-psychedelic', 'space rock', 'japanese psychedelic']

Just one nerd's opinions on New Order:

['dance rock', 'new wave', 'permanent wave', 'uk post-punk', 'gothic rock', 'discofox', 'madchester', 'britpop', 'latin', 'latin pop', 'teen pop', 'classic colombian pop', 'rai', 'pop rap', 'southern hip hop', 'trap music', 'deep rai']

Aggregate genre map for New Order:

['dance rock', 'new wave', 'permanent wave', 'new romantic', 'new wave pop', 'hi nrg', 'europop', 'power pop', 'album rock', 'gothic metal', 'j-metal', 'visual kei', 'intelligent dance music', 'uk post-punk', 'metropopolis', 'ambient', 'big beat', 'electronic', 'illbient', 'piano rock', 'trance', 'progressive house', 'progressive trance', 'uplifting trance', 'quebecois', 'deep uplifting trance', 'garage rock', 'neo-psychedelic', 'space rock', 'japanese psychedelic', 'gothic rock', 'discofox', 'madchester', 'britpop', 'latin', 'latin pop', 'teen pop', 'classic colombian pop', 'rai', 'pop rap', 'southern hip hop', 'trap music', 'deep rai']

In each case, the genre maps all begin the same, with the categories directly assigned to the source artist. Because the process is slightly random, the program eventually maps the same artist’s genre differently each time. For each iteration, the program runs until twenty randomly selected related artists return no new genre categories, which I take to be a kind of threshold of completion for one understanding of an artist’s genre.

The results suggest an amalgam of generic influence, shared characteristics, common lineages, and overlapping angles of approach. The decisions I made in how the program interacts with Spotify’s metadata suggest a definition of genre like the one offered by Alastair Fowler: “Representatives of a genre may then be regarded as making up a family whose septs and individual members are related in various ways, without necessarily having any single feature shared in common by all” (41). Genre is fluid and a matter of interpretive opinion - it is not necessarily based on objective links. The program reflects this in its results: sometimes a particular generic mapping feels very coherent, while at other times the script finds its way to very bizarre tangents. The connections do exist in the metadata if you drill down deeply enough, and it is possible to reproduce the links that brought about such output. But the more leaps the program takes from the original artist the more tenuous the connections appear to be. As I wrote this sentence, the program suggested a connection between garage rock revivalists The Strokes and big band jazz music: such output looks less like a conversation among music nerds and more like the material for a Ph.D. dissertation. As the program illustrates, generic description is the beginning of interpretation - not the ending.

Of course, the program does not actually search all music ever: it only has access to the metadata for artists listed in Spotify, and some artists like Prince or the Beatles are notoriously missing from the catalog. Major figures like these have artist pages that serve as stubs for content drawn largely from compilation CDs, and the program can successfully crawl through these results. But this wrinkle points to a larger fact: the results the program produces are as skewed as the collection of musicians in the service’s catalog. Many of the errors I had to troubleshoot were related to the uneven nature of the catalog: early versions of the script were thrown into disarray when Spotify listed no related artists for a musician. On occasion, the API suggested a related artist who did not actually have an artist page in the system (often the case with new or less-established musicians). I massaged these gaps to make this particular exercise work (you’ll now get a tongue in cheek “Musical dead end” or “Artist deleted from Spotify” output for them), but the silences in the archive offer significant reminders of the commercial politics that go into generic and archival formation, particularly when an archive is proprietary. I can imagine tweaking things slightly to create a script that produces only those archival gaps, but that is work for another day. In the meantime, I’ll be trying to figure out how Kanye West might be considered Christmas music.

Works Cited:

Fowler, Alastair David Shaw. Kinds of Literature: An Introduction to the Theory of Genres and Modes. Repr. Oxford: Clarendon Press, 1997. Print.

Virginia Woolf, Natural Language Processing, and the Quotation Mark

Cross-posted on the Scholars’ Lab blog

For my fellowship in the Scholars’ Lab this year I’ll be working with Eric to expand a project we began last year on Virginia Woolf and natural language processing. My dissertation focuses on sound recordings and modernism, and this year I will focus on how Woolf’s quotation marks offer evidence of her engagement with sound as a textual device. In my reading, the quotation mark is the most obvious point at which sound meets text, the most heavily used sound recording technology in use by writers. Patterns in quotation mark usage across large corpora can tell us a lot about the role that sound plays in literature, but, as you might expect, there are lots of quotation marks - hundreds or thousands in any given text. Computational methods can help us make sense of the vast number and turn them into reasonable objects of study.

You can find more information in this post about my thinking on quotation marks and some preliminary results from thinking about them in relation to Woolf. As I discuss there, finding quotation marks in a text is not especially challenging, but this year Eric and I will be focusing on a particular wrinkle in Woolf’s use of the marks, best conveyed in The Hours, Michael Cunningham’s late-century riff on Virginia Woolf. In The Hours, Cunningham offers a fictionalized version of Woolf meditating on her composition process:

She passes a couple, a man and woman younger than herself, walking together, leisurely, bent towards each other in the soft lemon-colored glow of a streetlamp, talking (she hears the man, “told me something something something in this establishment, something something, harrumph, indeed”) (166).

The repeated ”somethings” of the passage suggest the character’s imperfect experience of the conversation as well as the limits of her senses. As the moment is conveyed through the character’s perspective, the conversation will always be incomplete. Recording technology was largely unreliable during the early days of the twentieth century, and, similarly, the sound record of this conversation as given by the text is already degraded before we hear it. Cunningham points to how the sounded voice is given character in the ears of the listener, and, in a print context, in the pen of the writer. A printed voice can speak in a variety of ways and in a variety of modes.

Cunningham’s passage contains echoes of what will eventually be the famous first sentence of Woolf’s Mrs. Dalloway: “Mrs. Dalloway said she would buy the flowers herself.” The text implies that Mrs. Dalloway speaks, but it does not mark it as such: the same conversational tone in Cunningham remains here, but the narrator does not differentiate sound event from narrative by using quotation marks. We see moments of indirect speech like this all the time, when discourse becomes submerged in the texture of the narrative, but it doesn’t disappear entirely. Speech implies a lot: social relations, the thoughts of a speaking body, among others. Things get muddy when the line between narrative voice and speech becomes unclear. If quotation marks imply a different level of speech than unquoted speech, might they also imply changes in the social relations they represent?

Mrs. Dalloway is filled with moments like these, and this year I’ll be working to find ways to float them to the surface of the text. Examining these moments can tell us how conversation changes during the period, what people are talking about and for, how we conceive of the limits of print and sound, and about changing priorities in literary aesthetics. The goal this year is to train the computer to identify moments like this, moments that a human reader would be able to parse as spoken but that are not marked as such. Our first pass will be to work with the quoted material, which we can easily identify to build a series of trigger words that Woolf uses to flag speech as sound (said, asked, called, etc.). With this lexicon, we can then look for instances in her corpus where they pop up without punctuation. Teaching the computer to classify these passages correctly will be a big task, and this process alone will offer me lots of new material to work with as I untangle the relationship between modernist print and sound. In upcoming posts I’ll talk more about the process of learning natural language processing and about some preliminary results and problems. Stay tuned!

Works Cited:

Cunningham, Michael. The Hours. New York: Picador USA : Distributed by Holtzbrinck Publishers, 2002. Print.

Woolf, Virginia. Mrs. Dalloway. 1st Harvest/HBJ ed. San Diego: Harcourt Brace Jovanovich, 1990. Print.

Hearing Silent Woolf

[This week I presented at the 2015 Huskey Research Exhibition at UVA. The talk was delivered from very schematic notes, but below is a rough recreation of what I discussed. The talk I gave is a crash course in a new project I’ve started working on with the generous help of the Scholars’ Lab that thinks about sound in Virginia Woolf’s career using computational methods. Eric Rochester, especially, has been endlessly giving of his time and expertise, helping me think through and prototype work on this material. The talk wound up receiving first prize for the digital humanities panel of which I was a part. The project is still very much inchoate, and I’d welcome thoughts on it.]

When I talk to you, you make certain assumptions about me as a person based on what you’re hearing. You decide whether or not I might be worth paying attention to, and you develop a sense of our social relations based around the sound of my voice. The voice conveys and generates assumptions about the body and about power: am I making myself heard? Am I registering as a speaking voice? Am I worth listening to?

The human microphone, made famous by Occupy Wall Street, nicely encapsulates the social dimensions of sound that interest me: one person speaks, and the people around her repeat what she says more loudly, again and again, amplifying the human voice without technology. Sound literally moves through multiple bodies and structures the social relations between people, and the whole movement is an attempt to make a group of people heard by those who would rather not listen.

As a literary scholar, I am interested in how texts can speak in similar ways. The texts we read frequently contain large amounts of speech within them: conversations, monologues, poetic voice, etc. We talk about sound in texts all the time, and the same social and political dimensions of sound still remain even if a text appears silent on the page. If who can be heard and who gets to speak are both contested questions in the real world, they continue to structure our experiences of printed universes.

All of this brings me to the quotation mark. The humble piece of punctuation does a lot of work for us every day, and I want to think more closely about how it can help us understand how texts speak. The quotation mark is the most obvious point at which sound meets text. Computational methods tend to focus on the vocabulary of a text as the building blocks of meaning, but they can also help us turn quotation marks into objects of inquiry. Quotation marks can tell us a lot about how texts engage with the human voice, but there are lots of them in texts. Digital methods can help us make sense of the scale.

I examine Virginia Woolf’s quotation marks, in particular, for a number of reasons. Aesthetically, we can see her bridging the Victorian and modernist literary periods, though she tends to fall in with the latter of the two. Politically, she lived through periods of intense social and political upheaval at the beginning of the twentieth century. Very few recordings of Woolf remain, but she nonetheless thought deeply about sound recording. The worldwide market for gramophones exploded during her lifetime, and her texts frequently featured technologies of sound reproduction. Woolf’s gramophones frequently malfunction in her novels, and I’m interested in seeing how her quotation marks might analogously be irregular or broken intentionally. Woolf is especially good for thinking about punctuation marks in this way: she owned a printing press, and she often set type herself.

The following series of histograms gives a rough estimation of how Woolf’s use of quotation changes over the course of her career. On GitHub you can find the script I’ve been working on with Eric to generate these results. The number of quotations is plotted on the y-axis against their position in the novel on the x-axis, so each histogram represents more quoted speech with higher bars and more concentrated darknesses. If you have an especially good understanding of a particular novel, Mrs. Dalloway, say, you could pick out moments of intense conversation based on sudden spikes in the number of quotations. The histograms are organized in such a way that to read chronologically through Woolf’s career you would read left to right line by line, as you would the text of a book. The top-left histogram is Woolf’s earliest novel, the bottom-right corner her last.

To my eye, the output suggests high concentrations of conversation in the novels at the beginning and ending of Woolf’s career. We can see that her middle period, especially, appears to have a significant decrease in the amount of quoted speech. In one sense, this might make sense to someone familiar with Woolf’s career. Her first two novels feel more typically Victorian in their aesthetics, and she really gets into the thick of modernist experiment with her third novel. One way we often describe the shift from Victorian to the modernist period is as a shift inward, away from society and towards the psychology of the self. So it makes sense that we might see the amount of conversation between multiple speaking bodies significantly fall away over the course of those novels. The seventh histogram is especially interesting, because it suggests the least amount of speech of anything in her corpus. But if we visualize things a different way, we see that this novel, The Waves, actually shows a huge spike in punctuated speech. This graph represents the percentage of each text that is contained within quotation marks, the amount of text represented as punctuated speech.

This might look like a problem with the data: how could the text with the fewest number of quotations also have the highest percentage of quoted speech? But the script is actually giving me exactly what I asked for: The Waves is a series of monologues by six disembodied voices, and the amount of non-speech text is extremely small. More generally, charting the percentage of quoted speech in the corpus appears to support my general readings of the original nine histograms: roughly three times as much punctuated speech in the early novels as in the middle period, with a slight leveling off in the end of her career.

We could think of The Waves as an anomaly, but I think it more clearly calls for a revision of such a reading of speech in Woolf’s career. The spike in quoted speech is a hint that there is something else going on in Woolf’s work. Perhaps we can use the example of The Waves to propose that there might be a range of discourses, of types of speech in Woolf’s corpus. Before I suggested that speech diminished in the middle of Woolf’s career, but that’s not exactly true. My suspicion is that it just enters a different mode. Consider these two passages, both quoted from Mrs. Dalloway:

Mrs. Dalloway said she would buy the flowers herself.

Times without number Clarissa had visited Evelyn Whitbread in a nursing home. Was Evelyn ill again? Evelyn was a good deal out of sorts, said Hugh, intimating by a kind of pout or swell of his very well-covered, manly, extremely handsome, perfectly upholstered body (he was almost too well dressed always, but presumably had to be, with his little job at Court) that his wife had some internal ailment, nothing serious, which, as an old friend, Clarissa Dalloway would quite understand without requiring him to specify.

In each case, the text implies speech by Mrs. Dalloway and by Hugh without marking it as such with punctuation marks. Discourse becomes submerged in the texture of the narrative, but it doesn’t disappear entirely. Moments like these suggest a range of discourses in Woolf’s corpus: dialogue, monologue, conversation, punctuated, implied, etc. All of these speech types have different implications, but it’s difficult to get a handle on them because of their scale. I began the project by simply trying to mark down moments of implied speech in Mrs. Dalloway by hand. Once I got to about two hundred, it seemed like it was time to ask the computer for help.

The current plan moving forward is to build a corpus of test passages containing both quoted speech and implied speech, train a python script against this set of passages, and then use this same script to search for instances of implied speech throughout Woolf’s corpus. Theoretically, at least, the script will search for a series of words that flag text as implied speech to a human reader - said, recalled, exclaimed, etc. Using this lexicon as a basis, the script would then pull out the context surrounding these words to produce a database of sentences meant to serve as speech. At Eric’s suggestion, I’m currently exploring the Natural Language Toolkit to take a stab at all of this. My own hypothesis is that there will be an inverse relationship between quoted speech and implied speech in her corpus, that the amount of speech left unflagged by quotation marks will increase in the middle of Woolf’s career. Once I have all this material, I’ll be able to subject the results to further analysis and to think more deeply about speech in Woolf’s work. Who speaks? What about? What counts as a voice, and what is left in an ambiguous, unsounded state?

The project is very much in its beginning stages, but it’s already opening up the way that I think about speech in Woolf’s text. It tries to untangle the relationship between our print record and our sonic record, and further work will help show how discourse is unfolding over time in the modernist period.

Moving People, Linking Lives DH Symposium

[I am very excited to be working with Alison Booth, Jenny Strauss Clay, and Amy Ogden to plan a digital humanities symposium this March. What follows is our general announcement of the event, cross-posted on the Scholars’ Lab blog.]

I am pleased to announce that ”Moving People, Linking Lives: An Interdisciplinary Symposium” will take place March 20-21, 2015 at the University of Virginia. Presentations and workshops will open dialogue across different fields, periods, and methods, from textual interpretation to digital research. Invited participants include specialists on narrative theory and life writing, prosopography or comparative studies of life narratives in groups, and the diverse field of digital humanities or computer-assisted research on cultural materials, from ancient texts to Colonial archives, from printed books to social media.

Invited participants include: Elton Barker, Jason Boyd, James Phelan, Susan Brown, Margaret Cormack, Courtney Evans, Will Hanley, Ben Jasnow, Ruth Page, Sue Perdue, Sidonie Smith. We hope to have lots of locals involved with digital work participate as well, and we particularly encourage graduate students to join in for the weekend!

Our symposium will bridge the gaps among our fields; share the innovations of several digital projects; and welcome the skeptical or the uninitiated, whether in our historical fields or in the applications of technology in the humanities. Booth, Clay, and Ogden have each led digital projects with some common themes and aims: locating, identifying, and interpreting the narratives—or very often, the lack of discursive records—about individuals in groups or documents, in Homer or other ancient text, Medieval French hagiography, and nineteenth-century printed collections of biographies in English. We want to open discussion of many potential methods including our own—data mining and digital editions of texts; relational databases and historical timelines and maps—for research on groups of interlinked persons, narratives or data about their lives, and documents or other records, and synthesizing and visualizing this research in accessible ways that reach students and the public. Digital innovation, however, should be informed by traditions of scholarly interpretation and advanced theoretical insights and commitments. Narrative theory and Theory generally, ideological critique including studies of gender and race, textual and book history studies, transnational and social historiography, philology and language studies, archeology, cultural geography and critical cartography, are all gaining influence on digital projects.

Invited participants will be posting about their research to our blog in the weeks leading up to the symposium, anyone is free to comment on the posts. In addition, our participants will be building a Zotero-powered bibliography in the weeks leading up to the symposium full of rich materials related to the event’s discussion.

Organized and hosted by Alison Booth, Jenny Strauss Clay, and Amy Odgen and sponsored by the Page Barbour Committee, the departments of English, French, and Art, the Institute for Humanities and Global Cultures, the Scholars’ Lab and Institute for Advanced Technology in the Humanities, and other entities at UVa, all events are free and open to the public. More information can be found on the blog as planning progresses, and you can follow us on twitter at @livesdh.

Join in the conversation on the blog at, and we hope to see many come out for fruitful interchange in March!

Collation and Writing Pedagogy

[The following is the talk that I gave at the 2015 MLA Conference on a panel on “Pedagogy and Digital Editions.” The Google Docs section is a slight reworking and recontextualization of a previous post on the subject. I’m especially grateful to Sarah Storti and Andrew Stauffer for their suggestions and comments on how to use Juxta Commons to teach writing.]

Collation and Writing Pedagogy with Juxta Commons and Google Docs

We typically think about using digital collation to compare those documents that already exist. The usual model gathers multiple copies, multiple witnesses, of the same text, and juxtaposes them to gather a sense of the very small, micro changes that have been made to a document. We use these changes to deconstruct our sense of a complete and unified final whole. Instead, we get a sense of a series of related texts, of a work manifesting in different forms, and of stages of a revision process for which we were not present. I am especially interested in the last of these categories: collation tools allow us to uncover revision histories that might be otherwise obscured. They help us to uncover past stages of the writing process, breaking apart a text that might seem concrete and fixed and make it appear fluid and subject to change.

The illusion of a final unified text is a problem for textual editing, and collation tools have helped us to solve it for decades. This same problem, the tendency to think of texts as final objects with no prior histories, is at its core one of the key difficulties facing student writers. There is a danger for students to think of writing as crafting a marble structure – you chip away at it piece by piece until it forms a perfect, fixed form – the form it was meant to possess all along. Instead, I want to argue that collation tools can be used by teachers to help students conceive of writing as a kind of assemblage, a piecing together that instantiates one possible combination among many of a set of textual components. This mode of writing is, by contrast, characterized by play, transformation, and fluidity.

I will talk about two tools today in this context - Juxta Commons and Google Docs – and the exercises I use with each. The former is a collation tool proper, and, while the latter is more typically used for collaborative writing, it lends itself quite readily to the practice. So my focus here is on the practical – how to think about and use these tools for to teach writing and revision. I hope to tease out more of their implications in the discussion.

Juxta Commons

Juxta Commons, probably familiar to many, is the latest iteration of Juxta, a piece of software that allows a user to upload multiple textual witnesses and, at a glance, discern the differences. The tool’s digital nature means that the process is quantified and streamlined – no laboring over the collator. It also has the benefit of offering a number of visualizations for graphically understanding the differences between two witnesses, a fact that I find helpful for talking about student writing.

A potential writing exercise for use with Juxta is simple: a student writes a paragraph, and they then rewrite that paragraph several times. Finally, the student uploads each version to Juxta before writing a brief reflection on the differences between the drafts. What remains constant? Where do changes cluster? Do these edits indicate any special anxiety or concern with any one particular element of the writing process – transitional sentences, thematic chaining, logic, etc.? Do the ideas themselves stay the same? Fixating on these details can allow students to conceive of writing as an assemblage of various components that result in the illusion of a coherent whole.

In the example above, the student is writing a grant proposal for his tennis team. In an early draft, the class noted that the writing held the team at too much of a remove when the author wanted to stress its importance to him as a second family. Such a charge can seem like a big task, but processing the paragraph through the Juxta assignment throws into sharp relief the minute edits created in a revision to create such systemic change. Comparing the two revisions in Juxta, we can see that, by and large, the student revised the subjects of this first paragraph. “I” becomes “we,” and “friends” will become a “family.” He works to increase the sense of unity among the group of people he describes, a unity that will later become essential in his argument that the organization provides more to the community than just a place to play sports. The Juxta assignment allows a student better insight into how each of these component pieces can easily be sent into motion and radically change the character of the whole document. A large, sweeping suggestion like “adjust your tone” becomes revision by way of a thousand moving pieces. Much more doable.

Juxta Commons has the added bonus of being envisioned as a commons - an online community of textual scholars. It is quite easy to share sets with others, and it would take little effort to set up a repository of shared collation sets among a classroom. To encourage objective reflection as a component of writing, I would ask each student to write a short reflection on a different student’s collation set, observing the differences and reflecting on the minute changes that got them there.

Juxta’s strength as a collation tool is also its limitation for the sort of teaching exercise that I am describing. Juxta has the benefit of being quantitative: its visualizations can offer users quick and accurate depictions of things that might otherwise go unrecognized – a missing comma, or a single different word. Juxta works best with large documents that are largely the same. But if the corresponding passages become too different Juxta will be thrown into disarray.

While it is very good at processing texts to find small differences, the software does not quite work if the documents are too different from one another. Its system allows for either exact similarity or difference at the level of character. It cannot tell, for example, if you have reworded a particular phrase or removed it entirely. The paragraph in this example was heavily rewritten, with only a few words in common between the two drafts. While this sort of at a glance collation could be useful to identify revised sections in longer documents, it does little to unsettle the idea of writing as a search for a fully realized whole. Juxta Commons works best for helping students to see the massive change that can be wrought by a collection of small changes.

Google Docs

One of the difficulties with using Juxta to collate is that it relies on a student’s already extant drafts – revision must already have taken place, which seems to defeat the whole purpose of an exercise designed to unsettle the writing process. Google Docs is not a tool made for collation, but I do think that it can helpfully generate just those many witnesses that could be collated. By using Google Docs as a collaborative writing space, classmates can help another generate different textual possibilities for a single sentence. My use of Google Docs in conjunction with a discussion of writing first came about in an advanced course on Academic and Professional Writing. We talked a lot about editing in the class, and many of the conversations about style took this shape:

Student A: “Something about this word feels strange, but I don’t know what it is.”
Student B: “What if we moved the phrase to the beginning of the sentence?”
Student C: “We could get rid of that word and use this phrase instead.”

Those statements are hard to wrap your head around. Just imagine if those conversations were spoken. Talking about writing can only get you so far: writing is graphic, after all. As I write and edit, I try out different options on the page. I model possibilities, but I do so in writing. Discussing the editing process without visual representations of suggested changes can make things too abstract to be meaningful for students. They need to see the different possibilities, the different potential witnesses. I developed an exercise that I call “Writing Out Loud” that more closely mirrors my actual editing process. Using a Google Doc as a collaborative writing space, students are able to model alternate revisions visually and in real time for discussion.

The setup requires a projector-equipped classroom and that students bring their laptops to class. Circulate the link to the Google Doc ahead of time, taking care that anyone with the link can edit the document. The template of the Google Doc consists of a blank space at the top for displaying the sentence under question and a series of workspaces for each student consisting of their name and a few blank lines. Separate workspaces prevent overlapping revisions, and they also minimize the disorienting effects of having multiple people writing on the same document.

We usually turn to the exercise when a student feels a particular sentence is not working but cannot articulate why. When this happens, I put the Writing Out Loud template on the projector with the original version of the sentence at the top. Using their own laptops, students sign onto the Doc and type out alternative versions of the sentence, and the multiple possible revisions show up on the overhead for everyone to see and discuss. After each student rewrites the sentence to be something that they feel works better, ask for volunteers to explain how the changes affect meaning. The whole process only takes a few minutes, and it allows you to abstract writing principles from the actual process of revision rather that the other way around. How does the structure of a sentence matter? How can word choice change everything? What pieces of a sentence are repetitive?

I especially like this exercise because it asks multiple students to engage in the revision process. It is always easier to revise when you have critical distance on a piece of writing, and outside editors with no attachment to a particular word or phrase can offer just that. In the above example, the sentence under discussion contained the colloquial phrase “get the word out.” The class offered a range of alternatives that range in their formality. Instead of receiving an edict to professionalize their tone, the student gets a glimpse of many possibilities from which he can choose. The exercise also allows the choices to exist side by side, making collation possible in a way that the usual revision process makes difficult. Most students, I would wager, work with one or, at most, two drafts open at a single time. Google Docs can allow a number of possibilities to emerge.

The Google Docs exercise works better on micro-edits, revisions at the level of the sentence. The standard process of the exercise—write, collate, and discuss—would take far too long with anything longer than a few lines. The exercise can be particularly useful for those sentences that carry a lot of importance for entire arguments: thesis statements, topic sentences, the first sentences of the document, etc. Where Juxta is entirely quantitative and offers hand graphic visualizations of textual difference, this Google Docs exercise relies on you and the students to collate the materials yourselves. You can recognize subtle differences – a reworded idea vs. a dropped idea, for example. It trains students to internalize the practice of collation and reflect on the interpretive possibilities offered by such differences.

Closing Analysis

I find that students often think of editing as an intense, sweeping process that involves wholesale transformation from the ground up. Modeling multiple, slightly different versions of the same sentence can allow for a more concrete discussion of the sweeping rhetorical changes that even the smallest edits can make. In this sense, I think using these tools in the classroom allows students to conceive of a single composition as one instantiation among many. Forcing them to compose several different models means that the writing process will be looser. Collation as composition offers students a subjunctive space wherein they dwell in possibilities. It is a vision of composition as de and reconstruction, as a process that is constantly unfolding.

Digital tools uncover how writing is really always already such a fluid process, and they can allow students to see their own composition process in this way while they are still in the thick of it. Digital collation can offer students the chance to think of their own works as messy, subjunctive spaces, as things in flux. By allowing multiple possible versions of the same text to exist alongside and in relation to one another, they can allow students to slip between different textual realities. Most importantly, the process severs the link between the quality of an idea and the manner of its presentation. Instead of one right answer, students can see that there are many possible solutions to any writing difficulty.

I have touched on how exercises like these can also encourage students to distill writing principles from the process rather than the other way around. They can also help students to discover editorial principals through their own writing. I am imagining here a praxis-oriented approach to teaching textual editing where practice leads to principal, one where scholarly readings might come after a student has written an essay, revised it, and, in effect, produced their own edition of their text. An exercise with Juxta might lead to a discussion of eclectic editions, while Google Docs could lead to a fruitful discussion of accidentals and substantives. I am not suggesting that these sorts of exercises replace the good work performed by studying classic editions, reading about editorial practices, or producing one’s own edition by carrying out the steps of the editorial process. But in a class that has an explicit focus on composition, exercises with tools like Juxta Commons and Google Docs can help connect textual criticism with writing pedagogy.

The Devil in the Recording

[The following is an only slightly modified version of the talk I gave at the ACH’s panel on “Digital Deformance and Scholarly Forms” at the 2015 MLA conference. For more details on how to reverse audio recordings, see my previous post on the subject.]

The Devil in the Recording: Deformative Listening and Sound Reproduction

It’s a well-known fact that you can find the devil in popular music. Simply take Led Zeppelin’s “Stairway to Heaven,” play it backwards, and voila. You’ll get messages for, if not by, the Lord of the Flies. Obviously I’m being facetious. Few, if any, take this claim seriously, but it does offer serious ways to think about deformance in the context of sound recordings, particularly those with linguistic or literary content. The digital method of deformance I’ll speak about today, then, is a simple one. Using open source tools like Audacity, it’s easier than ever to play recordings backwards, to reverse a sound clip with the flip of a switch. I’ll touch just a bit on the history of such methods as they pertain to music and then speculate as to what they can tell us about approaches to thinking about literary sound recordings. I’m a modernist, and my examples will reflect this bias. My ultimate conclusions are as follows. First: reading backwards juxtaposed against audio reversal reveals the unique character of literary sound recordings to be simultaneously sounded and print, to be audiotextual objects as I call them. Second: deformance can offer us new modes for thinking about media failures and malfunctions that actually do exist constantly and all around us. In particular, audio deformance is something that the modernists were keenly interested in, and deformance as a practice can get us closer to the relationships they had with media.

So here is part of “Stairway to Heaven” backwards.


Can’t you hear the devil? The “Stairway to Satan,” as I will call it, suggests that we can find new linguistic content in an already extant sound message. Detractors of the “Stairway to Satan” narrative (numerous on Youtube if you care to check them out) suggest that this is just a function of our minds wanting to make sense of chaos. Is this gibberish? Or is it a collection of scattered sound components that can be reconstituted into a whole? In Lisa Samuels and Jerome McGann’s essay on deformance from which this panel takes its cue, they discuss reading Emily Dickinson’s poetry backwards in a mode not too far removed from this discussion. Reading backwards can throw into sharp relief the linguistic components, the very pieces that make up a poem, and at the end of the day, you still have the lines, the words, or even the component letters. It’s possible to reassemble these into semantic meanings.

But sound recordings are something different. They are bound in time in a different way. Daniel Albright in Untwisting the Serpent describes music by way of “Lessing’s famous distinction between the spatially juxtapositive arts of nebeinander, such as painting, sculpture, and architecture, and the temporally progressive arts of nacheinander, such as poetry and music” (9). Our experience of music and poetry depend upon their ability to move forward in time. To put the distinction in the context of deformance: you can move around a sculpture and view it from different angles, but it remains the same sculpture. Deform a musical recording by reversing its waveform, however, and you end with a different musical artifact entirely, one with different component parts. Hence, it can sound like gibberish.

Here is the waveform for the Zeppelin clip. The waveform here is a charting of intensity over time, and the reversal literally changes the original artifact. It’s a mirror image, but our ears are hard-pressed to be able to reconnect the new object to its original. Many kinds of deformance you can do on an audio recording would work in the same way – alter the pitches, smash them tighter, stretch them out, etc. You alter that wave, and you get something else entirely. At what point does it become something new?

But some reversed audio still sounds like a recognizable tune. Behind the “Stairway to Satan” claim is a long history behind it of musical reversal and mirroring. Musicians and listeners have been fascinated with the vectored nature of sound for centuries, and composers have experimented with reversal as a spur to creativity for ages. Take this melody.


The melody of the first ten measures is followed by a retrograde repetition of itself, meaning that it is a musical palindrome. All of the intervals of this first section become reversed and, if you were to fold the melody in upon itself, it would perfectly line up. Playing backwards is itself built into the creative process. The playback reflects this as the bouncing ball literally moves backwards on the page, but, if you were to write it out, it would look quite different. The kind of deformance that I am describing, that Zeppelin conspiracy theorists lament, and that Samuels and McGann suggest – it’s built into the music itself.

The melodic reversal of music like this works because, as Walter Pater taught us, music can be thought of as a “perfect identification of matter and form.” Flip the melody and you do not lose information, you get a new melody. The new object is still discernible as music because it is new music. The addition of linguistic content complicates the question - phonemes when reversed do not necessarily and easily coordinate with other phonemes. A recorded object with linguistic content has two distinct characters, each of which overlap with the other. It’s an obvious point, but one that I think has profound implications.

In Langston Hughes’s 1958 recording of “Motto” in collaboration with Charles Mingus and Leonard Feather, we can start to approach some useful conclusions about what this might all mean. The excerpt starts with an instrumental section and then Hughes comes in. So keep in mind that, in the reversal, we’ll hear the poetry first and then the instrumental part.


“Motto” Reversed

The recording is a useful analogy for vocal sound recordings more generally in that it has two distinct pieces – a musical (non-verbal sound) component, and a recorded voice (with linguistic content). The two elements often intertwine and are not easily separated (this example not withstanding). You can hear, I think, the stark difference between the reversed poetic content by Hughes and the reversed instrumental content. Hughes reversed sounds like nonsense, while the saxophone in particular still sounds like something of a melody. The digital reversal of sound recordings treats them both as waveforms with no semantic content – it reverses them just as easily and happily as it would any other sound recording.

We might expect the practice of deformance to throw into sharp relief the status of these recordings as sound objects. The pops, silences, and phonetic meanings of a reading suddenly become especially salient, and we might expect this reversal to make us hyper-aware of their sounded nature. In theory the deformance of these recordings more easily allows us to practice what Charles Bernstein has called “close listening,” examining the sounded nature of these objects. But, as the sounds themselves become distorted almost beyond recognition, the method can only provide clues towards such a practice. We might gain general senses, as with the Hughes, of the general prominence of certain registers or frequencies, silences and gaps, or of sections that are particularly filled with sonic activity. All of these might provide hints of content that might bear out fruitful analysis when put forwards again.

Deformance of poetic recordings forces us to consider the nature of recorded literary recordings anew. We might extrapolate from the character of this recording that all recorded voices contain a linguistic element as well an audible one. Not fully audio nor fully textual artifacts, I want to say that they are, instead something we might call audiotextual, a term that Jason Camlot has recently used in relation to the classification of Victorian literary recordings as an expansion of McGann’s own historicist approach to textual criticism.

I want to use the term as a play on audiovisual to describe the state of such sound recordings. Like the Hughes recording, with both an instrumental, sound component as well as a linguistic one, audiotextual recordings exist in sound as well as in print. It’s a fairly simple idea, but I think it is one that often gets concerned as we discuss sound recordings. Literary sound recordings are not reducible to their relationship with a print text: they have both sounded and print components. Audiobooks, in particular, not being “poetic” often seem to get left out of close listenings and treated as mere reproductions of print texts. If you read the reviews of any Amazon audiobook or LibriVox recording, you will see hundreds of people who expect an audiobook to be an unmediated, honest representation of the print origin. Audiotextual might be used equally to describe both Hughes’s literary sound recordings as well as Hughes’s poetry itself, saturated as it is with traces of the live performance techniques of jazz and blues musicians.

More profoundly, I think sound recording during the modernist period is an especially good candidate for deformative acts of listening and interpretation. It is well-known and often-noted that modernist authors were obsessed with the gramophone, but consider the nature of such representations. The gramophones are most often marked by the materiality of their failures. In the “Hades” episode of Ulysses, the machine disintegrates into parody at the very moment at which it is meant to revive the voice of a dead relative: “After dinner on a Sunday. Put on poor old greatgrandfather Kraahraark! Hellohellohello amawfullyglad kraark awfullygladaseeagain hellohello amawf krpthsth” (114). Sound reproduction during the period was not marked by the high fidelity, by the ability to authentically reproduce a deceptively “real” recording of life. It is a flawed act marked, as in Joyce’s case, by skipping needles, locked grooves, and hissing machines. For Joyce, this means uncovering a renewed sense that sound recordings were imperfect things, themselves subject to deformation by their own young technology. Joyce thinks of the gramophone recording as an object that can reach back into the past. He does not play it backwards as such, but the very act of playing the voice reverses time itself. And it does so in a manner that deforms the recording, altering its shape and transmission as a natural and comical part of the playback process.

Woolf’s failing gramophone in Between the Acts draws the elements of my short talk together nicely and can act as a closing image. During the pageant play at the heart of the text, a malfunctioning gramophone provides musical and narrative accompaniment: “The gramophone gurgled Unity-Dispersity. It gurgled Un…dis…And ceased” (201). The words themselves break apart into component syllables; semantic meaning evaporates as the grain of language pushes to the surface, and the heard word gives way to the gurgling materiality of the record itself. Woolf makes us hear the sound of the words as bound with their meanings. Her gramophone falls into locked grooves throughout the novel, transfixing its listeners and forming a community out of the audience of listeners by expanding the time with which they engage with each other. Not reversing time, certainly, but she does meditate on the ability of a malfunctioning gramophone to create anew through performance and deformance.

For Joyce and Woolf, the machines fail as often as they succeed. Deformance is thoroughly entwined with such performances. We may even go so far as to say that sound reproduction of this sort is a always kind of deformance, that no media form provides a pure, unaltered transmission of its content. As a critical practice, deformance, a systematic and intentionally disruptive form of engagement with materials, actually gets us closer to the kinds of media relationships that these authors would have known. The practice can offer us new perspective on literary and sonic materials, sure, but it can also provide us with something older. Deformative listening then, might be a practice of recovery, of attempting to recreate the phenomenological experience of a 1920s gramophone listening. The devil in the recording proves to be not the sort that conspiracy theorists would have you believe. The darkness lurking beneath sound recordings, be they musical or literary in nature, is the shadow of the materials, their very real failures, and the deformance that has always been present anytime we put needle to disc.

Works Cited

  • Albright, Daniel. Untwisting the Serpent: Modernism in Music, Literature, and Other Arts. Chicago, Ill: University of Chicago Press, 2000. Print.
  • Bernstein, Charles, ed. Close Listening: Poetry and the Performed Word. Oxford University Press, USA, 1998. Print.
  • Hughes, Langston, Leonard Feather, and Charles Mingus. Weary Blues. MGM, 1959. CD.
  • Joyce, James. Ulysses. Vintage, 1990. Print.
  • Pater, Walter. “The School of Giorgione.” The Renaissance: Studies in Art and Poetry. 1873. Web.
  • Woolf, Virginia. Between the Acts. New York: Harcourt Brace Jovanovich, 1969. Print.


[The following post is cross-listed on the ACH’s blog. The post details the methods used in putting together a talk for MLA 15 that takes place in 212 VCC West at 12:00 PM on Friday, January 9th.]

My talk for the ACH’s panel at MLA 15 is entitled “The Devil in the Recording: Deformative Listening and Poetry.” I will be talking about the problems and affordances for deformance in the context of audio recordings, specifically those that have literary content. The particular method I will focus on is the reversal of audio recordings, taking my cue from the infamous claim that you can hear Satanic lyrics in Led Zeppelin’s “Stairway to Heaven” if you play the recording backwards. In the example below I will show how to reverse an audio file, and I will be working with Langston Hughes’s reading of “Motto” with Charles Mingus and Leonard Feather on his 1958 recording Weary Blues.

Professional-grade sound-editing software like Pro Tools or Logic give you the capacity to do a lot in the way of sound mixing and work with music, but I often find myself drowning in their limitless options. They are also quite expensive. Audacity is my audio editing software of choice: the tool is open source and, most importantly, fairly intuitive and easy to use. Audacity is somewhat more limited than other options, but it does what it can cleanly and intuitively.

To reverse an audio file, begin by opening that clip in Audacity in the same way that you would open a file in any other piece of software. You will get something that looks like this:

What you see here is a waveform, a way of graphically representing the audio file in way that allows you to manipulate it. The y-axis of the waveform corresponds to volume – the taller the waveform, the louder the sound file’s contents are at that particular moment in time. This can be a quick and easy way to identify chunks of activity by looking for spikes in the volume.

The x-axis represents time – the Hughes file I have sliced out is 46” long, and the program gives you a timeline along the top of the segment to situate you in the file. Clicking anywhere on the waveform will set the file playback to begin at that point, and you can click and drag to highlight a selection of the clip for processing.

To process the file, highlight the section that you want reversed. In this case, since we are working with the entire file, we will just select everything. Under the “effect” menu, Audacity gives you a range of options for remixing your sound data, but we want the “reverse” function.

Now you have a reversed file at your disposal. Sound tends to work in attack and decay, and much of the strangeness of a reversed recording comes from sounds increasing rather than fading in intensity over time. And, as I will discuss in my talk, the process throws into sharp relief the distinct character of recorded linguistic content.

Audacity saves files in Audacity project formats by default, so you will need to export your file to a different file format if you want to play it in a media player. I tend to use both .ogg and .mp3 files for browser compatibility. Audacity will also give you the opportunity to input light metadata for your file before it exports in case you want to curate your file for inclusion in an archive or home-library.

Audacity gives many other options for experimenting with sound remixing, distortion, and deformance that I would encourage you to explore. The software also gives you many options for working with sound files more generally. I have written elsewhere about using Audacity to prepare sound files for research and presentation. Check out my other post if you want to learn more about how to slice out clips, mix together two sound files, or process DRM files.

Diss Talk Abstract

On Friday, November 7th I will be giving a talk to the UVA English Department entitled “The Joycean Record: Listening Patterns and Sound Coteries.” The talk is a reworking of material from one of my dissertation chapters that I originally presented at last year’s MLA meeting in Chicago.


Modernist authors famously gathered in a series of small coteries, intellectual clusters centered on the production and reception of their creations. Modernists frequently took to the microphone to record readings of their works as well, and the lives of such sound objects can offer us both new networks of modernist reception and distribution as well as a new conception of modernism’s engagement with sound technology based on lived practices. This talk places James Joyce alongside sociologies of record collecting and reception as a means of rethinking Ulysses’s engagement with sound recording technology as an ongoing, lived, and social practice. Doing so uncovers a new history of Ulysses as both participant in and subject of sound communities emerging during the twentieth century, as an object that coordinates networked sound production and reception. From Joyce’s network of friends and collaborators to the coterie that gathers around the production of the 2007 LibriVox recording of Ulysses, I suggest that group listening enabled by sound recording has always been vital to the life of Joyce’s text.

Prism in the Classroom: Questions to Frame Discussion

Cross-posted on the Scholars’ Lab blog.

I have been touting the use of Prism in the university classroom for some time now, but a recent exchange with Annie Swafford suggested to me that it might be worth explicitly outlining how I would go about doing so. With that in mind, I’ve composed the following set of questions for how I might frame discussion of Prism in the classroom. I’ve admittedly only had very brief chances to implement the tool in the classroom myself, so the thoughts come largely out of speculation and conversation. It should be noted as well that I assume below that you have already chosen a text and categories along which it should be marked (I may write on ways to approach such choices at a later date). In what follows, I move from general questions that I think would be helpful in framing any discussion of the tool to a particular use-case in James Joyce’s A Portrait of the Artist as a Young Man. The former questions inform and engage my latter use-case.

I prepare for class discussion by assembling a list of questions to be explored, and I would organize a Prism discussion around two lines of inquiry: tool-specific and visualization-specific. Some of these questions can be helpful for framing your own thoughts. Others could usefully be posed to the class as a whole as a means of framing discussion.

Tool-Specific Questions

How do the tool and our framing of it affect how we read the text? How is Prism’s mode of reading different from what we normally do? Is it the same that we’ve always been doing – close reading in a different form? What are the problems with the form? Can we really boil interpretation down to a series of numbers, visualize it, and move forward? Or is there more to interpretation than that? How do individual interpretations join in with the group reading? How much is the interpretive process encapsulated in the marking of a text? The visualization? The conversation that follows? How do the terms you choose for highlighting (the facets) guide the experience of reading the text? How do the explanations you provide for those terms affect the marking experience? When do the terms break down? If the terms propose a binary, what happens to that opposition over the course of the experience?

Visualization-Specific Questions

Which passages were marked the least for a particular category? The most? Why in either case? Which passages were particularly contentious, marked in many different ways? Where do particular categories cluster? How does the visualization show a relationship between the categories? How does your own interpretation link up to the collected visualization produced by the tool? Do the two visualizations tell us anything meaningful? Would we be able to find these meanings on our own? How does the visualization reflect the interpretive process? Why might we care more about a particular visualization for a particular reading? How is the quantified version of interpretation that Prism generates distinct from what we might learn from a discussion on our own? Can we imagine limits to this approach?

The primary job of an instructor using Prism is to help the students connect the results of the tool to the larger discussions encapsulated by the marking categories. Look at the results with a skeptical eye and ask how they can be meaningfully related to the ideas and provocations of the marking categories. My favorite early use of Prism asked users to mark James Joyce’s A Portrait of the Artist as a Young Man along the categories of “modernism” and “realism.” In a class, I would intersperse observations based on the visualizations with a discussion of the passage and the two marking categories. What do we mean by modernism? By realism? How is each expressed at the level of the text? What do we mean by literary experiment? By fragment? By realist details? What different genres does the text move through? Does the text construct a coherent narrative?

Putting realism and modernism alongside one another in Prism forces students to reconsider the binary, which quickly breaks down in practice. We can talk about whole novels or poems as belonging to one or another category, but can we do the same for individual sentences? For words? 80% of users at the time of this writing believe that the first word of the excerpt, “once,” is modernist. But why? If you look at the winning facet visualization, people seem primarily to be marking whole passages as one category – they are interpreting realism and modernism in chunks, not in terms of individual words. Readers tend to mark as modernist those generic changes where the excerpt suddenly adopts the form of nursery rhyme or of a fairy tale, suggesting that it is not any one genre but the shift between several in rapid succession that readers find to be modernist. The font size visualization suggests that those passages referencing physical actions by people are more likely to be associated with realist: “His father told him that story” and “When you wet the bed first it is warm then it gets cold” are marked as being especially realist. With this observation in hand, why these details? Why are the body and the bodily detail markers of a realism? Why might an association with the family suggest realism? How do they come under pressure in the face of aesthetic experiment?

Obviously these suggestions are just beginnings for how to approach Prism in the classroom. Many other fascinating examples have already surfaced, particularly those that use the tool to teach basic reading and foreign language skills. Get in touch if you have used the tool in your classroom! I would love to hear how you did so.

Prism News - Heroku and LLC

Cross-posted on the Scholars’ Lab blog

This past year the Scholars’ Lab has implemented many performance upgrades and bug fixes for Prism. The most recent upgrade is particularly exciting: users can now deploy their own personal Prism installations to Heroku with the click of a button. Well - it will take the click of a button and a few other commands. I’ve added a section detailing just how to do so under the ”Deploy to Heroku” section of the Prism Github readme.

It was already possible to implement private user communities by marking uploaded prism games as “unlisted” and then distributing the links to your group of participants. The Heroku deploy function makes this process a bit easier by allowing to users to host all of their games in one place. The process also sets you up well to tinker with the Prism codebase using a live app, as Heroku provides instructions for cloning the app to your desktop.

All of this on the heels of another exciting announcement: the Praxis Program has a short article on Prism appearing in the Digital Humanities 2013 special conference issue of Literary and Linguistic Computing. In the piece, we summarize Prism’s and interventions into conversations on crowdsourcing with special reference to its user interface.

It’s a good day to e-highlight!