CEECing new directions with Digital Humanities

This past week I was talking about the relationships between corpus linguistics and digital humanities as a visiting scholar at VARIENG, a very well known historical sociolinguistics and corpus linguistics working group. Corpus linguistics is a very text-oriented approach to language data, with much interest in curation, collection, annotation, and analysis – all things of much concern to digital humanists. If corpus linguistics is primarily concerned with text, digital humanities can be argued to be primarily be concerned about images: how to visualize textual information in a way that helps the user understand and interact with large data sets.

VARIENG has been compiling the Corpus of Early English Correspondence (CEEC) for a number of years, and one of their primary concerns is ‘what else can we do with all this metadata we’ve created’? Together, we discussed three main themes of corpus linguistics and digital humanities: access, ability, and the role of supplementary vs created knowledge. Digital humanities runs on a form of knowledge exchange, but this raises questions of who knows what, how, and how to access them.

Approaching a computer scientist with a bunch of historical letters may raise some “so what” eyebrows, but likewise, a computer scientist approaching a linguist with a software package to pull out lexical relationships might raise similar “so what” eyebrows: why should we care about your work and what can we do with it? Because both groups walk in with very different kinds of expertise, one of the very big challenges of digital work is to be able to reach a common language between the disciplines: both have very established, very theoretically-embedded systems of working.

All of this is to say that the takeaway factor for corpus linguistics research, and indeed any kind of digitally-inflected project, is very high. As Matti Rissanen says, and rightly so, “research begins when counting ends”. The so-what factor of counting requires heavy contextualization, human brainpower, time, funding, systems and communication – and none of these features are unique to corpus linguistics. Digitally-inflected scholarship requires complementary expertise in techniques, working and interacting with data; we need humanistic questions which can be pushed further with digital methods, not digital methods which (we hope) will push humanistic questions further. While it is nice to show what we already understand by condensing lots of information into a pretty picture, there are deeper questions to ask. If digital humanities currently serves mostly to supplement knowledge, rather than create new knowledge, we need to start thinking forward to ask “What else can we do with this data we’ve been curating?”

One thing we can do with this data is view it in new tools and learn to ask different questions, as we did with Docuscope, a rhetorical analysis software developed at Carnegie Mellon University.

Digital tools and techniques are question-making machines, not answer-providing packages. Here we may ask ourselves why F_1720-39.txt has a low count of Personal Pronouns in Docuscope, and the answer may be that what we consider to be personal pronouns (grammatically) are categorized otherwise by Docuscope and that other constructions are used instead. This isn’t magic and this can’t be quiet handwaving: we should be pushing ourselves towards asking questions which were previously impossible at the scale of sentence-level or lexical-level of detail, because suddenly we can.


Slides from last week’s workshops (right-click to save as pdf files):

Every ethnographer is a Borat but is every Borat an ethnographer? (part 3)

3.0 Where is ethnography heading to?

A fundamental starting point of ethnographic research, and hopefully this shows already from my argument here, is that its research objects are human subjects (‘the ethnographees’). In ethnographically oriented fieldwork, therefore, theory does not emerge behind a desk, but when engaged within the field. Ethnographic theorising is dialectically constructed in interaction with the material world and in encounters with human subjects. Ethnographers approach the world as an incredibly complex place of social action and communicative practice, and theorise on the basis of a description of that complexity, rather than on the basis of existing theories (See Juffermans 2008 for a position paper that touches on the difference between ethnographers and tourists). Ethnographers in other words arrive at theory from below, that is from messy everyday life of a given sociocultural space and try to reconstruct its cultural ecology. Ethnography thus works its way up from small, micro-events and lived experiences to explain, or at least try to explain, the societal forces that are at stake within the cultural ecology of a given space at a given time.  As ethnographers we arrive at new forms of knowledge in collaboration and negotiation with agentive ethnographees, or not at all (cf. Cameron et al. 1992; Collins 1998). These ethnographees, our research subjects, are not passive researchees, but agentive human beings with attitudes towards the research object and, as I have hopefully already pointed out to you, with a voice of their own (Fabian 1995).

Ethnographic studies of language in society have recently known a ‘human turn’. That is, they have known a move away from languages as linguistic systems that are merely used by people, towards language as a sociolinguistic system that is constructed and inhabited by people. Prominent scholars of language in society (e.g., Rampton 1995; Stroud 2003; Blommaert 2005;  Makoni & Pennycook 2007; Jørgensen 2008; Juffermans 2010) no longer define the field of sociolinguistics as the study of ‘who speaks (or writes) what language (or what language variety) to whom, and when and to what end’ (see Fishman 1972; Extra & Gorter 2008), but as the study of ‘who uses what linguistic features under particular circumstances in a particular place and time’. The central question that ethnography, or better linguistic ethnography,  has to cater for thus shifts from ‘what languages?’ (with language in plural) to ‘who languages which bit of a language and how does he or she do that with what purpose’ (using languages as a verb). In such a sociolinguistics of languaging (as opposed to a sociolinguistics of languages), the analysis revolves around human beings (languagers) engaged in particular communicative activities and situated in particular social, historical and geographical environments. The task for such a linguistic ethnography is not only describing and understanding language, but ultimately in describing and understanding society. Will ethnography survive this challenge and the challenge brought by the languaging that takes place among people through social media? It is early days still, but it seems that linguistic ethnography is getting there.                  



Every ethnographer is a Borat but is every Borat an ethnographer? (part 2)

2.0 When is ethnography actually ethnographic?

Ethnography is like the new girl in town. All sorts of disciplines try to have a date with her. Take for instance the fact that ethnography has had a recent hit in a handbook of criminology, it shows that at times this dating leads to a mating. Yet, ethnography finds its most suited bed fellow in anthropology. As you know, ethnography means ’writing about the nations’, with ’graphy’ from the Greek ’to write’ and ’ethno’ from the Greek noun ethnos that can be translated with either ’nation’ or ’tribe’ or ’people’.  What this implies is that the unit of analysis of the ethnographer, that is the ethnos,  need not be a nation, a region, a village or a speech community – no matter how difficult this concept may be to define – rather, to the ethnographer  this unit of analysis may be ’any social network forming a corporate entity in which social relations are regulated by customs’ (Erickson 1984:52). What makes a study ethnographic then, is  not the fact that this discipline takes a socio-cultural space of any size at any given time as a whole. Rather, ethnography portrays incidents through an emic perspective. This means that ethnography and the ethnographer, although we will tackle this matter at a latter stage, portray events from the point(s) of view of the actors involved. This focus on meaning constructed by the actors involved in the observed incident, is at the heart of Malinowski’s definition of ethnography in Argonauts of the Western Pacific. His attempt, in fact, was an attempt that although not always successful, it tried to crystallize meaning in words from the actors’ perspective. But so, where does the difference between Borat and the ethnographer lay? Let’s say that what Borat does is not ethnography, or better, it could be addressed as a pre-scientific ethnography. Unlike Borat, the trained ethnographer brings to the field a specific concern for meaning making at the level of actors. Borat may well be an excellent reporter of his own experience in the United States. It may also highlight several metonimic features of the socio-cultural spaces he has been getting involved with  during his search for the meaning of living in America, Pamela Anderson being one of them. Yet again, the ethnographer combines on the ground experience with an awareness gathered through meticulous fieldwork of those local meanings of behaviour that are other than his space of socialization. This is the ethnographer  magnum opus and yet its cross. He has to make sense of a ’strange’ behaviour, make it familiar to himself and then report on it and in doing so, he has the task of making it interesting again. Thus, ethnography differs from a journalistic report, a tourist diary or an episode of Borat in that it is painstaking in its data collection, rigid in its data analysis and controlled in the ethnographer’s own subjectivity. In the end, it is the ethnographer that gives way to a reality through his text and it is in this very text that the ethnographer produces a caricature, i. e.,  a systematic distortion of the features of a certain socio-cultural space. And yet this is a distortion that does not call for subjectivity and intuition at the expenses of objectivity. Rather, such distortion is a representation where, alike in a caricature, the ethnographer selectively reports on certain aspects rather than others  and where, he is rendering local meaning from a chosen point of view. And it is in so doing that an ethnographic study becomes ethnographic in that it shows the decisions made during the data collection process (Bezemer 2003; Jie Dong 2009; Spotti 2007), it describes the kinds and amounts of data that were and were not (made) available, the negotiations and related frustrations in gaining entrance within a certain socio-cultural space, and the process of rendering actors’ meanings in his own text.

Every ethnographer is a Borat but is every Borat an ethnographer? (part 1)

1.0  Introduction

I am an academic, or better my identity is often ascribed by others as that of an academic. More specifically, according to which conference I attend, my academic identity is ascribed as that of a sociolinguist, of an anthropologist, of an anthropologist interested in linguistics or, in the best of cases, as a linguist ethnographer. If I were to provide a description of my inhabited (academic) identity, I am a primary school teacher who has bumped into anthropology and sociolinguistics and who has tried to bake something out of these two disciplines. In so doing, I have become an ethnographer interested in how people construct their identities in verbal interaction within a sociocultural space, like a multicultural primary school classroom (Spotti 2007, 2008; Spotti & Kroon 2009). Setting this aside, this ascribed homo academicus identity of mine is in sharp contrast with my inhabited identity that is of someone who is well into ’low key’ cultural activities like going to the pub, cheering for a football team and, although either it does not suit the ascribed identity that has been impinged upon you or you will be shy to admit it, I am well into low key TV programs like Borat. It is while watching one episode of this series that I came to think that Borat has quite a bit to do with ethnography. How so? Let me first bring some enlightenment to those of you who do not know Borat. Borat is a foreigner from Kazakhstan. More precisely, he is a foreigner who has lawfully entered the United States of America and who tries to understand what living in America is all about. Like Borat, the ethnographer’s goal involves:

’analyzing a cultural phenomenon from the perspective of the outsider (to whom it is strange) while seeking to understand it from the perspective of an insider (to whom it is familiar)’ (Gal 2005:349)

In short, it is the ethnographer’s task to reflect light on phenomena that members of a culture overlook because they take them as a given, and often explained with the metapragmatic rationale that ’that thing happens here because it is normal for it to happen like that’. Borat, like an ethnographer, is trying to grasp the cultural ecology of the sociocultural space that he is inhabiting. In what follows, I try to tease apart what makes ethnographic research ethnographic and to draw a line between Borat’s own experience and the job of being an ethnographer engaged in field work. I then conclude with some considerations about where ethnography  is heading to.

Easy French: For our men abroad, And how to pronounce it.

“The need for supplementing the average Briton’s extremely fragmentary knowledge of French and German- has led to the formation of language classes for recruits of the new army, and many pocket dictionaries and conversation manuals have been published from time to time for the use of the men already in the field. Amongst the latter it would be difficult to find anything better than Captain Keyworth’s Easy French and Easy German. Small enough to be carried inside an ordinary pocket-book, these leaflets contain phrases and words most likely to be required by the soldier…” – The British Medical Journal, February 13, 1915.

This little 12 page pamphlet has probably seen the world. It caught my grandmother’s eye as she sifted through momentos working on yet another scrapbook rewriting history and she passed it along to me on a recent trip to Canada. It belonged to  my great-grandfather who in his latter years, was a composer and band leader for the 1st British Columbia Regiment Band in Vancouver (Lieut C.J. Cornfield) and who spent 12 years in India with the British military, likely as part of a regimental band.  And that is all I know about the origin of this pamphlet, or my great grandfather to be honest.

I place this here as an object of interest, or for those who like to solve mysteries, something with an undetermined provenance. It is also an interesting artefact in what it can reveal about how the military (or at least the publishers) of the day expected language to be learned and used in the context of war and in the context of the times. What follows are a few examples.

The pamphlet employs three categories: English, French, and Pronunciation (and in case one doesn’t know what pronunciation means, it is annotated with ”(say it like this)”!  The above is exemplary of the kind of military terms included, along with phrases such as ”Where is the enemy?” and ”Where is the calvary?” – a stark contrast to military communication today. In fact, the placement of phrases can almost read like an imagined narrative, take the following for example:

Straight out of a movie, pre-GPS. And yet the reality of war was quite stark, thus the inclusion of a great deal of ”In Hospital” language. Morbid, realistic, hopeful to indeed believe that one might be able to speak at all ( ”I am wounded in the neck/nose/head/mouth”) considering the battlefield medical treatments at the time were rather rudimentary, particularly in regards to pain management.

This is quite a different medical vocabulary from present day. Unable to figure out what ”lint” or ”la charpie” could be other than the stuff that comes off my clothing and is collected in the lint trap of a dryer – looking it up in the dictionary revealed that scrapings of linens would be used to dress wounds at the time. And despite all of this practical language there are few glimpses of the poetic, or at least I like to think that naming the sun, moon, stars and earth in another language might be considered frivolous or romantic.

The twelve page pamphlet doesn’t give any terminology for speaking on an interpersonal level, there are no words for expressing feelings; would one really expect to find the expression for, ”I am scared!”  What about the language of love? What would such a guide contain today? What does this say?

”Of any class”- Yes, these were different times indeed. And finally, this last example shows the only words the printer chose to have underlined in this text. Countless children are taught that these are the two most important words in English: ”please and thank you”.

The final page of the pamphlet advertises other ”Books for our Soldiers, Sailors and Red Cross workers” and leaves me pondering the easy use of ”our” both here and in the title…The use of  ”our” expresses patriotism and unity characteristic of war time, completely unfamiliar and out of keeping with today’s world where allegiances, membership, and belonging seems much more complicated. It is amazing how a single word can leap off a page and express quite a lot.

It is July, it is hot, and the corridors of this university are dark and echoing. But if you are reading this post and feel compelled to jump in and talk about it, please do. I welcome comments, ideas, discussion, anything that comes to mind.

You are our relative!!!

This is a sentence that almost always pops up in the flow of a conversation whenever it turns out that I am a foreigner, more precisely, a Hungarian in Finland. That I am from another country is very easy to presume since the common language is English. I have to admit that no conversation takes place without the following exchange:

-So where are you from then?


-Hungary?????!!!! Oh, then you are our relative!!!

And as the talk progresses, we get more and more into language matters. There are two options as to how our conversation will continue:


Option A – when the conversation partner is a Finn

… we really try to find out something in common. We think hard, maybe spend long minutes with brainstorming and finally we end up with the following four words[1]:

kala – hal      veri – vér       kesi – kéz       vesi – víz

Clearly, that’s not much at all, especially if we are really relatives … (Nowadays some time is saved while brainstorming since I already have these four words in mind — just in case … ). Since we do not want to give up, we continue thinking. Suddenly, something comes to my mind, a memory that happened during my very first stay in Finland. I was sitting on a bus, travelling somewhere and looking out of the window. All at once, my attention was distracted from the beauties of the landscape to a conversation somewhere on the other side of the bus. To a conversation that made me smile although I could not figure out what the people were saying – but it sounded as if they were Hungarians, or at least talking in Hungarian. Because I wanted to find out, I moved closer and closer; when I was close enough (i.e. within hearing distance) I was “shocked”. No, they were neither Hungarians (or who knows?) nor were they talking in Hungarian. The language they used was Finnish, but it really sounded like Hungarian from a distance. So that was the moment when I realised that there is one more thing in common between Finnish and Hungarian and it is their rhythm.

When I tell this story, everybody involved in the conversation is relieved since we have a little bit more in common than just four words. (Something more on “related words” can be read under http://homepage.univie.ac.at/johanna.laakso/Hki/f-h-ety.html)


Option B – when the conversation partner is not a Finn (because the Finnish partner has already departed)

… it is always taken for granted by the other party that if Finnish and Hungarian are related languages, then it must be really  easy for me to understand Finnish, and it might be also very easy to learn the language. Unfortunately, I have to free them of that misconception. No, I do not understand Finnish just because my mother tongue is Hungarian, and it is also very difficult for me to learn it. It might well be that these two languages are the only related languages whose learning is not facilitated by being a speaker of the other one, as is the case with, for instance, Germanic and Romance languages.

And what is the situation in Hungary in this question? Frankly, I have no personal experience, but I have met somebody who had an interesting story which had happened a few years before the Iron Curtain was lifted. That time there were all sorts of restrictions on foreigners who wanted to visit the country, especially from ”Capitalist” ones. But a big exception was always made for Finns. The Hungarians welcomed these far-flung “relatives” with open arms and without lots of red tape. For this reason one sometimes saw Volvos with Finnish number plates sharing the streets with the Hungarian Trabants and Wartburgs. There is, of course, doubt how much they actually understood from each other’s language.

Interestingly, a couple of weeks ago I started a conversation with a Hungarian friend of mine on the topic. He remembered that there is  a sentence that is the same (i.e. mutually understandable) not only in Finnish and Hungarian but in Estonian as well. The sentence reads:

Finnish:                                   Elävä kala ui veden alla.

Hungarian:                             Eleven hal úszik a víz alatt.

Estonian:                                Elav kala ujub vee all.

(Source: http://www.mari.ee/eng/articles/soc/2005/12/01.htm)

So thanks to my friend and to the website he gave me to consult, I have managed to extend my “Finnish-Hungarian repertoire” in preparation  for the next time  I am involved in a conversation about the relationship between the two languages.



This text is by no means a scientific one; it only shows some experiences as a language user of both languages.

Here I also wish to express my gratitude to Diana Metzner for her valuable contribution.


[1] I exclude those words here that have the same orthography but carry different meanings, like isi = daddy (Fin) vs. school (Hun, coll.), hinta = price (Fin) vs. a swing (Hun) or kuka = who? (Fin) vs. rubbish/dust bin (Hun).

Broadcasting a nexus point: late night radio.

“Good evening everyone, you’re in the right place at the right time. This is Coast to Coast AM. Coming at you, blasting out of the Mojave Desert like a sirocco, blazing across the land, into your town, into your home, slamming into your radio like a supercharged nano particle of dark energy. You’ve arrived at a nexus point, a crossroads of shadow and light, a phantasmic oracle  market place of ideas and blasphemies, grand melting pot of cultures and subcultures, from the benign to the bizarre, all on the same path searching for breadcrumbs of cosmic understanding and hoping we’ll be able to follow the trail back to where we started. Greetings from the boldest, bawdiest most outrageous city in the world, the planetary capital of sun, fun, sin, sex and secrets, my not so humble hometown, Las Vegas, Nevada. My name is George Knapp, your occasional host, your designated driver of the airwaves, and moderator of tonight’s upcoming cacophony of conversation. Glad to be with you once again.” (Sunday, August 23rd, 2009)

This is a live, spoken introduction to a radio program I listen to when sleep eludes me and I lay awake late at night. Back in Canada, I would tune in to a local channel on my clock radio and allow myself to be carried away by words.  This live American late night talk radio program, Coast to Coast AM  is picked up by affiliates in the US, Canada, Guam and the Virgin Islands, airing nightly. It covers a vast array of topics including the paranormal, unusual science and technology, unexplained phenomena, and conspiracy theories. The format features bizarre news, interviews with guests followed by calls from listeners with questions and stories related to the featured topic.

I never cease to be amazed by the way in which the nightly hosts navigate the fine line between belief and disbelief, and the way in which they manage interactions constructed around producing credibility. Looking at one of these interactions will be topic of my next blog entry. But for now, I just wanted to share this introduction and think about the way this language creates a safe space for disclosure of what most people would find unbelievable. The show positions itself outside of what it refers to as “mainstream media” and in doing so, needs to create a separate space. This introduction also creates a very vivid image of what radio is, but particularly, of what radio once was.

The language, imagery and delivery of this introduction pushes this bit of speech into the realm of performance. Each time Art Knapp delivers this introduction it varies; in pitch, in rhythm, some parts are left out, some parts are altered, and the last line is always personalized. While addressing a vast and varied audience spread across a continent, the broadcaster explicitly and directly singles out “you” the listener in a familiar manner. The audience is personified as one being – united by a desire to seek “cosmic understanding”.  There are two authors here as marked by an early reference to the institutional host (Coast to Coast AM) but there is also the later and more personal introduction to the broadcaster and the place of broadcast (this program is broadcast with several hosts who reside in different parts of the U.S. and the original host occasionally broadcasts from Manila, Philippines). The program is described as “blasting out of the Mojave desert” (of Nevada) “like a sirocco” (a wind of great speed originating in the Sahara desert of Northern Africa). While this program enters the homes across a continent, the introduction both personalizes it and localizes it.

There is so much I could say about this example, however classroom interaction is my area of research,  not broadcast talk, so I will leave this for the experts. I would like to draw one comparison however, to a better known program which also employed a similar kind of introduction: The Twilight Zone. While the opening to this classic television series changed from season to season, according to Jeffrey Sconce,  it “evoke[d] a sense of suspension, a ‘betwixt and between’ liminality that cast the program (and its viewers) as occupying an ‘elsewhere’ or even a ‘nowhere’”(Sconce 2000, 133). This objective is echoed in this introduction to Coast to Coast AM. Instead of “moving into a land of shadow and substance” one is arriving “at a crossroads of shadow and light,” an “oracle market place of ideas and blasphemies”. The language used here indexes the topic matter of the program, from “dark energy,” a hypothetical kind of energy, to “phantasmic oracles”. What will follow in the hours to come will be distinctive from other talk radio programs and will employ unique terminology and discursive strategies for constructing convincing and realistic accounts. While The Twilight Zone took over the senses in a way that only early television purported to, this present day radio program is an updated attempt at this; a live program reaching those who are nocturnal, a voice in the darkness where the realm of the possible expands. 

In fact, the introduction to the Twilight Zone was so effective, that it hasn’t been forgotten although the original series ended in 1964. To this day, “twilight zone” is a term used to describe a state of being where one is lost or not present in reality or a particular place or situation which is considered bizarre.

“You unlock this door with the key of imagination, beyond it is another dimension, a dimension of sound, a dimension of sight, a dimension of mind. You are moving into a land of both shadow and substance, of things and ideas. You’ve just crossed over into The Twilight Zone”. (Rod Serling, 1964, 4th season)


Sconce, J. 2000. Haunted media: Electronic presence from telegraphy to television. London: Duke University Press.

