February 13, 2017

The Interpretive Poverty Of Data

Stanley Fish

This brief essay was delivered equally a response to a newspaper co-written yesteryear Justice Thomas R. Lee together with James Phillips, Data Driven Originalism, at the Originalism Works-in-Progress Conference held at the University of San Diego on Feb 16-17, 2018.  The wish to generate human pregnant yesteryear eliminating from the patterns that bring it all traces of the human is at in i lawsuit perennial together with doomed to live on ever unfulfilled.

Back inwards the seventies at that topographic point was a fast-growing sub-discipline that promised to conjoin linguistics together with literary criticism inwards a way that would render an objective ground for the interpretation of texts. It was called stylistics together with I am pleased to say that I pretty much killed it yesteryear writing 2 essays, “What is Stylistics together with Why Are They Saying Such Terrible Things About It?” together with “What Is Stylistics together with Why Are They Saying Such Terrible Things About it? Part Two”. In those essays I said that Stylistics failed inwards 2 respects. More oft than not, the mountainous mechanism it normally cranks upwards labors to make something less fifty-fifty than a mouse; you lot wade through a whole lot of charts, distribution patterns, selection patterns, contiguity patterns together with discovery waiting for you lot at the other end  something that would take away keep been obvious  from the get-go to a 10 yr old. And, on the other hand, if at the halt of the whole line organisation at that topographic point is an interpretive insight that seems novel together with arresting , its human relationship to the operations of the analytical machine is alone arbitrary. Either you lot do all that counting together with sorting together with come upwards up amongst a pompously inflated version of what you lot had at the beginning,  or you lot do all that counting together with sorting together with so attach to the resulting information an interpretation it does non generate; to live on sure,   the interpretation may live on i the information tin live on made to support,  but only because the information , just sitting at that topographic point inwards all its empty bulk, tin live on made to back upwards anything.

Consider, for illustration the instance of Louis Milic, a computational stylistician, who studied the relative frequency of initial determiners together with initial connectives inwards the sentences of Swift, Macaulay, Addison, Gibbon together with Samuel Johnson. He discovered that Swift, to a much greater grade than the other authors, began his sentences amongst connectives together with he concluded that “The depression frequency of initial determiners, taken together amongst the high frequency of initial connectives, makes [Swift] a author who likes transitions together with makes much of connectives”. That’s the banal fruit of the analysis. The arbitrary fruit comes a chip latter when Milic notes that it was Swift’s habit (as it is inwards fact mine) to extend the length of his sentences yesteryear producing a series of appositional phrases that threatens never to end.  His conclusion? Swift’s “use of series argues a fertile together with good stocked mind”. Why non say that Swift’s work of series argues an empirical rather than an abstract cast of heed , or that Swift’s work of series argues an anal-retentive personality or that Swift’s work of series argues an unwillingness to allow travel of a thought. These together with a grand other interpretive conjectures volition lucifer the information Milic assembles, but at that topographic point is no genuinely motivated route from the information to whatsoever i of them. As a statistical output, Swift’s work of series has the status of an ink-blot inwards a Rorschach test. The researcher/analyst asks, what powerfulness that hateful , do you lot think?, together with the answers are immediate together with infinite.

Behind Milic’s efforts is the assumption, hard to dislodge no affair how many times it has failed to cash out, that banks of information , peculiarly information compiled disinterestedly,  can yield interpretive conclusions; together with the farther supposition is that the conclusions thus yielded volition live on to a greater extent than objective, because less impressionistic, than the conclusions reached yesteryear a single  interpreter who , because he or she is a finite, express creature, tin only survey a finite together with express number of texts  and volition hence e'er live on working amongst an inadequate together with distorting sample. This is an of import claim, for after all why travel to all the bother  of getting the machine going if there’s no payoff? The reward has to live on roughly improvement inwards interpretive methodology, an improvement that responds to the green fears of interpretation’s beingness biased, partial, manipulative, together with willful. Stefan Th. Gries , a leader inwards the corpus linguistics movement, puts the affair forthrightly when he says that the “assumption underlying most corpus-based analysis is that formal differences reverberate or correspond to functional differences” together with adds that “ functional divergence is understood …as anything—be it semantic or discourse pragmatic—that is intended to perform a detail communicative function”. In short, it should live on possible—and this is the only ground someone other than a descriptive linguist would live on interested inwards corpus linguistics at all-- to map formal differences onto content differences, to correlate features of a text described independently of whatsoever  interpretive hypothesis amongst a detail interpretation.

It isn’t possible together with it never volition be.

What this agency is that patch Thomas Lee together with James Phillips’s newspaper powerfulness live on interesting for several reasons, i of those reasons won’t live on that its methodology volition assist us to decide the pregnant of legal texts. But earlier I explicate together with defend that large assertion, allow me briefly greenback roughly of the problems I take away keep amongst the newspaper inwards general. First, Lee together with Phillips neglect to take away the mensurate of the intentionalism they casually dismiss together with their failure takes the shape of a misrepresentation of the intentionalist thesis.  Lee together with Phillips rehearse the at in i lawsuit criterion story that intentionalism had been discredited inwards favor of the doctrine of ordinary pregnant or Original Public Meaning. They explicate that “Original Public Meaning originalists plow the focus away from framers intentions together with toward the public’s agreement of the text.” But this declaration assumes what intentionalists would dispute: that at that topographic point is a text to live on “focused on” independently of the assignment or supposition of intention. The entire Original Public Meaning project, together with hence the corpus linguistics project, depends on the possibility of identifying the text equally a text (rather than roughly abstract scribbles) before any enquiry into the intention informing it, together with that is what intentionalists say cannot live on done. That is, for example, what Larry Alexander says inwards many places together with peculiarly inwards the essay Lee together with Philips cite, Is that English linguistic communication You’re Speaking: Why Intention-free Interpretation is An Impossibility. The subtitle of the newspaper powerfulness live on rewritten to read, Why Intention-Free Identification of Something equally a Text Isn’t Possible; for inwards this together with other essays, Alexander, along amongst other intentionalists, argues non simply that you lot can’t translate the text without recourse to a governing intention, but at that topographic point is no text until a specification of an intention is inwards place.

In a subsection entitled, Texts Cannot Declare that they are Texts, Alexander together with his co-author, Emily Sherwin, declare that “one cannot facial expression at the marks on a page together with sympathize those marks to live on a text (i.e. a meaningful writing) without assuming that an author made those marks intending to bring a pregnant yesteryear them.” That is, absent an intention, either assumed or specified, what you lot volition take away keep are dark marks together with white spaces. It is only when i thinks that someone meant something yesteryear these marks that i searches them for patterns conveying a message. Just equally when fume appears on a distant horizon, its forms tin only travel the object of interpretation when it is assumed that the series sequence of puffs is designed. Once that supposition is inwards place, so you lot tin ask, what does the sequence of puffs hateful (it volition only be a sequence rather than something random if that status is met); but without that supposition inwards place, the enquiry of pregnant does non together with cannot arise. The same declaration holds for private words. If I come across these markings – HELP – on a sidewalk together with wonder what they mean, my wonder together with my essay at interpretation volition both cease if I discovery that the marks were made yesteryear muddied pelting H2O dripping from the gutters of an side yesteryear side edifice (unless you lot hollo back that buildings take away keep intentions together with tin signal distress at having been allowed to deteriorate). It is non simply that seeking intention is the best way to translate a text; rather it is that if you lot don’t seek intention (or assume intention) you lot volition non take away keep a text. Now, i powerfulness discovery Alexander’s declaration persuasive or unpersuasive, but it cannot live on characterized equally Lee together with Phillips characterize it when they say that it questions “our powerfulness to resolve ambiguities inwards the pregnant of a given text without resort to authorial intention”. But “given text” – the text just sitting at that topographic point detached from authorial intention or institutional history, the text that powerfulness or powerfulness non live on ambiguous -- is what Alexander denies the availability of.  It is the intention-free existence of the text, non its beingness ambiguous or unambiguous, that Alexander calls into question.

Why do Lee together with Phillips misrepresent the intentionalist position? Because that position, taken seriously, agency that textualism inwards full general together with corpus linguistics inwards detail are nonstarters if the game beingness played is the game of interpretation. Lee together with Phillips report, quite accurately, that the bulk of citizens inwards the originalist’s universe are at in i lawsuit Original Public Meaning textualists. This undoubted fact is a statistical mensurate of what the bulk of those people inwards the champaign believe; it has cypher to do amongst whether that bulk is right or whether its methodology is coherent or whether it is the methodology interpreters are really following. Indeed, intentionalists characteristically say (of Justice Scalia together with others) that textualism is a method that cannot live on followed together with that self-identified anti-intentionalists enact intentionalism fifty-fifty when they rhetorically repudiate it. Lee together with Phillips introduce an illustration when they depict Justice Thomas’s dissent inwards Kelo v. Town of New London as an instance of a judge relying on master communicative content -- that is, a judge beingness a textualist. However, they are honest plenty to greenback that Thomas’s cites “early nation practices” that he claims “shed low-cal on the ... pregnant of the ... words contained inwards the Public Use Clause.” What form of low-cal is that? It’s surely non low-cal shed yesteryear an examine of the words; it is the low-cal shed yesteryear an agreement of what nation legislatures normally had inwards heed when they exercised their eminent domain power. The sequence of reasoning is equally follows: hither are roughly words inwards the Public Use Clause; inwards monastic enjoin to decide what they hateful let’s facial expression at the way legislatures of the catamenia normally idea virtually these issues, what they were concerned to promote or prevent. That’s intentionalism, pure, simple, together with unavoidable. Thomas is non slipping hither from Original Public Meaning textualism to a suspicious intentionalism; he is doing what everyone must do, whether the text is a clause inwards the Constitution, or a grocery list, or a lyric poem, or a discussion that appears on a sidewalk – trying to figure out what someone meant yesteryear these marks.

This is all to say that except inwards narrow together with carefully defined circumstances (which I volition depict later) Original Public Meaning or ordinary pregnant doesn’t acquire you lot anywhere. I am non denying either that Original Public Meaning is a thing or that specifying it, yesteryear whatever means, powerfulness live on occasionally together with limitedly helpful. I am only proverb that it is non a magic key to interpretation together with indeed is non a key at all.

What so is it? It is inwards fact a construct, a lay together laid of score – pregnant correlations that are statistically predominate inwards a detail population of speakers at a detail time. But, these correlations, inwards together with of themselves, are of no to a greater extent than interpretive assist than the raw information of frequencies together with distributions. You tin know the shape of populace meaning, just equally you lot tin know that Swift oft begins his sentences amongst connectives together with nonetheless live on unable to proceed to whatsoever legitimate interpretive conclusion. Original populace pregnant is interpretively inert; the mere fact of it doesn’t acquire you lot anywhere, just equally the mere fact that Swift makes extensive work of appositional series doesn’t acquire you lot anywhere. In both contexts what is missing together with required is a conclusion of what an agent intended to signify yesteryear those patterns. Let’s say that inwards the dictionary of Original Public Meaning a sure discussion is associated amongst a sure meaning; but having specified that, cypher interpretive follows unless an audience knows that Original Public Meaning is the code the author is deploying. The phrase “Original Public Meaning” is deceptive because the discussion “meaning” suggests that pregnant is what it delivers. But what it delivers is data; what it captures are roughly observed regularities, but those regularities are of no to a greater extent than interpretive significance than randomly occurring puffs of smoke. It is only if you lot know that the author has tied himself to the code of Ordinary Public Meaning that noesis of that code would live on interpretively relevant.

But how do you lot know that? The enquiry is urgent because equally Alexander oft says, the text won’t tell you lot what linguistic communication it is written in.  That knowledge, inwards the absence of which interpretation cannot begin, must come upwards from the outside, from roughly persuasive indication—non-textual—of what the author had inwards mind, of what he or she or they intended. Data analysis powerfulness live on able to tell you lot whether x is or was the ordinary pregnant of y, but you’d nonetheless live on faced amongst the enquiry was it ordinary pregnant the author was deploying. The bottom line is that patch corpus linguistics tin assist specify what the Ordinary Public Meaning is or was, it cannot assist you lot take away the next step, the measuring of interpreting; for that you lot demand the intentionalism corpus linguistics dismisses. If it is somehow determined that the speaker/writer intended to deploy the code of Original Public Meaning, Original Public Meaning tin assist guide interpretation. If that intention is non inwards place, the fact that at that topographic point is something called Original Public Meaning -- again, I stipulate to that fact -- is no to a greater extent than interpretively interesting than the fact that the speaker/writer is six-feet tall or the fact that he or she likes java ice-cream. Lee together with Phillips say that “by tabulating the relative frequencies of dissimilar senses of a discussion or phrase inside a corpus, a linguist tin . . . discern the to a greater extent than mutual feel of a given term inwards a given linguistic context.” But at that topographic point is cypher a linguist tin do amongst what he has discerned unless the intention of the author to necktie himself to the mutual feel of a discussion has been established yesteryear nonlinguistic means. Original Public Meaning is a resource, non a constraint; it is an choice speakers together with writers are non required to choose; the constraints come upwards in, if they do, when the intentionalist question—Is that Original Public Meaning you’re speaking?—has been answered.  In together with of itself corpus linguistics is an interpretive dead-end.

Let’s examine this out yesteryear considering Lee together with Phillips’s examples.
                               
Lee together with Phillips’s foremost illustration is the phrase “domestic violence” equally it appears inwards Article IV of the Constitution where the authorities is assigned the task together with duty of protecting the democracy from unusual invasion together with “domestic violence.” The corpus linguistics enquiry is how do you lot know the “domestic violence” hither agency violence committed yesteryear militia men similar Timothy McVeigh rather than the violence committed yesteryear i husband or partner against the other. The obvious reply is that anyone who has gone beyond the 6th grade knows that; but that reply volition non live on accepted yesteryear corpus linguists because it does non take away keep whatsoever statistical or numerical backup. So, Lee together with Phillips proceed to their information mining performance and, after a lot of work, say triumphantly that their method has “confirmed the intuition” that they together with nosotros had inwards the foremost place. “Our information demonstrate that domestic violence today is almost e'er used inwards reference to an onset on a fellow member of a person’s household, but was a reference to an insurrection or rebellion inwards the belatedly 18th century.” Shades of proverb that Swift’s preference for connector transitions over determiners agency that he is an author who likes transitions.

In a 2nd illustration utilizing pump boggling charts together with tables, Lee together with Phillips perform the same operation, inwards which something obvious is lay through a machine that huffs together with puffs together with delivers precisely what it began with. It’s similar the transporter device inwards Star Trek that breaks downward your molecules together with so reassembles them inwards the precisely the same shape on the other side. (The protagonist of the pic “The Fly” was non so fortunate.)  In this 2nd case, the object of analytical attending is the discussion “commerce” equally it appears inwards article 2 of the constitution, where Congress is given the powerfulness to “regulate commerce.” The enquiry is, what does “commerce” mean? You powerfulness hollo back that commerce agency trade, although, of course, “trade” itself tin live on understood inwards both narrow together with expansive ways. Lee together with Phillips also hollo back that commerce inwards this context agency “trade” but they wish to back upwards that intuition yesteryear lexical analysis performed yesteryear a powerful search engine on a massive database. They proceed yesteryear thinking upwards roughly of the other things that “trade” powerfulness take away keep meant or could maybe mean, together with so they search the frequency amongst which the discussion “trade” is collocated amongst roughly of those other things. So they brace off or confront off “trade” together with “manufacturing” equally 2 potential meanings of “commerce” together with discovery that “trade,” “shares vi give xxx collocates amongst commerce” (don’t ask!), patch “manufacturing” “shares just two.” Their conclusion: “the fact that commerce together with merchandise take away keep to a greater extent than overlap inwards their collocate network than “commerce” together with “manufacturing” do is evidence that the “trade” feel is probable to a greater extent than mutual than the “manufacturing” sense.” Louis Milic lives. Lee together with Phillips so say, inwards a really terrifying statement, that “further research could live on done,” for, “the value of a corpus is the powerfulness to piece together with die context to acquire to the most relevant semantic context.” I would inquire relevant to what? As far equally I tin tell, at that topographic point is no semantic context inwards their analysis -- all at that topographic point is is the tabulation of frequencies -- together with no way of getting from their analysis to a semantic context.

It is of import to sympathize what Lee together with Phillips’s analyses evidence together with do non prove. They do evidence (or strongly suggest) that inwards the code of ordinary pregnant equally it existed inwards the belatedly eighteenth century “commerce” inwards the phrase “regulate commerce” meant “trade.” They do non evidence that this is the pregnant the framers chose, for it remains possible that they chose a pregnant for “commerce” that departed from the statistical regularities Lee together with Phillips uncover; chose a meaning, that is, which departs from Ordinary Public Meaning. They were surely gratis to do so – it is e'er the speaker who nominates mark-meaning correlations – together with evidence that they did or did non volition non live on provided yesteryear the information Lee together with Phillips take away keep so laboriously assembled. In short, the fact that at that topographic point is an Original Public Meaning doesn’t compel a speaker/writer to work it together with the conclusion of whether he did or not, cannot live on made yesteryear consulting the text.

The 3rd illustration cited yesteryear Lee together with Phillips is the phrase “public use” equally found inwards the Constitution’s Taking Clause. The enquiry is, does “public use” hateful work related to populace authorities purposes (military, economic, education, etc.) or tin it live on interpreted, equally the bulk inwards Kelo does, to hateful private uses of which the populace authorities approves? Lee together with Philips are on the side of Justice Thomas, who believes that the nation errs inwards adopting the to a greater extent than expansive Definition of populace use, together with I concur amongst them. Lee together with Phillips dutifully burn downward upwards their machine in i lawsuit again together with discovery that “the straight feel that Justice Thomas argued for is much to a greater extent than mutual than the broader, indirect feel that the Kelo bulk adopted.” But again, the fact that the straight feel is statistically predominate inwards the catamenia says absolutely cypher virtually what feel the founders chose. Information virtually that must come upwards from elsewhere, perhaps from an declaration that foregrounds the founders concern to carve upwards the populace together with the private, a concern inwards the low-cal of which you lot powerfulness conclude that what they had inwards heed -- what they intended -- when they wrote the taking clause was the maintenance of the public/private distinction.

At i signal inwards their analysis of the discussion “commerce”, Lee together with Phillips wonder if perhaps “commerce” was the incorrect linguistic unit of measurement to start out with. “Maybe nosotros demand to search for institutions of the rule of commerce yesteryear the government.” They withdraw from their ain proposition because it “demands also much semantic context.” They realize that inwards monastic enjoin for their methodology to work, it must start out (as Chomsky tried years agone to begin) amongst no hostages to semantics at all; it must live on purely formal, a affair of assembling the information independently of semantic contexts or suppositions together with so proceeding from at that topographic point to a genuinely motivated interpretation. In short, they know at roughly story that outset amongst semantics volition homecoming their countings nugatory because the interpretive conclusion those countings supposedly generate take away keep already been lay inwards at the beginning. Here then, is the corpus linguist’s choice. Either stick to your method together with take away keep nowhere legitimately to travel after the information has been assembled or compromise the method yesteryear grounding it inwards a semantic hypothesis together with halt upwards amongst a production that is cypher to a greater extent than than your master input tricked upwards inwards numbers.

Toward the halt of their paper, Lee together with Phillips display an attractive modesty. Although, they stay confident that “the work of corpus analysis” tin live on “a cardinal chemical constituent of the foremost phase of whatsoever originalist inquiry,” they do admit that it is non quite clear what happens after this foremost stage. “We are less sure of the precise role [corpus analysis] should play.” In a recent paper, Stefan Th. Gries together with Brian G. Slocum (two leaders inwards the field) display an fifty-fifty to a greater extent than robust modesty: “corpus analysis cannot yesteryear itself render conclusive meanings to legal text” together with in i lawsuit again “corpus analysis tin render valuable insights virtually linguistic communication usage but cannot yesteryear itself resolve normative issues.” That is it cannot straight us to what words together with phrases mean. But I idea that was why nosotros were going to all that problem inwards the foremost place.

Can corpus analysis do anything? Yes it can. It tin reveal, verbal patterns the naked eye, or the only intuitive reader, would never see. And those patterns, oft deeply embedded inwards an author’s stylistic presentation, tin serve equally a agency of identifying him, of fingerprinting equally it were. Larry Alexander together with I take away keep been talking virtually collaborating on an op-ed virtually electrical flow campus protests, a dependent on which nosotros take away keep similarly Neanderthal views. Now let’s suppose Larry together with I do create do to write this piece, together with for roughly unfathomable ground an analyst wants to figure out which of the sentences inwards it were written yesteryear me together with which were written yesteryear Larry. No incertitude both Larry together with I take away keep verbal ticks of which nosotros are largely unaware, ticks that take away keep no detail pregnant except the pregnant that it is we who perform them. Corpus analysis would disclose those ticks together with hence would live on able to distinguish Larry sentences from Stanley sentences. But that’s all it could do; equally for contributing to a conclusion of what Larry or Stanley agency yesteryear those sentences, forget virtually it.

To live on sure, at that topographic point are to a greater extent than serious uses to which a fingerprinting analysis powerfulness live on put. It has been speculated that Shakespeare had a collaborator on roughly of his subsequently plays. Did he, together with if so who was it? There are poems whose authorship is inwards dispute betwixt John Donne together with Ben Jonson. Who really wrote them? This is a serious enquiry inwards literary history together with it is i corpus linguistics could assist answer, together with if it did at that topographic point would live on an interpretive pay off, for you lot would at in i lawsuit approach the poesy form together with have its words together with phrases within the noesis that it was Donne or Jonson, each amongst his distinctive concerns, obsessions, together with visions who had designed it. Real interpretation could so start out together with corpus linguistics would take away keep brought you lot to that moment, but it would itself take away keep done no interpretive work; it would take away keep just cleared the dry soil inwards a way that allowed interpretive piece of work to proceed. That’s non nothing, but it isn’t a contribution to interpretation, just equally a modern lawn tennis racquet volition assist you lot to hitting a winner -- it puts you lot inwards a seat to do the task -- but doesn’t itself do the job. So there’s a use, together with for roughly a important one, of corpus linguistic analysis. And equally I take away keep already acknowledged corpus linguistic analysis tin also assist us to decide what the contours of Original Public Meaning or ordinary pregnant were at a detail time. But beyond that, I don’t come across its utility together with therefore, I don’t come across why the enormous expense together with Casaubon-like erudition involved inwards learning how to do it is worth the candle.

Lee together with Phillips together with Gries together with Slocum powerfulness respond yesteryear proverb 2 things: (1) they powerfulness say that although the way nosotros humans make together with have meanings is quite dissimilar from the digitizing together with retrieval processes of corpus linguistics, at that topographic point is no ground why corpus linguistic methods cannot elucidate meanings produced yesteryear other means. But that’s similar proverb i could discovery together with transcribe the physical patterns of motility enacted yesteryear players inwards a football game game without knowing that the moves you lot are describing are moves within that game, together with take away keep something interesting to say. If all you lot take away keep is the catalog of physical actions performed inwards a sequence yesteryear 22 persons you lot tin never acquire from that catalog to whatsoever declaration of what those actions mean. Once you lot detach patterns from the intentional context inwards which they take away keep significance, you lot can’t acquire the significance back; together with (2) they powerfulness say equally the stylisticians of the 1970s said together with the digital humanists of the introduce twenty-four hours are saying, our dependent is new; don’t await us to take away keep all the answers at the beginning: but inwards time, trust us, this methodology volition deliver what nosotros seek, a legitimate human relationship betwixt the amassing of information together with interpretive conclusions. To which I would respond yesteryear invoking i of my favorite lines from Shakespeare’s King Lear: “Never, never, never, never, never.”

Stanley Fish is the Davidson-Kahn Distinguished University Professor of Humanities together with Law at Florida International University. You tin achieve him yesteryear email at fishs at fiu.edu




No comments:

Post a Comment