All fogged up

Fun with numbers! Stephen Berlin Johnson, author of The Ghost Map, has discovered the Fog Index. Or actually, he's discovered Amazon's Text Stats feature, part of the 'Zone's "Inside this Book" search engine. For select books, you can check out numerical aspects of an author's writing, such as the average length of his sentences or the average complexity of the words he uses (a "complex word," in this case, is defined as anything longer than three syllables).

Many newspaper journalists will instantly recognize these as the factors used in the infamous "Fog Index," developed by Robert Gunning as a scientific-seeming gauge of "readability," but in reality, a dread instrument of torture used by editors and writing coaches to intimidate reporters and columnists into writing very ... very ... simple ... prose. The simplest prose is the clearest prose, you understand. And the simplest prose is the most readable. And the more readable your prose, the more readers can read it. Maybe even will read it. And teenagers will flock to the paper and advertisers will return and all manner of things will be well once more.

In fact, the first thing listed on the 'Zone's Text Stats charts -- before all the breakdowns of a book's total number of words and sentences -- are the three "readability" indices (the Fog Index, the Flesch Index and the Flesch-Kincaid Index). The figures in these three relate to the supposed grade school level or percentage grade which a reader would need to have attained in order to keep up with the author's shimmering prose. Book/daddy's Fog Index while working at The Dallas Morning News was generally around 9th grade, sometimes dipping to 8th grade -- which was considered acceptable. They liked to keep things accessible to people without a high school education. No wonder more serious readers have given up and fled to the internet.

Indeed, although editors and writing coaches will deny it, one possible result of a wholesale and heavy-handed use of the Fog Index, I would argue, is a dumbing down of a coerced journalist's writing. Coaches and editors insist that even the most sophisticated and abstruse of topics can be written about at, say, an 8th-grade level. And they probably can. Except it requires more space to do so. Breaking down a multi-tiered business merger or a breakthrough in quantum physics into simple sentences with simple words that even partial literates might understand will generally require more sentences to cover the topic. Or less of the topic will get covered -- because space, like management brainpower, is tight these days at daily papers.

But back to Mr. Johnson and his discoveries and The Calibrations of Literary Style.

Naturally enough for the author of five books, Mr. Johnson was fascinated by the data and its implications, and he promptly crunched the numbers on his own books as well as those of writers he admires, such as Malcolm Gladwell and Christopher Hitchens.

"I've always thought that sentence length is a hugely determining factor in a reader's perception of a given work's complexity, and I spent quite a bit of time in my twenties actively teaching myself to write shorter sentences. So this kind of material is fascinating to me, partially because it lets me see something statistically that I've thought a great deal about intuitively as a writer, and partially because I can compare my own stats to other writers' and see how I fare."

In these ironically lengthy sentences, Mr. Johnson makes an important qualification concerning "readability" and "complexity" -- that is, he doesn't declare that sentence length is a determining fact in weighing a writer's complexity. It's an influence on the reader's perception of it. To put things bluntly: I think "readability" is like "intelligence quotient." I don't think a writer's style can be counted and subtracted and summed up in a single number in any meaningful way. I think it's ridiculous. But as a crude approximation of what many readers may feel -- "This book is giving me a headache" -- it has its uses, I suppose.

Consider: Sentence length is not necessarily evidence of a work's "complexity." Lots of dense verbiage can disguise peanut-sized ideas. A convoluted sentence may convey not a complex thought but an inarticulate or confused one. And, of course, the opposite is true: Simplicity can be profound. The ripeness, as Lear tells us, is all.

Mr. Johnson sticks to graphing non-fiction writers. But the inevitable direction -- taken in the comments to his post -- is to apply this to fiction writers. And here's where it's easier to see how the Index gets into trouble. What, after all, is "complexity" in a novel? Is it just the use of Latinate terms and subjunctive clauses? What if a novelist conveys a complex sensibility in the most ordinary prose, over the long haul and through multiple characters, a sensibility that develops and changes as his narrative does?

And ta-dah! It turns out that, if we use sentence length and word length measurements and the Fog Index to determine literary complexity, we discover that Harold Robbins (the stats on Never Enough) and Jacqueline Susann (Once is Not Enough) are more "complex," less "readable" writers than either Ernest Hemingway (The Complete Short Stories) or Cormac McCarthy (The Border Trilogy). ) Both of the trash pioneers, Robbins and Susann, use more complex words than either, while Susann almost matches them in sentence length (10 words vs. 10.3 for Hemingway and 11.1 for McCarthy).

OK, so maybe such a comparison is a little rigged. Hemingway and McCarthy are masters of some of the leanest, tightest prose around; inevitably, they'd skew the results and make the dreckmeisters look high-falutin'. But then there's this stunner: Susann even beats Donald Barthelme in sentence length, while Robbins beats him in complex words (The Dead Father).

I can go on like this all day. Who has perhaps the most lapidary, the most involved prose style in English this side of Henry James? Joseph Conrad. And in fact, the stats on Nostromo beat everyone else here all to hell -- a 12.2 grade level on the Fog Index, 12 percent complex words and a whopping 18.3 words per sentence. As one might expect, Victorians and early modernists generally score extremely high in these rankings (although I'm still looking for an English or American novelist to beat Tristram Shandy's inspiring record of 36.8 words per sentence) .

So guess who beats Conrad when it comes to sentence length -- 19.1 to 18.3? Huckleberry Finn. And he's a 14-year-old American hick. (His vocabulary is what is much simpler.)

My point remains: Yes, in general, simpler words, fewer words per sentence will make for clearer prose. In general. Whether that prose is therefore more "readable" or less "complex" is another matter. And knowing that an author uses especially long sentences may be illuminating in some analysis. But to use these faulty and limited tools to measure much beyond that -- for judging literary style, especially for anything as vague as "readability" or "complexity" -- is an exercise to be approached with great wariness and many hedging and qualifying remarks.

Samuel Beckett, for example, wrote prose that is so honed and tight and pared away, it makes Hemingway sound like Chatty Cathy. He often wrote sentences like these -- "'The old sunlit face. Tableau vivant if you will. In its way. All is silent from now on... All shadow here. Slow fade of afterglow. Night without moon or stars." But then he also linked together run-ons like these: "There's a way out there, there's a way out somewhere, the rest would come, the other words, sooner or later, and the power to get there, and the way to get there, and pass out, and see the beauties of the skies, and see the stars again."

In the end, bone-dry Beckett is positively loquacious: His Stories and Texts for Nothing beat even Joseph Conrad when it comes to sentence length -- 20.3 words.

But now for a practical application of all this newfound numerical wisdom. As noted, book/daddy's Fog Index grade level was a low 9/high 8. But if any journalists are reading this, any reporters who are working for publications that enthusiastically employ the Fog Index, there is a very easy way to shade things. The Official Book/Daddy Method for Gaming the Index (patent pending) is guaranteed to sink and/or keep your numbers down! Impress your style coach!

It's simple. To calculate the index, editors rarely use entire newspaper articles, especially if the articles are lengthy features -- counting all those words and syllables is a painstaking chore. Editors generally pick two or three samples and count the first few paragraphs, at most, because one claim of the Index is that an author's prose style, over time and over the length of a decent-sized article, is more or less consistent. No need, therefore, to count every word in every paragraph. Just a sampling is required.

So: If you could access my archive of articles at the DaMN's website, you would find that in many instances, my lede paragraphs will feature long, complicated sentences, such as this one, but only as needed. And then somewhere else in those opening graphs, a sentence fragment or two will appear. For emphasis. Really. No, really. And the fragment will use short, blunt words, to boot.

And there you have it. You've just Gamed the Index. Having snuck past the Prose Guardians, you can go on and write freely and merrily for the rest of your article.

But just remember. Sentence fragments. Upfront.

UPDATE: A friend has pointed out that Book/Daddy's Official Method for Gaming the Index is already out of date (my experiences with the dread Fog Index go back a few years). Nowadays, editors don't have to count your syllables and sentence lengths. Many newspapers have software that has built-in programs for just such chores. Or the editor can simply copy the text of an article and transfer it to Word 2000 to do all the counting for him.

So skewing the sample in only your first few paragraphs may not let you slide past the overseers any longer. Check how they operate in your office before trying the book/daddy method.

Void where prohibited by law. Not responsible for any goods left in car. Park and lock it.

October 23, 2007 4:22 PM |



Best of the Vault


Pat Barker, Frankenstein, Cass Sunstein on the internet, Samuel Johnson, Thrillers, Denis Johnson, Alan Furst, Caryl Phillips, Richard Flanagan, George Saunders, Michael Harvey, Larry McMurtry, Harry Potter and more ...


Big D between the sheets -- Dallas in fiction


Reviewing the state of reviewing


9/11 as a novel: Why?


How can critics say the things they do? And why does anyone pay attention? It's the issue of authority.

The disappearing book pages:  

Papers are cutting book coverage for little reason

Thrillers and Lists:  

Noir favorites, who makes the cut and why



About this Entry

This page contains a single entry by book/daddy published on October 23, 2007 4:22 PM.

Further adventures in our democratic legal system was the previous entry in this blog.

Mark your calendars, kids is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Creative Commons License
This weblog is licensed under a Creative Commons License.