Shop Talk | June 28, 2012

The ghost in the (writing) machine

by Celeste Ng

Not long ago, we talked about the phenomenon of robots writing books. But those computer-authored tomes—with scintillating subject matter like Saltine Cracker were mish-mashes of text culled together from Wikipedia and other websites. Computers can’t write actual stories.

Or can they?

Enter Narrative Science, a Chicago-based software company teaching computers to do just that—well, news stories, at least. Wired explains that articles written by the company’s computer algorithm are already out there:

The computer-written product could be a pennant-waving second-half update of a Big Ten basketball contest, a sober preview of a corporate earnings statement, or a blithe summary of the presidential horse race drawn from Twitter posts. The articles run on the websites of respected publishers like Forbes, as well as other Internet media powers (many of which are keeping their identities private). Niche news services hire Narrative Science to write updates for their subscribers, be they sports fans, small-cap investors, or fast-food franchise owners.

What do they sound like? The company’s website offers a snippet, transformed from a mundane spreadsheet of numbers into this:

Ryan Evans scored 22 points and grabbed six rebounds to lift No. 11. Wisconsin to a 64-40 win over Nebraska on Tuesday at Bob Devaney Sports Center in Lincoln. Evans and Jordan Taylor both ad solid performances for Wisconsin (12-2).

Okay, so it’s not Jhumpa Lahiri—but it soon will be, at least according to the company’s CTO and cofounder, Kristian Hammond:

Hammond was recently asked for his reaction to a prediction that a computer would win a Pulitzer Prize within 20 years. He disagreed. It would happen, he said, in five.

In light of this no-Pulitzer year—ouch.

Fast Company provides some fascinating insights into how the program works:

First, the algorithm takes a huge amount of data and determines what information is significant and what is noise. This is done based on predetermined parameters so, for example, when writing a baseball recap, the algorithm knows that a strikeout in the eighth inning stranding a runner on third-base is important to the narrative, while a strikeout in the second inning with no one on base hardly bears mentioning. Next, the algorithm figures out what kind of narrative it wants to tell. Is it a come-from-behind victory? A shocking upset? And finally, it converts that information into language. This part gets pretty complicated, but once you have the right data and the right story structure, the rest is just MadLibs.

Fellow writers: are you afraid yet?

Further reading: