A Q&A with the Creator of "I Write Like": "The Algorithm is Not a Rocket Science"

by Katjusa Cisar

AND WHO DO YOU WRITE LIKE, DMITRY?

This week’s meme is I Write Like, a new website that uses an algorithm of mysterious methodology to tell you which author’s work your writing most resembles. You enter some text-”your latest blog post, journal entry, comment, chapter of your unfinished book”-and a split-second later, it spits out the html code for a blog-ready badge: “I Write Like H.P. Lovecraft,” or any of the 49 other authors in its database. It’s hard science and great literature, together at last! Well, kind of.

I Write Like’s science has already been strung up and dissected: Gawker’s Max Read inputted Mel Gibson’s latest phone rant, got Margaret Atwood and came to an unfavorable opinion; Paste magazine got an “I Write Like Stephen King” badge after entering a few Big Boi rhymes; Margaret Atwood herself pasted in a sample of her own writing and got … Stephen King.

So take the site’s web address-iwl.me-as indication of how seriously we should be taking its diagnoses.

Dmitry Chestnykh is the creator of I Write Like. He’s a 27-year-old Russian software developer living in Montenegro. His company, Coding Robots, also offers a blog-writing program and an application to keep diaries.

He answered a few of my questions via e-mail Thursday night, explaining how his algorithm is like a spam-detector, how he plans to sustain the site beyond short-lived meme, and why he’s totally unqualified to analyze writing but still thinks I Write Like is useful.

[A note: since English is not his first language, he asked me to fix any grammatical or style errors in his answers. He barely made any mistakes, predictably putting the typically pitiful American foreign language skills to shame. I just fixed an awkward construction here and there. Based on I Write Like’s calculations, by the way, Chestnykh’s writing style here is most like David Foster Wallace.]

How and why did you get into software development as a career?
I think I got my first computer at 13, and after I used it for a few months, I knew I wanted to write programs for it. It’s a lot of fun to have something made by you do something for you. While at university I launched my tiny software business and have been working on it full-time since then.

Where did you first get the idea for I Write Like? Was it an idea you discussed/developed with friends, or did you go it alone?
Late at night I was looking for ways to promote my software. I had tried a few marketing things before and was going through a checklist to find what I had missed. Then the idea of making a fun badge came to me. Since most of our (Coding Robots’) programs were about writing, I immediately thought of comparing people’s writing, and began coding. I hadn’t discussed it with anyone before putting it online.

What makes you qualified to analyze literature like this?
Nothing, really. I’m the kind of person who is not qualified in a subject before jumping into it. (Good thing I didn’t try to become a medical doctor or a rocket scientist!) This is my way of learning: when I want to do something, I do it, learning along the way.

Who are your favorite authors? Do you read more literature in English or Russian (or other languages)?
I think I read more literature in English. It’s hard to name my favorite writers because there are so many of them. To name a few: Gabriel García Márquez (unfortunately, I don’t know Spanish yet, so I read his works in Russian translation), Agatha Christie, Stephen King, Ernest Hemingway. But there are many of those whose works I haven’t had time to read yet.

How many authors are currently in the database? How did you decide which authors to include?
The current version includes 50 writers. First versions included authors from the bestsellers list on Wikipedia, top downloaded books from The Gutenberg Project (a public library of out-of-copyright books), and the ones I could remember. Later versions included authors suggested by users.

When are you going to add explanations for the algorithm for each author? Why haven’t you included this already — why keep it secret?
I wanted to write a blog post about it, and to open-source the code, but haven’t had time for it yet, because I’ve been busy updating the program and handling all the traffic, emails and comments I received. Also, it’s really interesting to read how people try to explain the results they got.

Actually, the algorithm is not a rocket science, and you can find it on every computer today. It’s a Bayesian classifier, which is widely used to fight spam on the Internet. Take for example the “Mark as spam” button in Gmail or Outlook. When you receive a message that you think is spam, you click this button, and the internal database gets trained to recognize future messages similar to this one as spam. This is basically how “I Write Like” works on my side: I feed it with “Frankenstein” and tell it, “This is Mary Shelley. Recognize works similar to this as Mary Shelley.” Of course, the algorithm is slightly different from the one used to detect spam, because it takes into account more stylistic features of the text, such as the number of words in sentences, the number of commas, semicolons, and whether the sentence is a direct speech or a quotation.

There are a lot of works in academia dealing with writing analysis, but I used none of them. I have been contacted by people who research this topic, and received a lot of pointers to interesting works. I’m sure I wouldn’t have been able to integrate and figure them out in the three days I had to write this thing, but I will definitely learn more about the subject to improve the program.

Really, what’s the point of “I Write Like”? Does it have a useful application or is it just for fun?
I didn’t think that there was a big point in it before launching. However, I’ve been proved wrong: it helps people discover and re-discover writers. There are so many comments like “I write like Ernest Hemingway. I have to read more of his books,” or “I write like Chuck Palahniuk. Who? Never heard of him, will read,” or “I write like Edgar Allan Poe. Never read anything by him, but now I think I will.” It is amazing that this tool can be used for education, so I plan to add information about writers and their books in one of the next versions.

I Write Like is going viral very quickly. Ultimately, what’s your goal with the site? How will you sustain it beyond a quick-flash meme?
I’m trying to expand the website to make it the destination for people to learn more about how to be a better writer. I will also add more information about writers, and maybe I’ll add features to help people discover interesting authors and books.

Will you be tweaking the software to read any other language besides English? You were immediately called out for having more male than female authors, but no one picked up on the apparent overwhelming majority of English-writing authors.
I planned to launch a Russian version, but postponed it because of the lack of time. Also, I’ve been offered help to make a Portuguese version.

I just finished reading
Sam Lipsyte’s “The Ask.” He’s a master of simple, powerful, unusual sentences. I can see how he’s doing what he’s doing with language, but mostly I just want to bask in the magic and not analyze it too much in the moment. Has developing software like this changed how you read or lessened some of the magic in fine writing?
It has been only four days since I launched the website, and I haven’t read anything since the launch, so it’s a bit early to say if it changed how I read.

You’ve promised subscribers an “awesome” newsletter of writing tips and a free download the 1898 how-to book “A Practical Treatise on the Art of the Short Story” by Charles Raymond Barrett. Why that book? What’s one of your awesome writing tips?
I’ve chosen this particular book because it had so many details in it and a good analysis of short story writing. Also because it’s out-of-copyright, so I can redistribute it freely, kudos to The Gutenberg Project. 🙂 The newsletter is a part of the plan to convert “I Write Like” from a quick-flash meme to something sustainable and useful. I’m not a published writer myself, so I’m not qualified to give people tips (especially since English is not my first language). I will be the editor, and other more knowledgeable people will share their advice on writing. I hope the first issue will come out in August.

Katjusa Cisar is a freelance writer living in Atlanta.