Writing with the machine

I made something new: a plugin that provides inline text comple­tions powered by an AI language model.

rnn-writer example

Building this felt like playing with Lego, except instead of plastic bricks, I was snapping together conveniently-packaged blocks of human intellect and effort.

One block: a recurrent neural network, fruit of the deep learning boom, able to model and generate sequences of char­ac­ters with spooky verisimilitude. Snap!

Another block: a power­fully exten­sible text editor. Snap!

Together: responsive, inline “autocomplete” powered by an RNN trained on a corpus of old sci-fi stories.

If I had to offer an extrav­a­gant analogy (and I do) I’d say it’s like writing with a deranged but very well-read parrot on your shoulder. Anytime you feel brave enough to ask for a suggestion, you press tab, and … 

rnn-writer example

If you’d like to try it yourself, the code is now available, in two parts:

You’ll find instruc­tions for both tools on their respec­tive GitHub pages, and if you have diffi­cul­ties with either, feel free to open an issue or drop me a line.

Mainly, I wanted to share those links, but as long as I’m here I’ll add a few more things: first a note on motivations, then an obser­va­tion about the deep learning scene, and finally a link to the sci-fi corpus.

The vision

From my first tinker­ings with the torch-rnn project, gener­ating goofy/spooky text mimicry on the command line, I was struck — almost overwhelmed — by a vision of typing normally in a text editor and then summoning the help of the RNN with a keystroke. (When I say “help,” I mean: less Clippy, more sĂ©ance.)

After fumbling around for a few weeks and learning five percent of two new program­ming languages, I had the blocks snapped together; the RNN trained; the vision realized. And then my first hour playing with it was totally deflating. Huh. Not as cool as I imagined it would be.

This is an unavoid­able emotional waysta­tion in any project, and possibly a crucial one.

As I’ve spent more time with rnn-writer, my opinion has — er — reinflated somewhat. I am just so compelled by the notion of a text editor that possesses a deep, nuanced model of … what? Everything you’ve ever written? Every­thing written by all your favorite authors? By your nemesis? By everyone on the internet? It’s provoca­tive any way you slice it.

I should say clearly: I am absolutely 100% not talking about an editor that “writes for you,” whatever that means. The world doesn’t need any more dead-eyed robo-text.

The animating ideas here are augmentation; partnership; call and response.

The goal is not to make writing “easier”; it’s to make it harder.

The goal is not to make the resulting text “better”; it’s to make it different—weirder, with effects maybe not available by other means.

The tools I’m sharing here don’t achieve that goal; their effects are not yet suffi­cient compen­sa­tion for the effort required to use them. But! I think they could get there! And if this project has any contri­bu­tion to make beyond weird fun, I think it might be the simple trick of getting an RNN off the command line and into a text editor, where its output becomes something you can really work with.

Deep scenius

Like any tech-adjacent person, I’d been reading about deep learning for a couple of years, but it wasn’t until a long conver­sa­tion earlier this year with an old friend (who is eye-poppingly excited about these techniques) that I felt motivated to dig in myself. And, I have to report: it really is a remarkable community at a remark­able moment. Tracking papers on Arxiv, projects on Github, and threads on Twitter, you get the sense of a group of people nearly tripping over them­selves to do the next thing — to push the state of the art forward.

That’s all buoyed by a strong (recent?) culture of clear explanation. My excited friend claims this has been as crucial to deep learning’s rise as the (more commonly-discussed) avail­ability of fast GPUs and large datasets. Having benefited from that culture myself, it seems to me like a reason­able argument, and an important thing to recognize.

Here are a couple of resources I found especially useful:

149,326,361 characters

Most of the energy in the deep learning scene is focused on what I’d call “generic” problems, the solutions to which are very broadly useful to a lot of people: image recognition, speech recognition, sentence translation … you get the idea. Many of these problems have asso­ci­ated benchmark challenges, and if your model gets a better score than the reigning champ, you know you’ve done something worthwhile. These chal­lenges all depend on standard datasets. And these — datasets — are — extremely boring.

So, a large part of the work (and fun) of applying the deep learning scenesters’ hard-won technical triumphs to weird/fun objec­tives is tracking down non-standard, non-boring datasets. For me, decisions about the collec­tion and processing of the text corpus have been more conse­quen­tial than decisions about the RNN’s design and subsequent training.

The corpus I’ve used most is derived from the Internet Archive’s Pulp Magazine Archive: 150MB of Galaxy and IF Magazine. It’s very noisy, with tons of OCR errors and plenty of adver­tise­ments mixed in with the sci-fi stories, but wow there is a lot of text, and the RNN seems to thrive on that. I lightly processed and normal­ized it all, and the combined corpus — now just a huge text file without a single solitary line break — is available on the Internet Archive.

So, in conclusion:

Snap. Snap. Snap!

May 2016, Oakland