Recently NPR ran a rather interesting competition of man versus machine. This 21st century version of the John Henry legend pit NPR reporter Scott Horsely against Automated Insights’ Wordsmith, an algorithm designed to assemble original news stories by accessing data residing in the cloud. What this little experiment demonstrated is not what you might expect. It did not show that the machine is somehow better than or even just as good as the human reporter. Instead it proved that these programs are just good enough to begin seriously displacing this kind of labor.
Although journalists are correct to be concerned about the future of their careers, I believe this technology will also have a significant impact on the university and academic publishing. This will be especially true in the natural and social science, where data-driven research is the name of the game.
First, for many researchers, it is the experiment and the data it generates that is of primary importance. The journal article or research report is often a secondary effort necessary to communicate findings to the scientific community. These reports generally follow a standard compositional format and are often produced by graduate assistants and junior members of the research team. It therefore seem a “no-brainer” to move some if not all of this activity to natural language generation algorithms. Especially if doing so means more time in the laboratory and less time spent with Microsoft Word.
Second, the style and structure of the journal article makes it a perfect candidate for computational processing. Like sports journalism, which was one of the first areas to capitulate to this machine incursion, academic writing follows a rather strict and predictable formula: Introduction, Review of Literature, Methodology, Results, Discussion, and References. In fact, it could be argued that human authors of scientific articles already follow the instructions of a compositional algorithm and that a machine not only could do the same but might even be better suited to adhering to the stipulations of the formula.
Third, and because of this, machine written reports could have a slight edge in the peer-review process. As the competition run by NPR illustrates, the computer-generated story is clear and concise without a lot of poetic embellishment, like metaphor, allusion, word play, etc. And a recent study conducted by Christer Clerwall of Karlstad University in Sweden found that software-generated content was perceived to be descriptive and boring but (and perhaps because of this) was judged to be slightly more factual, credible, and objective.
Finally there are some important consequences here for professional academics and the university. First the culminating, pièce de résistance of pursuing an advanced degree is the doctoral dissertation. No matter the field of study, the process of preparing, writing, and publishing this document has been the universal right-of-passage. What will become of this experience and expectation, when some, and perhaps even a significant amount, of the dissertation can be outsourced to a machine? Will we be able to accommodate this technology to the dissertation writing process, like we had done with other tools, like the word processor, automatic footnoting, search engines, spell check, etc.? Or will it be perceived as a fundamental alteration in the very process itself?
Second, faculty promotion and tenure is often tied to research productivity. How will natural language generation algorithms effect these evaluations? What will be the impact of this technology on journal impact factors and the other metrics of faculty performance? If a conference paper or journal article produced (in whole or in part) by a machine wins an award, who (or what) should receive the accolade? And will the use of these system eventually create a “digital divide” separating those who have access to and can use this technology from those who do not or cannot?
No matter how you look at it, natural language generation algorithms are here. And they will have a significant impact on our work and professional lives. This is no longer science fiction. It is, or will soon be, science.