Speed Reading (scientific) Literature

Related to some of our earlier thoughts on speed reading and literature, Nature has a fascinating article on the decidedly different nature of speed reading within the scientific community: “Literature Mining: Speed reading

(for context, see also: “text mining”)

In terms of scientific literature, there’s a lot out there. Too much to read and keep on top of, unless there’s some automation of that reading process …

“The use of computers to help researchers drink from the literature firehose dates back to the early 1960s and the first experiments with techniques such as keyword searching. More recent efforts include the striking ‘maps of science’ that cluster papers together on the basis of how often they cite one another, or by similarities in the frequencies of certain keywords.”

Keyword parsing may well be a good start, but that isn’t the same thing as reading and extracting actual meaning, is it? — “As fascinating as these maps can be, however, they don’t get at the semantics of the papers … The extraction of this kind of information is much harder to automate, because computers are notoriously poor at understanding what they are reading.”

Not being a scientist, I won’t pretend to understand all of the nuts and bolts involved. But the general concept, and what it might imply for what “reading” is — not to mention how such ‘reading’ is being done: by man? by machine? — raises interesting questions. For example,

“Semantic Web Applications in Neuromedicine (SWAN), one of a new generation of online tools designed to help researchers zero in on the papers most relevant to their interests, uncover connections and gaps that might not otherwise be obvious, and test and generate new hypotheses …

For each hypothesis in the system, SWAN shows the factual claims that support it, plus links to the papers supporting each claim. Because claims from the various hypotheses are linked together in a network, a user can browse from one to the next and see the connections between them. The visualization tool uses a red icon to show when two claims conflict and a green icon to show when they’re consistent, allowing the user to see at a glance which hypotheses are controversial and which are well supported by the literature.”

In a strictly non-scientific sense, it’s fun to imagine Italo Calvino-esque scenarios when the same sorts of automation might be brought upon literary reading. What would a computer reading a novel for meaning come up with?

Most interesting of all is the end of that Nature article, which speculates upon changing the very nature of the journal article, and more provocative still: why bother with writing scientific journal articles at all …?

“Analysing articles in new ways leads to the larger question of whether the articles themselves should change in structure. If an article is to be boiled down into machine-readable bits, why bother writing whole articles in the first place? Why don’t researchers just deal with statements and facts and distribute and mash them up to generate hypotheses and knowledge?

“Human commentary and insight are still extraordinarily valuable,” says Martone. “Those insights don’t immediately fall out of data without human ingenuity. So you need to be able to communicate that and that generally means building an argument and a set of supporting claims. These things are not going to go away any time soon.”


Surprise me


I run the ThinkLab at the University of Cambridge, and research digital habits, productivity, and wellbeing.

tyler shores cambridge

What I’m Reading Now:

Supercommunicators by Charles Duhigg

Related Articles

Have questions or ideas or requests for working together?

Get in touch