A Digital Collaborator: The Future of Search

In 1941, Jorge Luis Borges published a short story about an unending library comprised of hexagonal rooms. The library contains every book ever written, every book that will be written, and every book that could be written, in all languages. One book contains a detailed history of the future. Another describes the true story of your death. There’s commentary on the gospel of Basilides, and commentary on the commentary.

The library contains books filled with every possible combination of letters—one book has the letters M C V repeated from start to finish—rendering most books nonsense. Some residents diligently search for a perfect index of the library, but it’s a quixotic search. How could they distinguish the faithful catalogue of the library from the innumerable false ones?

Borges’ short story, The Library of Babel, is an eerie illustration of a problem we encounter every day: information overload. We live in an era where information, once scarce and expensive, has become a commodity. And while access to more information is a good thing, it often comes at the expense of having to sort through heaps of gibberish. In a way, we’re all living in the Library of Babel.

Or are we?

If you trace the history of information, from the first spoken languages to the Internet, you’ll notice that each time we invent something that spews more information into the world, we ingeniously respond by creating a system that organizes the new information. Contemporary critics rightfully complain about information overload—we’re suffocating from “Data Smog,” as author David Shenk puts it—but it’s simultaneously true that we’re living in an era of extreme organization. It’s never been easier to store, retrieve, and share information. Not even close.

Yet the ability to access the world’s knowledge with just a swipe and a click might come at a cost. What John Stuart Mill said of happiness—that it “was only to be attained by not making it the direct end”—also describes the nature of discovery. We tend to descend on good ideas obliquely, as Financial Times writer John Kay puts it. That is, scientists and artists make discoveries when they’re contemplating something that is only vaguely related to their original question. It’s an overlooked aspect of the creative process that repeats itself—Archimedes in the bathtub, Darwin reading Malthus, Fleming experimenting with bacteria.

At this point, you might suspect that this essay is about the fundamental tradeoff between structure and serendipity. If we generate good ideas by welcoming a dose of unexpected encounters, then each time we organize information we risk impeding intellectual progress. Being productive and creative is about injecting the right dose of disorder and chaos into your daily routine, right?

The problem with this view is that it involves debating two abstractions. What’s at stake is not balancing the Apollonian with the Dionysian but answering a more concrete question: How do search interfaces influence search behavior?

This is where things get interesting. The field of information retrieval is based on a search model that we’ve inherited from the early days of computer science. That model assumes that retrieving information from a database involves going to a computer, searching the database, finding the document, and leaving. “It just wasn’t intuitive to imagine a cohesive information environment where people could search many databases at the same time,” Marcia Bates, Professor Emerita of Information Studies at UCLA, says.

Google was such a significant breakthrough because it indexed the World Wide Web and not just one database. It used an algorithm that ranked websites by the number and quality of inbound links instead of simply counting keywords. The logic of the algorithm, which Google co-founder Larry Page wryly called PageRank, is similar to the logic of academic citations: the quality of a paper is determined by how many times it has been cited.

Google has since improved search by incorporating slick new features such as Autofill. It can distinguish the meaning of a query from the words within the query better. Yet the difference between Google and Gerald Salton’s “SMART”, an early information retrieval system developed in the 1960s, is a difference of degree, not kind. In terms of organizing information online, we don’t need to worry about data smog. We need to replace an interface that’s over 50-years-old.

When search experts and information scientists talk about the future of search, they talk about having “a space to explore” and the opportunity “to go in various directions,” as Anabel Quan-Haase, an Associate Professor of Information Science at the University of Western Ontario in London put it to me. This is not the simple idea that Google will get better at answering your questions. It’s the more groundbreaking hypothesis that in the future, Google (or a competitor) might help by inspiring a few, too.

To understand the difference, I spoke with Tuukka Ruotsalo who leads a team of researchers at the Helsinki Institute for Information Technology. Tuukka and his team completed a new search engine called SciNet a few years ago. “The project started from the idea that search has developed dramatically in the last few years but search interfaces have not,” Ruotsalo says. “We still type in keywords and get a list of documents. We are trying to help users recognize topics they’re interested in, and a big part of that is visualization.”

SciNet’s interface looks like a Copernican solar system. The searched word or phrase appears in the middle (“machine vision”) and related keywords and topics dot the periphery (“nano-technology,” “neural networks,” “artificial technology”). Users drag new keywords toward the middle across concentric circles; the closer the keywords are to the center, the more they influence the results, which are listed on an adjacent column. A simple color-coding system makes it easy for users to spot useful articles. Taken together, it’s a wonderful experience.

Ruotsala said that SciNet is “not trying to beat Google,” emphasizing that his team “designed it to help scientific people find useful scientific information,” but they since established Etsimo, a company that will explore a commercial version of SciNet. If conducting research is about finding material at the periphery, this is a promising development. SciNet may bring us closer to the next generation of search by making it more visual and dynamic.

As I spoke with Ruotsala it became clear that the question of structure versus serendipity is misleading. If early IR systems were like hitting every red light down a long road, Google was the engineer who reprogrammed the lights to make the system work better. The future of search, breaking from this approach entirely, would play the role of an erudite driving buddy, stimulating the conversation at just the right moments. You’ll still take the shortest rout, but now you’ll have a digital collaborator to help you think through your hunch. In this view, the history of search is best seen not as an ongoing attempt to organize information, but as an ongoing attempt to simulate a real conversation, where critical feedback and original ideas are exchanged, not just facts.

And yet, when we contemplate the future of search we tend to imagine ourselves caught in Borges’ library, frantically searching for a way to escape. This fear emerges each time information becomes easier and cheaper to produce and share. And while worrying about overload is not completely misguided—imagine living through the 18th century, when the number of books in print doubled from about 331,000,000 to 628,000,000—it ignores a much broader and more important trend.

Right now, search is a completely unimaginative experience, akin to hanging out with a dull accountant. If the creative mind is fundamentally dialectic, constantly questioning and scrutinizing itself, thriving from exchange and dialog, then search must become a collaborator, a process by which a singular idea emerges out of interaction.

We extract two kinds of information when we collaborate with other people, the explicit stuff and the nonverbal cues—the smiles and subtle gestures that “envelop nearly all human action,” as Nietzsche said. If those cues are essential to human communication—psychologists insist that speaking and listening are fundamentally nonverbal—then our latest innovations in search are impressive but relatively primitive.

When you reflect on the inevitable rise of voice recognition software, virtual reality, and artificial intelligence, it’s easy to see what the future of search will look like: more human.

Sam McNerney