The Science Fiction Behind Search
What used to be science fiction is now everyday reality in the world of search. Here, Google Fellow Amit Singhal takes a moment to marvel at the incredible capabilities search has brought us, and what we can expect in the future.
As a boy in India, I dreamed of space. Watching episodes of Star Trek, I was transported to the final frontier. I remember watching in wonder as Spock and company beamed themselves to a mythical planet, used scanners to identify their surroundings, and struck up conversations with aliens.
Now it’s my job to help make that science fiction come true – and we’re not doing too badly. When search first captured my imagination two decades ago, the web was sparsely populated with text, and data was achingly slow. But despite these limitations, ‘surfing the web’ still felt like exploring space; discovering vast, uncharted territories. Today we can voyage to the surface of Mars or the depths of the ocean without ever leaving our couches, identify landmarks around the world just by snapping a photo on our mobile phones, and have conversations with people on the other side of the world, in dozens of languages we’ve never even studied. And this is all thanks to incredible – you might even call them science fiction-like – advancements in web and search technologies.
So where is search going? Let us first consider how our early science fiction search dreams came to fruition.
At Google, when we talk about organizing the world’s information, we don’t mean only text; images and videos contain a wealth of information. In the early days, this type of content simply didn’t exist online. Now, through efforts like Google Earth and Street View, we can provide something incredibly valuable: images of your physical world.
“Today we can voyage to the surface of Mars or the depths of the ocean without ever leaving our couches.”
However, in many ways, getting visual information online is the easy part. What’s hard is understanding that information. Unlike text, we cannot simply read an image or video. We have to look inside them, dig out the pixels and translate them into something meaningful. For a long time, we considered this a pipe dream, but by combining search methodology and technological breakthroughs in computer vision, today we can match pictures at a visual level. Search for ‘Mount Rushmore’ on Google and our algorithms will analyze many factors, such as the shape and texture that produces a good image of Mount Rushmore, then return those images to you in striking full-color. Better yet, take a picture of Mount Rushmore and Google Goggles will recognize it and show relevant query results – no need to type at all. (And if you don’t want to head to South Dakota, you can always recreate the monument yourself.)
Breaking down language barriers can unlock whole new worlds. Unfortunately, engineers working on translation technologies quickly discovered that teaching a computer to translate language is even more difficult than teaching a person. Humans learn language by combining vocabulary with grammatical rules. But as we all know, languages are complicated. There are exceptions to the rules, exceptions to the exceptions, and exceptions to those exceptions. These exceptions, though beautiful to humans, seem ‘illogical’ to computers and result in poor translation quality, making computer translations unusable. Plus, trying to teach these exceptions to computers doesn’t scale well. To translate between every possible language pair, whether it’s Japanese to Chinese, Hindi to Korean, or Urdu to Swahili, your computer would have to learn a lot of exceptions.
So rather than trying to code lots of rules, we fed our translation engine thousands of professionally translated documents and used statistical models to identify patterns across them. These patterns helped us identify countless correlations, and from those correlations, we can start predicting the best translation for a given word, phrase or document. Today, Google Translate can help you read search results, web pages, emails, YouTube video captions and more, in over 50 languages. And that’s just for starters. Thanks to emerging voice technology on mobile, you can even have a multilingual conversation with someone face-to-face, in real-time, using speak-to-translate.
One of the most iconic science fiction images is the robotic butler who brings your slippers, knows what temperature you like your tea, and anticipates your needs. We’re certainly not there yet, but providing more personalized experiences is a first step.
Everyone today has his or her very own version of Google. Your Google is different from my Google, which is different from my neighbor’s version, and so on. This makes a lot of sense, because we all have unique, distinct interests.
But building a tailored search engine for millions of users is no simple task, and many factors influence which results will be most relevant to you at a given time. For example, Google is localized across over 150 geographical domains, so when you search for ‘pizza’ in Tokyo, you’ll see pizza restaurants in your area. Sounds simple, right? But things get exponentially more complicated with more sophisticated users’ models.
Take the search query ‘lords’ for instance. This simple word means different things to everyone – parliamentary houses, castles and swords, even a multiplayer online game. However, as a fan of Indian cricket, I search for and click on cricket-related things all the time. So when I search for ‘lords’ on Google, I see results about the Lord’s Cricket Ground, the most famous cricket field in London.
Results have also gotten a lot more personal and relevant thanks to Social Search, which incorporates signals from people I’m connected to online. So, for example, I might see a tweet from my friend about a recent game.
Just a short time ago, the vast majority of electronic information was locked away in highly specialized databases with limited, often for-pay, access for research purposes. From the time an article was written, it took months to index that information in these specialized databases so that researchers could search for it. The power of accessing data within seconds of when it was produced has transformed us all, but for early search scientists, the concept of real-time search seemed truly impossible.
Google launched Realtime Search – one of the most complicated projects I have ever worked on – in December 2009. We developed a dozen new technologies to near-instantly determine the relevance of these updates, from extracting information from shortened URLs, to drawing meaning from shorthand conventions like ‘#obama,’ to evaluating changes in query volume to identify hot topics. The result: when AT&T announced that it was interested in buying T-Mobile on March 20th, 2011, Google’s Realtime Search started displaying tweets about the news several minutes before major news organizations started reporting on the story.
Search with real-time results gets people information faster, and it’s not a stretch to say that this can save lives. Take Flu Trends – we use aggregated search data to estimate flu activity, providing the information two weeks faster than CDC [Centers for Disease Control] data. The implications of this are enormous.
We’ve started teaching computers how to translate languages, but teaching a computer to actually understand language remains one of our biggest challenges. Google knows that ‘GM’ refers to General Motors in the context of cars, but ‘genetically modified’ in the context of food, for example. But what about words with multiple meanings? How does Google know that when you’re looking to change the brightness of your laptop screen, you actually want to ‘adjust’ it? By contrast, if you want to change a PDF file into a Word document, Google can help you learn how to ‘convert’ that file.
“If we can learn anything from history, it’s that science fiction doesn’t have to stay that way.”
These may sound like straightforward substitutions, but remember: computers don’t think like humans. Programming a computer to derive meaning from words and context was barely imaginable some 20 years ago. And back then, what if we’d said that we wanted to do this across all the world’s languages? We would have been called crazy.
Every day, billions of documents get added to the web. People’s expectations are changing. We want information delivered in all formats, in every language, tailored to our personal preferences, and we want it NOW. Clearly, there is plenty of work to be done to take search into the future, but in truth, we’ve come a long way in a short time.
Just last year, we made over 500 improvements to search. But when you’re chasing perfection, no matter how far you’ve come, no matter how many seemingly impossible problems you solve, there is always more work to be done. In my mind, the holy grail of search is to understand what the user wants, not just matching words, but actually trying to match meaning. Doing this before the user ever types in a search query would be even better.
Google Instant takes us down this road. Instant takes what you have typed already, predicts the most likely completion and provides search results as you type – yielding a smarter, faster search that is interactive, predictive and powerful. Just ask Clay Shirky what we can do with all that extra time.
My dream search engine of the future guides me throughout the day. It knows my next meeting is downtown, but the streets are closed, so I should take the subway. It reminds me that my wife’s birthday is in two weeks, tells me she wants an iPad and suggests I talk to my friend, Matt, who has done research on its Wi-Fi capabilities. Then it sends me directions to the closest store. It could even suggest a romantic restaurant nearby, search our schedules, and book a candlelit table for two.
If we can learn anything from history, it’s that science fiction doesn’t have to stay that way. We haven’t quite figured out how to beam you into space yet, but then again, it’s still only 2011.
"The holy grail of search is to understand what the user wants, not just matching words, but actually trying to match meaning. Doing this before the user ever types in a search query would be even better."