“Is the U.S. flag still on the moon?” I asked ChatGPT, the viral AI chatbot.
“The United States flag is no longer on the moon,” responded ChatGPT. It went on to explain why the flag couldn’t survive the harsh conditions of the lunar surface, and confidently claimed that the last time an American flag was planted on the moon was during the Apollo 11 mission in 1969, after which no humans returned to the moon.
In reality, though the United States’ lunar flags have been gradually disintegrating, five still stand today — except the one from Apollo 11, which was knocked over by the engine exhaust as the spacecraft lifted off.
ChatGPT isn’t fit yet for an information retrieval system like a search engine.
ChatGPT, a chatbot from OpenAI, the Microsoft-backed research lab and startup cofounded by Sam Altman and Elon Musk, has captivated the world like no other over the last few weeks. Punch in practically any question to its fairly rudimentary-looking interface, and it will spew out lengthy, authoritative answers.
Since OpenAI released it to the public, over a million have prompted it to write haikus, construct Harry Potter-themed text adventure games, build a website from scratch, and negotiate their internet bill down for them — and the chatbot has obliged.
So it’s no surprise ChatGPT’s excellent conversation skills got people wondering whether it could soon replace the service they usually turn to seek answers to their wide-ranging queries: Google.
But my exchange with ChatGPT on the US lunar flags is emblematic of how it actually forms responses and why — however fascinating it may be — it isn’t fit yet for an information retrieval system like a search engine.
ChatGPT is engineered to sift through huge reams of information scraped off the web — more than 500 GBs in this case — and spot patterns in the text. Short for “Chat Generative Pre-Trained Transformer,” it uses those patterns to predict or “generate” the next words in any given sentence out of thin air.
It can’t tell whether its answer is factually correct — and given machine-learning models’ black box-like behavior, its creators cannot easily trace what sources it relied on to string together an answer. Often, you’ll even find ChatGPT responding with different answers to the same question since it’s making it up from scratch each time.
When I asked ChatGPT the same lunar question again, for example, it told me all the Apollo missions’ flags are still there. Inquire about the GDP growth of the United States, and it will come up with a different figure every time you hit the “Regenerate Response” button.
ChatGPT’s creators cannot easily trace what sources it relied on to string together an answer.
More importantly, a question-answer system doesn’t suit what people need when they turn to search engines, Dr. Emily Bender, a linguistics professor and the director of the Computational Linguistics Laboratory at the University of Washington, told Freethink.
With Google, Dr. Bender argues in her research, we’re not simply hunting for answers. When a search goes well, we also learn about how our questions fit into a broad information landscape, where we can situate and verify the information we find with respect to the sources it is coming from.
Having a language model like ChatGPT “synthesizing an answer as a likely string of words, based on its training data, cuts off that ability to understand where the answer comes from and dig into it more deeply,” she adds.
Further, search engines rely on a network of traceability — PageRank, in Google’s case — which is designed to promote and reward pages that are linked the most by other websites. Since information on such better-attested pages generally is found to be more trustworthy, they’re pushed up in the results, automatically sorting the list of links by their quality.
ChatGPT doesn’t have such a filter, and it inherits (and amplifies) the biases of the datasets it’s based on. While OpenAI has installed guardrails to prevent it from putting out problematic content, people have been able to circumvent them. When one user asked the system to write Python code for which gender and race are needed to be a good scientist, ChatGPT responded with “male” and “white.”
In fact, when OpenAI researchers plugged the tech behind ChatGPT into a search engine, they discovered that the model often quoted from “highly unreliable sources,” and the biases it incorporated influenced the way in which it chose to search and synthesize information on the web.
In its current form, ChatGPT is also vastly limited in its data, and it can’t search the web for the latest information or work in languages other than English. It’s currently stuck in time, in a sense, and its knowledge base has a cut-off point, which is currently the end of 2021. (It still thinks Twitter is a public company, for instance.)
Open AI didn’t respond to our requests for comment. In a tweet, its CEO, Sam Altman, said, “ChatGPT is incredibly limited but good enough at some things to create a misleading impression of greatness. It’s a mistake to be relying on it for anything important right now.”
Yet, the promise of a chatbot that can search the web and save you the trouble of scanning and clicking through an endless list of links holds promise, and OpenAI isn’t the only one pursuing it. Several companies, including Google, are taking advantage of language models like ChatGPT not to replace search engines but to complement them, by better understanding natural language.
In its quest for the “One True Answer,” Google’s AI, for example, often pinpoints text from a website that best matches your question and pulls it up in a box at the top of the results (with mixed success). Google is also experimenting with replacing the traditional interface entirely with a chatbot; when its sister company, DeepMind, plugged a language model into a commercial search engine, it saw improvements and an increase in performance across the board.
“Soon enough the traditional 10 blue links on a search engine results page will seem as archaic as dial-up internet and rotary phones.”
Sridhar Ramaswamy
Neeva, a privacy-focused search engine founded by Sridhar Ramaswamy, who previously ran Google’s ad business, however, is set to go a step further. It will soon allow users to read synthesized single answers summarizing information from the most relevant sites to their query.
Ramaswamy says Neeva overcomes the issues plaguing ChatGPT by embedding references and citations from its search engine AI directly into the answer, performing these operations in real time so that the information is the latest.
“Soon enough the traditional 10 blue links on a search engine results page will seem as archaic as dial-up internet and rotary phones,” Ramaswamy told Freethink.
However, all of these efforts have run into similar hurdles as ChatGPT. And human feedback on responses, like the ones ChatGPT relies on, can only get them so far.
When EleutherAI, an open-source AI research firm, created a text dataset containing hundreds of gigabytes of web pages, they had to perform “extensive bias analysis” and make “tough editorial decisions” to exclude data they felt were “unacceptably negatively biased” toward certain groups or views.
Until these issues are solved, it’s unlikely we’ll see major search players like Google switch to chatbots in any form. In a recent all-hands meeting, Sundar Pichai said Google’s AI language models are just as capable as OpenAI’s, but it has to move “more conservatively than a small startup” because of the “reputational risk” posed by the technology.
In addition, a large chunk of most search engines’ revenue comes from the ads you see in results. For Google particularly, these ads were responsible for 81% of Alphabet’s overall revenue in 2021 — about $208 billion. Changing the fundamental interface for search without compromising that revenue stream will be tricky.
Gaurav Nemade, an ex-Google product manager who was one of the first to lead its internal LaMDA chatbot, agrees. Nemade told Freethink that the company’s tech is comparable to OpenAI’s, but it will be hard for it to pivot anytime soon.
At the same time, Nemade believes there’s a huge opportunity for Google to capitalize on hyper-personalized search results without compromising its core ad-based business model.
In just a few dialogues, chatbots like ChatGPT develop a near-precise understanding of the person on the other end, and companies like Google can use that data to deliver more targeted results — and ads. Nemade believes businesses would be willing to pay much more for such precise leads, since they won’t be competing with the rest of the search results.
However, Dr. Bender plans to stick to traditional search engines: “If I were given a choice between a search engine and a search engine hidden behind a chatbot, I would pick the search engine every time.”
We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.