Companies are deploying artificial intelligence systems but don’t know if they’ll measure up
The Animal-AI Olympics, which will begin this June, aims to “benchmark the current level of various AIs against different animal species using a range of established animal cognition tasks.” At stake are bragging rights and US $10,000 in prizes. The project, a partnership between the University of Cambridge’s Leverhulme Centre for the Future of Intelligence and GoodAI, a research institution based in Prague, is a new way to evaluate the progress of AI systems toward what researchers call artificial general intelligence.
Such an assessment is necessary, the organizers say, because recent benchmarks are somewhat deceiving. While AI systems have bested human grandmasters in a host of challenging competitions, including the board game Go and the video game StarCraft, these matchups only proved that the AIs were astoundingly good at those particular games. AI systems have yet to demonstrate the kind of flexible intelligence that enables humans to reason, plan, and act in many different domains. If you asked the StarCraft-playing AI to devise an investment strategy for your retirement, for example, it would give you the digital equivalent of a blank stare.
Animals may not be planning for retirement, but they can generalize, transfer lessons learned to new circumstances, and engage in creative problem solving—as anyone who has ever watched a squirrel stage an attack on a bird feeder can testify. Matthew Crosby, one of the contest’s organizers and a postdoctoral researcher at the Leverhulme Center and at Imperial College London, says he’s eager to see if any of the AI agents entered in the contest can display similar abilities when confronted with tests typically used in animal cognition research. “An AI can be great at one task,” Crosby says, “but can it solve similar tasks that it hasn’t seen before?”
Just about everyone thinks that, eventually, AI will transform society from top to bottom—yet no one knows when AI agents will be smart enough to really shake things up. Recent AI triumphs in the world of games have created inflated expectations, as hyped-up media reports fast-forward too quickly to a world run by machines.
One societal domain that could be transformed by AI is health care. Health data is increasing exponentially: IBM has estimated that the average person will generate 1 million gigabytes of health-related data in his or her lifetime, doubling the amount of health data in existence every two to five years. With all that data now stored in electronic health records, the industry seems perfect terrain for an AI that can mine vast quarries of data and discover gold nuggets therein. Many medical experts are anticipating AI systems that can find hitherto unseen patterns that improve patient outcomes and make health care more efficient.
In this issue, “How IBM Watson Overpromised and Underdelivered on AI Health Care” tells the story of IBM’s effort to turn its AI technology into a variety of tools for intelligent medicine. IBM was one of the first companies to make a serious push to bring AI into hospitals and clinics, and the challenge has proved harder than anyone expected. Medical data is messy, as messy as human biology—but somehow physicians manage to cut through the mess to make diagnoses and treatment plans for their patients. IBM knew going in that its Watson AI wasn’t as smart as an expert doctor—but as our story makes clear, the company is still trying to figure out exactly how smart it is, and what it’s good for. Watson still has a lot of learning to do, at least when it comes to practicing medicine.
The editorial content of IEEE Spectrum does not represent official positions of IEEE or its organizational units. This article is based in part on our Tech Talk blog post “Animal-AI Olympics Will Test AI on Intelligence Tasks Designed for Crows and Chimps.”
This article appears in the April 2019 print issue as “How Smart Is Artificial Intelligence?”