Devi Parikh, an assistant professor in the Bradley Department of Electrical and Computer Engineering at Virginia Tech, has received an Allen Distinguished Investigator Award for close to $1 million from the Paul G. Allen Family Foundation to teach machines to use “common sense” in image analysis.
Parikh uses cartoon scenes crafted from clip art to help computers “read” complex images.
Humans interpreting visual scenes can take advantage of basic knowledge about how objects typically interact, but computers, Parikh said, don’t have the same skill.
“The visual world around us is bound by common sense laws depicting birds flying and balls moving once they’ve been kicked, but much of this knowledge is hidden from the eyes of a computer,” she said. Computers, in other words, might have a lot of information about avian wing structure, but they don’t necessarily know that birds fly.
“Simply labeling images with this information does not address the underlying problem of how it all fits together,” said Parikh. “We need a dense sampling of the visual world to understand how subtle changes in the scene can change its overall meaning.”
Parikh proposes to use crowdsourcing, leveraging hundreds of thousands of Amazon Mechanical Turk workers (or “Turkers”) online to illustrate the visual world using clip art.
The Turkers will use clip art to create scenes with visual features and basic written depictions of what’s going on. By learning to associate certain visual elements with the information in the text, the computer may eventually accumulate a lexicon of common sense that will help it understand the visual world like humans do.
“These clip art scenes will serve as a completely new and rich test bed for computer vision researchers interested in solving high-level AI problems,” said Parikh, who will be collaborating with Larry Zitnick and Margaret Mitchell at Microsoft Research. Zitnick is in the Interactive Visual Media group and Mitchell specializes in Natural Language Processing.
“Learning common sense will make our machines more accurate, reasonable and interpretable — all imperative towards integrating artificial intelligence into our lives and society at large,” said Parikh.
So while machines today can play chess, vacuum floors, and win at Jeopardy, Parikh’s research could take them a step closer to being intelligent entities. That’s critical for a variety of artificial intelligence applications — be it for personal assistants, health care, autonomous driving, or security, such as law enforcement or disaster recovery purposes.
The award is part of the Allen Distinguished Investigators Program, which was established to advance ambitious, breakthrough research in key areas of science. Parikh is also a recipient of the Army Research Office Young Investigator Award, and of two Google Faculty Research Awards.
Parikh leads the Computer Vision Lab at Virginia Tech. She is also a member of the Discovery Analytics Center, which has operations on the Blacksburg campus and also at the Virginia Tech Research Center in Arlington. The center is housed in the Department of Computer Science within the College of Engineering. She is also a member of the Virginia Center for Autonomous Systems at Virginia Tech. Both centers benefit from the support of the Institute for Critical Technology and Applied Science for their interdisciplinary research.