An intelligence analyst hunting for answers in a sea of data faces steep challenges: She must choose the right search terms, identify useful results, and organize them in a way that reveals new connections.
Making that process quicker and more intuitive could yield faster answers to key national security questions, which is why a research group at Virginia Tech is collaborating with Fairfax-based defense company General Dynamics Mission Systems on intelligence software that allows analysts to interact more closely with their data.
According to Chris North, a professor of computer science in the College of Engineering and the associate director of the Discovery Analytics Center, analysts currently have to approach huge data sets with independent, consecutive searches.
“You search, and then you read. And you read and you read and you read. And then you might figure out from all that reading something else you might search for, and then you do that, and it’s a slow, painful iterative process,” North said.
North’s research group is developing software that uses a visual interface and computer learning algorithms to allow to the analyst’s interactions with the data to guide future searches.
Demonstrating the system, computer science doctoral student Michelle Dowling, from Grand Rapids, Michigan, enters the name of a person of interest in the search field, and a constellation of nodes pops up on the screen. Each node represents a document containing the name Dowling searched for; the documents belong to a data set the government uses to train intelligence analysts.
The size of each node represents the algorithm’s assessment of that document’s relevance, and the distance between any two nodes reflects the similarity of those two results to each other.
Because of the system’s graphical interface, the analyst can move two nodes closer together to indicate any important similarities; the results will rearrange themselves to show which documents are most relevant based on that analysis.
For example, bringing together nodes for a restaurant receipt and a plane ticket may suggest that a particular trip might be important; the algorithm can pull in other results related to that date or location.
And past search terms will be weighted more heavily when they pop up in the results of future searches.
“So as you interact with these documents and you do searches, it keeps track of what’s important to you,” Dowling said. “As you do subsequent searches and interact with it more, it’s trying to pull in more documents that it thinks are important to you and will be useful.”
This collaboration has been remarkably productive, according to Afroze Mohammed, the associate director of strategic alliances for Virginia Tech’s Office of Economic Development. Based in the National Capital Region, she works to find good matches between faculty research and corporate interests and has facilitated the relationship between General Dynamics and the university.
“What makes this partnership exciting is their level of involvement. They want to be very involved,” Mohammed said. “They’re trying to increase innovation in their company, and we’re their partners.”
Lorien Riead leads the research and development team at General Dynamics Mission Systems that collaborates with North and his students on the project. Members of her team call into weekly meetings with North’s group and visit campus about once a month.
And both teams credit each other for critical contributions to the project.
According to North, General Dynamics Mission Systems’ knowledge of how analysts work and the capabilities that might be useful to them has also brought a valuable perspective to the project.
For example, a key aspect of the current software is that it incorporates pieces of two different interfaces, one for text and one for numerical data; the General Dynamics Mission Systems team suggested combining the interfaces.
“We were thinking of those as two different research directions,” North said. “So the initial inspiration there was their idea — which was cool.”
Meanwhile, the ability to interact with the data, and use those interactions to guide future searches, has given the university and General Dynamics Mission Systems a new way to look at intelligence software.
“An intrinsic problem that people have is trying to make sense out of their data,” Riead said. “The way that you interact with it was the part that was really intriguing to us.”
General Dynamics Mission Systems’ relationship with Virginia Tech extends beyond the research collaboration.
The company is also working with the Ted and Karyn Hume Center for National Security and Technology, and Jay Mork, an executive at the company, provided feedback on national security during the development of the university’s destination area initiative.
General Dynamics Mission Systems also sponsors a “Datafest” on campus, a 48-hour competition where students race to answer questions by combing through gigabytes of data.
This year, the company sent representatives to serve as mentors and judges — and a recruiter.
“The students who want to get involved in these sorts of things are the ones that we really want to get to know, because they’re the ones who have the passion and the drive to go out and really make a difference in the problems we’re facing,” Riead said.
Meanwhile, the Discovery Analytics Center team has a list of improvements they’d like to make to the search software, inspired in part by General Dynamics Mission Systems’ suggestions of real-world problems an analyst might encounter.
They’re working to expand the number of databases that the software can search and to allow the algorithms to incorporate different types of text data, such as tweets and news stories; eventually, they hope to incorporate multimedia and streaming data.
The team also plans to set up a web-based interface that will make the program more widely accessible.
“We’re not there yet, but we’re moving in that direction,” Dowling said. “It’s on our massive checklist.”
The Discovery Analytics Center, which is based in the computer science department and is partially supported by the Institute for Critical Technology and Applied Science, brings together researchers from computer science, statistics, mathematics, and electrical and computer engineering to tackle “big data” problems in critical areas like intelligence analysis, sustainability, and public health.