"There seems to be some kind of magic," says Professor Ellie Pavlick of Brown CS, "that happens in humans' heads that allows us to conjure up a staggering amount of information in order to make inferences: not just information about language, but assumption and intents. The main questions I'm interested in in this area are about representation and learning. What is 'common sense' and what does it 'look like'? How do humans represent our knowledge about the world, and how do we build representations through the experience of living rather than supervised training? And how can we make the answers to these questions play well within a computational system?"
Two upcoming projects will allow Ellie and her collaborators to explore those answers in depth. They're being funded by two grants, one from the Defense Advanced Research Projects Agency (DARPA) and one from the Intelligence Advanced Research Projects Activity (IARPA). The latter of these, at six million dollars, is the largest Brown CS grant to date.
One of Ellie's partners, Professor Carsten Eickhoff of Brown CS, who leads the AI Lab at Brown's Center for Biomedical Informatics, explains that he looks at both grants from an Information Retrieval perspective. "Information Retrieval engineers and researchers," he says, "have spent decades devising formal models describing the connection between search queries and the best possible results to answer them. In particular, we care about the true underlying intention behind the ill-formed strings of keywords that all of us hammer into Google and Co. on a daily basis. For many queries this interpretation will be straightforward; for a significant number of others, however, it becomes veritable detective work. While modern natural language processing techniques have become indispensable in this pursuit of meaning, intentions, and goals, there still is a gap between the way in which people and machines read and generate language. These projects try to close this gap and truly enable search engines to understand what you are looking for."
Better Extraction from Text Toward Enhanced Retrieval
In the IARPA grant project ("Better Extraction from Text Towards Enhanced Retrieval"), the researchers will be studying methods for improving search that work across languages and which account for fine-grained differences in users' interests.
"Probably the most familiar of all natural language processing applications," Ellie says, "is information retrieval. We can think of Google search: products like it seem to perform extremely well, but they're actually only good at answering queries that a huge number of people have already asked. It seems surprising, but when you search for 'what was that movie with the box of chocolates', you're probably one of thousands of people who have searched for the same thing."
However, when people need to do very specific searches over very specialized sets of documents, search quality is far worse: think of the daily frustration from trying to find something in a massive pile of old email. In this project, Ellie and her team will attempt to build better search methods that quickly recognize a fine-grained topic by asking a small number of disambiguating questions and reorganizing the entire representation of language on the basis of the answers. The goal is for users to get better results from searches on very specialized topics, such as finding evidence for and against a little-known scientific theory. Collaborators include Carsten as well as colleagues from Ohio State University and University of Pennsylvania.
Grounded Artificial Intelligence Language Acquisition
To situate the project ("Grounded Artificial Intelligence Language Acquisition") funded by the second grant, Ellie explains that current approaches for teaching language to computers work by reading large volumes of text over and over.
"This means that they can produce normal-sounding sentences," she says, "but have zero knowledge of what they actually mean. Instead, we'll be teaching computers the meaning of words by emulating the way we talk to toddlers. They learn language by interacting with the world and humans, hearing something like this: 'Let's get ready for lunch. We should get a plate and glass to put on the table so we can eat.' We'll be doing that in virtual reality, letting you talk to a computer just like you would with a child."
Over time, the computer will learn to connect the words the person says to the objects and actions it observes. Later, it will generate the same words when it sees similar objects and actions.
"Asking questions and dialog is essential to robotics and language grounding," says Professor Stefanie Tellex of Brown CS, one of Ellie's collaborators. "Moreover, as a robot gains long-term memory, the information retrieval questions and applications for this grant are important, too. Think about asking a robot in your home when Grandma last took her medicine, or where she left her keys."
"We have an amazingly multidisciplinary group of collaborators right here at Brown," says Ellie, "including Carsten Eickhoff in Natural Language Processing, Roman Feiman in Developmental Psychology, Stefanie Tellex in Robotics, and Daniel Ritchie in Graphics. It's a dream team."
"Computer scientists," says Roman, "are trying to engineer systems that understand and use language. Cognitive scientists are trying to reverse-engineer how humans do the same thing. The two fields can learn a lot from each other, because engineering solutions can inspire scientific hypotheses and scientific findings can be the basis for new approaches in engineering. What I think is really special about our group is that we all have deep interests in the other's field, so that we can make meaningful connections from our own areas of expertise. That's rare, and it's an outstanding opportunity to make progress together."
For more information, click the link that follows to contact Brown CS Communication Outreach Specialist Jesse C. Polhemus.