IARPA MATERIAL research project to help intelligence analysts glean information from obscure languages
WASHINGTON – U.S. intelligence experts are reaching out to industry for new ways of distilling important information from relatively obscure languages for which no automated translation or analysis tools exist.
Officials of the Intelligence Advanced Research Projects Agency (IARPA) in Washington issued a broad agency announcement Thursday (IARPA-BAA-16-11) for the Machine Translation for English Retrieval of Information in Any Language (MATERIAL) program.
MATERIAL seeks ways to find speech and text content in low-resource languages from asking relevant English questions. Low-resource languages are those for which no automated human language translation and analysis capability exists.
IARPA officials want intelligence tools that require minimal training to use, and that can adapt quickly to new languages, to produce answers and summaries in English. IARPA is the research arm of the U.S. Director of National Intelligence.
Gleaning important intelligence information from voice and text often requires deep skills in many different languages, yet for most languages there are few or no automated tools available for information retrieval or machine translation, IARPA experts say.
Existing tools for commonly spoken languages are not easily transferable between domains and are difficult to adapt to low-resource languages. The MATERIAL program aims to investigate how computer experts can develop machine translation and information retrieval for multilingual speech and text data.
MATERIAL contractors will develop and integrate automatic speech recognition, machine translation, cross-language information retrieval, and summarization technologies into end-to-end systems.
These systems will take English queries and produce relevant cross-language English summaries, with search capability to enable English-speaking domain experts to process massive amounts of real-world multilingual speech and textual data efficiently.
The MATERIAL project will seek to develop methods that are effective across languages; that can adapt to different domains and genres of text and speech; that can mitigate limited amounts of translated data; have system build time; and that have effective search algorithms.
Companies interested must have experience in automatic speech recognition, machine translation, summarization, and integration. Automatic speech recognition technologies built under the IARPA Babel program for addressing keyword search may not be sufficient for MATERIAL.
The program will center on sample information of 75 percent text and 25 percent audible speech. Evaluations will be on a language-by-language basis. IARPA officials say they expect to award several contracts.
Companies interested should submit responses no later than 18 May 2017 online to the IARPA Distribution and Evaluation System (IDEAS) Website at http://iarpa-ideas.gov/Client/signin.aspx.
Email questions or concerns to IARPA at firstname.lastname@example.org. The program Website is at www.iarpa.gov/index.php/research-programs/material.
The MATERIAL program is expected to begin in October 2017 and extend through August 2021. More information is online at https://www.fbo.gov/notices/5f3c15319aeb29341b578396b92c5aee.
Learn more: search the Aerospace & Defense Buyer's Guide for companies, new products, press releases, and videos