For the first time, information retrieval is possible with the help of EEG interpreted with machine learning.
In a study conducted by the Helsinki Institute for Information Technology (HIIT) and the Centre of Excellence in Computational Inference (COIN), laboratory test subjects read the introductions of Wikipedia articles of their own choice. During the reading session, the test subjects’ EEG was recorded, and the readings were then used to model which key words the subjects found interesting.
‘The aim was to study if EEG can be used to identify the words relevant to a test subject, to predict a subject’s search intentions and to use this information to recommend new relevant and interesting documents to the subject. There are millions of documents in the English Wikipedia, so the recommendation accuracy was studied against this vast but controllable corpus’, says HIIT researcher Tuukka Ruotsalo.
Due to the noise in brain signals, machine learning was used for modelling, so that relevance and interest could be identified by learning the EEG responses. With the help of machine learning methods, it was possible to identify informative words, so they were also useful in the information retrieval application.
‘Information overload is a part of everyday life, and it is impossible to react to all the information we see. And according to this study, we don’t need to; EEG responses measured from brain signals can be used to predict a user’s reactions and intent’, tells HIIT researcher Manuel Eugster.
Based on the study, brain signals could be used to successfully predict other Wikipedia content that would interest the user.
‘Applying the method in real information retrieval situations seems promising based on the research findings. Nowadays, we use a lot of our working time searching for information, and there is much room in making knowledge work more effective, but practical applications still need more work. The main goal of this study was to show that this kind of new thing was possible in the first place’, tells Professor at the Department of Computer Science and Director of COIN Samuel Kaski.
‘It is possible that, in the future, EEG sensors can be worn comfortably. This way, machines could assist humans by automatically observing, marking and gathering relevant information by monitoring EEG responses’, adds Ruotsalo.
The study was carried out in cooperation by the Helsinki Institute for Information Technology (HIIT), which is jointly run by Aalto University and the University of Helsinki, and the Centre of Excellence in Computational Inference (COIN). The study has been funded by the EU, the Academy of Finland as a part of the COIN study on machine learning and advanced interfaces, and the Revolution of Knowledge Work project by Tekes.
See the video:
Learn more: EEG reveals information essential to users
Machine-learning system doesn’t require costly hand-annotated data.
In recent years, computers have gotten remarkably good at recognizing speech and images: Think of the dictation software on most cellphones, or the algorithms that automatically identify people in photos posted to Facebook.
But recognition of natural sounds — such as crowds cheering or waves crashing — has lagged behind. That’s because most automated recognition systems, whether they process audio or visual information, are the result of machine learning, in which computers search for patterns in huge compendia of training data. Usually, the training data has to be first annotated by hand, which is prohibitively expensive for all but the highest-demand applications.
Sound recognition may be catching up, however, thanks to researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). At the Neural Information Processing Systems conference next week, they will present a sound-recognition system that outperforms its predecessors but didn’t require hand-annotated data during training.
Instead, the researchers trained the system on video. First, existing computer vision systems that recognize scenes and objects categorized the images in the video. The new system then found correlations between those visual categories and natural sounds.
“Computer vision has gotten so good that we can transfer it to other domains,” says Carl Vondrick, an MIT graduate student in electrical engineering and computer science and one of the paper’s two first authors. “We’re capitalizing on the natural synchronization between vision and sound. We scale up with tons of unlabeled video to learn to understand sound.”
The researchers tested their system on two standard databases of annotated sound recordings, and it was between 13 and 15 percent more accurate than the best-performing previous system. On a data set with 10 different sound categories, it could categorize sounds with 92 percent accuracy, and on a data set with 50 categories it performed with 74 percent accuracy. On those same data sets, humans are 96 percent and 81 percent accurate, respectively.
“Even humans are ambiguous,” says Yusuf Aytar, the paper’s other first author and a postdoc in the lab of MIT professor of electrical engineering and computer science Antonio Torralba. Torralba is the final co-author on the paper.
“We did an experiment with Carl,” Aytar says. “Carl was looking at the computer monitor, and I couldn’t see it. He would play a recording and I would try to guess what it was. It turns out this is really, really hard. I could tell indoor from outdoor, basic guesses, but when it comes to the details — ‘Is it a restaurant?’ — those details are missing. Even for annotation purposes, the task is really hard.”
Because it takes far less power to collect and process audio data than it does to collect and process visual data, the researchers envision that a sound-recognition system could be used to improve the context sensitivity of mobile devices.
When coupled with GPS data, for instance, a sound-recognition system could determine that a cellphone user is in a movie theater and that the movie has started, and the phone could automatically route calls to a prerecorded outgoing message. Similarly, sound recognition could improve the situational awareness of autonomous robots.
“For instance, think of a self-driving car,” Aytar says. “There’s an ambulance coming, and the car doesn’t see it. If it hears it, it can make future predictions for the ambulance — which path it’s going to take — just purely based on sound.”
The researchers’ machine-learning system is a neural network, so called because its architecture loosely resembles that of the human brain. A neural net consists of processing nodes that, like individual neurons, can perform only rudimentary computations but are densely interconnected. Information — say, the pixel values of a digital image — is fed to the bottom layer of nodes, which processes it and feeds it to the next layer, which processes it and feeds it to the next layer, and so on. The training process continually modifies the settings of the individual nodes, until the output of the final layer reliably performs some classification of the data — say, identifying the objects in the image.
Vondrick, Aytar, and Torralba first trained a neural net on two large, annotated sets of images: one, the ImageNet data set, contains labeled examples of images of 1,000 different objects; the other, the Places data set created by Torralba’s group, contains labeled images of 401 different scene types, such as a playground, bedroom, or conference room.
Once the network was trained, the researchers fed it the video from 26 terabytes of video data downloaded from the photo-sharing site Flickr. “It’s about 2 million unique videos,” Vondrick says. “If you were to watch all of them back to back, it would take you about two years.” Then they trained a second neural network on the audio from the same videos. The second network’s goal was to correctly predict the object and scene tags produced by the first network.
The result was a network that could interpret natural sounds in terms of image categories. For instance, it might determine that the sound of birdsong tends to be associated with forest scenes and pictures of trees, birds, birdhouses, and bird feeders.
To compare the sound-recognition network’s performance to that of its predecessors, however, the researchers needed a way to translate its language of images into the familiar language of sound names. So they trained a simple machine-learning system to associate the outputs of the sound-recognition network with a set of standard sound labels.
For that, the researchers did use a database of annotated audio — one with 50 categories of sound and about 2,000 examples. Those annotations had been supplied by humans. But it’s much easier to label 2,000 examples than to label 2 million. And the MIT researchers’ network, trained first on unlabeled video, significantly outperformed all previous networks trained solely on the 2,000 labeled examples.
“With the modern machine-learning approaches, like deep learning, you have many, many trainable parameters in many layers in your neural-network system,” says Mark Plumbley, a professor of signal processing at the University of Surrey. “That normally means that you have to have many, many examples to train that on. And we have seen that sometimes there’s not enough data to be able to use a deep-learning system without some other help. Here the advantage is that they are using large amounts of other video information to train the network and then doing an additional step where they specialize the network for this particular task. That approach is very promising because it leverages this existing information from another field.”
Plumbley says that both he and colleagues at other institutions have been involved in efforts to commercialize sound recognition software for applications such as home security, where it might, for instance, respond to the sound of breaking glass. Other uses might include eldercare, to identify potentially alarming deviations from ordinary sound patterns, or to control sound pollution in urban areas. “I really think that there’s a lot of potential in the sound-recognition area,” he says.
Physicians have long used visual judgment of medical images to determine the course of cancer treatment. A new program package from Fraunhofer researchers reveals changes in images and facilitates this task using deep learning.
The experts will demonstrate this software in Chicago from November 27 to December 2 at RSNA, the world’s largest radiology meeting.
Has a tumor shrunk during the course of treatment over several months, or have new tumors developed? To answer questions like these, physicians often perform CT and MRI scans. Tumors are usually evaluated only visually, and new tumors are often over-
looked. “Our program package increases confidence during tumor measurement and follow-up,” explains Mark Schenk from the Fraunhofer Institute for Medical Image Computing MEVIS in Bremen, Germany. “The software can, for example, determine how the volume of a tumor changes over time and supports the detection of new tumors.” The package consists of modular processing components and can help medical technology manufacturers automate progress monitoring.
The computer learns on its own
The package is unique in its use of deep learning, a new type of machine learning that reaches far beyond existing approaches. This method is helpful for image segmentation, during which experts designate exact organ outlines. Existing computer segmentation programs seek clearly defined image features such as certain gray values. “How-
ever, this can often lead to errors,” according to Fraunhofer researcher Markus Harz. “The software assigns areas to the liver that do not belong to the organ.” These errors must be corrected by physicians, a process which can often be quite time-consuming.
The new deep learning approaches promise improved results and should save physicians valuable time. To demonstrate their self-learning methods, Fraunhofer scientists trained the software with CT liver images from 149 patients. Results showed that the more data the program analyzed, the better it could automatically identify liver contours.
Finding hidden metastases
A further application of the approach is image registration, in which software aligns images from different patient visits so that physicians can easily compare them. Machine learning can aid the particularly difficult task of locating bone metastases in the torso in which hip bones, ribs, and spine are visible. Currently, these metastases are often overlooked due to time constraints in clinical practice. Deep learning methods can help reliably discover metastases and thus improve treatment outcomes.
Researchers focus on a combination of classical approaches and machine learning: “We wish to harness existing expertise to implement deep learning as effectively and reliably as possible,” stresses Harz. Fraunhofer MEVIS builds upon years of experience in practical application: for example, the algorithms for highly precise lung image registration have been integrated into several commercial medical software applications.
Learn more: Machine learning to help physicians
“Information extraction” system helps turn plain text into data for statistical analysis.
Of the vast wealth of information unlocked by the Internet, most is plain text. The data necessary to answer myriad questions — about, say, the correlations between the industrial use of certain chemicals and incidents of disease, or between patterns of news coverage and voter-poll results — may all be online. But extracting it from plain text and organizing it for quantitative analysis may be prohibitively time consuming.
Information extraction — or automatically classifying data items stored as plain text — is thus a major topic of artificial-intelligence research. Last week, at the Association for Computational Linguistics’ Conference on Empirical Methods on Natural Language Processing, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory won a best-paper award for a new approach to information extraction that turns conventional machine learning on its head.
Most machine-learning systems work by combing through training examples and looking for patterns that correspond to classifications provided by human annotators. For instance, humans might label parts of speech in a set of texts, and the machine-learning system will try to identify patterns that resolve ambiguities — for instance, when “her” is a direct object and when it’s an adjective.
Typically, computer scientists will try to feed their machine-learning systems as much training data as possible. That generally increases the chances that a system will be able to handle difficult problems.
In their new paper, by contrast, the MIT researchers train their system on scanty data — because in the scenario they’re investigating, that’s usually all that’s available. But then they find the limited information an easy problem to solve.
“In information extraction, traditionally, in natural-language processing, you are given an article and you need to do whatever it takes to extract correctly from this article,” says Regina Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science and senior author on the new paper. “That’s very different from what you or I would do. When you’re reading an article that you can’t understand, you’re going to go on the web and find one that you can understand.”
Essentially, the researchers’ new system does the same thing. A machine-learning system will generally assign each of its classifications a confidence score, which is a measure of the statistical likelihood that the classification is correct, given the patterns discerned in the training data. With the researchers’ new system, if the confidence score is too low, the system automatically generates a web search query designed to pull up texts likely to contain the data it’s trying to extract.
It then attempts to extract the relevant data from one of the new texts and reconciles the results with those of its initial extraction. If the confidence score remains too low, it moves on to the next text pulled up by the search string, and so on.
“The base extractor isn’t changing,” says Adam Yala, a graduate student in the MIT Department of Electrical Engineering and Computer Science (EECS) and one of the coauthors on the new paper. “You’re going to find articles that are easier for that extractor to understand. So you have something that’s a very weak extractor, and you just find data that fits it automatically from the web.” Joining Yala and Barzilay on the paper is first author Karthik Narasimhan, also a graduate student in EECS.
Remarkably, every decision the system makes is the result of machine learning. The system learns how to generate search queries, gauge the likelihood that a new text is relevant to its extraction task, and determine the best strategy for fusing the results of multiple attempts at extraction.
Just the facts
In experiments, the researchers applied their system to two extraction tasks. One was the collection of data on mass shootings in the U.S., which is an essential resource for any epidemiological study of the effects of gun-control measures. The other was the collection of similar data on instances of food contamination. The system was trained separately for each task.
In the first case — the database of mass shootings — the system was asked to extract the name of the shooter, the location of the shooting, the number of people wounded, and the number of people killed. In the food-contamination case, it extracted food type, type of contaminant, and location. In each case, the system was trained on about 300 documents.
From those documents, it learned clusters of search terms that tended to be associated with the data items it was trying to extract. For instance, the names of mass shooters were correlated with terms like “police,” “identified,” “arrested,” and “charged.” During training, for each article the system was asked to analyze, it pulled up, on average, another nine or 10 news articles from the web.
The researchers compared their system’s performance to that of several extractors trained using more conventional machine-learning techniques. For every data item extracted in both tasks, the new system outperformed its predecessors, usually by about 10 percent.
“One of the difficulties of natural language is that you can express the same information in many, many different ways, and capturing all that variation is one of the challenges of building a comprehensive model,” says Chris Callison-Burch, an assistant professor of computer and information science at the University of Pennsylvania. “[Barzilay and her colleagues] have this super-clever part of the model that goes out and queries for more information that might result in something that’s simpler for it to process. It’s clever and well-executed.”
Callison-Burch’s group is using a combination of natural-language processing and human review to build a database of information on gun violence, much like the one that the MIT researchers’ system was trained to produce. “We’ve crawled millions and millions of news articles, and then we pick out ones that the text classifier thinks are related to gun violence, and then we have humans start doing information extraction manually,” he says. “Having a model like Regina’s that would allow us to predict whether or not this article corresponded to one that we’ve already annotated would be a huge time savings. It’s something that I’d be very excited to do in the future.”
Learning algorithm rewarded for building confidence over time
Researchers at Disney Research and Boston University have found that a machine learning program can be trained to detect human activity in a video sooner and more accurately than other methods by rewarding the program for gaining confidence in its prediction the longer it observes the activity.
It seems intuitive that the program would grow more confident that it is detecting, say, a person changing a tire, the longer it observes the person loosening lugnuts, jacking up the car and subsequently removing the wheel, but that’s not the way most computer models have been trained to detect activity, said Leonid Sigal, senior research scientist at Disney Research.
“Most training techniques are happy if the computer model gets 60 percent of the video frames correct, even if the errors occur late in the process, when the activity should actually be more apparent,” Sigal said. “That doesn’t make much sense. If the model predicts a person is making coffee even after it sees the person put pasta into boiling water, it should be penalized more than if it made the same incorrect prediction when the person was still just boiling water.”
Shugao Ma, a Ph.D. student in computer science at Boston University and a former intern at Disney Research, found that this change in training methods resulted in more accurate predictions of activities. The computer also was often able to accurately predict the activity early in the process, even after seeing only 20 to 30 percent of the video. Likewise, the program can detect that an activity is finished if its confidence that it is observing that activity begins to drop.
Crowd-sourced data yields system that determines where mobile-device users are looking.
For the past 40 years, eye-tracking technology — which can determine where in a visual scene people are directing their gaze — has been widely used in psychological experiments and marketing research, but it’s required pricey hardware that has kept it from finding consumer applications.
Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory and the University of Georgia hope to change that, with software that can turn any smartphone into an eye-tracking device. They describe their new system in a paper they’re presenting on June 28 at the Computer Vision and Pattern Recognition conference.
In addition to making existing applications of eye-tracking technology more accessible, the system could enable new computer interfaces or help detect signs of incipient neurological disease or mental illness.
Researchers apply adaptive-design strategy to reveal targeted properties in shape-memory alloy
Researchers recently demonstrated how an informatics-based adaptive design strategy, tightly coupled to experiments, can accelerate the discovery of new materials with targeted properties, according to a recent paper published in Nature Communications.
“What we’ve done is show that, starting with a relatively small data set of well-controlled experiments, it is possible to iteratively guide subsequent experiments toward finding the material with the desired target,” said Turab Lookman, a physicist and materials scientist in the Physics of Condensed Matter and Complex Systems group at Los Alamos National Laboratory. Lookman is the principal investigator of the research project.
“Finding new materials has traditionally been guided by intuition and trial and error,” said Lookman.”But with increasing chemical complexity, the combination possibilities become too large for trial-and-error approaches to be practical.”
To address this, Lookman, along with his colleagues at Los Alamos and the State Key Laboratory for Mechanical Behavior of Materials in China, employed machine learning to speed up the process. It worked. They developed a framework that uses uncertainties to iteratively guide the next experiments to be performed in search of a shape-memory alloy with very low thermal hysteresis (or dissipation). Such alloys are critical for improving fatigue life in engineering applications.
“The goal is to cut in half the time and cost of bringing materials to market,” said Lookman. “What we have demonstrated is a data-driven framework built on the foundations of machine learning and design that can lead to discovering new materials with targeted properties much faster than before.” The work made use of Los Alamos’ high-performance supercomputing resources.
Researchers from Carnegie Mellon University (CMU) have created the first robotically driven experimentation system to determine the effects of a large number of drugs on many proteins, reducing the number of necessary experiments by 70%.
The model, presented in the journal eLife, uses an approach that could lead to accurate predictions of the interactions between novel drugs and their targets, helping reduce the cost of drug discovery.
“Biomedical scientists have invested a lot of effort in making it easier to perform numerous experiments quickly and cheaply,” says lead author Armaghan Naik, a Lane Fellow in CMU’s Computational Biology Department.
“However, we simply cannot perform an experiment for every possible combination of biological conditions, such as genetic mutation and cell type. Researchers have therefore had to choose a few conditions or targets to test exhaustively, or pick experiments themselves. The question is which experiments do you pick?”
Naik says that careful balance between performing experiments that can be predicted confidently and those that cannot is a challenge for humans, as it requires reasoning about an enormous amount of hypothetical outcomes at the same time.
To address this problem, the research team has previously described the application of a machine learning approach called “active learning”. This involves a computer repeatedly choosing which experiments to do, in order to learn efficiently from the patterns it observes in the data. The team is led by senior author Robert F. Murphy, Professor at the Ray and Stephanie Lane Center for Computational Biology, and Head of CMU’s Computational Biology Department.