June 04, 2006 Organized by IBM Research Lab in Haifa, Israel
Recent Advances in Transductive Learning
Ran El-Yaniv, Technion - Israel Institute of Technology
In transductive learning, our goal is to transduce information from a given labeled set to a given unlabeled test set so as to label the test points as accurately as possible. Despite the conceptual simplicity of this learning model, and the growing attention it receives, our understanding of transduction is still quite limited. This talk will review the current state of transductive learning with an emphasis on recent performance guarantees, learning principles and open questions.
Machine Learning, the Mind Behind the Third Eye
Gaby Hayun, Mobileye
Machine learning plays a major role in solving vision problems. At Mobileye we cope with vision applications for the automotive market, including vehicle and pedestrian detection. To solve those problems in a real world environment, we had to integrate many machine learning algorithms in our off-line analysis and on-line applications. During this short talk, I'll present a few examples for the usage of supervised and unsupervised learning algorithms. I'll conclude by introducing a hardware solution we developed for efficient implementation of vision and machine learning algorithms.
Efficient Discriminative Learning of Energy Functions Using Conditional Random Fields
Yair Weiss, Hebrew University, Jerusalem
The problem of learning energy functions comes up in a wide range of applications, including computer vision and computational biology. This problem shares some aspects of standard supervised learning. We are given a labeled training set and wish to learn a function mapping inputs to outputs that will perform well on new data. However, unlike classical supervised learning, the label is not a single label but rather a label field and the output is computed by finding the label field that minimizes an input-dependent energy function. I will show how the conditional random field framework of Lafferty et al. can be used to learn energy functions and the computational challenges that arise in applying this framework to many applications. I will then show how tools of approximate inference in probabilistic models, including belief propagation and its variants, can be used to address these challenges.
The EuResist Initiative: Integration of Viral Genomics with Clinical Data to Predict Response of HIV Patients to Treatment
Michal Rosen-Zvi, IBM Haifa Research Lab
EuResist is a joint collaboration with seven European partners, targeted at predicting the response to HIV treatment. The Arevir, ARCA, and Karolinska databases, which contain information concerning thousands of patients, are used for the project. This clinical and genomic information consists of categorical data, numerical data, and time-varying data. Only in a few of the examples is it clear whether a therapy succeeded or failed. We address the problem of using this massive high-dimensional data for supervised learning from a few examples. We model the input data by conditional distributions and compare the performance of various approaches, such as the kernel method and a novel geometrical information-based dimensionality reduction method, on a testbed we generated. This research is carried out in collaboration with Udi Aharoni, Hani Neuvirth-Telem, and Tali Tishby.
Keynote: Machine Learning for Market Microstructure
Michael Kearns, University of Pennsylvania
Many modern equity markets use a standard limit order mechanism in which buyers and sellers specify desired prices, and are matched with each other temporally whenever orders cross. The recent and widespread revelation of full order book (or "microstructure") data provides a complete snapshot of the liquidity and depth of the market at each moment. It also presents a number of interesting opportunities and challenges for machine learning.
In this talk, I will survey some recent machine learning approaches to market microstructure. These include on-line algorithms with competitive ratios for optimal trade execution and a large-scale empirical application of reinforcement learning to the same problem. I will also discuss theoretical results on the stability of limit order dynamics and their implications for the possibility of measuring the market impact of a trading strategy --- a necessity for any learning-based approach.
The talk will be self-contained and assumes no background in finance or microstructure. It is based on joint works with Eyal Even-Dar, Yi Feng, Sham Kakade, Yishay Mansour, Yuriy Nevmyvaka, and Luis Ortiz.
Active Sampling for Multiple Output Identification
Yishay Mansour, Tel Aviv University
In this talk, we will discuss the problem of using active sampling to identify at least one example for each possible output value, given functions with multiple output values. The motivation for this setting is to efficiently hit pre-specified low probability events. We will give an overview of our results for this setting, which include:
- Efficient active sampling algorithms for simple geometric concepts, such as intervals on a line and axis parallel boxes
- Characterization for the case of binary output value in a transductive setting
- Analysis of active sampling with uniform distribution in the plane
- Efficient algorithm for the Boolean hypercube when each output value is a monomial
This is joint work with Shai Fine from IBM Haifa.
Early Assessment and Prediction of Dementia and Other Cognitive-based Disorders
Vered Aharonson, NexSig - Neurological Examination Technologies Ltd.
Medical applications increasingly use machine learning methods for diagnosis and prediction of response to treatment. NexSig develops reactive systems for early detection of Alzheimer's disease and other neurodegenerative diseases. These diseases are primarily manifested in cognitive and behavioral changes. The system tracks dynamic interaction patterns of subjects. Using feature extraction and machine learning methods it predicts the patient's prognosis. In this talk I will outline our methodology, discuss the inherent problems and pitfalls in the design and implementation of such a system, and present results from clinical trials.
Improving Self-Supervised Relation Extraction from the Web
Ronen Feldman, Bar-Ilan University
Web extraction systems attempt to use the immense amount of unlabeled text in the Web in order to create large lists of entities and relations. Unlike traditional IE methods, the Web extraction systems do not label every mention of the target entity or relation, and instead focus on extracting as many different instances as possible while keeping the precision of the resulting list reasonably high. SRES is a self-supervised Web relation extraction system that learns powerful extraction patterns from unlabeled text, using short descriptions of the target relations and their attributes. SRES automatically generates the training data needed for its pattern-learning component. The performance of SRES is further enhanced by classifying its output instances using the properties of the extracted patterns. The features we use for classification and the trained classification model are independent from the target relation, which we demonstrate in a series of experiments. We also compare the performance of SRES to the performance of the state-of-the-art know-it-all system, and to the performance of its pattern learning component, which uses a simpler and less powerful pattern language than SRES.
Some Business Applications of Reinforcement Learning
Naoki Abe, IBM T.J. Watson Research Center
Today, machine learning and data mining techniques are routinely applied in analyzing business data and assisting with business decision making. The range of techniques that have been applied to business analytics, however, has been mostly limited to basic techniques, such as classification and regression modeling.
In this talk, I will review some recent work we have been doing in applying reinforcement learning techniques to problems in business analytics. Specifically, we will describe some concrete applications in the area of targeted marketing, and discuss the technical challenges we faced and the solutions we devised. This talk will be based on the body of work conducted jointly by many colleagues, including Edwin Pednault, Naval Verma, Bianca Zarozny, Cezar Pendus, and Chid Apte.
| |
|