Driving Problem: Build a “description machine” to annotate natural scenes.
Background: Image interpretation, which is effortless and instantaneous for human beings, is the grand challenge of computer vision. The objective is to build a machine which produces a rich semantic description of the underlying scene, including the names and poses of the objects that are present, as well as “recognizing” actions, interactions and other semantic events. Mathematical frameworks are advanced from time to time, but none is yet widely accepted or clearly points the way to closing the performance gap with natural vision.
Research Strategy: Efficient search and evidence integration appear indispensable for scene annotation, and my approach is inspired by two facets of human search: divide-and-conquer querying in playing games like “twenty questions” and selective attention in natural vision. The enabling assumption is that interpretations may be grouped into natural subsets with shared features. We then design algorithms which naturally switch from monitoring the scene as a whole to local scrutiny for fine discrimination, and back again depending on accumulated evidence and its impact on the likelihoods of competing interpretations.
Driving Problem: Tailor cancer treatment to an individual molecular profile.
Background: Gigantic amounts of data about abnormal perturbations in biological networks are being collected by sequencing and microarray experiments, enabling the inference of systems-level disease signatures. Cancer is perhaps the prototypical systems disease, and the focus of extensive study in quantitative biology, including developing algorithms to predict disease phenotypes, progression and treatment response for individuals. However, translating these programs into personalized clinical care remains elusive, and in particular mature applications of statistical learning to inference at the patient level remain limited.
Research Strategy: Realizing this agenda requires recognizing and overcoming the two major limitations of current methodology: (i) the “Black box” decision rules which lack the biological meaning and algorithmic simplicity necessary for transitioning into the clinic; and (ii) the absence of biological mechanism, which is necessary for stable model selection, especially given the unfavorable ratio of samples to variables in the current and any foreseeable learning scenario. Embedding mechanism in the learning framework is then a “win-win” strategy, but requires a close, sustained collaboration between mathematicians and biologists.