rlm@379: #+title: Ullman Literature Review
rlm@379: #+author: Robert McIntyre
rlm@379: #+email: rlm@mit.edu
rlm@379: #+description: Review of some of the AI works of Professor Shimon Ullman.
rlm@379: #+keywords: Shimon, Ullman, computer vision, artificial intelligence, literature review
rlm@379: #+SETUPFILE: ../../aurellem/org/setup.org
rlm@379: #+INCLUDE: ../../aurellem/org/level-0.org
rlm@379: #+babel: :mkdirp yes :noweb yes :exports both
rlm@379: 
rlm@379: 
rlm@379: * Ullman 
rlm@379: 
rlm@379: Actual code reuse!
rlm@379: 
rlm@379: precision = fraction of retrieved instances that are relevant
rlm@380:   (true-positives/(true-positives+false-positives))
rlm@379: 
rlm@379: recall    =  fraction of relevant instances that are retrieved
rlm@379:   (true-positives/total-in-class)
rlm@379: 
rlm@379: cross-validation = train the model on two different sets to prevent
rlm@380: overfitting, and confirm that you have enough training samples.
rlm@379: 
rlm@379: nifty, relevant, realistic ideas
rlm@380: He doesn't confine himself to implausible assumptions
rlm@379: 
rlm@379: ** Our Reading
rlm@379: 
rlm@379: *** 2002 Visual features of intermediate complexity and their use in classification
rlm@379: 
rlm@379:     
rlm@379: 
rlm@379: 
rlm@380:     Viola's PhD thesis has a good introduction to entropy and mutual
rlm@380:     information 
rlm@380: 
rlm@379: ** Getting around the dumb "fixed training set" methods
rlm@379: 
rlm@379: *** 2006 Learning to classify by ongoing feature selection
rlm@379:     
rlm@379:     Brings in the most informative features of a class, based on
rlm@379:     mutual information between that feature and all the examples
rlm@379:     encountered so far. To bound the running time, he uses only a
rlm@379:     fixed number of the most recent examples. He uses a replacement
rlm@379:     strategy to tell whether a new feature is better than one of the
rlm@380:     current features.
rlm@379: 
rlm@379: *** 2009 Learning model complexity in an online environment
rlm@379:     
rlm@380:     Sort of like the hierarchical Bayesan models of Tennanbaum, this
rlm@379:     system makes the model more and more complicated as it gets more
rlm@379:     and more training data. It does this by using two systems in
rlm@380:     parallel and then whenever the more complex one seems to be
rlm@379:     needed by the data, the less complex one is thrown out, and an
rlm@379:     even more complex model is initialized in its place.
rlm@379: 
rlm@380:     He uses a SVM with polynomial kernels of varying complexity. He
rlm@380:     gets good performance on a handwriting classification using a large
rlm@379:     range of training samples, since his model changes complexity
rlm@379:     depending on the number of training samples. The simpler models do
rlm@379:     better with few training points, and the more complex ones do
rlm@379:     better with many training points.
rlm@379: 
rlm@379:     The final model had intermediate complexity between published
rlm@379:     extremes. 
rlm@379: 
rlm@379:     The more complex models must be able to be initialized efficiently
rlm@379:     from the less complex models which they replace!
rlm@379: 
rlm@379: 
rlm@379: ** Non Parametric Models
rlm@379: 
rlm@379: [[../images/viola-parzen-1.png]]
rlm@379: [[../images/viola-parzen-2.png]]
rlm@379: 
rlm@379: *** 2010 The chains model for detecting parts by their context
rlm@379: 
rlm@380:     Like the constellation method for rigid objects, but extended to
rlm@379:     non-rigid objects as well.
rlm@379: 
rlm@379:     Allows you to build a hand detector from a face detector. This is
rlm@380:     useful because hands might be only a few pixels, and very
rlm@379:     ambiguous in an image, but if you are expecting them at the end of
rlm@379:     an arm, then they become easier to find.
rlm@379: 
rlm@379:     They make chains by using spatial proximity of features. That way,
rlm@380:     a hand can be identified by chaining back from the head. If there
rlm@379:     is a good chain to the head, then it is more likely that there is
rlm@379:     a hand than if there isn't. Since there is some give in the
rlm@380:     proximity detection, the system can accommodate new poses that it
rlm@379:     has never seen before.
rlm@379: 
rlm@379:     Does not use any motion information.
rlm@379: 
rlm@379: *** 2005 A Hierarchical Non-Parametric Method for Capturing Non-Rigid Deformations
rlm@379:     
rlm@379:     (relative dynamic programming [RDP])
rlm@379: 
rlm@379:     Goal is to match images, as in SIFT, but this time the images can
rlm@379:     be subject to non rigid transformations. They do this by finding
rlm@379:     small patches that look the same, then building up bigger
rlm@379:     patches. They get a tree of patches that describes each image, and
rlm@379:     find the edit distance between each tree. Editing operations
rlm@380:     involve a coherent shift of features, so they can accommodate local
rlm@379:     shifts of patches in any direction. They get some cool results
rlm@379:     over just straight correlation. Basically, they made an image
rlm@380:     comparator that is resistant to multiple independent deformations.
rlm@379:     
rlm@380:     !important small regions are treated the same as unimportant
rlm@379:      small regions
rlm@379:      
rlm@379:     !no conception of shape
rlm@379:     
rlm@379:     quote:
rlm@379:     The dynamic programming procedure looks for an optimal
rlm@379:     transformation that aligns the patches of both images. This
rlm@379:     transformation is not a global transformation, but a composition
rlm@379:     of many local transformations of sub-patches at various sizes,
rlm@379:     performed one on top of the other.
rlm@379: 
rlm@379: *** 2006 Satellite Features for the Classification of Visually Similar Classes
rlm@379:     
rlm@379:     Finds features that can distinguish subclasses of a class, by
rlm@380:     first finding a rigid set of anchor features that are common to
rlm@379:     both subclasses, then finding distinguishing features relative to
rlm@379:     those subfeatures. They keep things rigid because the satellite
rlm@379:     features don't have much information in and of themselves, and are
rlm@379:     only informative relative to other features.
rlm@379: 
rlm@379: *** 2005 Learning a novel class from a single example by cross-generalization.
rlm@379: 
rlm@379:     Let's you use a vast visual experience to generate a classifier
rlm@380:     for a novel class by generating synthetic examples by replacing
rlm@380:     features from the single example with features from similar
rlm@379:     classes.
rlm@379: 
rlm@379:     quote: feature F is likely to be useful for class C if a similar
rlm@379:     feature F proved effective for a similar class C in the past.
rlm@379: 
rlm@380:     Allows you to transfer the "gestalt" of a similar class to a new
rlm@379:     class, by adapting all the features of the learned class that have
rlm@380:     correspondence to the new class.
rlm@379: 
rlm@379: *** 2007 Semantic Hierarchies for Recognizing Objects and Parts
rlm@379: 
rlm@379:     Better learning of complex objects like faces by learning each
rlm@379:     piece (like nose, mouth, eye, etc) separately, then making sure
rlm@380:     that the features are in plausible positions.