Mercurial > cortex

     1 #+title: Ullman Literature Review

     2 #+author: Robert McIntyre

     3 #+email: rlm@mit.edu

     4 #+description: Review of some of the AI works of Professor Shimon Ullman.

     5 #+keywords: Shimon, Ullman, computer vision, artificial intelligence, literature review

     6 #+SETUPFILE: ../../aurellem/org/setup.org

     7 #+INCLUDE: ../../aurellem/org/level-0.org

     8 #+babel: :mkdirp yes :noweb yes :exports both

     9 

    10 

    11 * Ullman 

    12 

    13 Actual code reuse!

    14 

    15 precision = fraction of retrieved instances that are relevant

    16   (true-positives/(true-positives+false-positives))

    17 

    18 recall    =  fraction of relevant instances that are retrieved

    19   (true-positives/total-in-class)

    20 

    21 cross-validation = train the model on two different sets to prevent

    22 overfitting, and confirm that you have enough training samples.

    23 

    24 nifty, relevant, realistic ideas

    25 He doesn't confine himself to implausible assumptions

    26 

    27 ** Our Reading

    28 

    29 *** 2002 Visual features of intermediate complexity and their use in classification

    30 

    31     

    32 

    33 

    34     Viola's PhD thesis has a good introduction to entropy and mutual

    35     information 

    36 

    37 ** Getting around the dumb "fixed training set" methods

    38 

    39 *** 2006 Learning to classify by ongoing feature selection

    40     

    41     Brings in the most informative features of a class, based on

    42     mutual information between that feature and all the examples

    43     encountered so far. To bound the running time, he uses only a

    44     fixed number of the most recent examples. He uses a replacement

    45     strategy to tell whether a new feature is better than one of the

    46     current features.

    47 

    48 *** 2009 Learning model complexity in an online environment

    49     

    50     Sort of like the hierarchical Bayesan models of Tennanbaum, this

    51     system makes the model more and more complicated as it gets more

    52     and more training data. It does this by using two systems in

    53     parallel and then whenever the more complex one seems to be

    54     needed by the data, the less complex one is thrown out, and an

    55     even more complex model is initialized in its place.

    56 

    57     He uses a SVM with polynomial kernels of varying complexity. He

    58     gets good performance on a handwriting classification using a large

    59     range of training samples, since his model changes complexity

    60     depending on the number of training samples. The simpler models do

    61     better with few training points, and the more complex ones do

    62     better with many training points.

    63 

    64     The final model had intermediate complexity between published

    65     extremes. 

    66 

    67     The more complex models must be able to be initialized efficiently

    68     from the less complex models which they replace!

    69 

    70 

    71 ** Non Parametric Models

    72 

    73 [[../images/viola-parzen-1.png]]

    74 [[../images/viola-parzen-2.png]]

    75 

    76 *** 2010 The chains model for detecting parts by their context

    77 

    78     Like the constellation method for rigid objects, but extended to

    79     non-rigid objects as well.

    80 

    81     Allows you to build a hand detector from a face detector. This is

    82     useful because hands might be only a few pixels, and very

    83     ambiguous in an image, but if you are expecting them at the end of

    84     an arm, then they become easier to find.

    85 

    86     They make chains by using spatial proximity of features. That way,

    87     a hand can be identified by chaining back from the head. If there

    88     is a good chain to the head, then it is more likely that there is

    89     a hand than if there isn't. Since there is some give in the

    90     proximity detection, the system can accommodate new poses that it

    91     has never seen before.

    92 

    93     Does not use any motion information.

    94 

    95 *** 2005 A Hierarchical Non-Parametric Method for Capturing Non-Rigid Deformations

    96     

    97     (relative dynamic programming [RDP])

    98 

    99     Goal is to match images, as in SIFT, but this time the images can

   100     be subject to non rigid transformations. They do this by finding

   101     small patches that look the same, then building up bigger

   102     patches. They get a tree of patches that describes each image, and

   103     find the edit distance between each tree. Editing operations

   104     involve a coherent shift of features, so they can accommodate local

   105     shifts of patches in any direction. They get some cool results

   106     over just straight correlation. Basically, they made an image

   107     comparator that is resistant to multiple independent deformations.

   108     

   109     !important small regions are treated the same as unimportant

   110      small regions

   111      

   112     !no conception of shape

   113     

   114     quote:

   115     The dynamic programming procedure looks for an optimal

   116     transformation that aligns the patches of both images. This

   117     transformation is not a global transformation, but a composition

   118     of many local transformations of sub-patches at various sizes,

   119     performed one on top of the other.

   120 

   121 *** 2006 Satellite Features for the Classification of Visually Similar Classes

   122     

   123     Finds features that can distinguish subclasses of a class, by

   124     first finding a rigid set of anchor features that are common to

   125     both subclasses, then finding distinguishing features relative to

   126     those subfeatures. They keep things rigid because the satellite

   127     features don't have much information in and of themselves, and are

   128     only informative relative to other features.

   129 

   130 *** 2005 Learning a novel class from a single example by cross-generalization.

   131 

   132     Let's you use a vast visual experience to generate a classifier

   133     for a novel class by generating synthetic examples by replacing

   134     features from the single example with features from similar

   135     classes.

   136 

   137     quote: feature F is likely to be useful for class C if a similar

   138     feature F proved effective for a similar class C in the past.

   139 

   140     Allows you to transfer the "gestalt" of a similar class to a new

   141     class, by adapting all the features of the learned class that have

   142     correspondence to the new class.

   143 

   144 *** 2007 Semantic Hierarchies for Recognizing Objects and Parts

   145 

   146     Better learning of complex objects like faces by learning each

   147     piece (like nose, mouth, eye, etc) separately, then making sure

   148     that the features are in plausible positions.
author	Robert McIntyre <rlm@mit.edu>
date	Wed, 29 May 2013 17:17:08 -0400
parents	2d0afb231081
children