view org/ullman.org @ 475:3ec428e096e5

most of the way to getting touch integrated.
author Robert McIntyre <rlm@mit.edu>
date Fri, 28 Mar 2014 21:48:53 -0400 (2014-03-29)
parents 2d0afb231081
children
line wrap: on
line source
1 #+title: Ullman Literature Review
2 #+author: Robert McIntyre
3 #+email: rlm@mit.edu
4 #+description: Review of some of the AI works of Professor Shimon Ullman.
5 #+keywords: Shimon, Ullman, computer vision, artificial intelligence, literature review
6 #+SETUPFILE: ../../aurellem/org/setup.org
7 #+INCLUDE: ../../aurellem/org/level-0.org
8 #+babel: :mkdirp yes :noweb yes :exports both
11 * Ullman
13 Actual code reuse!
15 precision = fraction of retrieved instances that are relevant
16 (true-positives/(true-positives+false-positives))
18 recall = fraction of relevant instances that are retrieved
19 (true-positives/total-in-class)
21 cross-validation = train the model on two different sets to prevent
22 overfitting, and confirm that you have enough training samples.
24 nifty, relevant, realistic ideas
25 He doesn't confine himself to implausible assumptions
27 ** Our Reading
29 *** 2002 Visual features of intermediate complexity and their use in classification
34 Viola's PhD thesis has a good introduction to entropy and mutual
35 information
37 ** Getting around the dumb "fixed training set" methods
39 *** 2006 Learning to classify by ongoing feature selection
41 Brings in the most informative features of a class, based on
42 mutual information between that feature and all the examples
43 encountered so far. To bound the running time, he uses only a
44 fixed number of the most recent examples. He uses a replacement
45 strategy to tell whether a new feature is better than one of the
46 current features.
48 *** 2009 Learning model complexity in an online environment
50 Sort of like the hierarchical Bayesan models of Tennanbaum, this
51 system makes the model more and more complicated as it gets more
52 and more training data. It does this by using two systems in
53 parallel and then whenever the more complex one seems to be
54 needed by the data, the less complex one is thrown out, and an
55 even more complex model is initialized in its place.
57 He uses a SVM with polynomial kernels of varying complexity. He
58 gets good performance on a handwriting classification using a large
59 range of training samples, since his model changes complexity
60 depending on the number of training samples. The simpler models do
61 better with few training points, and the more complex ones do
62 better with many training points.
64 The final model had intermediate complexity between published
65 extremes.
67 The more complex models must be able to be initialized efficiently
68 from the less complex models which they replace!
71 ** Non Parametric Models
73 [[../images/viola-parzen-1.png]]
74 [[../images/viola-parzen-2.png]]
76 *** 2010 The chains model for detecting parts by their context
78 Like the constellation method for rigid objects, but extended to
79 non-rigid objects as well.
81 Allows you to build a hand detector from a face detector. This is
82 useful because hands might be only a few pixels, and very
83 ambiguous in an image, but if you are expecting them at the end of
84 an arm, then they become easier to find.
86 They make chains by using spatial proximity of features. That way,
87 a hand can be identified by chaining back from the head. If there
88 is a good chain to the head, then it is more likely that there is
89 a hand than if there isn't. Since there is some give in the
90 proximity detection, the system can accommodate new poses that it
91 has never seen before.
93 Does not use any motion information.
95 *** 2005 A Hierarchical Non-Parametric Method for Capturing Non-Rigid Deformations
97 (relative dynamic programming [RDP])
99 Goal is to match images, as in SIFT, but this time the images can
100 be subject to non rigid transformations. They do this by finding
101 small patches that look the same, then building up bigger
102 patches. They get a tree of patches that describes each image, and
103 find the edit distance between each tree. Editing operations
104 involve a coherent shift of features, so they can accommodate local
105 shifts of patches in any direction. They get some cool results
106 over just straight correlation. Basically, they made an image
107 comparator that is resistant to multiple independent deformations.
109 !important small regions are treated the same as unimportant
110 small regions
112 !no conception of shape
114 quote:
115 The dynamic programming procedure looks for an optimal
116 transformation that aligns the patches of both images. This
117 transformation is not a global transformation, but a composition
118 of many local transformations of sub-patches at various sizes,
119 performed one on top of the other.
121 *** 2006 Satellite Features for the Classification of Visually Similar Classes
123 Finds features that can distinguish subclasses of a class, by
124 first finding a rigid set of anchor features that are common to
125 both subclasses, then finding distinguishing features relative to
126 those subfeatures. They keep things rigid because the satellite
127 features don't have much information in and of themselves, and are
128 only informative relative to other features.
130 *** 2005 Learning a novel class from a single example by cross-generalization.
132 Let's you use a vast visual experience to generate a classifier
133 for a novel class by generating synthetic examples by replacing
134 features from the single example with features from similar
135 classes.
137 quote: feature F is likely to be useful for class C if a similar
138 feature F proved effective for a similar class C in the past.
140 Allows you to transfer the "gestalt" of a similar class to a new
141 class, by adapting all the features of the learned class that have
142 correspondence to the new class.
144 *** 2007 Semantic Hierarchies for Recognizing Objects and Parts
146 Better learning of complex objects like faces by learning each
147 piece (like nose, mouth, eye, etc) separately, then making sure
148 that the features are in plausible positions.