rlm@379
|
1 #+title: Ullman Literature Review
|
rlm@379
|
2 #+author: Robert McIntyre
|
rlm@379
|
3 #+email: rlm@mit.edu
|
rlm@379
|
4 #+description: Review of some of the AI works of Professor Shimon Ullman.
|
rlm@379
|
5 #+keywords: Shimon, Ullman, computer vision, artificial intelligence, literature review
|
rlm@379
|
6 #+SETUPFILE: ../../aurellem/org/setup.org
|
rlm@379
|
7 #+INCLUDE: ../../aurellem/org/level-0.org
|
rlm@379
|
8 #+babel: :mkdirp yes :noweb yes :exports both
|
rlm@379
|
9
|
rlm@379
|
10
|
rlm@379
|
11 * Ullman
|
rlm@379
|
12
|
rlm@379
|
13 Actual code reuse!
|
rlm@379
|
14
|
rlm@379
|
15 precision = fraction of retrieved instances that are relevant
|
rlm@380
|
16 (true-positives/(true-positives+false-positives))
|
rlm@379
|
17
|
rlm@379
|
18 recall = fraction of relevant instances that are retrieved
|
rlm@379
|
19 (true-positives/total-in-class)
|
rlm@379
|
20
|
rlm@379
|
21 cross-validation = train the model on two different sets to prevent
|
rlm@380
|
22 overfitting, and confirm that you have enough training samples.
|
rlm@379
|
23
|
rlm@379
|
24 nifty, relevant, realistic ideas
|
rlm@380
|
25 He doesn't confine himself to implausible assumptions
|
rlm@379
|
26
|
rlm@379
|
27 ** Our Reading
|
rlm@379
|
28
|
rlm@379
|
29 *** 2002 Visual features of intermediate complexity and their use in classification
|
rlm@379
|
30
|
rlm@379
|
31
|
rlm@379
|
32
|
rlm@379
|
33
|
rlm@380
|
34 Viola's PhD thesis has a good introduction to entropy and mutual
|
rlm@380
|
35 information
|
rlm@380
|
36
|
rlm@379
|
37 ** Getting around the dumb "fixed training set" methods
|
rlm@379
|
38
|
rlm@379
|
39 *** 2006 Learning to classify by ongoing feature selection
|
rlm@379
|
40
|
rlm@379
|
41 Brings in the most informative features of a class, based on
|
rlm@379
|
42 mutual information between that feature and all the examples
|
rlm@379
|
43 encountered so far. To bound the running time, he uses only a
|
rlm@379
|
44 fixed number of the most recent examples. He uses a replacement
|
rlm@379
|
45 strategy to tell whether a new feature is better than one of the
|
rlm@380
|
46 current features.
|
rlm@379
|
47
|
rlm@379
|
48 *** 2009 Learning model complexity in an online environment
|
rlm@379
|
49
|
rlm@380
|
50 Sort of like the hierarchical Bayesan models of Tennanbaum, this
|
rlm@379
|
51 system makes the model more and more complicated as it gets more
|
rlm@379
|
52 and more training data. It does this by using two systems in
|
rlm@380
|
53 parallel and then whenever the more complex one seems to be
|
rlm@379
|
54 needed by the data, the less complex one is thrown out, and an
|
rlm@379
|
55 even more complex model is initialized in its place.
|
rlm@379
|
56
|
rlm@380
|
57 He uses a SVM with polynomial kernels of varying complexity. He
|
rlm@380
|
58 gets good performance on a handwriting classification using a large
|
rlm@379
|
59 range of training samples, since his model changes complexity
|
rlm@379
|
60 depending on the number of training samples. The simpler models do
|
rlm@379
|
61 better with few training points, and the more complex ones do
|
rlm@379
|
62 better with many training points.
|
rlm@379
|
63
|
rlm@379
|
64 The final model had intermediate complexity between published
|
rlm@379
|
65 extremes.
|
rlm@379
|
66
|
rlm@379
|
67 The more complex models must be able to be initialized efficiently
|
rlm@379
|
68 from the less complex models which they replace!
|
rlm@379
|
69
|
rlm@379
|
70
|
rlm@379
|
71 ** Non Parametric Models
|
rlm@379
|
72
|
rlm@379
|
73 [[../images/viola-parzen-1.png]]
|
rlm@379
|
74 [[../images/viola-parzen-2.png]]
|
rlm@379
|
75
|
rlm@379
|
76 *** 2010 The chains model for detecting parts by their context
|
rlm@379
|
77
|
rlm@380
|
78 Like the constellation method for rigid objects, but extended to
|
rlm@379
|
79 non-rigid objects as well.
|
rlm@379
|
80
|
rlm@379
|
81 Allows you to build a hand detector from a face detector. This is
|
rlm@380
|
82 useful because hands might be only a few pixels, and very
|
rlm@379
|
83 ambiguous in an image, but if you are expecting them at the end of
|
rlm@379
|
84 an arm, then they become easier to find.
|
rlm@379
|
85
|
rlm@379
|
86 They make chains by using spatial proximity of features. That way,
|
rlm@380
|
87 a hand can be identified by chaining back from the head. If there
|
rlm@379
|
88 is a good chain to the head, then it is more likely that there is
|
rlm@379
|
89 a hand than if there isn't. Since there is some give in the
|
rlm@380
|
90 proximity detection, the system can accommodate new poses that it
|
rlm@379
|
91 has never seen before.
|
rlm@379
|
92
|
rlm@379
|
93 Does not use any motion information.
|
rlm@379
|
94
|
rlm@379
|
95 *** 2005 A Hierarchical Non-Parametric Method for Capturing Non-Rigid Deformations
|
rlm@379
|
96
|
rlm@379
|
97 (relative dynamic programming [RDP])
|
rlm@379
|
98
|
rlm@379
|
99 Goal is to match images, as in SIFT, but this time the images can
|
rlm@379
|
100 be subject to non rigid transformations. They do this by finding
|
rlm@379
|
101 small patches that look the same, then building up bigger
|
rlm@379
|
102 patches. They get a tree of patches that describes each image, and
|
rlm@379
|
103 find the edit distance between each tree. Editing operations
|
rlm@380
|
104 involve a coherent shift of features, so they can accommodate local
|
rlm@379
|
105 shifts of patches in any direction. They get some cool results
|
rlm@379
|
106 over just straight correlation. Basically, they made an image
|
rlm@380
|
107 comparator that is resistant to multiple independent deformations.
|
rlm@379
|
108
|
rlm@380
|
109 !important small regions are treated the same as unimportant
|
rlm@379
|
110 small regions
|
rlm@379
|
111
|
rlm@379
|
112 !no conception of shape
|
rlm@379
|
113
|
rlm@379
|
114 quote:
|
rlm@379
|
115 The dynamic programming procedure looks for an optimal
|
rlm@379
|
116 transformation that aligns the patches of both images. This
|
rlm@379
|
117 transformation is not a global transformation, but a composition
|
rlm@379
|
118 of many local transformations of sub-patches at various sizes,
|
rlm@379
|
119 performed one on top of the other.
|
rlm@379
|
120
|
rlm@379
|
121 *** 2006 Satellite Features for the Classification of Visually Similar Classes
|
rlm@379
|
122
|
rlm@379
|
123 Finds features that can distinguish subclasses of a class, by
|
rlm@380
|
124 first finding a rigid set of anchor features that are common to
|
rlm@379
|
125 both subclasses, then finding distinguishing features relative to
|
rlm@379
|
126 those subfeatures. They keep things rigid because the satellite
|
rlm@379
|
127 features don't have much information in and of themselves, and are
|
rlm@379
|
128 only informative relative to other features.
|
rlm@379
|
129
|
rlm@379
|
130 *** 2005 Learning a novel class from a single example by cross-generalization.
|
rlm@379
|
131
|
rlm@379
|
132 Let's you use a vast visual experience to generate a classifier
|
rlm@380
|
133 for a novel class by generating synthetic examples by replacing
|
rlm@380
|
134 features from the single example with features from similar
|
rlm@379
|
135 classes.
|
rlm@379
|
136
|
rlm@379
|
137 quote: feature F is likely to be useful for class C if a similar
|
rlm@379
|
138 feature F proved effective for a similar class C in the past.
|
rlm@379
|
139
|
rlm@380
|
140 Allows you to transfer the "gestalt" of a similar class to a new
|
rlm@379
|
141 class, by adapting all the features of the learned class that have
|
rlm@380
|
142 correspondence to the new class.
|
rlm@379
|
143
|
rlm@379
|
144 *** 2007 Semantic Hierarchies for Recognizing Objects and Parts
|
rlm@379
|
145
|
rlm@379
|
146 Better learning of complex objects like faces by learning each
|
rlm@379
|
147 piece (like nose, mouth, eye, etc) separately, then making sure
|
rlm@380
|
148 that the features are in plausible positions.
|