rlm@379: #+title: Ullman Literature Review rlm@379: #+author: Robert McIntyre rlm@379: #+email: rlm@mit.edu rlm@379: #+description: Review of some of the AI works of Professor Shimon Ullman. rlm@379: #+keywords: Shimon, Ullman, computer vision, artificial intelligence, literature review rlm@379: #+SETUPFILE: ../../aurellem/org/setup.org rlm@379: #+INCLUDE: ../../aurellem/org/level-0.org rlm@379: #+babel: :mkdirp yes :noweb yes :exports both rlm@379: rlm@379: rlm@379: * Ullman rlm@379: rlm@379: Actual code reuse! rlm@379: rlm@379: precision = fraction of retrieved instances that are relevant rlm@380: (true-positives/(true-positives+false-positives)) rlm@379: rlm@379: recall = fraction of relevant instances that are retrieved rlm@379: (true-positives/total-in-class) rlm@379: rlm@379: cross-validation = train the model on two different sets to prevent rlm@380: overfitting, and confirm that you have enough training samples. rlm@379: rlm@379: nifty, relevant, realistic ideas rlm@380: He doesn't confine himself to implausible assumptions rlm@379: rlm@379: ** Our Reading rlm@379: rlm@379: *** 2002 Visual features of intermediate complexity and their use in classification rlm@379: rlm@379: rlm@379: rlm@379: rlm@380: Viola's PhD thesis has a good introduction to entropy and mutual rlm@380: information rlm@380: rlm@379: ** Getting around the dumb "fixed training set" methods rlm@379: rlm@379: *** 2006 Learning to classify by ongoing feature selection rlm@379: rlm@379: Brings in the most informative features of a class, based on rlm@379: mutual information between that feature and all the examples rlm@379: encountered so far. To bound the running time, he uses only a rlm@379: fixed number of the most recent examples. He uses a replacement rlm@379: strategy to tell whether a new feature is better than one of the rlm@380: current features. rlm@379: rlm@379: *** 2009 Learning model complexity in an online environment rlm@379: rlm@380: Sort of like the hierarchical Bayesan models of Tennanbaum, this rlm@379: system makes the model more and more complicated as it gets more rlm@379: and more training data. It does this by using two systems in rlm@380: parallel and then whenever the more complex one seems to be rlm@379: needed by the data, the less complex one is thrown out, and an rlm@379: even more complex model is initialized in its place. rlm@379: rlm@380: He uses a SVM with polynomial kernels of varying complexity. He rlm@380: gets good performance on a handwriting classification using a large rlm@379: range of training samples, since his model changes complexity rlm@379: depending on the number of training samples. The simpler models do rlm@379: better with few training points, and the more complex ones do rlm@379: better with many training points. rlm@379: rlm@379: The final model had intermediate complexity between published rlm@379: extremes. rlm@379: rlm@379: The more complex models must be able to be initialized efficiently rlm@379: from the less complex models which they replace! rlm@379: rlm@379: rlm@379: ** Non Parametric Models rlm@379: rlm@379: [[../images/viola-parzen-1.png]] rlm@379: [[../images/viola-parzen-2.png]] rlm@379: rlm@379: *** 2010 The chains model for detecting parts by their context rlm@379: rlm@380: Like the constellation method for rigid objects, but extended to rlm@379: non-rigid objects as well. rlm@379: rlm@379: Allows you to build a hand detector from a face detector. This is rlm@380: useful because hands might be only a few pixels, and very rlm@379: ambiguous in an image, but if you are expecting them at the end of rlm@379: an arm, then they become easier to find. rlm@379: rlm@379: They make chains by using spatial proximity of features. That way, rlm@380: a hand can be identified by chaining back from the head. If there rlm@379: is a good chain to the head, then it is more likely that there is rlm@379: a hand than if there isn't. Since there is some give in the rlm@380: proximity detection, the system can accommodate new poses that it rlm@379: has never seen before. rlm@379: rlm@379: Does not use any motion information. rlm@379: rlm@379: *** 2005 A Hierarchical Non-Parametric Method for Capturing Non-Rigid Deformations rlm@379: rlm@379: (relative dynamic programming [RDP]) rlm@379: rlm@379: Goal is to match images, as in SIFT, but this time the images can rlm@379: be subject to non rigid transformations. They do this by finding rlm@379: small patches that look the same, then building up bigger rlm@379: patches. They get a tree of patches that describes each image, and rlm@379: find the edit distance between each tree. Editing operations rlm@380: involve a coherent shift of features, so they can accommodate local rlm@379: shifts of patches in any direction. They get some cool results rlm@379: over just straight correlation. Basically, they made an image rlm@380: comparator that is resistant to multiple independent deformations. rlm@379: rlm@380: !important small regions are treated the same as unimportant rlm@379: small regions rlm@379: rlm@379: !no conception of shape rlm@379: rlm@379: quote: rlm@379: The dynamic programming procedure looks for an optimal rlm@379: transformation that aligns the patches of both images. This rlm@379: transformation is not a global transformation, but a composition rlm@379: of many local transformations of sub-patches at various sizes, rlm@379: performed one on top of the other. rlm@379: rlm@379: *** 2006 Satellite Features for the Classification of Visually Similar Classes rlm@379: rlm@379: Finds features that can distinguish subclasses of a class, by rlm@380: first finding a rigid set of anchor features that are common to rlm@379: both subclasses, then finding distinguishing features relative to rlm@379: those subfeatures. They keep things rigid because the satellite rlm@379: features don't have much information in and of themselves, and are rlm@379: only informative relative to other features. rlm@379: rlm@379: *** 2005 Learning a novel class from a single example by cross-generalization. rlm@379: rlm@379: Let's you use a vast visual experience to generate a classifier rlm@380: for a novel class by generating synthetic examples by replacing rlm@380: features from the single example with features from similar rlm@379: classes. rlm@379: rlm@379: quote: feature F is likely to be useful for class C if a similar rlm@379: feature F proved effective for a similar class C in the past. rlm@379: rlm@380: Allows you to transfer the "gestalt" of a similar class to a new rlm@379: class, by adapting all the features of the learned class that have rlm@380: correspondence to the new class. rlm@379: rlm@379: *** 2007 Semantic Hierarchies for Recognizing Objects and Parts rlm@379: rlm@379: Better learning of complex objects like faces by learning each rlm@379: piece (like nose, mouth, eye, etc) separately, then making sure rlm@380: that the features are in plausible positions.