rlm@371: When I write my thesis, I want it to have links to every 
rlm@371: 
rlm@371: 
rlm@371: 
rlm@369: * Object Recognition from Local Scale-Invariant Features, David G. Lowe
rlm@369:   
rlm@369:   This is the famous SIFT paper that is mentioned everywhere.
rlm@369: 
rlm@369:   This is a way to find objects in images given an image of that
rlm@369:   object. It is moderately risistant to variations in the sample image
rlm@369:   and the target image. Basically, this is a fancy way of picking out
rlm@369:   a test pattern embedded in a larger pattern. It would fail to learn
rlm@369:   anything resembling object categories, for instance. Usefull concept
rlm@369:   is the idea of storing the local scale and rotation of each feature
rlm@369:   as it is extracted from the image, then checking to make sure that
rlm@369:   proposed matches all more-or-less agree on shift, rotation, scale,
rlm@369:   etc.  Another good idea is to use points instead of edges, since
rlm@369:   they seem more robust.
rlm@369: 
rlm@369: ** References:
rlm@369:  - Basri, Ronen, and David. W. Jacobs, “Recognition using region
rlm@369:   correspondences,” International Journal of Computer Vision, 25, 2
rlm@369:   (1996), pp. 141–162.
rlm@369:   
rlm@369:  - Edelman, Shimon, Nathan Intrator, and Tomaso Poggio, “Complex
rlm@369:   cells and object recognition,” Unpublished Manuscript, preprint at
rlm@369:   http://www.ai.mit.edu/edelman/mirror/nips97.ps.Z
rlm@369:   
rlm@369:  - Lindeberg, Tony, “Detecting salient blob-like image structures
rlm@369:   and their scales with a scale-space primal sketch: a method for
rlm@369:   focus-of-attention,” International Journal of Computer Vision, 11, 3
rlm@369:   (1993), pp. 283–318.
rlm@369:   
rlm@369:  - Murase, Hiroshi, and Shree K. Nayar, “Visual learning and
rlm@369:   recognition of 3-D objects from appearance,” International Journal
rlm@369:   of Computer Vision, 14, 1 (1995), pp. 5–24.
rlm@369: 
rlm@369:  - Ohba, Kohtaro, and Katsushi Ikeuchi, “Detectability, uniqueness,
rlm@369:   and reliability of eigen windows for stable verification of
rlm@369:   partially occluded objects,” IEEE Trans. on Pattern Analysis and
rlm@369:   Machine Intelligence, 19, 9 (1997), pp. 1043–48.
rlm@369: 
rlm@369:  - Zhang, Z., R. Deriche, O. Faugeras, Q.T. Luong, “A robust
rlm@369:   technique for matching two uncalibrated images through the recovery
rlm@376:   of the unknown epipolar geometry,” Artificial Intelligence, 78,
rlm@369:   (1995), pp. 87-119.
rlm@369: 
rlm@369: 
rlm@369: 
rlm@369: 
rlm@376:    
rlm@371: * Alignment by Maximization of Mutual Information, Paul A. Viola
rlm@371: 
rlm@371:   PhD Thesis recommended by Winston. Describes a system that is able
rlm@371:   to align a 3D computer model of an object with an image of that
rlm@371:   object. 
rlm@371:   
rlm@371:   - Pages 9-19 is a very adequate intro to the algorithm.
rlm@371:   
rlm@371:   - Has a useful section on entropy and probability at the beginning
rlm@371:     which is worth reading, especially the part about entropy.
rlm@371: 
rlm@371:   - Differential entropy seems a bit odd -- you would think that it
rlm@371:     should be the same as normal entropy for a discrete distrubition
rlm@371:     embedded in continuous space. How do you measure the entropy of a
rlm@376:     half continuous, half discrete random variable? Perhaps the
rlm@376:     problem is related to the delta function, and not the definition
rlm@376:     of differential entropy?
rlm@371: 
rlm@371:   - Expectation Maximation (Mixture of Gaussians cool stuff)
rlm@371:     (Dempster 1977)
rlm@371: 
rlm@371:   - Good introduction to Parzen Window Density Estimation. Parzen
rlm@371:     density functions trade construction time for evaulation
rlm@376:     time.(Pg. 41) They are a way to transform a sample into a
rlm@376:     distribution. They don't work very well in higher dimensions due
rlm@376:     to the thinning of sample points.
rlm@376: 
rlm@376:   - Calculating the entropy of a Markov Model (or state machine,
rlm@376:     program, etc) seems like it would be very hard, since each trial
rlm@376:     would not be independent of the other trials. Yet, there are many
rlm@376:     common sense models that do need to have state to accurately model
rlm@376:     the world.
rlm@376: 
rlm@376:   - "... there is no direct procedure for evaluating entropy from a
rlm@376:     sample. A common approach is to model the density from the sample,
rlm@376:     and then estimate the entropy from the density."
rlm@376: 
rlm@376:   - pg. 55 he says that infinity minus infinity is zero lol.
rlm@376: 
rlm@376:   - great idea on pg 62 about using random samples from images to
rlm@376:     speed up computation.
rlm@376: 
rlm@376:   - practical way of terminating a random search: "A better idea is to
rlm@376:     reduce the learning rate until the parameters have a reasonable
rlm@376:     variance and then take the average parameters."
rlm@376: 
rlm@376:   - p. 65 bullshit hack to make his parzen window estimates work.
rlm@376: 
rlm@376:   - this alignment only works if the initial pose is not very far
rlm@376:     off. 
rlm@376: 
rlm@371: 
rlm@371:   Occlusion? Seems a bit holistic.
rlm@371: 
rlm@376: ** References
rlm@376:  - "excellent" book on entropy (Cover & Thomas, 1991) [Elements of
rlm@376:    Information Theory.] 
rlm@376: 
rlm@376:  - Canny, J. (1986). A Computational Approach to Edge Detection. IEEE
rlm@376:    Transactions PAMI, PAMI-8(6):679{698
rlm@376: 
rlm@376:  - Chin, R. and Dyer, C. (1986). Model-Based Recognition in Robot
rlm@376:    Vision. Computing Surveys, 18:67-108.
rlm@376: 
rlm@376:  - Grimson, W., Lozano-Perez, T., Wells, W., et al. (1994). An
rlm@376:    Automatic Registration Method for Frameless Stereotaxy, Image
rlm@376:    Guided Surgery, and Enhanced Realigy Visualization. In Proceedings
rlm@376:    of the Computer Society Conference on Computer Vision and Pattern
rlm@376:    Recognition, Seattle, WA. IEEE.
rlm@376: 
rlm@376:  - Hill, D. L., Studholme, C., and Hawkes, D. J. (1994). Voxel
rlm@376:    Similarity Measures for Auto-mated Image Registration. In
rlm@376:    Proceedings of the Third Conference on Visualization in Biomedical
rlm@376:    Computing, pages 205 { 216. SPIE.
rlm@376: 
rlm@376:  - Kirkpatrick, S., Gelatt, C., and Vecch Optimization by Simulated
rlm@376:    Annealing. Science, 220(4598):671-680.
rlm@376: 
rlm@376:  - Jones, M. and Poggio, T. (1995). Model-based matching of line
rlm@376:    drawings by linear combin-ations of prototypes. Proceedings of the
rlm@376:    International Conference on Computer Vision
rlm@376: 
rlm@376:  - Ljung, L. and Soderstrom, T. (1983). Theory and Practice of
rlm@376:    Recursive Identi cation. MIT Press.
rlm@376: 
rlm@376:  - Shannon, C. E. (1948). A mathematical theory of communication. Bell
rlm@376:    Systems Technical Journal, 27:379-423 and 623-656.
rlm@376: 
rlm@376:  - Shashua, A. (1992). Geometry and Photometry in 3D Visual
rlm@376:    Recognition. PhD thesis, M.I.T Artificial Intelligence Laboratory,
rlm@376:    AI-TR-1401.
rlm@376: 
rlm@376:  - William H. Press, Brian P. Flannery, S. A. T. and Veterling,
rlm@376:    W. T. (1992). Numerical Recipes in C: The Art of Scienti c
rlm@376:    Computing. Cambridge University Press, Cambridge, England, second
rlm@376:    edition edition.
rlm@376: 
rlm@376: * Semi-Automated Dialogue Act Classification for Situated Social Agents in Games, Deb Roy 
rlm@376:   
rlm@376:   Interesting attempt to learn "social scripts" related to resturant
rlm@376:   behaviour. The authors do this by creating a game which implements a
rlm@376:   virtual restruant, and recoding actual human players as they
rlm@376:   interact with the game. The learn scripts from annotated
rlm@376:   interactions and then use those scripts to label other
rlm@376:   interactions. They don't get very good results, but their
rlm@376:   methodology of creating a virtual world and recording
rlm@376:   low-dimensional actions is interesting.
rlm@376: 
rlm@376:   - Torque 2D/3D looks like an interesting game engine.
rlm@376: 
rlm@376: 
rlm@376: * Face Recognition by Humans: Nineteen Results all Computer Vision Researchers should know, Sinha
rlm@376:   
rlm@376:   This is a summary of a lot of bio experiments on human face
rlm@376:   recognition.
rlm@376:   
rlm@376:   - They assert again that the internal gradients/structures of a face
rlm@376:     are more important than the edges.
rlm@376: 
rlm@376:   - It's amazing to me that it takes about 10 years after birth for a
rlm@376:     human to get advanced adult-like face detection. They go through
rlm@376:     feature based processing to a holistic based approach during this
rlm@376:     time.
rlm@376: 
rlm@376:   - Finally, color is a very important cue for identifying faces.
rlm@371: 
rlm@371: ** References
rlm@376:   - A. Freire, K. Lee, and L. A. Symons, BThe face-inversion effect as
rlm@376:     a deficit in the encoding of configural information: Direct
rlm@376:     evidence,[ Perception, vol. 29, no. 2, pp. 159–170, 2000.
rlm@376:   - M. B. Lewis, BThatcher’s children: Development and the Thatcher
rlm@376:     illusion,[Perception, vol. 32, pp. 1415–21, 2003.
rlm@376:   - E. McKone and N. Kanwisher, BDoes the human brain process objects
rlm@376:     of expertise like faces? A review of the evidence,[ in From Monkey
rlm@376:     Brain to Human Brain, S. Dehaene, J. R. Duhamel, M. Hauser, and
rlm@376:     G. Rizzolatti, Eds. Cambridge, MA: MIT Press, 2005.
rlm@376: 
rlm@376: 
rlm@376: 
rlm@376: 
rlm@376: heee~eeyyyy kids, time to get eagle'd!!!!
rlm@376: 
rlm@376: 
rlm@376: 
rlm@376: 
rlm@376: 
rlm@376: * Ullman 
rlm@376: 
rlm@376: Actual code reuse!
rlm@376: 
rlm@376: precision = fraction of retrieved instances that are relevant
rlm@376:   (true-postives/(true-positives+false-positives))
rlm@376: 
rlm@376: recall    =  fraction of relevant instances that are retrieved
rlm@376:   (true-positives/total-in-class)
rlm@376: 
rlm@376: cross-validation = train the model on two different sets to prevent
rlm@376: overfitting. 
rlm@376: 
rlm@377: nifty, relevant, realistic ideas
rlm@377: He doesn't confine himself to unplasaubile assumptions
rlm@376: 
rlm@378: ** Our Reading
rlm@378: *** 2002 Visual features of intermediate complexity and their use in classification
rlm@376: 
rlm@378:     
rlm@376: 
rlm@376: 
rlm@376: ** Getting around the dumb "fixed training set" methods
rlm@376: 
rlm@376: *** 2006 Learning to classify by ongoing feature selection
rlm@376:     
rlm@376:     Brings in the most informative features of a class, based on
rlm@376:     mutual information between that feature and all the examples
rlm@376:     encountered so far. To bound the running time, he uses only a
rlm@376:     fixed number of the most recent examples. He uses a replacement
rlm@376:     strategy to tell whether a new feature is better than one of the
rlm@376:     corrent features.
rlm@376: 
rlm@376: *** 2009 Learning model complexity in an online environment
rlm@376:     
rlm@376:     Sort of like the heirichal baysean models of Tennanbaum, this
rlm@376:     system makes the model more and more complicated as it gets more
rlm@376:     and more training data. It does this by using two systems in
rlm@376:     parallell and then whenever the more complex one seems to be
rlm@376:     needed by the data, the less complex one is thrown out, and an
rlm@376:     even more complex model is initialized in its place.
rlm@376: 
rlm@376:     He uses a SVM with polynominal kernels of varying complexity. He
rlm@376:     gets good perfoemance on a handwriting classfication using a large
rlm@376:     range of training samples, since his model changes complexity
rlm@376:     depending on the number of training samples. The simpler models do
rlm@376:     better with few training points, and the more complex ones do
rlm@376:     better with many training points.
rlm@376: 
rlm@377:     The final model had intermediate complexity between published
rlm@377:     extremes. 
rlm@377: 
rlm@376:     The more complex models must be able to be initialized efficiently
rlm@376:     from the less complex models which they replace!
rlm@376: 
rlm@376: 
rlm@376: ** Non Parametric Models
rlm@376: 
rlm@377: *** 2010 The chains model for detecting parts by their context
rlm@376: 
rlm@376:     Like the constelation method for rigid objects, but extended to
rlm@376:     non-rigid objects as well.
rlm@376: 
rlm@376:     Allows you to build a hand detector from a face detector. This is
rlm@376:     usefull because hands might be only a few pixels, and very
rlm@376:     ambiguous in an image, but if you are expecting them at the end of
rlm@376:     an arm, then they become easier to find.
rlm@376: 
rlm@377:     They make chains by using spatial proximity of features. That way,
rlm@377:     a hand can be idntified by chaining back from the head. If there
rlm@377:     is a good chain to the head, then it is more likely that there is
rlm@377:     a hand than if there isn't. Since there is some give in the
rlm@377:     proximity detection, the system can accomodate new poses that it
rlm@377:     has never seen before.
rlm@377: 
rlm@377:     Does not use any motion information.
rlm@377: 
rlm@377: *** 2005 A Hierarchical Non-Parametric Method for Capturing Non-Rigid Deformations
rlm@377:     
rlm@377:     (relative dynamic programming [RDP])
rlm@377: 
rlm@377:     Goal is to match images, as in SIFT, but this time the images can
rlm@377:     be subject to non rigid transformations. They do this by finding
rlm@377:     small patches that look the same, then building up bigger
rlm@377:     patches. They get a tree of patches that describes each image, and
rlm@377:     find the edit distance between each tree. Editing operations
rlm@377:     involve a coherent shift of features, so they can accomodate local
rlm@377:     shifts of patches in any direction. They get some cool results
rlm@377:     over just straight correlation. Basically, they made an image
rlm@377:     comparor that is resistant to multiple independent deformations.
rlm@377:     
rlm@377:     !important small regions are treated the same as nonimportant
rlm@377:      small regions
rlm@377:      
rlm@377:     !no conception of shape
rlm@377:     
rlm@377:     quote:
rlm@377:     The dynamic programming procedure looks for an optimal
rlm@377:     transformation that aligns the patches of both images. This
rlm@377:     transformation is not a global transformation, but a composition
rlm@377:     of many local transformations of sub-patches at various sizes,
rlm@377:     performed one on top of the other.
rlm@377: 
rlm@377: *** 2006 Satellite Features for the Classification of Visually Similar Classes
rlm@377:     
rlm@377:     Finds features that can distinguish subclasses of a class, by
rlm@377:     first finding a rigid set of anghor features that are common to
rlm@377:     both subclasses, then finding distinguishing features relative to
rlm@377:     those subfeatures. They keep things rigid because the satellite
rlm@377:     features don't have much information in and of themselves, and are
rlm@377:     only informative relative to other features.
rlm@377: 
rlm@377: *** 2005 Learning a novel class from a single example by cross-generalization.
rlm@377: 
rlm@377:     Let's you use a vast visual experience to generate a classifier
rlm@377:     for a novel class by generating synthetic examples by replaceing
rlm@377:     features from the single example with features from similiar
rlm@377:     classes.
rlm@377: 
rlm@377:     quote: feature F is likely to be useful for class C if a similar
rlm@377:     feature F proved effective for a similar class C in the past.
rlm@377: 
rlm@377:     Allows you to trasfer the "gestalt" of a similiar class to a new
rlm@377:     class, by adapting all the features of the learned class that have
rlm@378:     correspondance to the new class.
rlm@378: 
rlm@378: *** 2007 Semantic Hierarchies for Recognizing Objects and Parts
rlm@378: 
rlm@378:     Better learning of complex objects like faces by learning each
rlm@378:     piece (like nose, mouth, eye, etc) separately, then making sure
rlm@378:     that the features are in plausable positions.