Mercurial > cortex

     1 When I write my thesis, I want it to have links to every 

     2 

     3 

     4 

     5 * Object Recognition from Local Scale-Invariant Features, David G. Lowe

     6   

     7   This is the famous SIFT paper that is mentioned everywhere.

     8 

     9   This is a way to find objects in images given an image of that

    10   object. It is moderately risistant to variations in the sample image

    11   and the target image. Basically, this is a fancy way of picking out

    12   a test pattern embedded in a larger pattern. It would fail to learn

    13   anything resembling object categories, for instance. Usefull concept

    14   is the idea of storing the local scale and rotation of each feature

    15   as it is extracted from the image, then checking to make sure that

    16   proposed matches all more-or-less agree on shift, rotation, scale,

    17   etc.  Another good idea is to use points instead of edges, since

    18   they seem more robust.

    19 

    20 ** References:

    21  - Basri, Ronen, and David. W. Jacobs, “Recognition using region

    22   correspondences,” International Journal of Computer Vision, 25, 2

    23   (1996), pp. 141–162.

    24   

    25  - Edelman, Shimon, Nathan Intrator, and Tomaso Poggio, “Complex

    26   cells and object recognition,” Unpublished Manuscript, preprint at

    27   http://www.ai.mit.edu/edelman/mirror/nips97.ps.Z

    28   

    29  - Lindeberg, Tony, “Detecting salient blob-like image structures

    30   and their scales with a scale-space primal sketch: a method for

    31   focus-of-attention,” International Journal of Computer Vision, 11, 3

    32   (1993), pp. 283–318.

    33   

    34  - Murase, Hiroshi, and Shree K. Nayar, “Visual learning and

    35   recognition of 3-D objects from appearance,” International Journal

    36   of Computer Vision, 14, 1 (1995), pp. 5–24.

    37 

    38  - Ohba, Kohtaro, and Katsushi Ikeuchi, “Detectability, uniqueness,

    39   and reliability of eigen windows for stable verification of

    40   partially occluded objects,” IEEE Trans. on Pattern Analysis and

    41   Machine Intelligence, 19, 9 (1997), pp. 1043–48.

    42 

    43  - Zhang, Z., R. Deriche, O. Faugeras, Q.T. Luong, “A robust

    44   technique for matching two uncalibrated images through the recovery

    45   of the unknown epipolar geometry,” Artificial Intelligence, 78,

    46   (1995), pp. 87-119.

    47 

    48 

    49 

    50 

    51    

    52 * Alignment by Maximization of Mutual Information, Paul A. Viola

    53 

    54   PhD Thesis recommended by Winston. Describes a system that is able

    55   to align a 3D computer model of an object with an image of that

    56   object. 

    57   

    58   - Pages 9-19 is a very adequate intro to the algorithm.

    59   

    60   - Has a useful section on entropy and probability at the beginning

    61     which is worth reading, especially the part about entropy.

    62 

    63   - Differential entropy seems a bit odd -- you would think that it

    64     should be the same as normal entropy for a discrete distrubition

    65     embedded in continuous space. How do you measure the entropy of a

    66     half continuous, half discrete random variable? Perhaps the

    67     problem is related to the delta function, and not the definition

    68     of differential entropy?

    69 

    70   - Expectation Maximation (Mixture of Gaussians cool stuff)

    71     (Dempster 1977)

    72 

    73   - Good introduction to Parzen Window Density Estimation. Parzen

    74     density functions trade construction time for evaulation

    75     time.(Pg. 41) They are a way to transform a sample into a

    76     distribution. They don't work very well in higher dimensions due

    77     to the thinning of sample points.

    78 

    79   - Calculating the entropy of a Markov Model (or state machine,

    80     program, etc) seems like it would be very hard, since each trial

    81     would not be independent of the other trials. Yet, there are many

    82     common sense models that do need to have state to accurately model

    83     the world.

    84 

    85   - "... there is no direct procedure for evaluating entropy from a

    86     sample. A common approach is to model the density from the sample,

    87     and then estimate the entropy from the density."

    88 

    89   - pg. 55 he says that infinity minus infinity is zero lol.

    90 

    91   - great idea on pg 62 about using random samples from images to

    92     speed up computation.

    93 

    94   - practical way of terminating a random search: "A better idea is to

    95     reduce the learning rate until the parameters have a reasonable

    96     variance and then take the average parameters."

    97 

    98   - p. 65 bullshit hack to make his parzen window estimates work.

    99 

   100   - this alignment only works if the initial pose is not very far

   101     off. 

   102 

   103 

   104   Occlusion? Seems a bit holistic.

   105 

   106 ** References

   107  - "excellent" book on entropy (Cover & Thomas, 1991) [Elements of

   108    Information Theory.] 

   109 

   110  - Canny, J. (1986). A Computational Approach to Edge Detection. IEEE

   111    Transactions PAMI, PAMI-8(6):679{698

   112 

   113  - Chin, R. and Dyer, C. (1986). Model-Based Recognition in Robot

   114    Vision. Computing Surveys, 18:67-108.

   115 

   116  - Grimson, W., Lozano-Perez, T., Wells, W., et al. (1994). An

   117    Automatic Registration Method for Frameless Stereotaxy, Image

   118    Guided Surgery, and Enhanced Realigy Visualization. In Proceedings

   119    of the Computer Society Conference on Computer Vision and Pattern

   120    Recognition, Seattle, WA. IEEE.

   121 

   122  - Hill, D. L., Studholme, C., and Hawkes, D. J. (1994). Voxel

   123    Similarity Measures for Auto-mated Image Registration. In

   124    Proceedings of the Third Conference on Visualization in Biomedical

   125    Computing, pages 205 { 216. SPIE.

   126 

   127  - Kirkpatrick, S., Gelatt, C., and Vecch Optimization by Simulated

   128    Annealing. Science, 220(4598):671-680.

   129 

   130  - Jones, M. and Poggio, T. (1995). Model-based matching of line

   131    drawings by linear combin-ations of prototypes. Proceedings of the

   132    International Conference on Computer Vision

   133 

   134  - Ljung, L. and Soderstrom, T. (1983). Theory and Practice of

   135    Recursive Identi cation. MIT Press.

   136 

   137  - Shannon, C. E. (1948). A mathematical theory of communication. Bell

   138    Systems Technical Journal, 27:379-423 and 623-656.

   139 

   140  - Shashua, A. (1992). Geometry and Photometry in 3D Visual

   141    Recognition. PhD thesis, M.I.T Artificial Intelligence Laboratory,

   142    AI-TR-1401.

   143 

   144  - William H. Press, Brian P. Flannery, S. A. T. and Veterling,

   145    W. T. (1992). Numerical Recipes in C: The Art of Scienti c

   146    Computing. Cambridge University Press, Cambridge, England, second

   147    edition edition.

   148 

   149 * Semi-Automated Dialogue Act Classification for Situated Social Agents in Games, Deb Roy 

   150   

   151   Interesting attempt to learn "social scripts" related to resturant

   152   behaviour. The authors do this by creating a game which implements a

   153   virtual restruant, and recoding actual human players as they

   154   interact with the game. The learn scripts from annotated

   155   interactions and then use those scripts to label other

   156   interactions. They don't get very good results, but their

   157   methodology of creating a virtual world and recording

   158   low-dimensional actions is interesting.

   159 

   160   - Torque 2D/3D looks like an interesting game engine.

   161 

   162 

   163 * Face Recognition by Humans: Nineteen Results all Computer Vision Researchers should know, Sinha

   164   

   165   This is a summary of a lot of bio experiments on human face

   166   recognition.

   167   

   168   - They assert again that the internal gradients/structures of a face

   169     are more important than the edges.

   170 

   171   - It's amazing to me that it takes about 10 years after birth for a

   172     human to get advanced adult-like face detection. They go through

   173     feature based processing to a holistic based approach during this

   174     time.

   175 

   176   - Finally, color is a very important cue for identifying faces.

   177 

   178 ** References

   179   - A. Freire, K. Lee, and L. A. Symons, BThe face-inversion effect as

   180     a deficit in the encoding of configural information: Direct

   181     evidence,[ Perception, vol. 29, no. 2, pp. 159–170, 2000.

   182   - M. B. Lewis, BThatcher’s children: Development and the Thatcher

   183     illusion,[Perception, vol. 32, pp. 1415–21, 2003.

   184   - E. McKone and N. Kanwisher, BDoes the human brain process objects

   185     of expertise like faces? A review of the evidence,[ in From Monkey

   186     Brain to Human Brain, S. Dehaene, J. R. Duhamel, M. Hauser, and

   187     G. Rizzolatti, Eds. Cambridge, MA: MIT Press, 2005.

   188 

   189 

   190 

   191 

   192 heee~eeyyyy kids, time to get eagle'd!!!!

   193 

   194 

   195 

   196 

   197 

   198 * Ullman 

   199 

   200 Actual code reuse!

   201 

   202 precision = fraction of retrieved instances that are relevant

   203   (true-postives/(true-positives+false-positives))

   204 

   205 recall    =  fraction of relevant instances that are retrieved

   206   (true-positives/total-in-class)

   207 

   208 cross-validation = train the model on two different sets to prevent

   209 overfitting. 

   210 

   211 nifty, relevant, realistic ideas

   212 He doesn't confine himself to unplasaubile assumptions

   213 

   214 ** Our Reading

   215 *** 2002 Visual features of intermediate complexity and their use in classification

   216 

   217     

   218 

   219 

   220 ** Getting around the dumb "fixed training set" methods

   221 

   222 *** 2006 Learning to classify by ongoing feature selection

   223     

   224     Brings in the most informative features of a class, based on

   225     mutual information between that feature and all the examples

   226     encountered so far. To bound the running time, he uses only a

   227     fixed number of the most recent examples. He uses a replacement

   228     strategy to tell whether a new feature is better than one of the

   229     corrent features.

   230 

   231 *** 2009 Learning model complexity in an online environment

   232     

   233     Sort of like the heirichal baysean models of Tennanbaum, this

   234     system makes the model more and more complicated as it gets more

   235     and more training data. It does this by using two systems in

   236     parallell and then whenever the more complex one seems to be

   237     needed by the data, the less complex one is thrown out, and an

   238     even more complex model is initialized in its place.

   239 

   240     He uses a SVM with polynominal kernels of varying complexity. He

   241     gets good perfoemance on a handwriting classfication using a large

   242     range of training samples, since his model changes complexity

   243     depending on the number of training samples. The simpler models do

   244     better with few training points, and the more complex ones do

   245     better with many training points.

   246 

   247     The final model had intermediate complexity between published

   248     extremes. 

   249 

   250     The more complex models must be able to be initialized efficiently

   251     from the less complex models which they replace!

   252 

   253 

   254 ** Non Parametric Models

   255 

   256 *** 2010 The chains model for detecting parts by their context

   257 

   258     Like the constelation method for rigid objects, but extended to

   259     non-rigid objects as well.

   260 

   261     Allows you to build a hand detector from a face detector. This is

   262     usefull because hands might be only a few pixels, and very

   263     ambiguous in an image, but if you are expecting them at the end of

   264     an arm, then they become easier to find.

   265 

   266     They make chains by using spatial proximity of features. That way,

   267     a hand can be idntified by chaining back from the head. If there

   268     is a good chain to the head, then it is more likely that there is

   269     a hand than if there isn't. Since there is some give in the

   270     proximity detection, the system can accomodate new poses that it

   271     has never seen before.

   272 

   273     Does not use any motion information.

   274 

   275 *** 2005 A Hierarchical Non-Parametric Method for Capturing Non-Rigid Deformations

   276     

   277     (relative dynamic programming [RDP])

   278 

   279     Goal is to match images, as in SIFT, but this time the images can

   280     be subject to non rigid transformations. They do this by finding

   281     small patches that look the same, then building up bigger

   282     patches. They get a tree of patches that describes each image, and

   283     find the edit distance between each tree. Editing operations

   284     involve a coherent shift of features, so they can accomodate local

   285     shifts of patches in any direction. They get some cool results

   286     over just straight correlation. Basically, they made an image

   287     comparor that is resistant to multiple independent deformations.

   288     

   289     !important small regions are treated the same as nonimportant

   290      small regions

   291      

   292     !no conception of shape

   293     

   294     quote:

   295     The dynamic programming procedure looks for an optimal

   296     transformation that aligns the patches of both images. This

   297     transformation is not a global transformation, but a composition

   298     of many local transformations of sub-patches at various sizes,

   299     performed one on top of the other.

   300 

   301 *** 2006 Satellite Features for the Classification of Visually Similar Classes

   302     

   303     Finds features that can distinguish subclasses of a class, by

   304     first finding a rigid set of anghor features that are common to

   305     both subclasses, then finding distinguishing features relative to

   306     those subfeatures. They keep things rigid because the satellite

   307     features don't have much information in and of themselves, and are

   308     only informative relative to other features.

   309 

   310 *** 2005 Learning a novel class from a single example by cross-generalization.

   311 

   312     Let's you use a vast visual experience to generate a classifier

   313     for a novel class by generating synthetic examples by replaceing

   314     features from the single example with features from similiar

   315     classes.

   316 

   317     quote: feature F is likely to be useful for class C if a similar

   318     feature F proved effective for a similar class C in the past.

   319 

   320     Allows you to trasfer the "gestalt" of a similiar class to a new

   321     class, by adapting all the features of the learned class that have

   322     correspondance to the new class.

   323 

   324 *** 2007 Semantic Hierarchies for Recognizing Objects and Parts

   325 

   326     Better learning of complex objects like faces by learning each

   327     piece (like nose, mouth, eye, etc) separately, then making sure

   328     that the features are in plausable positions.
author	Robert McIntyre <rlm@mit.edu>
date	Mon, 12 May 2014 12:49:18 -0400
parents	8e62bf52be59
children