Mercurial > cortex

     1 When I write my thesis, I want it to have links to every 

     2 

     3 

     4 

     5 * Object Recognition from Local Scale-Invariant Features, David G. Lowe

     6   

     7   This is the famous SIFT paper that is mentioned everywhere.

     8 

     9   This is a way to find objects in images given an image of that

    10   object. It is moderately risistant to variations in the sample image

    11   and the target image. Basically, this is a fancy way of picking out

    12   a test pattern embedded in a larger pattern. It would fail to learn

    13   anything resembling object categories, for instance. Usefull concept

    14   is the idea of storing the local scale and rotation of each feature

    15   as it is extracted from the image, then checking to make sure that

    16   proposed matches all more-or-less agree on shift, rotation, scale,

    17   etc.  Another good idea is to use points instead of edges, since

    18   they seem more robust.

    19 

    20 ** References:

    21  - Basri, Ronen, and David. W. Jacobs, “Recognition using region

    22   correspondences,” International Journal of Computer Vision, 25, 2

    23   (1996), pp. 141–162.

    24   

    25  - Edelman, Shimon, Nathan Intrator, and Tomaso Poggio, “Complex

    26   cells and object recognition,” Unpublished Manuscript, preprint at

    27   http://www.ai.mit.edu/edelman/mirror/nips97.ps.Z

    28   

    29  - Lindeberg, Tony, “Detecting salient blob-like image structures

    30   and their scales with a scale-space primal sketch: a method for

    31   focus-of-attention,” International Journal of Computer Vision, 11, 3

    32   (1993), pp. 283–318.

    33   

    34  - Murase, Hiroshi, and Shree K. Nayar, “Visual learning and

    35   recognition of 3-D objects from appearance,” International Journal

    36   of Computer Vision, 14, 1 (1995), pp. 5–24.

    37 

    38  - Ohba, Kohtaro, and Katsushi Ikeuchi, “Detectability, uniqueness,

    39   and reliability of eigen windows for stable verification of

    40   partially occluded objects,” IEEE Trans. on Pattern Analysis and

    41   Machine Intelligence, 19, 9 (1997), pp. 1043–48.

    42 

    43  - Zhang, Z., R. Deriche, O. Faugeras, Q.T. Luong, “A robust

    44   technique for matching two uncalibrated images through the recovery

    45   of the unknown epipolar geometry,” Artificial Intelligence, 78,

    46   (1995), pp. 87-119.

    47 

    48 

    49 

    50 

    51    

    52 * Alignment by Maximization of Mutual Information, Paul A. Viola

    53 

    54   PhD Thesis recommended by Winston. Describes a system that is able

    55   to align a 3D computer model of an object with an image of that

    56   object. 

    57   

    58   - Pages 9-19 is a very adequate intro to the algorithm.

    59   

    60   - Has a useful section on entropy and probability at the beginning

    61     which is worth reading, especially the part about entropy.

    62 

    63   - Differential entropy seems a bit odd -- you would think that it

    64     should be the same as normal entropy for a discrete distrubition

    65     embedded in continuous space. How do you measure the entropy of a

    66     half continuous, half discrete random variable? Perhaps the

    67     problem is related to the delta function, and not the definition

    68     of differential entropy?

    69 

    70   - Expectation Maximation (Mixture of Gaussians cool stuff)

    71     (Dempster 1977)

    72 

    73   - Good introduction to Parzen Window Density Estimation. Parzen

    74     density functions trade construction time for evaulation

    75     time.(Pg. 41) They are a way to transform a sample into a

    76     distribution. They don't work very well in higher dimensions due

    77     to the thinning of sample points.

    78 

    79   - Calculating the entropy of a Markov Model (or state machine,

    80     program, etc) seems like it would be very hard, since each trial

    81     would not be independent of the other trials. Yet, there are many

    82     common sense models that do need to have state to accurately model

    83     the world.

    84 

    85   - "... there is no direct procedure for evaluating entropy from a

    86     sample. A common approach is to model the density from the sample,

    87     and then estimate the entropy from the density."

    88 

    89   - pg. 55 he says that infinity minus infinity is zero lol.

    90 

    91   - great idea on pg 62 about using random samples from images to

    92     speed up computation.

    93 

    94   - practical way of terminating a random search: "A better idea is to

    95     reduce the learning rate until the parameters have a reasonable

    96     variance and then take the average parameters."

    97 

    98   - p. 65 bullshit hack to make his parzen window estimates work.

    99 

   100   - this alignment only works if the initial pose is not very far

   101     off. 

   102 

   103 

   104   Occlusion? Seems a bit holistic.

   105 

   106 ** References

   107  - "excellent" book on entropy (Cover & Thomas, 1991) [Elements of

   108    Information Theory.] 

   109 

   110  - Canny, J. (1986). A Computational Approach to Edge Detection. IEEE

   111    Transactions PAMI, PAMI-8(6):679{698

   112 

   113  - Chin, R. and Dyer, C. (1986). Model-Based Recognition in Robot

   114    Vision. Computing Surveys, 18:67-108.

   115 

   116  - Grimson, W., Lozano-Perez, T., Wells, W., et al. (1994). An

   117    Automatic Registration Method for Frameless Stereotaxy, Image

   118    Guided Surgery, and Enhanced Realigy Visualization. In Proceedings

   119    of the Computer Society Conference on Computer Vision and Pattern

   120    Recognition, Seattle, WA. IEEE.

   121 

   122  - Hill, D. L., Studholme, C., and Hawkes, D. J. (1994). Voxel

   123    Similarity Measures for Auto-mated Image Registration. In

   124    Proceedings of the Third Conference on Visualization in Biomedical

   125    Computing, pages 205 { 216. SPIE.

   126 

   127  - Kirkpatrick, S., Gelatt, C., and Vecch Optimization by Simulated

   128    Annealing. Science, 220(4598):671-680.

   129 

   130  - Jones, M. and Poggio, T. (1995). Model-based matching of line

   131    drawings by linear combin-ations of prototypes. Proceedings of the

   132    International Conference on Computer Vision

   133 

   134  - Ljung, L. and Soderstrom, T. (1983). Theory and Practice of

   135    Recursive Identi cation. MIT Press.

   136 

   137  - Shannon, C. E. (1948). A mathematical theory of communication. Bell

   138    Systems Technical Journal, 27:379-423 and 623-656.

   139 

   140  - Shashua, A. (1992). Geometry and Photometry in 3D Visual

   141    Recognition. PhD thesis, M.I.T Artificial Intelligence Laboratory,

   142    AI-TR-1401.

   143 

   144  - William H. Press, Brian P. Flannery, S. A. T. and Veterling,

   145    W. T. (1992). Numerical Recipes in C: The Art of Scienti c

   146    Computing. Cambridge University Press, Cambridge, England, second

   147    edition edition.

   148 

   149 * Semi-Automated Dialogue Act Classification for Situated Social Agents in Games, Deb Roy 

   150   

   151   Interesting attempt to learn "social scripts" related to resturant

   152   behaviour. The authors do this by creating a game which implements a

   153   virtual restruant, and recoding actual human players as they

   154   interact with the game. The learn scripts from annotated

   155   interactions and then use those scripts to label other

   156   interactions. They don't get very good results, but their

   157   methodology of creating a virtual world and recording

   158   low-dimensional actions is interesting.

   159 

   160   - Torque 2D/3D looks like an interesting game engine.

   161 

   162 

   163 * Face Recognition by Humans: Nineteen Results all Computer Vision Researchers should know, Sinha

   164   

   165   This is a summary of a lot of bio experiments on human face

   166   recognition.

   167   

   168   - They assert again that the internal gradients/structures of a face

   169     are more important than the edges.

   170 

   171   - It's amazing to me that it takes about 10 years after birth for a

   172     human to get advanced adult-like face detection. They go through

   173     feature based processing to a holistic based approach during this

   174     time.

   175 

   176   - Finally, color is a very important cue for identifying faces.

   177 

   178 ** References

   179   - A. Freire, K. Lee, and L. A. Symons, BThe face-inversion effect as

   180     a deficit in the encoding of configural information: Direct

   181     evidence,[ Perception, vol. 29, no. 2, pp. 159–170, 2000.

   182   - M. B. Lewis, BThatcher’s children: Development and the Thatcher

   183     illusion,[Perception, vol. 32, pp. 1415–21, 2003.

   184   - E. McKone and N. Kanwisher, BDoes the human brain process objects

   185     of expertise like faces? A review of the evidence,[ in From Monkey

   186     Brain to Human Brain, S. Dehaene, J. R. Duhamel, M. Hauser, and

   187     G. Rizzolatti, Eds. Cambridge, MA: MIT Press, 2005.

   188 

   189 

   190 

   191 

   192 heee~eeyyyy kids, time to get eagle'd!!!!

   193 

   194 

   195 

   196 

   197 

   198 * Ullman 

   199 

   200 Actual code reuse!

   201 

   202 precision = fraction of retrieved instances that are relevant

   203   (true-postives/(true-positives+false-positives))

   204 

   205 recall    =  fraction of relevant instances that are retrieved

   206   (true-positives/total-in-class)

   207 

   208 cross-validation = train the model on two different sets to prevent

   209 overfitting. 

   210 

   211 nifty, relevant, realistic ideas

   212 He doesn't confine himself to unplasaubile assumptions

   213 

   214 

   215 

   216 

   217 ** Getting around the dumb "fixed training set" methods

   218 

   219 *** 2006 Learning to classify by ongoing feature selection

   220     

   221     Brings in the most informative features of a class, based on

   222     mutual information between that feature and all the examples

   223     encountered so far. To bound the running time, he uses only a

   224     fixed number of the most recent examples. He uses a replacement

   225     strategy to tell whether a new feature is better than one of the

   226     corrent features.

   227 

   228 *** 2009 Learning model complexity in an online environment

   229     

   230     Sort of like the heirichal baysean models of Tennanbaum, this

   231     system makes the model more and more complicated as it gets more

   232     and more training data. It does this by using two systems in

   233     parallell and then whenever the more complex one seems to be

   234     needed by the data, the less complex one is thrown out, and an

   235     even more complex model is initialized in its place.

   236 

   237     He uses a SVM with polynominal kernels of varying complexity. He

   238     gets good perfoemance on a handwriting classfication using a large

   239     range of training samples, since his model changes complexity

   240     depending on the number of training samples. The simpler models do

   241     better with few training points, and the more complex ones do

   242     better with many training points.

   243 

   244     The final model had intermediate complexity between published

   245     extremes. 

   246 

   247     The more complex models must be able to be initialized efficiently

   248     from the less complex models which they replace!

   249 

   250 

   251 ** Non Parametric Models

   252 

   253 *** 2002 Visual features of intermediate complexity and their use in classification

   254 

   255     

   256 

   257 *** 2010 The chains model for detecting parts by their context

   258 

   259     Like the constelation method for rigid objects, but extended to

   260     non-rigid objects as well.

   261 

   262     Allows you to build a hand detector from a face detector. This is

   263     usefull because hands might be only a few pixels, and very

   264     ambiguous in an image, but if you are expecting them at the end of

   265     an arm, then they become easier to find.

   266 

   267     They make chains by using spatial proximity of features. That way,

   268     a hand can be idntified by chaining back from the head. If there

   269     is a good chain to the head, then it is more likely that there is

   270     a hand than if there isn't. Since there is some give in the

   271     proximity detection, the system can accomodate new poses that it

   272     has never seen before.

   273 

   274     Does not use any motion information.

   275 

   276 *** 2005 A Hierarchical Non-Parametric Method for Capturing Non-Rigid Deformations

   277     

   278     (relative dynamic programming [RDP])

   279 

   280     Goal is to match images, as in SIFT, but this time the images can

   281     be subject to non rigid transformations. They do this by finding

   282     small patches that look the same, then building up bigger

   283     patches. They get a tree of patches that describes each image, and

   284     find the edit distance between each tree. Editing operations

   285     involve a coherent shift of features, so they can accomodate local

   286     shifts of patches in any direction. They get some cool results

   287     over just straight correlation. Basically, they made an image

   288     comparor that is resistant to multiple independent deformations.

   289     

   290     !important small regions are treated the same as nonimportant

   291      small regions

   292      

   293     !no conception of shape

   294     

   295     quote:

   296     The dynamic programming procedure looks for an optimal

   297     transformation that aligns the patches of both images. This

   298     transformation is not a global transformation, but a composition

   299     of many local transformations of sub-patches at various sizes,

   300     performed one on top of the other.

   301 

   302 *** 2006 Satellite Features for the Classification of Visually Similar Classes

   303     

   304     Finds features that can distinguish subclasses of a class, by

   305     first finding a rigid set of anghor features that are common to

   306     both subclasses, then finding distinguishing features relative to

   307     those subfeatures. They keep things rigid because the satellite

   308     features don't have much information in and of themselves, and are

   309     only informative relative to other features.

   310 

   311 *** 2005 Learning a novel class from a single example by cross-generalization.

   312 

   313     Let's you use a vast visual experience to generate a classifier

   314     for a novel class by generating synthetic examples by replaceing

   315     features from the single example with features from similiar

   316     classes.

   317 

   318     quote: feature F is likely to be useful for class C if a similar

   319     feature F proved effective for a similar class C in the past.

   320 

   321     Allows you to trasfer the "gestalt" of a similiar class to a new

   322     class, by adapting all the features of the learned class that have

   323     correspondance to the new class.
author	Robert McIntyre <rlm@mit.edu>
date	Thu, 11 Apr 2013 06:19:59 +0000
parents	057d47fc4789
children	8e62bf52be59