Mercurial > cortex

     1 When I write my thesis, I want it to have links to every 

     2 

     3 

     4 

     5 * Object Recognition from Local Scale-Invariant Features, David G. Lowe

     6   

     7   This is the famous SIFT paper that is mentioned everywhere.

     8 

     9   This is a way to find objects in images given an image of that

    10   object. It is moderately risistant to variations in the sample image

    11   and the target image. Basically, this is a fancy way of picking out

    12   a test pattern embedded in a larger pattern. It would fail to learn

    13   anything resembling object categories, for instance. Usefull concept

    14   is the idea of storing the local scale and rotation of each feature

    15   as it is extracted from the image, then checking to make sure that

    16   proposed matches all more-or-less agree on shift, rotation, scale,

    17   etc.  Another good idea is to use points instead of edges, since

    18   they seem more robust.

    19 

    20 ** References:

    21  - Basri, Ronen, and David. W. Jacobs, “Recognition using region

    22   correspondences,” International Journal of Computer Vision, 25, 2

    23   (1996), pp. 141–162.

    24   

    25  - Edelman, Shimon, Nathan Intrator, and Tomaso Poggio, “Complex

    26   cells and object recognition,” Unpublished Manuscript, preprint at

    27   http://www.ai.mit.edu/edelman/mirror/nips97.ps.Z

    28   

    29  - Lindeberg, Tony, “Detecting salient blob-like image structures

    30   and their scales with a scale-space primal sketch: a method for

    31   focus-of-attention,” International Journal of Computer Vision, 11, 3

    32   (1993), pp. 283–318.

    33   

    34  - Murase, Hiroshi, and Shree K. Nayar, “Visual learning and

    35   recognition of 3-D objects from appearance,” International Journal

    36   of Computer Vision, 14, 1 (1995), pp. 5–24.

    37 

    38  - Ohba, Kohtaro, and Katsushi Ikeuchi, “Detectability, uniqueness,

    39   and reliability of eigen windows for stable verification of

    40   partially occluded objects,” IEEE Trans. on Pattern Analysis and

    41   Machine Intelligence, 19, 9 (1997), pp. 1043–48.

    42 

    43  - Zhang, Z., R. Deriche, O. Faugeras, Q.T. Luong, “A robust

    44   technique for matching two uncalibrated images through the recovery

    45   of the unknown epipolar geometry,” Artificial Intelligence, 78,

    46   (1995), pp. 87-119.

    47 

    48 

    49 

    50 

    51    

    52 * Alignment by Maximization of Mutual Information, Paul A. Viola

    53 

    54   PhD Thesis recommended by Winston. Describes a system that is able

    55   to align a 3D computer model of an object with an image of that

    56   object. 

    57   

    58   - Pages 9-19 is a very adequate intro to the algorithm.

    59   

    60   - Has a useful section on entropy and probability at the beginning

    61     which is worth reading, especially the part about entropy.

    62 

    63   - Differential entropy seems a bit odd -- you would think that it

    64     should be the same as normal entropy for a discrete distrubition

    65     embedded in continuous space. How do you measure the entropy of a

    66     half continuous, half discrete random variable? Perhaps the

    67     problem is related to the delta function, and not the definition

    68     of differential entropy?

    69 

    70   - Expectation Maximation (Mixture of Gaussians cool stuff)

    71     (Dempster 1977)

    72 

    73   - Good introduction to Parzen Window Density Estimation. Parzen

    74     density functions trade construction time for evaulation

    75     time.(Pg. 41) They are a way to transform a sample into a

    76     distribution. They don't work very well in higher dimensions due

    77     to the thinning of sample points.

    78 

    79   - Calculating the entropy of a Markov Model (or state machine,

    80     program, etc) seems like it would be very hard, since each trial

    81     would not be independent of the other trials. Yet, there are many

    82     common sense models that do need to have state to accurately model

    83     the world.

    84 

    85   - "... there is no direct procedure for evaluating entropy from a

    86     sample. A common approach is to model the density from the sample,

    87     and then estimate the entropy from the density."

    88 

    89   - pg. 55 he says that infinity minus infinity is zero lol.

    90 

    91   - great idea on pg 62 about using random samples from images to

    92     speed up computation.

    93 

    94   - practical way of terminating a random search: "A better idea is to

    95     reduce the learning rate until the parameters have a reasonable

    96     variance and then take the average parameters."

    97 

    98   - p. 65 bullshit hack to make his parzen window estimates work.

    99 

   100   - this alignment only works if the initial pose is not very far

   101     off. 

   102 

   103 

   104   Occlusion? Seems a bit holistic.

   105 

   106 ** References

   107  - "excellent" book on entropy (Cover & Thomas, 1991) [Elements of

   108    Information Theory.] 

   109 

   110  - Canny, J. (1986). A Computational Approach to Edge Detection. IEEE

   111    Transactions PAMI, PAMI-8(6):679{698

   112 

   113  - Chin, R. and Dyer, C. (1986). Model-Based Recognition in Robot

   114    Vision. Computing Surveys, 18:67-108.

   115 

   116  - Grimson, W., Lozano-Perez, T., Wells, W., et al. (1994). An

   117    Automatic Registration Method for Frameless Stereotaxy, Image

   118    Guided Surgery, and Enhanced Realigy Visualization. In Proceedings

   119    of the Computer Society Conference on Computer Vision and Pattern

   120    Recognition, Seattle, WA. IEEE.

   121 

   122  - Hill, D. L., Studholme, C., and Hawkes, D. J. (1994). Voxel

   123    Similarity Measures for Auto-mated Image Registration. In

   124    Proceedings of the Third Conference on Visualization in Biomedical

   125    Computing, pages 205 { 216. SPIE.

   126 

   127  - Kirkpatrick, S., Gelatt, C., and Vecch Optimization by Simulated

   128    Annealing. Science, 220(4598):671-680.

   129 

   130  - Jones, M. and Poggio, T. (1995). Model-based matching of line

   131    drawings by linear combin-ations of prototypes. Proceedings of the

   132    International Conference on Computer Vision

   133 

   134  - Ljung, L. and Soderstrom, T. (1983). Theory and Practice of

   135    Recursive Identi cation. MIT Press.

   136 

   137  - Shannon, C. E. (1948). A mathematical theory of communication. Bell

   138    Systems Technical Journal, 27:379-423 and 623-656.

   139 

   140  - Shashua, A. (1992). Geometry and Photometry in 3D Visual

   141    Recognition. PhD thesis, M.I.T Artificial Intelligence Laboratory,

   142    AI-TR-1401.

   143 

   144  - William H. Press, Brian P. Flannery, S. A. T. and Veterling,

   145    W. T. (1992). Numerical Recipes in C: The Art of Scienti c

   146    Computing. Cambridge University Press, Cambridge, England, second

   147    edition edition.

   148 

   149 * Semi-Automated Dialogue Act Classification for Situated Social Agents in Games, Deb Roy 

   150   

   151   Interesting attempt to learn "social scripts" related to resturant

   152   behaviour. The authors do this by creating a game which implements a

   153   virtual restruant, and recoding actual human players as they

   154   interact with the game. The learn scripts from annotated

   155   interactions and then use those scripts to label other

   156   interactions. They don't get very good results, but their

   157   methodology of creating a virtual world and recording

   158   low-dimensional actions is interesting.

   159 

   160   - Torque 2D/3D looks like an interesting game engine.

   161 

   162 

   163 * Face Recognition by Humans: Nineteen Results all Computer Vision Researchers should know, Sinha

   164   

   165   This is a summary of a lot of bio experiments on human face

   166   recognition.

   167   

   168   - They assert again that the internal gradients/structures of a face

   169     are more important than the edges.

   170 

   171   - It's amazing to me that it takes about 10 years after birth for a

   172     human to get advanced adult-like face detection. They go through

   173     feature based processing to a holistic based approach during this

   174     time.

   175 

   176   - Finally, color is a very important cue for identifying faces.

   177 

   178 ** References

   179   - A. Freire, K. Lee, and L. A. Symons, BThe face-inversion effect as

   180     a deficit in the encoding of configural information: Direct

   181     evidence,[ Perception, vol. 29, no. 2, pp. 159–170, 2000.

   182   - M. B. Lewis, BThatcher’s children: Development and the Thatcher

   183     illusion,[Perception, vol. 32, pp. 1415–21, 2003.

   184   - E. McKone and N. Kanwisher, BDoes the human brain process objects

   185     of expertise like faces? A review of the evidence,[ in From Monkey

   186     Brain to Human Brain, S. Dehaene, J. R. Duhamel, M. Hauser, and

   187     G. Rizzolatti, Eds. Cambridge, MA: MIT Press, 2005.

   188 

   189 

   190 

   191 

   192 heee~eeyyyy kids, time to get eagle'd!!!!

   193 

   194 

   195 

   196 

   197 

   198 * Ullman 

   199 

   200 Actual code reuse!

   201 

   202 precision = fraction of retrieved instances that are relevant

   203   (true-postives/(true-positives+false-positives))

   204 

   205 recall    =  fraction of relevant instances that are retrieved

   206   (true-positives/total-in-class)

   207 

   208 cross-validation = train the model on two different sets to prevent

   209 overfitting. 

   210 

   211 

   212 

   213 

   214 

   215 ** Getting around the dumb "fixed training set" methods

   216 

   217 *** 2006 Learning to classify by ongoing feature selection

   218     

   219     Brings in the most informative features of a class, based on

   220     mutual information between that feature and all the examples

   221     encountered so far. To bound the running time, he uses only a

   222     fixed number of the most recent examples. He uses a replacement

   223     strategy to tell whether a new feature is better than one of the

   224     corrent features.

   225 

   226 *** 2009 Learning model complexity in an online environment

   227     

   228     Sort of like the heirichal baysean models of Tennanbaum, this

   229     system makes the model more and more complicated as it gets more

   230     and more training data. It does this by using two systems in

   231     parallell and then whenever the more complex one seems to be

   232     needed by the data, the less complex one is thrown out, and an

   233     even more complex model is initialized in its place.

   234 

   235     He uses a SVM with polynominal kernels of varying complexity. He

   236     gets good perfoemance on a handwriting classfication using a large

   237     range of training samples, since his model changes complexity

   238     depending on the number of training samples. The simpler models do

   239     better with few training points, and the more complex ones do

   240     better with many training points.

   241 

   242     The more complex models must be able to be initialized efficiently

   243     from the less complex models which they replace!

   244 

   245 

   246 ** Non Parametric Models

   247 

   248 *** Visual features of intermediate complexity and their use in classification

   249 

   250 *** The chains model for detecting parts by their context

   251 

   252     Like the constelation method for rigid objects, but extended to

   253     non-rigid objects as well.

   254 

   255     Allows you to build a hand detector from a face detector. This is

   256     usefull because hands might be only a few pixels, and very

   257     ambiguous in an image, but if you are expecting them at the end of

   258     an arm, then they become easier to find.

   259 

   260
author	Robert McIntyre <rlm@mit.edu>
date	Thu, 11 Apr 2013 05:40:23 +0000
parents	9c37a55e1cd2
children	80cd096682b2