Mercurial > cortex
view org/literature-review.org @ 376:057d47fc4789
reviewing ullman's stuff.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Thu, 11 Apr 2013 05:40:23 +0000 |
parents | 9c37a55e1cd2 |
children | 80cd096682b2 |
line wrap: on
line source
1 When I write my thesis, I want it to have links to every5 * Object Recognition from Local Scale-Invariant Features, David G. Lowe7 This is the famous SIFT paper that is mentioned everywhere.9 This is a way to find objects in images given an image of that10 object. It is moderately risistant to variations in the sample image11 and the target image. Basically, this is a fancy way of picking out12 a test pattern embedded in a larger pattern. It would fail to learn13 anything resembling object categories, for instance. Usefull concept14 is the idea of storing the local scale and rotation of each feature15 as it is extracted from the image, then checking to make sure that16 proposed matches all more-or-less agree on shift, rotation, scale,17 etc. Another good idea is to use points instead of edges, since18 they seem more robust.20 ** References:21 - Basri, Ronen, and David. W. Jacobs, “Recognition using region22 correspondences,” International Journal of Computer Vision, 25, 223 (1996), pp. 141–162.25 - Edelman, Shimon, Nathan Intrator, and Tomaso Poggio, “Complex26 cells and object recognition,” Unpublished Manuscript, preprint at27 http://www.ai.mit.edu/edelman/mirror/nips97.ps.Z29 - Lindeberg, Tony, “Detecting salient blob-like image structures30 and their scales with a scale-space primal sketch: a method for31 focus-of-attention,” International Journal of Computer Vision, 11, 332 (1993), pp. 283–318.34 - Murase, Hiroshi, and Shree K. Nayar, “Visual learning and35 recognition of 3-D objects from appearance,” International Journal36 of Computer Vision, 14, 1 (1995), pp. 5–24.38 - Ohba, Kohtaro, and Katsushi Ikeuchi, “Detectability, uniqueness,39 and reliability of eigen windows for stable verification of40 partially occluded objects,” IEEE Trans. on Pattern Analysis and41 Machine Intelligence, 19, 9 (1997), pp. 1043–48.43 - Zhang, Z., R. Deriche, O. Faugeras, Q.T. Luong, “A robust44 technique for matching two uncalibrated images through the recovery45 of the unknown epipolar geometry,” Artificial Intelligence, 78,46 (1995), pp. 87-119.52 * Alignment by Maximization of Mutual Information, Paul A. Viola54 PhD Thesis recommended by Winston. Describes a system that is able55 to align a 3D computer model of an object with an image of that56 object.58 - Pages 9-19 is a very adequate intro to the algorithm.60 - Has a useful section on entropy and probability at the beginning61 which is worth reading, especially the part about entropy.63 - Differential entropy seems a bit odd -- you would think that it64 should be the same as normal entropy for a discrete distrubition65 embedded in continuous space. How do you measure the entropy of a66 half continuous, half discrete random variable? Perhaps the67 problem is related to the delta function, and not the definition68 of differential entropy?70 - Expectation Maximation (Mixture of Gaussians cool stuff)71 (Dempster 1977)73 - Good introduction to Parzen Window Density Estimation. Parzen74 density functions trade construction time for evaulation75 time.(Pg. 41) They are a way to transform a sample into a76 distribution. They don't work very well in higher dimensions due77 to the thinning of sample points.79 - Calculating the entropy of a Markov Model (or state machine,80 program, etc) seems like it would be very hard, since each trial81 would not be independent of the other trials. Yet, there are many82 common sense models that do need to have state to accurately model83 the world.85 - "... there is no direct procedure for evaluating entropy from a86 sample. A common approach is to model the density from the sample,87 and then estimate the entropy from the density."89 - pg. 55 he says that infinity minus infinity is zero lol.91 - great idea on pg 62 about using random samples from images to92 speed up computation.94 - practical way of terminating a random search: "A better idea is to95 reduce the learning rate until the parameters have a reasonable96 variance and then take the average parameters."98 - p. 65 bullshit hack to make his parzen window estimates work.100 - this alignment only works if the initial pose is not very far101 off.104 Occlusion? Seems a bit holistic.106 ** References107 - "excellent" book on entropy (Cover & Thomas, 1991) [Elements of108 Information Theory.]110 - Canny, J. (1986). A Computational Approach to Edge Detection. IEEE111 Transactions PAMI, PAMI-8(6):679{698113 - Chin, R. and Dyer, C. (1986). Model-Based Recognition in Robot114 Vision. Computing Surveys, 18:67-108.116 - Grimson, W., Lozano-Perez, T., Wells, W., et al. (1994). An117 Automatic Registration Method for Frameless Stereotaxy, Image118 Guided Surgery, and Enhanced Realigy Visualization. In Proceedings119 of the Computer Society Conference on Computer Vision and Pattern120 Recognition, Seattle, WA. IEEE.122 - Hill, D. L., Studholme, C., and Hawkes, D. J. (1994). Voxel123 Similarity Measures for Auto-mated Image Registration. In124 Proceedings of the Third Conference on Visualization in Biomedical125 Computing, pages 205 { 216. SPIE.127 - Kirkpatrick, S., Gelatt, C., and Vecch Optimization by Simulated128 Annealing. Science, 220(4598):671-680.130 - Jones, M. and Poggio, T. (1995). Model-based matching of line131 drawings by linear combin-ations of prototypes. Proceedings of the132 International Conference on Computer Vision134 - Ljung, L. and Soderstrom, T. (1983). Theory and Practice of135 Recursive Identi cation. MIT Press.137 - Shannon, C. E. (1948). A mathematical theory of communication. Bell138 Systems Technical Journal, 27:379-423 and 623-656.140 - Shashua, A. (1992). Geometry and Photometry in 3D Visual141 Recognition. PhD thesis, M.I.T Artificial Intelligence Laboratory,142 AI-TR-1401.144 - William H. Press, Brian P. Flannery, S. A. T. and Veterling,145 W. T. (1992). Numerical Recipes in C: The Art of Scienti c146 Computing. Cambridge University Press, Cambridge, England, second147 edition edition.149 * Semi-Automated Dialogue Act Classification for Situated Social Agents in Games, Deb Roy151 Interesting attempt to learn "social scripts" related to resturant152 behaviour. The authors do this by creating a game which implements a153 virtual restruant, and recoding actual human players as they154 interact with the game. The learn scripts from annotated155 interactions and then use those scripts to label other156 interactions. They don't get very good results, but their157 methodology of creating a virtual world and recording158 low-dimensional actions is interesting.160 - Torque 2D/3D looks like an interesting game engine.163 * Face Recognition by Humans: Nineteen Results all Computer Vision Researchers should know, Sinha165 This is a summary of a lot of bio experiments on human face166 recognition.168 - They assert again that the internal gradients/structures of a face169 are more important than the edges.171 - It's amazing to me that it takes about 10 years after birth for a172 human to get advanced adult-like face detection. They go through173 feature based processing to a holistic based approach during this174 time.176 - Finally, color is a very important cue for identifying faces.178 ** References179 - A. Freire, K. Lee, and L. A. Symons, BThe face-inversion effect as180 a deficit in the encoding of configural information: Direct181 evidence,[ Perception, vol. 29, no. 2, pp. 159–170, 2000.182 - M. B. Lewis, BThatcher’s children: Development and the Thatcher183 illusion,[Perception, vol. 32, pp. 1415–21, 2003.184 - E. McKone and N. Kanwisher, BDoes the human brain process objects185 of expertise like faces? A review of the evidence,[ in From Monkey186 Brain to Human Brain, S. Dehaene, J. R. Duhamel, M. Hauser, and187 G. Rizzolatti, Eds. Cambridge, MA: MIT Press, 2005.192 heee~eeyyyy kids, time to get eagle'd!!!!198 * Ullman200 Actual code reuse!202 precision = fraction of retrieved instances that are relevant203 (true-postives/(true-positives+false-positives))205 recall = fraction of relevant instances that are retrieved206 (true-positives/total-in-class)208 cross-validation = train the model on two different sets to prevent209 overfitting.215 ** Getting around the dumb "fixed training set" methods217 *** 2006 Learning to classify by ongoing feature selection219 Brings in the most informative features of a class, based on220 mutual information between that feature and all the examples221 encountered so far. To bound the running time, he uses only a222 fixed number of the most recent examples. He uses a replacement223 strategy to tell whether a new feature is better than one of the224 corrent features.226 *** 2009 Learning model complexity in an online environment228 Sort of like the heirichal baysean models of Tennanbaum, this229 system makes the model more and more complicated as it gets more230 and more training data. It does this by using two systems in231 parallell and then whenever the more complex one seems to be232 needed by the data, the less complex one is thrown out, and an233 even more complex model is initialized in its place.235 He uses a SVM with polynominal kernels of varying complexity. He236 gets good perfoemance on a handwriting classfication using a large237 range of training samples, since his model changes complexity238 depending on the number of training samples. The simpler models do239 better with few training points, and the more complex ones do240 better with many training points.242 The more complex models must be able to be initialized efficiently243 from the less complex models which they replace!246 ** Non Parametric Models248 *** Visual features of intermediate complexity and their use in classification250 *** The chains model for detecting parts by their context252 Like the constelation method for rigid objects, but extended to253 non-rigid objects as well.255 Allows you to build a hand detector from a face detector. This is256 usefull because hands might be only a few pixels, and very257 ambiguous in an image, but if you are expecting them at the end of258 an arm, then they become easier to find.