Mercurial > cortex
diff thesis/cortex.org @ 551:d304b2ea7c58
some changes from winston.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Fri, 02 May 2014 13:40:47 -0400 |
parents | b1d8d9b4b569 |
children | 20f64a70f8c5 |
line wrap: on
line diff
1.1 --- a/thesis/cortex.org Fri May 02 03:39:19 2014 -0400 1.2 +++ b/thesis/cortex.org Fri May 02 13:40:47 2014 -0400 1.3 @@ -43,15 +43,15 @@ 1.4 1.5 * Empathy \& Embodiment: problem solving strategies 1.6 1.7 - By the end of this thesis, you will have a novel approach to 1.8 - representing an recognizing physical actions using embodiment and 1.9 - empathy. You will also see one way to efficiently implement physical 1.10 - empathy for embodied creatures. Finally, you will become familiar 1.11 - with =CORTEX=, a system for designing and simulating creatures with 1.12 - rich senses, which I have designed as a library that you can use in 1.13 - your own research. Note that I /do not/ process video directly --- I 1.14 - start with knowledge of the positions of a creature's body parts and 1.15 - works from there. 1.16 + By the time you have read this thesis, you will understand a novel 1.17 + approach to representing and recognizing physical actions using 1.18 + embodiment and empathy. You will also see one way to efficiently 1.19 + implement physical empathy for embodied creatures. Finally, you will 1.20 + become familiar with =CORTEX=, a system for designing and simulating 1.21 + creatures with rich senses, which I have designed as a library that 1.22 + you can use in your own research. Note that I /do not/ process video 1.23 + directly --- I start with knowledge of the positions of a creature's 1.24 + body parts and work from there. 1.25 1.26 This is the core vision of my thesis: That one of the important ways 1.27 in which we understand others is by imagining ourselves in their 1.28 @@ -65,12 +65,13 @@ 1.29 1.30 ** The problem: recognizing actions is hard! 1.31 1.32 - Examine the following image. What is happening? As you, and indeed 1.33 - very young children, can easily determine, this is an image of 1.34 - drinking. 1.35 + Examine figure \ref{cat-drink}. What is happening? As you, and 1.36 + indeed very young children, can easily determine, this is an image 1.37 + of drinking. 1.38 1.39 #+caption: A cat drinking some water. Identifying this action is 1.40 #+caption: beyond the capabilities of existing computer vision systems. 1.41 + #+name: cat-drink 1.42 #+ATTR_LaTeX: :width 7cm 1.43 [[./images/cat-drinking.jpg]] 1.44 1.45 @@ -94,11 +95,13 @@ 1.46 [[./images/fat-person-sitting-at-desk.jpg]] 1.47 1.48 Finally, how is it that you can easily tell the difference between 1.49 - how the girls /muscles/ are working in figure \ref{girl}? 1.50 + how the girl's /muscles/ are working in figure \ref{girl}? 1.51 1.52 #+caption: The mysterious ``common sense'' appears here as you are able 1.53 #+caption: to discern the difference in how the girl's arm muscles 1.54 - #+caption: are activated between the two images. 1.55 + #+caption: are activated between the two images. When you compare 1.56 + #+caption: these two images, do you feel something in your own arm 1.57 + #+caption: muscles? 1.58 #+name: girl 1.59 #+ATTR_LaTeX: :width 7cm 1.60 [[./images/wall-push.png]] 1.61 @@ -138,7 +141,7 @@ 1.62 image, but is found clearly in a simulation / recollection inspired 1.63 by those pixels. An imaginative system, having been trained on 1.64 drinking and non-drinking examples and learning that the most 1.65 - important component of drinking is the feeling of water sliding 1.66 + important component of drinking is the feeling of water flowing 1.67 down one's throat, would analyze a video of a cat drinking in the 1.68 following manner: 1.69 1.70 @@ -146,7 +149,7 @@ 1.71 model of its own body in place of the cat. Possibly also create 1.72 a simulation of the stream of water. 1.73 1.74 - 2. ``Play out'' this simulated scene and generate imagined sensory 1.75 + 2. Play out this simulated scene and generate imagined sensory 1.76 experience. This will include relevant muscle contractions, a 1.77 close up view of the stream from the cat's perspective, and most 1.78 importantly, the imagined feeling of water entering the mouth. 1.79 @@ -233,19 +236,19 @@ 1.80 Exploring these ideas further demands a concrete implementation, so 1.81 first, I built a system for constructing virtual creatures with 1.82 physiologically plausible sensorimotor systems and detailed 1.83 - environments. The result is =CORTEX=, which is described in section 1.84 + environments. The result is =CORTEX=, which I describe in chapter 1.85 \ref{sec-2}. 1.86 1.87 Next, I wrote routines which enabled a simple worm-like creature to 1.88 infer the actions of a second worm-like creature, using only its 1.89 own prior sensorimotor experiences and knowledge of the second 1.90 worm's joint positions. This program, =EMPATH=, is described in 1.91 - section \ref{sec-3}. It's main components are: 1.92 + chapter \ref{sec-3}. It's main components are: 1.93 1.94 - Embodied Action Definitions :: Many otherwise complicated actions 1.95 are easily described in the language of a full suite of 1.96 body-centered, rich senses and experiences. For example, 1.97 - drinking is the feeling of water sliding down your throat, and 1.98 + drinking is the feeling of water flowing down your throat, and 1.99 cooling your insides. It's often accompanied by bringing your 1.100 hand close to your face, or bringing your face close to water. 1.101 Sitting down is the feeling of bending your knees, activating 1.102 @@ -316,10 +319,10 @@ 1.103 that such representations are very powerful, and often 1.104 indispensable for the types of recognition tasks considered here. 1.105 1.106 - - Although for expediency's sake, I relied on direct knowledge of 1.107 - joint positions in this proof of concept, it would be 1.108 - straightforward to extend =EMPATH= so that it (more 1.109 - realistically) infers joint positions from its visual data. 1.110 + - For expediency's sake, I relied on direct knowledge of joint 1.111 + positions in this proof of concept. However, I believe that the 1.112 + structure of =EMPATH= and =CORTEX= will make future work to 1.113 + enable video analysis much easier than it would otherwise be. 1.114 1.115 ** =EMPATH= is built on =CORTEX=, a creature builder. 1.116 1.117 @@ -343,19 +346,19 @@ 1.118 for three reasons: 1.119 1.120 - You can design new creatures using Blender (\cite{blender}), a 1.121 - popular 3D modeling program. Each sense can be specified using 1.122 - special blender nodes with biologically inspired parameters. You 1.123 - need not write any code to create a creature, and can use a wide 1.124 - library of pre-existing blender models as a base for your own 1.125 - creatures. 1.126 + popular, free 3D modeling program. Each sense can be specified 1.127 + using special blender nodes with biologically inspired 1.128 + parameters. You need not write any code to create a creature, and 1.129 + can use a wide library of pre-existing blender models as a base 1.130 + for your own creatures. 1.131 1.132 - =CORTEX= implements a wide variety of senses: touch, 1.133 proprioception, vision, hearing, and muscle tension. Complicated 1.134 senses like touch and vision involve multiple sensory elements 1.135 embedded in a 2D surface. You have complete control over the 1.136 distribution of these sensor elements through the use of simple 1.137 - png image files. =CORTEX= implements more comprehensive hearing 1.138 - than any other creature simulation system available. 1.139 + image files. =CORTEX= implements more comprehensive hearing than 1.140 + any other creature simulation system available. 1.141 1.142 - =CORTEX= supports any number of creatures and any number of 1.143 senses. Time in =CORTEX= dilates so that the simulated creatures 1.144 @@ -365,8 +368,8 @@ 1.145 =CORTEX= is built on top of =jMonkeyEngine3= 1.146 (\cite{jmonkeyengine}), which is a video game engine designed to 1.147 create cross-platform 3D desktop games. =CORTEX= is mainly written 1.148 - in clojure, a dialect of =LISP= that runs on the java virtual 1.149 - machine (JVM). The API for creating and simulating creatures and 1.150 + in clojure, a dialect of =LISP= that runs on the Java Virtual 1.151 + Machine (JVM). The API for creating and simulating creatures and 1.152 senses is entirely expressed in clojure, though many senses are 1.153 implemented at the layer of jMonkeyEngine or below. For example, 1.154 for the sense of hearing I use a layer of clojure code on top of a 1.155 @@ -396,8 +399,8 @@ 1.156 - imagination using subworlds 1.157 1.158 During one test with =CORTEX=, I created 3,000 creatures each with 1.159 - their own independent senses and ran them all at only 1/80 real 1.160 - time. In another test, I created a detailed model of my own hand, 1.161 + its own independent senses and ran them all at only 1/80 real time. 1.162 + In another test, I created a detailed model of my own hand, 1.163 equipped with a realistic distribution of touch (more sensitive at 1.164 the fingertips), as well as eyes and ears, and it ran at around 1/4 1.165 real time. 1.166 @@ -416,9 +419,9 @@ 1.167 \end{sidewaysfigure} 1.168 #+END_LaTeX 1.169 1.170 -* Designing =CORTEX= 1.171 - 1.172 - In this section, I outline the design decisions that went into 1.173 +* COMMENT Designing =CORTEX= 1.174 + 1.175 + In this chapter, I outline the design decisions that went into 1.176 making =CORTEX=, along with some details about its implementation. 1.177 (A practical guide to getting started with =CORTEX=, which skips 1.178 over the history and implementation details presented here, is 1.179 @@ -1317,8 +1320,8 @@ 1.180 1.181 ** ...but hearing must be built from scratch 1.182 1.183 - At the end of this section I will have simulated ears that work the 1.184 - same way as the simulated eyes in the last section. I will be able to 1.185 + At the end of this chapter I will have simulated ears that work the 1.186 + same way as the simulated eyes in the last chapter. I will be able to 1.187 place any number of ear-nodes in a blender file, and they will bind to 1.188 the closest physical object and follow it as it moves around. Each ear 1.189 will provide access to the sound data it picks up between every frame. 1.190 @@ -1332,7 +1335,7 @@ 1.191 =CORTEX='s hearing is unique because it does not have any 1.192 limitations compared to other simulation environments. As far as I 1.193 know, there is no other system that supports multiple listeners, 1.194 - and the sound demo at the end of this section is the first time 1.195 + and the sound demo at the end of this chapter is the first time 1.196 it's been done in a video game environment. 1.197 1.198 *** Brief Description of jMonkeyEngine's Sound System 1.199 @@ -2146,7 +2149,7 @@ 1.200 joint from the rest position defined in the blender file. This 1.201 simulates the muscle-spindles and joint capsules. I will deal with 1.202 Golgi tendon organs, which calculate muscle strain, in the next 1.203 - section. 1.204 + chapter. 1.205 1.206 *** Helper functions 1.207 1.208 @@ -2392,7 +2395,7 @@ 1.209 also a sense function: it returns the percent of the total muscle 1.210 strength that is currently being employed. This is analogous to 1.211 muscle tension in humans and completes the sense of proprioception 1.212 - begun in the last section. 1.213 + begun in the last chapter. 1.214 1.215 ** =CORTEX= brings complex creatures to life! 1.216 1.217 @@ -2499,7 +2502,7 @@ 1.218 1.219 \newpage 1.220 1.221 -* =EMPATH=: action recognition in a simulated worm 1.222 +* COMMENT =EMPATH=: action recognition in a simulated worm 1.223 1.224 Here I develop a computational model of empathy, using =CORTEX= as a 1.225 base. Empathy in this context is the ability to observe another 1.226 @@ -3220,7 +3223,7 @@ 1.227 1.228 ** Digression: Learning touch sensor layout through free play 1.229 1.230 - In the previous section I showed how to compute actions in terms of 1.231 + In the previous chapter I showed how to compute actions in terms of 1.232 body-centered predicates, but some of those predicates relied on 1.233 the average touch activation of pre-defined regions of the worm's 1.234 skin. What if, instead of receiving touch pre-grouped into the six 1.235 @@ -3233,7 +3236,7 @@ 1.236 of skin that are close together in the animal end up far apart in 1.237 the nerve bundle. 1.238 1.239 - In this section I show how to automatically learn the skin-topology of 1.240 + In this chapter I show how to automatically learn the skin-topology of 1.241 a worm segment by free exploration. As the worm rolls around on the 1.242 floor, large sections of its surface get activated. If the worm has 1.243 stopped moving, then whatever region of skin that is touching the 1.244 @@ -3484,7 +3487,7 @@ 1.245 \clearpage 1.246 #+END_LaTeX 1.247 1.248 -* Contributions 1.249 +* COMMENT Contributions 1.250 1.251 The big idea behind this thesis is a new way to represent and 1.252 recognize physical actions, which I call /empathic representation/. 1.253 @@ -3544,7 +3547,7 @@ 1.254 \appendix 1.255 #+END_LaTeX 1.256 1.257 -* Appendix: =CORTEX= User Guide 1.258 +* COMMENT Appendix: =CORTEX= User Guide 1.259 1.260 Those who write a thesis should endeavor to make their code not only 1.261 accessible, but actually usable, as a way to pay back the community