Mercurial > cortex
diff thesis/cortex.org @ 548:0b891e0dd809
version 0.2 of thesis complete.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Thu, 01 May 2014 23:41:41 -0400 |
parents | 5d89879fc894 |
children | c14545acdfba |
line wrap: on
line diff
1.1 --- a/thesis/cortex.org Mon Apr 28 15:10:59 2014 -0400 1.2 +++ b/thesis/cortex.org Thu May 01 23:41:41 2014 -0400 1.3 @@ -513,7 +513,7 @@ 1.4 appears to flow at a constant rate, regardless of how complicated 1.5 the environment becomes or how many creatures are in the 1.6 simulation. The cost is that =CORTEX= can sometimes run slower than 1.7 - real time. Time dialation works both ways, however --- simulations 1.8 + real time. Time dilation works both ways, however --- simulations 1.9 of very simple creatures in =CORTEX= generally run at 40x real-time 1.10 on my machine! 1.11 1.12 @@ -565,7 +565,7 @@ 1.13 each sense. 1.14 1.15 Fortunately this idea is already a well known computer graphics 1.16 - technique called /UV-mapping/. In UV-maping, the three-dimensional 1.17 + technique called /UV-mapping/. In UV-mapping, the three-dimensional 1.18 surface of a model is cut and smooshed until it fits on a 1.19 two-dimensional image. You paint whatever you want on that image, 1.20 and when the three-dimensional shape is rendered in a game the 1.21 @@ -2814,7 +2814,7 @@ 1.22 1.23 The worm's total life experience is a long looping path through 1.24 \Phi-space. I will now introduce simple way of taking that 1.25 - experiece path and building a function that can infer complete 1.26 + experience path and building a function that can infer complete 1.27 sensory experience given only a stream of proprioceptive data. This 1.28 /empathy/ function will provide a bridge to use the body centered 1.29 action predicates on video-like streams of information. 1.30 @@ -2972,7 +2972,7 @@ 1.31 entries in a proprioceptive bin, because for each element in the 1.32 starting bin it performs a series of set lookups in the preceding 1.33 bins. If the total history is limited, then this takes time 1.34 - proprotional to a only a constant multiple of the number of entries 1.35 + proportional to a only a constant multiple of the number of entries 1.36 in the starting bin. This analysis also applies, even if the action 1.37 requires multiple longest chains -- it's still the average number 1.38 of entries in a proprioceptive bin times the desired chain length. 1.39 @@ -3125,7 +3125,7 @@ 1.40 the testing environment for the action-predicates, with one major 1.41 difference : the only sensory information available to the system 1.42 is proprioception. From just the proprioception data and 1.43 - \Phi-space, =longest-thread= synthesises a complete record the last 1.44 + \Phi-space, =longest-thread= synthesizes a complete record the last 1.45 300 sensory experiences of the worm. These synthesized experiences 1.46 are fed directly into the action predicates =grand-circle?=, 1.47 =curled?=, =wiggling?=, and =resting?= from before and their output 1.48 @@ -3365,13 +3365,11 @@ 1.49 [[./images/worm-roll.png]] 1.50 1.51 #+caption: After completing its adventures, the worm now knows 1.52 - #+caption: how its touch sensors are arranged along its skin. These 1.53 - #+caption: are the regions that were deemed important by 1.54 + #+caption: how its touch sensors are arranged along its skin. Each of these six rectangles are touch sensory patterns that were 1.55 + #+caption: deemed important by 1.56 #+caption: =learn-touch-regions=. Each white square in the rectangles 1.57 #+caption: above is a cluster of ``related" touch nodes as determined 1.58 - #+caption: by the system. Since each square in the ``cross" corresponds 1.59 - #+caption: to a face, the worm has correctly discovered that it has 1.60 - #+caption: six faces. 1.61 + #+caption: by the system. The worm has correctly discovered that it has six faces, and has partitioned its sensory map into these six faces. 1.62 #+name: worm-touch-map 1.63 #+ATTR_LaTeX: :width 12cm 1.64 [[./images/touch-learn.png]] 1.65 @@ -3383,29 +3381,133 @@ 1.66 completely scrambled. The cross shape is just for convenience. This 1.67 example justifies the use of pre-defined touch regions in =EMPATH=. 1.68 1.69 +** Recognizing an object using embodied representation 1.70 + 1.71 + At the beginning of the thesis, I suggested that we might recognize 1.72 + the chair in Figure \ref{hidden-chair} by imagining ourselves in 1.73 + the position of the man and realizing that he must be sitting on 1.74 + something in order to maintain that position. Here, I present a 1.75 + brief elaboration on how to this might be done. 1.76 + 1.77 + First, I need the feeling of leaning or resting /on/ some other 1.78 + object that is not the floor. This feeling is easy to describe 1.79 + using an embodied representation. 1.80 + 1.81 + #+caption: Program describing the sense of leaning or resting on something. 1.82 + #+caption: This involves a relaxed posture, the feeling of touching something, 1.83 + #+caption: and a period of stability where the worm does not move. 1.84 + #+name: draped 1.85 + #+begin_listing clojure 1.86 + #+begin_src clojure 1.87 +(defn draped? 1.88 + "Is the worm: 1.89 + -- not flat (the floor is not a 'chair') 1.90 + -- supported (not using its muscles to hold its position) 1.91 + -- stable (not changing its position) 1.92 + -- touching something (must register contact)" 1.93 + [experiences] 1.94 + (let [b2-hash (bin 2) 1.95 + touch (:touch (peek experiences)) 1.96 + total-contact 1.97 + (reduce 1.98 + + 1.99 + (map #(contact all-touch-coordinates %) 1.100 + (rest touch)))] 1.101 + (println total-contact) 1.102 + (and (not (resting? experiences)) 1.103 + (every? 1.104 + zero? 1.105 + (-> experiences 1.106 + (vector:last-n 25) 1.107 + (#(map :muscle %)) 1.108 + (flatten))) 1.109 + (-> experiences 1.110 + (vector:last-n 20) 1.111 + (#(map (comp b2-hash flatten :proprioception) %)) 1.112 + (set) 1.113 + (count) (= 1)) 1.114 + (< 0.03 total-contact)))) 1.115 + #+end_src 1.116 + #+end_listing 1.117 + 1.118 + #+caption: The =draped?= predicate detects the presence of the 1.119 + #+caption: cube whenever the worm interacts with it. The details of the 1.120 + #+caption: cube are irrelevant; only the way it influences the worm's 1.121 + #+caption: body matters. 1.122 + #+name: draped-video 1.123 + #+ATTR_LaTeX: :width 13cm 1.124 + [[./images/draped.png]] 1.125 + 1.126 + Though this is a simple example, using the =draped?= predicate to 1.127 + detect the cube has interesting advantages. The =draped?= predicate 1.128 + describes the cube not in terms of properties that the cube has, 1.129 + but instead in terms of how the worm interacts with it physically. 1.130 + This means that the cube can still be detected even if it is not 1.131 + visible, as long as its influence on the worm's body is visible. 1.132 + 1.133 + This system will also see the virtual cube created by a 1.134 + ``mimeworm", which uses its muscles in a very controlled way to 1.135 + mimic the appearance of leaning on a cube. The system will 1.136 + anticipate that there is an actual invisible cube that provides 1.137 + support! 1.138 + 1.139 + #+caption: Can you see the thing that this person is leaning on? 1.140 + #+caption: What properties does it have, other than how it makes the man's 1.141 + #+caption: elbow and shoulder feel? I wonder if people who can actually 1.142 + #+caption: maintain this pose easily still see the support? 1.143 + #+name: mime 1.144 + #+ATTR_LaTeX: :width 6cm 1.145 + [[./images/pablo-the-mime.png]] 1.146 + 1.147 + This makes me wonder about the psychology of actual mimes. Suppose 1.148 + for a moment that people have something analogous to \Phi-space and 1.149 + that one of the ways that they find objects in a scene is by their 1.150 + relation to other people's bodies. Suppose that a person watches a 1.151 + person miming an invisible wall. For a person with no experience 1.152 + with miming, their \Phi-space will only have entries that describe 1.153 + the scene with the sensation of their hands touching a wall. This 1.154 + sensation of touch will create a strong impression of a wall, even 1.155 + though the wall would have to be invisible. A person with 1.156 + experience in miming however, will have entries in their \Phi-space 1.157 + that describe the wall-miming position without a sense of touch. It 1.158 + will not seem to such as person that an invisible wall is present, 1.159 + but merely that the mime is holding out their hands in a special 1.160 + way. Thus, the theory that humans use something like \Phi-space 1.161 + weakly predicts that learning how to mime should break the power of 1.162 + miming illusions. Most optical illusions still work no matter how 1.163 + much you know about them, so this proposal would be quite 1.164 + interesting to test, as it predicts a non-standard result! 1.165 + 1.166 + 1.167 +#+BEGIN_LaTeX 1.168 +\clearpage 1.169 +#+END_LaTeX 1.170 + 1.171 * Contributions 1.172 + 1.173 + The big idea behind this thesis is a new way to represent and 1.174 + recognize physical actions, which I call /empathic representation/. 1.175 + Actions are represented as predicates which have access to the 1.176 + totality of a creature's sensory abilities. To recognize the 1.177 + physical actions of another creature similar to yourself, you 1.178 + imagine what they would feel by examining the position of their body 1.179 + and relating it to your own previous experience. 1.180 1.181 - The big idea behind this thesis is a new way to represent and 1.182 - recognize physical actions -- empathic representation. Actions are 1.183 - represented as predicates which have available the totality of a 1.184 - creature's sensory abilities. To recognize the physical actions of 1.185 - another creature similar to yourself, you imagine what they would 1.186 - feel by examining the position of their body and relating it to your 1.187 - own previous experience. 1.188 - 1.189 - Empathic description of physical actions is very robust and general. 1.190 - Because the representation is body-centered, it avoids the fragility 1.191 - of learning from example videos. Because it relies on all of a 1.192 + Empathic representation of physical actions is robust and general. 1.193 + Because the representation is body-centered, it avoids baking in a 1.194 + particular viewpoint like you might get from learning from example 1.195 + videos. Because empathic representation relies on all of a 1.196 creature's senses, it can describe exactly what an action /feels 1.197 like/ without getting caught up in irrelevant details such as visual 1.198 appearance. I think it is important that a correct description of 1.199 - jumping (for example) should not waste even a single bit on the 1.200 - color of a person's clothes or skin; empathic representation can 1.201 - avoid this waste by describing jumping in terms of touch, muscle 1.202 - contractions, and the brief feeling of weightlessness. Empathic 1.203 - representation is very low-level in that it describes actions using 1.204 - concrete sensory data with little abstraction, but it has the 1.205 - generality of much more abstract representations! 1.206 + jumping (for example) should not include irrelevant details such as 1.207 + the color of a person's clothes or skin; empathic representation can 1.208 + get right to the heart of what jumping is by describing it in terms 1.209 + of touch, muscle contractions, and a brief feeling of 1.210 + weightlessness. Empathic representation is very low-level in that it 1.211 + describes actions using concrete sensory data with little 1.212 + abstraction, but it has the generality of much more abstract 1.213 + representations! 1.214 1.215 Another important contribution of this thesis is the development of 1.216 the =CORTEX= system, a complete environment for creating simulated 1.217 @@ -3413,29 +3515,31 @@ 1.218 proprioception, hearing, vision, and muscle tension. You have seen 1.219 how to create new creatures using blender, a 3D modeling tool. 1.220 1.221 - I hope that =CORTEX= will be useful in further research projects. To 1.222 - this end I have included the full source to =CORTEX= along with a 1.223 - large suite of tests and examples. I have also created a user guide 1.224 - for =CORTEX= which is included in an appendix to this thesis. 1.225 - 1.226 As a minor digression, you also saw how I used =CORTEX= to enable a 1.227 tiny worm to discover the topology of its skin simply by rolling on 1.228 - the ground. 1.229 - 1.230 - In conclusion, the main contributions of this thesis are: 1.231 - 1.232 - - =CORTEX=, a comprehensive platform for embodied AI experiments. 1.233 - =CORTEX= supports many features lacking in other systems, such 1.234 - proper simulation of hearing. It is easy to create new =CORTEX= 1.235 - creatures using Blender, a free 3D modeling program. 1.236 - 1.237 - - =EMPATH=, which uses =CORTEX= to identify the actions of a 1.238 - worm-like creature using a computational model of empathy. This 1.239 - empathic representation of actions is an important new kind of 1.240 - representation for physical actions. 1.241 + the ground. You also saw how to detect objects using only embodied 1.242 + predicates. 1.243 + 1.244 + In conclusion, for this thesis I: 1.245 + 1.246 + - Developed the idea of embodied representation, which describes 1.247 + actions that a creature can do in terms of first-person sensory 1.248 + data. 1.249 + 1.250 + - Developed a method of empathic action recognition which uses 1.251 + previous embodied experience and embodied representation of 1.252 + actions to greatly constrain the possible interpretations of an 1.253 + action. 1.254 + 1.255 + - Created =EMPATH=, a program which uses empathic action 1.256 + recognition to recognize physical actions in a simple model 1.257 + involving segmented worm-like creatures. 1.258 + 1.259 + - Created =CORTEX=, a comprehensive platform for embodied AI 1.260 + experiments. It is the base on which =EMPATH= is built. 1.261 1.262 #+BEGIN_LaTeX 1.263 -\newpage 1.264 +\clearpage 1.265 \appendix 1.266 #+END_LaTeX 1.267