Mercurial > cortex
diff thesis/cortex.org @ 517:68665d2c32a7
spellcheck; almost done with first draft!
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Mon, 31 Mar 2014 00:18:26 -0400 |
parents | ced955c3c84f |
children | d78f5102d693 |
line wrap: on
line diff
1.1 --- a/thesis/cortex.org Sun Mar 30 22:48:19 2014 -0400 1.2 +++ b/thesis/cortex.org Mon Mar 31 00:18:26 2014 -0400 1.3 @@ -59,7 +59,6 @@ 1.4 constraint can be the difference between easily understanding what 1.5 is happening in a video and being completely lost in a sea of 1.6 incomprehensible color and movement. 1.7 - 1.8 1.9 ** The problem: recognizing actions in video is hard! 1.10 1.11 @@ -77,7 +76,7 @@ 1.12 the problem is that many computer vision systems focus on 1.13 pixel-level details or comparisons to example images (such as 1.14 \cite{volume-action-recognition}), but the 3D world is so variable 1.15 - that it is hard to descrive the world in terms of possible images. 1.16 + that it is hard to describe the world in terms of possible images. 1.17 1.18 In fact, the contents of scene may have much less to do with pixel 1.19 probabilities than with recognizing various affordances: things you 1.20 @@ -102,7 +101,7 @@ 1.21 [[./images/wall-push.png]] 1.22 1.23 Each of these examples tells us something about what might be going 1.24 - on in our minds as we easily solve these recognition problems. 1.25 + on in our minds as we easily solve these recognition problems: 1.26 1.27 The hidden chair shows us that we are strongly triggered by cues 1.28 relating to the position of human bodies, and that we can determine 1.29 @@ -115,6 +114,11 @@ 1.30 most positions, and we can easily project this self-knowledge to 1.31 imagined positions triggered by images of the human body. 1.32 1.33 + The cat tells us that imagination of some kind plays an important 1.34 + role in understanding actions. The question is: Can we be more 1.35 + precise about what sort of imagination is required to understand 1.36 + these actions? 1.37 + 1.38 ** A step forward: the sensorimotor-centered approach 1.39 1.40 In this thesis, I explore the idea that our knowledge of our own 1.41 @@ -139,13 +143,13 @@ 1.42 model of its own body in place of the cat. Possibly also create 1.43 a simulation of the stream of water. 1.44 1.45 - 2. Play out this simulated scene and generate imagined sensory 1.46 + 2. ``Play out'' this simulated scene and generate imagined sensory 1.47 experience. This will include relevant muscle contractions, a 1.48 close up view of the stream from the cat's perspective, and most 1.49 - importantly, the imagined feeling of water entering the 1.50 - mouth. The imagined sensory experience can come from a 1.51 - simulation of the event, but can also be pattern-matched from 1.52 - previous, similar embodied experience. 1.53 + importantly, the imagined feeling of water entering the mouth. 1.54 + The imagined sensory experience can come from a simulation of 1.55 + the event, but can also be pattern-matched from previous, 1.56 + similar embodied experience. 1.57 1.58 3. The action is now easily identified as drinking by the sense of 1.59 taste alone. The other senses (such as the tongue moving in and 1.60 @@ -160,7 +164,7 @@ 1.61 2. Generate proprioceptive sensory data from this alignment. 1.62 1.63 3. Use the imagined proprioceptive data as a key to lookup related 1.64 - sensory experience associated with that particular proproceptive 1.65 + sensory experience associated with that particular proprioceptive 1.66 feeling. 1.67 1.68 4. Retrieve the feeling of your bottom resting on a surface, your 1.69 @@ -194,14 +198,14 @@ 1.70 viewpoint. 1.71 1.72 Another powerful advantage is that using the language of multiple 1.73 - body-centered rich senses to describe body-centerd actions offers a 1.74 + body-centered rich senses to describe body-centered actions offers a 1.75 massive boost in descriptive capability. Consider how difficult it 1.76 would be to compose a set of HOG filters to describe the action of 1.77 a simple worm-creature ``curling'' so that its head touches its 1.78 tail, and then behold the simplicity of describing thus action in a 1.79 language designed for the task (listing \ref{grand-circle-intro}): 1.80 1.81 - #+caption: Body-centerd actions are best expressed in a body-centered 1.82 + #+caption: Body-centered actions are best expressed in a body-centered 1.83 #+caption: language. This code detects when the worm has curled into a 1.84 #+caption: full circle. Imagine how you would replicate this functionality 1.85 #+caption: using low-level pixel features such as HOG filters! 1.86 @@ -220,30 +224,23 @@ 1.87 #+end_src 1.88 #+end_listing 1.89 1.90 -** =EMPATH= regognizes actions using empathy 1.91 - 1.92 - First, I built a system for constructing virtual creatures with 1.93 +** =EMPATH= recognizes actions using empathy 1.94 + 1.95 + Exploring these ideas further demands a concrete implementation, so 1.96 + first, I built a system for constructing virtual creatures with 1.97 physiologically plausible sensorimotor systems and detailed 1.98 environments. The result is =CORTEX=, which is described in section 1.99 - \ref{sec-2}. (=CORTEX= was built to be flexible and useful to other 1.100 - AI researchers; it is provided in full with detailed instructions 1.101 - on the web [here].) 1.102 + \ref{sec-2}. 1.103 1.104 Next, I wrote routines which enabled a simple worm-like creature to 1.105 infer the actions of a second worm-like creature, using only its 1.106 own prior sensorimotor experiences and knowledge of the second 1.107 worm's joint positions. This program, =EMPATH=, is described in 1.108 - section \ref{sec-3}, and the key results of this experiment are 1.109 - summarized below. 1.110 - 1.111 - I have built a system that can express the types of recognition 1.112 - problems in a form amenable to computation. It is split into 1.113 - four parts: 1.114 - 1.115 - - Free/Guided Play :: The creature moves around and experiences the 1.116 - world through its unique perspective. Many otherwise 1.117 - complicated actions are easily described in the language of a 1.118 - full suite of body-centered, rich senses. For example, 1.119 + section \ref{sec-3}. It's main components are: 1.120 + 1.121 + - Embodied Action Definitions :: Many otherwise complicated actions 1.122 + are easily described in the language of a full suite of 1.123 + body-centered, rich senses and experiences. For example, 1.124 drinking is the feeling of water sliding down your throat, and 1.125 cooling your insides. It's often accompanied by bringing your 1.126 hand close to your face, or bringing your face close to water. 1.127 @@ -251,26 +248,35 @@ 1.128 your quadriceps, then feeling a surface with your bottom and 1.129 relaxing your legs. These body-centered action descriptions 1.130 can be either learned or hard coded. 1.131 - - Posture Imitation :: When trying to interpret a video or image, 1.132 + 1.133 + - Guided Play :: The creature moves around and experiences the 1.134 + world through its unique perspective. As the creature moves, 1.135 + it gathers experiences that satisfy the embodied action 1.136 + definitions. 1.137 + 1.138 + - Posture imitation :: When trying to interpret a video or image, 1.139 the creature takes a model of itself and aligns it with 1.140 - whatever it sees. This alignment can even cross species, as 1.141 + whatever it sees. This alignment might even cross species, as 1.142 when humans try to align themselves with things like ponies, 1.143 dogs, or other humans with a different body type. 1.144 - - Empathy :: The alignment triggers associations with 1.145 + 1.146 + - Empathy :: The alignment triggers associations with 1.147 sensory data from prior experiences. For example, the 1.148 alignment itself easily maps to proprioceptive data. Any 1.149 sounds or obvious skin contact in the video can to a lesser 1.150 - extent trigger previous experience. Segments of previous 1.151 - experiences are stitched together to form a coherent and 1.152 - complete sensory portrait of the scene. 1.153 - - Recognition :: With the scene described in terms of first 1.154 - person sensory events, the creature can now run its 1.155 - action-identification programs on this synthesized sensory 1.156 - data, just as it would if it were actually experiencing the 1.157 - scene first-hand. If previous experience has been accurately 1.158 + extent trigger previous experience keyed to hearing or touch. 1.159 + Segments of previous experiences gained from play are stitched 1.160 + together to form a coherent and complete sensory portrait of 1.161 + the scene. 1.162 + 1.163 + - Recognition :: With the scene described in terms of 1.164 + remembered first person sensory events, the creature can now 1.165 + run its action-identified programs (such as the one in listing 1.166 + \ref{grand-circle-intro} on this synthesized sensory data, 1.167 + just as it would if it were actually experiencing the scene 1.168 + first-hand. If previous experience has been accurately 1.169 retrieved, and if it is analogous enough to the scene, then 1.170 the creature will correctly identify the action in the scene. 1.171 - 1.172 1.173 My program, =EMPATH= uses this empathic problem solving technique 1.174 to interpret the actions of a simple, worm-like creature. 1.175 @@ -287,28 +293,31 @@ 1.176 #+name: worm-recognition-intro 1.177 #+ATTR_LaTeX: :width 15cm 1.178 [[./images/worm-poses.png]] 1.179 - 1.180 - #+caption: From only \emph{proprioceptive} data, =EMPATH= was able to infer 1.181 - #+caption: the complete sensory experience and classify these four poses. 1.182 - #+caption: The last image is a composite, depicting the intermediate stages 1.183 - #+caption: of \emph{wriggling}. 1.184 - #+name: worm-recognition-intro-2 1.185 - #+ATTR_LaTeX: :width 15cm 1.186 - [[./images/empathy-1.png]] 1.187 1.188 - Next, I developed an experiment to test the power of =CORTEX='s 1.189 - sensorimotor-centered language for solving recognition problems. As 1.190 - a proof of concept, I wrote routines which enabled a simple 1.191 - worm-like creature to infer the actions of a second worm-like 1.192 - creature, using only its own previous sensorimotor experiences and 1.193 - knowledge of the second worm's joints (figure 1.194 - \ref{worm-recognition-intro-2}). The result of this proof of 1.195 - concept was the program =EMPATH=, described in section \ref{sec-3}. 1.196 - 1.197 -** =EMPATH= is built on =CORTEX=, en environment for making creatures. 1.198 - 1.199 - # =CORTEX= provides a language for describing the sensorimotor 1.200 - # experiences of various creatures. 1.201 +*** Main Results 1.202 + 1.203 + - After one-shot supervised training, =EMPATH= was able recognize a 1.204 + wide variety of static poses and dynamic actions---ranging from 1.205 + curling in a circle to wiggling with a particular frequency --- 1.206 + with 95\% accuracy. 1.207 + 1.208 + - These results were completely independent of viewing angle 1.209 + because the underlying body-centered language fundamentally is 1.210 + independent; once an action is learned, it can be recognized 1.211 + equally well from any viewing angle. 1.212 + 1.213 + - =EMPATH= is surprisingly short; the sensorimotor-centered 1.214 + language provided by =CORTEX= resulted in extremely economical 1.215 + recognition routines --- about 500 lines in all --- suggesting 1.216 + that such representations are very powerful, and often 1.217 + indispensable for the types of recognition tasks considered here. 1.218 + 1.219 + - Although for expediency's sake, I relied on direct knowledge of 1.220 + joint positions in this proof of concept, it would be 1.221 + straightforward to extend =EMPATH= so that it (more 1.222 + realistically) infers joint positions from its visual data. 1.223 + 1.224 +** =EMPATH= is built on =CORTEX=, a creature builder. 1.225 1.226 I built =CORTEX= to be a general AI research platform for doing 1.227 experiments involving multiple rich senses and a wide variety and 1.228 @@ -319,19 +328,21 @@ 1.229 language of creatures and senses, but in order to explore those 1.230 ideas they must first build a platform in which they can create 1.231 simulated creatures with rich senses! There are many ideas that 1.232 - would be simple to execute (such as =EMPATH=), but attached to them 1.233 - is the multi-month effort to make a good creature simulator. Often, 1.234 - that initial investment of time proves to be too much, and the 1.235 - project must make do with a lesser environment. 1.236 + would be simple to execute (such as =EMPATH= or 1.237 + \cite{larson-symbols}), but attached to them is the multi-month 1.238 + effort to make a good creature simulator. Often, that initial 1.239 + investment of time proves to be too much, and the project must make 1.240 + do with a lesser environment. 1.241 1.242 =CORTEX= is well suited as an environment for embodied AI research 1.243 for three reasons: 1.244 1.245 - - You can create new creatures using Blender, a popular 3D modeling 1.246 - program. Each sense can be specified using special blender nodes 1.247 - with biologically inspired paramaters. You need not write any 1.248 - code to create a creature, and can use a wide library of 1.249 - pre-existing blender models as a base for your own creatures. 1.250 + - You can create new creatures using Blender (\cite{blender}), a 1.251 + popular 3D modeling program. Each sense can be specified using 1.252 + special blender nodes with biologically inspired parameters. You 1.253 + need not write any code to create a creature, and can use a wide 1.254 + library of pre-existing blender models as a base for your own 1.255 + creatures. 1.256 1.257 - =CORTEX= implements a wide variety of senses: touch, 1.258 proprioception, vision, hearing, and muscle tension. Complicated 1.259 @@ -343,24 +354,25 @@ 1.260 available. 1.261 1.262 - =CORTEX= supports any number of creatures and any number of 1.263 - senses. Time in =CORTEX= dialates so that the simulated creatures 1.264 - always precieve a perfectly smooth flow of time, regardless of 1.265 + senses. Time in =CORTEX= dilates so that the simulated creatures 1.266 + always perceive a perfectly smooth flow of time, regardless of 1.267 the actual computational load. 1.268 1.269 - =CORTEX= is built on top of =jMonkeyEngine3=, which is a video game 1.270 - engine designed to create cross-platform 3D desktop games. =CORTEX= 1.271 - is mainly written in clojure, a dialect of =LISP= that runs on the 1.272 - java virtual machine (JVM). The API for creating and simulating 1.273 - creatures and senses is entirely expressed in clojure, though many 1.274 - senses are implemented at the layer of jMonkeyEngine or below. For 1.275 - example, for the sense of hearing I use a layer of clojure code on 1.276 - top of a layer of java JNI bindings that drive a layer of =C++= 1.277 - code which implements a modified version of =OpenAL= to support 1.278 - multiple listeners. =CORTEX= is the only simulation environment 1.279 - that I know of that can support multiple entities that can each 1.280 - hear the world from their own perspective. Other senses also 1.281 - require a small layer of Java code. =CORTEX= also uses =bullet=, a 1.282 - physics simulator written in =C=. 1.283 + =CORTEX= is built on top of =jMonkeyEngine3= 1.284 + (\cite{jmonkeyengine}), which is a video game engine designed to 1.285 + create cross-platform 3D desktop games. =CORTEX= is mainly written 1.286 + in clojure, a dialect of =LISP= that runs on the java virtual 1.287 + machine (JVM). The API for creating and simulating creatures and 1.288 + senses is entirely expressed in clojure, though many senses are 1.289 + implemented at the layer of jMonkeyEngine or below. For example, 1.290 + for the sense of hearing I use a layer of clojure code on top of a 1.291 + layer of java JNI bindings that drive a layer of =C++= code which 1.292 + implements a modified version of =OpenAL= to support multiple 1.293 + listeners. =CORTEX= is the only simulation environment that I know 1.294 + of that can support multiple entities that can each hear the world 1.295 + from their own perspective. Other senses also require a small layer 1.296 + of Java code. =CORTEX= also uses =bullet=, a physics simulator 1.297 + written in =C=. 1.298 1.299 #+caption: Here is the worm from figure \ref{worm-intro} modeled 1.300 #+caption: in Blender, a free 3D-modeling program. Senses and 1.301 @@ -375,8 +387,8 @@ 1.302 - distributed communication among swarm creatures 1.303 - self-learning using free exploration, 1.304 - evolutionary algorithms involving creature construction 1.305 - - exploration of exoitic senses and effectors that are not possible 1.306 - in the real world (such as telekenisis or a semantic sense) 1.307 + - exploration of exotic senses and effectors that are not possible 1.308 + in the real world (such as telekinesis or a semantic sense) 1.309 - imagination using subworlds 1.310 1.311 During one test with =CORTEX=, I created 3,000 creatures each with 1.312 @@ -400,37 +412,6 @@ 1.313 \end{sidewaysfigure} 1.314 #+END_LaTeX 1.315 1.316 -** Contributions 1.317 - 1.318 - - I built =CORTEX=, a comprehensive platform for embodied AI 1.319 - experiments. =CORTEX= supports many features lacking in other 1.320 - systems, such proper simulation of hearing. It is easy to create 1.321 - new =CORTEX= creatures using Blender, a free 3D modeling program. 1.322 - 1.323 - - I built =EMPATH=, which uses =CORTEX= to identify the actions of 1.324 - a worm-like creature using a computational model of empathy. 1.325 - 1.326 - - After one-shot supervised training, =EMPATH= was able recognize a 1.327 - wide variety of static poses and dynamic actions---ranging from 1.328 - curling in a circle to wriggling with a particular frequency --- 1.329 - with 95\% accuracy. 1.330 - 1.331 - - These results were completely independent of viewing angle 1.332 - because the underlying body-centered language fundamentally is 1.333 - independent; once an action is learned, it can be recognized 1.334 - equally well from any viewing angle. 1.335 - 1.336 - - =EMPATH= is surprisingly short; the sensorimotor-centered 1.337 - language provided by =CORTEX= resulted in extremely economical 1.338 - recognition routines --- about 500 lines in all --- suggesting 1.339 - that such representations are very powerful, and often 1.340 - indispensible for the types of recognition tasks considered here. 1.341 - 1.342 - - Although for expediency's sake, I relied on direct knowledge of 1.343 - joint positions in this proof of concept, it would be 1.344 - straightforward to extend =EMPATH= so that it (more 1.345 - realistically) infers joint positions from its visual data. 1.346 - 1.347 * Designing =CORTEX= 1.348 1.349 In this section, I outline the design decisions that went into 1.350 @@ -441,18 +422,18 @@ 1.351 1.352 Throughout this project, I intended for =CORTEX= to be flexible and 1.353 extensible enough to be useful for other researchers who want to 1.354 - test out ideas of their own. To this end, wherver I have had to make 1.355 - archetictural choices about =CORTEX=, I have chosen to give as much 1.356 + test out ideas of their own. To this end, wherever I have had to make 1.357 + architectural choices about =CORTEX=, I have chosen to give as much 1.358 freedom to the user as possible, so that =CORTEX= may be used for 1.359 - things I have not forseen. 1.360 + things I have not foreseen. 1.361 1.362 ** Building in simulation versus reality 1.363 - The most important archetictural decision of all is the choice to 1.364 - use a computer-simulated environemnt in the first place! The world 1.365 + The most important architectural decision of all is the choice to 1.366 + use a computer-simulated environment in the first place! The world 1.367 is a vast and rich place, and for now simulations are a very poor 1.368 reflection of its complexity. It may be that there is a significant 1.369 - qualatative difference between dealing with senses in the real 1.370 - world and dealing with pale facilimilies of them in a simulation 1.371 + qualitative difference between dealing with senses in the real 1.372 + world and dealing with pale facsimiles of them in a simulation 1.373 \cite{brooks-representation}. What are the advantages and 1.374 disadvantages of a simulation vs. reality? 1.375 1.376 @@ -519,13 +500,13 @@ 1.377 The need for real time processing only increases if multiple senses 1.378 are involved. In the extreme case, even simple algorithms will have 1.379 to be accelerated by ASIC chips or FPGAs, turning what would 1.380 - otherwise be a few lines of code and a 10x speed penality into a 1.381 + otherwise be a few lines of code and a 10x speed penalty into a 1.382 multi-month ordeal. For this reason, =CORTEX= supports 1.383 - /time-dialiation/, which scales back the framerate of the 1.384 + /time-dilation/, which scales back the framerate of the 1.385 simulation in proportion to the amount of processing each frame. 1.386 From the perspective of the creatures inside the simulation, time 1.387 always appears to flow at a constant rate, regardless of how 1.388 - complicated the envorimnent becomes or how many creatures are in 1.389 + complicated the environment becomes or how many creatures are in 1.390 the simulation. The cost is that =CORTEX= can sometimes run slower 1.391 than real time. This can also be an advantage, however --- 1.392 simulations of very simple creatures in =CORTEX= generally run at 1.393 @@ -536,7 +517,7 @@ 1.394 If =CORTEX= is to support a wide variety of senses, it would help 1.395 to have a better understanding of what a ``sense'' actually is! 1.396 While vision, touch, and hearing all seem like they are quite 1.397 - different things, I was supprised to learn during the course of 1.398 + different things, I was surprised to learn during the course of 1.399 this thesis that they (and all physical senses) can be expressed as 1.400 exactly the same mathematical object due to a dimensional argument! 1.401 1.402 @@ -561,13 +542,13 @@ 1.403 Most human senses consist of many discrete sensors of various 1.404 properties distributed along a surface at various densities. For 1.405 skin, it is Pacinian corpuscles, Meissner's corpuscles, Merkel's 1.406 - disks, and Ruffini's endings, which detect pressure and vibration 1.407 - of various intensities. For ears, it is the stereocilia distributed 1.408 - along the basilar membrane inside the cochlea; each one is 1.409 - sensitive to a slightly different frequency of sound. For eyes, it 1.410 - is rods and cones distributed along the surface of the retina. In 1.411 - each case, we can describe the sense with a surface and a 1.412 - distribution of sensors along that surface. 1.413 + disks, and Ruffini's endings (\cite{9.01-textbook), which detect 1.414 + pressure and vibration of various intensities. For ears, it is the 1.415 + stereocilia distributed along the basilar membrane inside the 1.416 + cochlea; each one is sensitive to a slightly different frequency of 1.417 + sound. For eyes, it is rods and cones distributed along the surface 1.418 + of the retina. In each case, we can describe the sense with a 1.419 + surface and a distribution of sensors along that surface. 1.420 1.421 The neat idea is that every human sense can be effectively 1.422 described in terms of a surface containing embedded sensors. If the 1.423 @@ -614,7 +595,7 @@ 1.424 I did not need to write my own physics simulation code or shader to 1.425 build =CORTEX=. Doing so would lead to a system that is impossible 1.426 for anyone but myself to use anyway. Instead, I use a video game 1.427 - engine as a base and modify it to accomodate the additional needs 1.428 + engine as a base and modify it to accommodate the additional needs 1.429 of =CORTEX=. Video game engines are an ideal starting point to 1.430 build =CORTEX=, because they are not far from being creature 1.431 building systems themselves. 1.432 @@ -684,7 +665,7 @@ 1.433 for other projects, it needs a way to construct complicated 1.434 creatures. If possible, it would be nice to leverage work that has 1.435 already been done by the community of 3D modelers, or at least 1.436 - enable people who are talented at moedling but not programming to 1.437 + enable people who are talented at modeling but not programming to 1.438 design =CORTEX= creatures. 1.439 1.440 Therefore, I use Blender, a free 3D modeling program, as the main 1.441 @@ -704,7 +685,7 @@ 1.442 sensors if applicable. 1.443 - Make each empty-node the child of the top-level node. 1.444 1.445 - #+caption: An example of annoting a creature model with empty 1.446 + #+caption: An example of annotating a creature model with empty 1.447 #+caption: nodes to describe the layout of senses. There are 1.448 #+caption: multiple empty nodes which each describe the position 1.449 #+caption: of muscles, ears, eyes, or joints. 1.450 @@ -717,7 +698,7 @@ 1.451 Blender is a general purpose animation tool, which has been used in 1.452 the past to create high quality movies such as Sintel 1.453 \cite{blender}. Though Blender can model and render even complicated 1.454 - things like water, it is crucual to keep models that are meant to 1.455 + things like water, it is crucial to keep models that are meant to 1.456 be simulated as creatures simple. =Bullet=, which =CORTEX= uses 1.457 though jMonkeyEngine3, is a rigid-body physics system. This offers 1.458 a compromise between the expressiveness of a game level and the 1.459 @@ -725,9 +706,9 @@ 1.460 should be naturally expressed as rigid components held together by 1.461 joint constraints. 1.462 1.463 - But humans are more like a squishy bag with wrapped around some 1.464 - hard bones which define the overall shape. When we move, our skin 1.465 - bends and stretches to accomodate the new positions of our bones. 1.466 + But humans are more like a squishy bag wrapped around some hard 1.467 + bones which define the overall shape. When we move, our skin bends 1.468 + and stretches to accommodate the new positions of our bones. 1.469 1.470 One way to make bodies composed of rigid pieces connected by joints 1.471 /seem/ more human-like is to use an /armature/, (or /rigging/) 1.472 @@ -735,17 +716,16 @@ 1.473 mesh deforms as a function of the position of each ``bone'' which 1.474 is a standard rigid body. This technique is used extensively to 1.475 model humans and create realistic animations. It is not a good 1.476 - technique for physical simulation, however because it creates a lie 1.477 - -- the skin is not a physical part of the simulation and does not 1.478 - interact with any objects in the world or itself. Objects will pass 1.479 - right though the skin until they come in contact with the 1.480 - underlying bone, which is a physical object. Whithout simulating 1.481 - the skin, the sense of touch has little meaning, and the creature's 1.482 - own vision will lie to it about the true extent of its body. 1.483 - Simulating the skin as a physical object requires some way to 1.484 - continuously update the physical model of the skin along with the 1.485 - movement of the bones, which is unacceptably slow compared to rigid 1.486 - body simulation. 1.487 + technique for physical simulation because it is a lie -- the skin 1.488 + is not a physical part of the simulation and does not interact with 1.489 + any objects in the world or itself. Objects will pass right though 1.490 + the skin until they come in contact with the underlying bone, which 1.491 + is a physical object. Without simulating the skin, the sense of 1.492 + touch has little meaning, and the creature's own vision will lie to 1.493 + it about the true extent of its body. Simulating the skin as a 1.494 + physical object requires some way to continuously update the 1.495 + physical model of the skin along with the movement of the bones, 1.496 + which is unacceptably slow compared to rigid body simulation. 1.497 1.498 Therefore, instead of using the human-like ``deformable bag of 1.499 bones'' approach, I decided to base my body plans on multiple solid 1.500 @@ -762,7 +742,7 @@ 1.501 together by invisible joint constraints. This is what I mean by 1.502 ``eve-like''. The main reason that I use eve-style bodies is for 1.503 efficiency, and so that there will be correspondence between the 1.504 - AI's semses and the physical presence of its body. Each individual 1.505 + AI's senses and the physical presence of its body. Each individual 1.506 section is simulated by a separate rigid body that corresponds 1.507 exactly with its visual representation and does not change. 1.508 Sections are connected by invisible joints that are well supported 1.509 @@ -870,7 +850,7 @@ 1.510 must be called /after/ =physical!= is called. 1.511 1.512 #+caption: Program to find the targets of a joint node by 1.513 - #+caption: exponentiallly growth of a search cube. 1.514 + #+caption: exponentially growth of a search cube. 1.515 #+name: joint-targets 1.516 #+begin_listing clojure 1.517 #+begin_src clojure 1.518 @@ -905,7 +885,7 @@ 1.519 a dispatch on the metadata of each joint node. 1.520 1.521 #+caption: Program to dispatch on blender metadata and create joints 1.522 - #+caption: sutiable for physical simulation. 1.523 + #+caption: suitable for physical simulation. 1.524 #+name: joint-dispatch 1.525 #+begin_listing clojure 1.526 #+begin_src clojure 1.527 @@ -985,8 +965,8 @@ 1.528 In general, whenever =CORTEX= exposes a sense (or in this case 1.529 physicality), it provides a function of the type =sense!=, which 1.530 takes in a collection of nodes and augments it to support that 1.531 - sense. The function returns any controlls necessary to use that 1.532 - sense. In this case =body!= cerates a physical body and returns no 1.533 + sense. The function returns any controls necessary to use that 1.534 + sense. In this case =body!= creates a physical body and returns no 1.535 control functions. 1.536 1.537 #+caption: Program to give joints to a creature. 1.538 @@ -1022,7 +1002,7 @@ 1.539 creature. 1.540 1.541 #+caption: With the ability to create physical creatures from blender, 1.542 - #+caption: =CORTEX= gets one step closer to becomming a full creature 1.543 + #+caption: =CORTEX= gets one step closer to becoming a full creature 1.544 #+caption: simulation environment. 1.545 #+name: name 1.546 #+ATTR_LaTeX: :width 15cm 1.547 @@ -1085,7 +1065,7 @@ 1.548 hold the data. It does not do any copying from the GPU to the CPU 1.549 itself because it is a slow operation. 1.550 1.551 - #+caption: Function to make the rendered secne in jMonkeyEngine 1.552 + #+caption: Function to make the rendered scene in jMonkeyEngine 1.553 #+caption: available for further processing. 1.554 #+name: pipeline-1 1.555 #+begin_listing clojure 1.556 @@ -1160,7 +1140,7 @@ 1.557 (let [target (closest-node creature eye) 1.558 [cam-width cam-height] 1.559 ;;[640 480] ;; graphics card on laptop doesn't support 1.560 - ;; arbitray dimensions. 1.561 + ;; arbitrary dimensions. 1.562 (eye-dimensions eye) 1.563 cam (Camera. cam-width cam-height) 1.564 rot (.getWorldRotation eye)] 1.565 @@ -1345,7 +1325,7 @@ 1.566 1.567 =CORTEX='s hearing is unique because it does not have any 1.568 limitations compared to other simulation environments. As far as I 1.569 - know, there is no other system that supports multiple listerers, 1.570 + know, there is no other system that supports multiple listeners, 1.571 and the sound demo at the end of this section is the first time 1.572 it's been done in a video game environment. 1.573 1.574 @@ -1384,7 +1364,7 @@ 1.575 Extending =OpenAL= to support multiple listeners requires 500 1.576 lines of =C= code and is too hairy to mention here. Instead, I 1.577 will show a small amount of extension code and go over the high 1.578 - level stragety. Full source is of course available with the 1.579 + level strategy. Full source is of course available with the 1.580 =CORTEX= distribution if you're interested. 1.581 1.582 =OpenAL= goes to great lengths to support many different systems, 1.583 @@ -1406,7 +1386,7 @@ 1.584 sound it receives to a file, if everything has been set up 1.585 correctly when configuring =OpenAL=. 1.586 1.587 - Actual mixing (doppler shift and distance.environment-based 1.588 + Actual mixing (Doppler shift and distance.environment-based 1.589 attenuation) of the sound data happens in the Devices, and they 1.590 are the only point in the sound rendering process where this data 1.591 is available. 1.592 @@ -1623,10 +1603,10 @@ 1.593 #+END_SRC 1.594 #+end_listing 1.595 1.596 - #+caption: First ever simulation of multiple listerners in =CORTEX=. 1.597 + #+caption: First ever simulation of multiple listeners in =CORTEX=. 1.598 #+caption: Each cube is a creature which processes sound data with 1.599 #+caption: the =process= function from listing \ref{sound-test}. 1.600 - #+caption: the ball is constantally emiting a pure tone of 1.601 + #+caption: the ball is constantly emitting a pure tone of 1.602 #+caption: constant volume. As it approaches the cubes, they each 1.603 #+caption: change color in response to the sound. 1.604 #+name: sound-cubes. 1.605 @@ -1756,7 +1736,7 @@ 1.606 fit the height and width of the UV image). 1.607 1.608 #+caption: Programs to extract triangles from a geometry and get 1.609 - #+caption: their verticies in both world and UV-coordinates. 1.610 + #+caption: their vertices in both world and UV-coordinates. 1.611 #+name: get-triangles 1.612 #+begin_listing clojure 1.613 #+BEGIN_SRC clojure 1.614 @@ -1851,7 +1831,7 @@ 1.615 jMonkeyEngine's =Matrix4f= objects, which can describe any affine 1.616 transformation. 1.617 1.618 - #+caption: Program to interpert triangles as affine transforms. 1.619 + #+caption: Program to interpret triangles as affine transforms. 1.620 #+name: triangle-affine 1.621 #+begin_listing clojure 1.622 #+BEGIN_SRC clojure 1.623 @@ -1894,7 +1874,7 @@ 1.624 =inside-triangle?= determines whether a point is inside a triangle 1.625 in 2D pixel-space. 1.626 1.627 - #+caption: Program to efficiently determine point includion 1.628 + #+caption: Program to efficiently determine point inclusion 1.629 #+caption: in a triangle. 1.630 #+name: in-triangle 1.631 #+begin_listing clojure 1.632 @@ -2089,7 +2069,7 @@ 1.633 1.634 Armed with the =touch!= function, =CORTEX= becomes capable of 1.635 giving creatures a sense of touch. A simple test is to create a 1.636 - cube that is outfitted with a uniform distrubition of touch 1.637 + cube that is outfitted with a uniform distribution of touch 1.638 sensors. It can feel the ground and any balls that it touches. 1.639 1.640 #+caption: =CORTEX= interface for creating touch in a simulated 1.641 @@ -2111,7 +2091,7 @@ 1.642 #+end_listing 1.643 1.644 The tactile-sensor-profile image for the touch cube is a simple 1.645 - cross with a unifom distribution of touch sensors: 1.646 + cross with a uniform distribution of touch sensors: 1.647 1.648 #+caption: The touch profile for the touch-cube. Each pure white 1.649 #+caption: pixel defines a touch sensitive feeler. 1.650 @@ -2119,7 +2099,7 @@ 1.651 #+ATTR_LaTeX: :width 7cm 1.652 [[./images/touch-profile.png]] 1.653 1.654 - #+caption: The touch cube reacts to canonballs. The black, red, 1.655 + #+caption: The touch cube reacts to cannonballs. The black, red, 1.656 #+caption: and white cross on the right is a visual display of 1.657 #+caption: the creature's touch. White means that it is feeling 1.658 #+caption: something strongly, black is not feeling anything, 1.659 @@ -2171,7 +2151,7 @@ 1.660 like a normal dot-product angle is. 1.661 1.662 The purpose of these functions is to build a system of angle 1.663 - measurement that is biologically plausable. 1.664 + measurement that is biologically plausible. 1.665 1.666 #+caption: Program to measure angles along a vector 1.667 #+name: helpers 1.668 @@ -2201,7 +2181,7 @@ 1.669 connects. The only tricky part here is making the angles relative 1.670 to the joint's initial ``straightness''. 1.671 1.672 - #+caption: Program to return biologially reasonable proprioceptive 1.673 + #+caption: Program to return biologically reasonable proprioceptive 1.674 #+caption: data for each joint. 1.675 #+name: proprioception 1.676 #+begin_listing clojure 1.677 @@ -2359,7 +2339,7 @@ 1.678 1.679 *** Creating muscles 1.680 1.681 - #+caption: This is the core movement functoion in =CORTEX=, which 1.682 + #+caption: This is the core movement function in =CORTEX=, which 1.683 #+caption: implements muscles that report on their activation. 1.684 #+name: muscle-kernel 1.685 #+begin_listing clojure 1.686 @@ -2417,7 +2397,7 @@ 1.687 intricate marionette hand with several strings for each finger: 1.688 1.689 #+caption: View of the hand model with all sense nodes. You can see 1.690 - #+caption: the joint, muscle, ear, and eye nodess here. 1.691 + #+caption: the joint, muscle, ear, and eye nodes here. 1.692 #+name: hand-nodes-1 1.693 #+ATTR_LaTeX: :width 11cm 1.694 [[./images/hand-with-all-senses2.png]] 1.695 @@ -2430,7 +2410,7 @@ 1.696 With the hand fully rigged with senses, I can run it though a test 1.697 that will test everything. 1.698 1.699 - #+caption: A full test of the hand with all senses. Note expecially 1.700 + #+caption: A full test of the hand with all senses. Note especially 1.701 #+caption: the interactions the hand has with itself: it feels 1.702 #+caption: its own palm and fingers, and when it curls its fingers, 1.703 #+caption: it sees them with its eye (which is located in the center 1.704 @@ -2440,7 +2420,7 @@ 1.705 #+ATTR_LaTeX: :width 16cm 1.706 [[./images/integration.png]] 1.707 1.708 -** =CORTEX= enables many possiblities for further research 1.709 +** =CORTEX= enables many possibilities for further research 1.710 1.711 Often times, the hardest part of building a system involving 1.712 creatures is dealing with physics and graphics. =CORTEX= removes 1.713 @@ -2561,14 +2541,14 @@ 1.714 #+end_src 1.715 #+end_listing 1.716 1.717 -** Embodiment factors action recognition into managable parts 1.718 +** Embodiment factors action recognition into manageable parts 1.719 1.720 Using empathy, I divide the problem of action recognition into a 1.721 recognition process expressed in the language of a full compliment 1.722 - of senses, and an imaganitive process that generates full sensory 1.723 + of senses, and an imaginative process that generates full sensory 1.724 data from partial sensory data. Splitting the action recognition 1.725 problem in this manner greatly reduces the total amount of work to 1.726 - recognize actions: The imaganitive process is mostly just matching 1.727 + recognize actions: The imaginative process is mostly just matching 1.728 previous experience, and the recognition process gets to use all 1.729 the senses to directly describe any action. 1.730 1.731 @@ -2586,8 +2566,8 @@ 1.732 experience, observe however much of it they desire, and decide 1.733 whether the worm is doing the action they describe. =curled?= 1.734 relies on proprioception, =resting?= relies on touch, =wiggling?= 1.735 - relies on a fourier analysis of muscle contraction, and 1.736 - =grand-circle?= relies on touch and reuses =curled?= as a gaurd. 1.737 + relies on a Fourier analysis of muscle contraction, and 1.738 + =grand-circle?= relies on touch and reuses =curled?= as a guard. 1.739 1.740 #+caption: Program for detecting whether the worm is curled. This is the 1.741 #+caption: simplest action predicate, because it only uses the last frame 1.742 @@ -2634,7 +2614,7 @@ 1.743 #+caption: uses a summary of the tactile information from the underbelly 1.744 #+caption: of the worm, and is only true if every segment is touching the 1.745 #+caption: floor. Note that this function contains no references to 1.746 - #+caption: proprioction at all. 1.747 + #+caption: proprioception at all. 1.748 #+name: resting 1.749 #+begin_listing clojure 1.750 #+begin_src clojure 1.751 @@ -2675,9 +2655,9 @@ 1.752 1.753 1.754 #+caption: Program for detecting whether the worm has been wiggling for 1.755 - #+caption: the last few frames. It uses a fourier analysis of the muscle 1.756 + #+caption: the last few frames. It uses a Fourier analysis of the muscle 1.757 #+caption: contractions of the worm's tail to determine wiggling. This is 1.758 - #+caption: signigicant because there is no particular frame that clearly 1.759 + #+caption: significant because there is no particular frame that clearly 1.760 #+caption: indicates that the worm is wiggling --- only when multiple frames 1.761 #+caption: are analyzed together is the wiggling revealed. Defining 1.762 #+caption: wiggling this way also gives the worm an opportunity to learn 1.763 @@ -2738,7 +2718,7 @@ 1.764 #+end_listing 1.765 1.766 #+caption: Using =debug-experience=, the body-centered predicates 1.767 - #+caption: work together to classify the behaviour of the worm. 1.768 + #+caption: work together to classify the behavior of the worm. 1.769 #+caption: the predicates are operating with access to the worm's 1.770 #+caption: full sensory data. 1.771 #+name: basic-worm-view 1.772 @@ -2749,10 +2729,10 @@ 1.773 empathic recognition system. There is power in the simplicity of 1.774 the action predicates. They describe their actions without getting 1.775 confused in visual details of the worm. Each one is frame 1.776 - independent, but more than that, they are each indepent of 1.777 + independent, but more than that, they are each independent of 1.778 irrelevant visual details of the worm and the environment. They 1.779 will work regardless of whether the worm is a different color or 1.780 - hevaily textured, or if the environment has strange lighting. 1.781 + heavily textured, or if the environment has strange lighting. 1.782 1.783 The trick now is to make the action predicates work even when the 1.784 sensory data on which they depend is absent. If I can do that, then 1.785 @@ -2776,7 +2756,7 @@ 1.786 1.787 As the worm moves around during free play and its experience vector 1.788 grows larger, the vector begins to define a subspace which is all 1.789 - the sensations the worm can practicaly experience during normal 1.790 + the sensations the worm can practically experience during normal 1.791 operation. I call this subspace \Phi-space, short for 1.792 physical-space. The experience vector defines a path through 1.793 \Phi-space. This path has interesting properties that all derive 1.794 @@ -2801,7 +2781,7 @@ 1.795 body along a specific path through \Phi-space. 1.796 1.797 There is a simple way of taking \Phi-space and the total ordering 1.798 - provided by an experience vector and reliably infering the rest of 1.799 + provided by an experience vector and reliably inferring the rest of 1.800 the senses. 1.801 1.802 ** Empathy is the process of tracing though \Phi-space 1.803 @@ -2817,8 +2797,8 @@ 1.804 matching experience records for each input, using the tiered 1.805 proprioceptive bins. 1.806 1.807 - Finally, to infer sensory data, select the longest consective chain 1.808 - of experiences. Conecutive experience means that the experiences 1.809 + Finally, to infer sensory data, select the longest consecutive chain 1.810 + of experiences. Consecutive experience means that the experiences 1.811 appear next to each other in the experience vector. 1.812 1.813 This algorithm has three advantages: 1.814 @@ -2833,8 +2813,8 @@ 1.815 1.816 2. It protects from wrong interpretations of transient ambiguous 1.817 proprioceptive data. For example, if the worm is flat for just 1.818 - an instant, this flattness will not be interpreted as implying 1.819 - that the worm has its muscles relaxed, since the flattness is 1.820 + an instant, this flatness will not be interpreted as implying 1.821 + that the worm has its muscles relaxed, since the flatness is 1.822 part of a longer chain which includes a distinct pattern of 1.823 muscle activation. Markov chains or other memoryless statistical 1.824 models that operate on individual frames may very well make this 1.825 @@ -2855,7 +2835,7 @@ 1.826 1.827 (defn gen-phi-scan 1.828 "Nearest-neighbors with binning. Only returns a result if 1.829 - the propriceptive data is within 10% of a previously recorded 1.830 + the proprioceptive data is within 10% of a previously recorded 1.831 result in all dimensions." 1.832 [phi-space] 1.833 (let [bin-keys (map bin [3 2 1]) 1.834 @@ -2882,13 +2862,13 @@ 1.835 from previous experience. It prefers longer chains of previous 1.836 experience to shorter ones. For example, during training the worm 1.837 might rest on the ground for one second before it performs its 1.838 - excercises. If during recognition the worm rests on the ground for 1.839 - five seconds, =longest-thread= will accomodate this five second 1.840 + exercises. If during recognition the worm rests on the ground for 1.841 + five seconds, =longest-thread= will accommodate this five second 1.842 rest period by looping the one second rest chain five times. 1.843 1.844 - =longest-thread= takes time proportinal to the average number of 1.845 + =longest-thread= takes time proportional to the average number of 1.846 entries in a proprioceptive bin, because for each element in the 1.847 - starting bin it performes a series of set lookups in the preceeding 1.848 + starting bin it performs a series of set lookups in the preceding 1.849 bins. If the total history is limited, then this is only a constant 1.850 multiple times the number of entries in the starting bin. This 1.851 analysis also applies even if the action requires multiple longest 1.852 @@ -2966,7 +2946,7 @@ 1.853 experiences from the worm that includes the actions I want to 1.854 recognize. The =generate-phi-space= program (listing 1.855 \ref{generate-phi-space} runs the worm through a series of 1.856 - exercices and gatheres those experiences into a vector. The 1.857 + exercises and gatherers those experiences into a vector. The 1.858 =do-all-the-things= program is a routine expressed in a simple 1.859 muscle contraction script language for automated worm control. It 1.860 causes the worm to rest, curl, and wiggle over about 700 frames 1.861 @@ -2975,7 +2955,7 @@ 1.862 #+caption: Program to gather the worm's experiences into a vector for 1.863 #+caption: further processing. The =motor-control-program= line uses 1.864 #+caption: a motor control script that causes the worm to execute a series 1.865 - #+caption: of ``exercices'' that include all the action predicates. 1.866 + #+caption: of ``exercises'' that include all the action predicates. 1.867 #+name: generate-phi-space 1.868 #+begin_listing clojure 1.869 #+begin_src clojure 1.870 @@ -3039,14 +3019,14 @@ 1.871 1.872 #+caption: From only proprioceptive data, =EMPATH= was able to infer 1.873 #+caption: the complete sensory experience and classify four poses 1.874 - #+caption: (The last panel shows a composite image of \emph{wriggling}, 1.875 + #+caption: (The last panel shows a composite image of /wiggling/, 1.876 #+caption: a dynamic pose.) 1.877 #+name: empathy-debug-image 1.878 #+ATTR_LaTeX: :width 10cm :placement [H] 1.879 [[./images/empathy-1.png]] 1.880 1.881 One way to measure the performance of =EMPATH= is to compare the 1.882 - sutiability of the imagined sense experience to trigger the same 1.883 + suitability of the imagined sense experience to trigger the same 1.884 action predicates as the real sensory experience. 1.885 1.886 #+caption: Determine how closely empathy approximates actual 1.887 @@ -3086,7 +3066,7 @@ 1.888 1.889 Running =test-empathy-accuracy= using the very short exercise 1.890 program defined in listing \ref{generate-phi-space}, and then doing 1.891 - a similar pattern of activity manually yeilds an accuracy of around 1.892 + a similar pattern of activity manually yields an accuracy of around 1.893 73%. This is based on very limited worm experience. By training the 1.894 worm for longer, the accuracy dramatically improves. 1.895 1.896 @@ -3113,21 +3093,21 @@ 1.897 =test-empathy-accuracy=. The majority of errors are near the 1.898 boundaries of transitioning from one type of action to another. 1.899 During these transitions the exact label for the action is more open 1.900 - to interpretation, and dissaggrement between empathy and experience 1.901 + to interpretation, and disagreement between empathy and experience 1.902 is more excusable. 1.903 1.904 ** Digression: Learn touch sensor layout through free play 1.905 1.906 In the previous section I showed how to compute actions in terms of 1.907 - body-centered predicates which relied averate touch activation of 1.908 - pre-defined regions of the worm's skin. What if, instead of 1.909 - recieving touch pre-grouped into the six faces of each worm 1.910 - segment, the true topology of the worm's skin was unknown? This is 1.911 - more similiar to how a nerve fiber bundle might be arranged. While 1.912 - two fibers that are close in a nerve bundle /might/ correspond to 1.913 - two touch sensors that are close together on the skin, the process 1.914 - of taking a complicated surface and forcing it into essentially a 1.915 - circle requires some cuts and rerragenments. 1.916 + body-centered predicates which relied on the average touch 1.917 + activation of pre-defined regions of the worm's skin. What if, 1.918 + instead of receiving touch pre-grouped into the six faces of each 1.919 + worm segment, the true topology of the worm's skin was unknown? 1.920 + This is more similar to how a nerve fiber bundle might be 1.921 + arranged. While two fibers that are close in a nerve bundle /might/ 1.922 + correspond to two touch sensors that are close together on the 1.923 + skin, the process of taking a complicated surface and forcing it 1.924 + into essentially a circle requires some cuts and rearrangements. 1.925 1.926 In this section I show how to automatically learn the skin-topology of 1.927 a worm segment by free exploration. As the worm rolls around on the 1.928 @@ -3151,15 +3131,15 @@ 1.929 #+end_listing 1.930 1.931 After collecting these important regions, there will many nearly 1.932 - similiar touch regions. While for some purposes the subtle 1.933 + similar touch regions. While for some purposes the subtle 1.934 differences between these regions will be important, for my 1.935 - purposes I colapse them into mostly non-overlapping sets using 1.936 - =remove-similiar= in listing \ref{remove-similiar} 1.937 - 1.938 - #+caption: Program to take a lits of set of points and ``collapse them'' 1.939 - #+caption: so that the remaining sets in the list are siginificantly 1.940 + purposes I collapse them into mostly non-overlapping sets using 1.941 + =remove-similar= in listing \ref{remove-similar} 1.942 + 1.943 + #+caption: Program to take a list of sets of points and ``collapse them'' 1.944 + #+caption: so that the remaining sets in the list are significantly 1.945 #+caption: different from each other. Prefer smaller sets to larger ones. 1.946 - #+name: remove-similiar 1.947 + #+name: remove-similar 1.948 #+begin_listing clojure 1.949 #+begin_src clojure 1.950 (defn remove-similar 1.951 @@ -3181,7 +3161,7 @@ 1.952 Actually running this simulation is easy given =CORTEX='s facilities. 1.953 1.954 #+caption: Collect experiences while the worm moves around. Filter the touch 1.955 - #+caption: sensations by stable ones, collapse similiar ones together, 1.956 + #+caption: sensations by stable ones, collapse similar ones together, 1.957 #+caption: and report the regions learned. 1.958 #+name: learn-touch 1.959 #+begin_listing clojure 1.960 @@ -3216,7 +3196,7 @@ 1.961 #+end_src 1.962 #+end_listing 1.963 1.964 - The only thing remining to define is the particular motion the worm 1.965 + The only thing remaining to define is the particular motion the worm 1.966 must take. I accomplish this with a simple motor control program. 1.967 1.968 #+caption: Motor control program for making the worm roll on the ground. 1.969 @@ -3275,7 +3255,7 @@ 1.970 the worm's physiology and the worm's environment to correctly 1.971 deduce that the worm has six sides. Note that =learn-touch-regions= 1.972 would work just as well even if the worm's touch sense data were 1.973 - completely scrambled. The cross shape is just for convienence. This 1.974 + completely scrambled. The cross shape is just for convenience. This 1.975 example justifies the use of pre-defined touch regions in =EMPATH=. 1.976 1.977 * Contributions 1.978 @@ -3283,19 +3263,18 @@ 1.979 In this thesis you have seen the =CORTEX= system, a complete 1.980 environment for creating simulated creatures. You have seen how to 1.981 implement five senses: touch, proprioception, hearing, vision, and 1.982 - muscle tension. You have seen how to create new creatues using 1.983 + muscle tension. You have seen how to create new creatures using 1.984 blender, a 3D modeling tool. I hope that =CORTEX= will be useful in 1.985 further research projects. To this end I have included the full 1.986 source to =CORTEX= along with a large suite of tests and examples. I 1.987 - have also created a user guide for =CORTEX= which is inculded in an 1.988 - appendix to this thesis \ref{}. 1.989 -# dxh: todo reference appendix 1.990 + have also created a user guide for =CORTEX= which is included in an 1.991 + appendix to this thesis. 1.992 1.993 You have also seen how I used =CORTEX= as a platform to attach the 1.994 /action recognition/ problem, which is the problem of recognizing 1.995 actions in video. You saw a simple system called =EMPATH= which 1.996 - ientifies actions by first describing actions in a body-centerd, 1.997 - rich sense language, then infering a full range of sensory 1.998 + identifies actions by first describing actions in a body-centered, 1.999 + rich sense language, then inferring a full range of sensory 1.1000 experience from limited data using previous experience gained from 1.1001 free play. 1.1002 1.1003 @@ -3305,23 +3284,22 @@ 1.1004 1.1005 In conclusion, the main contributions of this thesis are: 1.1006 1.1007 - - =CORTEX=, a system for creating simulated creatures with rich 1.1008 - senses. 1.1009 - - =EMPATH=, a program for recognizing actions by imagining sensory 1.1010 - experience. 1.1011 - 1.1012 -# An anatomical joke: 1.1013 -# - Training 1.1014 -# - Skeletal imitation 1.1015 -# - Sensory fleshing-out 1.1016 -# - Classification 1.1017 + - =CORTEX=, a comprehensive platform for embodied AI experiments. 1.1018 + =CORTEX= supports many features lacking in other systems, such 1.1019 + proper simulation of hearing. It is easy to create new =CORTEX= 1.1020 + creatures using Blender, a free 3D modeling program. 1.1021 + 1.1022 + - =EMPATH=, which uses =CORTEX= to identify the actions of a 1.1023 + worm-like creature using a computational model of empathy. 1.1024 + 1.1025 #+BEGIN_LaTeX 1.1026 \appendix 1.1027 #+END_LaTeX 1.1028 + 1.1029 * Appendix: =CORTEX= User Guide 1.1030 1.1031 Those who write a thesis should endeavor to make their code not only 1.1032 - accessable, but actually useable, as a way to pay back the community 1.1033 + accessible, but actually usable, as a way to pay back the community 1.1034 that made the thesis possible in the first place. This thesis would 1.1035 not be possible without Free Software such as jMonkeyEngine3, 1.1036 Blender, clojure, emacs, ffmpeg, and many other tools. That is why I 1.1037 @@ -3349,7 +3327,7 @@ 1.1038 1.1039 Creatures are created using /Blender/, a free 3D modeling program. 1.1040 You will need Blender version 2.6 when using the =CORTEX= included 1.1041 - in this thesis. You create a =CORTEX= creature in a similiar manner 1.1042 + in this thesis. You create a =CORTEX= creature in a similar manner 1.1043 to modeling anything in Blender, except that you also create 1.1044 several trees of empty nodes which define the creature's senses. 1.1045 1.1046 @@ -3417,7 +3395,7 @@ 1.1047 to set the empty node's display mode to ``Arrows'' so that you can 1.1048 clearly see the direction of the axes. 1.1049 1.1050 - Each retina file should contain white pixels whever you want to be 1.1051 + Each retina file should contain white pixels wherever you want to be 1.1052 sensitive to your chosen color. If you want the entire field of 1.1053 view, specify :all of 0xFFFFFF and a retinal map that is entirely 1.1054 white. 1.1055 @@ -3453,7 +3431,7 @@ 1.1056 #+END_EXAMPLE 1.1057 1.1058 You may also include an optional ``scale'' metadata number to 1.1059 - specifiy the length of the touch feelers. The default is $0.1$, 1.1060 + specify the length of the touch feelers. The default is $0.1$, 1.1061 and this is generally sufficient. 1.1062 1.1063 The touch UV should contain white pixels for each touch sensor. 1.1064 @@ -3475,7 +3453,7 @@ 1.1065 #+ATTR_LaTeX: :width 9cm :placement [H] 1.1066 [[./images/finger-2.png]] 1.1067 1.1068 -*** Propriocepotion 1.1069 +*** Proprioception 1.1070 1.1071 Proprioception is tied to each joint node -- nothing special must 1.1072 be done in a blender model to enable proprioception other than 1.1073 @@ -3582,10 +3560,10 @@ 1.1074 representing that described in a blender file. 1.1075 1.1076 - =(light-up-everything world)= :: distribute a standard compliment 1.1077 - of lights throught the simulation. Should be adequate for most 1.1078 + of lights throughout the simulation. Should be adequate for most 1.1079 purposes. 1.1080 1.1081 - - =(node-seq node)= :: return a recursuve list of the node's 1.1082 + - =(node-seq node)= :: return a recursive list of the node's 1.1083 children. 1.1084 1.1085 - =(nodify name children)= :: construct a node given a node-name and 1.1086 @@ -3638,7 +3616,7 @@ 1.1087 - =(proprioception! creature)= :: give the creature the sense of 1.1088 proprioception. Returns a list of functions, one for each 1.1089 joint, that when called during a running simulation will 1.1090 - report the =[headnig, pitch, roll]= of the joint. 1.1091 + report the =[heading, pitch, roll]= of the joint. 1.1092 1.1093 - =(movement! creature)= :: give the creature the power of movement. 1.1094 Creates a list of functions, one for each muscle, that when 1.1095 @@ -3677,7 +3655,7 @@ 1.1096 function will import all jMonkeyEngine3 classes for immediate 1.1097 use. 1.1098 1.1099 - - =(display-dialated-time world timer)= :: Shows the time as it is 1.1100 + - =(display-dilated-time world timer)= :: Shows the time as it is 1.1101 flowing in the simulation on a HUD display. 1.1102 1.1103