Mercurial > cortex
diff thesis/cortex.org @ 547:5d89879fc894
couple hours worth of edits.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Mon, 28 Apr 2014 15:10:59 -0400 |
parents | b2c66ea58c39 |
children | 0b891e0dd809 |
line wrap: on
line diff
1.1 --- a/thesis/cortex.org Mon Apr 28 13:14:52 2014 -0400 1.2 +++ b/thesis/cortex.org Mon Apr 28 15:10:59 2014 -0400 1.3 @@ -43,15 +43,15 @@ 1.4 1.5 * Empathy \& Embodiment: problem solving strategies 1.6 1.7 - By the end of this thesis, you will have seen a novel approach to 1.8 - interpreting video using embodiment and empathy. You will also see 1.9 - one way to efficiently implement physical empathy for embodied 1.10 - creatures. Finally, you will become familiar with =CORTEX=, a system 1.11 - for designing and simulating creatures with rich senses, which I 1.12 - have designed as a library that you can use in your own research. 1.13 - Note that I /do not/ process video directly --- I start with 1.14 - knowledge of the positions of a creature's body parts and works from 1.15 - there. 1.16 + By the end of this thesis, you will have a novel approach to 1.17 + representing an recognizing physical actions using embodiment and 1.18 + empathy. You will also see one way to efficiently implement physical 1.19 + empathy for embodied creatures. Finally, you will become familiar 1.20 + with =CORTEX=, a system for designing and simulating creatures with 1.21 + rich senses, which I have designed as a library that you can use in 1.22 + your own research. Note that I /do not/ process video directly --- I 1.23 + start with knowledge of the positions of a creature's body parts and 1.24 + works from there. 1.25 1.26 This is the core vision of my thesis: That one of the important ways 1.27 in which we understand others is by imagining ourselves in their 1.28 @@ -81,11 +81,11 @@ 1.29 \cite{volume-action-recognition}), but the 3D world is so variable 1.30 that it is hard to describe the world in terms of possible images. 1.31 1.32 - In fact, the contents of scene may have much less to do with pixel 1.33 - probabilities than with recognizing various affordances: things you 1.34 - can move, objects you can grasp, spaces that can be filled . For 1.35 - example, what processes might enable you to see the chair in figure 1.36 - \ref{hidden-chair}? 1.37 + In fact, the contents of a scene may have much less to do with 1.38 + pixel probabilities than with recognizing various affordances: 1.39 + things you can move, objects you can grasp, spaces that can be 1.40 + filled . For example, what processes might enable you to see the 1.41 + chair in figure \ref{hidden-chair}? 1.42 1.43 #+caption: The chair in this image is quite obvious to humans, but 1.44 #+caption: it can't be found by any modern computer vision program. 1.45 @@ -106,21 +106,21 @@ 1.46 Each of these examples tells us something about what might be going 1.47 on in our minds as we easily solve these recognition problems: 1.48 1.49 - The hidden chair shows us that we are strongly triggered by cues 1.50 - relating to the position of human bodies, and that we can determine 1.51 - the overall physical configuration of a human body even if much of 1.52 - that body is occluded. 1.53 - 1.54 - The picture of the girl pushing against the wall tells us that we 1.55 - have common sense knowledge about the kinetics of our own bodies. 1.56 - We know well how our muscles would have to work to maintain us in 1.57 - most positions, and we can easily project this self-knowledge to 1.58 - imagined positions triggered by images of the human body. 1.59 - 1.60 - The cat tells us that imagination of some kind plays an important 1.61 - role in understanding actions. The question is: Can we be more 1.62 - precise about what sort of imagination is required to understand 1.63 - these actions? 1.64 + - The hidden chair shows us that we are strongly triggered by cues 1.65 + relating to the position of human bodies, and that we can 1.66 + determine the overall physical configuration of a human body even 1.67 + if much of that body is occluded. 1.68 + 1.69 + - The picture of the girl pushing against the wall tells us that we 1.70 + have common sense knowledge about the kinetics of our own bodies. 1.71 + We know well how our muscles would have to work to maintain us in 1.72 + most positions, and we can easily project this self-knowledge to 1.73 + imagined positions triggered by images of the human body. 1.74 + 1.75 + - The cat tells us that imagination of some kind plays an important 1.76 + role in understanding actions. The question is: Can we be more 1.77 + precise about what sort of imagination is required to understand 1.78 + these actions? 1.79 1.80 ** A step forward: the sensorimotor-centered approach 1.81 1.82 @@ -135,12 +135,12 @@ 1.83 the cool water hitting their tongue, and feel the water entering 1.84 their body, and are able to recognize that /feeling/ as drinking. 1.85 So, the label of the action is not really in the pixels of the 1.86 - image, but is found clearly in a simulation inspired by those 1.87 - pixels. An imaginative system, having been trained on drinking and 1.88 - non-drinking examples and learning that the most important 1.89 - component of drinking is the feeling of water sliding down one's 1.90 - throat, would analyze a video of a cat drinking in the following 1.91 - manner: 1.92 + image, but is found clearly in a simulation / recollection inspired 1.93 + by those pixels. An imaginative system, having been trained on 1.94 + drinking and non-drinking examples and learning that the most 1.95 + important component of drinking is the feeling of water sliding 1.96 + down one's throat, would analyze a video of a cat drinking in the 1.97 + following manner: 1.98 1.99 1. Create a physical model of the video by putting a ``fuzzy'' 1.100 model of its own body in place of the cat. Possibly also create 1.101 @@ -193,7 +193,7 @@ 1.102 the particulars of any visual representation of the actions. If you 1.103 teach the system what ``running'' is, and you have a good enough 1.104 aligner, the system will from then on be able to recognize running 1.105 - from any point of view, even strange points of view like above or 1.106 + from any point of view -- even strange points of view like above or 1.107 underneath the runner. This is in contrast to action recognition 1.108 schemes that try to identify actions using a non-embodied approach. 1.109 If these systems learn about running as viewed from the side, they 1.110 @@ -201,12 +201,13 @@ 1.111 viewpoint. 1.112 1.113 Another powerful advantage is that using the language of multiple 1.114 - body-centered rich senses to describe body-centered actions offers a 1.115 - massive boost in descriptive capability. Consider how difficult it 1.116 - would be to compose a set of HOG filters to describe the action of 1.117 - a simple worm-creature ``curling'' so that its head touches its 1.118 - tail, and then behold the simplicity of describing thus action in a 1.119 - language designed for the task (listing \ref{grand-circle-intro}): 1.120 + body-centered rich senses to describe body-centered actions offers 1.121 + a massive boost in descriptive capability. Consider how difficult 1.122 + it would be to compose a set of HOG (Histogram of Oriented 1.123 + Gradients) filters to describe the action of a simple worm-creature 1.124 + ``curling'' so that its head touches its tail, and then behold the 1.125 + simplicity of describing thus action in a language designed for the 1.126 + task (listing \ref{grand-circle-intro}): 1.127 1.128 #+caption: Body-centered actions are best expressed in a body-centered 1.129 #+caption: language. This code detects when the worm has curled into a 1.130 @@ -272,10 +273,10 @@ 1.131 together to form a coherent and complete sensory portrait of 1.132 the scene. 1.133 1.134 - - Recognition :: With the scene described in terms of 1.135 - remembered first person sensory events, the creature can now 1.136 - run its action-identified programs (such as the one in listing 1.137 - \ref{grand-circle-intro} on this synthesized sensory data, 1.138 + - Recognition :: With the scene described in terms of remembered 1.139 + first person sensory events, the creature can now run its 1.140 + action-definition programs (such as the one in listing 1.141 + \ref{grand-circle-intro}) on this synthesized sensory data, 1.142 just as it would if it were actually experiencing the scene 1.143 first-hand. If previous experience has been accurately 1.144 retrieved, and if it is analogous enough to the scene, then 1.145 @@ -327,20 +328,21 @@ 1.146 number of creatures. I intend it to be useful as a library for many 1.147 more projects than just this thesis. =CORTEX= was necessary to meet 1.148 a need among AI researchers at CSAIL and beyond, which is that 1.149 - people often will invent neat ideas that are best expressed in the 1.150 - language of creatures and senses, but in order to explore those 1.151 + people often will invent wonderful ideas that are best expressed in 1.152 + the language of creatures and senses, but in order to explore those 1.153 ideas they must first build a platform in which they can create 1.154 simulated creatures with rich senses! There are many ideas that 1.155 - would be simple to execute (such as =EMPATH= or 1.156 - \cite{larson-symbols}), but attached to them is the multi-month 1.157 - effort to make a good creature simulator. Often, that initial 1.158 - investment of time proves to be too much, and the project must make 1.159 - do with a lesser environment. 1.160 + would be simple to execute (such as =EMPATH= or Larson's 1.161 + self-organizing maps (\cite{larson-symbols})), but attached to them 1.162 + is the multi-month effort to make a good creature simulator. Often, 1.163 + that initial investment of time proves to be too much, and the 1.164 + project must make do with a lesser environment or be abandoned 1.165 + entirely. 1.166 1.167 =CORTEX= is well suited as an environment for embodied AI research 1.168 for three reasons: 1.169 1.170 - - You can create new creatures using Blender (\cite{blender}), a 1.171 + - You can design new creatures using Blender (\cite{blender}), a 1.172 popular 3D modeling program. Each sense can be specified using 1.173 special blender nodes with biologically inspired parameters. You 1.174 need not write any code to create a creature, and can use a wide 1.175 @@ -352,9 +354,8 @@ 1.176 senses like touch and vision involve multiple sensory elements 1.177 embedded in a 2D surface. You have complete control over the 1.178 distribution of these sensor elements through the use of simple 1.179 - png image files. In particular, =CORTEX= implements more 1.180 - comprehensive hearing than any other creature simulation system 1.181 - available. 1.182 + png image files. =CORTEX= implements more comprehensive hearing 1.183 + than any other creature simulation system available. 1.184 1.185 - =CORTEX= supports any number of creatures and any number of 1.186 senses. Time in =CORTEX= dilates so that the simulated creatures 1.187 @@ -425,7 +426,7 @@ 1.188 1.189 Throughout this project, I intended for =CORTEX= to be flexible and 1.190 extensible enough to be useful for other researchers who want to 1.191 - test out ideas of their own. To this end, wherever I have had to make 1.192 + test ideas of their own. To this end, wherever I have had to make 1.193 architectural choices about =CORTEX=, I have chosen to give as much 1.194 freedom to the user as possible, so that =CORTEX= may be used for 1.195 things I have not foreseen. 1.196 @@ -437,25 +438,26 @@ 1.197 reflection of its complexity. It may be that there is a significant 1.198 qualitative difference between dealing with senses in the real 1.199 world and dealing with pale facsimiles of them in a simulation 1.200 - \cite{brooks-representation}. What are the advantages and 1.201 + (\cite{brooks-representation}). What are the advantages and 1.202 disadvantages of a simulation vs. reality? 1.203 1.204 *** Simulation 1.205 1.206 The advantages of virtual reality are that when everything is a 1.207 simulation, experiments in that simulation are absolutely 1.208 - reproducible. It's also easier to change the character and world 1.209 - to explore new situations and different sensory combinations. 1.210 + reproducible. It's also easier to change the creature and 1.211 + environment to explore new situations and different sensory 1.212 + combinations. 1.213 1.214 If the world is to be simulated on a computer, then not only do 1.215 - you have to worry about whether the character's senses are rich 1.216 + you have to worry about whether the creature's senses are rich 1.217 enough to learn from the world, but whether the world itself is 1.218 rendered with enough detail and realism to give enough working 1.219 - material to the character's senses. To name just a few 1.220 + material to the creature's senses. To name just a few 1.221 difficulties facing modern physics simulators: destructibility of 1.222 the environment, simulation of water/other fluids, large areas, 1.223 nonrigid bodies, lots of objects, smoke. I don't know of any 1.224 - computer simulation that would allow a character to take a rock 1.225 + computer simulation that would allow a creature to take a rock 1.226 and grind it into fine dust, then use that dust to make a clay 1.227 sculpture, at least not without spending years calculating the 1.228 interactions of every single small grain of dust. Maybe a 1.229 @@ -471,14 +473,14 @@ 1.230 the complexity of implementing the senses. Instead of just 1.231 grabbing the current rendered frame for processing, you have to 1.232 use an actual camera with real lenses and interact with photons to 1.233 - get an image. It is much harder to change the character, which is 1.234 + get an image. It is much harder to change the creature, which is 1.235 now partly a physical robot of some sort, since doing so involves 1.236 changing things around in the real world instead of modifying 1.237 lines of code. While the real world is very rich and definitely 1.238 - provides enough stimulation for intelligence to develop as 1.239 - evidenced by our own existence, it is also uncontrollable in the 1.240 + provides enough stimulation for intelligence to develop (as 1.241 + evidenced by our own existence), it is also uncontrollable in the 1.242 sense that a particular situation cannot be recreated perfectly or 1.243 - saved for later use. It is harder to conduct science because it is 1.244 + saved for later use. It is harder to conduct Science because it is 1.245 harder to repeat an experiment. The worst thing about using the 1.246 real world instead of a simulation is the matter of time. Instead 1.247 of simulated time you get the constant and unstoppable flow of 1.248 @@ -488,8 +490,8 @@ 1.249 may simply be impossible given the current speed of our 1.250 processors. Contrast this with a simulation, in which the flow of 1.251 time in the simulated world can be slowed down to accommodate the 1.252 - limitations of the character's programming. In terms of cost, 1.253 - doing everything in software is far cheaper than building custom 1.254 + limitations of the creature's programming. In terms of cost, doing 1.255 + everything in software is far cheaper than building custom 1.256 real-time hardware. All you need is a laptop and some patience. 1.257 1.258 ** Simulated time enables rapid prototyping \& simple programs 1.259 @@ -505,24 +507,24 @@ 1.260 to be accelerated by ASIC chips or FPGAs, turning what would 1.261 otherwise be a few lines of code and a 10x speed penalty into a 1.262 multi-month ordeal. For this reason, =CORTEX= supports 1.263 - /time-dilation/, which scales back the framerate of the 1.264 - simulation in proportion to the amount of processing each frame. 1.265 - From the perspective of the creatures inside the simulation, time 1.266 - always appears to flow at a constant rate, regardless of how 1.267 - complicated the environment becomes or how many creatures are in 1.268 - the simulation. The cost is that =CORTEX= can sometimes run slower 1.269 - than real time. This can also be an advantage, however --- 1.270 - simulations of very simple creatures in =CORTEX= generally run at 1.271 - 40x on my machine! 1.272 + /time-dilation/, which scales back the framerate of the simulation 1.273 + in proportion to the amount of processing each frame. From the 1.274 + perspective of the creatures inside the simulation, time always 1.275 + appears to flow at a constant rate, regardless of how complicated 1.276 + the environment becomes or how many creatures are in the 1.277 + simulation. The cost is that =CORTEX= can sometimes run slower than 1.278 + real time. Time dialation works both ways, however --- simulations 1.279 + of very simple creatures in =CORTEX= generally run at 40x real-time 1.280 + on my machine! 1.281 1.282 ** All sense organs are two-dimensional surfaces 1.283 1.284 If =CORTEX= is to support a wide variety of senses, it would help 1.285 - to have a better understanding of what a ``sense'' actually is! 1.286 - While vision, touch, and hearing all seem like they are quite 1.287 - different things, I was surprised to learn during the course of 1.288 - this thesis that they (and all physical senses) can be expressed as 1.289 - exactly the same mathematical object due to a dimensional argument! 1.290 + to have a better understanding of what a sense actually is! While 1.291 + vision, touch, and hearing all seem like they are quite different 1.292 + things, I was surprised to learn during the course of this thesis 1.293 + that they (and all physical senses) can be expressed as exactly the 1.294 + same mathematical object! 1.295 1.296 Human beings are three-dimensional objects, and the nerves that 1.297 transmit data from our various sense organs to our brain are 1.298 @@ -545,7 +547,7 @@ 1.299 Most human senses consist of many discrete sensors of various 1.300 properties distributed along a surface at various densities. For 1.301 skin, it is Pacinian corpuscles, Meissner's corpuscles, Merkel's 1.302 - disks, and Ruffini's endings \cite{textbook901}, which detect 1.303 + disks, and Ruffini's endings (\cite{textbook901}), which detect 1.304 pressure and vibration of various intensities. For ears, it is the 1.305 stereocilia distributed along the basilar membrane inside the 1.306 cochlea; each one is sensitive to a slightly different frequency of 1.307 @@ -556,19 +558,19 @@ 1.308 In fact, almost every human sense can be effectively described in 1.309 terms of a surface containing embedded sensors. If the sense had 1.310 any more dimensions, then there wouldn't be enough room in the 1.311 - spinal chord to transmit the information! 1.312 + spinal cord to transmit the information! 1.313 1.314 Therefore, =CORTEX= must support the ability to create objects and 1.315 then be able to ``paint'' points along their surfaces to describe 1.316 each sense. 1.317 1.318 Fortunately this idea is already a well known computer graphics 1.319 - technique called /UV-mapping/. The three-dimensional surface of a 1.320 - model is cut and smooshed until it fits on a two-dimensional 1.321 - image. You paint whatever you want on that image, and when the 1.322 - three-dimensional shape is rendered in a game the smooshing and 1.323 - cutting is reversed and the image appears on the three-dimensional 1.324 - object. 1.325 + technique called /UV-mapping/. In UV-maping, the three-dimensional 1.326 + surface of a model is cut and smooshed until it fits on a 1.327 + two-dimensional image. You paint whatever you want on that image, 1.328 + and when the three-dimensional shape is rendered in a game the 1.329 + smooshing and cutting is reversed and the image appears on the 1.330 + three-dimensional object. 1.331 1.332 To make a sense, interpret the UV-image as describing the 1.333 distribution of that senses sensors. To get different types of 1.334 @@ -610,12 +612,12 @@ 1.335 game engine will allow you to efficiently create multiple cameras 1.336 in the simulated world that can be used as eyes. Video game systems 1.337 offer integrated asset management for things like textures and 1.338 - creatures models, providing an avenue for defining creatures. They 1.339 + creature models, providing an avenue for defining creatures. They 1.340 also understand UV-mapping, since this technique is used to apply a 1.341 texture to a model. Finally, because video game engines support a 1.342 - large number of users, as long as =CORTEX= doesn't stray too far 1.343 - from the base system, other researchers can turn to this community 1.344 - for help when doing their research. 1.345 + large number of developers, as long as =CORTEX= doesn't stray too 1.346 + far from the base system, other researchers can turn to this 1.347 + community for help when doing their research. 1.348 1.349 ** =CORTEX= is based on jMonkeyEngine3 1.350 1.351 @@ -623,14 +625,14 @@ 1.352 engines to see which would best serve as a base. The top contenders 1.353 were: 1.354 1.355 - - [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]] :: The Quake II engine was designed by ID 1.356 - software in 1997. All the source code was released by ID 1.357 - software into the Public Domain several years ago, and as a 1.358 - result it has been ported to many different languages. This 1.359 - engine was famous for its advanced use of realistic shading 1.360 - and had decent and fast physics simulation. The main advantage 1.361 - of the Quake II engine is its simplicity, but I ultimately 1.362 - rejected it because the engine is too tied to the concept of a 1.363 + - [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]] :: The Quake II engine was designed by ID software 1.364 + in 1997. All the source code was released by ID software into 1.365 + the Public Domain several years ago, and as a result it has 1.366 + been ported to many different languages. This engine was 1.367 + famous for its advanced use of realistic shading and it had 1.368 + decent and fast physics simulation. The main advantage of the 1.369 + Quake II engine is its simplicity, but I ultimately rejected 1.370 + it because the engine is too tied to the concept of a 1.371 first-person shooter game. One of the problems I had was that 1.372 there does not seem to be any easy way to attach multiple 1.373 cameras to a single character. There are also several physics 1.374 @@ -670,11 +672,11 @@ 1.375 enable people who are talented at modeling but not programming to 1.376 design =CORTEX= creatures. 1.377 1.378 - Therefore, I use Blender, a free 3D modeling program, as the main 1.379 + Therefore I use Blender, a free 3D modeling program, as the main 1.380 way to create creatures in =CORTEX=. However, the creatures modeled 1.381 in Blender must also be simple to simulate in jMonkeyEngine3's game 1.382 engine, and must also be easy to rig with =CORTEX='s senses. I 1.383 - accomplish this with extensive use of Blender's ``empty nodes.'' 1.384 + accomplish this with extensive use of Blender's ``empty nodes.'' 1.385 1.386 Empty nodes have no mass, physical presence, or appearance, but 1.387 they can hold metadata and have names. I use a tree structure of 1.388 @@ -699,14 +701,14 @@ 1.389 1.390 Blender is a general purpose animation tool, which has been used in 1.391 the past to create high quality movies such as Sintel 1.392 - \cite{blender}. Though Blender can model and render even complicated 1.393 - things like water, it is crucial to keep models that are meant to 1.394 - be simulated as creatures simple. =Bullet=, which =CORTEX= uses 1.395 - though jMonkeyEngine3, is a rigid-body physics system. This offers 1.396 - a compromise between the expressiveness of a game level and the 1.397 - speed at which it can be simulated, and it means that creatures 1.398 - should be naturally expressed as rigid components held together by 1.399 - joint constraints. 1.400 + (\cite{blender}). Though Blender can model and render even 1.401 + complicated things like water, it is crucial to keep models that 1.402 + are meant to be simulated as creatures simple. =Bullet=, which 1.403 + =CORTEX= uses though jMonkeyEngine3, is a rigid-body physics 1.404 + system. This offers a compromise between the expressiveness of a 1.405 + game level and the speed at which it can be simulated, and it means 1.406 + that creatures should be naturally expressed as rigid components 1.407 + held together by joint constraints. 1.408 1.409 But humans are more like a squishy bag wrapped around some hard 1.410 bones which define the overall shape. When we move, our skin bends 1.411 @@ -729,10 +731,10 @@ 1.412 physical model of the skin along with the movement of the bones, 1.413 which is unacceptably slow compared to rigid body simulation. 1.414 1.415 - Therefore, instead of using the human-like ``deformable bag of 1.416 - bones'' approach, I decided to base my body plans on multiple solid 1.417 - objects that are connected by joints, inspired by the robot =EVE= 1.418 - from the movie WALL-E. 1.419 + Therefore, instead of using the human-like ``bony meatbag'' 1.420 + approach, I decided to base my body plans on multiple solid objects 1.421 + that are connected by joints, inspired by the robot =EVE= from the 1.422 + movie WALL-E. 1.423 1.424 #+caption: =EVE= from the movie WALL-E. This body plan turns 1.425 #+caption: out to be much better suited to my purposes than a more 1.426 @@ -742,19 +744,19 @@ 1.427 1.428 =EVE='s body is composed of several rigid components that are held 1.429 together by invisible joint constraints. This is what I mean by 1.430 - ``eve-like''. The main reason that I use eve-style bodies is for 1.431 - efficiency, and so that there will be correspondence between the 1.432 - AI's senses and the physical presence of its body. Each individual 1.433 - section is simulated by a separate rigid body that corresponds 1.434 - exactly with its visual representation and does not change. 1.435 - Sections are connected by invisible joints that are well supported 1.436 - in jMonkeyEngine3. Bullet, the physics backend for jMonkeyEngine3, 1.437 - can efficiently simulate hundreds of rigid bodies connected by 1.438 - joints. Just because sections are rigid does not mean they have to 1.439 - stay as one piece forever; they can be dynamically replaced with 1.440 - multiple sections to simulate splitting in two. This could be used 1.441 - to simulate retractable claws or =EVE='s hands, which are able to 1.442 - coalesce into one object in the movie. 1.443 + /eve-like/. The main reason that I use eve-like bodies is for 1.444 + simulation efficiency, and so that there will be correspondence 1.445 + between the AI's senses and the physical presence of its body. Each 1.446 + individual section is simulated by a separate rigid body that 1.447 + corresponds exactly with its visual representation and does not 1.448 + change. Sections are connected by invisible joints that are well 1.449 + supported in jMonkeyEngine3. Bullet, the physics backend for 1.450 + jMonkeyEngine3, can efficiently simulate hundreds of rigid bodies 1.451 + connected by joints. Just because sections are rigid does not mean 1.452 + they have to stay as one piece forever; they can be dynamically 1.453 + replaced with multiple sections to simulate splitting in two. This 1.454 + could be used to simulate retractable claws or =EVE='s hands, which 1.455 + are able to coalesce into one object in the movie. 1.456 1.457 *** Solidifying/Connecting a body 1.458 1.459 @@ -2443,10 +2445,10 @@ 1.460 improvement, among which are using vision to infer 1.461 proprioception and looking up sensory experience with imagined 1.462 vision, touch, and sound. 1.463 - - Evolution :: Karl Sims created a rich environment for 1.464 - simulating the evolution of creatures on a connection 1.465 - machine. Today, this can be redone and expanded with =CORTEX= 1.466 - on an ordinary computer. 1.467 + - Evolution :: Karl Sims created a rich environment for simulating 1.468 + the evolution of creatures on a Connection Machine 1.469 + (\cite{sims-evolving-creatures}). Today, this can be redone 1.470 + and expanded with =CORTEX= on an ordinary computer. 1.471 - Exotic senses :: Cortex enables many fascinating senses that are 1.472 not possible to build in the real world. For example, 1.473 telekinesis is an interesting avenue to explore. You can also 1.474 @@ -2457,7 +2459,7 @@ 1.475 an effector which creates an entire new sub-simulation where 1.476 the creature has direct control over placement/creation of 1.477 objects via simulated telekinesis. The creature observes this 1.478 - sub-world through it's normal senses and uses its observations 1.479 + sub-world through its normal senses and uses its observations 1.480 to make predictions about its top level world. 1.481 - Simulated prescience :: step the simulation forward a few ticks, 1.482 gather sensory data, then supply this data for the creature as 1.483 @@ -2470,25 +2472,24 @@ 1.484 with each other. Because the creatures would be simulated, you 1.485 could investigate computationally complex rules of behavior 1.486 which still, from the group's point of view, would happen in 1.487 - ``real time''. Interactions could be as simple as cellular 1.488 + real time. Interactions could be as simple as cellular 1.489 organisms communicating via flashing lights, or as complex as 1.490 humanoids completing social tasks, etc. 1.491 - - =HACKER= for writing muscle-control programs :: Presented with 1.492 - low-level muscle control/ sense API, generate higher level 1.493 + - =HACKER= for writing muscle-control programs :: Presented with a 1.494 + low-level muscle control / sense API, generate higher level 1.495 programs for accomplishing various stated goals. Example goals 1.496 might be "extend all your fingers" or "move your hand into the 1.497 area with blue light" or "decrease the angle of this joint". 1.498 It would be like Sussman's HACKER, except it would operate 1.499 with much more data in a more realistic world. Start off with 1.500 "calisthenics" to develop subroutines over the motor control 1.501 - API. This would be the "spinal chord" of a more intelligent 1.502 - creature. The low level programming code might be a turning 1.503 - machine that could develop programs to iterate over a "tape" 1.504 - where each entry in the tape could control recruitment of the 1.505 - fibers in a muscle. 1.506 - - Sense fusion :: There is much work to be done on sense 1.507 + API. The low level programming code might be a turning machine 1.508 + that could develop programs to iterate over a "tape" where 1.509 + each entry in the tape could control recruitment of the fibers 1.510 + in a muscle. 1.511 + - Sense fusion :: There is much work to be done on sense 1.512 integration -- building up a coherent picture of the world and 1.513 - the things in it with =CORTEX= as a base, you can explore 1.514 + the things in it. With =CORTEX= as a base, you can explore 1.515 concepts like self-organizing maps or cross modal clustering 1.516 in ways that have never before been tried. 1.517 - Inverse kinematics :: experiments in sense guided motor control 1.518 @@ -2761,7 +2762,7 @@ 1.519 jumping actually /is/. 1.520 1.521 Of course, the action predicates are not directly applicable to 1.522 - video data which lacks the advanced sensory information which they 1.523 + video data, which lacks the advanced sensory information which they 1.524 require! 1.525 1.526 The trick now is to make the action predicates work even when the 1.527 @@ -2858,7 +2859,8 @@ 1.528 #+END_EXAMPLE 1.529 1.530 The worm's previous experience of lying on the ground and lifting 1.531 - its head generates possible interpretations for each frame: 1.532 + its head generates possible interpretations for each frame (the 1.533 + numbers are experience-indices): 1.534 1.535 #+BEGIN_EXAMPLE 1.536 [ flat, flat, flat, flat, flat, flat, flat, lift-head ] 1.537 @@ -2878,9 +2880,9 @@ 1.538 #+END_EXAMPLE 1.539 1.540 The new path through \Phi-space is synthesized from two actual 1.541 - paths that the creature actually experiences, the "1-2-3-4" chain 1.542 - and the "6-7-8-9" chain. The "1-2-3-4" chain is necessary because 1.543 - it ends with the worm lifting its head. It originated from a short 1.544 + paths that the creature has experienced: the "1-2-3-4" chain and 1.545 + the "6-7-8-9" chain. The "1-2-3-4" chain is necessary because it 1.546 + ends with the worm lifting its head. It originated from a short 1.547 training session where the worm rested on the floor for a brief 1.548 while and then raised its head. The "6-7-8-9" chain is part of a 1.549 longer chain of inactivity where the worm simply rested on the 1.550 @@ -3800,3 +3802,4 @@ 1.551 1.552 1.553 1.554 +TODO -- add a paper about detecting biological motion from only a few dots.