diff thesis/cortex.org @ 547:5d89879fc894

couple hours worth of edits.
author Robert McIntyre <rlm@mit.edu>
date Mon, 28 Apr 2014 15:10:59 -0400
parents b2c66ea58c39
children 0b891e0dd809
line wrap: on
line diff
     1.1 --- a/thesis/cortex.org	Mon Apr 28 13:14:52 2014 -0400
     1.2 +++ b/thesis/cortex.org	Mon Apr 28 15:10:59 2014 -0400
     1.3 @@ -43,15 +43,15 @@
     1.4  
     1.5  * Empathy \& Embodiment: problem solving strategies
     1.6  
     1.7 -  By the end of this thesis, you will have seen a novel approach to
     1.8 -  interpreting video using embodiment and empathy. You will also see
     1.9 -  one way to efficiently implement physical empathy for embodied
    1.10 -  creatures. Finally, you will become familiar with =CORTEX=, a system
    1.11 -  for designing and simulating creatures with rich senses, which I
    1.12 -  have designed as a library that you can use in your own research.
    1.13 -  Note that I /do not/ process video directly --- I start with
    1.14 -  knowledge of the positions of a creature's body parts and works from
    1.15 -  there.
    1.16 +  By the end of this thesis, you will have a novel approach to
    1.17 +  representing an recognizing physical actions using embodiment and
    1.18 +  empathy. You will also see one way to efficiently implement physical
    1.19 +  empathy for embodied creatures. Finally, you will become familiar
    1.20 +  with =CORTEX=, a system for designing and simulating creatures with
    1.21 +  rich senses, which I have designed as a library that you can use in
    1.22 +  your own research. Note that I /do not/ process video directly --- I
    1.23 +  start with knowledge of the positions of a creature's body parts and
    1.24 +  works from there.
    1.25    
    1.26    This is the core vision of my thesis: That one of the important ways
    1.27    in which we understand others is by imagining ourselves in their
    1.28 @@ -81,11 +81,11 @@
    1.29     \cite{volume-action-recognition}), but the 3D world is so variable
    1.30     that it is hard to describe the world in terms of possible images.
    1.31  
    1.32 -   In fact, the contents of scene may have much less to do with pixel
    1.33 -   probabilities than with recognizing various affordances: things you
    1.34 -   can move, objects you can grasp, spaces that can be filled . For
    1.35 -   example, what processes might enable you to see the chair in figure
    1.36 -   \ref{hidden-chair}?
    1.37 +   In fact, the contents of a scene may have much less to do with
    1.38 +   pixel probabilities than with recognizing various affordances:
    1.39 +   things you can move, objects you can grasp, spaces that can be
    1.40 +   filled . For example, what processes might enable you to see the
    1.41 +   chair in figure \ref{hidden-chair}?
    1.42  
    1.43     #+caption: The chair in this image is quite obvious to humans, but 
    1.44     #+caption: it can't be found by any modern computer vision program.
    1.45 @@ -106,21 +106,21 @@
    1.46     Each of these examples tells us something about what might be going
    1.47     on in our minds as we easily solve these recognition problems:
    1.48     
    1.49 -   The hidden chair shows us that we are strongly triggered by cues
    1.50 -   relating to the position of human bodies, and that we can determine
    1.51 -   the overall physical configuration of a human body even if much of
    1.52 -   that body is occluded.
    1.53 -
    1.54 -   The picture of the girl pushing against the wall tells us that we
    1.55 -   have common sense knowledge about the kinetics of our own bodies.
    1.56 -   We know well how our muscles would have to work to maintain us in
    1.57 -   most positions, and we can easily project this self-knowledge to
    1.58 -   imagined positions triggered by images of the human body.
    1.59 -
    1.60 -   The cat tells us that imagination of some kind plays an important
    1.61 -   role in understanding actions. The question is: Can we be more
    1.62 -   precise about what sort of imagination is required to understand
    1.63 -   these actions?
    1.64 +   - The hidden chair shows us that we are strongly triggered by cues
    1.65 +     relating to the position of human bodies, and that we can
    1.66 +     determine the overall physical configuration of a human body even
    1.67 +     if much of that body is occluded.
    1.68 +
    1.69 +   - The picture of the girl pushing against the wall tells us that we
    1.70 +     have common sense knowledge about the kinetics of our own bodies.
    1.71 +     We know well how our muscles would have to work to maintain us in
    1.72 +     most positions, and we can easily project this self-knowledge to
    1.73 +     imagined positions triggered by images of the human body.
    1.74 +
    1.75 +   - The cat tells us that imagination of some kind plays an important
    1.76 +     role in understanding actions. The question is: Can we be more
    1.77 +     precise about what sort of imagination is required to understand
    1.78 +     these actions?
    1.79  
    1.80  ** A step forward: the sensorimotor-centered approach
    1.81  
    1.82 @@ -135,12 +135,12 @@
    1.83     the cool water hitting their tongue, and feel the water entering
    1.84     their body, and are able to recognize that /feeling/ as drinking.
    1.85     So, the label of the action is not really in the pixels of the
    1.86 -   image, but is found clearly in a simulation inspired by those
    1.87 -   pixels. An imaginative system, having been trained on drinking and
    1.88 -   non-drinking examples and learning that the most important
    1.89 -   component of drinking is the feeling of water sliding down one's
    1.90 -   throat, would analyze a video of a cat drinking in the following
    1.91 -   manner:
    1.92 +   image, but is found clearly in a simulation / recollection inspired
    1.93 +   by those pixels. An imaginative system, having been trained on
    1.94 +   drinking and non-drinking examples and learning that the most
    1.95 +   important component of drinking is the feeling of water sliding
    1.96 +   down one's throat, would analyze a video of a cat drinking in the
    1.97 +   following manner:
    1.98     
    1.99     1. Create a physical model of the video by putting a ``fuzzy''
   1.100        model of its own body in place of the cat. Possibly also create
   1.101 @@ -193,7 +193,7 @@
   1.102     the particulars of any visual representation of the actions. If you
   1.103     teach the system what ``running'' is, and you have a good enough
   1.104     aligner, the system will from then on be able to recognize running
   1.105 -   from any point of view, even strange points of view like above or
   1.106 +   from any point of view -- even strange points of view like above or
   1.107     underneath the runner. This is in contrast to action recognition
   1.108     schemes that try to identify actions using a non-embodied approach.
   1.109     If these systems learn about running as viewed from the side, they
   1.110 @@ -201,12 +201,13 @@
   1.111     viewpoint.
   1.112  
   1.113     Another powerful advantage is that using the language of multiple
   1.114 -   body-centered rich senses to describe body-centered actions offers a
   1.115 -   massive boost in descriptive capability. Consider how difficult it
   1.116 -   would be to compose a set of HOG filters to describe the action of
   1.117 -   a simple worm-creature ``curling'' so that its head touches its
   1.118 -   tail, and then behold the simplicity of describing thus action in a
   1.119 -   language designed for the task (listing \ref{grand-circle-intro}):
   1.120 +   body-centered rich senses to describe body-centered actions offers
   1.121 +   a massive boost in descriptive capability. Consider how difficult
   1.122 +   it would be to compose a set of HOG (Histogram of Oriented
   1.123 +   Gradients) filters to describe the action of a simple worm-creature
   1.124 +   ``curling'' so that its head touches its tail, and then behold the
   1.125 +   simplicity of describing thus action in a language designed for the
   1.126 +   task (listing \ref{grand-circle-intro}):
   1.127  
   1.128     #+caption: Body-centered actions are best expressed in a body-centered 
   1.129     #+caption: language. This code detects when the worm has curled into a 
   1.130 @@ -272,10 +273,10 @@
   1.131          together to form a coherent and complete sensory portrait of
   1.132          the scene.
   1.133  
   1.134 -   - Recognition      :: With the scene described in terms of
   1.135 -        remembered first person sensory events, the creature can now
   1.136 -        run its action-identified programs (such as the one in listing
   1.137 -        \ref{grand-circle-intro} on this synthesized sensory data,
   1.138 +   - Recognition :: With the scene described in terms of remembered
   1.139 +        first person sensory events, the creature can now run its
   1.140 +        action-definition programs (such as the one in listing
   1.141 +        \ref{grand-circle-intro}) on this synthesized sensory data,
   1.142          just as it would if it were actually experiencing the scene
   1.143          first-hand. If previous experience has been accurately
   1.144          retrieved, and if it is analogous enough to the scene, then
   1.145 @@ -327,20 +328,21 @@
   1.146     number of creatures. I intend it to be useful as a library for many
   1.147     more projects than just this thesis. =CORTEX= was necessary to meet
   1.148     a need among AI researchers at CSAIL and beyond, which is that
   1.149 -   people often will invent neat ideas that are best expressed in the
   1.150 -   language of creatures and senses, but in order to explore those
   1.151 +   people often will invent wonderful ideas that are best expressed in
   1.152 +   the language of creatures and senses, but in order to explore those
   1.153     ideas they must first build a platform in which they can create
   1.154     simulated creatures with rich senses! There are many ideas that
   1.155 -   would be simple to execute (such as =EMPATH= or
   1.156 -   \cite{larson-symbols}), but attached to them is the multi-month
   1.157 -   effort to make a good creature simulator. Often, that initial
   1.158 -   investment of time proves to be too much, and the project must make
   1.159 -   do with a lesser environment.
   1.160 +   would be simple to execute (such as =EMPATH= or Larson's
   1.161 +   self-organizing maps (\cite{larson-symbols})), but attached to them
   1.162 +   is the multi-month effort to make a good creature simulator. Often,
   1.163 +   that initial investment of time proves to be too much, and the
   1.164 +   project must make do with a lesser environment or be abandoned
   1.165 +   entirely.
   1.166  
   1.167     =CORTEX= is well suited as an environment for embodied AI research
   1.168     for three reasons:
   1.169  
   1.170 -   - You can create new creatures using Blender (\cite{blender}), a
   1.171 +   - You can design new creatures using Blender (\cite{blender}), a
   1.172       popular 3D modeling program. Each sense can be specified using
   1.173       special blender nodes with biologically inspired parameters. You
   1.174       need not write any code to create a creature, and can use a wide
   1.175 @@ -352,9 +354,8 @@
   1.176       senses like touch and vision involve multiple sensory elements
   1.177       embedded in a 2D surface. You have complete control over the
   1.178       distribution of these sensor elements through the use of simple
   1.179 -     png image files. In particular, =CORTEX= implements more
   1.180 -     comprehensive hearing than any other creature simulation system
   1.181 -     available.
   1.182 +     png image files. =CORTEX= implements more comprehensive hearing
   1.183 +     than any other creature simulation system available.
   1.184  
   1.185     - =CORTEX= supports any number of creatures and any number of
   1.186       senses. Time in =CORTEX= dilates so that the simulated creatures
   1.187 @@ -425,7 +426,7 @@
   1.188  
   1.189    Throughout this project, I intended for =CORTEX= to be flexible and
   1.190    extensible enough to be useful for other researchers who want to
   1.191 -  test out ideas of their own. To this end, wherever I have had to make
   1.192 +  test ideas of their own. To this end, wherever I have had to make
   1.193    architectural choices about =CORTEX=, I have chosen to give as much
   1.194    freedom to the user as possible, so that =CORTEX= may be used for
   1.195    things I have not foreseen.
   1.196 @@ -437,25 +438,26 @@
   1.197     reflection of its complexity. It may be that there is a significant
   1.198     qualitative difference between dealing with senses in the real
   1.199     world and dealing with pale facsimiles of them in a simulation
   1.200 -   \cite{brooks-representation}. What are the advantages and
   1.201 +   (\cite{brooks-representation}). What are the advantages and
   1.202     disadvantages of a simulation vs. reality?
   1.203     
   1.204  *** Simulation
   1.205  
   1.206      The advantages of virtual reality are that when everything is a
   1.207      simulation, experiments in that simulation are absolutely
   1.208 -    reproducible. It's also easier to change the character and world
   1.209 -    to explore new situations and different sensory combinations.
   1.210 +    reproducible. It's also easier to change the creature and
   1.211 +    environment to explore new situations and different sensory
   1.212 +    combinations.
   1.213  
   1.214      If the world is to be simulated on a computer, then not only do
   1.215 -    you have to worry about whether the character's senses are rich
   1.216 +    you have to worry about whether the creature's senses are rich
   1.217      enough to learn from the world, but whether the world itself is
   1.218      rendered with enough detail and realism to give enough working
   1.219 -    material to the character's senses. To name just a few
   1.220 +    material to the creature's senses. To name just a few
   1.221      difficulties facing modern physics simulators: destructibility of
   1.222      the environment, simulation of water/other fluids, large areas,
   1.223      nonrigid bodies, lots of objects, smoke. I don't know of any
   1.224 -    computer simulation that would allow a character to take a rock
   1.225 +    computer simulation that would allow a creature to take a rock
   1.226      and grind it into fine dust, then use that dust to make a clay
   1.227      sculpture, at least not without spending years calculating the
   1.228      interactions of every single small grain of dust. Maybe a
   1.229 @@ -471,14 +473,14 @@
   1.230      the complexity of implementing the senses. Instead of just
   1.231      grabbing the current rendered frame for processing, you have to
   1.232      use an actual camera with real lenses and interact with photons to
   1.233 -    get an image. It is much harder to change the character, which is
   1.234 +    get an image. It is much harder to change the creature, which is
   1.235      now partly a physical robot of some sort, since doing so involves
   1.236      changing things around in the real world instead of modifying
   1.237      lines of code. While the real world is very rich and definitely
   1.238 -    provides enough stimulation for intelligence to develop as
   1.239 -    evidenced by our own existence, it is also uncontrollable in the
   1.240 +    provides enough stimulation for intelligence to develop (as
   1.241 +    evidenced by our own existence), it is also uncontrollable in the
   1.242      sense that a particular situation cannot be recreated perfectly or
   1.243 -    saved for later use. It is harder to conduct science because it is
   1.244 +    saved for later use. It is harder to conduct Science because it is
   1.245      harder to repeat an experiment. The worst thing about using the
   1.246      real world instead of a simulation is the matter of time. Instead
   1.247      of simulated time you get the constant and unstoppable flow of
   1.248 @@ -488,8 +490,8 @@
   1.249      may simply be impossible given the current speed of our
   1.250      processors. Contrast this with a simulation, in which the flow of
   1.251      time in the simulated world can be slowed down to accommodate the
   1.252 -    limitations of the character's programming. In terms of cost,
   1.253 -    doing everything in software is far cheaper than building custom
   1.254 +    limitations of the creature's programming. In terms of cost, doing
   1.255 +    everything in software is far cheaper than building custom
   1.256      real-time hardware. All you need is a laptop and some patience.
   1.257      
   1.258  ** Simulated time enables rapid prototyping \& simple programs
   1.259 @@ -505,24 +507,24 @@
   1.260     to be accelerated by ASIC chips or FPGAs, turning what would
   1.261     otherwise be a few lines of code and a 10x speed penalty into a
   1.262     multi-month ordeal. For this reason, =CORTEX= supports
   1.263 -   /time-dilation/, which scales back the framerate of the
   1.264 -   simulation in proportion to the amount of processing each frame.
   1.265 -   From the perspective of the creatures inside the simulation, time
   1.266 -   always appears to flow at a constant rate, regardless of how
   1.267 -   complicated the environment becomes or how many creatures are in
   1.268 -   the simulation. The cost is that =CORTEX= can sometimes run slower
   1.269 -   than real time. This can also be an advantage, however ---
   1.270 -   simulations of very simple creatures in =CORTEX= generally run at
   1.271 -   40x on my machine!
   1.272 +   /time-dilation/, which scales back the framerate of the simulation
   1.273 +   in proportion to the amount of processing each frame. From the
   1.274 +   perspective of the creatures inside the simulation, time always
   1.275 +   appears to flow at a constant rate, regardless of how complicated
   1.276 +   the environment becomes or how many creatures are in the
   1.277 +   simulation. The cost is that =CORTEX= can sometimes run slower than
   1.278 +   real time. Time dialation works both ways, however --- simulations
   1.279 +   of very simple creatures in =CORTEX= generally run at 40x real-time
   1.280 +   on my machine!
   1.281  
   1.282  ** All sense organs are two-dimensional surfaces
   1.283  
   1.284     If =CORTEX= is to support a wide variety of senses, it would help
   1.285 -   to have a better understanding of what a ``sense'' actually is!
   1.286 -   While vision, touch, and hearing all seem like they are quite
   1.287 -   different things, I was surprised to learn during the course of
   1.288 -   this thesis that they (and all physical senses) can be expressed as
   1.289 -   exactly the same mathematical object due to a dimensional argument!
   1.290 +   to have a better understanding of what a sense actually is! While
   1.291 +   vision, touch, and hearing all seem like they are quite different
   1.292 +   things, I was surprised to learn during the course of this thesis
   1.293 +   that they (and all physical senses) can be expressed as exactly the
   1.294 +   same mathematical object!
   1.295  
   1.296     Human beings are three-dimensional objects, and the nerves that
   1.297     transmit data from our various sense organs to our brain are
   1.298 @@ -545,7 +547,7 @@
   1.299     Most human senses consist of many discrete sensors of various
   1.300     properties distributed along a surface at various densities. For
   1.301     skin, it is Pacinian corpuscles, Meissner's corpuscles, Merkel's
   1.302 -   disks, and Ruffini's endings \cite{textbook901}, which detect
   1.303 +   disks, and Ruffini's endings (\cite{textbook901}), which detect
   1.304     pressure and vibration of various intensities. For ears, it is the
   1.305     stereocilia distributed along the basilar membrane inside the
   1.306     cochlea; each one is sensitive to a slightly different frequency of
   1.307 @@ -556,19 +558,19 @@
   1.308     In fact, almost every human sense can be effectively described in
   1.309     terms of a surface containing embedded sensors. If the sense had
   1.310     any more dimensions, then there wouldn't be enough room in the
   1.311 -   spinal chord to transmit the information!
   1.312 +   spinal cord to transmit the information!
   1.313  
   1.314     Therefore, =CORTEX= must support the ability to create objects and
   1.315     then be able to ``paint'' points along their surfaces to describe
   1.316     each sense. 
   1.317  
   1.318     Fortunately this idea is already a well known computer graphics
   1.319 -   technique called /UV-mapping/. The three-dimensional surface of a
   1.320 -   model is cut and smooshed until it fits on a two-dimensional
   1.321 -   image. You paint whatever you want on that image, and when the
   1.322 -   three-dimensional shape is rendered in a game the smooshing and
   1.323 -   cutting is reversed and the image appears on the three-dimensional
   1.324 -   object.
   1.325 +   technique called /UV-mapping/. In UV-maping, the three-dimensional
   1.326 +   surface of a model is cut and smooshed until it fits on a
   1.327 +   two-dimensional image. You paint whatever you want on that image,
   1.328 +   and when the three-dimensional shape is rendered in a game the
   1.329 +   smooshing and cutting is reversed and the image appears on the
   1.330 +   three-dimensional object.
   1.331  
   1.332     To make a sense, interpret the UV-image as describing the
   1.333     distribution of that senses sensors. To get different types of
   1.334 @@ -610,12 +612,12 @@
   1.335     game engine will allow you to efficiently create multiple cameras
   1.336     in the simulated world that can be used as eyes. Video game systems
   1.337     offer integrated asset management for things like textures and
   1.338 -   creatures models, providing an avenue for defining creatures. They
   1.339 +   creature models, providing an avenue for defining creatures. They
   1.340     also understand UV-mapping, since this technique is used to apply a
   1.341     texture to a model. Finally, because video game engines support a
   1.342 -   large number of users, as long as =CORTEX= doesn't stray too far
   1.343 -   from the base system, other researchers can turn to this community
   1.344 -   for help when doing their research.
   1.345 +   large number of developers, as long as =CORTEX= doesn't stray too
   1.346 +   far from the base system, other researchers can turn to this
   1.347 +   community for help when doing their research.
   1.348     
   1.349  ** =CORTEX= is based on jMonkeyEngine3
   1.350  
   1.351 @@ -623,14 +625,14 @@
   1.352     engines to see which would best serve as a base. The top contenders
   1.353     were:
   1.354  
   1.355 -   - [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]]    :: The Quake II engine was designed by ID
   1.356 -        software in 1997.  All the source code was released by ID
   1.357 -        software into the Public Domain several years ago, and as a
   1.358 -        result it has been ported to many different languages. This
   1.359 -        engine was famous for its advanced use of realistic shading
   1.360 -        and had decent and fast physics simulation. The main advantage
   1.361 -        of the Quake II engine is its simplicity, but I ultimately
   1.362 -        rejected it because the engine is too tied to the concept of a
   1.363 +   - [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]] :: The Quake II engine was designed by ID software
   1.364 +        in 1997. All the source code was released by ID software into
   1.365 +        the Public Domain several years ago, and as a result it has
   1.366 +        been ported to many different languages. This engine was
   1.367 +        famous for its advanced use of realistic shading and it had
   1.368 +        decent and fast physics simulation. The main advantage of the
   1.369 +        Quake II engine is its simplicity, but I ultimately rejected
   1.370 +        it because the engine is too tied to the concept of a
   1.371          first-person shooter game. One of the problems I had was that
   1.372          there does not seem to be any easy way to attach multiple
   1.373          cameras to a single character. There are also several physics
   1.374 @@ -670,11 +672,11 @@
   1.375     enable people who are talented at modeling but not programming to
   1.376     design =CORTEX= creatures.
   1.377  
   1.378 -   Therefore, I use Blender, a free 3D modeling program, as the main
   1.379 +   Therefore I use Blender, a free 3D modeling program, as the main
   1.380     way to create creatures in =CORTEX=. However, the creatures modeled
   1.381     in Blender must also be simple to simulate in jMonkeyEngine3's game
   1.382     engine, and must also be easy to rig with =CORTEX='s senses. I
   1.383 -   accomplish this with extensive use of Blender's ``empty nodes.'' 
   1.384 +   accomplish this with extensive use of Blender's ``empty nodes.''
   1.385  
   1.386     Empty nodes have no mass, physical presence, or appearance, but
   1.387     they can hold metadata and have names. I use a tree structure of
   1.388 @@ -699,14 +701,14 @@
   1.389  
   1.390     Blender is a general purpose animation tool, which has been used in
   1.391     the past to create high quality movies such as Sintel
   1.392 -   \cite{blender}. Though Blender can model and render even complicated
   1.393 -   things like water, it is crucial to keep models that are meant to
   1.394 -   be simulated as creatures simple. =Bullet=, which =CORTEX= uses
   1.395 -   though jMonkeyEngine3, is a rigid-body physics system. This offers
   1.396 -   a compromise between the expressiveness of a game level and the
   1.397 -   speed at which it can be simulated, and it means that creatures
   1.398 -   should be naturally expressed as rigid components held together by
   1.399 -   joint constraints.
   1.400 +   (\cite{blender}). Though Blender can model and render even
   1.401 +   complicated things like water, it is crucial to keep models that
   1.402 +   are meant to be simulated as creatures simple. =Bullet=, which
   1.403 +   =CORTEX= uses though jMonkeyEngine3, is a rigid-body physics
   1.404 +   system. This offers a compromise between the expressiveness of a
   1.405 +   game level and the speed at which it can be simulated, and it means
   1.406 +   that creatures should be naturally expressed as rigid components
   1.407 +   held together by joint constraints.
   1.408  
   1.409     But humans are more like a squishy bag wrapped around some hard
   1.410     bones which define the overall shape. When we move, our skin bends
   1.411 @@ -729,10 +731,10 @@
   1.412     physical model of the skin along with the movement of the bones,
   1.413     which is unacceptably slow compared to rigid body simulation.
   1.414  
   1.415 -   Therefore, instead of using the human-like ``deformable bag of
   1.416 -   bones'' approach, I decided to base my body plans on multiple solid
   1.417 -   objects that are connected by joints, inspired by the robot =EVE=
   1.418 -   from the movie WALL-E.
   1.419 +   Therefore, instead of using the human-like ``bony meatbag''
   1.420 +   approach, I decided to base my body plans on multiple solid objects
   1.421 +   that are connected by joints, inspired by the robot =EVE= from the
   1.422 +   movie WALL-E.
   1.423     
   1.424     #+caption: =EVE= from the movie WALL-E.  This body plan turns 
   1.425     #+caption: out to be much better suited to my purposes than a more 
   1.426 @@ -742,19 +744,19 @@
   1.427  
   1.428     =EVE='s body is composed of several rigid components that are held
   1.429     together by invisible joint constraints. This is what I mean by
   1.430 -   ``eve-like''. The main reason that I use eve-style bodies is for
   1.431 -   efficiency, and so that there will be correspondence between the
   1.432 -   AI's senses and the physical presence of its body. Each individual
   1.433 -   section is simulated by a separate rigid body that corresponds
   1.434 -   exactly with its visual representation and does not change.
   1.435 -   Sections are connected by invisible joints that are well supported
   1.436 -   in jMonkeyEngine3. Bullet, the physics backend for jMonkeyEngine3,
   1.437 -   can efficiently simulate hundreds of rigid bodies connected by
   1.438 -   joints. Just because sections are rigid does not mean they have to
   1.439 -   stay as one piece forever; they can be dynamically replaced with
   1.440 -   multiple sections to simulate splitting in two. This could be used
   1.441 -   to simulate retractable claws or =EVE='s hands, which are able to
   1.442 -   coalesce into one object in the movie.
   1.443 +   /eve-like/. The main reason that I use eve-like bodies is for
   1.444 +   simulation efficiency, and so that there will be correspondence
   1.445 +   between the AI's senses and the physical presence of its body. Each
   1.446 +   individual section is simulated by a separate rigid body that
   1.447 +   corresponds exactly with its visual representation and does not
   1.448 +   change. Sections are connected by invisible joints that are well
   1.449 +   supported in jMonkeyEngine3. Bullet, the physics backend for
   1.450 +   jMonkeyEngine3, can efficiently simulate hundreds of rigid bodies
   1.451 +   connected by joints. Just because sections are rigid does not mean
   1.452 +   they have to stay as one piece forever; they can be dynamically
   1.453 +   replaced with multiple sections to simulate splitting in two. This
   1.454 +   could be used to simulate retractable claws or =EVE='s hands, which
   1.455 +   are able to coalesce into one object in the movie.
   1.456  
   1.457  *** Solidifying/Connecting a body
   1.458  
   1.459 @@ -2443,10 +2445,10 @@
   1.460          improvement, among which are using vision to infer
   1.461          proprioception and looking up sensory experience with imagined
   1.462          vision, touch, and sound.
   1.463 -   - Evolution       :: Karl Sims created a rich environment for
   1.464 -        simulating the evolution of creatures on a connection
   1.465 -        machine. Today, this can be redone and expanded with =CORTEX=
   1.466 -        on an ordinary computer.
   1.467 +   - Evolution :: Karl Sims created a rich environment for simulating
   1.468 +        the evolution of creatures on a Connection Machine
   1.469 +        (\cite{sims-evolving-creatures}). Today, this can be redone
   1.470 +        and expanded with =CORTEX= on an ordinary computer.
   1.471     - Exotic senses  :: Cortex enables many fascinating senses that are
   1.472          not possible to build in the real world. For example,
   1.473          telekinesis is an interesting avenue to explore. You can also
   1.474 @@ -2457,7 +2459,7 @@
   1.475          an effector which creates an entire new sub-simulation where
   1.476          the creature has direct control over placement/creation of
   1.477          objects via simulated telekinesis. The creature observes this
   1.478 -        sub-world through it's normal senses and uses its observations
   1.479 +        sub-world through its normal senses and uses its observations
   1.480          to make predictions about its top level world.
   1.481     - Simulated prescience :: step the simulation forward a few ticks,
   1.482          gather sensory data, then supply this data for the creature as
   1.483 @@ -2470,25 +2472,24 @@
   1.484          with each other. Because the creatures would be simulated, you
   1.485          could investigate computationally complex rules of behavior
   1.486          which still, from the group's point of view, would happen in
   1.487 -        ``real time''. Interactions could be as simple as cellular
   1.488 +        real time. Interactions could be as simple as cellular
   1.489          organisms communicating via flashing lights, or as complex as
   1.490          humanoids completing social tasks, etc.
   1.491 -   - =HACKER= for writing muscle-control programs :: Presented with
   1.492 -        low-level muscle control/ sense API, generate higher level
   1.493 +   - =HACKER= for writing muscle-control programs :: Presented with a
   1.494 +        low-level muscle control / sense API, generate higher level
   1.495          programs for accomplishing various stated goals. Example goals
   1.496          might be "extend all your fingers" or "move your hand into the
   1.497          area with blue light" or "decrease the angle of this joint".
   1.498          It would be like Sussman's HACKER, except it would operate
   1.499          with much more data in a more realistic world. Start off with
   1.500          "calisthenics" to develop subroutines over the motor control
   1.501 -        API. This would be the "spinal chord" of a more intelligent
   1.502 -        creature. The low level programming code might be a turning
   1.503 -        machine that could develop programs to iterate over a "tape"
   1.504 -        where each entry in the tape could control recruitment of the
   1.505 -        fibers in a muscle.
   1.506 -   - Sense fusion    :: There is much work to be done on sense
   1.507 +        API. The low level programming code might be a turning machine
   1.508 +        that could develop programs to iterate over a "tape" where
   1.509 +        each entry in the tape could control recruitment of the fibers
   1.510 +        in a muscle.
   1.511 +   - Sense fusion :: There is much work to be done on sense
   1.512          integration -- building up a coherent picture of the world and
   1.513 -        the things in it with =CORTEX= as a base, you can explore
   1.514 +        the things in it. With =CORTEX= as a base, you can explore
   1.515          concepts like self-organizing maps or cross modal clustering
   1.516          in ways that have never before been tried.
   1.517     - Inverse kinematics :: experiments in sense guided motor control
   1.518 @@ -2761,7 +2762,7 @@
   1.519     jumping actually /is/.
   1.520  
   1.521     Of course, the action predicates are not directly applicable to
   1.522 -   video data which lacks the advanced sensory information which they
   1.523 +   video data, which lacks the advanced sensory information which they
   1.524     require!
   1.525  
   1.526     The trick now is to make the action predicates work even when the
   1.527 @@ -2858,7 +2859,8 @@
   1.528     #+END_EXAMPLE
   1.529  
   1.530     The worm's previous experience of lying on the ground and lifting
   1.531 -   its head generates possible interpretations for each frame: 
   1.532 +   its head generates possible interpretations for each frame (the
   1.533 +   numbers are experience-indices):
   1.534  
   1.535     #+BEGIN_EXAMPLE
   1.536     [ flat, flat, flat, flat, flat, flat, flat, lift-head ]
   1.537 @@ -2878,9 +2880,9 @@
   1.538     #+END_EXAMPLE
   1.539  
   1.540     The new path through \Phi-space is synthesized from two actual
   1.541 -   paths that the creature actually experiences, the "1-2-3-4" chain
   1.542 -   and the "6-7-8-9" chain. The "1-2-3-4" chain is necessary because
   1.543 -   it ends with the worm lifting its head. It originated from a short
   1.544 +   paths that the creature has experienced: the "1-2-3-4" chain and
   1.545 +   the "6-7-8-9" chain. The "1-2-3-4" chain is necessary because it
   1.546 +   ends with the worm lifting its head. It originated from a short
   1.547     training session where the worm rested on the floor for a brief
   1.548     while and then raised its head. The "6-7-8-9" chain is part of a
   1.549     longer chain of inactivity where the worm simply rested on the
   1.550 @@ -3800,3 +3802,4 @@
   1.551  
   1.552  
   1.553  
   1.554 +TODO -- add a paper about detecting biological motion from only a few dots.