diff thesis/cortex.org @ 548:0b891e0dd809

version 0.2 of thesis complete.
author Robert McIntyre <rlm@mit.edu>
date Thu, 01 May 2014 23:41:41 -0400
parents 5d89879fc894
children c14545acdfba
line wrap: on
line diff
     1.1 --- a/thesis/cortex.org	Mon Apr 28 15:10:59 2014 -0400
     1.2 +++ b/thesis/cortex.org	Thu May 01 23:41:41 2014 -0400
     1.3 @@ -513,7 +513,7 @@
     1.4     appears to flow at a constant rate, regardless of how complicated
     1.5     the environment becomes or how many creatures are in the
     1.6     simulation. The cost is that =CORTEX= can sometimes run slower than
     1.7 -   real time. Time dialation works both ways, however --- simulations
     1.8 +   real time. Time dilation works both ways, however --- simulations
     1.9     of very simple creatures in =CORTEX= generally run at 40x real-time
    1.10     on my machine!
    1.11  
    1.12 @@ -565,7 +565,7 @@
    1.13     each sense. 
    1.14  
    1.15     Fortunately this idea is already a well known computer graphics
    1.16 -   technique called /UV-mapping/. In UV-maping, the three-dimensional
    1.17 +   technique called /UV-mapping/. In UV-mapping, the three-dimensional
    1.18     surface of a model is cut and smooshed until it fits on a
    1.19     two-dimensional image. You paint whatever you want on that image,
    1.20     and when the three-dimensional shape is rendered in a game the
    1.21 @@ -2814,7 +2814,7 @@
    1.22  
    1.23     The worm's total life experience is a long looping path through
    1.24     \Phi-space. I will now introduce simple way of taking that
    1.25 -   experiece path and building a function that can infer complete
    1.26 +   experience path and building a function that can infer complete
    1.27     sensory experience given only a stream of proprioceptive data. This
    1.28     /empathy/ function will provide a bridge to use the body centered
    1.29     action predicates on video-like streams of information.
    1.30 @@ -2972,7 +2972,7 @@
    1.31     entries in a proprioceptive bin, because for each element in the
    1.32     starting bin it performs a series of set lookups in the preceding
    1.33     bins. If the total history is limited, then this takes time
    1.34 -   proprotional to a only a constant multiple of the number of entries
    1.35 +   proportional to a only a constant multiple of the number of entries
    1.36     in the starting bin. This analysis also applies, even if the action
    1.37     requires multiple longest chains -- it's still the average number
    1.38     of entries in a proprioceptive bin times the desired chain length.
    1.39 @@ -3125,7 +3125,7 @@
    1.40     the testing environment for the action-predicates, with one major
    1.41     difference : the only sensory information available to the system
    1.42     is proprioception. From just the proprioception data and
    1.43 -   \Phi-space, =longest-thread= synthesises a complete record the last
    1.44 +   \Phi-space, =longest-thread= synthesizes a complete record the last
    1.45     300 sensory experiences of the worm. These synthesized experiences
    1.46     are fed directly into the action predicates =grand-circle?=,
    1.47     =curled?=, =wiggling?=, and =resting?= from before and their output
    1.48 @@ -3365,13 +3365,11 @@
    1.49     [[./images/worm-roll.png]]
    1.50  
    1.51     #+caption: After completing its adventures, the worm now knows 
    1.52 -   #+caption: how its touch sensors are arranged along its skin. These 
    1.53 -   #+caption: are the regions that were deemed important by 
    1.54 +   #+caption: how its touch sensors are arranged along its skin. Each of these six rectangles are touch sensory patterns that were 
    1.55 +   #+caption: deemed important by 
    1.56     #+caption: =learn-touch-regions=. Each white square in the rectangles 
    1.57     #+caption: above is a cluster of ``related" touch nodes as determined 
    1.58 -   #+caption: by the system. Since each square in the ``cross" corresponds 
    1.59 -   #+caption: to a face, the worm has correctly discovered that it has
    1.60 -   #+caption: six faces.
    1.61 +   #+caption: by the system. The worm has correctly discovered that it has six faces, and has partitioned its sensory map into these six faces. 
    1.62     #+name: worm-touch-map
    1.63     #+ATTR_LaTeX: :width 12cm
    1.64     [[./images/touch-learn.png]]
    1.65 @@ -3383,29 +3381,133 @@
    1.66     completely scrambled. The cross shape is just for convenience. This
    1.67     example justifies the use of pre-defined touch regions in =EMPATH=.
    1.68  
    1.69 +** Recognizing an object using embodied representation
    1.70 +
    1.71 +   At the beginning of the thesis, I suggested that we might recognize
    1.72 +   the chair in Figure \ref{hidden-chair} by imagining ourselves in
    1.73 +   the position of the man and realizing that he must be sitting on
    1.74 +   something in order to maintain that position. Here, I present a
    1.75 +   brief elaboration on how to this might be done.
    1.76 +
    1.77 +   First, I need the feeling of leaning or resting /on/ some other
    1.78 +   object that is not the floor. This feeling is easy to describe
    1.79 +   using an embodied representation. 
    1.80 +
    1.81 +  #+caption: Program describing the sense of leaning or resting on something.
    1.82 +  #+caption: This involves a relaxed posture, the feeling of touching something, 
    1.83 +  #+caption: and a period of stability where the worm does not move.
    1.84 +  #+name: draped
    1.85 +  #+begin_listing clojure
    1.86 +  #+begin_src clojure
    1.87 +(defn draped?
    1.88 +  "Is the worm:
    1.89 +    -- not flat (the floor is not a 'chair')
    1.90 +    -- supported (not using its muscles to hold its position)
    1.91 +    -- stable (not changing its position)
    1.92 +    -- touching something (must register contact)"
    1.93 +  [experiences]
    1.94 +  (let [b2-hash (bin 2)
    1.95 +        touch (:touch (peek experiences))
    1.96 +        total-contact
    1.97 +        (reduce
    1.98 +         +
    1.99 +         (map #(contact all-touch-coordinates %)
   1.100 +              (rest touch)))]
   1.101 +    (println total-contact)
   1.102 +    (and (not (resting? experiences))
   1.103 +         (every?
   1.104 +          zero?
   1.105 +          (-> experiences
   1.106 +              (vector:last-n 25)
   1.107 +              (#(map :muscle %))
   1.108 +              (flatten)))
   1.109 +         (-> experiences
   1.110 +             (vector:last-n 20)
   1.111 +             (#(map (comp b2-hash flatten :proprioception) %))
   1.112 +             (set)
   1.113 +             (count) (= 1))
   1.114 +         (< 0.03 total-contact))))
   1.115 +  #+end_src
   1.116 +  #+end_listing
   1.117 +   
   1.118 +   #+caption: The =draped?= predicate detects the presence of the
   1.119 +   #+caption: cube whenever the worm interacts with it. The details of the 
   1.120 +   #+caption: cube are irrelevant; only the way it influences the worm's
   1.121 +   #+caption: body matters.
   1.122 +   #+name: draped-video
   1.123 +   #+ATTR_LaTeX: :width 13cm
   1.124 +   [[./images/draped.png]]
   1.125 +
   1.126 +   Though this is a simple example, using the =draped?= predicate to
   1.127 +   detect the cube has interesting advantages. The =draped?= predicate
   1.128 +   describes the cube not in terms of properties that the cube has,
   1.129 +   but instead in terms of how the worm interacts with it physically.
   1.130 +   This means that the cube can still be detected even if it is not
   1.131 +   visible, as long as its influence on the worm's body is visible.
   1.132 +   
   1.133 +   This system will also see the virtual cube created by a
   1.134 +   ``mimeworm", which uses its muscles in a very controlled way to
   1.135 +   mimic the appearance of leaning on a cube. The system will
   1.136 +   anticipate that there is an actual invisible cube that provides
   1.137 +   support!
   1.138 +
   1.139 +   #+caption: Can you see the thing that this person is leaning on?
   1.140 +   #+caption: What properties does it have, other than how it makes the man's 
   1.141 +   #+caption: elbow and shoulder feel? I wonder if people who can actually 
   1.142 +   #+caption: maintain this pose easily still see the support?
   1.143 +   #+name: mime
   1.144 +   #+ATTR_LaTeX: :width 6cm
   1.145 +   [[./images/pablo-the-mime.png]]
   1.146 +
   1.147 +   This makes me wonder about the psychology of actual mimes. Suppose
   1.148 +   for a moment that people have something analogous to \Phi-space and
   1.149 +   that one of the ways that they find objects in a scene is by their
   1.150 +   relation to other people's bodies. Suppose that a person watches a
   1.151 +   person miming an invisible wall. For a person with no experience
   1.152 +   with miming, their \Phi-space will only have entries that describe
   1.153 +   the scene with the sensation of their hands touching a wall. This
   1.154 +   sensation of touch will create a strong impression of a wall, even
   1.155 +   though the wall would have to be invisible. A person with
   1.156 +   experience in miming however, will have entries in their \Phi-space
   1.157 +   that describe the wall-miming position without a sense of touch. It
   1.158 +   will not seem to such as person that an invisible wall is present,
   1.159 +   but merely that the mime is holding out their hands in a special
   1.160 +   way. Thus, the theory that humans use something like \Phi-space
   1.161 +   weakly predicts that learning how to mime should break the power of
   1.162 +   miming illusions. Most optical illusions still work no matter how
   1.163 +   much you know about them, so this proposal would be quite
   1.164 +   interesting to test, as it predicts a non-standard result!
   1.165 +
   1.166 +
   1.167 +#+BEGIN_LaTeX
   1.168 +\clearpage
   1.169 +#+END_LaTeX
   1.170 +
   1.171  * Contributions
   1.172 +
   1.173 +  The big idea behind this thesis is a new way to represent and
   1.174 +  recognize physical actions, which I call /empathic representation/.
   1.175 +  Actions are represented as predicates which have access to the
   1.176 +  totality of a creature's sensory abilities. To recognize the
   1.177 +  physical actions of another creature similar to yourself, you
   1.178 +  imagine what they would feel by examining the position of their body
   1.179 +  and relating it to your own previous experience.
   1.180    
   1.181 -  The big idea behind this thesis is a new way to represent and
   1.182 -  recognize physical actions -- empathic representation. Actions are
   1.183 -  represented as predicates which have available the totality of a
   1.184 -  creature's sensory abilities. To recognize the physical actions of
   1.185 -  another creature similar to yourself, you imagine what they would
   1.186 -  feel by examining the position of their body and relating it to your
   1.187 -  own previous experience.
   1.188 -  
   1.189 -  Empathic description of physical actions is very robust and general.
   1.190 -  Because the representation is body-centered, it avoids the fragility
   1.191 -  of learning from example videos. Because it relies on all of a
   1.192 +  Empathic representation of physical actions is robust and general.
   1.193 +  Because the representation is body-centered, it avoids baking in a
   1.194 +  particular viewpoint like you might get from learning from example
   1.195 +  videos. Because empathic representation relies on all of a
   1.196    creature's senses, it can describe exactly what an action /feels
   1.197    like/ without getting caught up in irrelevant details such as visual
   1.198    appearance. I think it is important that a correct description of
   1.199 -  jumping (for example) should not waste even a single bit on the
   1.200 -  color of a person's clothes or skin; empathic representation can
   1.201 -  avoid this waste by describing jumping in terms of touch, muscle
   1.202 -  contractions, and the brief feeling of weightlessness. Empathic
   1.203 -  representation is very low-level in that it describes actions using
   1.204 -  concrete sensory data with little abstraction, but it has the
   1.205 -  generality of much more abstract representations!
   1.206 +  jumping (for example) should not include irrelevant details such as
   1.207 +  the color of a person's clothes or skin; empathic representation can
   1.208 +  get right to the heart of what jumping is by describing it in terms
   1.209 +  of touch, muscle contractions, and a brief feeling of
   1.210 +  weightlessness. Empathic representation is very low-level in that it
   1.211 +  describes actions using concrete sensory data with little
   1.212 +  abstraction, but it has the generality of much more abstract
   1.213 +  representations!
   1.214  
   1.215    Another important contribution of this thesis is the development of
   1.216    the =CORTEX= system, a complete environment for creating simulated
   1.217 @@ -3413,29 +3515,31 @@
   1.218    proprioception, hearing, vision, and muscle tension. You have seen
   1.219    how to create new creatures using blender, a 3D modeling tool.
   1.220  
   1.221 -  I hope that =CORTEX= will be useful in further research projects. To
   1.222 -  this end I have included the full source to =CORTEX= along with a
   1.223 -  large suite of tests and examples. I have also created a user guide
   1.224 -  for =CORTEX= which is included in an appendix to this thesis.
   1.225 -
   1.226    As a minor digression, you also saw how I used =CORTEX= to enable a
   1.227    tiny worm to discover the topology of its skin simply by rolling on
   1.228 -  the ground. 
   1.229 -
   1.230 -  In conclusion, the main contributions of this thesis are:
   1.231 -
   1.232 -   - =CORTEX=, a comprehensive platform for embodied AI experiments.
   1.233 -     =CORTEX= supports many features lacking in other systems, such
   1.234 -     proper simulation of hearing. It is easy to create new =CORTEX=
   1.235 -     creatures using Blender, a free 3D modeling program.
   1.236 -
   1.237 -   - =EMPATH=, which uses =CORTEX= to identify the actions of a
   1.238 -     worm-like creature using a computational model of empathy. This
   1.239 -     empathic representation of actions is an important new kind of
   1.240 -     representation for physical actions.
   1.241 +  the ground.  You also saw how to detect objects using only embodied
   1.242 +  predicates. 
   1.243 +
   1.244 +  In conclusion, for this thesis I:
   1.245 +
   1.246 +   - Developed the idea of embodied representation, which describes
   1.247 +     actions that a creature can do in terms of first-person sensory
   1.248 +     data. 
   1.249 +
   1.250 +   - Developed a method of empathic action recognition which uses
   1.251 +     previous embodied experience and embodied representation of
   1.252 +     actions to greatly constrain the possible interpretations of an
   1.253 +     action.
   1.254 +
   1.255 +   - Created =EMPATH=, a program which uses empathic action
   1.256 +     recognition to recognize physical actions in a simple model
   1.257 +     involving segmented worm-like creatures.
   1.258 +
   1.259 +   - Created =CORTEX=, a comprehensive platform for embodied AI
   1.260 +     experiments. It is the base on which =EMPATH= is built.
   1.261  
   1.262  #+BEGIN_LaTeX
   1.263 -\newpage
   1.264 +\clearpage
   1.265  \appendix
   1.266  #+END_LaTeX
   1.267