cortex: thesis/cortex.org comparison

comparison thesis/cortex.org @ 451:0a4362d1f138

finishing up chapter 3.

author	Robert McIntyre <rlm@mit.edu>
date	Wed, 26 Mar 2014 20:38:17 -0400
parents	432f2c4646cb
children	f339e3d5cc8c

comparison

equal deleted inserted replaced

-:432f2c4646cb
+:0a4362d1f138
 #+title: =CORTEX=
 #+author: Robert McIntyre
 #+email: rlm@mit.edu
 #+description: Using embodied AI to facilitate Artificial Imagination.
 #+keywords: AI, clojure, embodiment
+#+LaTeX_CLASS_OPTIONS: [nofloat]
-* Empathy and Embodiment as problem solving strategies
+* Empathy and Embodiment as problem solving strategieszzzzzzz
 By the end of this thesis, you will have seen a novel approach to
 interpreting video using embodiment and empathy. You will have also
 seen one way to efficiently implement empathy for embodied
 creatures. Finally, you will become familiar with =CORTEX=, a system
 - evolutionary algorithms involving creature construction
 - exploration of exoitic senses and effectors that are not possible
 in the real world (such as telekenisis or a semantic sense)
 - imagination using subworlds
-During one test with =CORTEX=, I created 3,000 entities each with
+During one test with =CORTEX=, I created 3,000 creatures each with
 their own independent senses and ran them all at only 1/80 real
 time. In another test, I created a detailed model of my own hand,
 equipped with a realistic distribution of touch (more sensitive at
 the fingertips), as well as eyes and ears, and it ran at around 1/4
 real time.
 #+BEGIN_LaTeX
 \begin{sidewaysfigure}
 \includegraphics[width=9.5in]{images/full-hand.png}
-\caption{Here is the worm from above modeled in Blender,
+\caption{
-a free 3D-modeling program. Senses and joints are described
+I modeled my own right hand in Blender and rigged it with all the
-using special nodes in Blender. The senses are displayed on
+senses that {\tt CORTEX} supports. My simulated hand has a
-the right, and the simulation is displayed on the left. Notice
+biologically inspired distribution of touch sensors. The senses are
-that the hand is curling its fingers, that it can see its own
+displayed on the right, and the simulation is displayed on the
-finger from the eye in its palm, and thta it can feel its own
+left. Notice that my hand is curling its fingers, that it can see
-thumb touching its palm.}
+its own finger from the eye in its palm, and that it can feel its
+own thumb touching its palm.}
 \end{sidewaysfigure}
 #+END_LaTeX
 ** Contributions
-I built =CORTEX=, a comprehensive platform for embodied AI
+- I built =CORTEX=, a comprehensive platform for embodied AI
-experiments. =CORTEX= many new features lacking in other systems,
+experiments. =CORTEX= supports many features lacking in other
-such as sound. It is easy to create new creatures using Blender, a
+systems, such proper simulation of hearing. It is easy to create
-free 3D modeling program.
+new =CORTEX= creatures using Blender, a free 3D modeling program.
-I built =EMPATH=, which uses =CORTEX= to identify the actions of a
+- I built =EMPATH=, which uses =CORTEX= to identify the actions of
-worm-like creature using a computational model of empathy.
+a worm-like creature using a computational model of empathy.
 * Building =CORTEX=
 ** To explore embodiment, we need a world, body, and senses
 #+caption: Here is the worm with which we will be working.
 #+caption: It is composed of 5 segments. Each segment has a
 #+caption: pair of extensor and flexor muscles. Each of the
 #+caption: worm's four joints is a hinge joint which allows
-#+caption: 30 degrees of rotation to either side. Each segment
+#+caption: about 30 degrees of rotation to either side. Each segment
 #+caption: of the worm is touch-capable and has a uniform
 #+caption: distribution of touch sensors on each of its faces.
 #+caption: Each joint has a proprioceptive sense to detect
 #+caption: relative positions. The worm segments are all the
 #+caption: same except for the first one, which has a much
 Embodied representations using multiple senses such as touch,
 proprioception, and muscle tension turns out be be exceedingly
 efficient at describing body-centered actions. It is the ``right
 language for the job''. For example, it takes only around 5 lines
 of LISP code to describe the action of ``curling'' using embodied
-primitives. It takes about 8 lines to describe the seemingly
+primitives. It takes about 10 lines to describe the seemingly
 complicated action of wiggling.
 The following action predicates each take a stream of sensory
 experience, observe however much of it they desire, and decide
 whether the worm is doing the action they describe. =curled?=
 #+end_src
 #+end_listing
 #+caption: Using =debug-experience=, the body-centered predicates
 #+caption: work together to classify the behaviour of the worm.
-#+caption: while under manual motor control.
+#+caption: the predicates are operating with access to the worm's
+#+caption: full sensory data.
 #+name: basic-worm-view
 #+ATTR_LaTeX: :width 10cm
 [[./images/worm-identify-init.png]]
 These action predicates satisfy the recognition requirement of an
-empathic recognition system. There is a lot of power in the
+empathic recognition system. There is power in the simplicity of
-simplicity of the action predicates. They describe their actions
+the action predicates. They describe their actions without getting
-without getting confused in visual details of the worm. Each one is
+confused in visual details of the worm. Each one is frame
-frame independent, but more than that, they are each indepent of
+independent, but more than that, they are each indepent of
 irrelevant visual details of the worm and the environment. They
 will work regardless of whether the worm is a different color or
-hevaily textured, or of the environment has strange lighting.
+hevaily textured, or if the environment has strange lighting.
 The trick now is to make the action predicates work even when the
 sensory data on which they depend is absent. If I can do that, then
 I will have gained much,
 ** \Phi-space describes the worm's experiences
 As a first step towards building empathy, I need to gather all of
 the worm's experiences during free play. I use a simple vector to
 store all the experiences.
-#+caption: Program to gather the worm's experiences into a vector for
-#+caption: further processing. The =motor-control-program= line uses
-#+caption: a motor control script that causes the worm to execute a series
-#+caption: of ``exercices'' that include all the action predicates.
-#+name: generate-phi-space
-#+begin_listing clojure
-#+begin_src clojure
-(defn generate-phi-space []
-(let [experiences (atom [])]
-(run-world
-(apply-map
-worm-world
-(merge
-(worm-world-defaults)
-{:end-frame 700
-:motor-control
-(motor-control-program worm-muscle-labels do-all-the-things)
-:experiences experiences})))
-@experiences))
-#+end_src
-#+end_listing
 Each element of the experience vector exists in the vast space of
 all possible worm-experiences. Most of this vast space is actually
 unreachable due to physical constraints of the worm's body. For
 example, the worm's segments are connected by hinge joints that put
-a practical limit on the worm's degrees of freedom. Also, the worm
+a practical limit on the worm's range of motions without limiting
-can not be bent into a circle so that its ends are touching and at
+its degrees of freedom. Some groupings of senses are impossible;
-the same time not also experience the sensation of touching itself.
+the worm can not be bent into a circle so that its ends are
+touching and at the same time not also experience the sensation of
-As the worm moves around during free play and the vector grows
+touching itself.
-larger, the vector begins to define a subspace which is all the
-practical experiences the worm can experience during normal
+As the worm moves around during free play and its experience vector
-operation, which I call \Phi-space, short for physical-space. The
+grows larger, the vector begins to define a subspace which is all
-vector defines a path through \Phi-space. This path has interesting
+the sensations the worm can practicaly experience during normal
-properties that all derive from embodiment. The proprioceptive
+operation. I call this subspace \Phi-space, short for
-components are completely smooth, because in order for the worm to
+physical-space. The experience vector defines a path through
-move from one position to another, it must pass through the
+\Phi-space. This path has interesting properties that all derive
-intermediate positions. The path invariably forms loops as actions
+from physical embodiment. The proprioceptive components are
-are repeated. Finally and most importantly, proprioception actually
+completely smooth, because in order for the worm to move from one
-gives very strong inference about the other senses. For example,
+position to another, it must pass through the intermediate
-when the worm is flat, you can infer that it is touching the ground
+positions. The path invariably forms loops as actions are repeated.
-and that its muscles are not active, because if the muscles were
+Finally and most importantly, proprioception actually gives very
-active, the worm would be moving and would not be perfectly flat.
+strong inference about the other senses. For example, when the worm
-In order to stay flat, the worm has to be touching the ground, or
+is flat, you can infer that it is touching the ground and that its
-it would again be moving out of the flat position due to gravity.
+muscles are not active, because if the muscles were active, the
-If the worm is positioned in such a way that it interacts with
+worm would be moving and would not be perfectly flat. In order to
-itself, then it is very likely to be feeling the same tactile
+stay flat, the worm has to be touching the ground, or it would
-feelings as the last time it was in that position, because it has
+again be moving out of the flat position due to gravity. If the
-the same body as then. If you observe multiple frames of
+worm is positioned in such a way that it interacts with itself,
-proprioceptive data, then you can become increasingly confident
+then it is very likely to be feeling the same tactile feelings as
-about the exact activations of the worm's muscles, because it
+the last time it was in that position, because it has the same body
-generally takes a unique combination of muscle contractions to
+as then. If you observe multiple frames of proprioceptive data,
-transform the worm's body along a specific path through \Phi-space.
+then you can become increasingly confident about the exact
+activations of the worm's muscles, because it generally takes a
+unique combination of muscle contractions to transform the worm's
+body along a specific path through \Phi-space.
 There is a simple way of taking \Phi-space and the total ordering
 provided by an experience vector and reliably infering the rest of
 the senses.
 ** Empathy is the process of tracing though \Phi-space
 Here is the core of a basic empathy algorithm, starting with an
-experience vector: First, group the experiences into tiered
+experience vector:
-proprioceptive bins. I use powers of 10 and 3 bins, and the
-smallest bin has and approximate size of 0.001 radians in all
+First, group the experiences into tiered proprioceptive bins. I use
-proprioceptive dimensions.
+powers of 10 and 3 bins, and the smallest bin has an approximate
+size of 0.001 radians in all proprioceptive dimensions.
 Then, given a sequence of proprioceptive input, generate a set of
-matching experience records for each input.
+matching experience records for each input, using the tiered
+proprioceptive bins.
 Finally, to infer sensory data, select the longest consective chain
-of experiences as determined by the indexes into the experience
+of experiences. Conecutive experience means that the experiences
-vector.
+appear next to each other in the experience vector.
 This algorithm has three advantages:
 1. It's simple
-3. It's very fast -- both tracing through possibilites and
+3. It's very fast -- retrieving possible interpretations takes
-retrieving possible interpretations take essentially constant
+constant time. Tracing through chains of interpretations takes
-time.
+time proportional to the average number of experiences in a
+proprioceptive bin. Redundant experiences in \Phi-space can be
+merged to save computation.
 2. It protects from wrong interpretations of transient ambiguous
-proprioceptive data : for example, if the worm is flat for just
+proprioceptive data. For example, if the worm is flat for just
 an instant, this flattness will not be interpreted as implying
 that the worm has its muscles relaxed, since the flattness is
 part of a longer chain which includes a distinct pattern of
-muscle activation. A memoryless statistical model such as a
+muscle activation. Markov chains or other memoryless statistical
-markov model that operates on individual frames may very well
+models that operate on individual frames may very well make this
-make this mistake.
+mistake.
 #+caption: Program to convert an experience vector into a
 #+caption: proprioceptively binned lookup function.
 #+name: bin
 #+begin_listing clojure
 (fn lookup [proprio-data]
 (set (some #(% proprio-data) lookups)))))
 #+end_src
 #+end_listing
+#+caption: =longest-thread= finds the longest path of consecutive
+#+caption: experiences to explain proprioceptive worm data.
+#+name: phi-space-history-scan
+#+ATTR_LaTeX: :width 10cm
+[[./images/aurellem-gray.png]]
+=longest-thread= infers sensory data by stitching together pieces
+from previous experience. It prefers longer chains of previous
+experience to shorter ones. For example, during training the worm
+might rest on the ground for one second before it performs its
+excercises. If during recognition the worm rests on the ground for
+five seconds, =longest-thread= will accomodate this five second
+rest period by looping the one second rest chain five times.
+=longest-thread= takes time proportinal to the average number of
+entries in a proprioceptive bin, because for each element in the
+starting bin it performes a series of set lookups in the preceeding
+bins. If the total history is limited, then this is only a constant
+multiple times the number of entries in the starting bin. This
+analysis also applies even if the action requires multiple longest
+chains -- it's still the average number of entries in a
+proprioceptive bin times the desired chain length. Because
+=longest-thread= is so efficient and simple, I can interpret
+worm-actions in real time.
 #+caption: Program to calculate empathy by tracing though \Phi-space
 #+caption: and finding the longest (ie. most coherent) interpretation
 #+caption: of the data.
 #+name: longest-thread
 (recur (concat longest-thread result)
 (drop (count longest-thread) phi-index-sets))))))
 #+end_src
 #+end_listing
+There is one final piece, which is to replace missing sensory data
-There is one final piece, which is to replace missing sensory data
+with a best-guess estimate. While I could fill in missing data by
-with a best-guess estimate. While I could fill in missing data by
+using a gradient over the closest known sensory data points,
-using a gradient over the closest known sensory data points, averages
+averages can be misleading. It is certainly possible to create an
-can be misleading. It is certainly possible to create an impossible
+impossible sensory state by averaging two possible sensory states.
-sensory state by averaging two possible sensory states. Therefore, I
+Therefore, I simply replicate the most recent sensory experience to
-simply replicate the most recent sensory experience to fill in the
+fill in the gaps.
-gaps.
 #+caption: Fill in blanks in sensory experience by replicating the most
 #+caption: recent experience.
 #+name: infer-nils
 #+begin_listing clojure
 (recur (dec i) v)
 (recur (dec i) (assoc! v (dec i) cur)))
 (recur i (assoc! v i 0))))))
 #+end_src
 #+end_listing
 ** Efficient action recognition with =EMPATH=
-In my exploration with the worm, I can generally infer actions from
+To use =EMPATH= with the worm, I first need to gather a set of
-proprioceptive data exactly as well as when I have the complete
+experiences from the worm that includes the actions I want to
-sensory data. To reach this level, I have to train the worm with
+recognize. The =generate-phi-space= program (listint
-verious exercices for about 1 minute.
+\ref{generate-phi-space} runs the worm through a series of
+exercices and gatheres those experiences into a vector. The
+=do-all-the-things= program is a routine expressed in a simple
+muscle contraction script language for automated worm control.
+#+caption: Program to gather the worm's experiences into a vector for
+#+caption: further processing. The =motor-control-program= line uses
+#+caption: a motor control script that causes the worm to execute a series
+#+caption: of ``exercices'' that include all the action predicates.
+#+name: generate-phi-space
+#+attr_latex: [!H]
+#+begin_listing clojure
+#+begin_src clojure
+(def do-all-the-things
+(concat
+curl-script
+[[300 :d-ex 40]
+[320 :d-ex 0]]
+(shift-script 280 (take 16 wiggle-script))))
+(defn generate-phi-space []
+(let [experiences (atom [])]
+(run-world
+(apply-map
+worm-world
+(merge
+(worm-world-defaults)
+{:end-frame 700
+:motor-control
+(motor-control-program worm-muscle-labels do-all-the-things)
+:experiences experiences})))
+@experiences))
+#+end_src
+#+end_listing
+#+caption: Use longest thread and a phi-space generated from a short
+#+caption: exercise routine to interpret actions during free play.
+#+name: empathy-debug
+#+begin_listing clojure
+#+begin_src clojure
+(defn init []
+(def phi-space (generate-phi-space))
+(def phi-scan (gen-phi-scan phi-space)))
+(defn empathy-demonstration []
+(let [proprio (atom ())]
+(fn
+[experiences text]
+(let [phi-indices (phi-scan (:proprioception (peek experiences)))]
+(swap! proprio (partial cons phi-indices))
+(let [exp-thread (longest-thread (take 300 @proprio))
+empathy (mapv phi-space (infer-nils exp-thread))]
+(println-repl (vector:last-n exp-thread 22))
+(cond
+(grand-circle? empathy) (.setText text "Grand Circle")
+(curled? empathy)       (.setText text "Curled")
+(wiggling? empathy)     (.setText text "Wiggling")
+(resting? empathy)      (.setText text "Resting")
+:else                       (.setText text "Unknown")))))))
+(defn empathy-experiment [record]
+(.start (worm-world :experience-watch (debug-experience-phi)
+:record record :worm worm*)))
+#+end_src
+#+end_listing
+The result of running =empathy-experiment= is that the system is
+generally able to interpret worm actions using the action-predicates
+on simulated sensory data just as well as with actual data. Figure
+\ref{empathy-debug-image} was generated using =empathy-experiment=:
+#+caption: From only proprioceptive data, =EMPATH= was able to infer
+#+caption: the complete sensory experience and classify four poses
+#+caption: (The last panel shows a composite image of \emph{wriggling},
+#+caption: a dynamic pose.)
+#+name: empathy-debug-image
+#+ATTR_LaTeX: :width 10cm :placement [H]
+[[./images/empathy-1.png]]
+One way to measure the performance of =EMPATH= is to compare the
+sutiability of the imagined sense experience to trigger the same
+action predicates as the real sensory experience.
+#+caption: Determine how closely empathy approximates actual
+#+caption: sensory data.
+#+name: test-empathy-accuracy
+#+begin_listing clojure
+#+begin_src clojure
+(def worm-action-label
+(juxt grand-circle? curled? wiggling?))
+(defn compare-empathy-with-baseline [matches]
+(let [proprio (atom ())]
+(fn
+[experiences text]
+(let [phi-indices (phi-scan (:proprioception (peek experiences)))]
+(swap! proprio (partial cons phi-indices))
+(let [exp-thread (longest-thread (take 300 @proprio))
+empathy (mapv phi-space (infer-nils exp-thread))
+experience-matches-empathy
+(= (worm-action-label experiences)
+(worm-action-label empathy))]
+(println-repl experience-matches-empathy)
+(swap! matches #(conj % experience-matches-empathy)))))))
+(defn accuracy [v]
+(float (/ (count (filter true? v)) (count v))))
+(defn test-empathy-accuracy []
+(let [res (atom [])]
+(run-world
+(worm-world :experience-watch
+(compare-empathy-with-baseline res)
+:worm worm*))
+(accuracy @res)))
+#+end_src
+#+end_listing
+Running =test-empathy-accuracy= using the very short exercise
+program defined in listing \ref{generate-phi-space}, and then doing
+a similar pattern of activity manually yeilds an accuracy of around
+73%. This is based on very limited worm experience. By training the
+worm for longer, the accuracy dramatically improves.
+#+caption: Program to generate \Phi-space using manual training.
+#+name: manual-phi-space
+#+begin_listing clojure
+#+begin_src clojure
+(defn init-interactive []
+(def phi-space
+(let [experiences (atom [])]
+(run-world
+(apply-map
+worm-world
+(merge
+(worm-world-defaults)
+{:experiences experiences})))
+@experiences))
+(def phi-scan (gen-phi-scan phi-space)))
+#+end_src
+#+end_listing
+After about 1 minute of manual training, I was able to achieve 95%
+accuracy on manual testing of the worm using =init-interactive= and
+=test-empathy-accuracy=. The ability of the system to infer sensory
+states is truly impressive.
 ** Digression: bootstrapping touch using free exploration
 * Contributions

Mercurial > cortex

comparison thesis/cortex.org @ 451:0a4362d1f138