changeset 432:1e5ea711857d

abstract first draft.
author Robert McIntyre <rlm@mit.edu>
date Sun, 23 Mar 2014 16:33:01 -0400 (2014-03-23)
parents 7410f0d8011c
children 0b27c0c9c188
files thesis/abstract.org thesis/abstract.tex thesis/cortex.org thesis/cortex.tex thesis/garbage_cortex.org thesis/rlm-cortex-meng.tex thesis/weave-thesis.sh
diffstat 7 files changed, 224 insertions(+), 201 deletions(-) [+]
line wrap: on
line diff
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/thesis/abstract.org	Sun Mar 23 16:33:01 2014 -0400
     1.3 @@ -0,0 +1,31 @@
     1.4 +Here I explore the design and capabilities of my system (called
     1.5 +=CORTEX=) which enables experiments in /embodied artificial
     1.6 +intelligence/ -- that is, AI which uses a physical simulation of
     1.7 +reality accompanied by a simulated body to solve problems.
     1.8 +
     1.9 +In the first half of the thesis I describe the construction of
    1.10 +=CORTEX= and the rationale behind my architecture choices. =CORTEX= is
    1.11 +a complete platform for embodied AI research. It provides multiple
    1.12 +senses for simulated creatures, including vision, touch,
    1.13 +proprioception, muscle tension, and hearing. Each of these senses
    1.14 +provides a wealth of parameters that are biologically
    1.15 +inspired. =CORTEX= is able to simulate any number of creatures and
    1.16 +senses, and provides facilities for easily modeling and creating new
    1.17 +creatures. As a research platform it is more complete than any other
    1.18 +system currently available.
    1.19 +
    1.20 +In the second half of the thesis I develop a computational model of
    1.21 +empathy, using =CORTEX= as a base. Empathy in this context is the
    1.22 +ability to observe another creature and infer what sorts of sensations
    1.23 +that creature is feeling. My empathy algorithm involves multiple
    1.24 +phases. First is free-play, where the creature moves around and gains
    1.25 +sensory experience. From this experience I construct a representation
    1.26 +of the creature's sensory state space, which I call \phi-space. Using
    1.27 +\phi-space, I construct an efficient function for enriching the
    1.28 +limited data that comes from observing another creature with a full
    1.29 +compliment of imagined sensory data based on previous experience. I
    1.30 +can then use the imagined sensory data to recognize what the observed
    1.31 +creature is doing and feeling, using straightforward embodied action
    1.32 +predicates. This is all demonstrated with using a simple worm-like
    1.33 +creature, recognizing worm-actions in video.
    1.34 +
     2.1 --- a/thesis/abstract.tex	Sat Mar 22 23:31:07 2014 -0400
     2.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     2.3 @@ -1,22 +0,0 @@
     2.4 -% $Log: abstract.tex,v $
     2.5 -% Revision 1.1  93/05/14  14:56:25  starflt
     2.6 -% Initial revision
     2.7 -% 
     2.8 -% Revision 1.1  90/05/04  10:41:01  lwvanels
     2.9 -% Initial revision
    2.10 -% 
    2.11 -%
    2.12 -%% The text of your abstract and nothing else (other than comments) goes here.
    2.13 -%% It will be single-spaced and the rest of the text that is supposed to go on
    2.14 -%% the abstract page will be generated by the abstractpage environment.  This
    2.15 -%% file should be \input (not \include 'd) from cover.tex.
    2.16 -In this thesis, I designed and implemented a compiler which performs
    2.17 -optimizations that reduce the number of low-level floating point operations
    2.18 -necessary for a specific task; this involves the optimization of chains of
    2.19 -floating point operations as well as the implementation of a ``fixed'' point
    2.20 -data type that allows some floating point operations to simulated with integer
    2.21 -arithmetic.  The source language of the compiler is a subset of C, and the
    2.22 -destination language is assembly language for a micro-floating point CPU.  An
    2.23 -instruction-level simulator of the CPU was written to allow testing of the
    2.24 -code.  A series of test pieces of codes was compiled, both with and without
    2.25 -optimization, to determine how effective these optimizations were.
     3.1 --- a/thesis/cortex.org	Sat Mar 22 23:31:07 2014 -0400
     3.2 +++ b/thesis/cortex.org	Sun Mar 23 16:33:01 2014 -0400
     3.3 @@ -4,97 +4,52 @@
     3.4  #+description: Using embodied AI to facilitate Artificial Imagination.
     3.5  #+keywords: AI, clojure, embodiment
     3.6  
     3.7 -* Artificial Imagination
     3.8 +* Vision 
     3.9  
    3.10 -  Imagine watching a video of someone skateboarding. When you watch
    3.11 -  the video, you can imagine yourself skateboarding, and your
    3.12 -  knowledge of the human body and its dynamics guides your
    3.13 -  interpretation of the scene. For example, even if the skateboarder
    3.14 -  is partially occluded, you can infer the positions of his arms and
    3.15 -  body from your own knowledge of how your body would be positioned if
    3.16 -  you were skateboarding. If the skateboarder suffers an accident, you
    3.17 -  wince in sympathy, imagining the pain your own body would experience
    3.18 -  if it were in the same situation. This empathy with other people
    3.19 -  guides our understanding of whatever they are doing because it is a
    3.20 -  powerful constraint on what is probable and possible. In order to
    3.21 -  make use of this powerful empathy constraint, I need a system that
    3.22 -  can generate and make sense of sensory data from the many different
    3.23 -  senses that humans possess. The two key proprieties of such a system
    3.24 -  are /embodiment/ and /imagination/.
    3.25 +  System for understanding what the actors in a video are doing --
    3.26 +  Action Recognition.
    3.27 +  
    3.28 +  Separate action recognition into three components:
    3.29  
    3.30 -** What is imagination?
    3.31 +  - free play
    3.32 +  - embodied action predicates
    3.33 +  - model alignment 
    3.34 +  - sensory imagination
    3.35  
    3.36 -   One kind of imagination is /sympathetic/ imagination: you imagine
    3.37 -   yourself in the position of something/someone you are
    3.38 -   observing. This type of imagination comes into play when you follow
    3.39 -   along visually when watching someone perform actions, or when you
    3.40 -   sympathetically grimace when someone hurts themselves. This type of
    3.41 -   imagination uses the constraints you have learned about your own
    3.42 -   body to highly constrain the possibilities in whatever you are
    3.43 -   seeing. It uses all your senses to including your senses of touch,
    3.44 -   proprioception, etc. Humans are flexible when it comes to "putting
    3.45 -   themselves in another's shoes," and can sympathetically understand
    3.46 -   not only other humans, but entities ranging from animals to cartoon
    3.47 -   characters to [[http://www.youtube.com/watch?v=0jz4HcwTQmU][single dots]] on a screen!
    3.48 +* Steps 
    3.49 +  
    3.50 + - Build cortex, a simulated environment for sensate AI
    3.51 +   - solid bodies w/ joints
    3.52 +   - vision
    3.53 +   - touch
    3.54 +   - vision
    3.55 +   - hearing
    3.56 +   - proprioception
    3.57 +   - muscle contraction
    3.58  
    3.59 + - Build experimental framework for worm-actions
    3.60 +  - embodied stream predicates
    3.61 +  - \phi-space 
    3.62 +  - \phi-scan
    3.63  
    3.64 -   #+caption: A cat drinking some water. Identifying this action is beyond the state of the art for computers.
    3.65 -   #+ATTR_LaTeX: :width 5cm
    3.66 -   [[./images/cat-drinking.jpg]]
    3.67 +* News
    3.68 +  
    3.69 +  Experimental results:
    3.70  
    3.71 +  - \phi-space actually works very well for the worm!
    3.72 +  - self organizing touch map
    3.73  
    3.74 -#+begin_listing
    3.75 -\caption{This is a basic test for the vision system. It only tests the vision-pipeline and does not deal with loading eyes from a blender file. The code creates two videos of the same rotating cube from different angles.}
    3.76 -#+name: test-1
    3.77 -#+begin_src clojure
    3.78 -(defn test-pipeline
    3.79 -  "Testing vision:
    3.80 -   Tests the vision system by creating two views of the same rotating
    3.81 -   object from different angles and displaying both of those views in
    3.82 -   JFrames.
    3.83  
    3.84 -   You should see a rotating cube, and two windows,
    3.85 -   each displaying a different view of the cube."
    3.86 -  ([] (test-pipeline false))
    3.87 -  ([record?]
    3.88 -     (let [candy
    3.89 -           (box 1 1 1 :physical? false :color ColorRGBA/Blue)]
    3.90 -       (world
    3.91 -        (doto (Node.)
    3.92 -          (.attachChild candy))
    3.93 -        {}
    3.94 -        (fn [world]
    3.95 -          (let [cam (.clone (.getCamera world))
    3.96 -                width (.getWidth cam)
    3.97 -                height (.getHeight cam)]
    3.98 -            (add-camera! world cam 
    3.99 -                         (comp
   3.100 -                          (view-image
   3.101 -                           (if record?
   3.102 -                             (File. "/home/r/proj/cortex/render/vision/1")))
   3.103 -                          BufferedImage!))
   3.104 -            (add-camera! world
   3.105 -                         (doto (.clone cam)
   3.106 -                           (.setLocation (Vector3f. -10 0 0))
   3.107 -                           (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))
   3.108 -                         (comp
   3.109 -                          (view-image
   3.110 -                           (if record?
   3.111 -                             (File. "/home/r/proj/cortex/render/vision/2")))
   3.112 -                          BufferedImage!))
   3.113 -            (let [timer (IsoTimer. 60)]
   3.114 -              (.setTimer world timer)
   3.115 -              (display-dilated-time world timer))
   3.116 -            ;; This is here to restore the main view
   3.117 -            ;; after the other views have completed processing
   3.118 -            (add-camera! world (.getCamera world) no-op)))
   3.119 -        (fn [world tpf]
   3.120 -          (.rotate candy (* tpf 0.2) 0 0))))))
   3.121 -#+end_src
   3.122 -#+end_listing
   3.123 +* Contributions
   3.124 +  - Built =CORTEX=, a comprehensive platform for embodied AI
   3.125 +    experiments. Has many new features lacking in other systems, such
   3.126 +    as sound. Easy to model/create new creatures.
   3.127 +  - created a novel concept for action recognition by using artificial
   3.128 +    imagination. 
   3.129  
   3.130 -- This is test1 \cite{Tappert77}.
   3.131  
   3.132 -\cite{Tappert77}
   3.133 -lol
   3.134 -\cite{Tappert77}
   3.135 \ No newline at end of file
   3.136 +
   3.137 +
   3.138 +
   3.139 +
   3.140 +
     4.1 --- a/thesis/cortex.tex	Sat Mar 22 23:31:07 2014 -0400
     4.2 +++ b/thesis/cortex.tex	Sun Mar 23 16:33:01 2014 -0400
     4.3 @@ -1,100 +1,56 @@
     4.4  
     4.5 -\section{Artificial Imagination}
     4.6 +\section{Vision}
     4.7  \label{sec-1}
     4.8  
     4.9 -Imagine watching a video of someone skateboarding. When you watch
    4.10 -the video, you can imagine yourself skateboarding, and your
    4.11 -knowledge of the human body and its dynamics guides your
    4.12 -interpretation of the scene. For example, even if the skateboarder
    4.13 -is partially occluded, you can infer the positions of his arms and
    4.14 -body from your own knowledge of how your body would be positioned if
    4.15 -you were skateboarding. If the skateboarder suffers an accident, you
    4.16 -wince in sympathy, imagining the pain your own body would experience
    4.17 -if it were in the same situation. This empathy with other people
    4.18 -guides our understanding of whatever they are doing because it is a
    4.19 -powerful constraint on what is probable and possible. In order to
    4.20 -make use of this powerful empathy constraint, I need a system that
    4.21 -can generate and make sense of sensory data from the many different
    4.22 -senses that humans possess. The two key proprieties of such a system
    4.23 -are \emph{embodiment} and \emph{imagination}.
    4.24 +System for understanding what the actors in a video are doing --
    4.25 +Action Recognition.
    4.26  
    4.27 -\subsection{What is imagination?}
    4.28 -\label{sec-1-1}
    4.29 -
    4.30 -One kind of imagination is \emph{sympathetic} imagination: you imagine
    4.31 -yourself in the position of something/someone you are
    4.32 -observing. This type of imagination comes into play when you follow
    4.33 -along visually when watching someone perform actions, or when you
    4.34 -sympathetically grimace when someone hurts themselves. This type of
    4.35 -imagination uses the constraints you have learned about your own
    4.36 -body to highly constrain the possibilities in whatever you are
    4.37 -seeing. It uses all your senses to including your senses of touch,
    4.38 -proprioception, etc. Humans are flexible when it comes to "putting
    4.39 -themselves in another's shoes," and can sympathetically understand
    4.40 -not only other humans, but entities ranging from animals to cartoon
    4.41 -characters to \href{http://www.youtube.com/watch?v=0jz4HcwTQmU}{single dots} on a screen!
    4.42 -
    4.43 -
    4.44 -\begin{figure}[htb]
    4.45 -\centering
    4.46 -\includegraphics[width=5cm]{./images/cat-drinking.jpg}
    4.47 -\caption{A cat drinking some water. Identifying this action is beyond the state of the art for computers.}
    4.48 -\end{figure}
    4.49 -
    4.50 -
    4.51 -\begin{listing}
    4.52 -\caption{This is a basic test for the vision system. It only tests the vision-pipeline and does not deal with loading eyes from a blender file. The code creates two videos of the same rotating cube from different angles.}
    4.53 -\begin{clojurecode}
    4.54 -(defn test-pipeline
    4.55 -  "Testing vision:
    4.56 -   Tests the vision system by creating two views of the same rotating
    4.57 -   object from different angles and displaying both of those views in
    4.58 -   JFrames.
    4.59 -
    4.60 -   You should see a rotating cube, and two windows,
    4.61 -   each displaying a different view of the cube."
    4.62 -  ([] (test-pipeline false))
    4.63 -  ([record?]
    4.64 -     (let [candy
    4.65 -           (box 1 1 1 :physical? false :color ColorRGBA/Blue)]
    4.66 -       (world
    4.67 -        (doto (Node.)
    4.68 -          (.attachChild candy))
    4.69 -        {}
    4.70 -        (fn [world]
    4.71 -          (let [cam (.clone (.getCamera world))
    4.72 -                width (.getWidth cam)
    4.73 -                height (.getHeight cam)]
    4.74 -            (add-camera! world cam 
    4.75 -                         (comp
    4.76 -                          (view-image
    4.77 -                           (if record?
    4.78 -                             (File. "/home/r/proj/cortex/render/vision/1")))
    4.79 -                          BufferedImage!))
    4.80 -            (add-camera! world
    4.81 -                         (doto (.clone cam)
    4.82 -                           (.setLocation (Vector3f. -10 0 0))
    4.83 -                           (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))
    4.84 -                         (comp
    4.85 -                          (view-image
    4.86 -                           (if record?
    4.87 -                             (File. "/home/r/proj/cortex/render/vision/2")))
    4.88 -                          BufferedImage!))
    4.89 -            (let [timer (IsoTimer. 60)]
    4.90 -              (.setTimer world timer)
    4.91 -              (display-dilated-time world timer))
    4.92 -            ;; This is here to restore the main view
    4.93 -            ;; after the other views have completed processing
    4.94 -            (add-camera! world (.getCamera world) no-op)))
    4.95 -        (fn [world tpf]
    4.96 -          (.rotate candy (* tpf 0.2) 0 0))))))
    4.97 -\end{clojurecode}
    4.98 -\end{listing}
    4.99 +Separate action recognition into three components:
   4.100  
   4.101  \begin{itemize}
   4.102 -\item This is test1 \cite{Tappert77}.
   4.103 +\item free play
   4.104 +\item embodied action predicates
   4.105 +\item model alignment
   4.106 +\item sensory imagination
   4.107 +\end{itemize}
   4.108 +\section{Steps}
   4.109 +\label{sec-2}
   4.110 +
   4.111 +\begin{itemize}
   4.112 +\item Build cortex, a simulated environment for sensate AI
   4.113 +\begin{itemize}
   4.114 +\item solid bodies w/ joints
   4.115 +\item vision
   4.116 +\item touch
   4.117 +\item vision
   4.118 +\item hearing
   4.119 +\item proprioception
   4.120 +\item muscle contraction
   4.121  \end{itemize}
   4.122  
   4.123 -\cite{Tappert77}
   4.124 -lol
   4.125 -\cite{Tappert77}
   4.126 +\item Build experimental framework for worm-actions
   4.127 +\begin{itemize}
   4.128 +\item embodied stream predicates
   4.129 +\item \(\phi\)-space
   4.130 +\item \(\phi\)-scan
   4.131 +\end{itemize}
   4.132 +\end{itemize}
   4.133 +\section{News}
   4.134 +\label{sec-3}
   4.135 +
   4.136 +Experimental results:
   4.137 +
   4.138 +\begin{itemize}
   4.139 +\item \(\phi\)-space actually works very well for the worm!
   4.140 +\item self organizing touch map
   4.141 +\end{itemize}
   4.142 +
   4.143 +\section{Contributions}
   4.144 +\label{sec-4}
   4.145 +\begin{itemize}
   4.146 +\item Built \texttt{CORTEX}, a comprehensive platform for embodied AI
   4.147 +experiments. Has many new features lacking in other systems, such
   4.148 +as sound. Easy to model/create new creatures.
   4.149 +\item created a novel concept for action recognition by using artificial
   4.150 +imagination.
   4.151 +\end{itemize}
     5.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     5.2 +++ b/thesis/garbage_cortex.org	Sun Mar 23 16:33:01 2014 -0400
     5.3 @@ -0,0 +1,100 @@
     5.4 +#+title: =CORTEX=
     5.5 +#+author: Robert McIntyre
     5.6 +#+email: rlm@mit.edu
     5.7 +#+description: Using embodied AI to facilitate Artificial Imagination.
     5.8 +#+keywords: AI, clojure, embodiment
     5.9 +
    5.10 +* Artificial Imagination
    5.11 +
    5.12 +  Imagine watching a video of someone skateboarding. When you watch
    5.13 +  the video, you can imagine yourself skateboarding, and your
    5.14 +  knowledge of the human body and its dynamics guides your
    5.15 +  interpretation of the scene. For example, even if the skateboarder
    5.16 +  is partially occluded, you can infer the positions of his arms and
    5.17 +  body from your own knowledge of how your body would be positioned if
    5.18 +  you were skateboarding. If the skateboarder suffers an accident, you
    5.19 +  wince in sympathy, imagining the pain your own body would experience
    5.20 +  if it were in the same situation. This empathy with other people
    5.21 +  guides our understanding of whatever they are doing because it is a
    5.22 +  powerful constraint on what is probable and possible. In order to
    5.23 +  make use of this powerful empathy constraint, I need a system that
    5.24 +  can generate and make sense of sensory data from the many different
    5.25 +  senses that humans possess. The two key proprieties of such a system
    5.26 +  are /embodiment/ and /imagination/.
    5.27 +
    5.28 +** What is imagination?
    5.29 +
    5.30 +   One kind of imagination is /sympathetic/ imagination: you imagine
    5.31 +   yourself in the position of something/someone you are
    5.32 +   observing. This type of imagination comes into play when you follow
    5.33 +   along visually when watching someone perform actions, or when you
    5.34 +   sympathetically grimace when someone hurts themselves. This type of
    5.35 +   imagination uses the constraints you have learned about your own
    5.36 +   body to highly constrain the possibilities in whatever you are
    5.37 +   seeing. It uses all your senses to including your senses of touch,
    5.38 +   proprioception, etc. Humans are flexible when it comes to "putting
    5.39 +   themselves in another's shoes," and can sympathetically understand
    5.40 +   not only other humans, but entities ranging from animals to cartoon
    5.41 +   characters to [[http://www.youtube.com/watch?v=0jz4HcwTQmU][single dots]] on a screen!
    5.42 +
    5.43 +
    5.44 +   #+caption: A cat drinking some water. Identifying this action is beyond the state of the art for computers.
    5.45 +   #+ATTR_LaTeX: :width 5cm
    5.46 +   [[./images/cat-drinking.jpg]]
    5.47 +
    5.48 +
    5.49 +#+begin_listing clojure
    5.50 +\caption{This is a basic test for the vision system. It only tests the vision-pipeline and does not deal with loading eyes from a blender file. The code creates two videos of the same rotating cube from different angles.}
    5.51 +#+name: test-1
    5.52 +#+begin_src clojure
    5.53 +(defn test-pipeline
    5.54 +  "Testing vision:
    5.55 +   Tests the vision system by creating two views of the same rotating
    5.56 +   object from different angles and displaying both of those views in
    5.57 +   JFrames.
    5.58 +
    5.59 +   You should see a rotating cube, and two windows,
    5.60 +   each displaying a different view of the cube."
    5.61 +  ([] (test-pipeline false))
    5.62 +  ([record?]
    5.63 +     (let [candy
    5.64 +           (box 1 1 1 :physical? false :color ColorRGBA/Blue)]
    5.65 +       (world
    5.66 +        (doto (Node.)
    5.67 +          (.attachChild candy))
    5.68 +        {}
    5.69 +        (fn [world]
    5.70 +          (let [cam (.clone (.getCamera world))
    5.71 +                width (.getWidth cam)
    5.72 +                height (.getHeight cam)]
    5.73 +            (add-camera! world cam 
    5.74 +                         (comp
    5.75 +                          (view-image
    5.76 +                           (if record?
    5.77 +                             (File. "/home/r/proj/cortex/render/vision/1")))
    5.78 +                          BufferedImage!))
    5.79 +            (add-camera! world
    5.80 +                         (doto (.clone cam)
    5.81 +                           (.setLocation (Vector3f. -10 0 0))
    5.82 +                           (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))
    5.83 +                         (comp
    5.84 +                          (view-image
    5.85 +                           (if record?
    5.86 +                             (File. "/home/r/proj/cortex/render/vision/2")))
    5.87 +                          BufferedImage!))
    5.88 +            (let [timer (IsoTimer. 60)]
    5.89 +              (.setTimer world timer)
    5.90 +              (display-dilated-time world timer))
    5.91 +            ;; This is here to restore the main view
    5.92 +            ;; after the other views have completed processing
    5.93 +            (add-camera! world (.getCamera world) no-op)))
    5.94 +        (fn [world tpf]
    5.95 +          (.rotate candy (* tpf 0.2) 0 0))))))
    5.96 +#+end_src
    5.97 +#+end_listing
    5.98 +
    5.99 +- This is test1 \cite{Tappert77}.
   5.100 +
   5.101 +\cite{Tappert77}
   5.102 +lol
   5.103 +\cite{Tappert77}
   5.104 \ No newline at end of file
     6.1 --- a/thesis/rlm-cortex-meng.tex	Sat Mar 22 23:31:07 2014 -0400
     6.2 +++ b/thesis/rlm-cortex-meng.tex	Sun Mar 23 16:33:01 2014 -0400
     6.3 @@ -41,7 +41,6 @@
     6.4  \usepackage{wasysym}
     6.5  \usepackage{amssymb}
     6.6  \usepackage{hyperref}
     6.7 -%\usepackage{natbib}
     6.8  \usepackage{libertine}
     6.9  \usepackage{inconsolata}
    6.10  
     7.1 --- a/thesis/weave-thesis.sh	Sat Mar 22 23:31:07 2014 -0400
     7.2 +++ b/thesis/weave-thesis.sh	Sun Mar 23 16:33:01 2014 -0400
     7.3 @@ -7,9 +7,13 @@
     7.4  --batch \
     7.5  --eval "
     7.6  (progn
     7.7 -  (find-file \"$1.org\")
     7.8 +  (find-file \"cortex.org\")
     7.9 +  (org-latex-export-to-latex nil nil nil t nil)) \
    7.10 +(progn
    7.11 +  (find-file \"abstract.org\")
    7.12    (org-latex-export-to-latex nil nil nil t nil))" \
    7.13  \
    7.14  2>&1 
    7.15  
    7.16 -rm $1.tex~
    7.17 \ No newline at end of file
    7.18 +rm cortex.tex~
    7.19 +rm abstract.tex~