Mercurial > cortex
changeset 432:1e5ea711857d
abstract first draft.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Sun, 23 Mar 2014 16:33:01 -0400 (2014-03-23) |
parents | 7410f0d8011c |
children | 0b27c0c9c188 |
files | thesis/abstract.org thesis/abstract.tex thesis/cortex.org thesis/cortex.tex thesis/garbage_cortex.org thesis/rlm-cortex-meng.tex thesis/weave-thesis.sh |
diffstat | 7 files changed, 224 insertions(+), 201 deletions(-) [+] |
line wrap: on
line diff
1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 1.2 +++ b/thesis/abstract.org Sun Mar 23 16:33:01 2014 -0400 1.3 @@ -0,0 +1,31 @@ 1.4 +Here I explore the design and capabilities of my system (called 1.5 +=CORTEX=) which enables experiments in /embodied artificial 1.6 +intelligence/ -- that is, AI which uses a physical simulation of 1.7 +reality accompanied by a simulated body to solve problems. 1.8 + 1.9 +In the first half of the thesis I describe the construction of 1.10 +=CORTEX= and the rationale behind my architecture choices. =CORTEX= is 1.11 +a complete platform for embodied AI research. It provides multiple 1.12 +senses for simulated creatures, including vision, touch, 1.13 +proprioception, muscle tension, and hearing. Each of these senses 1.14 +provides a wealth of parameters that are biologically 1.15 +inspired. =CORTEX= is able to simulate any number of creatures and 1.16 +senses, and provides facilities for easily modeling and creating new 1.17 +creatures. As a research platform it is more complete than any other 1.18 +system currently available. 1.19 + 1.20 +In the second half of the thesis I develop a computational model of 1.21 +empathy, using =CORTEX= as a base. Empathy in this context is the 1.22 +ability to observe another creature and infer what sorts of sensations 1.23 +that creature is feeling. My empathy algorithm involves multiple 1.24 +phases. First is free-play, where the creature moves around and gains 1.25 +sensory experience. From this experience I construct a representation 1.26 +of the creature's sensory state space, which I call \phi-space. Using 1.27 +\phi-space, I construct an efficient function for enriching the 1.28 +limited data that comes from observing another creature with a full 1.29 +compliment of imagined sensory data based on previous experience. I 1.30 +can then use the imagined sensory data to recognize what the observed 1.31 +creature is doing and feeling, using straightforward embodied action 1.32 +predicates. This is all demonstrated with using a simple worm-like 1.33 +creature, recognizing worm-actions in video. 1.34 +
2.1 --- a/thesis/abstract.tex Sat Mar 22 23:31:07 2014 -0400 2.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 2.3 @@ -1,22 +0,0 @@ 2.4 -% $Log: abstract.tex,v $ 2.5 -% Revision 1.1 93/05/14 14:56:25 starflt 2.6 -% Initial revision 2.7 -% 2.8 -% Revision 1.1 90/05/04 10:41:01 lwvanels 2.9 -% Initial revision 2.10 -% 2.11 -% 2.12 -%% The text of your abstract and nothing else (other than comments) goes here. 2.13 -%% It will be single-spaced and the rest of the text that is supposed to go on 2.14 -%% the abstract page will be generated by the abstractpage environment. This 2.15 -%% file should be \input (not \include 'd) from cover.tex. 2.16 -In this thesis, I designed and implemented a compiler which performs 2.17 -optimizations that reduce the number of low-level floating point operations 2.18 -necessary for a specific task; this involves the optimization of chains of 2.19 -floating point operations as well as the implementation of a ``fixed'' point 2.20 -data type that allows some floating point operations to simulated with integer 2.21 -arithmetic. The source language of the compiler is a subset of C, and the 2.22 -destination language is assembly language for a micro-floating point CPU. An 2.23 -instruction-level simulator of the CPU was written to allow testing of the 2.24 -code. A series of test pieces of codes was compiled, both with and without 2.25 -optimization, to determine how effective these optimizations were.
3.1 --- a/thesis/cortex.org Sat Mar 22 23:31:07 2014 -0400 3.2 +++ b/thesis/cortex.org Sun Mar 23 16:33:01 2014 -0400 3.3 @@ -4,97 +4,52 @@ 3.4 #+description: Using embodied AI to facilitate Artificial Imagination. 3.5 #+keywords: AI, clojure, embodiment 3.6 3.7 -* Artificial Imagination 3.8 +* Vision 3.9 3.10 - Imagine watching a video of someone skateboarding. When you watch 3.11 - the video, you can imagine yourself skateboarding, and your 3.12 - knowledge of the human body and its dynamics guides your 3.13 - interpretation of the scene. For example, even if the skateboarder 3.14 - is partially occluded, you can infer the positions of his arms and 3.15 - body from your own knowledge of how your body would be positioned if 3.16 - you were skateboarding. If the skateboarder suffers an accident, you 3.17 - wince in sympathy, imagining the pain your own body would experience 3.18 - if it were in the same situation. This empathy with other people 3.19 - guides our understanding of whatever they are doing because it is a 3.20 - powerful constraint on what is probable and possible. In order to 3.21 - make use of this powerful empathy constraint, I need a system that 3.22 - can generate and make sense of sensory data from the many different 3.23 - senses that humans possess. The two key proprieties of such a system 3.24 - are /embodiment/ and /imagination/. 3.25 + System for understanding what the actors in a video are doing -- 3.26 + Action Recognition. 3.27 + 3.28 + Separate action recognition into three components: 3.29 3.30 -** What is imagination? 3.31 + - free play 3.32 + - embodied action predicates 3.33 + - model alignment 3.34 + - sensory imagination 3.35 3.36 - One kind of imagination is /sympathetic/ imagination: you imagine 3.37 - yourself in the position of something/someone you are 3.38 - observing. This type of imagination comes into play when you follow 3.39 - along visually when watching someone perform actions, or when you 3.40 - sympathetically grimace when someone hurts themselves. This type of 3.41 - imagination uses the constraints you have learned about your own 3.42 - body to highly constrain the possibilities in whatever you are 3.43 - seeing. It uses all your senses to including your senses of touch, 3.44 - proprioception, etc. Humans are flexible when it comes to "putting 3.45 - themselves in another's shoes," and can sympathetically understand 3.46 - not only other humans, but entities ranging from animals to cartoon 3.47 - characters to [[http://www.youtube.com/watch?v=0jz4HcwTQmU][single dots]] on a screen! 3.48 +* Steps 3.49 + 3.50 + - Build cortex, a simulated environment for sensate AI 3.51 + - solid bodies w/ joints 3.52 + - vision 3.53 + - touch 3.54 + - vision 3.55 + - hearing 3.56 + - proprioception 3.57 + - muscle contraction 3.58 3.59 + - Build experimental framework for worm-actions 3.60 + - embodied stream predicates 3.61 + - \phi-space 3.62 + - \phi-scan 3.63 3.64 - #+caption: A cat drinking some water. Identifying this action is beyond the state of the art for computers. 3.65 - #+ATTR_LaTeX: :width 5cm 3.66 - [[./images/cat-drinking.jpg]] 3.67 +* News 3.68 + 3.69 + Experimental results: 3.70 3.71 + - \phi-space actually works very well for the worm! 3.72 + - self organizing touch map 3.73 3.74 -#+begin_listing 3.75 -\caption{This is a basic test for the vision system. It only tests the vision-pipeline and does not deal with loading eyes from a blender file. The code creates two videos of the same rotating cube from different angles.} 3.76 -#+name: test-1 3.77 -#+begin_src clojure 3.78 -(defn test-pipeline 3.79 - "Testing vision: 3.80 - Tests the vision system by creating two views of the same rotating 3.81 - object from different angles and displaying both of those views in 3.82 - JFrames. 3.83 3.84 - You should see a rotating cube, and two windows, 3.85 - each displaying a different view of the cube." 3.86 - ([] (test-pipeline false)) 3.87 - ([record?] 3.88 - (let [candy 3.89 - (box 1 1 1 :physical? false :color ColorRGBA/Blue)] 3.90 - (world 3.91 - (doto (Node.) 3.92 - (.attachChild candy)) 3.93 - {} 3.94 - (fn [world] 3.95 - (let [cam (.clone (.getCamera world)) 3.96 - width (.getWidth cam) 3.97 - height (.getHeight cam)] 3.98 - (add-camera! world cam 3.99 - (comp 3.100 - (view-image 3.101 - (if record? 3.102 - (File. "/home/r/proj/cortex/render/vision/1"))) 3.103 - BufferedImage!)) 3.104 - (add-camera! world 3.105 - (doto (.clone cam) 3.106 - (.setLocation (Vector3f. -10 0 0)) 3.107 - (.lookAt Vector3f/ZERO Vector3f/UNIT_Y)) 3.108 - (comp 3.109 - (view-image 3.110 - (if record? 3.111 - (File. "/home/r/proj/cortex/render/vision/2"))) 3.112 - BufferedImage!)) 3.113 - (let [timer (IsoTimer. 60)] 3.114 - (.setTimer world timer) 3.115 - (display-dilated-time world timer)) 3.116 - ;; This is here to restore the main view 3.117 - ;; after the other views have completed processing 3.118 - (add-camera! world (.getCamera world) no-op))) 3.119 - (fn [world tpf] 3.120 - (.rotate candy (* tpf 0.2) 0 0)))))) 3.121 -#+end_src 3.122 -#+end_listing 3.123 +* Contributions 3.124 + - Built =CORTEX=, a comprehensive platform for embodied AI 3.125 + experiments. Has many new features lacking in other systems, such 3.126 + as sound. Easy to model/create new creatures. 3.127 + - created a novel concept for action recognition by using artificial 3.128 + imagination. 3.129 3.130 -- This is test1 \cite{Tappert77}. 3.131 3.132 -\cite{Tappert77} 3.133 -lol 3.134 -\cite{Tappert77} 3.135 \ No newline at end of file 3.136 + 3.137 + 3.138 + 3.139 + 3.140 +
4.1 --- a/thesis/cortex.tex Sat Mar 22 23:31:07 2014 -0400 4.2 +++ b/thesis/cortex.tex Sun Mar 23 16:33:01 2014 -0400 4.3 @@ -1,100 +1,56 @@ 4.4 4.5 -\section{Artificial Imagination} 4.6 +\section{Vision} 4.7 \label{sec-1} 4.8 4.9 -Imagine watching a video of someone skateboarding. When you watch 4.10 -the video, you can imagine yourself skateboarding, and your 4.11 -knowledge of the human body and its dynamics guides your 4.12 -interpretation of the scene. For example, even if the skateboarder 4.13 -is partially occluded, you can infer the positions of his arms and 4.14 -body from your own knowledge of how your body would be positioned if 4.15 -you were skateboarding. If the skateboarder suffers an accident, you 4.16 -wince in sympathy, imagining the pain your own body would experience 4.17 -if it were in the same situation. This empathy with other people 4.18 -guides our understanding of whatever they are doing because it is a 4.19 -powerful constraint on what is probable and possible. In order to 4.20 -make use of this powerful empathy constraint, I need a system that 4.21 -can generate and make sense of sensory data from the many different 4.22 -senses that humans possess. The two key proprieties of such a system 4.23 -are \emph{embodiment} and \emph{imagination}. 4.24 +System for understanding what the actors in a video are doing -- 4.25 +Action Recognition. 4.26 4.27 -\subsection{What is imagination?} 4.28 -\label{sec-1-1} 4.29 - 4.30 -One kind of imagination is \emph{sympathetic} imagination: you imagine 4.31 -yourself in the position of something/someone you are 4.32 -observing. This type of imagination comes into play when you follow 4.33 -along visually when watching someone perform actions, or when you 4.34 -sympathetically grimace when someone hurts themselves. This type of 4.35 -imagination uses the constraints you have learned about your own 4.36 -body to highly constrain the possibilities in whatever you are 4.37 -seeing. It uses all your senses to including your senses of touch, 4.38 -proprioception, etc. Humans are flexible when it comes to "putting 4.39 -themselves in another's shoes," and can sympathetically understand 4.40 -not only other humans, but entities ranging from animals to cartoon 4.41 -characters to \href{http://www.youtube.com/watch?v=0jz4HcwTQmU}{single dots} on a screen! 4.42 - 4.43 - 4.44 -\begin{figure}[htb] 4.45 -\centering 4.46 -\includegraphics[width=5cm]{./images/cat-drinking.jpg} 4.47 -\caption{A cat drinking some water. Identifying this action is beyond the state of the art for computers.} 4.48 -\end{figure} 4.49 - 4.50 - 4.51 -\begin{listing} 4.52 -\caption{This is a basic test for the vision system. It only tests the vision-pipeline and does not deal with loading eyes from a blender file. The code creates two videos of the same rotating cube from different angles.} 4.53 -\begin{clojurecode} 4.54 -(defn test-pipeline 4.55 - "Testing vision: 4.56 - Tests the vision system by creating two views of the same rotating 4.57 - object from different angles and displaying both of those views in 4.58 - JFrames. 4.59 - 4.60 - You should see a rotating cube, and two windows, 4.61 - each displaying a different view of the cube." 4.62 - ([] (test-pipeline false)) 4.63 - ([record?] 4.64 - (let [candy 4.65 - (box 1 1 1 :physical? false :color ColorRGBA/Blue)] 4.66 - (world 4.67 - (doto (Node.) 4.68 - (.attachChild candy)) 4.69 - {} 4.70 - (fn [world] 4.71 - (let [cam (.clone (.getCamera world)) 4.72 - width (.getWidth cam) 4.73 - height (.getHeight cam)] 4.74 - (add-camera! world cam 4.75 - (comp 4.76 - (view-image 4.77 - (if record? 4.78 - (File. "/home/r/proj/cortex/render/vision/1"))) 4.79 - BufferedImage!)) 4.80 - (add-camera! world 4.81 - (doto (.clone cam) 4.82 - (.setLocation (Vector3f. -10 0 0)) 4.83 - (.lookAt Vector3f/ZERO Vector3f/UNIT_Y)) 4.84 - (comp 4.85 - (view-image 4.86 - (if record? 4.87 - (File. "/home/r/proj/cortex/render/vision/2"))) 4.88 - BufferedImage!)) 4.89 - (let [timer (IsoTimer. 60)] 4.90 - (.setTimer world timer) 4.91 - (display-dilated-time world timer)) 4.92 - ;; This is here to restore the main view 4.93 - ;; after the other views have completed processing 4.94 - (add-camera! world (.getCamera world) no-op))) 4.95 - (fn [world tpf] 4.96 - (.rotate candy (* tpf 0.2) 0 0)))))) 4.97 -\end{clojurecode} 4.98 -\end{listing} 4.99 +Separate action recognition into three components: 4.100 4.101 \begin{itemize} 4.102 -\item This is test1 \cite{Tappert77}. 4.103 +\item free play 4.104 +\item embodied action predicates 4.105 +\item model alignment 4.106 +\item sensory imagination 4.107 +\end{itemize} 4.108 +\section{Steps} 4.109 +\label{sec-2} 4.110 + 4.111 +\begin{itemize} 4.112 +\item Build cortex, a simulated environment for sensate AI 4.113 +\begin{itemize} 4.114 +\item solid bodies w/ joints 4.115 +\item vision 4.116 +\item touch 4.117 +\item vision 4.118 +\item hearing 4.119 +\item proprioception 4.120 +\item muscle contraction 4.121 \end{itemize} 4.122 4.123 -\cite{Tappert77} 4.124 -lol 4.125 -\cite{Tappert77} 4.126 +\item Build experimental framework for worm-actions 4.127 +\begin{itemize} 4.128 +\item embodied stream predicates 4.129 +\item \(\phi\)-space 4.130 +\item \(\phi\)-scan 4.131 +\end{itemize} 4.132 +\end{itemize} 4.133 +\section{News} 4.134 +\label{sec-3} 4.135 + 4.136 +Experimental results: 4.137 + 4.138 +\begin{itemize} 4.139 +\item \(\phi\)-space actually works very well for the worm! 4.140 +\item self organizing touch map 4.141 +\end{itemize} 4.142 + 4.143 +\section{Contributions} 4.144 +\label{sec-4} 4.145 +\begin{itemize} 4.146 +\item Built \texttt{CORTEX}, a comprehensive platform for embodied AI 4.147 +experiments. Has many new features lacking in other systems, such 4.148 +as sound. Easy to model/create new creatures. 4.149 +\item created a novel concept for action recognition by using artificial 4.150 +imagination. 4.151 +\end{itemize}
5.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 5.2 +++ b/thesis/garbage_cortex.org Sun Mar 23 16:33:01 2014 -0400 5.3 @@ -0,0 +1,100 @@ 5.4 +#+title: =CORTEX= 5.5 +#+author: Robert McIntyre 5.6 +#+email: rlm@mit.edu 5.7 +#+description: Using embodied AI to facilitate Artificial Imagination. 5.8 +#+keywords: AI, clojure, embodiment 5.9 + 5.10 +* Artificial Imagination 5.11 + 5.12 + Imagine watching a video of someone skateboarding. When you watch 5.13 + the video, you can imagine yourself skateboarding, and your 5.14 + knowledge of the human body and its dynamics guides your 5.15 + interpretation of the scene. For example, even if the skateboarder 5.16 + is partially occluded, you can infer the positions of his arms and 5.17 + body from your own knowledge of how your body would be positioned if 5.18 + you were skateboarding. If the skateboarder suffers an accident, you 5.19 + wince in sympathy, imagining the pain your own body would experience 5.20 + if it were in the same situation. This empathy with other people 5.21 + guides our understanding of whatever they are doing because it is a 5.22 + powerful constraint on what is probable and possible. In order to 5.23 + make use of this powerful empathy constraint, I need a system that 5.24 + can generate and make sense of sensory data from the many different 5.25 + senses that humans possess. The two key proprieties of such a system 5.26 + are /embodiment/ and /imagination/. 5.27 + 5.28 +** What is imagination? 5.29 + 5.30 + One kind of imagination is /sympathetic/ imagination: you imagine 5.31 + yourself in the position of something/someone you are 5.32 + observing. This type of imagination comes into play when you follow 5.33 + along visually when watching someone perform actions, or when you 5.34 + sympathetically grimace when someone hurts themselves. This type of 5.35 + imagination uses the constraints you have learned about your own 5.36 + body to highly constrain the possibilities in whatever you are 5.37 + seeing. It uses all your senses to including your senses of touch, 5.38 + proprioception, etc. Humans are flexible when it comes to "putting 5.39 + themselves in another's shoes," and can sympathetically understand 5.40 + not only other humans, but entities ranging from animals to cartoon 5.41 + characters to [[http://www.youtube.com/watch?v=0jz4HcwTQmU][single dots]] on a screen! 5.42 + 5.43 + 5.44 + #+caption: A cat drinking some water. Identifying this action is beyond the state of the art for computers. 5.45 + #+ATTR_LaTeX: :width 5cm 5.46 + [[./images/cat-drinking.jpg]] 5.47 + 5.48 + 5.49 +#+begin_listing clojure 5.50 +\caption{This is a basic test for the vision system. It only tests the vision-pipeline and does not deal with loading eyes from a blender file. The code creates two videos of the same rotating cube from different angles.} 5.51 +#+name: test-1 5.52 +#+begin_src clojure 5.53 +(defn test-pipeline 5.54 + "Testing vision: 5.55 + Tests the vision system by creating two views of the same rotating 5.56 + object from different angles and displaying both of those views in 5.57 + JFrames. 5.58 + 5.59 + You should see a rotating cube, and two windows, 5.60 + each displaying a different view of the cube." 5.61 + ([] (test-pipeline false)) 5.62 + ([record?] 5.63 + (let [candy 5.64 + (box 1 1 1 :physical? false :color ColorRGBA/Blue)] 5.65 + (world 5.66 + (doto (Node.) 5.67 + (.attachChild candy)) 5.68 + {} 5.69 + (fn [world] 5.70 + (let [cam (.clone (.getCamera world)) 5.71 + width (.getWidth cam) 5.72 + height (.getHeight cam)] 5.73 + (add-camera! world cam 5.74 + (comp 5.75 + (view-image 5.76 + (if record? 5.77 + (File. "/home/r/proj/cortex/render/vision/1"))) 5.78 + BufferedImage!)) 5.79 + (add-camera! world 5.80 + (doto (.clone cam) 5.81 + (.setLocation (Vector3f. -10 0 0)) 5.82 + (.lookAt Vector3f/ZERO Vector3f/UNIT_Y)) 5.83 + (comp 5.84 + (view-image 5.85 + (if record? 5.86 + (File. "/home/r/proj/cortex/render/vision/2"))) 5.87 + BufferedImage!)) 5.88 + (let [timer (IsoTimer. 60)] 5.89 + (.setTimer world timer) 5.90 + (display-dilated-time world timer)) 5.91 + ;; This is here to restore the main view 5.92 + ;; after the other views have completed processing 5.93 + (add-camera! world (.getCamera world) no-op))) 5.94 + (fn [world tpf] 5.95 + (.rotate candy (* tpf 0.2) 0 0)))))) 5.96 +#+end_src 5.97 +#+end_listing 5.98 + 5.99 +- This is test1 \cite{Tappert77}. 5.100 + 5.101 +\cite{Tappert77} 5.102 +lol 5.103 +\cite{Tappert77} 5.104 \ No newline at end of file
6.1 --- a/thesis/rlm-cortex-meng.tex Sat Mar 22 23:31:07 2014 -0400 6.2 +++ b/thesis/rlm-cortex-meng.tex Sun Mar 23 16:33:01 2014 -0400 6.3 @@ -41,7 +41,6 @@ 6.4 \usepackage{wasysym} 6.5 \usepackage{amssymb} 6.6 \usepackage{hyperref} 6.7 -%\usepackage{natbib} 6.8 \usepackage{libertine} 6.9 \usepackage{inconsolata} 6.10
7.1 --- a/thesis/weave-thesis.sh Sat Mar 22 23:31:07 2014 -0400 7.2 +++ b/thesis/weave-thesis.sh Sun Mar 23 16:33:01 2014 -0400 7.3 @@ -7,9 +7,13 @@ 7.4 --batch \ 7.5 --eval " 7.6 (progn 7.7 - (find-file \"$1.org\") 7.8 + (find-file \"cortex.org\") 7.9 + (org-latex-export-to-latex nil nil nil t nil)) \ 7.10 +(progn 7.11 + (find-file \"abstract.org\") 7.12 (org-latex-export-to-latex nil nil nil t nil))" \ 7.13 \ 7.14 2>&1 7.15 7.16 -rm $1.tex~ 7.17 \ No newline at end of file 7.18 +rm cortex.tex~ 7.19 +rm abstract.tex~