Mercurial > cortex
changeset 528:fd74479db5cb
merge some winston updates.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Mon, 21 Apr 2014 02:12:51 -0400 |
parents | ac747fa0a678 (current diff) 25f23cfd56ce (diff) |
children | 96c189d4d15e |
files | |
diffstat | 8 files changed, 48 insertions(+), 36 deletions(-) [+] |
line wrap: on
line diff
1.1 Binary file images/finger-1.png has changed
2.1 --- a/thesis/abstract.org Mon Apr 21 02:11:59 2014 -0400 2.2 +++ b/thesis/abstract.org Mon Apr 21 02:12:51 2014 -0400 2.3 @@ -3,20 +3,22 @@ 2.4 recognizing actions performed by a creature given limited data about 2.5 the creature's actions, such as a video recording. I solve this 2.6 problem in the case of a worm-like creature performing actions such as 2.7 -curling and wiggling. 2.8 +curling and wiggling. 2.9 2.10 To attack the action recognition problem, I developed a computational 2.11 model of empathy (=EMPATH=) which allows me to recognize actions using 2.12 simple, embodied representations of actions (which require rich 2.13 -sensory data), even when that sensory data is not actually 2.14 -available. The missing sense data is ``imagined'' by the system by 2.15 -combining previous experiences gained from unsupervised free play. 2.16 +sensory data), even when that sensory data is not actually available. 2.17 +The missing sense data is ``imagined'' by the system by combining 2.18 +previous experiences gained from unsupervised free play. The worm is a 2.19 +five-segment creature equipped with touch, proprioception, and muscle 2.20 +tension senses. It recognizes actions using only proprioception data. 2.21 2.22 In order to build this empathic, action-recognizing system, I created 2.23 a program called =CORTEX=, which is a complete platform for embodied 2.24 AI research. It provides multiple senses for simulated creatures, 2.25 -including vision, touch, proprioception, muscle tension, and 2.26 -hearing. Each of these senses provides a wealth of parameters that are 2.27 +including vision, touch, proprioception, muscle tension, and hearing. 2.28 +Each of these senses provides a wealth of parameters that are 2.29 biologically inspired. =CORTEX= is able to simulate any number of 2.30 creatures and senses, and provides facilities for easily modeling and 2.31 creating new creatures. As a research platform it is more complete
3.1 --- a/thesis/cortex.org Mon Apr 21 02:11:59 2014 -0400 3.2 +++ b/thesis/cortex.org Mon Apr 21 02:12:51 2014 -0400 3.3 @@ -44,11 +44,14 @@ 3.4 * Empathy \& Embodiment: problem solving strategies 3.5 3.6 By the end of this thesis, you will have seen a novel approach to 3.7 - interpreting video using embodiment and empathy. You will have also 3.8 - seen one way to efficiently implement empathy for embodied 3.9 + interpreting video using embodiment and empathy. You will also see 3.10 + one way to efficiently implement physical empathy for embodied 3.11 creatures. Finally, you will become familiar with =CORTEX=, a system 3.12 - for designing and simulating creatures with rich senses, which you 3.13 - may choose to use in your own research. 3.14 + for designing and simulating creatures with rich senses, which I 3.15 + have designed as a library that you can use in your own research. 3.16 + Note that I /do not/ process video directly --- I start with 3.17 + knowledge of the positions of a creature's body parts and works from 3.18 + there. 3.19 3.20 This is the core vision of my thesis: That one of the important ways 3.21 in which we understand others is by imagining ourselves in their 3.22 @@ -60,7 +63,7 @@ 3.23 is happening in a video and being completely lost in a sea of 3.24 incomprehensible color and movement. 3.25 3.26 -** The problem: recognizing actions in video is hard! 3.27 +** The problem: recognizing actions is hard! 3.28 3.29 Examine the following image. What is happening? As you, and indeed 3.30 very young children, can easily determine, this is an image of 3.31 @@ -84,8 +87,8 @@ 3.32 example, what processes might enable you to see the chair in figure 3.33 \ref{hidden-chair}? 3.34 3.35 - #+caption: The chair in this image is quite obvious to humans, but I 3.36 - #+caption: doubt that any modern computer vision program can find it. 3.37 + #+caption: The chair in this image is quite obvious to humans, but 3.38 + #+caption: it can't be found by any modern computer vision program. 3.39 #+name: hidden-chair 3.40 #+ATTR_LaTeX: :width 10cm 3.41 [[./images/fat-person-sitting-at-desk.jpg]] 3.42 @@ -480,7 +483,7 @@ 3.43 real world instead of a simulation is the matter of time. Instead 3.44 of simulated time you get the constant and unstoppable flow of 3.45 real time. This severely limits the sorts of software you can use 3.46 - to program the AI because all sense inputs must be handled in real 3.47 + to program an AI, because all sense inputs must be handled in real 3.48 time. Complicated ideas may have to be implemented in hardware or 3.49 may simply be impossible given the current speed of our 3.50 processors. Contrast this with a simulation, in which the flow of 3.51 @@ -550,10 +553,10 @@ 3.52 of the retina. In each case, we can describe the sense with a 3.53 surface and a distribution of sensors along that surface. 3.54 3.55 - The neat idea is that every human sense can be effectively 3.56 - described in terms of a surface containing embedded sensors. If the 3.57 - sense had any more dimensions, then there wouldn't be enough room 3.58 - in the spinal chord to transmit the information! 3.59 + In fact, almost every human sense can be effectively described in 3.60 + terms of a surface containing embedded sensors. If the sense had 3.61 + any more dimensions, then there wouldn't be enough room in the 3.62 + spinal chord to transmit the information! 3.63 3.64 Therefore, =CORTEX= must support the ability to create objects and 3.65 then be able to ``paint'' points along their surfaces to describe 3.66 @@ -2378,14 +2381,14 @@ 3.67 #+end_listing 3.68 3.69 3.70 - =movement-kernel= creates a function that will move the nearest 3.71 - physical object to the muscle node. The muscle exerts a rotational 3.72 - force dependent on it's orientation to the object in the blender 3.73 - file. The function returned by =movement-kernel= is also a sense 3.74 - function: it returns the percent of the total muscle strength that 3.75 - is currently being employed. This is analogous to muscle tension 3.76 - in humans and completes the sense of proprioception begun in the 3.77 - last section. 3.78 + =movement-kernel= creates a function that controlls the movement 3.79 + of the nearest physical node to the muscle node. The muscle exerts 3.80 + a rotational force dependent on it's orientation to the object in 3.81 + the blender file. The function returned by =movement-kernel= is 3.82 + also a sense function: it returns the percent of the total muscle 3.83 + strength that is currently being employed. This is analogous to 3.84 + muscle tension in humans and completes the sense of proprioception 3.85 + begun in the last section. 3.86 3.87 ** =CORTEX= brings complex creatures to life! 3.88 3.89 @@ -2491,6 +2494,8 @@ 3.90 hard control problems without worrying about physics or 3.91 senses. 3.92 3.93 +\newpage 3.94 + 3.95 * =EMPATH=: action recognition in a simulated worm 3.96 3.97 Here I develop a computational model of empathy, using =CORTEX= as a 3.98 @@ -2502,8 +2507,8 @@ 3.99 creature's sensory state space, which I call \Phi-space. Using 3.100 \Phi-space, I construct an efficient function which takes the 3.101 limited data that comes from observing another creature and enriches 3.102 - it full compliment of imagined sensory data. I can then use the 3.103 - imagined sensory data to recognize what the observed creature is 3.104 + it with a full compliment of imagined sensory data. I can then use 3.105 + the imagined sensory data to recognize what the observed creature is 3.106 doing and feeling, using straightforward embodied action predicates. 3.107 This is all demonstrated with using a simple worm-like creature, and 3.108 recognizing worm-actions based on limited data. 3.109 @@ -2555,9 +2560,9 @@ 3.110 3.111 Embodied representations using multiple senses such as touch, 3.112 proprioception, and muscle tension turns out be be exceedingly 3.113 - efficient at describing body-centered actions. It is the ``right 3.114 - language for the job''. For example, it takes only around 5 lines 3.115 - of LISP code to describe the action of ``curling'' using embodied 3.116 + efficient at describing body-centered actions. It is the right 3.117 + language for the job. For example, it takes only around 5 lines of 3.118 + LISP code to describe the action of curling using embodied 3.119 primitives. It takes about 10 lines to describe the seemingly 3.120 complicated action of wiggling. 3.121 3.122 @@ -2566,14 +2571,16 @@ 3.123 whether the worm is doing the action they describe. =curled?= 3.124 relies on proprioception, =resting?= relies on touch, =wiggling?= 3.125 relies on a Fourier analysis of muscle contraction, and 3.126 - =grand-circle?= relies on touch and reuses =curled?= as a guard. 3.127 + =grand-circle?= relies on touch and reuses =curled?= in its 3.128 + definition, showing how embodied predicates can be composed. 3.129 3.130 #+caption: Program for detecting whether the worm is curled. This is the 3.131 #+caption: simplest action predicate, because it only uses the last frame 3.132 #+caption: of sensory experience, and only uses proprioceptive data. Even 3.133 #+caption: this simple predicate, however, is automatically frame 3.134 - #+caption: independent and ignores vermopomorphic differences such as 3.135 - #+caption: worm textures and colors. 3.136 + #+caption: independent and ignores vermopomorphic \footnote{Like 3.137 + #+caption: \emph{anthropomorphic}, except for worms instead of humans.} 3.138 + #+caption: differences such as worm textures and colors. 3.139 #+name: curled 3.140 #+begin_listing clojure 3.141 #+begin_src clojure 3.142 @@ -2735,7 +2742,7 @@ 3.143 3.144 The trick now is to make the action predicates work even when the 3.145 sensory data on which they depend is absent. If I can do that, then 3.146 - I will have gained much, 3.147 + I will have gained much. 3.148 3.149 ** \Phi-space describes the worm's experiences 3.150
4.1 --- a/thesis/cover.tex Mon Apr 21 02:11:59 2014 -0400 4.2 +++ b/thesis/cover.tex Mon Apr 21 02:12:51 2014 -0400 4.3 @@ -45,7 +45,7 @@ 4.4 % however the specifications can change. We recommend that you verify the 4.5 % layout of your title page with your thesis advisor and/or the MIT 4.6 % Libraries before printing your final copy. 4.7 -\title{Solving Problems using Embodiment \& Empathy} 4.8 +\title{Recognizing Actions using Embodiment \& Empathy} 4.9 \author{Robert Louis M\raisebox{\depth}{\small \underline{\underline{c}}}Intyre} 4.10 %\author{Robert McIntyre} 4.11
5.1 Binary file thesis/images/empty-sense-nodes.png has changed
6.1 Binary file thesis/images/finger-UV.png has changed
7.1 Binary file thesis/images/hand-screenshot1.png has changed
8.1 --- a/thesis/rlm-cortex-meng.tex Mon Apr 21 02:11:59 2014 -0400 8.2 +++ b/thesis/rlm-cortex-meng.tex Mon Apr 21 02:12:51 2014 -0400 8.3 @@ -62,6 +62,9 @@ 8.4 % Configure minted source code listings. 8.5 \usemintedstyle{default} 8.6 \newminted{clojure}{fontsize=\footnotesize} 8.7 +\newminted{java}{fontsize=\footnotesize} 8.8 +\newminted{c}{fontsize=\footnotesize} 8.9 + 8.10 8.11 % Allow colored source code listing to break across pages. 8.12 \newenvironment{anchoredListing}{\captionsetup{type=listing}}{}