# HG changeset patch # User Robert McIntyre # Date 1396190478 14400 # Node ID 4c4d45f6f30b605314fb51ec02e593dbc7f071dd # Parent 8b962ab418c8cf399d06aa98759e45ce34d132a5 accept/reject changes diff -r 8b962ab418c8 -r 4c4d45f6f30b thesis/dxh-cortex-diff.diff --- a/thesis/dxh-cortex-diff.diff Sun Mar 30 10:39:19 2014 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,428 +0,0 @@ -diff -r f639e2139ce2 thesis/cortex.org ---- a/thesis/cortex.org Sun Mar 30 01:34:43 2014 -0400 -+++ b/thesis/cortex.org Sun Mar 30 10:07:17 2014 -0400 -@@ -41,49 +41,46 @@ - [[./images/aurellem-gray.png]] - - --* Empathy and Embodiment as problem solving strategies -+* Empathy \& Embodiment: problem solving strategies - -- By the end of this thesis, you will have seen a novel approach to -- interpreting video using embodiment and empathy. You will have also -- seen one way to efficiently implement empathy for embodied -- creatures. Finally, you will become familiar with =CORTEX=, a system -- for designing and simulating creatures with rich senses, which you -- may choose to use in your own research. -- -- This is the core vision of my thesis: That one of the important ways -- in which we understand others is by imagining ourselves in their -- position and emphatically feeling experiences relative to our own -- bodies. By understanding events in terms of our own previous -- corporeal experience, we greatly constrain the possibilities of what -- would otherwise be an unwieldy exponential search. This extra -- constraint can be the difference between easily understanding what -- is happening in a video and being completely lost in a sea of -- incomprehensible color and movement. -- --** Recognizing actions in video is extremely difficult -- -- Consider for example the problem of determining what is happening -- in a video of which this is one frame: -- -+** The problem: recognizing actions in video is extremely difficult -+# developing / requires useful representations -+ -+ Examine the following collection of images. As you, and indeed very -+ young children, can easily determine, each one is a picture of -+ someone drinking. -+ -+ # dxh: cat, cup, drinking fountain, rain, straw, coconut - #+caption: A cat drinking some water. Identifying this action is -- #+caption: beyond the state of the art for computers. -+ #+caption: beyond the capabilities of existing computer vision systems. - #+ATTR_LaTeX: :width 7cm - [[./images/cat-drinking.jpg]] -+ -+ Nevertheless, it is beyond the state of the art for a computer -+ vision program to describe what's happening in each of these -+ images, or what's common to them. Part of the problem is that many -+ computer vision systems focus on pixel-level details or probability -+ distributions of pixels, with little focus on [...] -+ -+ -+ In fact, the contents of scene may have much less to do with pixel -+ probabilities than with recognizing various affordances: things you -+ can move, objects you can grasp, spaces that can be filled -+ (Gibson). For example, what processes might enable you to see the -+ chair in figure \ref{hidden-chair}? -+ # Or suppose that you are building a program that recognizes chairs. -+ # How could you ``see'' the chair ? - -- It is currently impossible for any computer program to reliably -- label such a video as ``drinking''. And rightly so -- it is a very -- hard problem! What features can you describe in terms of low level -- functions of pixels that can even begin to describe at a high level -- what is happening here? -- -- Or suppose that you are building a program that recognizes chairs. -- How could you ``see'' the chair in figure \ref{hidden-chair}? -- -+ # dxh: blur chair - #+caption: The chair in this image is quite obvious to humans, but I - #+caption: doubt that any modern computer vision program can find it. - #+name: hidden-chair - #+ATTR_LaTeX: :width 10cm - [[./images/fat-person-sitting-at-desk.jpg]] -+ -+ -+ -+ - - Finally, how is it that you can easily tell the difference between - how the girls /muscles/ are working in figure \ref{girl}? -@@ -95,10 +92,13 @@ - #+ATTR_LaTeX: :width 7cm - [[./images/wall-push.png]] - -+ -+ -+ - Each of these examples tells us something about what might be going - on in our minds as we easily solve these recognition problems. - -- The hidden chairs show us that we are strongly triggered by cues -+ The hidden chair shows us that we are strongly triggered by cues - relating to the position of human bodies, and that we can determine - the overall physical configuration of a human body even if much of - that body is occluded. -@@ -109,10 +109,107 @@ - most positions, and we can easily project this self-knowledge to - imagined positions triggered by images of the human body. - --** =EMPATH= neatly solves recognition problems -+** A step forward: the sensorimotor-centered approach -+# ** =EMPATH= recognizes what creatures are doing -+# neatly solves recognition problems -+ In this thesis, I explore the idea that our knowledge of our own -+ bodies enables us to recognize the actions of others. -+ -+ First, I built a system for constructing virtual creatures with -+ physiologically plausible sensorimotor systems and detailed -+ environments. The result is =CORTEX=, which is described in section -+ \ref{sec-2}. (=CORTEX= was built to be flexible and useful to other -+ AI researchers; it is provided in full with detailed instructions -+ on the web [here].) -+ -+ Next, I wrote routines which enabled a simple worm-like creature to -+ infer the actions of a second worm-like creature, using only its -+ own prior sensorimotor experiences and knowledge of the second -+ worm's joint positions. This program, =EMPATH=, is described in -+ section \ref{sec-3}, and the key results of this experiment are -+ summarized below. -+ -+ #+caption: From only \emph{proprioceptive} data, =EMPATH= was able to infer -+ #+caption: the complete sensory experience and classify these four poses. -+ #+caption: The last image is a composite, depicting the intermediate stages of \emph{wriggling}. -+ #+name: worm-recognition-intro-2 -+ #+ATTR_LaTeX: :width 15cm -+ [[./images/empathy-1.png]] -+ -+ # =CORTEX= provides a language for describing the sensorimotor -+ # experiences of various creatures. -+ -+ # Next, I developed an experiment to test the power of =CORTEX='s -+ # sensorimotor-centered language for solving recognition problems. As -+ # a proof of concept, I wrote routines which enabled a simple -+ # worm-like creature to infer the actions of a second worm-like -+ # creature, using only its own previous sensorimotor experiences and -+ # knowledge of the second worm's joints (figure -+ # \ref{worm-recognition-intro-2}). The result of this proof of -+ # concept was the program =EMPATH=, described in section -+ # \ref{sec-3}. The key results of this -+ -+ # Using only first-person sensorimotor experiences and third-person -+ # proprioceptive data, -+ -+*** Key results -+ - After one-shot supervised training, =EMPATH= was able recognize a -+ wide variety of static poses and dynamic actions---ranging from -+ curling in a circle to wriggling with a particular frequency --- -+ with 95\% accuracy. -+ - These results were completely independent of viewing angle -+ because the underlying body-centered language fundamentally is; -+ once an action is learned, it can be recognized equally well from -+ any viewing angle. -+ - =EMPATH= is surprisingly short; the sensorimotor-centered -+ language provided by =CORTEX= resulted in extremely economical -+ recognition routines --- about 0000 lines in all --- suggesting -+ that such representations are very powerful, and often -+ indispensible for the types of recognition tasks considered here. -+ - Although for expediency's sake, I relied on direct knowledge of -+ joint positions in this proof of concept, it would be -+ straightforward to extend =EMPATH= so that it (more -+ realistically) infers joint positions from its visual data. -+ -+# because the underlying language is fundamentally orientation-independent -+ -+# recognize the actions of a worm with 95\% accuracy. The -+# recognition tasks - -- I propose a system that can express the types of recognition -- problems above in a form amenable to computation. It is split into -+ -+ -+ -+ [Talk about these results and what you find promising about them] -+ -+** Roadmap -+ [I'm going to explain how =CORTEX= works, then break down how -+ =EMPATH= does its thing. Because the details reveal such-and-such -+ about the approach.] -+ -+ # The success of this simple proof-of-concept offers a tantalizing -+ -+ -+ # explore the idea -+ # The key contribution of this thesis is the idea that body-centered -+ # representations (which express -+ -+ -+ # the -+ # body-centered approach --- in which I try to determine what's -+ # happening in a scene by bringing it into registration with my own -+ # bodily experiences --- are indispensible for recognizing what -+ # creatures are doing in a scene. -+ -+* COMMENT -+# body-centered language -+ -+ In this thesis, I'll describe =EMPATH=, which solves a certain -+ class of recognition problems -+ -+ The key idea is to use self-centered (or first-person) language. -+ -+ I have built a system that can express the types of recognition -+ problems in a form amenable to computation. It is split into - four parts: - - - Free/Guided Play :: The creature moves around and experiences the -@@ -286,14 +383,14 @@ - code to create a creature, and can use a wide library of - pre-existing blender models as a base for your own creatures. - -- - =CORTEX= implements a wide variety of senses, including touch, -+ - =CORTEX= implements a wide variety of senses: touch, - proprioception, vision, hearing, and muscle tension. Complicated - senses like touch, and vision involve multiple sensory elements - embedded in a 2D surface. You have complete control over the - distribution of these sensor elements through the use of simple - png image files. In particular, =CORTEX= implements more - comprehensive hearing than any other creature simulation system -- available. -+ available. - - - =CORTEX= supports any number of creatures and any number of - senses. Time in =CORTEX= dialates so that the simulated creatures -@@ -353,7 +450,24 @@ - \end{sidewaysfigure} - #+END_LaTeX - --** Contributions -+** Road map -+ -+ By the end of this thesis, you will have seen a novel approach to -+ interpreting video using embodiment and empathy. You will have also -+ seen one way to efficiently implement empathy for embodied -+ creatures. Finally, you will become familiar with =CORTEX=, a system -+ for designing and simulating creatures with rich senses, which you -+ may choose to use in your own research. -+ -+ This is the core vision of my thesis: That one of the important ways -+ in which we understand others is by imagining ourselves in their -+ position and emphatically feeling experiences relative to our own -+ bodies. By understanding events in terms of our own previous -+ corporeal experience, we greatly constrain the possibilities of what -+ would otherwise be an unwieldy exponential search. This extra -+ constraint can be the difference between easily understanding what -+ is happening in a video and being completely lost in a sea of -+ incomprehensible color and movement. - - - I built =CORTEX=, a comprehensive platform for embodied AI - experiments. =CORTEX= supports many features lacking in other -@@ -363,18 +477,22 @@ - - I built =EMPATH=, which uses =CORTEX= to identify the actions of - a worm-like creature using a computational model of empathy. - --* Building =CORTEX= -- -- I intend for =CORTEX= to be used as a general-purpose library for -- building creatures and outfitting them with senses, so that it will -- be useful for other researchers who want to test out ideas of their -- own. To this end, wherver I have had to make archetictural choices -- about =CORTEX=, I have chosen to give as much freedom to the user as -- possible, so that =CORTEX= may be used for things I have not -- forseen. -- --** Simulation or Reality? -- -+ -+* Designing =CORTEX= -+ In this section, I outline the design decisions that went into -+ making =CORTEX=, along with some details about its -+ implementation. (A practical guide to getting started with =CORTEX=, -+ which skips over the history and implementation details presented -+ here, is provided in an appendix \ref{} at the end of this paper.) -+ -+ Throughout this project, I intended for =CORTEX= to be flexible and -+ extensible enough to be useful for other researchers who want to -+ test out ideas of their own. To this end, wherver I have had to make -+ archetictural choices about =CORTEX=, I have chosen to give as much -+ freedom to the user as possible, so that =CORTEX= may be used for -+ things I have not forseen. -+ -+** Building in simulation versus reality - The most important archetictural decision of all is the choice to - use a computer-simulated environemnt in the first place! The world - is a vast and rich place, and for now simulations are a very poor -@@ -436,7 +554,7 @@ - doing everything in software is far cheaper than building custom - real-time hardware. All you need is a laptop and some patience. - --** Because of Time, simulation is perferable to reality -+** Simulated time enables rapid prototyping and complex scenes - - I envision =CORTEX= being used to support rapid prototyping and - iteration of ideas. Even if I could put together a well constructed -@@ -459,8 +577,8 @@ - simulations of very simple creatures in =CORTEX= generally run at - 40x on my machine! - --** What is a sense? -- -+** All sense organs are two-dimensional surfaces -+# What is a sense? - If =CORTEX= is to support a wide variety of senses, it would help - to have a better understanding of what a ``sense'' actually is! - While vision, touch, and hearing all seem like they are quite -@@ -956,7 +1074,7 @@ - #+ATTR_LaTeX: :width 15cm - [[./images/physical-hand.png]] - --** Eyes reuse standard video game components -+** Sight reuses standard video game components... - - Vision is one of the most important senses for humans, so I need to - build a simulated sense of vision for my AI. I will do this with -@@ -1257,8 +1375,8 @@ - community and is now (in modified form) part of a system for - capturing in-game video to a file. - --** Hearing is hard; =CORTEX= does it right -- -+** ...but hearing must be built from scratch -+# is hard; =CORTEX= does it right - At the end of this section I will have simulated ears that work the - same way as the simulated eyes in the last section. I will be able to - place any number of ear-nodes in a blender file, and they will bind to -@@ -1565,7 +1683,7 @@ - jMonkeyEngine3 community and is used to record audio for demo - videos. - --** Touch uses hundreds of hair-like elements -+** Hundreds of hair-like elements provide a sense of touch - - Touch is critical to navigation and spatial reasoning and as such I - need a simulated version of it to give to my AI creatures. -@@ -2059,7 +2177,7 @@ - #+ATTR_LaTeX: :width 15cm - [[./images/touch-cube.png]] - --** Proprioception is the sense that makes everything ``real'' -+** Proprioception provides knowledge of your own body's position - - Close your eyes, and touch your nose with your right index finger. - How did you do it? You could not see your hand, and neither your -@@ -2193,7 +2311,7 @@ - #+ATTR_LaTeX: :width 11cm - [[./images/proprio.png]] - --** Muscles are both effectors and sensors -+** Muscles contain both sensors and effectors - - Surprisingly enough, terrestrial creatures only move by using - torque applied about their joints. There's not a single straight -@@ -2440,7 +2558,8 @@ - hard control problems without worrying about physics or - senses. - --* Empathy in a simulated worm -+* =EMPATH=: the simulated worm experiment -+# Empathy in a simulated worm - - Here I develop a computational model of empathy, using =CORTEX= as a - base. Empathy in this context is the ability to observe another -@@ -2732,7 +2851,7 @@ - provided by an experience vector and reliably infering the rest of - the senses. - --** Empathy is the process of tracing though \Phi-space -+** ``Empathy'' requires retracing steps though \Phi-space - - Here is the core of a basic empathy algorithm, starting with an - experience vector: -@@ -2888,7 +3007,7 @@ - #+end_src - #+end_listing - --** Efficient action recognition with =EMPATH= -+** =EMPATH= recognizes actions efficiently - - To use =EMPATH= with the worm, I first need to gather a set of - experiences from the worm that includes the actions I want to -@@ -3044,9 +3163,9 @@ - to interpretation, and dissaggrement between empathy and experience - is more excusable. - --** Digression: bootstrapping touch using free exploration -- -- In the previous section I showed how to compute actions in terms of -+** Digression: Learn touch sensor layout through haptic experimentation, instead -+# Boostraping touch using free exploration -+In the previous section I showed how to compute actions in terms of - body-centered predicates which relied averate touch activation of - pre-defined regions of the worm's skin. What if, instead of recieving - touch pre-grouped into the six faces of each worm segment, the true -@@ -3210,13 +3329,14 @@ - - In this thesis you have seen the =CORTEX= system, a complete - environment for creating simulated creatures. You have seen how to -- implement five senses including touch, proprioception, hearing, -- vision, and muscle tension. You have seen how to create new creatues -- using blender, a 3D modeling tool. I hope that =CORTEX= will be -- useful in further research projects. To this end I have included the -- full source to =CORTEX= along with a large suite of tests and -- examples. I have also created a user guide for =CORTEX= which is -- inculded in an appendix to this thesis. -+ implement five senses: touch, proprioception, hearing, vision, and -+ muscle tension. You have seen how to create new creatues using -+ blender, a 3D modeling tool. I hope that =CORTEX= will be useful in -+ further research projects. To this end I have included the full -+ source to =CORTEX= along with a large suite of tests and examples. I -+ have also created a user guide for =CORTEX= which is inculded in an -+ appendix to this thesis \ref{}. -+# dxh: todo reference appendix - - You have also seen how I used =CORTEX= as a platform to attach the - /action recognition/ problem, which is the problem of recognizing -@@ -3234,8 +3354,8 @@ - - - =CORTEX=, a system for creating simulated creatures with rich - senses. -- - =EMPATH=, a program for recognizing actions by imagining sensory -- experience. -+ - =EMPATH=, a program for recognizing actions by aligning them with -+ personal sensory experiences. - - # An anatomical joke: - # - Training diff -r 8b962ab418c8 -r 4c4d45f6f30b thesis/dylan-accept.diff --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/thesis/dylan-accept.diff Sun Mar 30 10:41:18 2014 -0400 @@ -0,0 +1,22 @@ +@@ -3210,13 +3329,14 @@ + + In this thesis you have seen the =CORTEX= system, a complete + environment for creating simulated creatures. You have seen how to +- implement five senses including touch, proprioception, hearing, +- vision, and muscle tension. You have seen how to create new creatues +- using blender, a 3D modeling tool. I hope that =CORTEX= will be +- useful in further research projects. To this end I have included the +- full source to =CORTEX= along with a large suite of tests and +- examples. I have also created a user guide for =CORTEX= which is +- inculded in an appendix to this thesis. ++ implement five senses: touch, proprioception, hearing, vision, and ++ muscle tension. You have seen how to create new creatues using ++ blender, a 3D modeling tool. I hope that =CORTEX= will be useful in ++ further research projects. To this end I have included the full ++ source to =CORTEX= along with a large suite of tests and examples. I ++ have also created a user guide for =CORTEX= which is inculded in an ++ appendix to this thesis \ref{}. ++# dxh: todo reference appendix + + You have also seen how I used =CORTEX= as a platform to attach the + /action recognition/ problem, which is the problem of recognizing diff -r 8b962ab418c8 -r 4c4d45f6f30b thesis/dylan-cortex-diff.diff --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/thesis/dylan-cortex-diff.diff Sun Mar 30 10:41:18 2014 -0400 @@ -0,0 +1,395 @@ +diff -r f639e2139ce2 thesis/cortex.org +--- a/thesis/cortex.org Sun Mar 30 01:34:43 2014 -0400 ++++ b/thesis/cortex.org Sun Mar 30 10:07:17 2014 -0400 +@@ -41,49 +41,46 @@ + [[./images/aurellem-gray.png]] + + +-* Empathy and Embodiment as problem solving strategies ++* Empathy \& Embodiment: problem solving strategies + +- By the end of this thesis, you will have seen a novel approach to +- interpreting video using embodiment and empathy. You will have also +- seen one way to efficiently implement empathy for embodied +- creatures. Finally, you will become familiar with =CORTEX=, a system +- for designing and simulating creatures with rich senses, which you +- may choose to use in your own research. +- +- This is the core vision of my thesis: That one of the important ways +- in which we understand others is by imagining ourselves in their +- position and emphatically feeling experiences relative to our own +- bodies. By understanding events in terms of our own previous +- corporeal experience, we greatly constrain the possibilities of what +- would otherwise be an unwieldy exponential search. This extra +- constraint can be the difference between easily understanding what +- is happening in a video and being completely lost in a sea of +- incomprehensible color and movement. +- +-** Recognizing actions in video is extremely difficult +- +- Consider for example the problem of determining what is happening +- in a video of which this is one frame: +- ++** The problem: recognizing actions in video is extremely difficult ++# developing / requires useful representations ++ ++ Examine the following collection of images. As you, and indeed very ++ young children, can easily determine, each one is a picture of ++ someone drinking. ++ ++ # dxh: cat, cup, drinking fountain, rain, straw, coconut + #+caption: A cat drinking some water. Identifying this action is +- #+caption: beyond the state of the art for computers. ++ #+caption: beyond the capabilities of existing computer vision systems. + #+ATTR_LaTeX: :width 7cm + [[./images/cat-drinking.jpg]] ++ ++ Nevertheless, it is beyond the state of the art for a computer ++ vision program to describe what's happening in each of these ++ images, or what's common to them. Part of the problem is that many ++ computer vision systems focus on pixel-level details or probability ++ distributions of pixels, with little focus on [...] ++ ++ ++ In fact, the contents of scene may have much less to do with pixel ++ probabilities than with recognizing various affordances: things you ++ can move, objects you can grasp, spaces that can be filled ++ (Gibson). For example, what processes might enable you to see the ++ chair in figure \ref{hidden-chair}? ++ # Or suppose that you are building a program that recognizes chairs. ++ # How could you ``see'' the chair ? + +- It is currently impossible for any computer program to reliably +- label such a video as ``drinking''. And rightly so -- it is a very +- hard problem! What features can you describe in terms of low level +- functions of pixels that can even begin to describe at a high level +- what is happening here? +- +- Or suppose that you are building a program that recognizes chairs. +- How could you ``see'' the chair in figure \ref{hidden-chair}? +- ++ # dxh: blur chair + #+caption: The chair in this image is quite obvious to humans, but I + #+caption: doubt that any modern computer vision program can find it. + #+name: hidden-chair + #+ATTR_LaTeX: :width 10cm + [[./images/fat-person-sitting-at-desk.jpg]] ++ ++ ++ ++ + + Finally, how is it that you can easily tell the difference between + how the girls /muscles/ are working in figure \ref{girl}? +@@ -95,10 +92,13 @@ + #+ATTR_LaTeX: :width 7cm + [[./images/wall-push.png]] + ++ ++ ++ + Each of these examples tells us something about what might be going + on in our minds as we easily solve these recognition problems. + +- The hidden chairs show us that we are strongly triggered by cues ++ The hidden chair shows us that we are strongly triggered by cues + relating to the position of human bodies, and that we can determine + the overall physical configuration of a human body even if much of + that body is occluded. +@@ -109,10 +109,107 @@ + most positions, and we can easily project this self-knowledge to + imagined positions triggered by images of the human body. + +-** =EMPATH= neatly solves recognition problems ++** A step forward: the sensorimotor-centered approach ++# ** =EMPATH= recognizes what creatures are doing ++# neatly solves recognition problems ++ In this thesis, I explore the idea that our knowledge of our own ++ bodies enables us to recognize the actions of others. ++ ++ First, I built a system for constructing virtual creatures with ++ physiologically plausible sensorimotor systems and detailed ++ environments. The result is =CORTEX=, which is described in section ++ \ref{sec-2}. (=CORTEX= was built to be flexible and useful to other ++ AI researchers; it is provided in full with detailed instructions ++ on the web [here].) ++ ++ Next, I wrote routines which enabled a simple worm-like creature to ++ infer the actions of a second worm-like creature, using only its ++ own prior sensorimotor experiences and knowledge of the second ++ worm's joint positions. This program, =EMPATH=, is described in ++ section \ref{sec-3}, and the key results of this experiment are ++ summarized below. ++ ++ #+caption: From only \emph{proprioceptive} data, =EMPATH= was able to infer ++ #+caption: the complete sensory experience and classify these four poses. ++ #+caption: The last image is a composite, depicting the intermediate stages of \emph{wriggling}. ++ #+name: worm-recognition-intro-2 ++ #+ATTR_LaTeX: :width 15cm ++ [[./images/empathy-1.png]] ++ ++ # =CORTEX= provides a language for describing the sensorimotor ++ # experiences of various creatures. ++ ++ # Next, I developed an experiment to test the power of =CORTEX='s ++ # sensorimotor-centered language for solving recognition problems. As ++ # a proof of concept, I wrote routines which enabled a simple ++ # worm-like creature to infer the actions of a second worm-like ++ # creature, using only its own previous sensorimotor experiences and ++ # knowledge of the second worm's joints (figure ++ # \ref{worm-recognition-intro-2}). The result of this proof of ++ # concept was the program =EMPATH=, described in section ++ # \ref{sec-3}. The key results of this ++ ++ # Using only first-person sensorimotor experiences and third-person ++ # proprioceptive data, ++ ++*** Key results ++ - After one-shot supervised training, =EMPATH= was able recognize a ++ wide variety of static poses and dynamic actions---ranging from ++ curling in a circle to wriggling with a particular frequency --- ++ with 95\% accuracy. ++ - These results were completely independent of viewing angle ++ because the underlying body-centered language fundamentally is; ++ once an action is learned, it can be recognized equally well from ++ any viewing angle. ++ - =EMPATH= is surprisingly short; the sensorimotor-centered ++ language provided by =CORTEX= resulted in extremely economical ++ recognition routines --- about 0000 lines in all --- suggesting ++ that such representations are very powerful, and often ++ indispensible for the types of recognition tasks considered here. ++ - Although for expediency's sake, I relied on direct knowledge of ++ joint positions in this proof of concept, it would be ++ straightforward to extend =EMPATH= so that it (more ++ realistically) infers joint positions from its visual data. ++ ++# because the underlying language is fundamentally orientation-independent ++ ++# recognize the actions of a worm with 95\% accuracy. The ++# recognition tasks + +- I propose a system that can express the types of recognition +- problems above in a form amenable to computation. It is split into ++ ++ ++ ++ [Talk about these results and what you find promising about them] ++ ++** Roadmap ++ [I'm going to explain how =CORTEX= works, then break down how ++ =EMPATH= does its thing. Because the details reveal such-and-such ++ about the approach.] ++ ++ # The success of this simple proof-of-concept offers a tantalizing ++ ++ ++ # explore the idea ++ # The key contribution of this thesis is the idea that body-centered ++ # representations (which express ++ ++ ++ # the ++ # body-centered approach --- in which I try to determine what's ++ # happening in a scene by bringing it into registration with my own ++ # bodily experiences --- are indispensible for recognizing what ++ # creatures are doing in a scene. ++ ++* COMMENT ++# body-centered language ++ ++ In this thesis, I'll describe =EMPATH=, which solves a certain ++ class of recognition problems ++ ++ The key idea is to use self-centered (or first-person) language. ++ ++ I have built a system that can express the types of recognition ++ problems in a form amenable to computation. It is split into + four parts: + + - Free/Guided Play :: The creature moves around and experiences the +@@ -286,14 +383,14 @@ + code to create a creature, and can use a wide library of + pre-existing blender models as a base for your own creatures. + +- - =CORTEX= implements a wide variety of senses, including touch, ++ - =CORTEX= implements a wide variety of senses: touch, + proprioception, vision, hearing, and muscle tension. Complicated + senses like touch, and vision involve multiple sensory elements + embedded in a 2D surface. You have complete control over the + distribution of these sensor elements through the use of simple + png image files. In particular, =CORTEX= implements more + comprehensive hearing than any other creature simulation system +- available. ++ available. + + - =CORTEX= supports any number of creatures and any number of + senses. Time in =CORTEX= dialates so that the simulated creatures +@@ -353,7 +450,24 @@ + \end{sidewaysfigure} + #+END_LaTeX + +-** Contributions ++** Road map ++ ++ By the end of this thesis, you will have seen a novel approach to ++ interpreting video using embodiment and empathy. You will have also ++ seen one way to efficiently implement empathy for embodied ++ creatures. Finally, you will become familiar with =CORTEX=, a system ++ for designing and simulating creatures with rich senses, which you ++ may choose to use in your own research. ++ ++ This is the core vision of my thesis: That one of the important ways ++ in which we understand others is by imagining ourselves in their ++ position and emphatically feeling experiences relative to our own ++ bodies. By understanding events in terms of our own previous ++ corporeal experience, we greatly constrain the possibilities of what ++ would otherwise be an unwieldy exponential search. This extra ++ constraint can be the difference between easily understanding what ++ is happening in a video and being completely lost in a sea of ++ incomprehensible color and movement. + + - I built =CORTEX=, a comprehensive platform for embodied AI + experiments. =CORTEX= supports many features lacking in other +@@ -363,18 +477,22 @@ + - I built =EMPATH=, which uses =CORTEX= to identify the actions of + a worm-like creature using a computational model of empathy. + +-* Building =CORTEX= +- +- I intend for =CORTEX= to be used as a general-purpose library for +- building creatures and outfitting them with senses, so that it will +- be useful for other researchers who want to test out ideas of their +- own. To this end, wherver I have had to make archetictural choices +- about =CORTEX=, I have chosen to give as much freedom to the user as +- possible, so that =CORTEX= may be used for things I have not +- forseen. +- +-** Simulation or Reality? +- ++ ++* Designing =CORTEX= ++ In this section, I outline the design decisions that went into ++ making =CORTEX=, along with some details about its ++ implementation. (A practical guide to getting started with =CORTEX=, ++ which skips over the history and implementation details presented ++ here, is provided in an appendix \ref{} at the end of this paper.) ++ ++ Throughout this project, I intended for =CORTEX= to be flexible and ++ extensible enough to be useful for other researchers who want to ++ test out ideas of their own. To this end, wherver I have had to make ++ archetictural choices about =CORTEX=, I have chosen to give as much ++ freedom to the user as possible, so that =CORTEX= may be used for ++ things I have not forseen. ++ ++** Building in simulation versus reality + The most important archetictural decision of all is the choice to + use a computer-simulated environemnt in the first place! The world + is a vast and rich place, and for now simulations are a very poor +@@ -436,7 +554,7 @@ + doing everything in software is far cheaper than building custom + real-time hardware. All you need is a laptop and some patience. + +-** Because of Time, simulation is perferable to reality ++** Simulated time enables rapid prototyping and complex scenes + + I envision =CORTEX= being used to support rapid prototyping and + iteration of ideas. Even if I could put together a well constructed +@@ -459,8 +577,8 @@ + simulations of very simple creatures in =CORTEX= generally run at + 40x on my machine! + +-** What is a sense? +- ++** All sense organs are two-dimensional surfaces ++# What is a sense? + If =CORTEX= is to support a wide variety of senses, it would help + to have a better understanding of what a ``sense'' actually is! + While vision, touch, and hearing all seem like they are quite +@@ -956,7 +1074,7 @@ + #+ATTR_LaTeX: :width 15cm + [[./images/physical-hand.png]] + +-** Eyes reuse standard video game components ++** Sight reuses standard video game components... + + Vision is one of the most important senses for humans, so I need to + build a simulated sense of vision for my AI. I will do this with +@@ -1257,8 +1375,8 @@ + community and is now (in modified form) part of a system for + capturing in-game video to a file. + +-** Hearing is hard; =CORTEX= does it right +- ++** ...but hearing must be built from scratch ++# is hard; =CORTEX= does it right + At the end of this section I will have simulated ears that work the + same way as the simulated eyes in the last section. I will be able to + place any number of ear-nodes in a blender file, and they will bind to +@@ -1565,7 +1683,7 @@ + jMonkeyEngine3 community and is used to record audio for demo + videos. + +-** Touch uses hundreds of hair-like elements ++** Hundreds of hair-like elements provide a sense of touch + + Touch is critical to navigation and spatial reasoning and as such I + need a simulated version of it to give to my AI creatures. +@@ -2059,7 +2177,7 @@ + #+ATTR_LaTeX: :width 15cm + [[./images/touch-cube.png]] + +-** Proprioception is the sense that makes everything ``real'' ++** Proprioception provides knowledge of your own body's position + + Close your eyes, and touch your nose with your right index finger. + How did you do it? You could not see your hand, and neither your +@@ -2193,7 +2311,7 @@ + #+ATTR_LaTeX: :width 11cm + [[./images/proprio.png]] + +-** Muscles are both effectors and sensors ++** Muscles contain both sensors and effectors + + Surprisingly enough, terrestrial creatures only move by using + torque applied about their joints. There's not a single straight +@@ -2440,7 +2558,8 @@ + hard control problems without worrying about physics or + senses. + +-* Empathy in a simulated worm ++* =EMPATH=: the simulated worm experiment ++# Empathy in a simulated worm + + Here I develop a computational model of empathy, using =CORTEX= as a + base. Empathy in this context is the ability to observe another +@@ -2732,7 +2851,7 @@ + provided by an experience vector and reliably infering the rest of + the senses. + +-** Empathy is the process of tracing though \Phi-space ++** ``Empathy'' requires retracing steps though \Phi-space + + Here is the core of a basic empathy algorithm, starting with an + experience vector: +@@ -2888,7 +3007,7 @@ + #+end_src + #+end_listing + +-** Efficient action recognition with =EMPATH= ++** =EMPATH= recognizes actions efficiently + + To use =EMPATH= with the worm, I first need to gather a set of + experiences from the worm that includes the actions I want to +@@ -3044,9 +3163,9 @@ + to interpretation, and dissaggrement between empathy and experience + is more excusable. + +-** Digression: bootstrapping touch using free exploration +- +- In the previous section I showed how to compute actions in terms of ++** Digression: Learn touch sensor layout through haptic experimentation, instead ++# Boostraping touch using free exploration ++In the previous section I showed how to compute actions in terms of + body-centered predicates which relied averate touch activation of + pre-defined regions of the worm's skin. What if, instead of recieving + touch pre-grouped into the six faces of each worm segment, the true diff -r 8b962ab418c8 -r 4c4d45f6f30b thesis/dylan-reject.diff --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/thesis/dylan-reject.diff Sun Mar 30 10:41:18 2014 -0400 @@ -0,0 +1,11 @@ +@@ -3234,8 +3354,8 @@ + + - =CORTEX=, a system for creating simulated creatures with rich + senses. +- - =EMPATH=, a program for recognizing actions by imagining sensory +- experience. ++ - =EMPATH=, a program for recognizing actions by aligning them with ++ personal sensory experiences. + + # An anatomical joke: + # - Training