cortex: thesis/aux/org/first-chapter.html comparison

comparison thesis/aux/org/first-chapter.html @ 422:6b0f77df0e53

building latex scaffolding for thesis.

author	Robert McIntyre <rlm@mit.edu>
date	Fri, 21 Mar 2014 01:17:41 -0400
parents	thesis/org/first-chapter.html@7ee735a836da
children

comparison

equal deleted inserted replaced

-:c2c28c3e27c4
+:6b0f77df0e53
+<?xml version="1.0" encoding="utf-8"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+<head>
+<title><code>CORTEX</code></title>
+<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
+<meta name="title" content="<code>CORTEX</code>"/>
+<meta name="generator" content="Org-mode"/>
+<meta name="generated" content="2013-11-07 04:21:29 EST"/>
+<meta name="author" content="Robert McIntyre"/>
+<meta name="description" content="Using embodied AI to facilitate Artificial Imagination."/>
+<meta name="keywords" content="AI, clojure, embodiment"/>
+<style type="text/css">
+<!--/*--><![CDATA[/*><!--*/
+html { font-family: Times, serif; font-size: 12pt; }
+.title  { text-align: center; }
+.todo   { color: red; }
+.done   { color: green; }
+.tag    { background-color: #add8e6; font-weight:normal }
+.target { }
+.timestamp { color: #bebebe; }
+.timestamp-kwd { color: #5f9ea0; }
+.right  {margin-left:auto; margin-right:0px;  text-align:right;}
+.left   {margin-left:0px;  margin-right:auto; text-align:left;}
+.center {margin-left:auto; margin-right:auto; text-align:center;}
+p.verse { margin-left: 3% }
+pre {
+	border: 1pt solid #AEBDCC;
+	background-color: #F3F5F7;
+	padding: 5pt;
+	font-family: courier, monospace;
+font-size: 90%;
+overflow:auto;
+}
+table { border-collapse: collapse; }
+td, th { vertical-align: top;  }
+th.right  { text-align:center;  }
+th.left   { text-align:center;   }
+th.center { text-align:center; }
+td.right  { text-align:right;  }
+td.left   { text-align:left;   }
+td.center { text-align:center; }
+dt { font-weight: bold; }
+div.figure { padding: 0.5em; }
+div.figure p { text-align: center; }
+div.inlinetask {
+padding:10px;
+border:2px solid gray;
+margin:10px;
+background: #ffffcc;
+}
+textarea { overflow-x: auto; }
+.linenr { font-size:smaller }
+.code-highlighted {background-color:#ffff00;}
+.org-info-js_info-navigation { border-style:none; }
+#org-info-js_console-label { font-size:10px; font-weight:bold;
+white-space:nowrap; }
+.org-info-js_search-highlight {background-color:#ffff00; color:#000000;
+font-weight:bold; }
+/*]]>*/-->
+</style>
+<script type="text/javascript">var _gaq = _gaq || [];_gaq.push(['_setAccount', 'UA-31261312-1']);_gaq.push(['_trackPageview']);(function() {var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);})();</script><link rel="stylesheet" type="text/css" href="../../aurellem/css/argentum.css" />
+<script type="text/javascript">
+<!--/*--><![CDATA[/*><!--*/
+function CodeHighlightOn(elem, id)
+{
+var target = document.getElementById(id);
+if(null != target) {
+elem.cacheClassElem = elem.className;
+elem.cacheClassTarget = target.className;
+target.className = "code-highlighted";
+elem.className   = "code-highlighted";
+}
+}
+function CodeHighlightOff(elem, id)
+{
+var target = document.getElementById(id);
+if(elem.cacheClassElem)
+elem.className = elem.cacheClassElem;
+if(elem.cacheClassTarget)
+target.className = elem.cacheClassTarget;
+}
+/*]]>*///-->
+</script>
+</head>
+<body>
+<div id="content">
+<h1 class="title"><code>CORTEX</code></h1>
+<div class="header">
+<div class="float-right">
+<!--
+<form>
+<input type="text"/><input type="submit" value="search the blog &raquo;"/>
+</form>
+-->
+</div>
+<h1>aurellem <em>&#x2609;</em></h1>
+<ul class="nav">
+<li><a href="/">read the blog &raquo;</a></li>
+<!-- li><a href="#">learn about us &raquo;</a></li-->
+</ul>
+</div>
+<div class="author">Written by <author>Robert McIntyre</author></div>
+<div id="outline-container-1" class="outline-2">
+<h2 id="sec-1">Artificial Imagination</h2>
+<div class="outline-text-2" id="text-1">
+<p>
+Imagine watching a video of someone skateboarding. When you watch
+the video, you can imagine yourself skateboarding, and your
+knowledge of the human body and its dynamics guides your
+interpretation of the scene. For example, even if the skateboarder
+is partially occluded, you can infer the positions of his arms and
+body from your own knowledge of how your body would be positioned if
+you were skateboarding. If the skateboarder suffers an accident, you
+wince in sympathy, imagining the pain your own body would experience
+if it were in the same situation. This empathy with other people
+guides our understanding of whatever they are doing because it is a
+powerful constraint on what is probable and possible. In order to
+make use of this powerful empathy constraint, I need a system that
+can generate and make sense of sensory data from the many different
+senses that humans possess. The two key proprieties of such a system
+are <i>embodiment</i> and <i>imagination</i>.
+</p>
+</div>
+<div id="outline-container-1-1" class="outline-3">
+<h3 id="sec-1-1">What is imagination?</h3>
+<div class="outline-text-3" id="text-1-1">
+<p>
+One kind of imagination is <i>sympathetic</i> imagination: you imagine
+yourself in the position of something/someone you are
+observing. This type of imagination comes into play when you follow
+along visually when watching someone perform actions, or when you
+sympathetically grimace when someone hurts themselves. This type of
+imagination uses the constraints you have learned about your own
+body to highly constrain the possibilities in whatever you are
+seeing. It uses all your senses to including your senses of touch,
+proprioception, etc. Humans are flexible when it comes to "putting
+themselves in another's shoes," and can sympathetically understand
+not only other humans, but entities ranging animals to cartoon
+characters to <a href="http://www.youtube.com/watch?v=0jz4HcwTQmU">single dots</a> on a screen!
+</p>
+<p>
+Another kind of imagination is <i>predictive</i> imagination: you
+construct scenes in your mind that are not entirely related to
+whatever you are observing, but instead are predictions of the
+future or simply flights of fancy. You use this type of imagination
+to plan out multi-step actions, or play out dangerous situations in
+your mind so as to avoid messing them up in reality.
+</p>
+<p>
+Of course, sympathetic and predictive imagination blend into each
+other and are not completely separate concepts. One dimension along
+which you can distinguish types of imagination is dependence on raw
+sense data. Sympathetic imagination is highly constrained by your
+senses, while predictive imagination can be more or less dependent
+on your senses depending on how far ahead you imagine. Daydreaming
+is an extreme form of predictive imagination that wanders through
+different possibilities without concern for whether they are
+related to whatever is happening in reality.
+</p>
+<p>
+For this thesis, I will mostly focus on sympathetic imagination and
+the constraint it provides for understanding sensory data.
+</p>
+</div>
+</div>
+<div id="outline-container-1-2" class="outline-3">
+<h3 id="sec-1-2">What problems can imagination solve?</h3>
+<div class="outline-text-3" id="text-1-2">
+<p>
+Consider a video of a cat drinking some water.
+</p>
+<div class="figure">
+<p><img src="../images/cat-drinking.jpg"  alt="../images/cat-drinking.jpg" /></p>
+<p>A cat drinking some water. Identifying this action is beyond the state of the art for computers.</p>
+</div>
+<p>
+It is currently impossible for any computer program to reliably
+label such an video as "drinking". I think humans are able to label
+such video as "drinking" because they imagine <i>themselves</i> as the
+cat, and imagine putting their face up against a stream of water
+and sticking out their tongue. In that imagined world, they can
+feel the cool water hitting their tongue, and feel the water
+entering their body, and are able to recognize that <i>feeling</i> as
+drinking. So, the label of the action is not really in the pixels
+of the image, but is found clearly in a simulation inspired by
+those pixels. An imaginative system, having been trained on
+drinking and non-drinking examples and learning that the most
+important component of drinking is the feeling of water sliding
+down one's throat, would analyze a video of a cat drinking in the
+following manner:
+</p>
+<ul>
+<li>Create a physical model of the video by putting a "fuzzy" model
+of its own body in place of the cat. Also, create a simulation of
+the stream of water.
+</li>
+<li>Play out this simulated scene and generate imagined sensory
+experience. This will include relevant muscle contractions, a
+close up view of the stream from the cat's perspective, and most
+importantly, the imagined feeling of water entering the mouth.
+</li>
+<li>The action is now easily identified as drinking by the sense of
+taste alone. The other senses (such as the tongue moving in and
+out) help to give plausibility to the simulated action. Note that
+the sense of vision, while critical in creating the simulation,
+is not critical for identifying the action from the simulation.
+</li>
+</ul>
+<p>
+More generally, I expect imaginative systems to be particularly
+good at identifying embodied actions in videos.
+</p>
+</div>
+</div>
+</div>
+<div id="outline-container-2" class="outline-2">
+<h2 id="sec-2">Cortex</h2>
+<div class="outline-text-2" id="text-2">
+<p>
+The previous example involves liquids, the sense of taste, and
+imagining oneself as a cat. For this thesis I constrain myself to
+simpler, more easily digitizable senses and situations.
+</p>
+<p>
+My system, <code>Cortex</code> performs imagination in two different simplified
+worlds: <i>worm world</i> and <i>stick figure world</i>. In each of these
+worlds, entities capable of imagination recognize actions by
+simulating the experience from their own perspective, and then
+recognizing the action from a database of examples.
+</p>
+<p>
+In order to serve as a framework for experiments in imagination,
+<code>Cortex</code> requires simulated bodies, worlds, and senses like vision,
+hearing, touch, proprioception, etc.
+</p>
+</div>
+<div id="outline-container-2-1" class="outline-3">
+<h3 id="sec-2-1">A Video Game Engine takes care of some of the groundwork</h3>
+<div class="outline-text-3" id="text-2-1">
+<p>
+When it comes to simulation environments, the engines used to
+create the worlds in video games offer top-notch physics and
+graphics support. These engines also have limited support for
+creating cameras and rendering 3D sound, which can be repurposed
+for vision and hearing respectively. Physics collision detection
+can be expanded to create a sense of touch.
+</p>
+<p>
+jMonkeyEngine3 is one such engine for creating video games in
+Java. It uses OpenGL to render to the screen and uses screengraphs
+to avoid drawing things that do not appear on the screen. It has an
+active community and several games in the pipeline. The engine was
+not built to serve any particular game but is instead meant to be
+used for any 3D game. I chose jMonkeyEngine3 it because it had the
+most features out of all the open projects I looked at, and because
+I could then write my code in Clojure, an implementation of LISP
+that runs on the JVM.
+</p>
+</div>
+</div>
+<div id="outline-container-2-2" class="outline-3">
+<h3 id="sec-2-2"><code>CORTEX</code> Extends jMonkeyEngine3 to implement rich senses</h3>
+<div class="outline-text-3" id="text-2-2">
+<p>
+Using the game-making primitives provided by jMonkeyEngine3, I have
+constructed every major human sense except for smell and
+taste. <code>Cortex</code> also provides an interface for creating creatures
+in Blender, a 3D modeling environment, and then "rigging" the
+creatures with senses using 3D annotations in Blender. A creature
+can have any number of senses, and there can be any number of
+creatures in a simulation.
+</p>
+<p>
+The senses available in <code>Cortex</code> are:
+</p>
+<ul>
+<li><a href="../../cortex/html/vision.html">Vision</a>
+</li>
+<li><a href="../../cortex/html/hearing.html">Hearing</a>
+</li>
+<li><a href="../../cortex/html/touch.html">Touch</a>
+</li>
+<li><a href="../../cortex/html/proprioception.html">Proprioception</a>
+</li>
+<li><a href="../../cortex/html/movement.html">Muscle Tension</a>
+</li>
+</ul>
+</div>
+</div>
+</div>
+<div id="outline-container-3" class="outline-2">
+<h2 id="sec-3">A roadmap for <code>Cortex</code> experiments</h2>
+<div class="outline-text-2" id="text-3">
+</div>
+<div id="outline-container-3-1" class="outline-3">
+<h3 id="sec-3-1">Worm World</h3>
+<div class="outline-text-3" id="text-3-1">
+<p>
+Worms in <code>Cortex</code> are segmented creatures which vary in length and
+number of segments, and have the senses of vision, proprioception,
+touch, and muscle tension.
+</p>
+<div class="figure">
+<p><img src="../images/finger-UV.png" width=755 alt="../images/finger-UV.png" /></p>
+<p>This is the tactile-sensor-profile for the upper segment of a worm. It defines regions of high touch sensitivity (where there are many white pixels) and regions of low sensitivity (where white pixels are sparse).</p>
+</div>
+<div class="figure">
+<center>
+<video controls="controls" width="550">
+<source src="../video/worm-touch.ogg" type="video/ogg"
+preload="none" />
+</video>
+<br> <a href="http://youtu.be/RHx2wqzNVcU"> YouTube </a>
+</center>
+<p>The worm responds to touch.</p>
+</div>
+<div class="figure">
+<center>
+<video controls="controls" width="550">
+<source src="../video/test-proprioception.ogg" type="video/ogg"
+preload="none" />
+</video>
+<br> <a href="http://youtu.be/JjdDmyM8b0w"> YouTube </a>
+</center>
+<p>Proprioception in a worm. The proprioceptive readout is
+in the upper left corner of the screen.</p>
+</div>
+<p>
+A worm is trained in various actions such as sinusoidal movement,
+curling, flailing, and spinning by directly playing motor
+contractions while the worm "feels" the experience. These actions
+are recorded both as vectors of muscle tension, touch, and
+proprioceptive data, but also in higher level forms such as
+frequencies of the various contractions and a symbolic name for the
+action.
+</p>
+<p>
+Then, the worm watches a video of another worm performing one of
+the actions, and must judge which action was performed. Normally
+this would be an extremely difficult problem, but the worm is able
+to greatly diminish the search space through sympathetic
+imagination. First, it creates an imagined copy of its body which
+it observes from a third person point of view. Then for each frame
+of the video, it maneuvers its simulated body to be in registration
+with the worm depicted in the video. The physical constraints
+imposed by the physics simulation greatly decrease the number of
+poses that have to be tried, making the search feasible. As the
+imaginary worm moves, it generates imaginary muscle tension and
+proprioceptive sensations. The worm determines the action not by
+vision, but by matching the imagined proprioceptive data with
+previous examples.
+</p>
+<p>
+By using non-visual sensory data such as touch, the worms can also
+answer body related questions such as "did your head touch your
+tail?" and "did worm A touch worm B?"
+</p>
+<p>
+The proprioceptive information used for action identification is
+body-centric, so only the registration step is dependent on point
+of view, not the identification step. Registration is not specific
+to any particular action. Thus, action identification can be
+divided into a point-of-view dependent generic registration step,
+and a action-specific step that is body-centered and invariant to
+point of view.
+</p>
+</div>
+</div>
+<div id="outline-container-3-2" class="outline-3">
+<h3 id="sec-3-2">Stick Figure World</h3>
+<div class="outline-text-3" id="text-3-2">
+<p>
+This environment is similar to Worm World, except the creatures are
+more complicated and the actions and questions more varied. It is
+an experiment to see how far imagination can go in interpreting
+actions.
+</p></div>
+</div>
+</div>
+</div>
+<div id="postamble">
+<p class="date">Date: 2013-11-07 04:21:29 EST</p>
+<p class="author">Author: Robert McIntyre</p>
+<p class="creator">Org version 7.7 with Emacs version 24</p>
+<a href="http://validator.w3.org/check?uri=referer">Validate XHTML 1.0</a>
+</div>
+</body>
+</html>

Mercurial > cortex

comparison thesis/aux/org/first-chapter.html @ 422:6b0f77df0e53