# HG changeset patch
# User Robert McIntyre <rlm@mit.edu>
# Date 1396041043 14400
# Node ID 3401053124b0a867d1598f36904df062d260a061
# Parent  ae10f35022ba7548f49fe02390053f64bcb27149
integrating vision into thesis.

diff -r ae10f35022ba -r 3401053124b0 org/vision.org
--- a/org/vision.org	Fri Mar 28 16:34:35 2014 -0400
+++ b/org/vision.org	Fri Mar 28 17:10:43 2014 -0400
@@ -174,21 +174,18 @@
     (bind-sense target cam) cam))
 #+end_src
 
-#+results: add-eye
-: #'cortex.vision/add-eye!
-
 Here, the camera is created based on metadata on the eye-node and
 attached to the nearest physical object with =bind-sense=
 ** The Retina
 
 An eye is a surface (the retina) which contains many discrete sensors
-to detect light. These sensors have can have different light-sensing
-properties.  In humans, each discrete sensor is sensitive to red,
-blue, green, or gray. These different types of sensors can have
-different spatial distributions along the retina. In humans, there is
-a fovea in the center of the retina which has a very high density of
-color sensors, and a blind spot which has no sensors at all. Sensor
-density decreases in proportion to distance from the fovea.
+to detect light. These sensors can have different light-sensing
+properties. In humans, each discrete sensor is sensitive to red, blue,
+green, or gray. These different types of sensors can have different
+spatial distributions along the retina. In humans, there is a fovea in
+the center of the retina which has a very high density of color
+sensors, and a blind spot which has no sensors at all. Sensor density
+decreases in proportion to distance from the fovea.
 
 I want to be able to model any retinal configuration, so my eye-nodes
 in blender contain metadata pointing to images that describe the
diff -r ae10f35022ba -r 3401053124b0 thesis/cortex.org
--- a/thesis/cortex.org	Fri Mar 28 16:34:35 2014 -0400
+++ b/thesis/cortex.org	Fri Mar 28 17:10:43 2014 -0400
@@ -6,22 +6,36 @@
 #+LaTeX_CLASS_OPTIONS: [nofloat]
 
 * COMMENT templates
-  #+caption: 
-  #+caption: 
-  #+caption: 
-  #+caption: 
-  #+name: name
-  #+begin_listing clojure
-  #+begin_src clojure
-  #+end_src
-  #+end_listing
+   #+caption: 
+   #+caption: 
+   #+caption: 
+   #+caption: 
+   #+name: name
+   #+begin_listing clojure
+   #+end_listing
 
-  #+caption: 
-  #+caption: 
-  #+caption: 
-  #+name: name
-  #+ATTR_LaTeX: :width 10cm
-  [[./images/aurellem-gray.png]]
+   #+caption: 
+   #+caption: 
+   #+caption: 
+   #+name: name
+   #+ATTR_LaTeX: :width 10cm
+   [[./images/aurellem-gray.png]]
+
+    #+caption: 
+    #+caption: 
+    #+caption: 
+    #+caption: 
+    #+name: name
+    #+begin_listing clojure
+    #+end_listing
+
+    #+caption: 
+    #+caption: 
+    #+caption: 
+    #+name: name
+    #+ATTR_LaTeX: :width 10cm
+    [[./images/aurellem-gray.png]]
+
 
 * COMMENT Empathy and Embodiment as problem solving strategies
   
@@ -942,6 +956,285 @@
 
 ** Eyes reuse standard video game components
 
+   Vision is one of the most important senses for humans, so I need to
+   build a simulated sense of vision for my AI. I will do this with
+   simulated eyes. Each eye can be independently moved and should see
+   its own version of the world depending on where it is.
+
+   Making these simulated eyes a reality is simple because
+   jMonkeyEngine already contains extensive support for multiple views
+   of the same 3D simulated world. The reason jMonkeyEngine has this
+   support is because the support is necessary to create games with
+   split-screen views. Multiple views are also used to create
+   efficient pseudo-reflections by rendering the scene from a certain
+   perspective and then projecting it back onto a surface in the 3D
+   world.
+
+   #+caption: jMonkeyEngine supports multiple views to enable 
+   #+caption: split-screen games, like GoldenEye, which was one of 
+   #+caption: the first games to use split-screen views.
+   #+name: name
+   #+ATTR_LaTeX: :width 10cm
+   [[./images/goldeneye-4-player.png]]
+
+*** A Brief Description of jMonkeyEngine's Rendering Pipeline
+
+    jMonkeyEngine allows you to create a =ViewPort=, which represents a
+    view of the simulated world. You can create as many of these as you
+    want. Every frame, the =RenderManager= iterates through each
+    =ViewPort=, rendering the scene in the GPU. For each =ViewPort= there
+    is a =FrameBuffer= which represents the rendered image in the GPU.
+  
+    #+caption: =ViewPorts= are cameras in the world. During each frame, 
+    #+caption: the =RenderManager= records a snapshot of what each view 
+    #+caption: is currently seeing; these snapshots are =FrameBuffer= objects.
+    #+name: name
+    #+ATTR_LaTeX: :width 10cm
+    [[../images/diagram_rendermanager2.png]]
+
+    Each =ViewPort= can have any number of attached =SceneProcessor=
+    objects, which are called every time a new frame is rendered. A
+    =SceneProcessor= receives its =ViewPort's= =FrameBuffer= and can do
+    whatever it wants to the data.  Often this consists of invoking GPU
+    specific operations on the rendered image.  The =SceneProcessor= can
+    also copy the GPU image data to RAM and process it with the CPU.
+
+*** Appropriating Views for Vision
+
+    Each eye in the simulated creature needs its own =ViewPort= so
+    that it can see the world from its own perspective. To this
+    =ViewPort=, I add a =SceneProcessor= that feeds the visual data to
+    any arbitrary continuation function for further processing. That
+    continuation function may perform both CPU and GPU operations on
+    the data. To make this easy for the continuation function, the
+    =SceneProcessor= maintains appropriately sized buffers in RAM to
+    hold the data. It does not do any copying from the GPU to the CPU
+    itself because it is a slow operation.
+
+    #+caption: Function to make the rendered secne in jMonkeyEngine 
+    #+caption: available for further processing.
+    #+name: pipeline-1 
+    #+begin_listing clojure
+    #+begin_src clojure
+(defn vision-pipeline
+  "Create a SceneProcessor object which wraps a vision processing
+  continuation function. The continuation is a function that takes 
+  [#^Renderer r #^FrameBuffer fb #^ByteBuffer b #^BufferedImage bi],
+  each of which has already been appropriately sized."
+  [continuation]
+  (let [byte-buffer (atom nil)
+	renderer (atom nil)
+        image (atom nil)]
+  (proxy [SceneProcessor] []
+    (initialize
+     [renderManager viewPort]
+     (let [cam (.getCamera viewPort)
+	   width (.getWidth cam)
+	   height (.getHeight cam)]
+       (reset! renderer (.getRenderer renderManager))
+       (reset! byte-buffer
+	     (BufferUtils/createByteBuffer
+	      (* width height 4)))
+        (reset! image (BufferedImage.
+                      width height
+                      BufferedImage/TYPE_4BYTE_ABGR))))
+    (isInitialized [] (not (nil? @byte-buffer)))
+    (reshape [_ _ _])
+    (preFrame [_])
+    (postQueue [_])
+    (postFrame
+     [#^FrameBuffer fb]
+     (.clear @byte-buffer)
+     (continuation @renderer fb @byte-buffer @image))
+    (cleanup []))))
+    #+end_src
+    #+end_listing
+
+    The continuation function given to =vision-pipeline= above will be
+    given a =Renderer= and three containers for image data. The
+    =FrameBuffer= references the GPU image data, but the pixel data
+    can not be used directly on the CPU. The =ByteBuffer= and
+    =BufferedImage= are initially "empty" but are sized to hold the
+    data in the =FrameBuffer=. I call transferring the GPU image data
+    to the CPU structures "mixing" the image data.
+
+*** Optical sensor arrays are described with images and referenced with metadata
+
+    The vision pipeline described above handles the flow of rendered
+    images. Now, =CORTEX= needs simulated eyes to serve as the source
+    of these images.
+
+    An eye is described in blender in the same way as a joint. They
+    are zero dimensional empty objects with no geometry whose local
+    coordinate system determines the orientation of the resulting eye.
+    All eyes are children of a parent node named "eyes" just as all
+    joints have a parent named "joints". An eye binds to the nearest
+    physical object with =bind-sense=.
+
+    #+caption: Here, the camera is created based on metadata on the
+    #+caption: eye-node and attached to the nearest physical object 
+    #+caption: with =bind-sense=
+    #+name: add-eye
+    #+begin_listing clojure
+(defn add-eye!
+  "Create a Camera centered on the current position of 'eye which
+   follows the closest physical node in 'creature. The camera will
+   point in the X direction and use the Z vector as up as determined
+   by the rotation of these vectors in blender coordinate space. Use
+   XZY rotation for the node in blender."
+  [#^Node creature #^Spatial eye]
+  (let [target (closest-node creature eye)
+        [cam-width cam-height] 
+        ;;[640 480] ;; graphics card on laptop doesn't support
+                    ;; arbitray dimensions.
+        (eye-dimensions eye)
+        cam (Camera. cam-width cam-height)
+        rot (.getWorldRotation eye)]
+    (.setLocation cam (.getWorldTranslation eye))
+    (.lookAtDirection
+     cam                           ; this part is not a mistake and
+     (.mult rot Vector3f/UNIT_X)   ; is consistent with using Z in
+     (.mult rot Vector3f/UNIT_Y))  ; blender as the UP vector.
+    (.setFrustumPerspective
+     cam (float 45)
+     (float (/ (.getWidth cam) (.getHeight cam)))
+     (float 1)
+     (float 1000))
+    (bind-sense target cam) cam))
+    #+end_listing
+
+*** Simulated Retina 
+
+    An eye is a surface (the retina) which contains many discrete
+    sensors to detect light. These sensors can have different
+    light-sensing properties. In humans, each discrete sensor is
+    sensitive to red, blue, green, or gray. These different types of
+    sensors can have different spatial distributions along the retina.
+    In humans, there is a fovea in the center of the retina which has
+    a very high density of color sensors, and a blind spot which has
+    no sensors at all. Sensor density decreases in proportion to
+    distance from the fovea.
+
+    I want to be able to model any retinal configuration, so my
+    eye-nodes in blender contain metadata pointing to images that
+    describe the precise position of the individual sensors using
+    white pixels. The meta-data also describes the precise sensitivity
+    to light that the sensors described in the image have. An eye can
+    contain any number of these images. For example, the metadata for
+    an eye might look like this:
+
+    #+begin_src clojure
+{0xFF0000 "Models/test-creature/retina-small.png"}
+    #+end_src
+
+    #+caption: An example retinal profile image. White pixels are 
+    #+caption: photo-sensitive elements. The distribution of white 
+    #+caption: pixels is denser in the middle and falls off at the 
+    #+caption: edges and is inspired by the human retina.
+    #+name: retina
+    #+ATTR_LaTeX: :width 10cm
+    [[./images/retina-small.png]]
+
+    Together, the number 0xFF0000 and the image image above describe
+    the placement of red-sensitive sensory elements.
+
+    Meta-data to very crudely approximate a human eye might be
+    something like this:
+
+    #+begin_src clojure
+(let [retinal-profile "Models/test-creature/retina-small.png"]
+  {0xFF0000 retinal-profile
+   0x00FF00 retinal-profile
+   0x0000FF retinal-profile
+   0xFFFFFF retinal-profile})
+    #+end_src
+
+    The numbers that serve as keys in the map determine a sensor's
+    relative sensitivity to the channels red, green, and blue. These
+    sensitivity values are packed into an integer in the order
+    =|_|R|G|B|= in 8-bit fields. The RGB values of a pixel in the
+    image are added together with these sensitivities as linear
+    weights. Therefore, 0xFF0000 means sensitive to red only while
+    0xFFFFFF means sensitive to all colors equally (gray).
+
+    #+caption: This is the core of vision in =CORTEX=. A given eye node 
+    #+caption: is converted into a function that returns visual
+    #+caption: information from the simulation.
+    #+name: name
+    #+begin_listing clojure
+(defn vision-kernel
+  "Returns a list of functions, each of which will return a color
+   channel's worth of visual information when called inside a running
+   simulation."
+  [#^Node creature #^Spatial eye & {skip :skip :or {skip 0}}]
+  (let [retinal-map (retina-sensor-profile eye)
+        camera (add-eye! creature eye)
+        vision-image
+        (atom
+         (BufferedImage. (.getWidth camera)
+                         (.getHeight camera)
+                         BufferedImage/TYPE_BYTE_BINARY))
+        register-eye!
+        (runonce
+         (fn [world]
+           (add-camera!
+            world camera
+            (let [counter  (atom 0)]
+              (fn [r fb bb bi]
+                (if (zero? (rem (swap! counter inc) (inc skip)))
+                  (reset! vision-image
+                          (BufferedImage! r fb bb bi))))))))]
+     (vec
+      (map
+       (fn [[key image]]
+         (let [whites (white-coordinates image)
+               topology (vec (collapse whites))
+               sensitivity (sensitivity-presets key key)]
+           (attached-viewport.
+            (fn [world]
+              (register-eye! world)
+              (vector
+               topology
+               (vec 
+                (for [[x y] whites]
+                  (pixel-sense 
+                   sensitivity
+                   (.getRGB @vision-image x y))))))
+            register-eye!)))
+         retinal-map))))
+    #+end_listing
+
+    Note that since each of the functions generated by =vision-kernel=
+    shares the same =register-eye!= function, the eye will be
+    registered only once the first time any of the functions from the
+    list returned by =vision-kernel= is called. Each of the functions
+    returned by =vision-kernel= also allows access to the =Viewport=
+    through which it receives images.
+
+    All the hard work has been done; all that remains is to apply
+    =vision-kernel= to each eye in the creature and gather the results
+    into one list of functions.
+
+
+    #+caption: With =vision!=, =CORTEX= is already a fine simulation 
+    #+caption: environment for experimenting with different types of 
+    #+caption: eyes.
+    #+name: vision!
+    #+begin_listing clojure
+(defn vision!
+  "Returns a list of functions, each of which returns visual sensory
+   data when called inside a running simulation."
+  [#^Node creature & {skip :skip :or {skip 0}}]
+  (reduce
+   concat 
+   (for [eye (eyes creature)]
+     (vision-kernel creature eye))))
+    #+end_listing
+
+
+
+
+
 ** Hearing is hard; =CORTEX= does it right
 
 ** Touch uses hundreds of hair-like elements
diff -r ae10f35022ba -r 3401053124b0 thesis/images/retina-small.png
Binary file thesis/images/retina-small.png has changed