rlm@34: #+title: Simulated Sense of Sight
rlm@23: #+author: Robert McIntyre
rlm@23: #+email: rlm@mit.edu
rlm@38: #+description: Simulated sight for AI research using JMonkeyEngine3 and clojure
rlm@34: #+keywords: computer vision, jMonkeyEngine3, clojure
rlm@23: #+SETUPFILE: ../../aurellem/org/setup.org
rlm@23: #+INCLUDE: ../../aurellem/org/level-0.org
rlm@23: #+babel: :mkdirp yes :noweb yes :exports both
rlm@23: 
ocsenave@264: * JMonkeyEngine natively supports multiple views of the same world.
ocsenave@264:  
rlm@212: Vision is one of the most important senses for humans, so I need to
rlm@212: build a simulated sense of vision for my AI. I will do this with
rlm@306: simulated eyes. Each eye can be independently moved and should see its
rlm@212: own version of the world depending on where it is.
rlm@212: 
rlm@306: Making these simulated eyes a reality is simple because jMonkeyEngine
rlm@306: already contains extensive support for multiple views of the same 3D
rlm@218: simulated world. The reason jMonkeyEngine has this support is because
rlm@218: the support is necessary to create games with split-screen
rlm@218: views. Multiple views are also used to create efficient
rlm@212: pseudo-reflections by rendering the scene from a certain perspective
rlm@212: and then projecting it back onto a surface in the 3D world.
rlm@212: 
rlm@218: #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye, which was one of the first games to use split-screen views.
rlm@212: [[../images/goldeneye-4-player.png]]
rlm@212: 
ocsenave@264: ** =ViewPorts=, =SceneProcessors=, and the =RenderManager=. 
rlm@306: # =ViewPorts= are cameras; =RenderManger= takes snapshots each frame. 
ocsenave@264: #* A Brief Description of jMonkeyEngine's Rendering Pipeline
rlm@212: 
rlm@213: jMonkeyEngine allows you to create a =ViewPort=, which represents a
rlm@213: view of the simulated world. You can create as many of these as you
rlm@213: want. Every frame, the =RenderManager= iterates through each
rlm@213: =ViewPort=, rendering the scene in the GPU. For each =ViewPort= there
rlm@213: is a =FrameBuffer= which represents the rendered image in the GPU.
rlm@151: 
rlm@306: #+caption: =ViewPorts= are cameras in the world. During each frame, the =RenderManager= records a snapshot of what each view is currently seeing; these snapshots are =FrameBuffer= objects.
ocsenave@265: #+ATTR_HTML: width="400"
ocsenave@272: [[../images/diagram_rendermanager2.png]]
ocsenave@262: 
rlm@213: Each =ViewPort= can have any number of attached =SceneProcessor=
rlm@213: objects, which are called every time a new frame is rendered. A
rlm@306: =SceneProcessor= receives its =ViewPort's= =FrameBuffer= and can do
rlm@219: whatever it wants to the data.  Often this consists of invoking GPU
rlm@219: specific operations on the rendered image.  The =SceneProcessor= can
rlm@219: also copy the GPU image data to RAM and process it with the CPU.
rlm@151: 
ocsenave@264: ** From Views to Vision
ocsenave@264: # Appropriating Views for Vision.
rlm@151: 
ocsenave@264: Each eye in the simulated creature needs its own =ViewPort= so that
rlm@213: it can see the world from its own perspective. To this =ViewPort=, I
rlm@306: add a =SceneProcessor= that feeds the visual data to any arbitrary
rlm@213: continuation function for further processing.  That continuation
rlm@213: function may perform both CPU and GPU operations on the data. To make
rlm@213: this easy for the continuation function, the =SceneProcessor=
rlm@306: maintains appropriately sized buffers in RAM to hold the data.  It does
rlm@218: not do any copying from the GPU to the CPU itself because it is a slow
rlm@218: operation.
rlm@214: 
rlm@213: #+name: pipeline-1
rlm@213: #+begin_src clojure
rlm@113: (defn vision-pipeline
rlm@34:   "Create a SceneProcessor object which wraps a vision processing
rlm@113:   continuation function. The continuation is a function that takes 
rlm@113:   [#^Renderer r #^FrameBuffer fb #^ByteBuffer b #^BufferedImage bi],
rlm@306:   each of which has already been appropriately sized."
rlm@23:   [continuation]
rlm@23:   (let [byte-buffer (atom nil)
rlm@113: 	renderer (atom nil)
rlm@113:         image (atom nil)]
rlm@23:   (proxy [SceneProcessor] []
rlm@23:     (initialize
rlm@23:      [renderManager viewPort]
rlm@23:      (let [cam (.getCamera viewPort)
rlm@23: 	   width (.getWidth cam)
rlm@23: 	   height (.getHeight cam)]
rlm@23:        (reset! renderer (.getRenderer renderManager))
rlm@23:        (reset! byte-buffer
rlm@23: 	     (BufferUtils/createByteBuffer
rlm@113: 	      (* width height 4)))
rlm@113:         (reset! image (BufferedImage.
rlm@113:                       width height
rlm@113:                       BufferedImage/TYPE_4BYTE_ABGR))))
rlm@23:     (isInitialized [] (not (nil? @byte-buffer)))
rlm@23:     (reshape [_ _ _])
rlm@23:     (preFrame [_])
rlm@23:     (postQueue [_])
rlm@23:     (postFrame
rlm@23:      [#^FrameBuffer fb]
rlm@23:      (.clear @byte-buffer)
rlm@113:      (continuation @renderer fb @byte-buffer @image))
rlm@23:     (cleanup []))))
rlm@213: #+end_src
rlm@213: 
rlm@273: The continuation function given to =vision-pipeline= above will be
rlm@213: given a =Renderer= and three containers for image data. The
rlm@218: =FrameBuffer= references the GPU image data, but the pixel data can
rlm@218: not be used directly on the CPU.  The =ByteBuffer= and =BufferedImage=
rlm@219: are initially "empty" but are sized to hold the data in the
rlm@306: =FrameBuffer=. I call transferring the GPU image data to the CPU
rlm@213: structures "mixing" the image data. I have provided three functions to
rlm@213: do this mixing.
rlm@213: 
rlm@213: #+name: pipeline-2
rlm@213: #+begin_src clojure
rlm@113: (defn frameBuffer->byteBuffer!
rlm@113:   "Transfer the data in the graphics card (Renderer, FrameBuffer) to
rlm@113:    the CPU (ByteBuffer)."  
rlm@113:   [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb]
rlm@113:   (.readFrameBuffer r fb bb) bb)
rlm@113: 
rlm@113: (defn byteBuffer->bufferedImage!
rlm@113:   "Convert the C-style BGRA image data in the ByteBuffer bb to the AWT
rlm@113:    style ABGR image data and place it in BufferedImage bi."
rlm@113:   [#^ByteBuffer bb #^BufferedImage bi]
rlm@113:   (Screenshots/convertScreenShot bb bi) bi)
rlm@113: 
rlm@113: (defn BufferedImage!
rlm@113:   "Continuation which will grab the buffered image from the materials
rlm@113:    provided by (vision-pipeline)."
rlm@113:   [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb #^BufferedImage bi]
rlm@113:   (byteBuffer->bufferedImage!
rlm@113:    (frameBuffer->byteBuffer! r fb bb) bi))
rlm@213: #+end_src
rlm@112: 
rlm@213: Note that it is possible to write vision processing algorithms
rlm@213: entirely in terms of =BufferedImage= inputs. Just compose that
rlm@273: =BufferedImage= algorithm with =BufferedImage!=. However, a vision
rlm@213: processing algorithm that is entirely hosted on the GPU does not have
rlm@306: to pay for this convenience.
rlm@213: 
ocsenave@265: * Optical sensor arrays are described with images and referenced with metadata
rlm@214: The vision pipeline described above handles the flow of rendered
rlm@214: images. Now, we need simulated eyes to serve as the source of these
rlm@214: images. 
rlm@214: 
rlm@214: An eye is described in blender in the same way as a joint. They are
rlm@214: zero dimensional empty objects with no geometry whose local coordinate
rlm@214: system determines the orientation of the resulting eye. All eyes are
rlm@306: children of a parent node named "eyes" just as all joints have a
rlm@214: parent named "joints". An eye binds to the nearest physical object
rlm@273: with =bind-sense=.
rlm@214: 
rlm@214: #+name: add-eye
rlm@214: #+begin_src clojure
rlm@215: (in-ns 'cortex.vision)
rlm@215: 
rlm@214: (defn add-eye!
rlm@214:   "Create a Camera centered on the current position of 'eye which
rlm@338:    follows the closest physical node in 'creature. The camera will
rlm@338:    point in the X direction and use the Z vector as up as determined
rlm@338:    by the rotation of these vectors in blender coordinate space. Use
rlm@338:    XZY rotation for the node in blender."
rlm@214:   [#^Node creature #^Spatial eye]
rlm@214:   (let [target (closest-node creature eye)
rlm@338:         [cam-width cam-height] 
rlm@338:         ;;[640 480] ;; graphics card on laptop doesn't support
rlm@338:                     ;; arbitray dimensions.
rlm@338:         (eye-dimensions eye)
rlm@215:         cam (Camera. cam-width cam-height)
rlm@215:         rot (.getWorldRotation eye)]
rlm@214:     (.setLocation cam (.getWorldTranslation eye))
rlm@218:     (.lookAtDirection
rlm@338:      cam                           ; this part is not a mistake and
rlm@338:      (.mult rot Vector3f/UNIT_X)   ; is consistent with using Z in
rlm@338:      (.mult rot Vector3f/UNIT_Y))  ; blender as the UP vector.
rlm@214:     (.setFrustumPerspective
rlm@338:      cam (float 45)
rlm@338:      (float (/ (.getWidth cam) (.getHeight cam)))
rlm@338:      (float 1)
rlm@338:      (float 1000))
rlm@215:     (bind-sense target cam) cam))
rlm@214: #+end_src
rlm@214: 
rlm@214: Here, the camera is created based on metadata on the eye-node and
rlm@273: attached to the nearest physical object with =bind-sense=
rlm@214: ** The Retina
rlm@214: 
rlm@214: An eye is a surface (the retina) which contains many discrete sensors
rlm@470: to detect light. These sensors can have different light-sensing
rlm@470: properties. In humans, each discrete sensor is sensitive to red, blue,
rlm@470: green, or gray. These different types of sensors can have different
rlm@470: spatial distributions along the retina. In humans, there is a fovea in
rlm@470: the center of the retina which has a very high density of color
rlm@470: sensors, and a blind spot which has no sensors at all. Sensor density
rlm@470: decreases in proportion to distance from the fovea.
rlm@214: 
rlm@214: I want to be able to model any retinal configuration, so my eye-nodes
rlm@214: in blender contain metadata pointing to images that describe the
rlm@306: precise position of the individual sensors using white pixels. The
rlm@306: meta-data also describes the precise sensitivity to light that the
rlm@214: sensors described in the image have.  An eye can contain any number of
rlm@214: these images. For example, the metadata for an eye might look like
rlm@214: this:
rlm@214: 
rlm@214: #+begin_src clojure
rlm@214: {0xFF0000 "Models/test-creature/retina-small.png"}
rlm@214: #+end_src
rlm@214: 
rlm@214: #+caption: The retinal profile image "Models/test-creature/retina-small.png". White pixels are photo-sensitive elements. The distribution of white pixels is denser in the middle and falls off at the edges and is inspired by the human retina.
rlm@214: [[../assets/Models/test-creature/retina-small.png]]
rlm@214: 
rlm@214: Together, the number 0xFF0000 and the image image above describe the
rlm@214: placement of red-sensitive sensory elements.
rlm@214: 
rlm@214: Meta-data to very crudely approximate a human eye might be something
rlm@214: like this:
rlm@214: 
rlm@214: #+begin_src clojure
rlm@214: (let [retinal-profile "Models/test-creature/retina-small.png"]
rlm@214:   {0xFF0000 retinal-profile
rlm@214:    0x00FF00 retinal-profile
rlm@214:    0x0000FF retinal-profile
rlm@214:    0xFFFFFF retinal-profile})
rlm@214: #+end_src
rlm@214: 
rlm@214: The numbers that serve as keys in the map determine a sensor's
rlm@214: relative sensitivity to the channels red, green, and blue. These
rlm@218: sensitivity values are packed into an integer in the order =|_|R|G|B|=
rlm@218: in 8-bit fields. The RGB values of a pixel in the image are added
rlm@306: together with these sensitivities as linear weights. Therefore,
rlm@214: 0xFF0000 means sensitive to red only while 0xFFFFFF means sensitive to
rlm@214: all colors equally (gray).
rlm@214: 
rlm@306: For convenience I've defined a few symbols for the more common
rlm@214: sensitivity values.
rlm@214: 
rlm@214: #+name: sensitivity
rlm@214: #+begin_src clojure
rlm@317: (def sensitivity-presets
rlm@317:   "Retinal sensitivity presets for sensors that extract one channel
rlm@317:    (:red :blue :green) or average all channels (:all)"
rlm@214:   {:all    0xFFFFFF
rlm@214:    :red    0xFF0000
rlm@214:    :blue   0x0000FF
rlm@317:    :green  0x00FF00})
rlm@214: #+end_src
rlm@214: 
rlm@214: ** Metadata Processing
rlm@214: 
rlm@273: =retina-sensor-profile= extracts a map from the eye-node in the same
rlm@273: format as the example maps above.  =eye-dimensions= finds the
rlm@219: dimensions of the smallest image required to contain all the retinal
rlm@214: sensor maps.
rlm@214: 
rlm@216: #+name: retina
rlm@214: #+begin_src clojure
rlm@214: (defn retina-sensor-profile
rlm@214:   "Return a map of pixel sensitivity numbers to BufferedImages
rlm@214:    describing the distribution of light-sensitive components of this
rlm@214:    eye. :red, :green, :blue, :gray are already defined as extracting
rlm@214:    the red, green, blue, and average components respectively."
rlm@214:    [#^Spatial eye]
rlm@214:    (if-let [eye-map (meta-data eye "eye")]
rlm@214:      (map-vals
rlm@214:       load-image
rlm@214:       (eval (read-string eye-map)))))
rlm@214: 
rlm@218: (defn eye-dimensions 
rlm@218:   "Returns [width, height] determined by the metadata of the eye."
rlm@214:   [#^Spatial eye]
rlm@214:   (let [dimensions
rlm@214:           (map #(vector (.getWidth %) (.getHeight %))
rlm@214:                (vals (retina-sensor-profile eye)))]
rlm@214:     [(apply max (map first dimensions))
rlm@214:      (apply max (map second dimensions))]))
rlm@214: #+end_src
rlm@214: 
ocsenave@265: * Importing and parsing descriptions of eyes.
rlm@214: First off, get the children of the "eyes" empty node to find all the
rlm@214: eyes the creature has.
rlm@216: #+name: eye-node
rlm@214: #+begin_src clojure
rlm@317: (def
rlm@317:   ^{:doc "Return the children of the creature's \"eyes\" node."
rlm@317:     :arglists '([creature])}
rlm@214:   eyes
rlm@317:   (sense-nodes "eyes"))
rlm@214: #+end_src
rlm@214: 
rlm@273: Then, add the camera created by =add-eye!= to the simulation by
rlm@215: creating a new viewport.
rlm@214: 
rlm@216: #+name: add-camera
rlm@213: #+begin_src clojure
rlm@338: (in-ns 'cortex.vision)
rlm@169: (defn add-camera!
rlm@169:   "Add a camera to the world, calling continuation on every frame
rlm@34:   produced." 
rlm@167:   [#^Application world camera continuation]
rlm@23:   (let [width (.getWidth camera)
rlm@23: 	height (.getHeight camera)
rlm@23: 	render-manager (.getRenderManager world)
rlm@23: 	viewport (.createMainView render-manager "eye-view" camera)]
rlm@23:     (doto viewport
rlm@23:       (.setClearFlags true true true)
rlm@112:       (.setBackgroundColor ColorRGBA/Black)
rlm@113:       (.addProcessor (vision-pipeline continuation))
rlm@23:       (.attachScene (.getRootNode world)))))
rlm@215: #+end_src
rlm@151: 
rlm@338: #+results: add-camera
rlm@338: : #'cortex.vision/add-camera!
rlm@338: 
rlm@151: 
rlm@218: The eye's continuation function should register the viewport with the
rlm@218: simulation the first time it is called, use the CPU to extract the
rlm@215: appropriate pixels from the rendered image and weight them by each
rlm@218: sensor's sensitivity. I have the option to do this processing in
rlm@218: native code for a slight gain in speed. I could also do it in the GPU
rlm@273: for a massive gain in speed. =vision-kernel= generates a list of
rlm@218: such continuation functions, one for each channel of the eye.
rlm@151: 
rlm@216: #+name: kernel
rlm@215: #+begin_src clojure 
rlm@215: (in-ns 'cortex.vision)
rlm@151: 
rlm@215: (defrecord attached-viewport [vision-fn viewport-fn]
rlm@215:   clojure.lang.IFn
rlm@215:   (invoke [this world] (vision-fn world))
rlm@215:   (applyTo [this args] (apply vision-fn args)))
rlm@151: 
rlm@216: (defn pixel-sense [sensitivity pixel]
rlm@216:   (let [s-r (bit-shift-right (bit-and 0xFF0000 sensitivity) 16)
rlm@216:         s-g (bit-shift-right (bit-and 0x00FF00 sensitivity)  8)
rlm@216:         s-b (bit-and 0x0000FF sensitivity)
rlm@216: 
rlm@216:         p-r (bit-shift-right (bit-and 0xFF0000 pixel)       16)
rlm@216:         p-g (bit-shift-right (bit-and 0x00FF00 pixel)        8)
rlm@216:         p-b (bit-and 0x0000FF pixel)
rlm@216: 
rlm@216:         total-sensitivity (* 255 (+ s-r s-g s-b))]
rlm@216:         (float (/ (+ (* s-r p-r)
rlm@216:                      (* s-g p-g)
rlm@216:                      (* s-b p-b))
rlm@216:                   total-sensitivity))))
rlm@216: 
rlm@215: (defn vision-kernel
rlm@171:   "Returns a list of functions, each of which will return a color
rlm@171:    channel's worth of visual information when called inside a running
rlm@171:    simulation."
rlm@151:   [#^Node creature #^Spatial eye & {skip :skip :or {skip 0}}]
rlm@169:   (let [retinal-map (retina-sensor-profile eye)
rlm@169:         camera (add-eye! creature eye)
rlm@151:         vision-image
rlm@151:         (atom
rlm@151:          (BufferedImage. (.getWidth camera)
rlm@151:                          (.getHeight camera)
rlm@170:                          BufferedImage/TYPE_BYTE_BINARY))
rlm@170:         register-eye!
rlm@170:         (runonce
rlm@170:          (fn [world]
rlm@170:            (add-camera!
rlm@170:             world camera
rlm@170:             (let [counter  (atom 0)]
rlm@170:               (fn [r fb bb bi]
rlm@170:                 (if (zero? (rem (swap! counter inc) (inc skip)))
rlm@170:                   (reset! vision-image
rlm@170:                           (BufferedImage! r fb bb bi))))))))]
rlm@151:      (vec
rlm@151:       (map
rlm@151:        (fn [[key image]]
rlm@151:          (let [whites (white-coordinates image)
rlm@151:                topology (vec (collapse whites))
rlm@216:                sensitivity (sensitivity-presets key key)]
rlm@215:            (attached-viewport.
rlm@215:             (fn [world]
rlm@215:               (register-eye! world)
rlm@215:               (vector
rlm@215:                topology
rlm@215:                (vec 
rlm@215:                 (for [[x y] whites]
rlm@216:                   (pixel-sense 
rlm@216:                    sensitivity
rlm@216:                    (.getRGB @vision-image x y))))))
rlm@215:             register-eye!)))
rlm@215:          retinal-map))))
rlm@151: 
rlm@215: (defn gen-fix-display
rlm@215:   "Create a function to call to restore a simulation's display when it
rlm@215:    is disrupted by a Viewport."
rlm@215:   []
rlm@215:   (runonce
rlm@215:    (fn [world]
rlm@215:      (add-camera! world (.getCamera world) no-op))))
rlm@215: #+end_src
rlm@170: 
rlm@273: Note that since each of the functions generated by =vision-kernel=
rlm@273: shares the same =register-eye!= function, the eye will be registered
rlm@215: only once the first time any of the functions from the list returned
rlm@273: by =vision-kernel= is called.  Each of the functions returned by
rlm@273: =vision-kernel= also allows access to the =Viewport= through which
rlm@306: it receives images.
rlm@215: 
rlm@306: The in-game display can be disrupted by all the ViewPorts that the
rlm@306: functions generated by =vision-kernel= add. This doesn't affect the
rlm@215: simulation or the simulated senses, but can be annoying.
rlm@273: =gen-fix-display= restores the in-simulation display.
rlm@215: 
ocsenave@265: ** The =vision!= function creates sensory probes.
rlm@215: 
rlm@218: All the hard work has been done; all that remains is to apply
rlm@273: =vision-kernel= to each eye in the creature and gather the results
rlm@215: into one list of functions.
rlm@215: 
rlm@216: #+name: main
rlm@215: #+begin_src clojure
rlm@170: (defn vision!
rlm@348:   "Returns a list of functions, each of which returns visual sensory
rlm@348:    data when called inside a running simulation."
rlm@151:   [#^Node creature & {skip :skip :or {skip 0}}]
rlm@151:   (reduce
rlm@170:    concat 
rlm@167:    (for [eye (eyes creature)]
rlm@215:      (vision-kernel creature eye))))
rlm@215: #+end_src
rlm@151: 
ocsenave@265: ** Displaying visual data for debugging.
ocsenave@265: # Visualization of Vision. Maybe less alliteration would be better.
rlm@215: It's vital to have a visual representation for each sense. Here I use
rlm@273: =view-sense= to construct a function that will create a display for
rlm@215: visual data.
rlm@215: 
rlm@216: #+name: display
rlm@215: #+begin_src clojure 
rlm@216: (in-ns 'cortex.vision)
rlm@216: 
rlm@189: (defn view-vision
rlm@189:   "Creates a function which accepts a list of visual sensor-data and
rlm@189:   displays each element of the list to the screen." 
rlm@189:   []
rlm@188:   (view-sense
rlm@188:    (fn 
rlm@188:      [[coords sensor-data]]
rlm@188:      (let [image (points->image coords)]
rlm@188:        (dorun
rlm@188:         (for [i (range (count coords))]
rlm@188:           (.setRGB image ((coords i) 0) ((coords i) 1)
rlm@216:                    (gray (int (* 255 (sensor-data i)))))))
rlm@189:        image))))
rlm@34: #+end_src
rlm@23: 
ocsenave@264: * Demonstrations
ocsenave@264: ** Demonstrating the vision pipeline.
rlm@23: 
rlm@215: This is a basic test for the vision system.  It only tests the
ocsenave@264: vision-pipeline and does not deal with loading eyes from a blender
rlm@215: file. The code creates two videos of the same rotating cube from
rlm@215: different angles. 
rlm@23: 
rlm@215: #+name: test-1
rlm@23: #+begin_src clojure
rlm@215: (in-ns 'cortex.test.vision)
rlm@23: 
rlm@219: (defn test-pipeline
rlm@69:   "Testing vision:
rlm@69:    Tests the vision system by creating two views of the same rotating
rlm@69:    object from different angles and displaying both of those views in
rlm@69:    JFrames.
rlm@69: 
rlm@69:    You should see a rotating cube, and two windows,
rlm@69:    each displaying a different view of the cube."
rlm@283:   ([] (test-pipeline false))
rlm@283:   ([record?]
rlm@283:      (let [candy
rlm@283:            (box 1 1 1 :physical? false :color ColorRGBA/Blue)]
rlm@283:        (world
rlm@283:         (doto (Node.)
rlm@283:           (.attachChild candy))
rlm@283:         {}
rlm@283:         (fn [world]
rlm@283:           (let [cam (.clone (.getCamera world))
rlm@283:                 width (.getWidth cam)
rlm@283:                 height (.getHeight cam)]
rlm@283:             (add-camera! world cam 
rlm@283:                          (comp
rlm@283:                           (view-image
rlm@283:                            (if record?
rlm@283:                              (File. "/home/r/proj/cortex/render/vision/1")))
rlm@283:                           BufferedImage!))
rlm@283:             (add-camera! world
rlm@283:                          (doto (.clone cam)
rlm@283:                            (.setLocation (Vector3f. -10 0 0))
rlm@283:                            (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))
rlm@283:                          (comp
rlm@283:                           (view-image
rlm@283:                            (if record?
rlm@283:                              (File. "/home/r/proj/cortex/render/vision/2")))
rlm@283:                           BufferedImage!))
rlm@341:             (let [timer (IsoTimer. 60)]
rlm@340:               (.setTimer world timer)
rlm@340:               (display-dilated-time world timer))
rlm@283:             ;; This is here to restore the main view
rlm@340:             ;; after the other views have completed processing
rlm@283:             (add-camera! world (.getCamera world) no-op)))
rlm@283:         (fn [world tpf]
rlm@283:           (.rotate candy (* tpf 0.2) 0 0))))))
rlm@23: #+end_src
rlm@23: 
rlm@340: #+results: test-1
rlm@340: : #'cortex.test.vision/test-pipeline
rlm@340: 
rlm@215: #+begin_html
rlm@215: <div class="figure">
rlm@215: <video controls="controls" width="755">
rlm@215:   <source src="../video/spinning-cube.ogg" type="video/ogg"
rlm@215: 	  preload="none" poster="../images/aurellem-1280x480.png" />
rlm@215: </video>
rlm@309:  <br> <a href="http://youtu.be/r5Bn2aG7MO0"> YouTube </a>
rlm@215: <p>A rotating cube viewed from two different perspectives.</p>
rlm@215: </div>
rlm@215: #+end_html
rlm@215: 
rlm@215: Creating multiple eyes like this can be used for stereoscopic vision
rlm@215: simulation in a single creature or for simulating multiple creatures,
rlm@215: each with their own sense of vision.
ocsenave@264: ** Demonstrating eye import and parsing.
rlm@215: 
rlm@218: To the worm from the last post, I add a new node that describes its
rlm@215: eyes.
rlm@215: 
rlm@215: #+attr_html: width=755
rlm@215: #+caption: The worm with newly added empty nodes describing a single eye.
rlm@215: [[../images/worm-with-eye.png]]
rlm@215: 
rlm@215: The node highlighted in yellow is the root level "eyes" node.  It has
rlm@218: a single child, highlighted in orange, which describes a single
rlm@218: eye. This is the "eye" node. It is placed so that the worm will have
rlm@218: an eye located in the center of the flat portion of its lower
rlm@218: hemispherical section.
rlm@218: 
rlm@218: The two nodes which are not highlighted describe the single joint of
rlm@218: the worm.
rlm@215: 
rlm@215: The metadata of the eye-node is:
rlm@215: 
rlm@215: #+begin_src clojure :results verbatim :exports both
rlm@215: (cortex.sense/meta-data
rlm@218:  (.getChild (.getChild (cortex.test.body/worm) "eyes") "eye") "eye")
rlm@215: #+end_src
rlm@215: 
rlm@215: #+results:
rlm@215: : "(let [retina \"Models/test-creature/retina-small.png\"]
rlm@215: :     {:all retina :red retina :green retina :blue retina})"
rlm@215: 
rlm@215: This is the approximation to the human eye described earlier.
rlm@215: 
rlm@216: #+name: test-2
rlm@215: #+begin_src clojure
rlm@215: (in-ns 'cortex.test.vision)
rlm@215: 
rlm@216: (defn change-color [obj color]
rlm@321:   ;;(println-repl obj)
rlm@216:   (if obj
rlm@216:     (.setColor  (.getMaterial obj) "Color" color)))
rlm@216: 
rlm@216: (defn colored-cannon-ball [color]
rlm@216:   (comp #(change-color % color)
rlm@216:          (fire-cannon-ball)))
rlm@215: 
rlm@338: (defn gen-worm
rlm@338:   "create a creature acceptable for testing as a replacement for the
rlm@338:    worm."
rlm@338:   []
rlm@338:   (nodify
rlm@338:    "worm"
rlm@338:    [(nodify
rlm@338:      "eyes"
rlm@338:      [(doto
rlm@338:           (Node. "eye1")
rlm@338:         (.setLocalTranslation (Vector3f. 0 -1.1 0))
rlm@338:         (.setUserData
rlm@338:          
rlm@338:          "eye" 
rlm@338:          "(let [retina
rlm@338:                 \"Models/test-creature/retina-small.png\"]
rlm@338:                 {:all retina :red retina
rlm@338:                  :green retina :blue retina})"))])
rlm@338:     (box
rlm@338:      0.2 0.2 0.2
rlm@338:      :name "worm-segment"
rlm@338:      :position (Vector3f. 0 0 0)
rlm@338:      :color ColorRGBA/Orange)]))
rlm@338: 
rlm@338: 
rlm@338: 
rlm@283: (defn test-worm-vision 
rlm@321:   "Testing vision:
rlm@321:    You should see the worm suspended in mid-air, looking down at a
rlm@321:    table. There are four small displays, one each for red, green blue,
rlm@321:    and gray channels. You can fire balls of various colors, and the
rlm@321:    four channels should react accordingly.
rlm@321: 
rlm@321:    Keys:
rlm@321:      r  : fire red-ball
rlm@321:      b  : fire blue-ball
rlm@321:      g  : fire green-ball
rlm@321:      <space> : fire white ball"
rlm@338:   
rlm@283:   ([] (test-worm-vision false))
rlm@283:   ([record?] 
rlm@283:      (let [the-worm (doto (worm)(body!))
rlm@340:            vision (vision! the-worm)
rlm@340:            vision-display (view-vision)
rlm@340:            fix-display (gen-fix-display)
rlm@283:            me (sphere 0.5 :color ColorRGBA/Blue :physical? false)
rlm@283:            x-axis
rlm@283:            (box 1 0.01 0.01 :physical? false :color ColorRGBA/Red
rlm@283:                 :position (Vector3f. 0 -5 0))
rlm@283:            y-axis
rlm@283:            (box 0.01 1 0.01 :physical? false :color ColorRGBA/Green
rlm@283:                 :position (Vector3f. 0 -5 0))
rlm@283:            z-axis
rlm@283:            (box 0.01 0.01 1 :physical? false :color ColorRGBA/Blue
rlm@283:                 :position (Vector3f. 0 -5 0))
rlm@340: 
rlm@338:            ]
rlm@215: 
rlm@335:        (world
rlm@335:         (nodify [(floor) the-worm x-axis y-axis z-axis me])
rlm@340:         (merge standard-debug-controls
rlm@340:                {"key-r" (colored-cannon-ball ColorRGBA/Red)
rlm@340:                 "key-b" (colored-cannon-ball ColorRGBA/Blue)
rlm@340:                 "key-g" (colored-cannon-ball ColorRGBA/Green)})
rlm@338:         
rlm@335:         (fn [world]
rlm@340:           (light-up-everything world)
rlm@340:           (speed-up world)
rlm@341:           (let [timer (IsoTimer. 60)]
rlm@340:                 (.setTimer world timer)
rlm@340:                 (display-dilated-time world timer))
rlm@340:           ;; add a view from the worm's perspective
rlm@340:           (if record?
rlm@340:             (Capture/captureVideo
rlm@340:              world
rlm@340:              (File.
rlm@340:               "/home/r/proj/cortex/render/worm-vision/main-view")))
rlm@340:             
rlm@340:           (add-camera!
rlm@340:            world
rlm@340:            (add-eye! the-worm (first (eyes the-worm)))
rlm@340:            (comp
rlm@340:             (view-image
rlm@340:              (if record?
rlm@340:                (File.
rlm@340:                 "/home/r/proj/cortex/render/worm-vision/worm-view")))
rlm@340:             BufferedImage!))
rlm@340:             
rlm@340:           (set-gravity world Vector3f/ZERO)
rlm@340:           (add-camera! world (.getCamera world) no-op))
rlm@340:         
rlm@340:         (fn [world _]
rlm@340:           (.setLocalTranslation me (.getLocation (.getCamera world)))
rlm@340:            (vision-display
rlm@340:             (map #(% world) vision)
rlm@338:             (if record?
rlm@340:               (File. "/home/r/proj/cortex/render/worm-vision")))
rlm@340:           (fix-display world)
rlm@335:           )))))
rlm@215: #+end_src
rlm@215: 
rlm@335: #+RESULTS: test-2
rlm@337: : #'cortex.test.vision/test-worm-vision
rlm@335: 
rlm@335: 
rlm@218: The world consists of the worm and a flat gray floor. I can shoot red,
rlm@218: green, blue and white cannonballs at the worm. The worm is initially
rlm@218: looking down at the floor, and there is no gravity. My perspective
rlm@218: (the Main View), the worm's perspective (Worm View) and the 4 sensor
rlm@218: channels that comprise the worm's eye are all saved frame-by-frame to
rlm@218: disk.
rlm@218: 
rlm@218: * Demonstration of Vision
rlm@218: #+begin_html
rlm@218: <div class="figure">
rlm@218: <video controls="controls" width="755">
rlm@218:   <source src="../video/worm-vision.ogg" type="video/ogg"
rlm@218: 	  preload="none" poster="../images/aurellem-1280x480.png" />
rlm@218: </video>
rlm@309: <br> <a href="http://youtu.be/J3H3iB_2NPQ"> YouTube </a>
rlm@218: <p>Simulated Vision in a Virtual Environment</p>
rlm@218: </div>
rlm@218: #+end_html
rlm@218: 
rlm@218: ** Generate the Worm Video from Frames
rlm@216: #+name: magick2
rlm@216: #+begin_src clojure
rlm@216: (ns cortex.video.magick2
rlm@216:   (:import java.io.File)
rlm@316:   (:use clojure.java.shell))
rlm@216: 
rlm@216: (defn images [path]
rlm@216:   (sort (rest (file-seq (File. path)))))
rlm@216: 
rlm@216: (def base "/home/r/proj/cortex/render/worm-vision/")
rlm@216: 
rlm@216: (defn pics [file]
rlm@216:   (images (str base file)))
rlm@216: 
rlm@216: (defn combine-images []
rlm@216:   (let [main-view (pics "main-view")
rlm@216:         worm-view (pics "worm-view")
rlm@216:         blue   (pics "0")
rlm@216:         green  (pics "1")
rlm@216:         red    (pics "2")
rlm@216:         gray   (pics "3")
rlm@216:         blender (let [b-pics (pics "blender")]
rlm@216:                   (concat b-pics (repeat 9001 (last b-pics))))
rlm@216:         background (repeat 9001 (File. (str base "background.png")))
rlm@216:         targets (map
rlm@216:                  #(File. (str base "out/" (format "%07d.png" %)))
rlm@216:                  (range 0 (count main-view)))]
rlm@216:     (dorun
rlm@216:      (pmap
rlm@216:       (comp
rlm@216:        (fn [[background main-view worm-view red green blue gray blender target]]
rlm@216:          (println target)
rlm@216:          (sh "convert"
rlm@216:              background
rlm@216:              main-view "-geometry" "+18+17"    "-composite"
rlm@216:              worm-view "-geometry" "+677+17"   "-composite"
rlm@216:              green     "-geometry" "+685+430"  "-composite"
rlm@216:              red       "-geometry" "+788+430"  "-composite"
rlm@216:              blue      "-geometry" "+894+430"  "-composite"
rlm@216:              gray      "-geometry" "+1000+430" "-composite"
rlm@216:              blender   "-geometry" "+0+0"      "-composite"
rlm@216:              target))
rlm@216:        (fn [& args] (map #(.getCanonicalPath %) args)))
rlm@216:       background main-view worm-view red green blue gray blender targets))))
rlm@216: #+end_src
rlm@216: 
rlm@216: #+begin_src sh :results silent
rlm@216: cd /home/r/proj/cortex/render/worm-vision
rlm@216: ffmpeg -r 25 -b 9001k -i out/%07d.png -vcodec libtheora worm-vision.ogg 
rlm@216: #+end_src
rlm@236:    
ocsenave@265: * Onward!
ocsenave@265:   - As a neat bonus, this idea behind simulated vision also enables one
ocsenave@265:     to [[../../cortex/html/capture-video.html][capture live video feeds from jMonkeyEngine]].
ocsenave@265:   - Now that we have vision, it's time to tackle [[./hearing.org][hearing]].
ocsenave@265: #+appendix
ocsenave@265: 
rlm@215: * Headers
rlm@215: 
rlm@213: #+name: vision-header
rlm@213: #+begin_src clojure 
rlm@213: (ns cortex.vision
rlm@213:   "Simulate the sense of vision in jMonkeyEngine3. Enables multiple
rlm@213:   eyes from different positions to observe the same world, and pass
rlm@306:   the observed data to any arbitrary function. Automatically reads
rlm@216:   eye-nodes from specially prepared blender files and instantiates
rlm@213:   them in the world as actual eyes."
rlm@213:   {:author "Robert McIntyre"}
rlm@213:   (:use (cortex world sense util))
rlm@213:   (:import com.jme3.post.SceneProcessor)
rlm@237:   (:import (com.jme3.util BufferUtils Screenshots))
rlm@213:   (:import java.nio.ByteBuffer)
rlm@213:   (:import java.awt.image.BufferedImage)
rlm@213:   (:import (com.jme3.renderer ViewPort Camera))
rlm@216:   (:import (com.jme3.math ColorRGBA Vector3f Matrix3f))
rlm@213:   (:import com.jme3.renderer.Renderer)
rlm@213:   (:import com.jme3.app.Application)
rlm@213:   (:import com.jme3.texture.FrameBuffer)
rlm@213:   (:import (com.jme3.scene Node Spatial)))
rlm@213: #+end_src
rlm@112: 
rlm@215: #+name: test-header
rlm@215: #+begin_src clojure
rlm@215: (ns cortex.test.vision
rlm@215:   (:use (cortex world sense util body vision))
rlm@215:   (:use cortex.test.body)
rlm@215:   (:import java.awt.image.BufferedImage)
rlm@215:   (:import javax.swing.JPanel)
rlm@215:   (:import javax.swing.SwingUtilities)
rlm@215:   (:import java.awt.Dimension)
rlm@215:   (:import javax.swing.JFrame)
rlm@215:   (:import com.jme3.math.ColorRGBA)
rlm@215:   (:import com.jme3.scene.Node)
rlm@215:   (:import com.jme3.math.Vector3f)
rlm@216:   (:import java.io.File)
rlm@341:   (:import (com.aurellem.capture Capture RatchetTimer IsoTimer)))
rlm@215: #+end_src
rlm@341: 
rlm@341: #+results: test-header
rlm@341: : com.aurellem.capture.IsoTimer
rlm@341: 
rlm@216: * Source Listing
rlm@216:   - [[../src/cortex/vision.clj][cortex.vision]]
rlm@216:   - [[../src/cortex/test/vision.clj][cortex.test.vision]]
rlm@216:   - [[../src/cortex/video/magick2.clj][cortex.video.magick2]]
rlm@216:   - [[../assets/Models/subtitles/worm-vision-subtitles.blend][worm-vision-subtitles.blend]]
rlm@216: #+html: <ul> <li> <a href="../org/sense.org">This org file</a> </li> </ul>
rlm@216:   - [[http://hg.bortreb.com ][source-repository]]
rlm@216:  
rlm@35: 
rlm@273: * Next 
rlm@273: I find some [[./hearing.org][ears]] for the creature while exploring the guts of
rlm@273: jMonkeyEngine's sound system.
rlm@24: 
rlm@212: * COMMENT Generate Source
rlm@34: #+begin_src clojure :tangle ../src/cortex/vision.clj
rlm@216: <<vision-header>>
rlm@216: <<pipeline-1>>
rlm@216: <<pipeline-2>>
rlm@216: <<retina>>
rlm@216: <<add-eye>>
rlm@216: <<sensitivity>>
rlm@216: <<eye-node>>
rlm@216: <<add-camera>>
rlm@216: <<kernel>>
rlm@216: <<main>>
rlm@216: <<display>>
rlm@24: #+end_src
rlm@24: 
rlm@68: #+begin_src clojure :tangle ../src/cortex/test/vision.clj
rlm@215: <<test-header>>
rlm@215: <<test-1>>
rlm@216: <<test-2>>
rlm@24: #+end_src
rlm@216: 
rlm@216: #+begin_src clojure :tangle ../src/cortex/video/magick2.clj
rlm@216: <<magick2>>
rlm@216: #+end_src