annotate org/vision.org @ 212:8e9825c38941

writing intro for vision.org
author Robert McIntyre <rlm@mit.edu>
date Thu, 09 Feb 2012 07:39:21 -0700
parents ac158a976443
children 319963720179
rev   line source
rlm@34 1 #+title: Simulated Sense of Sight
rlm@23 2 #+author: Robert McIntyre
rlm@23 3 #+email: rlm@mit.edu
rlm@38 4 #+description: Simulated sight for AI research using JMonkeyEngine3 and clojure
rlm@34 5 #+keywords: computer vision, jMonkeyEngine3, clojure
rlm@23 6 #+SETUPFILE: ../../aurellem/org/setup.org
rlm@23 7 #+INCLUDE: ../../aurellem/org/level-0.org
rlm@23 8 #+babel: :mkdirp yes :noweb yes :exports both
rlm@23 9
rlm@194 10 * Vision
rlm@23 11
rlm@151 12
rlm@212 13 Vision is one of the most important senses for humans, so I need to
rlm@212 14 build a simulated sense of vision for my AI. I will do this with
rlm@212 15 simulated eyes. Each eye can be independely moved and should see its
rlm@212 16 own version of the world depending on where it is.
rlm@212 17
rlm@212 18 Making these simulated eyes a reality is fairly simple bacause
rlm@212 19 jMonkeyEngine already conatains extensive support for multiple views
rlm@212 20 of the same 3D simulated world. The reason jMonkeyEngine has this
rlm@212 21 support is because the support is necessary to create games with
rlm@212 22 split-screen views. Multiple views are also used to create efficient
rlm@212 23 pseudo-reflections by rendering the scene from a certain perspective
rlm@212 24 and then projecting it back onto a surface in the 3D world.
rlm@212 25
rlm@212 26 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye
rlm@212 27 [[../images/goldeneye-4-player.png]]
rlm@212 28
rlm@212 29
rlm@151 30
rlm@151 31 Make the continuation in scene-processor take FrameBuffer,
rlm@151 32 byte-buffer, BufferedImage already sized to the correct
rlm@151 33 dimensions. the continuation will decide wether to "mix" them
rlm@151 34 into the BufferedImage, lazily ignore them, or mix them halfway
rlm@151 35 and call c/graphics card routines.
rlm@151 36
rlm@151 37 (vision creature) will take an optional :skip argument which will
rlm@151 38 inform the continuations in scene processor to skip the given
rlm@151 39 number of cycles 0 means that no cycles will be skipped.
rlm@151 40
rlm@151 41 (vision creature) will return [init-functions sensor-functions].
rlm@151 42 The init-functions are each single-arg functions that take the
rlm@151 43 world and register the cameras and must each be called before the
rlm@151 44 corresponding sensor-functions. Each init-function returns the
rlm@151 45 viewport for that eye which can be manipulated, saved, etc. Each
rlm@151 46 sensor-function is a thunk and will return data in the same
rlm@151 47 format as the tactile-sensor functions the structure is
rlm@151 48 [topology, sensor-data]. Internally, these sensor-functions
rlm@151 49 maintain a reference to sensor-data which is periodically updated
rlm@151 50 by the continuation function established by its init-function.
rlm@151 51 They can be queried every cycle, but their information may not
rlm@151 52 necessairly be different every cycle.
rlm@151 53
rlm@151 54 Each eye in the creature in blender will work the same way as
rlm@151 55 joints -- a zero dimensional object with no geometry whose local
rlm@151 56 coordinate system determines the orientation of the resulting
rlm@151 57 eye. All eyes will have a parent named "eyes" just as all joints
rlm@151 58 have a parent named "joints". The resulting camera will be a
rlm@151 59 ChaseCamera or a CameraNode bound to the geo that is closest to
rlm@151 60 the eye marker. The eye marker will contain the metadata for the
rlm@151 61 eye, and will be moved by it's bound geometry. The dimensions of
rlm@151 62 the eye's camera are equal to the dimensions of the eye's "UV"
rlm@151 63 map.
rlm@151 64
rlm@66 65 #+name: eyes
rlm@23 66 #+begin_src clojure
rlm@34 67 (ns cortex.vision
rlm@34 68 "Simulate the sense of vision in jMonkeyEngine3. Enables multiple
rlm@34 69 eyes from different positions to observe the same world, and pass
rlm@172 70 the observed data to any arbitray function. Automatically reads
rlm@172 71 eye-nodes from specially prepared blender files and instanttiates
rlm@172 72 them in the world as actual eyes."
rlm@34 73 {:author "Robert McIntyre"}
rlm@151 74 (:use (cortex world sense util))
rlm@167 75 (:use clojure.contrib.def)
rlm@34 76 (:import com.jme3.post.SceneProcessor)
rlm@113 77 (:import (com.jme3.util BufferUtils Screenshots))
rlm@34 78 (:import java.nio.ByteBuffer)
rlm@34 79 (:import java.awt.image.BufferedImage)
rlm@172 80 (:import (com.jme3.renderer ViewPort Camera))
rlm@113 81 (:import com.jme3.math.ColorRGBA)
rlm@151 82 (:import com.jme3.renderer.Renderer)
rlm@172 83 (:import com.jme3.app.Application)
rlm@172 84 (:import com.jme3.texture.FrameBuffer)
rlm@172 85 (:import (com.jme3.scene Node Spatial)))
rlm@113 86
rlm@113 87 (defn vision-pipeline
rlm@34 88 "Create a SceneProcessor object which wraps a vision processing
rlm@113 89 continuation function. The continuation is a function that takes
rlm@113 90 [#^Renderer r #^FrameBuffer fb #^ByteBuffer b #^BufferedImage bi],
rlm@113 91 each of which has already been appropiately sized."
rlm@23 92 [continuation]
rlm@23 93 (let [byte-buffer (atom nil)
rlm@113 94 renderer (atom nil)
rlm@113 95 image (atom nil)]
rlm@23 96 (proxy [SceneProcessor] []
rlm@23 97 (initialize
rlm@23 98 [renderManager viewPort]
rlm@23 99 (let [cam (.getCamera viewPort)
rlm@23 100 width (.getWidth cam)
rlm@23 101 height (.getHeight cam)]
rlm@23 102 (reset! renderer (.getRenderer renderManager))
rlm@23 103 (reset! byte-buffer
rlm@23 104 (BufferUtils/createByteBuffer
rlm@113 105 (* width height 4)))
rlm@113 106 (reset! image (BufferedImage.
rlm@113 107 width height
rlm@113 108 BufferedImage/TYPE_4BYTE_ABGR))))
rlm@23 109 (isInitialized [] (not (nil? @byte-buffer)))
rlm@23 110 (reshape [_ _ _])
rlm@23 111 (preFrame [_])
rlm@23 112 (postQueue [_])
rlm@23 113 (postFrame
rlm@23 114 [#^FrameBuffer fb]
rlm@23 115 (.clear @byte-buffer)
rlm@113 116 (continuation @renderer fb @byte-buffer @image))
rlm@23 117 (cleanup []))))
rlm@23 118
rlm@113 119 (defn frameBuffer->byteBuffer!
rlm@113 120 "Transfer the data in the graphics card (Renderer, FrameBuffer) to
rlm@113 121 the CPU (ByteBuffer)."
rlm@113 122 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb]
rlm@113 123 (.readFrameBuffer r fb bb) bb)
rlm@113 124
rlm@113 125 (defn byteBuffer->bufferedImage!
rlm@113 126 "Convert the C-style BGRA image data in the ByteBuffer bb to the AWT
rlm@113 127 style ABGR image data and place it in BufferedImage bi."
rlm@113 128 [#^ByteBuffer bb #^BufferedImage bi]
rlm@113 129 (Screenshots/convertScreenShot bb bi) bi)
rlm@113 130
rlm@113 131 (defn BufferedImage!
rlm@113 132 "Continuation which will grab the buffered image from the materials
rlm@113 133 provided by (vision-pipeline)."
rlm@113 134 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb #^BufferedImage bi]
rlm@113 135 (byteBuffer->bufferedImage!
rlm@113 136 (frameBuffer->byteBuffer! r fb bb) bi))
rlm@112 137
rlm@169 138 (defn add-camera!
rlm@169 139 "Add a camera to the world, calling continuation on every frame
rlm@34 140 produced."
rlm@167 141 [#^Application world camera continuation]
rlm@23 142 (let [width (.getWidth camera)
rlm@23 143 height (.getHeight camera)
rlm@23 144 render-manager (.getRenderManager world)
rlm@23 145 viewport (.createMainView render-manager "eye-view" camera)]
rlm@23 146 (doto viewport
rlm@23 147 (.setClearFlags true true true)
rlm@112 148 (.setBackgroundColor ColorRGBA/Black)
rlm@113 149 (.addProcessor (vision-pipeline continuation))
rlm@23 150 (.attachScene (.getRootNode world)))))
rlm@151 151
rlm@169 152 (defn retina-sensor-profile
rlm@151 153 "Return a map of pixel selection functions to BufferedImages
rlm@169 154 describing the distribution of light-sensitive components of this
rlm@169 155 eye. Each function creates an integer from the rgb values found in
rlm@169 156 the pixel. :red, :green, :blue, :gray are already defined as
rlm@169 157 extracting the red, green, blue, and average components
rlm@151 158 respectively."
rlm@151 159 [#^Spatial eye]
rlm@151 160 (if-let [eye-map (meta-data eye "eye")]
rlm@151 161 (map-vals
rlm@167 162 load-image
rlm@151 163 (eval (read-string eye-map)))))
rlm@151 164
rlm@151 165 (defn eye-dimensions
rlm@169 166 "Returns [width, height] specified in the metadata of the eye"
rlm@151 167 [#^Spatial eye]
rlm@151 168 (let [dimensions
rlm@151 169 (map #(vector (.getWidth %) (.getHeight %))
rlm@169 170 (vals (retina-sensor-profile eye)))]
rlm@151 171 [(apply max (map first dimensions))
rlm@151 172 (apply max (map second dimensions))]))
rlm@151 173
rlm@167 174 (defvar
rlm@167 175 ^{:arglists '([creature])}
rlm@167 176 eyes
rlm@167 177 (sense-nodes "eyes")
rlm@167 178 "Return the children of the creature's \"eyes\" node.")
rlm@151 179
rlm@169 180 (defn add-eye!
rlm@169 181 "Create a Camera centered on the current position of 'eye which
rlm@169 182 follows the closest physical node in 'creature and sends visual
rlm@169 183 data to 'continuation."
rlm@151 184 [#^Node creature #^Spatial eye]
rlm@151 185 (let [target (closest-node creature eye)
rlm@151 186 [cam-width cam-height] (eye-dimensions eye)
rlm@151 187 cam (Camera. cam-width cam-height)]
rlm@151 188 (.setLocation cam (.getWorldTranslation eye))
rlm@151 189 (.setRotation cam (.getWorldRotation eye))
rlm@151 190 (.setFrustumPerspective
rlm@151 191 cam 45 (/ (.getWidth cam) (.getHeight cam))
rlm@151 192 1 1000)
rlm@151 193 (bind-sense target cam)
rlm@151 194 cam))
rlm@151 195
rlm@172 196 (defvar color-channel-presets
rlm@151 197 {:all 0xFFFFFF
rlm@151 198 :red 0xFF0000
rlm@151 199 :blue 0x0000FF
rlm@172 200 :green 0x00FF00}
rlm@172 201 "Bitmasks for common RGB color channels")
rlm@151 202
rlm@169 203 (defn vision-fn
rlm@171 204 "Returns a list of functions, each of which will return a color
rlm@171 205 channel's worth of visual information when called inside a running
rlm@171 206 simulation."
rlm@151 207 [#^Node creature #^Spatial eye & {skip :skip :or {skip 0}}]
rlm@169 208 (let [retinal-map (retina-sensor-profile eye)
rlm@169 209 camera (add-eye! creature eye)
rlm@151 210 vision-image
rlm@151 211 (atom
rlm@151 212 (BufferedImage. (.getWidth camera)
rlm@151 213 (.getHeight camera)
rlm@170 214 BufferedImage/TYPE_BYTE_BINARY))
rlm@170 215 register-eye!
rlm@170 216 (runonce
rlm@170 217 (fn [world]
rlm@170 218 (add-camera!
rlm@170 219 world camera
rlm@170 220 (let [counter (atom 0)]
rlm@170 221 (fn [r fb bb bi]
rlm@170 222 (if (zero? (rem (swap! counter inc) (inc skip)))
rlm@170 223 (reset! vision-image
rlm@170 224 (BufferedImage! r fb bb bi))))))))]
rlm@151 225 (vec
rlm@151 226 (map
rlm@151 227 (fn [[key image]]
rlm@151 228 (let [whites (white-coordinates image)
rlm@151 229 topology (vec (collapse whites))
rlm@172 230 mask (color-channel-presets key)]
rlm@170 231 (fn [world]
rlm@170 232 (register-eye! world)
rlm@151 233 (vector
rlm@151 234 topology
rlm@151 235 (vec
rlm@151 236 (for [[x y] whites]
rlm@151 237 (bit-and
rlm@151 238 mask (.getRGB @vision-image x y))))))))
rlm@170 239 retinal-map))))
rlm@151 240
rlm@170 241
rlm@170 242 ;; TODO maybe should add a viewport-manipulation function to
rlm@170 243 ;; automatically change viewport settings, attach shadow filters, etc.
rlm@170 244
rlm@170 245 (defn vision!
rlm@170 246 "Returns a function which returns visual sensory data when called
rlm@170 247 inside a running simulation"
rlm@151 248 [#^Node creature & {skip :skip :or {skip 0}}]
rlm@151 249 (reduce
rlm@170 250 concat
rlm@167 251 (for [eye (eyes creature)]
rlm@169 252 (vision-fn creature eye))))
rlm@151 253
rlm@189 254 (defn view-vision
rlm@189 255 "Creates a function which accepts a list of visual sensor-data and
rlm@189 256 displays each element of the list to the screen."
rlm@189 257 []
rlm@188 258 (view-sense
rlm@188 259 (fn
rlm@188 260 [[coords sensor-data]]
rlm@188 261 (let [image (points->image coords)]
rlm@188 262 (dorun
rlm@188 263 (for [i (range (count coords))]
rlm@188 264 (.setRGB image ((coords i) 0) ((coords i) 1)
rlm@188 265 (sensor-data i))))
rlm@189 266 image))))
rlm@188 267
rlm@34 268 #+end_src
rlm@23 269
rlm@112 270
rlm@34 271 Note the use of continuation passing style for connecting the eye to a
rlm@34 272 function to process the output. You can create any number of eyes, and
rlm@34 273 each of them will see the world from their own =Camera=. Once every
rlm@34 274 frame, the rendered image is copied to a =BufferedImage=, and that
rlm@34 275 data is sent off to the continuation function. Moving the =Camera=
rlm@34 276 which was used to create the eye will change what the eye sees.
rlm@23 277
rlm@34 278 * Example
rlm@23 279
rlm@66 280 #+name: test-vision
rlm@23 281 #+begin_src clojure
rlm@68 282 (ns cortex.test.vision
rlm@34 283 (:use (cortex world util vision))
rlm@34 284 (:import java.awt.image.BufferedImage)
rlm@34 285 (:import javax.swing.JPanel)
rlm@34 286 (:import javax.swing.SwingUtilities)
rlm@34 287 (:import java.awt.Dimension)
rlm@34 288 (:import javax.swing.JFrame)
rlm@34 289 (:import com.jme3.math.ColorRGBA)
rlm@45 290 (:import com.jme3.scene.Node)
rlm@113 291 (:import com.jme3.math.Vector3f))
rlm@23 292
rlm@36 293 (defn test-two-eyes
rlm@69 294 "Testing vision:
rlm@69 295 Tests the vision system by creating two views of the same rotating
rlm@69 296 object from different angles and displaying both of those views in
rlm@69 297 JFrames.
rlm@69 298
rlm@69 299 You should see a rotating cube, and two windows,
rlm@69 300 each displaying a different view of the cube."
rlm@36 301 []
rlm@58 302 (let [candy
rlm@58 303 (box 1 1 1 :physical? false :color ColorRGBA/Blue)]
rlm@112 304 (world
rlm@112 305 (doto (Node.)
rlm@112 306 (.attachChild candy))
rlm@112 307 {}
rlm@112 308 (fn [world]
rlm@112 309 (let [cam (.clone (.getCamera world))
rlm@112 310 width (.getWidth cam)
rlm@112 311 height (.getHeight cam)]
rlm@169 312 (add-camera! world cam
rlm@113 313 ;;no-op
rlm@113 314 (comp (view-image) BufferedImage!)
rlm@112 315 )
rlm@169 316 (add-camera! world
rlm@112 317 (doto (.clone cam)
rlm@112 318 (.setLocation (Vector3f. -10 0 0))
rlm@112 319 (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))
rlm@113 320 ;;no-op
rlm@113 321 (comp (view-image) BufferedImage!))
rlm@112 322 ;; This is here to restore the main view
rlm@112 323 ;; after the other views have completed processing
rlm@169 324 (add-camera! world (.getCamera world) no-op)))
rlm@112 325 (fn [world tpf]
rlm@112 326 (.rotate candy (* tpf 0.2) 0 0)))))
rlm@23 327 #+end_src
rlm@23 328
rlm@112 329 #+results: test-vision
rlm@112 330 : #'cortex.test.vision/test-two-eyes
rlm@112 331
rlm@34 332 The example code will create two videos of the same rotating object
rlm@34 333 from different angles. It can be used both for stereoscopic vision
rlm@34 334 simulation or for simulating multiple creatures, each with their own
rlm@34 335 sense of vision.
rlm@24 336
rlm@35 337 - As a neat bonus, this idea behind simulated vision also enables one
rlm@35 338 to [[../../cortex/html/capture-video.html][capture live video feeds from jMonkeyEngine]].
rlm@35 339
rlm@24 340
rlm@212 341 * COMMENT Generate Source
rlm@34 342 #+begin_src clojure :tangle ../src/cortex/vision.clj
rlm@24 343 <<eyes>>
rlm@24 344 #+end_src
rlm@24 345
rlm@68 346 #+begin_src clojure :tangle ../src/cortex/test/vision.clj
rlm@24 347 <<test-vision>>
rlm@24 348 #+end_src