annotate org/vision.org @ 326:e5636b1740f8

add GPL v2 license
author Robert McIntyre <rlm@mit.edu>
date Tue, 26 Feb 2013 16:29:04 +0000
parents 702b5c78c2de
children 5dcd44576cbc
rev   line source
rlm@34 1 #+title: Simulated Sense of Sight
rlm@23 2 #+author: Robert McIntyre
rlm@23 3 #+email: rlm@mit.edu
rlm@38 4 #+description: Simulated sight for AI research using JMonkeyEngine3 and clojure
rlm@34 5 #+keywords: computer vision, jMonkeyEngine3, clojure
rlm@23 6 #+SETUPFILE: ../../aurellem/org/setup.org
rlm@23 7 #+INCLUDE: ../../aurellem/org/level-0.org
rlm@23 8 #+babel: :mkdirp yes :noweb yes :exports both
rlm@23 9
ocsenave@264 10 * JMonkeyEngine natively supports multiple views of the same world.
ocsenave@264 11
rlm@212 12 Vision is one of the most important senses for humans, so I need to
rlm@212 13 build a simulated sense of vision for my AI. I will do this with
rlm@306 14 simulated eyes. Each eye can be independently moved and should see its
rlm@212 15 own version of the world depending on where it is.
rlm@212 16
rlm@306 17 Making these simulated eyes a reality is simple because jMonkeyEngine
rlm@306 18 already contains extensive support for multiple views of the same 3D
rlm@218 19 simulated world. The reason jMonkeyEngine has this support is because
rlm@218 20 the support is necessary to create games with split-screen
rlm@218 21 views. Multiple views are also used to create efficient
rlm@212 22 pseudo-reflections by rendering the scene from a certain perspective
rlm@212 23 and then projecting it back onto a surface in the 3D world.
rlm@212 24
rlm@218 25 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye, which was one of the first games to use split-screen views.
rlm@212 26 [[../images/goldeneye-4-player.png]]
rlm@212 27
ocsenave@264 28 ** =ViewPorts=, =SceneProcessors=, and the =RenderManager=.
rlm@306 29 # =ViewPorts= are cameras; =RenderManger= takes snapshots each frame.
ocsenave@264 30 #* A Brief Description of jMonkeyEngine's Rendering Pipeline
rlm@212 31
rlm@213 32 jMonkeyEngine allows you to create a =ViewPort=, which represents a
rlm@213 33 view of the simulated world. You can create as many of these as you
rlm@213 34 want. Every frame, the =RenderManager= iterates through each
rlm@213 35 =ViewPort=, rendering the scene in the GPU. For each =ViewPort= there
rlm@213 36 is a =FrameBuffer= which represents the rendered image in the GPU.
rlm@151 37
rlm@306 38 #+caption: =ViewPorts= are cameras in the world. During each frame, the =RenderManager= records a snapshot of what each view is currently seeing; these snapshots are =FrameBuffer= objects.
ocsenave@265 39 #+ATTR_HTML: width="400"
ocsenave@272 40 [[../images/diagram_rendermanager2.png]]
ocsenave@262 41
rlm@213 42 Each =ViewPort= can have any number of attached =SceneProcessor=
rlm@213 43 objects, which are called every time a new frame is rendered. A
rlm@306 44 =SceneProcessor= receives its =ViewPort's= =FrameBuffer= and can do
rlm@219 45 whatever it wants to the data. Often this consists of invoking GPU
rlm@219 46 specific operations on the rendered image. The =SceneProcessor= can
rlm@219 47 also copy the GPU image data to RAM and process it with the CPU.
rlm@151 48
ocsenave@264 49 ** From Views to Vision
ocsenave@264 50 # Appropriating Views for Vision.
rlm@151 51
ocsenave@264 52 Each eye in the simulated creature needs its own =ViewPort= so that
rlm@213 53 it can see the world from its own perspective. To this =ViewPort=, I
rlm@306 54 add a =SceneProcessor= that feeds the visual data to any arbitrary
rlm@213 55 continuation function for further processing. That continuation
rlm@213 56 function may perform both CPU and GPU operations on the data. To make
rlm@213 57 this easy for the continuation function, the =SceneProcessor=
rlm@306 58 maintains appropriately sized buffers in RAM to hold the data. It does
rlm@218 59 not do any copying from the GPU to the CPU itself because it is a slow
rlm@218 60 operation.
rlm@214 61
rlm@213 62 #+name: pipeline-1
rlm@213 63 #+begin_src clojure
rlm@113 64 (defn vision-pipeline
rlm@34 65 "Create a SceneProcessor object which wraps a vision processing
rlm@113 66 continuation function. The continuation is a function that takes
rlm@113 67 [#^Renderer r #^FrameBuffer fb #^ByteBuffer b #^BufferedImage bi],
rlm@306 68 each of which has already been appropriately sized."
rlm@23 69 [continuation]
rlm@23 70 (let [byte-buffer (atom nil)
rlm@113 71 renderer (atom nil)
rlm@113 72 image (atom nil)]
rlm@23 73 (proxy [SceneProcessor] []
rlm@23 74 (initialize
rlm@23 75 [renderManager viewPort]
rlm@23 76 (let [cam (.getCamera viewPort)
rlm@23 77 width (.getWidth cam)
rlm@23 78 height (.getHeight cam)]
rlm@23 79 (reset! renderer (.getRenderer renderManager))
rlm@23 80 (reset! byte-buffer
rlm@23 81 (BufferUtils/createByteBuffer
rlm@113 82 (* width height 4)))
rlm@113 83 (reset! image (BufferedImage.
rlm@113 84 width height
rlm@113 85 BufferedImage/TYPE_4BYTE_ABGR))))
rlm@23 86 (isInitialized [] (not (nil? @byte-buffer)))
rlm@23 87 (reshape [_ _ _])
rlm@23 88 (preFrame [_])
rlm@23 89 (postQueue [_])
rlm@23 90 (postFrame
rlm@23 91 [#^FrameBuffer fb]
rlm@23 92 (.clear @byte-buffer)
rlm@113 93 (continuation @renderer fb @byte-buffer @image))
rlm@23 94 (cleanup []))))
rlm@213 95 #+end_src
rlm@213 96
rlm@273 97 The continuation function given to =vision-pipeline= above will be
rlm@213 98 given a =Renderer= and three containers for image data. The
rlm@218 99 =FrameBuffer= references the GPU image data, but the pixel data can
rlm@218 100 not be used directly on the CPU. The =ByteBuffer= and =BufferedImage=
rlm@219 101 are initially "empty" but are sized to hold the data in the
rlm@306 102 =FrameBuffer=. I call transferring the GPU image data to the CPU
rlm@213 103 structures "mixing" the image data. I have provided three functions to
rlm@213 104 do this mixing.
rlm@213 105
rlm@213 106 #+name: pipeline-2
rlm@213 107 #+begin_src clojure
rlm@113 108 (defn frameBuffer->byteBuffer!
rlm@113 109 "Transfer the data in the graphics card (Renderer, FrameBuffer) to
rlm@113 110 the CPU (ByteBuffer)."
rlm@113 111 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb]
rlm@113 112 (.readFrameBuffer r fb bb) bb)
rlm@113 113
rlm@113 114 (defn byteBuffer->bufferedImage!
rlm@113 115 "Convert the C-style BGRA image data in the ByteBuffer bb to the AWT
rlm@113 116 style ABGR image data and place it in BufferedImage bi."
rlm@113 117 [#^ByteBuffer bb #^BufferedImage bi]
rlm@113 118 (Screenshots/convertScreenShot bb bi) bi)
rlm@113 119
rlm@113 120 (defn BufferedImage!
rlm@113 121 "Continuation which will grab the buffered image from the materials
rlm@113 122 provided by (vision-pipeline)."
rlm@113 123 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb #^BufferedImage bi]
rlm@113 124 (byteBuffer->bufferedImage!
rlm@113 125 (frameBuffer->byteBuffer! r fb bb) bi))
rlm@213 126 #+end_src
rlm@112 127
rlm@213 128 Note that it is possible to write vision processing algorithms
rlm@213 129 entirely in terms of =BufferedImage= inputs. Just compose that
rlm@273 130 =BufferedImage= algorithm with =BufferedImage!=. However, a vision
rlm@213 131 processing algorithm that is entirely hosted on the GPU does not have
rlm@306 132 to pay for this convenience.
rlm@213 133
ocsenave@265 134 * Optical sensor arrays are described with images and referenced with metadata
rlm@214 135 The vision pipeline described above handles the flow of rendered
rlm@214 136 images. Now, we need simulated eyes to serve as the source of these
rlm@214 137 images.
rlm@214 138
rlm@214 139 An eye is described in blender in the same way as a joint. They are
rlm@214 140 zero dimensional empty objects with no geometry whose local coordinate
rlm@214 141 system determines the orientation of the resulting eye. All eyes are
rlm@306 142 children of a parent node named "eyes" just as all joints have a
rlm@214 143 parent named "joints". An eye binds to the nearest physical object
rlm@273 144 with =bind-sense=.
rlm@214 145
rlm@214 146 #+name: add-eye
rlm@214 147 #+begin_src clojure
rlm@215 148 (in-ns 'cortex.vision)
rlm@215 149
rlm@214 150 (defn add-eye!
rlm@214 151 "Create a Camera centered on the current position of 'eye which
rlm@214 152 follows the closest physical node in 'creature and sends visual
rlm@215 153 data to 'continuation. The camera will point in the X direction and
rlm@215 154 use the Z vector as up as determined by the rotation of these
rlm@215 155 vectors in blender coordinate space. Use XZY rotation for the node
rlm@215 156 in blender."
rlm@214 157 [#^Node creature #^Spatial eye]
rlm@214 158 (let [target (closest-node creature eye)
rlm@214 159 [cam-width cam-height] (eye-dimensions eye)
rlm@215 160 cam (Camera. cam-width cam-height)
rlm@215 161 rot (.getWorldRotation eye)]
rlm@214 162 (.setLocation cam (.getWorldTranslation eye))
rlm@218 163 (.lookAtDirection
rlm@218 164 cam ; this part is not a mistake and
rlm@218 165 (.mult rot Vector3f/UNIT_X) ; is consistent with using Z in
rlm@218 166 (.mult rot Vector3f/UNIT_Y)) ; blender as the UP vector.
rlm@214 167 (.setFrustumPerspective
rlm@215 168 cam 45 (/ (.getWidth cam) (.getHeight cam)) 1 1000)
rlm@215 169 (bind-sense target cam) cam))
rlm@214 170 #+end_src
rlm@214 171
rlm@214 172 Here, the camera is created based on metadata on the eye-node and
rlm@273 173 attached to the nearest physical object with =bind-sense=
rlm@214 174 ** The Retina
rlm@214 175
rlm@214 176 An eye is a surface (the retina) which contains many discrete sensors
rlm@218 177 to detect light. These sensors have can have different light-sensing
rlm@214 178 properties. In humans, each discrete sensor is sensitive to red,
rlm@214 179 blue, green, or gray. These different types of sensors can have
rlm@214 180 different spatial distributions along the retina. In humans, there is
rlm@214 181 a fovea in the center of the retina which has a very high density of
rlm@214 182 color sensors, and a blind spot which has no sensors at all. Sensor
rlm@219 183 density decreases in proportion to distance from the fovea.
rlm@214 184
rlm@214 185 I want to be able to model any retinal configuration, so my eye-nodes
rlm@214 186 in blender contain metadata pointing to images that describe the
rlm@306 187 precise position of the individual sensors using white pixels. The
rlm@306 188 meta-data also describes the precise sensitivity to light that the
rlm@214 189 sensors described in the image have. An eye can contain any number of
rlm@214 190 these images. For example, the metadata for an eye might look like
rlm@214 191 this:
rlm@214 192
rlm@214 193 #+begin_src clojure
rlm@214 194 {0xFF0000 "Models/test-creature/retina-small.png"}
rlm@214 195 #+end_src
rlm@214 196
rlm@214 197 #+caption: The retinal profile image "Models/test-creature/retina-small.png". White pixels are photo-sensitive elements. The distribution of white pixels is denser in the middle and falls off at the edges and is inspired by the human retina.
rlm@214 198 [[../assets/Models/test-creature/retina-small.png]]
rlm@214 199
rlm@214 200 Together, the number 0xFF0000 and the image image above describe the
rlm@214 201 placement of red-sensitive sensory elements.
rlm@214 202
rlm@214 203 Meta-data to very crudely approximate a human eye might be something
rlm@214 204 like this:
rlm@214 205
rlm@214 206 #+begin_src clojure
rlm@214 207 (let [retinal-profile "Models/test-creature/retina-small.png"]
rlm@214 208 {0xFF0000 retinal-profile
rlm@214 209 0x00FF00 retinal-profile
rlm@214 210 0x0000FF retinal-profile
rlm@214 211 0xFFFFFF retinal-profile})
rlm@214 212 #+end_src
rlm@214 213
rlm@214 214 The numbers that serve as keys in the map determine a sensor's
rlm@214 215 relative sensitivity to the channels red, green, and blue. These
rlm@218 216 sensitivity values are packed into an integer in the order =|_|R|G|B|=
rlm@218 217 in 8-bit fields. The RGB values of a pixel in the image are added
rlm@306 218 together with these sensitivities as linear weights. Therefore,
rlm@214 219 0xFF0000 means sensitive to red only while 0xFFFFFF means sensitive to
rlm@214 220 all colors equally (gray).
rlm@214 221
rlm@306 222 For convenience I've defined a few symbols for the more common
rlm@214 223 sensitivity values.
rlm@214 224
rlm@214 225 #+name: sensitivity
rlm@214 226 #+begin_src clojure
rlm@317 227 (def sensitivity-presets
rlm@317 228 "Retinal sensitivity presets for sensors that extract one channel
rlm@317 229 (:red :blue :green) or average all channels (:all)"
rlm@214 230 {:all 0xFFFFFF
rlm@214 231 :red 0xFF0000
rlm@214 232 :blue 0x0000FF
rlm@317 233 :green 0x00FF00})
rlm@214 234 #+end_src
rlm@214 235
rlm@214 236 ** Metadata Processing
rlm@214 237
rlm@273 238 =retina-sensor-profile= extracts a map from the eye-node in the same
rlm@273 239 format as the example maps above. =eye-dimensions= finds the
rlm@219 240 dimensions of the smallest image required to contain all the retinal
rlm@214 241 sensor maps.
rlm@214 242
rlm@216 243 #+name: retina
rlm@214 244 #+begin_src clojure
rlm@214 245 (defn retina-sensor-profile
rlm@214 246 "Return a map of pixel sensitivity numbers to BufferedImages
rlm@214 247 describing the distribution of light-sensitive components of this
rlm@214 248 eye. :red, :green, :blue, :gray are already defined as extracting
rlm@214 249 the red, green, blue, and average components respectively."
rlm@214 250 [#^Spatial eye]
rlm@214 251 (if-let [eye-map (meta-data eye "eye")]
rlm@214 252 (map-vals
rlm@214 253 load-image
rlm@214 254 (eval (read-string eye-map)))))
rlm@214 255
rlm@218 256 (defn eye-dimensions
rlm@218 257 "Returns [width, height] determined by the metadata of the eye."
rlm@214 258 [#^Spatial eye]
rlm@214 259 (let [dimensions
rlm@214 260 (map #(vector (.getWidth %) (.getHeight %))
rlm@214 261 (vals (retina-sensor-profile eye)))]
rlm@214 262 [(apply max (map first dimensions))
rlm@214 263 (apply max (map second dimensions))]))
rlm@214 264 #+end_src
rlm@214 265
ocsenave@265 266 * Importing and parsing descriptions of eyes.
rlm@214 267 First off, get the children of the "eyes" empty node to find all the
rlm@214 268 eyes the creature has.
rlm@216 269 #+name: eye-node
rlm@214 270 #+begin_src clojure
rlm@317 271 (def
rlm@317 272 ^{:doc "Return the children of the creature's \"eyes\" node."
rlm@317 273 :arglists '([creature])}
rlm@214 274 eyes
rlm@317 275 (sense-nodes "eyes"))
rlm@214 276 #+end_src
rlm@214 277
rlm@273 278 Then, add the camera created by =add-eye!= to the simulation by
rlm@215 279 creating a new viewport.
rlm@214 280
rlm@216 281 #+name: add-camera
rlm@213 282 #+begin_src clojure
rlm@169 283 (defn add-camera!
rlm@169 284 "Add a camera to the world, calling continuation on every frame
rlm@34 285 produced."
rlm@167 286 [#^Application world camera continuation]
rlm@23 287 (let [width (.getWidth camera)
rlm@23 288 height (.getHeight camera)
rlm@23 289 render-manager (.getRenderManager world)
rlm@23 290 viewport (.createMainView render-manager "eye-view" camera)]
rlm@23 291 (doto viewport
rlm@23 292 (.setClearFlags true true true)
rlm@112 293 (.setBackgroundColor ColorRGBA/Black)
rlm@113 294 (.addProcessor (vision-pipeline continuation))
rlm@23 295 (.attachScene (.getRootNode world)))))
rlm@215 296 #+end_src
rlm@151 297
rlm@151 298
rlm@218 299 The eye's continuation function should register the viewport with the
rlm@218 300 simulation the first time it is called, use the CPU to extract the
rlm@215 301 appropriate pixels from the rendered image and weight them by each
rlm@218 302 sensor's sensitivity. I have the option to do this processing in
rlm@218 303 native code for a slight gain in speed. I could also do it in the GPU
rlm@273 304 for a massive gain in speed. =vision-kernel= generates a list of
rlm@218 305 such continuation functions, one for each channel of the eye.
rlm@151 306
rlm@216 307 #+name: kernel
rlm@215 308 #+begin_src clojure
rlm@215 309 (in-ns 'cortex.vision)
rlm@151 310
rlm@215 311 (defrecord attached-viewport [vision-fn viewport-fn]
rlm@215 312 clojure.lang.IFn
rlm@215 313 (invoke [this world] (vision-fn world))
rlm@215 314 (applyTo [this args] (apply vision-fn args)))
rlm@151 315
rlm@216 316 (defn pixel-sense [sensitivity pixel]
rlm@216 317 (let [s-r (bit-shift-right (bit-and 0xFF0000 sensitivity) 16)
rlm@216 318 s-g (bit-shift-right (bit-and 0x00FF00 sensitivity) 8)
rlm@216 319 s-b (bit-and 0x0000FF sensitivity)
rlm@216 320
rlm@216 321 p-r (bit-shift-right (bit-and 0xFF0000 pixel) 16)
rlm@216 322 p-g (bit-shift-right (bit-and 0x00FF00 pixel) 8)
rlm@216 323 p-b (bit-and 0x0000FF pixel)
rlm@216 324
rlm@216 325 total-sensitivity (* 255 (+ s-r s-g s-b))]
rlm@216 326 (float (/ (+ (* s-r p-r)
rlm@216 327 (* s-g p-g)
rlm@216 328 (* s-b p-b))
rlm@216 329 total-sensitivity))))
rlm@216 330
rlm@215 331 (defn vision-kernel
rlm@171 332 "Returns a list of functions, each of which will return a color
rlm@171 333 channel's worth of visual information when called inside a running
rlm@171 334 simulation."
rlm@151 335 [#^Node creature #^Spatial eye & {skip :skip :or {skip 0}}]
rlm@169 336 (let [retinal-map (retina-sensor-profile eye)
rlm@169 337 camera (add-eye! creature eye)
rlm@151 338 vision-image
rlm@151 339 (atom
rlm@151 340 (BufferedImage. (.getWidth camera)
rlm@151 341 (.getHeight camera)
rlm@170 342 BufferedImage/TYPE_BYTE_BINARY))
rlm@170 343 register-eye!
rlm@170 344 (runonce
rlm@170 345 (fn [world]
rlm@170 346 (add-camera!
rlm@170 347 world camera
rlm@170 348 (let [counter (atom 0)]
rlm@170 349 (fn [r fb bb bi]
rlm@170 350 (if (zero? (rem (swap! counter inc) (inc skip)))
rlm@170 351 (reset! vision-image
rlm@170 352 (BufferedImage! r fb bb bi))))))))]
rlm@151 353 (vec
rlm@151 354 (map
rlm@151 355 (fn [[key image]]
rlm@151 356 (let [whites (white-coordinates image)
rlm@151 357 topology (vec (collapse whites))
rlm@216 358 sensitivity (sensitivity-presets key key)]
rlm@215 359 (attached-viewport.
rlm@215 360 (fn [world]
rlm@215 361 (register-eye! world)
rlm@215 362 (vector
rlm@215 363 topology
rlm@215 364 (vec
rlm@215 365 (for [[x y] whites]
rlm@216 366 (pixel-sense
rlm@216 367 sensitivity
rlm@216 368 (.getRGB @vision-image x y))))))
rlm@215 369 register-eye!)))
rlm@215 370 retinal-map))))
rlm@151 371
rlm@215 372 (defn gen-fix-display
rlm@215 373 "Create a function to call to restore a simulation's display when it
rlm@215 374 is disrupted by a Viewport."
rlm@215 375 []
rlm@215 376 (runonce
rlm@215 377 (fn [world]
rlm@215 378 (add-camera! world (.getCamera world) no-op))))
rlm@215 379 #+end_src
rlm@170 380
rlm@273 381 Note that since each of the functions generated by =vision-kernel=
rlm@273 382 shares the same =register-eye!= function, the eye will be registered
rlm@215 383 only once the first time any of the functions from the list returned
rlm@273 384 by =vision-kernel= is called. Each of the functions returned by
rlm@273 385 =vision-kernel= also allows access to the =Viewport= through which
rlm@306 386 it receives images.
rlm@215 387
rlm@306 388 The in-game display can be disrupted by all the ViewPorts that the
rlm@306 389 functions generated by =vision-kernel= add. This doesn't affect the
rlm@215 390 simulation or the simulated senses, but can be annoying.
rlm@273 391 =gen-fix-display= restores the in-simulation display.
rlm@215 392
ocsenave@265 393 ** The =vision!= function creates sensory probes.
rlm@215 394
rlm@218 395 All the hard work has been done; all that remains is to apply
rlm@273 396 =vision-kernel= to each eye in the creature and gather the results
rlm@215 397 into one list of functions.
rlm@215 398
rlm@216 399 #+name: main
rlm@215 400 #+begin_src clojure
rlm@170 401 (defn vision!
rlm@170 402 "Returns a function which returns visual sensory data when called
rlm@218 403 inside a running simulation."
rlm@151 404 [#^Node creature & {skip :skip :or {skip 0}}]
rlm@151 405 (reduce
rlm@170 406 concat
rlm@167 407 (for [eye (eyes creature)]
rlm@215 408 (vision-kernel creature eye))))
rlm@215 409 #+end_src
rlm@151 410
ocsenave@265 411 ** Displaying visual data for debugging.
ocsenave@265 412 # Visualization of Vision. Maybe less alliteration would be better.
rlm@215 413 It's vital to have a visual representation for each sense. Here I use
rlm@273 414 =view-sense= to construct a function that will create a display for
rlm@215 415 visual data.
rlm@215 416
rlm@216 417 #+name: display
rlm@215 418 #+begin_src clojure
rlm@216 419 (in-ns 'cortex.vision)
rlm@216 420
rlm@189 421 (defn view-vision
rlm@189 422 "Creates a function which accepts a list of visual sensor-data and
rlm@189 423 displays each element of the list to the screen."
rlm@189 424 []
rlm@188 425 (view-sense
rlm@188 426 (fn
rlm@188 427 [[coords sensor-data]]
rlm@188 428 (let [image (points->image coords)]
rlm@188 429 (dorun
rlm@188 430 (for [i (range (count coords))]
rlm@188 431 (.setRGB image ((coords i) 0) ((coords i) 1)
rlm@216 432 (gray (int (* 255 (sensor-data i)))))))
rlm@189 433 image))))
rlm@34 434 #+end_src
rlm@23 435
ocsenave@264 436 * Demonstrations
ocsenave@264 437 ** Demonstrating the vision pipeline.
rlm@23 438
rlm@215 439 This is a basic test for the vision system. It only tests the
ocsenave@264 440 vision-pipeline and does not deal with loading eyes from a blender
rlm@215 441 file. The code creates two videos of the same rotating cube from
rlm@215 442 different angles.
rlm@23 443
rlm@215 444 #+name: test-1
rlm@23 445 #+begin_src clojure
rlm@215 446 (in-ns 'cortex.test.vision)
rlm@23 447
rlm@219 448 (defn test-pipeline
rlm@69 449 "Testing vision:
rlm@69 450 Tests the vision system by creating two views of the same rotating
rlm@69 451 object from different angles and displaying both of those views in
rlm@69 452 JFrames.
rlm@69 453
rlm@69 454 You should see a rotating cube, and two windows,
rlm@69 455 each displaying a different view of the cube."
rlm@283 456 ([] (test-pipeline false))
rlm@283 457 ([record?]
rlm@283 458 (let [candy
rlm@283 459 (box 1 1 1 :physical? false :color ColorRGBA/Blue)]
rlm@283 460 (world
rlm@283 461 (doto (Node.)
rlm@283 462 (.attachChild candy))
rlm@283 463 {}
rlm@283 464 (fn [world]
rlm@283 465 (let [cam (.clone (.getCamera world))
rlm@283 466 width (.getWidth cam)
rlm@283 467 height (.getHeight cam)]
rlm@283 468 (add-camera! world cam
rlm@283 469 (comp
rlm@283 470 (view-image
rlm@283 471 (if record?
rlm@283 472 (File. "/home/r/proj/cortex/render/vision/1")))
rlm@283 473 BufferedImage!))
rlm@283 474 (add-camera! world
rlm@283 475 (doto (.clone cam)
rlm@283 476 (.setLocation (Vector3f. -10 0 0))
rlm@283 477 (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))
rlm@283 478 (comp
rlm@283 479 (view-image
rlm@283 480 (if record?
rlm@283 481 (File. "/home/r/proj/cortex/render/vision/2")))
rlm@283 482 BufferedImage!))
rlm@283 483 ;; This is here to restore the main view
rlm@112 484 ;; after the other views have completed processing
rlm@283 485 (add-camera! world (.getCamera world) no-op)))
rlm@283 486 (fn [world tpf]
rlm@283 487 (.rotate candy (* tpf 0.2) 0 0))))))
rlm@23 488 #+end_src
rlm@23 489
rlm@215 490 #+begin_html
rlm@215 491 <div class="figure">
rlm@215 492 <video controls="controls" width="755">
rlm@215 493 <source src="../video/spinning-cube.ogg" type="video/ogg"
rlm@215 494 preload="none" poster="../images/aurellem-1280x480.png" />
rlm@215 495 </video>
rlm@309 496 <br> <a href="http://youtu.be/r5Bn2aG7MO0"> YouTube </a>
rlm@215 497 <p>A rotating cube viewed from two different perspectives.</p>
rlm@215 498 </div>
rlm@215 499 #+end_html
rlm@215 500
rlm@215 501 Creating multiple eyes like this can be used for stereoscopic vision
rlm@215 502 simulation in a single creature or for simulating multiple creatures,
rlm@215 503 each with their own sense of vision.
ocsenave@264 504 ** Demonstrating eye import and parsing.
rlm@215 505
rlm@218 506 To the worm from the last post, I add a new node that describes its
rlm@215 507 eyes.
rlm@215 508
rlm@215 509 #+attr_html: width=755
rlm@215 510 #+caption: The worm with newly added empty nodes describing a single eye.
rlm@215 511 [[../images/worm-with-eye.png]]
rlm@215 512
rlm@215 513 The node highlighted in yellow is the root level "eyes" node. It has
rlm@218 514 a single child, highlighted in orange, which describes a single
rlm@218 515 eye. This is the "eye" node. It is placed so that the worm will have
rlm@218 516 an eye located in the center of the flat portion of its lower
rlm@218 517 hemispherical section.
rlm@218 518
rlm@218 519 The two nodes which are not highlighted describe the single joint of
rlm@218 520 the worm.
rlm@215 521
rlm@215 522 The metadata of the eye-node is:
rlm@215 523
rlm@215 524 #+begin_src clojure :results verbatim :exports both
rlm@215 525 (cortex.sense/meta-data
rlm@218 526 (.getChild (.getChild (cortex.test.body/worm) "eyes") "eye") "eye")
rlm@215 527 #+end_src
rlm@215 528
rlm@215 529 #+results:
rlm@215 530 : "(let [retina \"Models/test-creature/retina-small.png\"]
rlm@215 531 : {:all retina :red retina :green retina :blue retina})"
rlm@215 532
rlm@215 533 This is the approximation to the human eye described earlier.
rlm@215 534
rlm@216 535 #+name: test-2
rlm@215 536 #+begin_src clojure
rlm@215 537 (in-ns 'cortex.test.vision)
rlm@215 538
rlm@216 539 (defn change-color [obj color]
rlm@321 540 ;;(println-repl obj)
rlm@216 541 (if obj
rlm@216 542 (.setColor (.getMaterial obj) "Color" color)))
rlm@216 543
rlm@216 544 (defn colored-cannon-ball [color]
rlm@216 545 (comp #(change-color % color)
rlm@216 546 (fire-cannon-ball)))
rlm@215 547
rlm@283 548 (defn test-worm-vision
rlm@321 549 "Testing vision:
rlm@321 550 You should see the worm suspended in mid-air, looking down at a
rlm@321 551 table. There are four small displays, one each for red, green blue,
rlm@321 552 and gray channels. You can fire balls of various colors, and the
rlm@321 553 four channels should react accordingly.
rlm@321 554
rlm@321 555 Keys:
rlm@321 556 r : fire red-ball
rlm@321 557 b : fire blue-ball
rlm@321 558 g : fire green-ball
rlm@321 559 <space> : fire white ball"
rlm@321 560
rlm@283 561 ([] (test-worm-vision false))
rlm@283 562 ([record?]
rlm@283 563 (let [the-worm (doto (worm)(body!))
rlm@283 564 vision (vision! the-worm)
rlm@283 565 vision-display (view-vision)
rlm@283 566 fix-display (gen-fix-display)
rlm@283 567 me (sphere 0.5 :color ColorRGBA/Blue :physical? false)
rlm@283 568 x-axis
rlm@283 569 (box 1 0.01 0.01 :physical? false :color ColorRGBA/Red
rlm@283 570 :position (Vector3f. 0 -5 0))
rlm@283 571 y-axis
rlm@283 572 (box 0.01 1 0.01 :physical? false :color ColorRGBA/Green
rlm@283 573 :position (Vector3f. 0 -5 0))
rlm@283 574 z-axis
rlm@283 575 (box 0.01 0.01 1 :physical? false :color ColorRGBA/Blue
rlm@283 576 :position (Vector3f. 0 -5 0))
rlm@283 577 timer (RatchetTimer. 60)]
rlm@215 578
rlm@283 579 (world (nodify [(floor) the-worm x-axis y-axis z-axis me])
rlm@283 580 (assoc standard-debug-controls
rlm@283 581 "key-r" (colored-cannon-ball ColorRGBA/Red)
rlm@283 582 "key-b" (colored-cannon-ball ColorRGBA/Blue)
rlm@283 583 "key-g" (colored-cannon-ball ColorRGBA/Green))
rlm@283 584 (fn [world]
rlm@283 585 (light-up-everything world)
rlm@283 586 (speed-up world)
rlm@283 587 (.setTimer world timer)
rlm@306 588 (display-dilated-time world timer)
rlm@283 589 ;; add a view from the worm's perspective
rlm@283 590 (if record?
rlm@283 591 (Capture/captureVideo
rlm@283 592 world
rlm@283 593 (File.
rlm@283 594 "/home/r/proj/cortex/render/worm-vision/main-view")))
rlm@283 595
rlm@283 596 (add-camera!
rlm@283 597 world
rlm@283 598 (add-eye! the-worm
rlm@283 599 (.getChild
rlm@283 600 (.getChild the-worm "eyes") "eye"))
rlm@283 601 (comp
rlm@283 602 (view-image
rlm@283 603 (if record?
rlm@283 604 (File.
rlm@283 605 "/home/r/proj/cortex/render/worm-vision/worm-view")))
rlm@283 606 BufferedImage!))
rlm@283 607
rlm@283 608 (set-gravity world Vector3f/ZERO))
rlm@283 609
rlm@283 610 (fn [world _ ]
rlm@283 611 (.setLocalTranslation me (.getLocation (.getCamera world)))
rlm@283 612 (vision-display
rlm@283 613 (map #(% world) vision)
rlm@283 614 (if record? (File. "/home/r/proj/cortex/render/worm-vision")))
rlm@283 615 (fix-display world))))))
rlm@215 616 #+end_src
rlm@215 617
rlm@218 618 The world consists of the worm and a flat gray floor. I can shoot red,
rlm@218 619 green, blue and white cannonballs at the worm. The worm is initially
rlm@218 620 looking down at the floor, and there is no gravity. My perspective
rlm@218 621 (the Main View), the worm's perspective (Worm View) and the 4 sensor
rlm@218 622 channels that comprise the worm's eye are all saved frame-by-frame to
rlm@218 623 disk.
rlm@218 624
rlm@218 625 * Demonstration of Vision
rlm@218 626 #+begin_html
rlm@218 627 <div class="figure">
rlm@218 628 <video controls="controls" width="755">
rlm@218 629 <source src="../video/worm-vision.ogg" type="video/ogg"
rlm@218 630 preload="none" poster="../images/aurellem-1280x480.png" />
rlm@218 631 </video>
rlm@309 632 <br> <a href="http://youtu.be/J3H3iB_2NPQ"> YouTube </a>
rlm@218 633 <p>Simulated Vision in a Virtual Environment</p>
rlm@218 634 </div>
rlm@218 635 #+end_html
rlm@218 636
rlm@218 637 ** Generate the Worm Video from Frames
rlm@216 638 #+name: magick2
rlm@216 639 #+begin_src clojure
rlm@216 640 (ns cortex.video.magick2
rlm@216 641 (:import java.io.File)
rlm@316 642 (:use clojure.java.shell))
rlm@216 643
rlm@216 644 (defn images [path]
rlm@216 645 (sort (rest (file-seq (File. path)))))
rlm@216 646
rlm@216 647 (def base "/home/r/proj/cortex/render/worm-vision/")
rlm@216 648
rlm@216 649 (defn pics [file]
rlm@216 650 (images (str base file)))
rlm@216 651
rlm@216 652 (defn combine-images []
rlm@216 653 (let [main-view (pics "main-view")
rlm@216 654 worm-view (pics "worm-view")
rlm@216 655 blue (pics "0")
rlm@216 656 green (pics "1")
rlm@216 657 red (pics "2")
rlm@216 658 gray (pics "3")
rlm@216 659 blender (let [b-pics (pics "blender")]
rlm@216 660 (concat b-pics (repeat 9001 (last b-pics))))
rlm@216 661 background (repeat 9001 (File. (str base "background.png")))
rlm@216 662 targets (map
rlm@216 663 #(File. (str base "out/" (format "%07d.png" %)))
rlm@216 664 (range 0 (count main-view)))]
rlm@216 665 (dorun
rlm@216 666 (pmap
rlm@216 667 (comp
rlm@216 668 (fn [[background main-view worm-view red green blue gray blender target]]
rlm@216 669 (println target)
rlm@216 670 (sh "convert"
rlm@216 671 background
rlm@216 672 main-view "-geometry" "+18+17" "-composite"
rlm@216 673 worm-view "-geometry" "+677+17" "-composite"
rlm@216 674 green "-geometry" "+685+430" "-composite"
rlm@216 675 red "-geometry" "+788+430" "-composite"
rlm@216 676 blue "-geometry" "+894+430" "-composite"
rlm@216 677 gray "-geometry" "+1000+430" "-composite"
rlm@216 678 blender "-geometry" "+0+0" "-composite"
rlm@216 679 target))
rlm@216 680 (fn [& args] (map #(.getCanonicalPath %) args)))
rlm@216 681 background main-view worm-view red green blue gray blender targets))))
rlm@216 682 #+end_src
rlm@216 683
rlm@216 684 #+begin_src sh :results silent
rlm@216 685 cd /home/r/proj/cortex/render/worm-vision
rlm@216 686 ffmpeg -r 25 -b 9001k -i out/%07d.png -vcodec libtheora worm-vision.ogg
rlm@216 687 #+end_src
rlm@236 688
ocsenave@265 689 * Onward!
ocsenave@265 690 - As a neat bonus, this idea behind simulated vision also enables one
ocsenave@265 691 to [[../../cortex/html/capture-video.html][capture live video feeds from jMonkeyEngine]].
ocsenave@265 692 - Now that we have vision, it's time to tackle [[./hearing.org][hearing]].
ocsenave@265 693 #+appendix
ocsenave@265 694
rlm@215 695 * Headers
rlm@215 696
rlm@213 697 #+name: vision-header
rlm@213 698 #+begin_src clojure
rlm@213 699 (ns cortex.vision
rlm@213 700 "Simulate the sense of vision in jMonkeyEngine3. Enables multiple
rlm@213 701 eyes from different positions to observe the same world, and pass
rlm@306 702 the observed data to any arbitrary function. Automatically reads
rlm@216 703 eye-nodes from specially prepared blender files and instantiates
rlm@213 704 them in the world as actual eyes."
rlm@213 705 {:author "Robert McIntyre"}
rlm@213 706 (:use (cortex world sense util))
rlm@213 707 (:import com.jme3.post.SceneProcessor)
rlm@237 708 (:import (com.jme3.util BufferUtils Screenshots))
rlm@213 709 (:import java.nio.ByteBuffer)
rlm@213 710 (:import java.awt.image.BufferedImage)
rlm@213 711 (:import (com.jme3.renderer ViewPort Camera))
rlm@216 712 (:import (com.jme3.math ColorRGBA Vector3f Matrix3f))
rlm@213 713 (:import com.jme3.renderer.Renderer)
rlm@213 714 (:import com.jme3.app.Application)
rlm@213 715 (:import com.jme3.texture.FrameBuffer)
rlm@213 716 (:import (com.jme3.scene Node Spatial)))
rlm@213 717 #+end_src
rlm@112 718
rlm@215 719 #+name: test-header
rlm@215 720 #+begin_src clojure
rlm@215 721 (ns cortex.test.vision
rlm@215 722 (:use (cortex world sense util body vision))
rlm@215 723 (:use cortex.test.body)
rlm@215 724 (:import java.awt.image.BufferedImage)
rlm@215 725 (:import javax.swing.JPanel)
rlm@215 726 (:import javax.swing.SwingUtilities)
rlm@215 727 (:import java.awt.Dimension)
rlm@215 728 (:import javax.swing.JFrame)
rlm@215 729 (:import com.jme3.math.ColorRGBA)
rlm@215 730 (:import com.jme3.scene.Node)
rlm@215 731 (:import com.jme3.math.Vector3f)
rlm@216 732 (:import java.io.File)
rlm@216 733 (:import (com.aurellem.capture Capture RatchetTimer)))
rlm@215 734 #+end_src
rlm@216 735 * Source Listing
rlm@216 736 - [[../src/cortex/vision.clj][cortex.vision]]
rlm@216 737 - [[../src/cortex/test/vision.clj][cortex.test.vision]]
rlm@216 738 - [[../src/cortex/video/magick2.clj][cortex.video.magick2]]
rlm@216 739 - [[../assets/Models/subtitles/worm-vision-subtitles.blend][worm-vision-subtitles.blend]]
rlm@216 740 #+html: <ul> <li> <a href="../org/sense.org">This org file</a> </li> </ul>
rlm@216 741 - [[http://hg.bortreb.com ][source-repository]]
rlm@216 742
rlm@35 743
rlm@273 744 * Next
rlm@273 745 I find some [[./hearing.org][ears]] for the creature while exploring the guts of
rlm@273 746 jMonkeyEngine's sound system.
rlm@24 747
rlm@212 748 * COMMENT Generate Source
rlm@34 749 #+begin_src clojure :tangle ../src/cortex/vision.clj
rlm@216 750 <<vision-header>>
rlm@216 751 <<pipeline-1>>
rlm@216 752 <<pipeline-2>>
rlm@216 753 <<retina>>
rlm@216 754 <<add-eye>>
rlm@216 755 <<sensitivity>>
rlm@216 756 <<eye-node>>
rlm@216 757 <<add-camera>>
rlm@216 758 <<kernel>>
rlm@216 759 <<main>>
rlm@216 760 <<display>>
rlm@24 761 #+end_src
rlm@24 762
rlm@68 763 #+begin_src clojure :tangle ../src/cortex/test/vision.clj
rlm@215 764 <<test-header>>
rlm@215 765 <<test-1>>
rlm@216 766 <<test-2>>
rlm@24 767 #+end_src
rlm@216 768
rlm@216 769 #+begin_src clojure :tangle ../src/cortex/video/magick2.clj
rlm@216 770 <<magick2>>
rlm@216 771 #+end_src