Mercurial > cortex
view org/vision.org @ 213:319963720179
fleshing out vision
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Thu, 09 Feb 2012 08:11:10 -0700 |
parents | 8e9825c38941 |
children | 01d3e9855ef9 |
line wrap: on
line source
1 #+title: Simulated Sense of Sight2 #+author: Robert McIntyre3 #+email: rlm@mit.edu4 #+description: Simulated sight for AI research using JMonkeyEngine3 and clojure5 #+keywords: computer vision, jMonkeyEngine3, clojure6 #+SETUPFILE: ../../aurellem/org/setup.org7 #+INCLUDE: ../../aurellem/org/level-0.org8 #+babel: :mkdirp yes :noweb yes :exports both10 * Vision13 Vision is one of the most important senses for humans, so I need to14 build a simulated sense of vision for my AI. I will do this with15 simulated eyes. Each eye can be independely moved and should see its16 own version of the world depending on where it is.18 Making these simulated eyes a reality is fairly simple bacause19 jMonkeyEngine already conatains extensive support for multiple views20 of the same 3D simulated world. The reason jMonkeyEngine has this21 support is because the support is necessary to create games with22 split-screen views. Multiple views are also used to create efficient23 pseudo-reflections by rendering the scene from a certain perspective24 and then projecting it back onto a surface in the 3D world.26 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye27 [[../images/goldeneye-4-player.png]]29 * Brief Description of jMonkeyEngine's Rendering Pipeline31 jMonkeyEngine allows you to create a =ViewPort=, which represents a32 view of the simulated world. You can create as many of these as you33 want. Every frame, the =RenderManager= iterates through each34 =ViewPort=, rendering the scene in the GPU. For each =ViewPort= there35 is a =FrameBuffer= which represents the rendered image in the GPU.37 Each =ViewPort= can have any number of attached =SceneProcessor=38 objects, which are called every time a new frame is rendered. A39 =SceneProcessor= recieves a =FrameBuffer= and can do whatever it wants40 to the data. Often this consists of invoking GPU specific operations41 on the rendered image. The =SceneProcessor= can also copy the GPU42 image data to RAM and process it with the CPU.44 * The Vision Pipeline46 Each eye in the simulated creature needs it's own =ViewPort= so that47 it can see the world from its own perspective. To this =ViewPort=, I48 add a =SceneProcessor= that feeds the visual data to any arbitra49 continuation function for further processing. That continuation50 function may perform both CPU and GPU operations on the data. To make51 this easy for the continuation function, the =SceneProcessor=52 maintains appropriatly sized buffers in RAM to hold the data. It does53 not do any copying from the GPU to the CPU itself.54 #+name: pipeline-155 #+begin_src clojure56 (defn vision-pipeline57 "Create a SceneProcessor object which wraps a vision processing58 continuation function. The continuation is a function that takes59 [#^Renderer r #^FrameBuffer fb #^ByteBuffer b #^BufferedImage bi],60 each of which has already been appropiately sized."61 [continuation]62 (let [byte-buffer (atom nil)63 renderer (atom nil)64 image (atom nil)]65 (proxy [SceneProcessor] []66 (initialize67 [renderManager viewPort]68 (let [cam (.getCamera viewPort)69 width (.getWidth cam)70 height (.getHeight cam)]71 (reset! renderer (.getRenderer renderManager))72 (reset! byte-buffer73 (BufferUtils/createByteBuffer74 (* width height 4)))75 (reset! image (BufferedImage.76 width height77 BufferedImage/TYPE_4BYTE_ABGR))))78 (isInitialized [] (not (nil? @byte-buffer)))79 (reshape [_ _ _])80 (preFrame [_])81 (postQueue [_])82 (postFrame83 [#^FrameBuffer fb]84 (.clear @byte-buffer)85 (continuation @renderer fb @byte-buffer @image))86 (cleanup []))))87 #+end_src89 The continuation function given to =(vision-pipeline)= above will be90 given a =Renderer= and three containers for image data. The91 =FrameBuffer= references the GPU image data, but it can not be used92 directly on the CPU. The =ByteBuffer= and =BufferedImage= are93 initially "empty" but are sized to hold to data in the94 =FrameBuffer=. I call transfering the GPU image data to the CPU95 structures "mixing" the image data. I have provided three functions to96 do this mixing.98 #+name: pipeline-299 #+begin_src clojure100 (defn frameBuffer->byteBuffer!101 "Transfer the data in the graphics card (Renderer, FrameBuffer) to102 the CPU (ByteBuffer)."103 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb]104 (.readFrameBuffer r fb bb) bb)106 (defn byteBuffer->bufferedImage!107 "Convert the C-style BGRA image data in the ByteBuffer bb to the AWT108 style ABGR image data and place it in BufferedImage bi."109 [#^ByteBuffer bb #^BufferedImage bi]110 (Screenshots/convertScreenShot bb bi) bi)112 (defn BufferedImage!113 "Continuation which will grab the buffered image from the materials114 provided by (vision-pipeline)."115 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb #^BufferedImage bi]116 (byteBuffer->bufferedImage!117 (frameBuffer->byteBuffer! r fb bb) bi))118 #+end_src120 Note that it is possible to write vision processing algorithms121 entirely in terms of =BufferedImage= inputs. Just compose that122 =BufferedImage= algorithm with =(BufferedImage!)=. However, a vision123 processing algorithm that is entirely hosted on the GPU does not have124 to pay for this convienence.127 * Physical Eyes129 The vision pipeline described above only deals with130 Each eye in the creature in blender will work the same way as131 joints -- a zero dimensional object with no geometry whose local132 coordinate system determines the orientation of the resulting133 eye. All eyes will have a parent named "eyes" just as all joints134 have a parent named "joints". The resulting camera will be a135 ChaseCamera or a CameraNode bound to the geo that is closest to136 the eye marker. The eye marker will contain the metadata for the137 eye, and will be moved by it's bound geometry. The dimensions of138 the eye's camera are equal to the dimensions of the eye's "UV"139 map.141 (vision creature) will take an optional :skip argument which will142 inform the continuations in scene processor to skip the given143 number of cycles 0 means that no cycles will be skipped.145 (vision creature) will return [init-functions sensor-functions].146 The init-functions are each single-arg functions that take the147 world and register the cameras and must each be called before the148 corresponding sensor-functions. Each init-function returns the149 viewport for that eye which can be manipulated, saved, etc. Each150 sensor-function is a thunk and will return data in the same151 format as the tactile-sensor functions the structure is152 [topology, sensor-data]. Internally, these sensor-functions153 maintain a reference to sensor-data which is periodically updated154 by the continuation function established by its init-function.155 They can be queried every cycle, but their information may not156 necessairly be different every cycle.159 #+begin_src clojure160 (defn add-camera!161 "Add a camera to the world, calling continuation on every frame162 produced."163 [#^Application world camera continuation]164 (let [width (.getWidth camera)165 height (.getHeight camera)166 render-manager (.getRenderManager world)167 viewport (.createMainView render-manager "eye-view" camera)]168 (doto viewport169 (.setClearFlags true true true)170 (.setBackgroundColor ColorRGBA/Black)171 (.addProcessor (vision-pipeline continuation))172 (.attachScene (.getRootNode world)))))174 (defn retina-sensor-profile175 "Return a map of pixel selection functions to BufferedImages176 describing the distribution of light-sensitive components of this177 eye. Each function creates an integer from the rgb values found in178 the pixel. :red, :green, :blue, :gray are already defined as179 extracting the red, green, blue, and average components180 respectively."181 [#^Spatial eye]182 (if-let [eye-map (meta-data eye "eye")]183 (map-vals184 load-image185 (eval (read-string eye-map)))))187 (defn eye-dimensions188 "Returns [width, height] specified in the metadata of the eye"189 [#^Spatial eye]190 (let [dimensions191 (map #(vector (.getWidth %) (.getHeight %))192 (vals (retina-sensor-profile eye)))]193 [(apply max (map first dimensions))194 (apply max (map second dimensions))]))196 (defvar197 ^{:arglists '([creature])}198 eyes199 (sense-nodes "eyes")200 "Return the children of the creature's \"eyes\" node.")202 (defn add-eye!203 "Create a Camera centered on the current position of 'eye which204 follows the closest physical node in 'creature and sends visual205 data to 'continuation."206 [#^Node creature #^Spatial eye]207 (let [target (closest-node creature eye)208 [cam-width cam-height] (eye-dimensions eye)209 cam (Camera. cam-width cam-height)]210 (.setLocation cam (.getWorldTranslation eye))211 (.setRotation cam (.getWorldRotation eye))212 (.setFrustumPerspective213 cam 45 (/ (.getWidth cam) (.getHeight cam))214 1 1000)215 (bind-sense target cam)216 cam))218 (defvar color-channel-presets219 {:all 0xFFFFFF220 :red 0xFF0000221 :blue 0x0000FF222 :green 0x00FF00}223 "Bitmasks for common RGB color channels")225 (defn vision-fn226 "Returns a list of functions, each of which will return a color227 channel's worth of visual information when called inside a running228 simulation."229 [#^Node creature #^Spatial eye & {skip :skip :or {skip 0}}]230 (let [retinal-map (retina-sensor-profile eye)231 camera (add-eye! creature eye)232 vision-image233 (atom234 (BufferedImage. (.getWidth camera)235 (.getHeight camera)236 BufferedImage/TYPE_BYTE_BINARY))237 register-eye!238 (runonce239 (fn [world]240 (add-camera!241 world camera242 (let [counter (atom 0)]243 (fn [r fb bb bi]244 (if (zero? (rem (swap! counter inc) (inc skip)))245 (reset! vision-image246 (BufferedImage! r fb bb bi))))))))]247 (vec248 (map249 (fn [[key image]]250 (let [whites (white-coordinates image)251 topology (vec (collapse whites))252 mask (color-channel-presets key)]253 (fn [world]254 (register-eye! world)255 (vector256 topology257 (vec258 (for [[x y] whites]259 (bit-and260 mask (.getRGB @vision-image x y))))))))261 retinal-map))))264 ;; TODO maybe should add a viewport-manipulation function to265 ;; automatically change viewport settings, attach shadow filters, etc.267 (defn vision!268 "Returns a function which returns visual sensory data when called269 inside a running simulation"270 [#^Node creature & {skip :skip :or {skip 0}}]271 (reduce272 concat273 (for [eye (eyes creature)]274 (vision-fn creature eye))))276 (defn view-vision277 "Creates a function which accepts a list of visual sensor-data and278 displays each element of the list to the screen."279 []280 (view-sense281 (fn282 [[coords sensor-data]]283 (let [image (points->image coords)]284 (dorun285 (for [i (range (count coords))]286 (.setRGB image ((coords i) 0) ((coords i) 1)287 (sensor-data i))))288 image))))290 #+end_src293 Note the use of continuation passing style for connecting the eye to a294 function to process the output. You can create any number of eyes, and295 each of them will see the world from their own =Camera=. Once every296 frame, the rendered image is copied to a =BufferedImage=, and that297 data is sent off to the continuation function. Moving the =Camera=298 which was used to create the eye will change what the eye sees.300 * Example302 #+name: test-vision303 #+begin_src clojure304 (ns cortex.test.vision305 (:use (cortex world util vision))306 (:import java.awt.image.BufferedImage)307 (:import javax.swing.JPanel)308 (:import javax.swing.SwingUtilities)309 (:import java.awt.Dimension)310 (:import javax.swing.JFrame)311 (:import com.jme3.math.ColorRGBA)312 (:import com.jme3.scene.Node)313 (:import com.jme3.math.Vector3f))315 (defn test-two-eyes316 "Testing vision:317 Tests the vision system by creating two views of the same rotating318 object from different angles and displaying both of those views in319 JFrames.321 You should see a rotating cube, and two windows,322 each displaying a different view of the cube."323 []324 (let [candy325 (box 1 1 1 :physical? false :color ColorRGBA/Blue)]326 (world327 (doto (Node.)328 (.attachChild candy))329 {}330 (fn [world]331 (let [cam (.clone (.getCamera world))332 width (.getWidth cam)333 height (.getHeight cam)]334 (add-camera! world cam335 ;;no-op336 (comp (view-image) BufferedImage!)337 )338 (add-camera! world339 (doto (.clone cam)340 (.setLocation (Vector3f. -10 0 0))341 (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))342 ;;no-op343 (comp (view-image) BufferedImage!))344 ;; This is here to restore the main view345 ;; after the other views have completed processing346 (add-camera! world (.getCamera world) no-op)))347 (fn [world tpf]348 (.rotate candy (* tpf 0.2) 0 0)))))349 #+end_src351 #+name: vision-header352 #+begin_src clojure353 (ns cortex.vision354 "Simulate the sense of vision in jMonkeyEngine3. Enables multiple355 eyes from different positions to observe the same world, and pass356 the observed data to any arbitray function. Automatically reads357 eye-nodes from specially prepared blender files and instanttiates358 them in the world as actual eyes."359 {:author "Robert McIntyre"}360 (:use (cortex world sense util))361 (:use clojure.contrib.def)362 (:import com.jme3.post.SceneProcessor)363 (:import (com.jme3.util BufferUtils Screenshots))364 (:import java.nio.ByteBuffer)365 (:import java.awt.image.BufferedImage)366 (:import (com.jme3.renderer ViewPort Camera))367 (:import com.jme3.math.ColorRGBA)368 (:import com.jme3.renderer.Renderer)369 (:import com.jme3.app.Application)370 (:import com.jme3.texture.FrameBuffer)371 (:import (com.jme3.scene Node Spatial)))372 #+end_src374 The example code will create two videos of the same rotating object375 from different angles. It can be used both for stereoscopic vision376 simulation or for simulating multiple creatures, each with their own377 sense of vision.379 - As a neat bonus, this idea behind simulated vision also enables one380 to [[../../cortex/html/capture-video.html][capture live video feeds from jMonkeyEngine]].383 * COMMENT Generate Source384 #+begin_src clojure :tangle ../src/cortex/vision.clj385 <<eyes>>386 #+end_src388 #+begin_src clojure :tangle ../src/cortex/test/vision.clj389 <<test-vision>>390 #+end_src