cortex: org/vision.org annotate

annotate org/vision.org @ 270:aa3641042958

minor formatting changes.

author	Robert McIntyre <rlm@mit.edu>
date	Tue, 14 Feb 2012 05:30:55 -0700
parents	e57d8c52f12f
children	12e6231eae8e c39b8b29a79e

rev	line source
rlm@34	1 #+title: Simulated Sense of Sight
rlm@23	2 #+author: Robert McIntyre
rlm@23	3 #+email: rlm@mit.edu
rlm@38	4 #+description: Simulated sight for AI research using JMonkeyEngine3 and clojure
rlm@34	5 #+keywords: computer vision, jMonkeyEngine3, clojure
rlm@23	6 #+SETUPFILE: ../../aurellem/org/setup.org
rlm@23	7 #+INCLUDE: ../../aurellem/org/level-0.org
rlm@23	8 #+babel: :mkdirp yes :noweb yes :exports both
rlm@23	9
ocsenave@265	10 # SUGGEST: Call functions by their name, without
ocsenave@265	11 # parentheses. e.g. =add-eye!=, not =(add-eye!)=. The reason for this
ocsenave@265	12 # is that it is potentially easy to confuse the /function/ =f= with its
ocsenave@265	13 # /value/ at a particular point =(f x)=. Mathematicians have this
ocsenave@265	14 # problem with their notation; we don't need it in ours.
ocsenave@265	15
ocsenave@264	16 * JMonkeyEngine natively supports multiple views of the same world.
ocsenave@264	17
rlm@212	18 Vision is one of the most important senses for humans, so I need to
rlm@212	19 build a simulated sense of vision for my AI. I will do this with
rlm@212	20 simulated eyes. Each eye can be independely moved and should see its
rlm@212	21 own version of the world depending on where it is.
rlm@212	22
rlm@218	23 Making these simulated eyes a reality is simple bacause jMonkeyEngine
rlm@218	24 already conatains extensive support for multiple views of the same 3D
rlm@218	25 simulated world. The reason jMonkeyEngine has this support is because
rlm@218	26 the support is necessary to create games with split-screen
rlm@218	27 views. Multiple views are also used to create efficient
rlm@212	28 pseudo-reflections by rendering the scene from a certain perspective
rlm@212	29 and then projecting it back onto a surface in the 3D world.
rlm@212	30
rlm@218	31 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye, which was one of the first games to use split-screen views.
rlm@212	32 [[../images/goldeneye-4-player.png]]
rlm@212	33
ocsenave@264	34 ** =ViewPorts=, =SceneProcessors=, and the =RenderManager=.
ocsenave@264	35 # =Viewports= are cameras; =RenderManger= takes snapshots each frame.
ocsenave@264	36 #* A Brief Description of jMonkeyEngine's Rendering Pipeline
rlm@212	37
rlm@213	38 jMonkeyEngine allows you to create a =ViewPort=, which represents a
rlm@213	39 view of the simulated world. You can create as many of these as you
rlm@213	40 want. Every frame, the =RenderManager= iterates through each
rlm@213	41 =ViewPort=, rendering the scene in the GPU. For each =ViewPort= there
rlm@213	42 is a =FrameBuffer= which represents the rendered image in the GPU.
rlm@151	43
ocsenave@262	44 #+caption: =ViewPorts= are cameras in the world. During each frame, the =Rendermanager= records a snapshot of what each view is currently seeing.
ocsenave@265	45 #+ATTR_HTML: width="400"
ocsenave@262	46 [[../images/diagram_rendermanager.png]]
ocsenave@262	47
rlm@213	48 Each =ViewPort= can have any number of attached =SceneProcessor=
rlm@213	49 objects, which are called every time a new frame is rendered. A
rlm@219	50 =SceneProcessor= recieves its =ViewPort's= =FrameBuffer= and can do
rlm@219	51 whatever it wants to the data. Often this consists of invoking GPU
rlm@219	52 specific operations on the rendered image. The =SceneProcessor= can
rlm@219	53 also copy the GPU image data to RAM and process it with the CPU.
rlm@151	54
ocsenave@264	55 ** From Views to Vision
ocsenave@264	56 # Appropriating Views for Vision.
rlm@151	57
ocsenave@264	58 Each eye in the simulated creature needs its own =ViewPort= so that
rlm@213	59 it can see the world from its own perspective. To this =ViewPort=, I
rlm@214	60 add a =SceneProcessor= that feeds the visual data to any arbitray
rlm@213	61 continuation function for further processing. That continuation
rlm@213	62 function may perform both CPU and GPU operations on the data. To make
rlm@213	63 this easy for the continuation function, the =SceneProcessor=
rlm@213	64 maintains appropriatly sized buffers in RAM to hold the data. It does
rlm@218	65 not do any copying from the GPU to the CPU itself because it is a slow
rlm@218	66 operation.
rlm@214	67
rlm@213	68 #+name: pipeline-1
rlm@213	69 #+begin_src clojure
rlm@113	70 (defn vision-pipeline
rlm@34	71 "Create a SceneProcessor object which wraps a vision processing
rlm@113	72 continuation function. The continuation is a function that takes
rlm@113	73 [#^Renderer r #^FrameBuffer fb #^ByteBuffer b #^BufferedImage bi],
rlm@113	74 each of which has already been appropiately sized."
rlm@23	75 [continuation]
rlm@23	76 (let [byte-buffer (atom nil)
rlm@113	77 renderer (atom nil)
rlm@113	78 image (atom nil)]
rlm@23	79 (proxy [SceneProcessor] []
rlm@23	80 (initialize
rlm@23	81 [renderManager viewPort]
rlm@23	82 (let [cam (.getCamera viewPort)
rlm@23	83 width (.getWidth cam)
rlm@23	84 height (.getHeight cam)]
rlm@23	85 (reset! renderer (.getRenderer renderManager))
rlm@23	86 (reset! byte-buffer
rlm@23	87 (BufferUtils/createByteBuffer
rlm@113	88 (* width height 4)))
rlm@113	89 (reset! image (BufferedImage.
rlm@113	90 width height
rlm@113	91 BufferedImage/TYPE_4BYTE_ABGR))))
rlm@23	92 (isInitialized [] (not (nil? @byte-buffer)))
rlm@23	93 (reshape [_ _ _])
rlm@23	94 (preFrame [_])
rlm@23	95 (postQueue [_])
rlm@23	96 (postFrame
rlm@23	97 [#^FrameBuffer fb]
rlm@23	98 (.clear @byte-buffer)
rlm@113	99 (continuation @renderer fb @byte-buffer @image))
rlm@23	100 (cleanup []))))
rlm@213	101 #+end_src
rlm@213	102
rlm@213	103 The continuation function given to =(vision-pipeline)= above will be
rlm@213	104 given a =Renderer= and three containers for image data. The
rlm@218	105 =FrameBuffer= references the GPU image data, but the pixel data can
rlm@218	106 not be used directly on the CPU. The =ByteBuffer= and =BufferedImage=
rlm@219	107 are initially "empty" but are sized to hold the data in the
rlm@213	108 =FrameBuffer=. I call transfering the GPU image data to the CPU
rlm@213	109 structures "mixing" the image data. I have provided three functions to
rlm@213	110 do this mixing.
rlm@213	111
rlm@213	112 #+name: pipeline-2
rlm@213	113 #+begin_src clojure
rlm@113	114 (defn frameBuffer->byteBuffer!
rlm@113	115 "Transfer the data in the graphics card (Renderer, FrameBuffer) to
rlm@113	116 the CPU (ByteBuffer)."
rlm@113	117 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb]
rlm@113	118 (.readFrameBuffer r fb bb) bb)
rlm@113	119
rlm@113	120 (defn byteBuffer->bufferedImage!
rlm@113	121 "Convert the C-style BGRA image data in the ByteBuffer bb to the AWT
rlm@113	122 style ABGR image data and place it in BufferedImage bi."
rlm@113	123 [#^ByteBuffer bb #^BufferedImage bi]
rlm@113	124 (Screenshots/convertScreenShot bb bi) bi)
rlm@113	125
rlm@113	126 (defn BufferedImage!
rlm@113	127 "Continuation which will grab the buffered image from the materials
rlm@113	128 provided by (vision-pipeline)."
rlm@113	129 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb #^BufferedImage bi]
rlm@113	130 (byteBuffer->bufferedImage!
rlm@113	131 (frameBuffer->byteBuffer! r fb bb) bi))
rlm@213	132 #+end_src
rlm@112	133
rlm@213	134 Note that it is possible to write vision processing algorithms
rlm@213	135 entirely in terms of =BufferedImage= inputs. Just compose that
rlm@213	136 =BufferedImage= algorithm with =(BufferedImage!)=. However, a vision
rlm@213	137 processing algorithm that is entirely hosted on the GPU does not have
rlm@213	138 to pay for this convienence.
rlm@213	139
ocsenave@265	140 * Optical sensor arrays are described with images and referenced with metadata
rlm@214	141 The vision pipeline described above handles the flow of rendered
rlm@214	142 images. Now, we need simulated eyes to serve as the source of these
rlm@214	143 images.
rlm@214	144
rlm@214	145 An eye is described in blender in the same way as a joint. They are
rlm@214	146 zero dimensional empty objects with no geometry whose local coordinate
rlm@214	147 system determines the orientation of the resulting eye. All eyes are
rlm@214	148 childern of a parent node named "eyes" just as all joints have a
rlm@214	149 parent named "joints". An eye binds to the nearest physical object
rlm@214	150 with =(bind-sense=).
rlm@214	151
rlm@214	152 #+name: add-eye
rlm@214	153 #+begin_src clojure
rlm@215	154 (in-ns 'cortex.vision)
rlm@215	155
rlm@214	156 (defn add-eye!
rlm@214	157 "Create a Camera centered on the current position of 'eye which
rlm@214	158 follows the closest physical node in 'creature and sends visual
rlm@215	159 data to 'continuation. The camera will point in the X direction and
rlm@215	160 use the Z vector as up as determined by the rotation of these
rlm@215	161 vectors in blender coordinate space. Use XZY rotation for the node
rlm@215	162 in blender."
rlm@214	163 [#^Node creature #^Spatial eye]
rlm@214	164 (let [target (closest-node creature eye)
rlm@214	165 [cam-width cam-height] (eye-dimensions eye)
rlm@215	166 cam (Camera. cam-width cam-height)
rlm@215	167 rot (.getWorldRotation eye)]
rlm@214	168 (.setLocation cam (.getWorldTranslation eye))
rlm@218	169 (.lookAtDirection
rlm@218	170 cam ; this part is not a mistake and
rlm@218	171 (.mult rot Vector3f/UNIT_X) ; is consistent with using Z in
rlm@218	172 (.mult rot Vector3f/UNIT_Y)) ; blender as the UP vector.
rlm@214	173 (.setFrustumPerspective
rlm@215	174 cam 45 (/ (.getWidth cam) (.getHeight cam)) 1 1000)
rlm@215	175 (bind-sense target cam) cam))
rlm@214	176 #+end_src
rlm@214	177
rlm@214	178 Here, the camera is created based on metadata on the eye-node and
rlm@214	179 attached to the nearest physical object with =(bind-sense)=
rlm@214	180 ** The Retina
rlm@214	181
rlm@214	182 An eye is a surface (the retina) which contains many discrete sensors
rlm@218	183 to detect light. These sensors have can have different light-sensing
rlm@214	184 properties. In humans, each discrete sensor is sensitive to red,
rlm@214	185 blue, green, or gray. These different types of sensors can have
rlm@214	186 different spatial distributions along the retina. In humans, there is
rlm@214	187 a fovea in the center of the retina which has a very high density of
rlm@214	188 color sensors, and a blind spot which has no sensors at all. Sensor
rlm@219	189 density decreases in proportion to distance from the fovea.
rlm@214	190
rlm@214	191 I want to be able to model any retinal configuration, so my eye-nodes
rlm@214	192 in blender contain metadata pointing to images that describe the
rlm@214	193 percise position of the individual sensors using white pixels. The
rlm@214	194 meta-data also describes the percise sensitivity to light that the
rlm@214	195 sensors described in the image have. An eye can contain any number of
rlm@214	196 these images. For example, the metadata for an eye might look like
rlm@214	197 this:
rlm@214	198
rlm@214	199 #+begin_src clojure
rlm@214	200 {0xFF0000 "Models/test-creature/retina-small.png"}
rlm@214	201 #+end_src
rlm@214	202
rlm@214	203 #+caption: The retinal profile image "Models/test-creature/retina-small.png". White pixels are photo-sensitive elements. The distribution of white pixels is denser in the middle and falls off at the edges and is inspired by the human retina.
rlm@214	204 [[../assets/Models/test-creature/retina-small.png]]
rlm@214	205
rlm@214	206 Together, the number 0xFF0000 and the image image above describe the
rlm@214	207 placement of red-sensitive sensory elements.
rlm@214	208
rlm@214	209 Meta-data to very crudely approximate a human eye might be something
rlm@214	210 like this:
rlm@214	211
rlm@214	212 #+begin_src clojure
rlm@214	213 (let [retinal-profile "Models/test-creature/retina-small.png"]
rlm@214	214 {0xFF0000 retinal-profile
rlm@214	215 0x00FF00 retinal-profile
rlm@214	216 0x0000FF retinal-profile
rlm@214	217 0xFFFFFF retinal-profile})
rlm@214	218 #+end_src
rlm@214	219
rlm@214	220 The numbers that serve as keys in the map determine a sensor's
rlm@214	221 relative sensitivity to the channels red, green, and blue. These
rlm@218	222 sensitivity values are packed into an integer in the order =\|_\|R\|G\|B\|=
rlm@218	223 in 8-bit fields. The RGB values of a pixel in the image are added
rlm@214	224 together with these sensitivities as linear weights. Therfore,
rlm@214	225 0xFF0000 means sensitive to red only while 0xFFFFFF means sensitive to
rlm@214	226 all colors equally (gray).
rlm@214	227
rlm@214	228 For convienence I've defined a few symbols for the more common
rlm@214	229 sensitivity values.
rlm@214	230
rlm@214	231 #+name: sensitivity
rlm@214	232 #+begin_src clojure
rlm@214	233 (defvar sensitivity-presets
rlm@214	234 {:all 0xFFFFFF
rlm@214	235 :red 0xFF0000
rlm@214	236 :blue 0x0000FF
rlm@214	237 :green 0x00FF00}
rlm@214	238 "Retinal sensitivity presets for sensors that extract one channel
rlm@219	239 (:red :blue :green) or average all channels (:all)")
rlm@214	240 #+end_src
rlm@214	241
rlm@214	242 ** Metadata Processing
rlm@214	243
rlm@214	244 =(retina-sensor-profile)= extracts a map from the eye-node in the same
rlm@214	245 format as the example maps above. =(eye-dimensions)= finds the
rlm@219	246 dimensions of the smallest image required to contain all the retinal
rlm@214	247 sensor maps.
rlm@214	248
rlm@216	249 #+name: retina
rlm@214	250 #+begin_src clojure
rlm@214	251 (defn retina-sensor-profile
rlm@214	252 "Return a map of pixel sensitivity numbers to BufferedImages
rlm@214	253 describing the distribution of light-sensitive components of this
rlm@214	254 eye. :red, :green, :blue, :gray are already defined as extracting
rlm@214	255 the red, green, blue, and average components respectively."
rlm@214	256 [#^Spatial eye]
rlm@214	257 (if-let [eye-map (meta-data eye "eye")]
rlm@214	258 (map-vals
rlm@214	259 load-image
rlm@214	260 (eval (read-string eye-map)))))
rlm@214	261
rlm@218	262 (defn eye-dimensions
rlm@218	263 "Returns [width, height] determined by the metadata of the eye."
rlm@214	264 [#^Spatial eye]
rlm@214	265 (let [dimensions
rlm@214	266 (map #(vector (.getWidth %) (.getHeight %))
rlm@214	267 (vals (retina-sensor-profile eye)))]
rlm@214	268 [(apply max (map first dimensions))
rlm@214	269 (apply max (map second dimensions))]))
rlm@214	270 #+end_src
rlm@214	271
ocsenave@265	272 * Importing and parsing descriptions of eyes.
rlm@214	273 First off, get the children of the "eyes" empty node to find all the
rlm@214	274 eyes the creature has.
rlm@216	275 #+name: eye-node
rlm@214	276 #+begin_src clojure
rlm@214	277 (defvar
rlm@214	278 ^{:arglists '([creature])}
rlm@214	279 eyes
rlm@214	280 (sense-nodes "eyes")
rlm@214	281 "Return the children of the creature's \"eyes\" node.")
rlm@214	282 #+end_src
rlm@214	283
rlm@215	284 Then, add the camera created by =(add-eye!)= to the simulation by
rlm@215	285 creating a new viewport.
rlm@214	286
rlm@216	287 #+name: add-camera
rlm@213	288 #+begin_src clojure
rlm@169	289 (defn add-camera!
rlm@169	290 "Add a camera to the world, calling continuation on every frame
rlm@34	291 produced."
rlm@167	292 [#^Application world camera continuation]
rlm@23	293 (let [width (.getWidth camera)
rlm@23	294 height (.getHeight camera)
rlm@23	295 render-manager (.getRenderManager world)
rlm@23	296 viewport (.createMainView render-manager "eye-view" camera)]
rlm@23	297 (doto viewport
rlm@23	298 (.setClearFlags true true true)
rlm@112	299 (.setBackgroundColor ColorRGBA/Black)
rlm@113	300 (.addProcessor (vision-pipeline continuation))
rlm@23	301 (.attachScene (.getRootNode world)))))
rlm@215	302 #+end_src
rlm@151	303
rlm@151	304
rlm@218	305 The eye's continuation function should register the viewport with the
rlm@218	306 simulation the first time it is called, use the CPU to extract the
rlm@215	307 appropriate pixels from the rendered image and weight them by each
rlm@218	308 sensor's sensitivity. I have the option to do this processing in
rlm@218	309 native code for a slight gain in speed. I could also do it in the GPU
rlm@218	310 for a massive gain in speed. =(vision-kernel)= generates a list of
rlm@218	311 such continuation functions, one for each channel of the eye.
rlm@151	312
rlm@216	313 #+name: kernel
rlm@215	314 #+begin_src clojure
rlm@215	315 (in-ns 'cortex.vision)
rlm@151	316
rlm@215	317 (defrecord attached-viewport [vision-fn viewport-fn]
rlm@215	318 clojure.lang.IFn
rlm@215	319 (invoke [this world] (vision-fn world))
rlm@215	320 (applyTo [this args] (apply vision-fn args)))
rlm@151	321
rlm@216	322 (defn pixel-sense [sensitivity pixel]
rlm@216	323 (let [s-r (bit-shift-right (bit-and 0xFF0000 sensitivity) 16)
rlm@216	324 s-g (bit-shift-right (bit-and 0x00FF00 sensitivity) 8)
rlm@216	325 s-b (bit-and 0x0000FF sensitivity)
rlm@216	326
rlm@216	327 p-r (bit-shift-right (bit-and 0xFF0000 pixel) 16)
rlm@216	328 p-g (bit-shift-right (bit-and 0x00FF00 pixel) 8)
rlm@216	329 p-b (bit-and 0x0000FF pixel)
rlm@216	330
rlm@216	331 total-sensitivity (* 255 (+ s-r s-g s-b))]
rlm@216	332 (float (/ (+ (* s-r p-r)
rlm@216	333 (* s-g p-g)
rlm@216	334 (* s-b p-b))
rlm@216	335 total-sensitivity))))
rlm@216	336
rlm@215	337 (defn vision-kernel
rlm@171	338 "Returns a list of functions, each of which will return a color
rlm@171	339 channel's worth of visual information when called inside a running
rlm@171	340 simulation."
rlm@151	341 [#^Node creature #^Spatial eye & {skip :skip :or {skip 0}}]
rlm@169	342 (let [retinal-map (retina-sensor-profile eye)
rlm@169	343 camera (add-eye! creature eye)
rlm@151	344 vision-image
rlm@151	345 (atom
rlm@151	346 (BufferedImage. (.getWidth camera)
rlm@151	347 (.getHeight camera)
rlm@170	348 BufferedImage/TYPE_BYTE_BINARY))
rlm@170	349 register-eye!
rlm@170	350 (runonce
rlm@170	351 (fn [world]
rlm@170	352 (add-camera!
rlm@170	353 world camera
rlm@170	354 (let [counter (atom 0)]
rlm@170	355 (fn [r fb bb bi]
rlm@170	356 (if (zero? (rem (swap! counter inc) (inc skip)))
rlm@170	357 (reset! vision-image
rlm@170	358 (BufferedImage! r fb bb bi))))))))]
rlm@151	359 (vec
rlm@151	360 (map
rlm@151	361 (fn [[key image]]
rlm@151	362 (let [whites (white-coordinates image)
rlm@151	363 topology (vec (collapse whites))
rlm@216	364 sensitivity (sensitivity-presets key key)]
rlm@215	365 (attached-viewport.
rlm@215	366 (fn [world]
rlm@215	367 (register-eye! world)
rlm@215	368 (vector
rlm@215	369 topology
rlm@215	370 (vec
rlm@215	371 (for [[x y] whites]
rlm@216	372 (pixel-sense
rlm@216	373 sensitivity
rlm@216	374 (.getRGB @vision-image x y))))))
rlm@215	375 register-eye!)))
rlm@215	376 retinal-map))))
rlm@151	377
rlm@215	378 (defn gen-fix-display
rlm@215	379 "Create a function to call to restore a simulation's display when it
rlm@215	380 is disrupted by a Viewport."
rlm@215	381 []
rlm@215	382 (runonce
rlm@215	383 (fn [world]
rlm@215	384 (add-camera! world (.getCamera world) no-op))))
rlm@215	385 #+end_src
rlm@170	386
rlm@215	387 Note that since each of the functions generated by =(vision-kernel)=
rlm@215	388 shares the same =(register-eye!)= function, the eye will be registered
rlm@215	389 only once the first time any of the functions from the list returned
rlm@215	390 by =(vision-kernel)= is called. Each of the functions returned by
rlm@215	391 =(vision-kernel)= also allows access to the =Viewport= through which
rlm@215	392 it recieves images.
rlm@215	393
rlm@215	394 The in-game display can be disrupted by all the viewports that the
rlm@215	395 functions greated by =(vision-kernel)= add. This doesn't affect the
rlm@215	396 simulation or the simulated senses, but can be annoying.
rlm@215	397 =(gen-fix-display)= restores the in-simulation display.
rlm@215	398
ocsenave@265	399 ** The =vision!= function creates sensory probes.
rlm@215	400
rlm@218	401 All the hard work has been done; all that remains is to apply
rlm@215	402 =(vision-kernel)= to each eye in the creature and gather the results
rlm@215	403 into one list of functions.
rlm@215	404
rlm@216	405 #+name: main
rlm@215	406 #+begin_src clojure
rlm@170	407 (defn vision!
rlm@170	408 "Returns a function which returns visual sensory data when called
rlm@218	409 inside a running simulation."
rlm@151	410 [#^Node creature & {skip :skip :or {skip 0}}]
rlm@151	411 (reduce
rlm@170	412 concat
rlm@167	413 (for [eye (eyes creature)]
rlm@215	414 (vision-kernel creature eye))))
rlm@215	415 #+end_src
rlm@151	416
ocsenave@265	417 ** Displaying visual data for debugging.
ocsenave@265	418 # Visualization of Vision. Maybe less alliteration would be better.
rlm@215	419 It's vital to have a visual representation for each sense. Here I use
rlm@215	420 =(view-sense)= to construct a function that will create a display for
rlm@215	421 visual data.
rlm@215	422
rlm@216	423 #+name: display
rlm@215	424 #+begin_src clojure
rlm@216	425 (in-ns 'cortex.vision)
rlm@216	426
rlm@189	427 (defn view-vision
rlm@189	428 "Creates a function which accepts a list of visual sensor-data and
rlm@189	429 displays each element of the list to the screen."
rlm@189	430 []
rlm@188	431 (view-sense
rlm@188	432 (fn
rlm@188	433 [[coords sensor-data]]
rlm@188	434 (let [image (points->image coords)]
rlm@188	435 (dorun
rlm@188	436 (for [i (range (count coords))]
rlm@188	437 (.setRGB image ((coords i) 0) ((coords i) 1)
rlm@216	438 (gray (int (* 255 (sensor-data i)))))))
rlm@189	439 image))))
rlm@34	440 #+end_src
rlm@23	441
ocsenave@264	442 * Demonstrations
ocsenave@264	443 ** Demonstrating the vision pipeline.
rlm@23	444
rlm@215	445 This is a basic test for the vision system. It only tests the
ocsenave@264	446 vision-pipeline and does not deal with loading eyes from a blender
rlm@215	447 file. The code creates two videos of the same rotating cube from
rlm@215	448 different angles.
rlm@23	449
rlm@215	450 #+name: test-1
rlm@23	451 #+begin_src clojure
rlm@215	452 (in-ns 'cortex.test.vision)
rlm@23	453
rlm@219	454 (defn test-pipeline
rlm@69	455 "Testing vision:
rlm@69	456 Tests the vision system by creating two views of the same rotating
rlm@69	457 object from different angles and displaying both of those views in
rlm@69	458 JFrames.
rlm@69	459
rlm@69	460 You should see a rotating cube, and two windows,
rlm@69	461 each displaying a different view of the cube."
rlm@36	462 []
rlm@58	463 (let [candy
rlm@58	464 (box 1 1 1 :physical? false :color ColorRGBA/Blue)]
rlm@112	465 (world
rlm@112	466 (doto (Node.)
rlm@112	467 (.attachChild candy))
rlm@112	468 {}
rlm@112	469 (fn [world]
rlm@112	470 (let [cam (.clone (.getCamera world))
rlm@112	471 width (.getWidth cam)
rlm@112	472 height (.getHeight cam)]
rlm@169	473 (add-camera! world cam
rlm@215	474 (comp
rlm@215	475 (view-image
rlm@215	476 (File. "/home/r/proj/cortex/render/vision/1"))
rlm@215	477 BufferedImage!))
rlm@169	478 (add-camera! world
rlm@112	479 (doto (.clone cam)
rlm@112	480 (.setLocation (Vector3f. -10 0 0))
rlm@112	481 (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))
rlm@215	482 (comp
rlm@215	483 (view-image
rlm@215	484 (File. "/home/r/proj/cortex/render/vision/2"))
rlm@215	485 BufferedImage!))
rlm@112	486 ;; This is here to restore the main view
rlm@112	487 ;; after the other views have completed processing
rlm@169	488 (add-camera! world (.getCamera world) no-op)))
rlm@112	489 (fn [world tpf]
rlm@112	490 (.rotate candy (* tpf 0.2) 0 0)))))
rlm@23	491 #+end_src
rlm@23	492
rlm@215	493 #+begin_html
rlm@215	494 <div class="figure">
rlm@215	495 <video controls="controls" width="755">
rlm@215	496 <source src="../video/spinning-cube.ogg" type="video/ogg"
rlm@215	497 preload="none" poster="../images/aurellem-1280x480.png" />
rlm@215	498 </video>
rlm@215	499 <p>A rotating cube viewed from two different perspectives.</p>
rlm@215	500 </div>
rlm@215	501 #+end_html
rlm@215	502
rlm@215	503 Creating multiple eyes like this can be used for stereoscopic vision
rlm@215	504 simulation in a single creature or for simulating multiple creatures,
rlm@215	505 each with their own sense of vision.
ocsenave@264	506 ** Demonstrating eye import and parsing.
rlm@215	507
rlm@218	508 To the worm from the last post, I add a new node that describes its
rlm@215	509 eyes.
rlm@215	510
rlm@215	511 #+attr_html: width=755
rlm@215	512 #+caption: The worm with newly added empty nodes describing a single eye.
rlm@215	513 [[../images/worm-with-eye.png]]
rlm@215	514
rlm@215	515 The node highlighted in yellow is the root level "eyes" node. It has
rlm@218	516 a single child, highlighted in orange, which describes a single
rlm@218	517 eye. This is the "eye" node. It is placed so that the worm will have
rlm@218	518 an eye located in the center of the flat portion of its lower
rlm@218	519 hemispherical section.
rlm@218	520
rlm@218	521 The two nodes which are not highlighted describe the single joint of
rlm@218	522 the worm.
rlm@215	523
rlm@215	524 The metadata of the eye-node is:
rlm@215	525
rlm@215	526 #+begin_src clojure :results verbatim :exports both
rlm@215	527 (cortex.sense/meta-data
rlm@218	528 (.getChild (.getChild (cortex.test.body/worm) "eyes") "eye") "eye")
rlm@215	529 #+end_src
rlm@215	530
rlm@215	531 #+results:
rlm@215	532 : "(let [retina \"Models/test-creature/retina-small.png\"]
rlm@215	533 : {:all retina :red retina :green retina :blue retina})"
rlm@215	534
rlm@215	535 This is the approximation to the human eye described earlier.
rlm@215	536
rlm@216	537 #+name: test-2
rlm@215	538 #+begin_src clojure
rlm@215	539 (in-ns 'cortex.test.vision)
rlm@215	540
rlm@216	541 (defn change-color [obj color]
rlm@216	542 (println-repl obj)
rlm@216	543 (if obj
rlm@216	544 (.setColor (.getMaterial obj) "Color" color)))
rlm@216	545
rlm@216	546 (defn colored-cannon-ball [color]
rlm@216	547 (comp #(change-color % color)
rlm@216	548 (fire-cannon-ball)))
rlm@215	549
rlm@236	550 (defn test-worm-vision [record]
rlm@215	551 (let [the-worm (doto (worm)(body!))
rlm@215	552 vision (vision! the-worm)
rlm@215	553 vision-display (view-vision)
rlm@215	554 fix-display (gen-fix-display)
rlm@215	555 me (sphere 0.5 :color ColorRGBA/Blue :physical? false)
rlm@215	556 x-axis
rlm@215	557 (box 1 0.01 0.01 :physical? false :color ColorRGBA/Red
rlm@215	558 :position (Vector3f. 0 -5 0))
rlm@215	559 y-axis
rlm@215	560 (box 0.01 1 0.01 :physical? false :color ColorRGBA/Green
rlm@215	561 :position (Vector3f. 0 -5 0))
rlm@215	562 z-axis
rlm@215	563 (box 0.01 0.01 1 :physical? false :color ColorRGBA/Blue
rlm@216	564 :position (Vector3f. 0 -5 0))
rlm@216	565 timer (RatchetTimer. 60)]
rlm@215	566
rlm@215	567 (world (nodify [(floor) the-worm x-axis y-axis z-axis me])
rlm@216	568 (assoc standard-debug-controls
rlm@216	569 "key-r" (colored-cannon-ball ColorRGBA/Red)
rlm@216	570 "key-b" (colored-cannon-ball ColorRGBA/Blue)
rlm@216	571 "key-g" (colored-cannon-ball ColorRGBA/Green))
rlm@215	572 (fn [world]
rlm@215	573 (light-up-everything world)
rlm@216	574 (speed-up world)
rlm@216	575 (.setTimer world timer)
rlm@216	576 (display-dialated-time world timer)
rlm@215	577 ;; add a view from the worm's perspective
rlm@236	578 (if record
rlm@236	579 (Capture/captureVideo
rlm@236	580 world
rlm@236	581 (File.
rlm@236	582 "/home/r/proj/cortex/render/worm-vision/main-view")))
rlm@236	583
rlm@215	584 (add-camera!
rlm@215	585 world
rlm@215	586 (add-eye! the-worm
rlm@215	587 (.getChild
rlm@215	588 (.getChild the-worm "eyes") "eye"))
rlm@215	589 (comp
rlm@215	590 (view-image
rlm@236	591 (if record
rlm@236	592 (File.
rlm@236	593 "/home/r/proj/cortex/render/worm-vision/worm-view")))
rlm@215	594 BufferedImage!))
rlm@236	595
rlm@236	596 (set-gravity world Vector3f/ZERO))
rlm@216	597
rlm@215	598 (fn [world _ ]
rlm@215	599 (.setLocalTranslation me (.getLocation (.getCamera world)))
rlm@215	600 (vision-display
rlm@215	601 (map #(% world) vision)
rlm@236	602 (if record (File. "/home/r/proj/cortex/render/worm-vision")))
rlm@215	603 (fix-display world)))))
rlm@215	604 #+end_src
rlm@215	605
rlm@218	606 The world consists of the worm and a flat gray floor. I can shoot red,
rlm@218	607 green, blue and white cannonballs at the worm. The worm is initially
rlm@218	608 looking down at the floor, and there is no gravity. My perspective
rlm@218	609 (the Main View), the worm's perspective (Worm View) and the 4 sensor
rlm@218	610 channels that comprise the worm's eye are all saved frame-by-frame to
rlm@218	611 disk.
rlm@218	612
rlm@218	613 * Demonstration of Vision
rlm@218	614 #+begin_html
rlm@218	615 <div class="figure">
rlm@218	616 <video controls="controls" width="755">
rlm@218	617 <source src="../video/worm-vision.ogg" type="video/ogg"
rlm@218	618 preload="none" poster="../images/aurellem-1280x480.png" />
rlm@218	619 </video>
rlm@218	620 <p>Simulated Vision in a Virtual Environment</p>
rlm@218	621 </div>
rlm@218	622 #+end_html
rlm@218	623
rlm@218	624 ** Generate the Worm Video from Frames
rlm@216	625 #+name: magick2
rlm@216	626 #+begin_src clojure
rlm@216	627 (ns cortex.video.magick2
rlm@216	628 (:import java.io.File)
rlm@216	629 (:use clojure.contrib.shell-out))
rlm@216	630
rlm@216	631 (defn images [path]
rlm@216	632 (sort (rest (file-seq (File. path)))))
rlm@216	633
rlm@216	634 (def base "/home/r/proj/cortex/render/worm-vision/")
rlm@216	635
rlm@216	636 (defn pics [file]
rlm@216	637 (images (str base file)))
rlm@216	638
rlm@216	639 (defn combine-images []
rlm@216	640 (let [main-view (pics "main-view")
rlm@216	641 worm-view (pics "worm-view")
rlm@216	642 blue (pics "0")
rlm@216	643 green (pics "1")
rlm@216	644 red (pics "2")
rlm@216	645 gray (pics "3")
rlm@216	646 blender (let [b-pics (pics "blender")]
rlm@216	647 (concat b-pics (repeat 9001 (last b-pics))))
rlm@216	648 background (repeat 9001 (File. (str base "background.png")))
rlm@216	649 targets (map
rlm@216	650 #(File. (str base "out/" (format "%07d.png" %)))
rlm@216	651 (range 0 (count main-view)))]
rlm@216	652 (dorun
rlm@216	653 (pmap
rlm@216	654 (comp
rlm@216	655 (fn [[background main-view worm-view red green blue gray blender target]]
rlm@216	656 (println target)
rlm@216	657 (sh "convert"
rlm@216	658 background
rlm@216	659 main-view "-geometry" "+18+17" "-composite"
rlm@216	660 worm-view "-geometry" "+677+17" "-composite"
rlm@216	661 green "-geometry" "+685+430" "-composite"
rlm@216	662 red "-geometry" "+788+430" "-composite"
rlm@216	663 blue "-geometry" "+894+430" "-composite"
rlm@216	664 gray "-geometry" "+1000+430" "-composite"
rlm@216	665 blender "-geometry" "+0+0" "-composite"
rlm@216	666 target))
rlm@216	667 (fn [& args] (map #(.getCanonicalPath %) args)))
rlm@216	668 background main-view worm-view red green blue gray blender targets))))
rlm@216	669 #+end_src
rlm@216	670
rlm@216	671 #+begin_src sh :results silent
rlm@216	672 cd /home/r/proj/cortex/render/worm-vision
rlm@216	673 ffmpeg -r 25 -b 9001k -i out/%07d.png -vcodec libtheora worm-vision.ogg
rlm@216	674 #+end_src
rlm@236	675
ocsenave@265	676 * Onward!
ocsenave@265	677 - As a neat bonus, this idea behind simulated vision also enables one
ocsenave@265	678 to [[../../cortex/html/capture-video.html][capture live video feeds from jMonkeyEngine]].
ocsenave@265	679 - Now that we have vision, it's time to tackle [[./hearing.org][hearing]].
ocsenave@265	680
ocsenave@265	681
ocsenave@265	682 #+appendix
ocsenave@265	683
rlm@215	684 * Headers
rlm@215	685
rlm@213	686 #+name: vision-header
rlm@213	687 #+begin_src clojure
rlm@213	688 (ns cortex.vision
rlm@213	689 "Simulate the sense of vision in jMonkeyEngine3. Enables multiple
rlm@213	690 eyes from different positions to observe the same world, and pass
rlm@213	691 the observed data to any arbitray function. Automatically reads
rlm@216	692 eye-nodes from specially prepared blender files and instantiates
rlm@213	693 them in the world as actual eyes."
rlm@213	694 {:author "Robert McIntyre"}
rlm@213	695 (:use (cortex world sense util))
rlm@213	696 (:use clojure.contrib.def)
rlm@213	697 (:import com.jme3.post.SceneProcessor)
rlm@237	698 (:import (com.jme3.util BufferUtils Screenshots))
rlm@213	699 (:import java.nio.ByteBuffer)
rlm@213	700 (:import java.awt.image.BufferedImage)
rlm@213	701 (:import (com.jme3.renderer ViewPort Camera))
rlm@216	702 (:import (com.jme3.math ColorRGBA Vector3f Matrix3f))
rlm@213	703 (:import com.jme3.renderer.Renderer)
rlm@213	704 (:import com.jme3.app.Application)
rlm@213	705 (:import com.jme3.texture.FrameBuffer)
rlm@213	706 (:import (com.jme3.scene Node Spatial)))
rlm@213	707 #+end_src
rlm@112	708
rlm@215	709 #+name: test-header
rlm@215	710 #+begin_src clojure
rlm@215	711 (ns cortex.test.vision
rlm@215	712 (:use (cortex world sense util body vision))
rlm@215	713 (:use cortex.test.body)
rlm@215	714 (:import java.awt.image.BufferedImage)
rlm@215	715 (:import javax.swing.JPanel)
rlm@215	716 (:import javax.swing.SwingUtilities)
rlm@215	717 (:import java.awt.Dimension)
rlm@215	718 (:import javax.swing.JFrame)
rlm@215	719 (:import com.jme3.math.ColorRGBA)
rlm@215	720 (:import com.jme3.scene.Node)
rlm@215	721 (:import com.jme3.math.Vector3f)
rlm@216	722 (:import java.io.File)
rlm@216	723 (:import (com.aurellem.capture Capture RatchetTimer)))
rlm@215	724 #+end_src
rlm@216	725 * Source Listing
rlm@216	726 - [[../src/cortex/vision.clj][cortex.vision]]
rlm@216	727 - [[../src/cortex/test/vision.clj][cortex.test.vision]]
rlm@216	728 - [[../src/cortex/video/magick2.clj][cortex.video.magick2]]
rlm@216	729 - [[../assets/Models/subtitles/worm-vision-subtitles.blend][worm-vision-subtitles.blend]]
rlm@216	730 #+html: <ul> <li> <a href="../org/sense.org">This org file</a> </li> </ul>
rlm@216	731 - [[http://hg.bortreb.com ][source-repository]]
rlm@216	732
rlm@35	733
rlm@24	734
rlm@212	735 * COMMENT Generate Source
rlm@34	736 #+begin_src clojure :tangle ../src/cortex/vision.clj
rlm@216	737 <<vision-header>>
rlm@216	738 <<pipeline-1>>
rlm@216	739 <<pipeline-2>>
rlm@216	740 <<retina>>
rlm@216	741 <<add-eye>>
rlm@216	742 <<sensitivity>>
rlm@216	743 <<eye-node>>
rlm@216	744 <<add-camera>>
rlm@216	745 <<kernel>>
rlm@216	746 <<main>>
rlm@216	747 <<display>>
rlm@24	748 #+end_src
rlm@24	749
rlm@68	750 #+begin_src clojure :tangle ../src/cortex/test/vision.clj
rlm@215	751 <<test-header>>
rlm@215	752 <<test-1>>
rlm@216	753 <<test-2>>
rlm@24	754 #+end_src
rlm@216	755
rlm@216	756 #+begin_src clojure :tangle ../src/cortex/video/magick2.clj
rlm@216	757 <<magick2>>
rlm@216	758 #+end_src

Mercurial > cortex

annotate org/vision.org @ 270:aa3641042958