comparison org/vision.org @ 218:ac46ee4e574a

edits to vision.org
author Robert McIntyre <rlm@mit.edu>
date Fri, 10 Feb 2012 12:06:41 -0700
parents f5ea63245b3b
children 5f14fd7b1288
comparison
equal deleted inserted replaced
217:7bf3e3d8fb26 218:ac46ee4e574a
12 Vision is one of the most important senses for humans, so I need to 12 Vision is one of the most important senses for humans, so I need to
13 build a simulated sense of vision for my AI. I will do this with 13 build a simulated sense of vision for my AI. I will do this with
14 simulated eyes. Each eye can be independely moved and should see its 14 simulated eyes. Each eye can be independely moved and should see its
15 own version of the world depending on where it is. 15 own version of the world depending on where it is.
16 16
17 Making these simulated eyes a reality is fairly simple bacause 17 Making these simulated eyes a reality is simple bacause jMonkeyEngine
18 jMonkeyEngine already conatains extensive support for multiple views 18 already conatains extensive support for multiple views of the same 3D
19 of the same 3D simulated world. The reason jMonkeyEngine has this 19 simulated world. The reason jMonkeyEngine has this support is because
20 support is because the support is necessary to create games with 20 the support is necessary to create games with split-screen
21 split-screen views. Multiple views are also used to create efficient 21 views. Multiple views are also used to create efficient
22 pseudo-reflections by rendering the scene from a certain perspective 22 pseudo-reflections by rendering the scene from a certain perspective
23 and then projecting it back onto a surface in the 3D world. 23 and then projecting it back onto a surface in the 3D world.
24 24
25 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye 25 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye, which was one of the first games to use split-screen views.
26 [[../images/goldeneye-4-player.png]] 26 [[../images/goldeneye-4-player.png]]
27 27
28 * Brief Description of jMonkeyEngine's Rendering Pipeline 28 * Brief Description of jMonkeyEngine's Rendering Pipeline
29 29
30 jMonkeyEngine allows you to create a =ViewPort=, which represents a 30 jMonkeyEngine allows you to create a =ViewPort=, which represents a
47 add a =SceneProcessor= that feeds the visual data to any arbitray 47 add a =SceneProcessor= that feeds the visual data to any arbitray
48 continuation function for further processing. That continuation 48 continuation function for further processing. That continuation
49 function may perform both CPU and GPU operations on the data. To make 49 function may perform both CPU and GPU operations on the data. To make
50 this easy for the continuation function, the =SceneProcessor= 50 this easy for the continuation function, the =SceneProcessor=
51 maintains appropriatly sized buffers in RAM to hold the data. It does 51 maintains appropriatly sized buffers in RAM to hold the data. It does
52 not do any copying from the GPU to the CPU itself. 52 not do any copying from the GPU to the CPU itself because it is a slow
53 operation.
53 54
54 #+name: pipeline-1 55 #+name: pipeline-1
55 #+begin_src clojure 56 #+begin_src clojure
56 (defn vision-pipeline 57 (defn vision-pipeline
57 "Create a SceneProcessor object which wraps a vision processing 58 "Create a SceneProcessor object which wraps a vision processing
86 (cleanup [])))) 87 (cleanup []))))
87 #+end_src 88 #+end_src
88 89
89 The continuation function given to =(vision-pipeline)= above will be 90 The continuation function given to =(vision-pipeline)= above will be
90 given a =Renderer= and three containers for image data. The 91 given a =Renderer= and three containers for image data. The
91 =FrameBuffer= references the GPU image data, but it can not be used 92 =FrameBuffer= references the GPU image data, but the pixel data can
92 directly on the CPU. The =ByteBuffer= and =BufferedImage= are 93 not be used directly on the CPU. The =ByteBuffer= and =BufferedImage=
93 initially "empty" but are sized to hold to data in the 94 are initially "empty" but are sized to hold to data in the
94 =FrameBuffer=. I call transfering the GPU image data to the CPU 95 =FrameBuffer=. I call transfering the GPU image data to the CPU
95 structures "mixing" the image data. I have provided three functions to 96 structures "mixing" the image data. I have provided three functions to
96 do this mixing. 97 do this mixing.
97 98
98 #+name: pipeline-2 99 #+name: pipeline-2
170 (let [target (closest-node creature eye) 171 (let [target (closest-node creature eye)
171 [cam-width cam-height] (eye-dimensions eye) 172 [cam-width cam-height] (eye-dimensions eye)
172 cam (Camera. cam-width cam-height) 173 cam (Camera. cam-width cam-height)
173 rot (.getWorldRotation eye)] 174 rot (.getWorldRotation eye)]
174 (.setLocation cam (.getWorldTranslation eye)) 175 (.setLocation cam (.getWorldTranslation eye))
175 (.lookAtDirection cam (.mult rot Vector3f/UNIT_X) 176 (.lookAtDirection
176 ;; this part is consistent with using Z in 177 cam ; this part is not a mistake and
177 ;; blender as the UP vector. 178 (.mult rot Vector3f/UNIT_X) ; is consistent with using Z in
178 (.mult rot Vector3f/UNIT_Y)) 179 (.mult rot Vector3f/UNIT_Y)) ; blender as the UP vector.
179 (.setFrustumPerspective 180 (.setFrustumPerspective
180 cam 45 (/ (.getWidth cam) (.getHeight cam)) 1 1000) 181 cam 45 (/ (.getWidth cam) (.getHeight cam)) 1 1000)
181 (bind-sense target cam) cam)) 182 (bind-sense target cam) cam))
182 #+end_src 183 #+end_src
183 184
186 187
187 188
188 ** The Retina 189 ** The Retina
189 190
190 An eye is a surface (the retina) which contains many discrete sensors 191 An eye is a surface (the retina) which contains many discrete sensors
191 to detect light. These sensors have can have different-light sensing 192 to detect light. These sensors have can have different light-sensing
192 properties. In humans, each discrete sensor is sensitive to red, 193 properties. In humans, each discrete sensor is sensitive to red,
193 blue, green, or gray. These different types of sensors can have 194 blue, green, or gray. These different types of sensors can have
194 different spatial distributions along the retina. In humans, there is 195 different spatial distributions along the retina. In humans, there is
195 a fovea in the center of the retina which has a very high density of 196 a fovea in the center of the retina which has a very high density of
196 color sensors, and a blind spot which has no sensors at all. Sensor 197 color sensors, and a blind spot which has no sensors at all. Sensor
225 0xFFFFFF retinal-profile}) 226 0xFFFFFF retinal-profile})
226 #+end_src 227 #+end_src
227 228
228 The numbers that serve as keys in the map determine a sensor's 229 The numbers that serve as keys in the map determine a sensor's
229 relative sensitivity to the channels red, green, and blue. These 230 relative sensitivity to the channels red, green, and blue. These
230 sensitivity values are packed into an integer in the order _RGB in 231 sensitivity values are packed into an integer in the order =|_|R|G|B|=
231 8-bit fields. The RGB values of a pixel in the image are added 232 in 8-bit fields. The RGB values of a pixel in the image are added
232 together with these sensitivities as linear weights. Therfore, 233 together with these sensitivities as linear weights. Therfore,
233 0xFF0000 means sensitive to red only while 0xFFFFFF means sensitive to 234 0xFF0000 means sensitive to red only while 0xFFFFFF means sensitive to
234 all colors equally (gray). 235 all colors equally (gray).
235 236
236 For convienence I've defined a few symbols for the more common 237 For convienence I've defined a few symbols for the more common
265 (if-let [eye-map (meta-data eye "eye")] 266 (if-let [eye-map (meta-data eye "eye")]
266 (map-vals 267 (map-vals
267 load-image 268 load-image
268 (eval (read-string eye-map))))) 269 (eval (read-string eye-map)))))
269 270
270 (defn eye-dimensions 271 (defn eye-dimensions
271 "Returns [width, height] specified in the metadata of the eye" 272 "Returns [width, height] determined by the metadata of the eye."
272 [#^Spatial eye] 273 [#^Spatial eye]
273 (let [dimensions 274 (let [dimensions
274 (map #(vector (.getWidth %) (.getHeight %)) 275 (map #(vector (.getWidth %) (.getHeight %))
275 (vals (retina-sensor-profile eye)))] 276 (vals (retina-sensor-profile eye)))]
276 [(apply max (map first dimensions)) 277 [(apply max (map first dimensions))
277 (apply max (map second dimensions))])) 278 (apply max (map second dimensions))]))
278 #+end_src 279 #+end_src
279 280
280
281 * Eye Creation 281 * Eye Creation
282
283 First off, get the children of the "eyes" empty node to find all the 282 First off, get the children of the "eyes" empty node to find all the
284 eyes the creature has. 283 eyes the creature has.
285 #+name: eye-node 284 #+name: eye-node
286 #+begin_src clojure 285 #+begin_src clojure
287 (defvar 286 (defvar
310 (.addProcessor (vision-pipeline continuation)) 309 (.addProcessor (vision-pipeline continuation))
311 (.attachScene (.getRootNode world))))) 310 (.attachScene (.getRootNode world)))))
312 #+end_src 311 #+end_src
313 312
314 313
315 The continuation function registers the viewport with the simulation 314 The eye's continuation function should register the viewport with the
316 the first time it is called, and uses the CPU to extract the 315 simulation the first time it is called, use the CPU to extract the
317 appropriate pixels from the rendered image and weight them by each 316 appropriate pixels from the rendered image and weight them by each
318 sensors sensitivity. I have the option to do this filtering in native 317 sensor's sensitivity. I have the option to do this processing in
319 code for a slight gain in speed. I could also do it in the GPU for a 318 native code for a slight gain in speed. I could also do it in the GPU
320 massive gain in speed. =(vision-kernel)= generates a list of such 319 for a massive gain in speed. =(vision-kernel)= generates a list of
321 continuation functions, one for each channel of the eye. 320 such continuation functions, one for each channel of the eye.
322 321
323 #+name: kernel 322 #+name: kernel
324 #+begin_src clojure 323 #+begin_src clojure
325 (in-ns 'cortex.vision) 324 (in-ns 'cortex.vision)
326 325
406 simulation or the simulated senses, but can be annoying. 405 simulation or the simulated senses, but can be annoying.
407 =(gen-fix-display)= restores the in-simulation display. 406 =(gen-fix-display)= restores the in-simulation display.
408 407
409 ** Vision! 408 ** Vision!
410 409
411 All the hard work has been done, all that remains is to apply 410 All the hard work has been done; all that remains is to apply
412 =(vision-kernel)= to each eye in the creature and gather the results 411 =(vision-kernel)= to each eye in the creature and gather the results
413 into one list of functions. 412 into one list of functions.
414 413
415 #+name: main 414 #+name: main
416 #+begin_src clojure 415 #+begin_src clojure
417 (defn vision! 416 (defn vision!
418 "Returns a function which returns visual sensory data when called 417 "Returns a function which returns visual sensory data when called
419 inside a running simulation" 418 inside a running simulation."
420 [#^Node creature & {skip :skip :or {skip 0}}] 419 [#^Node creature & {skip :skip :or {skip 0}}]
421 (reduce 420 (reduce
422 concat 421 concat
423 (for [eye (eyes creature)] 422 (for [eye (eyes creature)]
424 (vision-kernel creature eye)))) 423 (vision-kernel creature eye))))
515 simulation in a single creature or for simulating multiple creatures, 514 simulation in a single creature or for simulating multiple creatures,
516 each with their own sense of vision. 515 each with their own sense of vision.
517 516
518 ** Adding Vision to the Worm 517 ** Adding Vision to the Worm
519 518
520 To the worm from the last post, we add a new node that describes its 519 To the worm from the last post, I add a new node that describes its
521 eyes. 520 eyes.
522 521
523 #+attr_html: width=755 522 #+attr_html: width=755
524 #+caption: The worm with newly added empty nodes describing a single eye. 523 #+caption: The worm with newly added empty nodes describing a single eye.
525 [[../images/worm-with-eye.png]] 524 [[../images/worm-with-eye.png]]
526 525
527 The node highlighted in yellow is the root level "eyes" node. It has 526 The node highlighted in yellow is the root level "eyes" node. It has
528 a single node, highlighted in orange, which describes a single 527 a single child, highlighted in orange, which describes a single
529 eye. This is the "eye" node. The two nodes which are not highlighted describe the single joint 528 eye. This is the "eye" node. It is placed so that the worm will have
530 of the worm. 529 an eye located in the center of the flat portion of its lower
530 hemispherical section.
531
532 The two nodes which are not highlighted describe the single joint of
533 the worm.
531 534
532 The metadata of the eye-node is: 535 The metadata of the eye-node is:
533 536
534 #+begin_src clojure :results verbatim :exports both 537 #+begin_src clojure :results verbatim :exports both
535 (cortex.sense/meta-data 538 (cortex.sense/meta-data
536 (.getChild 539 (.getChild (.getChild (cortex.test.body/worm) "eyes") "eye") "eye")
537 (.getChild (cortex.test.body/worm)
538 "eyes") "eye") "eye")
539 #+end_src 540 #+end_src
540 541
541 #+results: 542 #+results:
542 : "(let [retina \"Models/test-creature/retina-small.png\"] 543 : "(let [retina \"Models/test-creature/retina-small.png\"]
543 : {:all retina :red retina :green retina :blue retina})" 544 : {:all retina :red retina :green retina :blue retina})"
606 (map #(% world) vision) 607 (map #(% world) vision)
607 (File. "/home/r/proj/cortex/render/worm-vision")) 608 (File. "/home/r/proj/cortex/render/worm-vision"))
608 (fix-display world))))) 609 (fix-display world)))))
609 #+end_src 610 #+end_src
610 611
611 ** Methods to Generate the Worm Video 612 The world consists of the worm and a flat gray floor. I can shoot red,
613 green, blue and white cannonballs at the worm. The worm is initially
614 looking down at the floor, and there is no gravity. My perspective
615 (the Main View), the worm's perspective (Worm View) and the 4 sensor
616 channels that comprise the worm's eye are all saved frame-by-frame to
617 disk.
618
619 * Demonstration of Vision
620 #+begin_html
621 <div class="figure">
622 <video controls="controls" width="755">
623 <source src="../video/worm-vision.ogg" type="video/ogg"
624 preload="none" poster="../images/aurellem-1280x480.png" />
625 </video>
626 <p>Simulated Vision in a Virtual Environment</p>
627 </div>
628 #+end_html
629
630 ** Generate the Worm Video from Frames
612 #+name: magick2 631 #+name: magick2
613 #+begin_src clojure 632 #+begin_src clojure
614 (ns cortex.video.magick2 633 (ns cortex.video.magick2
615 (:import java.io.File) 634 (:import java.io.File)
616 (:use clojure.contrib.shell-out)) 635 (:use clojure.contrib.shell-out))
658 #+begin_src sh :results silent 677 #+begin_src sh :results silent
659 cd /home/r/proj/cortex/render/worm-vision 678 cd /home/r/proj/cortex/render/worm-vision
660 ffmpeg -r 25 -b 9001k -i out/%07d.png -vcodec libtheora worm-vision.ogg 679 ffmpeg -r 25 -b 9001k -i out/%07d.png -vcodec libtheora worm-vision.ogg
661 #+end_src 680 #+end_src
662 681
663 * Demonstration of Vision
664 #+begin_html
665 <div class="figure">
666 <video controls="controls" width="755">
667 <source src="../video/worm-vision.ogg" type="video/ogg"
668 preload="none" poster="../images/aurellem-1280x480.png" />
669 </video>
670 <p>Simulated Vision in a Virtual Environment</p>
671 </div>
672 #+end_html
673
674 * Headers 682 * Headers
675 683
676 #+name: vision-header 684 #+name: vision-header
677 #+begin_src clojure 685 #+begin_src clojure
678 (ns cortex.vision 686 (ns cortex.vision