Mercurial > cortex
comparison org/vision.org @ 218:ac46ee4e574a
edits to vision.org
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Fri, 10 Feb 2012 12:06:41 -0700 |
parents | f5ea63245b3b |
children | 5f14fd7b1288 |
comparison
equal
deleted
inserted
replaced
217:7bf3e3d8fb26 | 218:ac46ee4e574a |
---|---|
12 Vision is one of the most important senses for humans, so I need to | 12 Vision is one of the most important senses for humans, so I need to |
13 build a simulated sense of vision for my AI. I will do this with | 13 build a simulated sense of vision for my AI. I will do this with |
14 simulated eyes. Each eye can be independely moved and should see its | 14 simulated eyes. Each eye can be independely moved and should see its |
15 own version of the world depending on where it is. | 15 own version of the world depending on where it is. |
16 | 16 |
17 Making these simulated eyes a reality is fairly simple bacause | 17 Making these simulated eyes a reality is simple bacause jMonkeyEngine |
18 jMonkeyEngine already conatains extensive support for multiple views | 18 already conatains extensive support for multiple views of the same 3D |
19 of the same 3D simulated world. The reason jMonkeyEngine has this | 19 simulated world. The reason jMonkeyEngine has this support is because |
20 support is because the support is necessary to create games with | 20 the support is necessary to create games with split-screen |
21 split-screen views. Multiple views are also used to create efficient | 21 views. Multiple views are also used to create efficient |
22 pseudo-reflections by rendering the scene from a certain perspective | 22 pseudo-reflections by rendering the scene from a certain perspective |
23 and then projecting it back onto a surface in the 3D world. | 23 and then projecting it back onto a surface in the 3D world. |
24 | 24 |
25 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye | 25 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye, which was one of the first games to use split-screen views. |
26 [[../images/goldeneye-4-player.png]] | 26 [[../images/goldeneye-4-player.png]] |
27 | 27 |
28 * Brief Description of jMonkeyEngine's Rendering Pipeline | 28 * Brief Description of jMonkeyEngine's Rendering Pipeline |
29 | 29 |
30 jMonkeyEngine allows you to create a =ViewPort=, which represents a | 30 jMonkeyEngine allows you to create a =ViewPort=, which represents a |
47 add a =SceneProcessor= that feeds the visual data to any arbitray | 47 add a =SceneProcessor= that feeds the visual data to any arbitray |
48 continuation function for further processing. That continuation | 48 continuation function for further processing. That continuation |
49 function may perform both CPU and GPU operations on the data. To make | 49 function may perform both CPU and GPU operations on the data. To make |
50 this easy for the continuation function, the =SceneProcessor= | 50 this easy for the continuation function, the =SceneProcessor= |
51 maintains appropriatly sized buffers in RAM to hold the data. It does | 51 maintains appropriatly sized buffers in RAM to hold the data. It does |
52 not do any copying from the GPU to the CPU itself. | 52 not do any copying from the GPU to the CPU itself because it is a slow |
53 operation. | |
53 | 54 |
54 #+name: pipeline-1 | 55 #+name: pipeline-1 |
55 #+begin_src clojure | 56 #+begin_src clojure |
56 (defn vision-pipeline | 57 (defn vision-pipeline |
57 "Create a SceneProcessor object which wraps a vision processing | 58 "Create a SceneProcessor object which wraps a vision processing |
86 (cleanup [])))) | 87 (cleanup [])))) |
87 #+end_src | 88 #+end_src |
88 | 89 |
89 The continuation function given to =(vision-pipeline)= above will be | 90 The continuation function given to =(vision-pipeline)= above will be |
90 given a =Renderer= and three containers for image data. The | 91 given a =Renderer= and three containers for image data. The |
91 =FrameBuffer= references the GPU image data, but it can not be used | 92 =FrameBuffer= references the GPU image data, but the pixel data can |
92 directly on the CPU. The =ByteBuffer= and =BufferedImage= are | 93 not be used directly on the CPU. The =ByteBuffer= and =BufferedImage= |
93 initially "empty" but are sized to hold to data in the | 94 are initially "empty" but are sized to hold to data in the |
94 =FrameBuffer=. I call transfering the GPU image data to the CPU | 95 =FrameBuffer=. I call transfering the GPU image data to the CPU |
95 structures "mixing" the image data. I have provided three functions to | 96 structures "mixing" the image data. I have provided three functions to |
96 do this mixing. | 97 do this mixing. |
97 | 98 |
98 #+name: pipeline-2 | 99 #+name: pipeline-2 |
170 (let [target (closest-node creature eye) | 171 (let [target (closest-node creature eye) |
171 [cam-width cam-height] (eye-dimensions eye) | 172 [cam-width cam-height] (eye-dimensions eye) |
172 cam (Camera. cam-width cam-height) | 173 cam (Camera. cam-width cam-height) |
173 rot (.getWorldRotation eye)] | 174 rot (.getWorldRotation eye)] |
174 (.setLocation cam (.getWorldTranslation eye)) | 175 (.setLocation cam (.getWorldTranslation eye)) |
175 (.lookAtDirection cam (.mult rot Vector3f/UNIT_X) | 176 (.lookAtDirection |
176 ;; this part is consistent with using Z in | 177 cam ; this part is not a mistake and |
177 ;; blender as the UP vector. | 178 (.mult rot Vector3f/UNIT_X) ; is consistent with using Z in |
178 (.mult rot Vector3f/UNIT_Y)) | 179 (.mult rot Vector3f/UNIT_Y)) ; blender as the UP vector. |
179 (.setFrustumPerspective | 180 (.setFrustumPerspective |
180 cam 45 (/ (.getWidth cam) (.getHeight cam)) 1 1000) | 181 cam 45 (/ (.getWidth cam) (.getHeight cam)) 1 1000) |
181 (bind-sense target cam) cam)) | 182 (bind-sense target cam) cam)) |
182 #+end_src | 183 #+end_src |
183 | 184 |
186 | 187 |
187 | 188 |
188 ** The Retina | 189 ** The Retina |
189 | 190 |
190 An eye is a surface (the retina) which contains many discrete sensors | 191 An eye is a surface (the retina) which contains many discrete sensors |
191 to detect light. These sensors have can have different-light sensing | 192 to detect light. These sensors have can have different light-sensing |
192 properties. In humans, each discrete sensor is sensitive to red, | 193 properties. In humans, each discrete sensor is sensitive to red, |
193 blue, green, or gray. These different types of sensors can have | 194 blue, green, or gray. These different types of sensors can have |
194 different spatial distributions along the retina. In humans, there is | 195 different spatial distributions along the retina. In humans, there is |
195 a fovea in the center of the retina which has a very high density of | 196 a fovea in the center of the retina which has a very high density of |
196 color sensors, and a blind spot which has no sensors at all. Sensor | 197 color sensors, and a blind spot which has no sensors at all. Sensor |
225 0xFFFFFF retinal-profile}) | 226 0xFFFFFF retinal-profile}) |
226 #+end_src | 227 #+end_src |
227 | 228 |
228 The numbers that serve as keys in the map determine a sensor's | 229 The numbers that serve as keys in the map determine a sensor's |
229 relative sensitivity to the channels red, green, and blue. These | 230 relative sensitivity to the channels red, green, and blue. These |
230 sensitivity values are packed into an integer in the order _RGB in | 231 sensitivity values are packed into an integer in the order =|_|R|G|B|= |
231 8-bit fields. The RGB values of a pixel in the image are added | 232 in 8-bit fields. The RGB values of a pixel in the image are added |
232 together with these sensitivities as linear weights. Therfore, | 233 together with these sensitivities as linear weights. Therfore, |
233 0xFF0000 means sensitive to red only while 0xFFFFFF means sensitive to | 234 0xFF0000 means sensitive to red only while 0xFFFFFF means sensitive to |
234 all colors equally (gray). | 235 all colors equally (gray). |
235 | 236 |
236 For convienence I've defined a few symbols for the more common | 237 For convienence I've defined a few symbols for the more common |
265 (if-let [eye-map (meta-data eye "eye")] | 266 (if-let [eye-map (meta-data eye "eye")] |
266 (map-vals | 267 (map-vals |
267 load-image | 268 load-image |
268 (eval (read-string eye-map))))) | 269 (eval (read-string eye-map))))) |
269 | 270 |
270 (defn eye-dimensions | 271 (defn eye-dimensions |
271 "Returns [width, height] specified in the metadata of the eye" | 272 "Returns [width, height] determined by the metadata of the eye." |
272 [#^Spatial eye] | 273 [#^Spatial eye] |
273 (let [dimensions | 274 (let [dimensions |
274 (map #(vector (.getWidth %) (.getHeight %)) | 275 (map #(vector (.getWidth %) (.getHeight %)) |
275 (vals (retina-sensor-profile eye)))] | 276 (vals (retina-sensor-profile eye)))] |
276 [(apply max (map first dimensions)) | 277 [(apply max (map first dimensions)) |
277 (apply max (map second dimensions))])) | 278 (apply max (map second dimensions))])) |
278 #+end_src | 279 #+end_src |
279 | 280 |
280 | |
281 * Eye Creation | 281 * Eye Creation |
282 | |
283 First off, get the children of the "eyes" empty node to find all the | 282 First off, get the children of the "eyes" empty node to find all the |
284 eyes the creature has. | 283 eyes the creature has. |
285 #+name: eye-node | 284 #+name: eye-node |
286 #+begin_src clojure | 285 #+begin_src clojure |
287 (defvar | 286 (defvar |
310 (.addProcessor (vision-pipeline continuation)) | 309 (.addProcessor (vision-pipeline continuation)) |
311 (.attachScene (.getRootNode world))))) | 310 (.attachScene (.getRootNode world))))) |
312 #+end_src | 311 #+end_src |
313 | 312 |
314 | 313 |
315 The continuation function registers the viewport with the simulation | 314 The eye's continuation function should register the viewport with the |
316 the first time it is called, and uses the CPU to extract the | 315 simulation the first time it is called, use the CPU to extract the |
317 appropriate pixels from the rendered image and weight them by each | 316 appropriate pixels from the rendered image and weight them by each |
318 sensors sensitivity. I have the option to do this filtering in native | 317 sensor's sensitivity. I have the option to do this processing in |
319 code for a slight gain in speed. I could also do it in the GPU for a | 318 native code for a slight gain in speed. I could also do it in the GPU |
320 massive gain in speed. =(vision-kernel)= generates a list of such | 319 for a massive gain in speed. =(vision-kernel)= generates a list of |
321 continuation functions, one for each channel of the eye. | 320 such continuation functions, one for each channel of the eye. |
322 | 321 |
323 #+name: kernel | 322 #+name: kernel |
324 #+begin_src clojure | 323 #+begin_src clojure |
325 (in-ns 'cortex.vision) | 324 (in-ns 'cortex.vision) |
326 | 325 |
406 simulation or the simulated senses, but can be annoying. | 405 simulation or the simulated senses, but can be annoying. |
407 =(gen-fix-display)= restores the in-simulation display. | 406 =(gen-fix-display)= restores the in-simulation display. |
408 | 407 |
409 ** Vision! | 408 ** Vision! |
410 | 409 |
411 All the hard work has been done, all that remains is to apply | 410 All the hard work has been done; all that remains is to apply |
412 =(vision-kernel)= to each eye in the creature and gather the results | 411 =(vision-kernel)= to each eye in the creature and gather the results |
413 into one list of functions. | 412 into one list of functions. |
414 | 413 |
415 #+name: main | 414 #+name: main |
416 #+begin_src clojure | 415 #+begin_src clojure |
417 (defn vision! | 416 (defn vision! |
418 "Returns a function which returns visual sensory data when called | 417 "Returns a function which returns visual sensory data when called |
419 inside a running simulation" | 418 inside a running simulation." |
420 [#^Node creature & {skip :skip :or {skip 0}}] | 419 [#^Node creature & {skip :skip :or {skip 0}}] |
421 (reduce | 420 (reduce |
422 concat | 421 concat |
423 (for [eye (eyes creature)] | 422 (for [eye (eyes creature)] |
424 (vision-kernel creature eye)))) | 423 (vision-kernel creature eye)))) |
515 simulation in a single creature or for simulating multiple creatures, | 514 simulation in a single creature or for simulating multiple creatures, |
516 each with their own sense of vision. | 515 each with their own sense of vision. |
517 | 516 |
518 ** Adding Vision to the Worm | 517 ** Adding Vision to the Worm |
519 | 518 |
520 To the worm from the last post, we add a new node that describes its | 519 To the worm from the last post, I add a new node that describes its |
521 eyes. | 520 eyes. |
522 | 521 |
523 #+attr_html: width=755 | 522 #+attr_html: width=755 |
524 #+caption: The worm with newly added empty nodes describing a single eye. | 523 #+caption: The worm with newly added empty nodes describing a single eye. |
525 [[../images/worm-with-eye.png]] | 524 [[../images/worm-with-eye.png]] |
526 | 525 |
527 The node highlighted in yellow is the root level "eyes" node. It has | 526 The node highlighted in yellow is the root level "eyes" node. It has |
528 a single node, highlighted in orange, which describes a single | 527 a single child, highlighted in orange, which describes a single |
529 eye. This is the "eye" node. The two nodes which are not highlighted describe the single joint | 528 eye. This is the "eye" node. It is placed so that the worm will have |
530 of the worm. | 529 an eye located in the center of the flat portion of its lower |
530 hemispherical section. | |
531 | |
532 The two nodes which are not highlighted describe the single joint of | |
533 the worm. | |
531 | 534 |
532 The metadata of the eye-node is: | 535 The metadata of the eye-node is: |
533 | 536 |
534 #+begin_src clojure :results verbatim :exports both | 537 #+begin_src clojure :results verbatim :exports both |
535 (cortex.sense/meta-data | 538 (cortex.sense/meta-data |
536 (.getChild | 539 (.getChild (.getChild (cortex.test.body/worm) "eyes") "eye") "eye") |
537 (.getChild (cortex.test.body/worm) | |
538 "eyes") "eye") "eye") | |
539 #+end_src | 540 #+end_src |
540 | 541 |
541 #+results: | 542 #+results: |
542 : "(let [retina \"Models/test-creature/retina-small.png\"] | 543 : "(let [retina \"Models/test-creature/retina-small.png\"] |
543 : {:all retina :red retina :green retina :blue retina})" | 544 : {:all retina :red retina :green retina :blue retina})" |
606 (map #(% world) vision) | 607 (map #(% world) vision) |
607 (File. "/home/r/proj/cortex/render/worm-vision")) | 608 (File. "/home/r/proj/cortex/render/worm-vision")) |
608 (fix-display world))))) | 609 (fix-display world))))) |
609 #+end_src | 610 #+end_src |
610 | 611 |
611 ** Methods to Generate the Worm Video | 612 The world consists of the worm and a flat gray floor. I can shoot red, |
613 green, blue and white cannonballs at the worm. The worm is initially | |
614 looking down at the floor, and there is no gravity. My perspective | |
615 (the Main View), the worm's perspective (Worm View) and the 4 sensor | |
616 channels that comprise the worm's eye are all saved frame-by-frame to | |
617 disk. | |
618 | |
619 * Demonstration of Vision | |
620 #+begin_html | |
621 <div class="figure"> | |
622 <video controls="controls" width="755"> | |
623 <source src="../video/worm-vision.ogg" type="video/ogg" | |
624 preload="none" poster="../images/aurellem-1280x480.png" /> | |
625 </video> | |
626 <p>Simulated Vision in a Virtual Environment</p> | |
627 </div> | |
628 #+end_html | |
629 | |
630 ** Generate the Worm Video from Frames | |
612 #+name: magick2 | 631 #+name: magick2 |
613 #+begin_src clojure | 632 #+begin_src clojure |
614 (ns cortex.video.magick2 | 633 (ns cortex.video.magick2 |
615 (:import java.io.File) | 634 (:import java.io.File) |
616 (:use clojure.contrib.shell-out)) | 635 (:use clojure.contrib.shell-out)) |
658 #+begin_src sh :results silent | 677 #+begin_src sh :results silent |
659 cd /home/r/proj/cortex/render/worm-vision | 678 cd /home/r/proj/cortex/render/worm-vision |
660 ffmpeg -r 25 -b 9001k -i out/%07d.png -vcodec libtheora worm-vision.ogg | 679 ffmpeg -r 25 -b 9001k -i out/%07d.png -vcodec libtheora worm-vision.ogg |
661 #+end_src | 680 #+end_src |
662 | 681 |
663 * Demonstration of Vision | |
664 #+begin_html | |
665 <div class="figure"> | |
666 <video controls="controls" width="755"> | |
667 <source src="../video/worm-vision.ogg" type="video/ogg" | |
668 preload="none" poster="../images/aurellem-1280x480.png" /> | |
669 </video> | |
670 <p>Simulated Vision in a Virtual Environment</p> | |
671 </div> | |
672 #+end_html | |
673 | |
674 * Headers | 682 * Headers |
675 | 683 |
676 #+name: vision-header | 684 #+name: vision-header |
677 #+begin_src clojure | 685 #+begin_src clojure |
678 (ns cortex.vision | 686 (ns cortex.vision |