rlm@34
|
1 #+title: Simulated Sense of Sight
|
rlm@23
|
2 #+author: Robert McIntyre
|
rlm@23
|
3 #+email: rlm@mit.edu
|
rlm@38
|
4 #+description: Simulated sight for AI research using JMonkeyEngine3 and clojure
|
rlm@34
|
5 #+keywords: computer vision, jMonkeyEngine3, clojure
|
rlm@23
|
6 #+SETUPFILE: ../../aurellem/org/setup.org
|
rlm@23
|
7 #+INCLUDE: ../../aurellem/org/level-0.org
|
rlm@23
|
8 #+babel: :mkdirp yes :noweb yes :exports both
|
rlm@23
|
9
|
rlm@194
|
10 * Vision
|
rlm@23
|
11
|
rlm@151
|
12
|
rlm@212
|
13 Vision is one of the most important senses for humans, so I need to
|
rlm@212
|
14 build a simulated sense of vision for my AI. I will do this with
|
rlm@212
|
15 simulated eyes. Each eye can be independely moved and should see its
|
rlm@212
|
16 own version of the world depending on where it is.
|
rlm@212
|
17
|
rlm@212
|
18 Making these simulated eyes a reality is fairly simple bacause
|
rlm@212
|
19 jMonkeyEngine already conatains extensive support for multiple views
|
rlm@212
|
20 of the same 3D simulated world. The reason jMonkeyEngine has this
|
rlm@212
|
21 support is because the support is necessary to create games with
|
rlm@212
|
22 split-screen views. Multiple views are also used to create efficient
|
rlm@212
|
23 pseudo-reflections by rendering the scene from a certain perspective
|
rlm@212
|
24 and then projecting it back onto a surface in the 3D world.
|
rlm@212
|
25
|
rlm@212
|
26 #+caption: jMonkeyEngine supports multiple views to enable split-screen games, like GoldenEye
|
rlm@212
|
27 [[../images/goldeneye-4-player.png]]
|
rlm@212
|
28
|
rlm@213
|
29 * Brief Description of jMonkeyEngine's Rendering Pipeline
|
rlm@212
|
30
|
rlm@213
|
31 jMonkeyEngine allows you to create a =ViewPort=, which represents a
|
rlm@213
|
32 view of the simulated world. You can create as many of these as you
|
rlm@213
|
33 want. Every frame, the =RenderManager= iterates through each
|
rlm@213
|
34 =ViewPort=, rendering the scene in the GPU. For each =ViewPort= there
|
rlm@213
|
35 is a =FrameBuffer= which represents the rendered image in the GPU.
|
rlm@151
|
36
|
rlm@213
|
37 Each =ViewPort= can have any number of attached =SceneProcessor=
|
rlm@213
|
38 objects, which are called every time a new frame is rendered. A
|
rlm@213
|
39 =SceneProcessor= recieves a =FrameBuffer= and can do whatever it wants
|
rlm@213
|
40 to the data. Often this consists of invoking GPU specific operations
|
rlm@213
|
41 on the rendered image. The =SceneProcessor= can also copy the GPU
|
rlm@213
|
42 image data to RAM and process it with the CPU.
|
rlm@151
|
43
|
rlm@213
|
44 * The Vision Pipeline
|
rlm@151
|
45
|
rlm@213
|
46 Each eye in the simulated creature needs it's own =ViewPort= so that
|
rlm@213
|
47 it can see the world from its own perspective. To this =ViewPort=, I
|
rlm@214
|
48 add a =SceneProcessor= that feeds the visual data to any arbitray
|
rlm@213
|
49 continuation function for further processing. That continuation
|
rlm@213
|
50 function may perform both CPU and GPU operations on the data. To make
|
rlm@213
|
51 this easy for the continuation function, the =SceneProcessor=
|
rlm@213
|
52 maintains appropriatly sized buffers in RAM to hold the data. It does
|
rlm@213
|
53 not do any copying from the GPU to the CPU itself.
|
rlm@214
|
54
|
rlm@213
|
55 #+name: pipeline-1
|
rlm@213
|
56 #+begin_src clojure
|
rlm@113
|
57 (defn vision-pipeline
|
rlm@34
|
58 "Create a SceneProcessor object which wraps a vision processing
|
rlm@113
|
59 continuation function. The continuation is a function that takes
|
rlm@113
|
60 [#^Renderer r #^FrameBuffer fb #^ByteBuffer b #^BufferedImage bi],
|
rlm@113
|
61 each of which has already been appropiately sized."
|
rlm@23
|
62 [continuation]
|
rlm@23
|
63 (let [byte-buffer (atom nil)
|
rlm@113
|
64 renderer (atom nil)
|
rlm@113
|
65 image (atom nil)]
|
rlm@23
|
66 (proxy [SceneProcessor] []
|
rlm@23
|
67 (initialize
|
rlm@23
|
68 [renderManager viewPort]
|
rlm@23
|
69 (let [cam (.getCamera viewPort)
|
rlm@23
|
70 width (.getWidth cam)
|
rlm@23
|
71 height (.getHeight cam)]
|
rlm@23
|
72 (reset! renderer (.getRenderer renderManager))
|
rlm@23
|
73 (reset! byte-buffer
|
rlm@23
|
74 (BufferUtils/createByteBuffer
|
rlm@113
|
75 (* width height 4)))
|
rlm@113
|
76 (reset! image (BufferedImage.
|
rlm@113
|
77 width height
|
rlm@113
|
78 BufferedImage/TYPE_4BYTE_ABGR))))
|
rlm@23
|
79 (isInitialized [] (not (nil? @byte-buffer)))
|
rlm@23
|
80 (reshape [_ _ _])
|
rlm@23
|
81 (preFrame [_])
|
rlm@23
|
82 (postQueue [_])
|
rlm@23
|
83 (postFrame
|
rlm@23
|
84 [#^FrameBuffer fb]
|
rlm@23
|
85 (.clear @byte-buffer)
|
rlm@113
|
86 (continuation @renderer fb @byte-buffer @image))
|
rlm@23
|
87 (cleanup []))))
|
rlm@213
|
88 #+end_src
|
rlm@213
|
89
|
rlm@213
|
90 The continuation function given to =(vision-pipeline)= above will be
|
rlm@213
|
91 given a =Renderer= and three containers for image data. The
|
rlm@213
|
92 =FrameBuffer= references the GPU image data, but it can not be used
|
rlm@213
|
93 directly on the CPU. The =ByteBuffer= and =BufferedImage= are
|
rlm@213
|
94 initially "empty" but are sized to hold to data in the
|
rlm@213
|
95 =FrameBuffer=. I call transfering the GPU image data to the CPU
|
rlm@213
|
96 structures "mixing" the image data. I have provided three functions to
|
rlm@213
|
97 do this mixing.
|
rlm@213
|
98
|
rlm@213
|
99 #+name: pipeline-2
|
rlm@213
|
100 #+begin_src clojure
|
rlm@113
|
101 (defn frameBuffer->byteBuffer!
|
rlm@113
|
102 "Transfer the data in the graphics card (Renderer, FrameBuffer) to
|
rlm@113
|
103 the CPU (ByteBuffer)."
|
rlm@113
|
104 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb]
|
rlm@113
|
105 (.readFrameBuffer r fb bb) bb)
|
rlm@113
|
106
|
rlm@113
|
107 (defn byteBuffer->bufferedImage!
|
rlm@113
|
108 "Convert the C-style BGRA image data in the ByteBuffer bb to the AWT
|
rlm@113
|
109 style ABGR image data and place it in BufferedImage bi."
|
rlm@113
|
110 [#^ByteBuffer bb #^BufferedImage bi]
|
rlm@113
|
111 (Screenshots/convertScreenShot bb bi) bi)
|
rlm@113
|
112
|
rlm@113
|
113 (defn BufferedImage!
|
rlm@113
|
114 "Continuation which will grab the buffered image from the materials
|
rlm@113
|
115 provided by (vision-pipeline)."
|
rlm@113
|
116 [#^Renderer r #^FrameBuffer fb #^ByteBuffer bb #^BufferedImage bi]
|
rlm@113
|
117 (byteBuffer->bufferedImage!
|
rlm@113
|
118 (frameBuffer->byteBuffer! r fb bb) bi))
|
rlm@213
|
119 #+end_src
|
rlm@112
|
120
|
rlm@213
|
121 Note that it is possible to write vision processing algorithms
|
rlm@213
|
122 entirely in terms of =BufferedImage= inputs. Just compose that
|
rlm@213
|
123 =BufferedImage= algorithm with =(BufferedImage!)=. However, a vision
|
rlm@213
|
124 processing algorithm that is entirely hosted on the GPU does not have
|
rlm@213
|
125 to pay for this convienence.
|
rlm@213
|
126
|
rlm@214
|
127 * COMMENT asdasd
|
rlm@213
|
128
|
rlm@213
|
129 (vision creature) will take an optional :skip argument which will
|
rlm@213
|
130 inform the continuations in scene processor to skip the given
|
rlm@213
|
131 number of cycles 0 means that no cycles will be skipped.
|
rlm@213
|
132
|
rlm@213
|
133 (vision creature) will return [init-functions sensor-functions].
|
rlm@213
|
134 The init-functions are each single-arg functions that take the
|
rlm@213
|
135 world and register the cameras and must each be called before the
|
rlm@213
|
136 corresponding sensor-functions. Each init-function returns the
|
rlm@213
|
137 viewport for that eye which can be manipulated, saved, etc. Each
|
rlm@213
|
138 sensor-function is a thunk and will return data in the same
|
rlm@213
|
139 format as the tactile-sensor functions the structure is
|
rlm@213
|
140 [topology, sensor-data]. Internally, these sensor-functions
|
rlm@213
|
141 maintain a reference to sensor-data which is periodically updated
|
rlm@213
|
142 by the continuation function established by its init-function.
|
rlm@213
|
143 They can be queried every cycle, but their information may not
|
rlm@213
|
144 necessairly be different every cycle.
|
rlm@213
|
145
|
rlm@213
|
146
|
rlm@214
|
147
|
rlm@214
|
148 * Physical Eyes
|
rlm@214
|
149
|
rlm@214
|
150 The vision pipeline described above handles the flow of rendered
|
rlm@214
|
151 images. Now, we need simulated eyes to serve as the source of these
|
rlm@214
|
152 images.
|
rlm@214
|
153
|
rlm@214
|
154 An eye is described in blender in the same way as a joint. They are
|
rlm@214
|
155 zero dimensional empty objects with no geometry whose local coordinate
|
rlm@214
|
156 system determines the orientation of the resulting eye. All eyes are
|
rlm@214
|
157 childern of a parent node named "eyes" just as all joints have a
|
rlm@214
|
158 parent named "joints". An eye binds to the nearest physical object
|
rlm@214
|
159 with =(bind-sense=).
|
rlm@214
|
160
|
rlm@214
|
161 #+name: add-eye
|
rlm@214
|
162 #+begin_src clojure
|
rlm@215
|
163 (in-ns 'cortex.vision)
|
rlm@215
|
164
|
rlm@215
|
165 (import com.jme3.math.Vector3f)
|
rlm@215
|
166
|
rlm@215
|
167 (def blender-rotation-correction
|
rlm@215
|
168 (doto (Quaternion.)
|
rlm@215
|
169 (.fromRotationMatrix
|
rlm@215
|
170 (doto (Matrix3f.)
|
rlm@215
|
171 (.setColumn 0
|
rlm@215
|
172 (Vector3f. 1 0 0))
|
rlm@215
|
173 (.setColumn 1
|
rlm@215
|
174 (Vector3f. 0 -1 0))
|
rlm@215
|
175 (.setColumn 2
|
rlm@215
|
176 (Vector3f. 0 0 -1)))
|
rlm@215
|
177
|
rlm@215
|
178 (doto (Matrix3f.)
|
rlm@215
|
179 (.setColumn 0
|
rlm@215
|
180 (Vector3f.
|
rlm@215
|
181
|
rlm@215
|
182
|
rlm@214
|
183 (defn add-eye!
|
rlm@214
|
184 "Create a Camera centered on the current position of 'eye which
|
rlm@214
|
185 follows the closest physical node in 'creature and sends visual
|
rlm@215
|
186 data to 'continuation. The camera will point in the X direction and
|
rlm@215
|
187 use the Z vector as up as determined by the rotation of these
|
rlm@215
|
188 vectors in blender coordinate space. Use XZY rotation for the node
|
rlm@215
|
189 in blender."
|
rlm@214
|
190 [#^Node creature #^Spatial eye]
|
rlm@214
|
191 (let [target (closest-node creature eye)
|
rlm@214
|
192 [cam-width cam-height] (eye-dimensions eye)
|
rlm@215
|
193 cam (Camera. cam-width cam-height)
|
rlm@215
|
194 rot (.getWorldRotation eye)]
|
rlm@214
|
195 (.setLocation cam (.getWorldTranslation eye))
|
rlm@215
|
196 (.lookAtDirection cam (.mult rot Vector3f/UNIT_X)
|
rlm@215
|
197 ;; this part is consistent with using Z in
|
rlm@215
|
198 ;; blender as the UP vector.
|
rlm@215
|
199 (.mult rot Vector3f/UNIT_Y))
|
rlm@215
|
200
|
rlm@215
|
201 (println-repl "eye unit-z ->" (.mult rot Vector3f/UNIT_Z))
|
rlm@215
|
202 (println-repl "eye unit-y ->" (.mult rot Vector3f/UNIT_Y))
|
rlm@215
|
203 (println-repl "eye unit-x ->" (.mult rot Vector3f/UNIT_X))
|
rlm@214
|
204 (.setFrustumPerspective
|
rlm@215
|
205 cam 45 (/ (.getWidth cam) (.getHeight cam)) 1 1000)
|
rlm@215
|
206 (bind-sense target cam) cam))
|
rlm@214
|
207 #+end_src
|
rlm@214
|
208
|
rlm@214
|
209 Here, the camera is created based on metadata on the eye-node and
|
rlm@214
|
210 attached to the nearest physical object with =(bind-sense)=
|
rlm@214
|
211
|
rlm@214
|
212
|
rlm@214
|
213 ** The Retina
|
rlm@214
|
214
|
rlm@214
|
215 An eye is a surface (the retina) which contains many discrete sensors
|
rlm@214
|
216 to detect light. These sensors have can have different-light sensing
|
rlm@214
|
217 properties. In humans, each discrete sensor is sensitive to red,
|
rlm@214
|
218 blue, green, or gray. These different types of sensors can have
|
rlm@214
|
219 different spatial distributions along the retina. In humans, there is
|
rlm@214
|
220 a fovea in the center of the retina which has a very high density of
|
rlm@214
|
221 color sensors, and a blind spot which has no sensors at all. Sensor
|
rlm@214
|
222 density decreases in proportion to distance from the retina.
|
rlm@214
|
223
|
rlm@214
|
224 I want to be able to model any retinal configuration, so my eye-nodes
|
rlm@214
|
225 in blender contain metadata pointing to images that describe the
|
rlm@214
|
226 percise position of the individual sensors using white pixels. The
|
rlm@214
|
227 meta-data also describes the percise sensitivity to light that the
|
rlm@214
|
228 sensors described in the image have. An eye can contain any number of
|
rlm@214
|
229 these images. For example, the metadata for an eye might look like
|
rlm@214
|
230 this:
|
rlm@214
|
231
|
rlm@214
|
232 #+begin_src clojure
|
rlm@214
|
233 {0xFF0000 "Models/test-creature/retina-small.png"}
|
rlm@214
|
234 #+end_src
|
rlm@214
|
235
|
rlm@214
|
236 #+caption: The retinal profile image "Models/test-creature/retina-small.png". White pixels are photo-sensitive elements. The distribution of white pixels is denser in the middle and falls off at the edges and is inspired by the human retina.
|
rlm@214
|
237 [[../assets/Models/test-creature/retina-small.png]]
|
rlm@214
|
238
|
rlm@214
|
239 Together, the number 0xFF0000 and the image image above describe the
|
rlm@214
|
240 placement of red-sensitive sensory elements.
|
rlm@214
|
241
|
rlm@214
|
242 Meta-data to very crudely approximate a human eye might be something
|
rlm@214
|
243 like this:
|
rlm@214
|
244
|
rlm@214
|
245 #+begin_src clojure
|
rlm@214
|
246 (let [retinal-profile "Models/test-creature/retina-small.png"]
|
rlm@214
|
247 {0xFF0000 retinal-profile
|
rlm@214
|
248 0x00FF00 retinal-profile
|
rlm@214
|
249 0x0000FF retinal-profile
|
rlm@214
|
250 0xFFFFFF retinal-profile})
|
rlm@214
|
251 #+end_src
|
rlm@214
|
252
|
rlm@214
|
253 The numbers that serve as keys in the map determine a sensor's
|
rlm@214
|
254 relative sensitivity to the channels red, green, and blue. These
|
rlm@214
|
255 sensitivity values are packed into an integer in the order _RGB in
|
rlm@214
|
256 8-bit fields. The RGB values of a pixel in the image are added
|
rlm@214
|
257 together with these sensitivities as linear weights. Therfore,
|
rlm@214
|
258 0xFF0000 means sensitive to red only while 0xFFFFFF means sensitive to
|
rlm@214
|
259 all colors equally (gray).
|
rlm@214
|
260
|
rlm@214
|
261 For convienence I've defined a few symbols for the more common
|
rlm@214
|
262 sensitivity values.
|
rlm@214
|
263
|
rlm@214
|
264 #+name: sensitivity
|
rlm@214
|
265 #+begin_src clojure
|
rlm@214
|
266 (defvar sensitivity-presets
|
rlm@214
|
267 {:all 0xFFFFFF
|
rlm@214
|
268 :red 0xFF0000
|
rlm@214
|
269 :blue 0x0000FF
|
rlm@214
|
270 :green 0x00FF00}
|
rlm@214
|
271 "Retinal sensitivity presets for sensors that extract one channel
|
rlm@214
|
272 (:red :blue :green) or average all channels (:gray)")
|
rlm@214
|
273 #+end_src
|
rlm@214
|
274
|
rlm@214
|
275 ** Metadata Processing
|
rlm@214
|
276
|
rlm@214
|
277 =(retina-sensor-profile)= extracts a map from the eye-node in the same
|
rlm@214
|
278 format as the example maps above. =(eye-dimensions)= finds the
|
rlm@214
|
279 dimansions of the smallest image required to contain all the retinal
|
rlm@214
|
280 sensor maps.
|
rlm@214
|
281
|
rlm@214
|
282 #+begin_src clojure
|
rlm@214
|
283 (defn retina-sensor-profile
|
rlm@214
|
284 "Return a map of pixel sensitivity numbers to BufferedImages
|
rlm@214
|
285 describing the distribution of light-sensitive components of this
|
rlm@214
|
286 eye. :red, :green, :blue, :gray are already defined as extracting
|
rlm@214
|
287 the red, green, blue, and average components respectively."
|
rlm@214
|
288 [#^Spatial eye]
|
rlm@214
|
289 (if-let [eye-map (meta-data eye "eye")]
|
rlm@214
|
290 (map-vals
|
rlm@214
|
291 load-image
|
rlm@214
|
292 (eval (read-string eye-map)))))
|
rlm@214
|
293
|
rlm@214
|
294 (defn eye-dimensions
|
rlm@214
|
295 "Returns [width, height] specified in the metadata of the eye"
|
rlm@214
|
296 [#^Spatial eye]
|
rlm@214
|
297 (let [dimensions
|
rlm@214
|
298 (map #(vector (.getWidth %) (.getHeight %))
|
rlm@214
|
299 (vals (retina-sensor-profile eye)))]
|
rlm@214
|
300 [(apply max (map first dimensions))
|
rlm@214
|
301 (apply max (map second dimensions))]))
|
rlm@214
|
302 #+end_src
|
rlm@214
|
303
|
rlm@214
|
304
|
rlm@214
|
305 * Eye Creation
|
rlm@214
|
306
|
rlm@214
|
307 First off, get the children of the "eyes" empty node to find all the
|
rlm@214
|
308 eyes the creature has.
|
rlm@214
|
309
|
rlm@214
|
310 #+begin_src clojure
|
rlm@214
|
311 (defvar
|
rlm@214
|
312 ^{:arglists '([creature])}
|
rlm@214
|
313 eyes
|
rlm@214
|
314 (sense-nodes "eyes")
|
rlm@214
|
315 "Return the children of the creature's \"eyes\" node.")
|
rlm@214
|
316 #+end_src
|
rlm@214
|
317
|
rlm@215
|
318 Then, add the camera created by =(add-eye!)= to the simulation by
|
rlm@215
|
319 creating a new viewport.
|
rlm@214
|
320
|
rlm@213
|
321 #+begin_src clojure
|
rlm@169
|
322 (defn add-camera!
|
rlm@169
|
323 "Add a camera to the world, calling continuation on every frame
|
rlm@34
|
324 produced."
|
rlm@167
|
325 [#^Application world camera continuation]
|
rlm@23
|
326 (let [width (.getWidth camera)
|
rlm@23
|
327 height (.getHeight camera)
|
rlm@23
|
328 render-manager (.getRenderManager world)
|
rlm@23
|
329 viewport (.createMainView render-manager "eye-view" camera)]
|
rlm@23
|
330 (doto viewport
|
rlm@23
|
331 (.setClearFlags true true true)
|
rlm@112
|
332 (.setBackgroundColor ColorRGBA/Black)
|
rlm@113
|
333 (.addProcessor (vision-pipeline continuation))
|
rlm@23
|
334 (.attachScene (.getRootNode world)))))
|
rlm@215
|
335 #+end_src
|
rlm@151
|
336
|
rlm@151
|
337
|
rlm@215
|
338 The continuation function registers the viewport with the simulation
|
rlm@215
|
339 the first time it is called, and uses the CPU to extract the
|
rlm@215
|
340 appropriate pixels from the rendered image and weight them by each
|
rlm@215
|
341 sensors sensitivity. I have the option to do this filtering in native
|
rlm@215
|
342 code for a slight gain in speed. I could also do it in the GPU for a
|
rlm@215
|
343 massive gain in speed. =(vision-kernel)= generates a list of such
|
rlm@215
|
344 continuation functions, one for each channel of the eye.
|
rlm@151
|
345
|
rlm@215
|
346 #+begin_src clojure
|
rlm@215
|
347 (in-ns 'cortex.vision)
|
rlm@151
|
348
|
rlm@215
|
349 (defrecord attached-viewport [vision-fn viewport-fn]
|
rlm@215
|
350 clojure.lang.IFn
|
rlm@215
|
351 (invoke [this world] (vision-fn world))
|
rlm@215
|
352 (applyTo [this args] (apply vision-fn args)))
|
rlm@151
|
353
|
rlm@215
|
354 (defn vision-kernel
|
rlm@171
|
355 "Returns a list of functions, each of which will return a color
|
rlm@171
|
356 channel's worth of visual information when called inside a running
|
rlm@171
|
357 simulation."
|
rlm@151
|
358 [#^Node creature #^Spatial eye & {skip :skip :or {skip 0}}]
|
rlm@169
|
359 (let [retinal-map (retina-sensor-profile eye)
|
rlm@169
|
360 camera (add-eye! creature eye)
|
rlm@151
|
361 vision-image
|
rlm@151
|
362 (atom
|
rlm@151
|
363 (BufferedImage. (.getWidth camera)
|
rlm@151
|
364 (.getHeight camera)
|
rlm@170
|
365 BufferedImage/TYPE_BYTE_BINARY))
|
rlm@170
|
366 register-eye!
|
rlm@170
|
367 (runonce
|
rlm@170
|
368 (fn [world]
|
rlm@170
|
369 (add-camera!
|
rlm@170
|
370 world camera
|
rlm@170
|
371 (let [counter (atom 0)]
|
rlm@170
|
372 (fn [r fb bb bi]
|
rlm@170
|
373 (if (zero? (rem (swap! counter inc) (inc skip)))
|
rlm@170
|
374 (reset! vision-image
|
rlm@170
|
375 (BufferedImage! r fb bb bi))))))))]
|
rlm@151
|
376 (vec
|
rlm@151
|
377 (map
|
rlm@151
|
378 (fn [[key image]]
|
rlm@151
|
379 (let [whites (white-coordinates image)
|
rlm@151
|
380 topology (vec (collapse whites))
|
rlm@214
|
381 mask (color-channel-presets key key)]
|
rlm@215
|
382 (attached-viewport.
|
rlm@215
|
383 (fn [world]
|
rlm@215
|
384 (register-eye! world)
|
rlm@215
|
385 (vector
|
rlm@215
|
386 topology
|
rlm@215
|
387 (vec
|
rlm@215
|
388 (for [[x y] whites]
|
rlm@215
|
389 (bit-and
|
rlm@215
|
390 mask (.getRGB @vision-image x y))))))
|
rlm@215
|
391 register-eye!)))
|
rlm@215
|
392 retinal-map))))
|
rlm@151
|
393
|
rlm@215
|
394 (defn gen-fix-display
|
rlm@215
|
395 "Create a function to call to restore a simulation's display when it
|
rlm@215
|
396 is disrupted by a Viewport."
|
rlm@215
|
397 []
|
rlm@215
|
398 (runonce
|
rlm@215
|
399 (fn [world]
|
rlm@215
|
400 (add-camera! world (.getCamera world) no-op))))
|
rlm@170
|
401
|
rlm@215
|
402 #+end_src
|
rlm@170
|
403
|
rlm@215
|
404 Note that since each of the functions generated by =(vision-kernel)=
|
rlm@215
|
405 shares the same =(register-eye!)= function, the eye will be registered
|
rlm@215
|
406 only once the first time any of the functions from the list returned
|
rlm@215
|
407 by =(vision-kernel)= is called. Each of the functions returned by
|
rlm@215
|
408 =(vision-kernel)= also allows access to the =Viewport= through which
|
rlm@215
|
409 it recieves images.
|
rlm@215
|
410
|
rlm@215
|
411 The in-game display can be disrupted by all the viewports that the
|
rlm@215
|
412 functions greated by =(vision-kernel)= add. This doesn't affect the
|
rlm@215
|
413 simulation or the simulated senses, but can be annoying.
|
rlm@215
|
414 =(gen-fix-display)= restores the in-simulation display.
|
rlm@215
|
415
|
rlm@215
|
416 ** Vision!
|
rlm@215
|
417
|
rlm@215
|
418 All the hard work has been done, all that remains is to apply
|
rlm@215
|
419 =(vision-kernel)= to each eye in the creature and gather the results
|
rlm@215
|
420 into one list of functions.
|
rlm@215
|
421
|
rlm@215
|
422 #+begin_src clojure
|
rlm@170
|
423 (defn vision!
|
rlm@170
|
424 "Returns a function which returns visual sensory data when called
|
rlm@170
|
425 inside a running simulation"
|
rlm@151
|
426 [#^Node creature & {skip :skip :or {skip 0}}]
|
rlm@151
|
427 (reduce
|
rlm@170
|
428 concat
|
rlm@167
|
429 (for [eye (eyes creature)]
|
rlm@215
|
430 (vision-kernel creature eye))))
|
rlm@215
|
431 #+end_src
|
rlm@151
|
432
|
rlm@215
|
433 ** Visualization of Vision
|
rlm@215
|
434
|
rlm@215
|
435 It's vital to have a visual representation for each sense. Here I use
|
rlm@215
|
436 =(view-sense)= to construct a function that will create a display for
|
rlm@215
|
437 visual data.
|
rlm@215
|
438
|
rlm@215
|
439 #+begin_src clojure
|
rlm@189
|
440 (defn view-vision
|
rlm@189
|
441 "Creates a function which accepts a list of visual sensor-data and
|
rlm@189
|
442 displays each element of the list to the screen."
|
rlm@189
|
443 []
|
rlm@188
|
444 (view-sense
|
rlm@188
|
445 (fn
|
rlm@188
|
446 [[coords sensor-data]]
|
rlm@188
|
447 (let [image (points->image coords)]
|
rlm@188
|
448 (dorun
|
rlm@188
|
449 (for [i (range (count coords))]
|
rlm@188
|
450 (.setRGB image ((coords i) 0) ((coords i) 1)
|
rlm@188
|
451 (sensor-data i))))
|
rlm@189
|
452 image))))
|
rlm@34
|
453 #+end_src
|
rlm@23
|
454
|
rlm@215
|
455 * Tests
|
rlm@112
|
456
|
rlm@215
|
457 ** Basic Test
|
rlm@23
|
458
|
rlm@215
|
459 This is a basic test for the vision system. It only tests the
|
rlm@215
|
460 vision-pipeline and does not deal with loadig eyes from a blender
|
rlm@215
|
461 file. The code creates two videos of the same rotating cube from
|
rlm@215
|
462 different angles.
|
rlm@23
|
463
|
rlm@215
|
464 #+name: test-1
|
rlm@23
|
465 #+begin_src clojure
|
rlm@215
|
466 (in-ns 'cortex.test.vision)
|
rlm@23
|
467
|
rlm@36
|
468 (defn test-two-eyes
|
rlm@69
|
469 "Testing vision:
|
rlm@69
|
470 Tests the vision system by creating two views of the same rotating
|
rlm@69
|
471 object from different angles and displaying both of those views in
|
rlm@69
|
472 JFrames.
|
rlm@69
|
473
|
rlm@69
|
474 You should see a rotating cube, and two windows,
|
rlm@69
|
475 each displaying a different view of the cube."
|
rlm@36
|
476 []
|
rlm@58
|
477 (let [candy
|
rlm@58
|
478 (box 1 1 1 :physical? false :color ColorRGBA/Blue)]
|
rlm@112
|
479 (world
|
rlm@112
|
480 (doto (Node.)
|
rlm@112
|
481 (.attachChild candy))
|
rlm@112
|
482 {}
|
rlm@112
|
483 (fn [world]
|
rlm@112
|
484 (let [cam (.clone (.getCamera world))
|
rlm@112
|
485 width (.getWidth cam)
|
rlm@112
|
486 height (.getHeight cam)]
|
rlm@169
|
487 (add-camera! world cam
|
rlm@215
|
488 (comp
|
rlm@215
|
489 (view-image
|
rlm@215
|
490 (File. "/home/r/proj/cortex/render/vision/1"))
|
rlm@215
|
491 BufferedImage!))
|
rlm@169
|
492 (add-camera! world
|
rlm@112
|
493 (doto (.clone cam)
|
rlm@112
|
494 (.setLocation (Vector3f. -10 0 0))
|
rlm@112
|
495 (.lookAt Vector3f/ZERO Vector3f/UNIT_Y))
|
rlm@215
|
496 (comp
|
rlm@215
|
497 (view-image
|
rlm@215
|
498 (File. "/home/r/proj/cortex/render/vision/2"))
|
rlm@215
|
499 BufferedImage!))
|
rlm@112
|
500 ;; This is here to restore the main view
|
rlm@112
|
501 ;; after the other views have completed processing
|
rlm@169
|
502 (add-camera! world (.getCamera world) no-op)))
|
rlm@112
|
503 (fn [world tpf]
|
rlm@112
|
504 (.rotate candy (* tpf 0.2) 0 0)))))
|
rlm@23
|
505 #+end_src
|
rlm@23
|
506
|
rlm@215
|
507 #+begin_html
|
rlm@215
|
508 <div class="figure">
|
rlm@215
|
509 <video controls="controls" width="755">
|
rlm@215
|
510 <source src="../video/spinning-cube.ogg" type="video/ogg"
|
rlm@215
|
511 preload="none" poster="../images/aurellem-1280x480.png" />
|
rlm@215
|
512 </video>
|
rlm@215
|
513 <p>A rotating cube viewed from two different perspectives.</p>
|
rlm@215
|
514 </div>
|
rlm@215
|
515 #+end_html
|
rlm@215
|
516
|
rlm@215
|
517 Creating multiple eyes like this can be used for stereoscopic vision
|
rlm@215
|
518 simulation in a single creature or for simulating multiple creatures,
|
rlm@215
|
519 each with their own sense of vision.
|
rlm@215
|
520
|
rlm@215
|
521 ** Adding Vision to the Worm
|
rlm@215
|
522
|
rlm@215
|
523 To the worm from the last post, we add a new node that describes its
|
rlm@215
|
524 eyes.
|
rlm@215
|
525
|
rlm@215
|
526 #+attr_html: width=755
|
rlm@215
|
527 #+caption: The worm with newly added empty nodes describing a single eye.
|
rlm@215
|
528 [[../images/worm-with-eye.png]]
|
rlm@215
|
529
|
rlm@215
|
530 The node highlighted in yellow is the root level "eyes" node. It has
|
rlm@215
|
531 a single node, highlighted in orange, which describes a single
|
rlm@215
|
532 eye. This is the "eye" node. The two nodes which are not highlighted describe the single joint
|
rlm@215
|
533 of the worm.
|
rlm@215
|
534
|
rlm@215
|
535 The metadata of the eye-node is:
|
rlm@215
|
536
|
rlm@215
|
537 #+begin_src clojure :results verbatim :exports both
|
rlm@215
|
538 (cortex.sense/meta-data
|
rlm@215
|
539 (.getChild
|
rlm@215
|
540 (.getChild (cortex.test.body/worm)
|
rlm@215
|
541 "eyes") "eye") "eye")
|
rlm@215
|
542 #+end_src
|
rlm@215
|
543
|
rlm@215
|
544 #+results:
|
rlm@215
|
545 : "(let [retina \"Models/test-creature/retina-small.png\"]
|
rlm@215
|
546 : {:all retina :red retina :green retina :blue retina})"
|
rlm@215
|
547
|
rlm@215
|
548 This is the approximation to the human eye described earlier.
|
rlm@215
|
549
|
rlm@215
|
550 #+begin_src clojure
|
rlm@215
|
551 (in-ns 'cortex.test.vision)
|
rlm@215
|
552
|
rlm@215
|
553 (import com.aurellem.capture.Capture)
|
rlm@215
|
554
|
rlm@215
|
555 (defn test-worm-vision []
|
rlm@215
|
556 (let [the-worm (doto (worm)(body!))
|
rlm@215
|
557 vision (vision! the-worm)
|
rlm@215
|
558 vision-display (view-vision)
|
rlm@215
|
559 fix-display (gen-fix-display)
|
rlm@215
|
560 me (sphere 0.5 :color ColorRGBA/Blue :physical? false)
|
rlm@215
|
561 x-axis
|
rlm@215
|
562 (box 1 0.01 0.01 :physical? false :color ColorRGBA/Red
|
rlm@215
|
563 :position (Vector3f. 0 -5 0))
|
rlm@215
|
564 y-axis
|
rlm@215
|
565 (box 0.01 1 0.01 :physical? false :color ColorRGBA/Green
|
rlm@215
|
566 :position (Vector3f. 0 -5 0))
|
rlm@215
|
567 z-axis
|
rlm@215
|
568 (box 0.01 0.01 1 :physical? false :color ColorRGBA/Blue
|
rlm@215
|
569 :position (Vector3f. 0 -5 0))]
|
rlm@215
|
570
|
rlm@215
|
571 (world (nodify [(floor) the-worm x-axis y-axis z-axis me])
|
rlm@215
|
572 standard-debug-controls
|
rlm@215
|
573 (fn [world]
|
rlm@215
|
574 (light-up-everything world)
|
rlm@215
|
575 ;; add a view from the worm's perspective
|
rlm@215
|
576 (add-camera!
|
rlm@215
|
577 world
|
rlm@215
|
578 (add-eye! the-worm
|
rlm@215
|
579 (.getChild
|
rlm@215
|
580 (.getChild the-worm "eyes") "eye"))
|
rlm@215
|
581 (comp
|
rlm@215
|
582 (view-image
|
rlm@215
|
583 (File. "/home/r/proj/cortex/render/worm-vision/worm-view"))
|
rlm@215
|
584 BufferedImage!))
|
rlm@215
|
585 (set-gravity world Vector3f/ZERO)
|
rlm@215
|
586 (Capture/captureVideo
|
rlm@215
|
587 world
|
rlm@215
|
588 (File. "/home/r/proj/cortex/render/worm-vision/main-view")))
|
rlm@215
|
589 (fn [world _ ]
|
rlm@215
|
590 (.setLocalTranslation me (.getLocation (.getCamera world)))
|
rlm@215
|
591 (vision-display
|
rlm@215
|
592 (map #(% world) vision)
|
rlm@215
|
593 (File. "/home/r/proj/cortex/render/worm-vision"))
|
rlm@215
|
594 (fix-display world)))))
|
rlm@215
|
595 #+end_src
|
rlm@215
|
596
|
rlm@215
|
597 * Headers
|
rlm@215
|
598
|
rlm@213
|
599 #+name: vision-header
|
rlm@213
|
600 #+begin_src clojure
|
rlm@213
|
601 (ns cortex.vision
|
rlm@213
|
602 "Simulate the sense of vision in jMonkeyEngine3. Enables multiple
|
rlm@213
|
603 eyes from different positions to observe the same world, and pass
|
rlm@213
|
604 the observed data to any arbitray function. Automatically reads
|
rlm@213
|
605 eye-nodes from specially prepared blender files and instanttiates
|
rlm@213
|
606 them in the world as actual eyes."
|
rlm@213
|
607 {:author "Robert McIntyre"}
|
rlm@213
|
608 (:use (cortex world sense util))
|
rlm@213
|
609 (:use clojure.contrib.def)
|
rlm@213
|
610 (:import com.jme3.post.SceneProcessor)
|
rlm@213
|
611 (:import (com.jme3.util BufferUtils Screenshots))
|
rlm@213
|
612 (:import java.nio.ByteBuffer)
|
rlm@213
|
613 (:import java.awt.image.BufferedImage)
|
rlm@213
|
614 (:import (com.jme3.renderer ViewPort Camera))
|
rlm@213
|
615 (:import com.jme3.math.ColorRGBA)
|
rlm@213
|
616 (:import com.jme3.renderer.Renderer)
|
rlm@213
|
617 (:import com.jme3.app.Application)
|
rlm@213
|
618 (:import com.jme3.texture.FrameBuffer)
|
rlm@213
|
619 (:import (com.jme3.scene Node Spatial)))
|
rlm@213
|
620 #+end_src
|
rlm@112
|
621
|
rlm@215
|
622 #+name: test-header
|
rlm@215
|
623 #+begin_src clojure
|
rlm@215
|
624 (ns cortex.test.vision
|
rlm@215
|
625 (:use (cortex world sense util body vision))
|
rlm@215
|
626 (:use cortex.test.body)
|
rlm@215
|
627 (:import java.awt.image.BufferedImage)
|
rlm@215
|
628 (:import javax.swing.JPanel)
|
rlm@215
|
629 (:import javax.swing.SwingUtilities)
|
rlm@215
|
630 (:import java.awt.Dimension)
|
rlm@215
|
631 (:import javax.swing.JFrame)
|
rlm@215
|
632 (:import com.jme3.math.ColorRGBA)
|
rlm@215
|
633 (:import com.jme3.scene.Node)
|
rlm@215
|
634 (:import com.jme3.math.Vector3f)
|
rlm@215
|
635 (:import java.io.File))
|
rlm@215
|
636 #+end_src
|
rlm@215
|
637
|
rlm@215
|
638
|
rlm@24
|
639
|
rlm@35
|
640 - As a neat bonus, this idea behind simulated vision also enables one
|
rlm@35
|
641 to [[../../cortex/html/capture-video.html][capture live video feeds from jMonkeyEngine]].
|
rlm@35
|
642
|
rlm@24
|
643
|
rlm@212
|
644 * COMMENT Generate Source
|
rlm@34
|
645 #+begin_src clojure :tangle ../src/cortex/vision.clj
|
rlm@24
|
646 <<eyes>>
|
rlm@24
|
647 #+end_src
|
rlm@24
|
648
|
rlm@68
|
649 #+begin_src clojure :tangle ../src/cortex/test/vision.clj
|
rlm@215
|
650 <<test-header>>
|
rlm@215
|
651 <<test-1>>
|
rlm@24
|
652 #+end_src
|