comparison thesis/cortex.org @ 517:68665d2c32a7

spellcheck; almost done with first draft!
author Robert McIntyre <rlm@mit.edu>
date Mon, 31 Mar 2014 00:18:26 -0400
parents ced955c3c84f
children d78f5102d693
comparison
equal deleted inserted replaced
516:ced955c3c84f 517:68665d2c32a7
57 corporeal experience, we greatly constrain the possibilities of what 57 corporeal experience, we greatly constrain the possibilities of what
58 would otherwise be an unwieldy exponential search. This extra 58 would otherwise be an unwieldy exponential search. This extra
59 constraint can be the difference between easily understanding what 59 constraint can be the difference between easily understanding what
60 is happening in a video and being completely lost in a sea of 60 is happening in a video and being completely lost in a sea of
61 incomprehensible color and movement. 61 incomprehensible color and movement.
62
63 62
64 ** The problem: recognizing actions in video is hard! 63 ** The problem: recognizing actions in video is hard!
65 64
66 Examine the following image. What is happening? As you, and indeed 65 Examine the following image. What is happening? As you, and indeed
67 very young children, can easily determine, this is an image of 66 very young children, can easily determine, this is an image of
75 Nevertheless, it is beyond the state of the art for a computer 74 Nevertheless, it is beyond the state of the art for a computer
76 vision program to describe what's happening in this image. Part of 75 vision program to describe what's happening in this image. Part of
77 the problem is that many computer vision systems focus on 76 the problem is that many computer vision systems focus on
78 pixel-level details or comparisons to example images (such as 77 pixel-level details or comparisons to example images (such as
79 \cite{volume-action-recognition}), but the 3D world is so variable 78 \cite{volume-action-recognition}), but the 3D world is so variable
80 that it is hard to descrive the world in terms of possible images. 79 that it is hard to describe the world in terms of possible images.
81 80
82 In fact, the contents of scene may have much less to do with pixel 81 In fact, the contents of scene may have much less to do with pixel
83 probabilities than with recognizing various affordances: things you 82 probabilities than with recognizing various affordances: things you
84 can move, objects you can grasp, spaces that can be filled . For 83 can move, objects you can grasp, spaces that can be filled . For
85 example, what processes might enable you to see the chair in figure 84 example, what processes might enable you to see the chair in figure
100 #+name: girl 99 #+name: girl
101 #+ATTR_LaTeX: :width 7cm 100 #+ATTR_LaTeX: :width 7cm
102 [[./images/wall-push.png]] 101 [[./images/wall-push.png]]
103 102
104 Each of these examples tells us something about what might be going 103 Each of these examples tells us something about what might be going
105 on in our minds as we easily solve these recognition problems. 104 on in our minds as we easily solve these recognition problems:
106 105
107 The hidden chair shows us that we are strongly triggered by cues 106 The hidden chair shows us that we are strongly triggered by cues
108 relating to the position of human bodies, and that we can determine 107 relating to the position of human bodies, and that we can determine
109 the overall physical configuration of a human body even if much of 108 the overall physical configuration of a human body even if much of
110 that body is occluded. 109 that body is occluded.
112 The picture of the girl pushing against the wall tells us that we 111 The picture of the girl pushing against the wall tells us that we
113 have common sense knowledge about the kinetics of our own bodies. 112 have common sense knowledge about the kinetics of our own bodies.
114 We know well how our muscles would have to work to maintain us in 113 We know well how our muscles would have to work to maintain us in
115 most positions, and we can easily project this self-knowledge to 114 most positions, and we can easily project this self-knowledge to
116 imagined positions triggered by images of the human body. 115 imagined positions triggered by images of the human body.
116
117 The cat tells us that imagination of some kind plays an important
118 role in understanding actions. The question is: Can we be more
119 precise about what sort of imagination is required to understand
120 these actions?
117 121
118 ** A step forward: the sensorimotor-centered approach 122 ** A step forward: the sensorimotor-centered approach
119 123
120 In this thesis, I explore the idea that our knowledge of our own 124 In this thesis, I explore the idea that our knowledge of our own
121 bodies, combined with our own rich senses, enables us to recognize 125 bodies, combined with our own rich senses, enables us to recognize
137 141
138 1. Create a physical model of the video by putting a ``fuzzy'' 142 1. Create a physical model of the video by putting a ``fuzzy''
139 model of its own body in place of the cat. Possibly also create 143 model of its own body in place of the cat. Possibly also create
140 a simulation of the stream of water. 144 a simulation of the stream of water.
141 145
142 2. Play out this simulated scene and generate imagined sensory 146 2. ``Play out'' this simulated scene and generate imagined sensory
143 experience. This will include relevant muscle contractions, a 147 experience. This will include relevant muscle contractions, a
144 close up view of the stream from the cat's perspective, and most 148 close up view of the stream from the cat's perspective, and most
145 importantly, the imagined feeling of water entering the 149 importantly, the imagined feeling of water entering the mouth.
146 mouth. The imagined sensory experience can come from a 150 The imagined sensory experience can come from a simulation of
147 simulation of the event, but can also be pattern-matched from 151 the event, but can also be pattern-matched from previous,
148 previous, similar embodied experience. 152 similar embodied experience.
149 153
150 3. The action is now easily identified as drinking by the sense of 154 3. The action is now easily identified as drinking by the sense of
151 taste alone. The other senses (such as the tongue moving in and 155 taste alone. The other senses (such as the tongue moving in and
152 out) help to give plausibility to the simulated action. Note that 156 out) help to give plausibility to the simulated action. Note that
153 the sense of vision, while critical in creating the simulation, 157 the sense of vision, while critical in creating the simulation,
158 1. Align a model of your body to the person in the image. 162 1. Align a model of your body to the person in the image.
159 163
160 2. Generate proprioceptive sensory data from this alignment. 164 2. Generate proprioceptive sensory data from this alignment.
161 165
162 3. Use the imagined proprioceptive data as a key to lookup related 166 3. Use the imagined proprioceptive data as a key to lookup related
163 sensory experience associated with that particular proproceptive 167 sensory experience associated with that particular proprioceptive
164 feeling. 168 feeling.
165 169
166 4. Retrieve the feeling of your bottom resting on a surface, your 170 4. Retrieve the feeling of your bottom resting on a surface, your
167 knees bent, and your leg muscles relaxed. 171 knees bent, and your leg muscles relaxed.
168 172
192 If these systems learn about running as viewed from the side, they 196 If these systems learn about running as viewed from the side, they
193 will not automatically be able to recognize running from any other 197 will not automatically be able to recognize running from any other
194 viewpoint. 198 viewpoint.
195 199
196 Another powerful advantage is that using the language of multiple 200 Another powerful advantage is that using the language of multiple
197 body-centered rich senses to describe body-centerd actions offers a 201 body-centered rich senses to describe body-centered actions offers a
198 massive boost in descriptive capability. Consider how difficult it 202 massive boost in descriptive capability. Consider how difficult it
199 would be to compose a set of HOG filters to describe the action of 203 would be to compose a set of HOG filters to describe the action of
200 a simple worm-creature ``curling'' so that its head touches its 204 a simple worm-creature ``curling'' so that its head touches its
201 tail, and then behold the simplicity of describing thus action in a 205 tail, and then behold the simplicity of describing thus action in a
202 language designed for the task (listing \ref{grand-circle-intro}): 206 language designed for the task (listing \ref{grand-circle-intro}):
203 207
204 #+caption: Body-centerd actions are best expressed in a body-centered 208 #+caption: Body-centered actions are best expressed in a body-centered
205 #+caption: language. This code detects when the worm has curled into a 209 #+caption: language. This code detects when the worm has curled into a
206 #+caption: full circle. Imagine how you would replicate this functionality 210 #+caption: full circle. Imagine how you would replicate this functionality
207 #+caption: using low-level pixel features such as HOG filters! 211 #+caption: using low-level pixel features such as HOG filters!
208 #+name: grand-circle-intro 212 #+name: grand-circle-intro
209 #+begin_listing clojure 213 #+begin_listing clojure
218 (and (< 0.2 (contact worm-segment-bottom-tip tail-touch)) 222 (and (< 0.2 (contact worm-segment-bottom-tip tail-touch))
219 (< 0.2 (contact worm-segment-top-tip head-touch)))))) 223 (< 0.2 (contact worm-segment-top-tip head-touch))))))
220 #+end_src 224 #+end_src
221 #+end_listing 225 #+end_listing
222 226
223 ** =EMPATH= regognizes actions using empathy 227 ** =EMPATH= recognizes actions using empathy
224 228
225 First, I built a system for constructing virtual creatures with 229 Exploring these ideas further demands a concrete implementation, so
230 first, I built a system for constructing virtual creatures with
226 physiologically plausible sensorimotor systems and detailed 231 physiologically plausible sensorimotor systems and detailed
227 environments. The result is =CORTEX=, which is described in section 232 environments. The result is =CORTEX=, which is described in section
228 \ref{sec-2}. (=CORTEX= was built to be flexible and useful to other 233 \ref{sec-2}.
229 AI researchers; it is provided in full with detailed instructions
230 on the web [here].)
231 234
232 Next, I wrote routines which enabled a simple worm-like creature to 235 Next, I wrote routines which enabled a simple worm-like creature to
233 infer the actions of a second worm-like creature, using only its 236 infer the actions of a second worm-like creature, using only its
234 own prior sensorimotor experiences and knowledge of the second 237 own prior sensorimotor experiences and knowledge of the second
235 worm's joint positions. This program, =EMPATH=, is described in 238 worm's joint positions. This program, =EMPATH=, is described in
236 section \ref{sec-3}, and the key results of this experiment are 239 section \ref{sec-3}. It's main components are:
237 summarized below. 240
238 241 - Embodied Action Definitions :: Many otherwise complicated actions
239 I have built a system that can express the types of recognition 242 are easily described in the language of a full suite of
240 problems in a form amenable to computation. It is split into 243 body-centered, rich senses and experiences. For example,
241 four parts:
242
243 - Free/Guided Play :: The creature moves around and experiences the
244 world through its unique perspective. Many otherwise
245 complicated actions are easily described in the language of a
246 full suite of body-centered, rich senses. For example,
247 drinking is the feeling of water sliding down your throat, and 244 drinking is the feeling of water sliding down your throat, and
248 cooling your insides. It's often accompanied by bringing your 245 cooling your insides. It's often accompanied by bringing your
249 hand close to your face, or bringing your face close to water. 246 hand close to your face, or bringing your face close to water.
250 Sitting down is the feeling of bending your knees, activating 247 Sitting down is the feeling of bending your knees, activating
251 your quadriceps, then feeling a surface with your bottom and 248 your quadriceps, then feeling a surface with your bottom and
252 relaxing your legs. These body-centered action descriptions 249 relaxing your legs. These body-centered action descriptions
253 can be either learned or hard coded. 250 can be either learned or hard coded.
254 - Posture Imitation :: When trying to interpret a video or image, 251
252 - Guided Play :: The creature moves around and experiences the
253 world through its unique perspective. As the creature moves,
254 it gathers experiences that satisfy the embodied action
255 definitions.
256
257 - Posture imitation :: When trying to interpret a video or image,
255 the creature takes a model of itself and aligns it with 258 the creature takes a model of itself and aligns it with
256 whatever it sees. This alignment can even cross species, as 259 whatever it sees. This alignment might even cross species, as
257 when humans try to align themselves with things like ponies, 260 when humans try to align themselves with things like ponies,
258 dogs, or other humans with a different body type. 261 dogs, or other humans with a different body type.
259 - Empathy :: The alignment triggers associations with 262
263 - Empathy :: The alignment triggers associations with
260 sensory data from prior experiences. For example, the 264 sensory data from prior experiences. For example, the
261 alignment itself easily maps to proprioceptive data. Any 265 alignment itself easily maps to proprioceptive data. Any
262 sounds or obvious skin contact in the video can to a lesser 266 sounds or obvious skin contact in the video can to a lesser
263 extent trigger previous experience. Segments of previous 267 extent trigger previous experience keyed to hearing or touch.
264 experiences are stitched together to form a coherent and 268 Segments of previous experiences gained from play are stitched
265 complete sensory portrait of the scene. 269 together to form a coherent and complete sensory portrait of
266 - Recognition :: With the scene described in terms of first 270 the scene.
267 person sensory events, the creature can now run its 271
268 action-identification programs on this synthesized sensory 272 - Recognition :: With the scene described in terms of
269 data, just as it would if it were actually experiencing the 273 remembered first person sensory events, the creature can now
270 scene first-hand. If previous experience has been accurately 274 run its action-identified programs (such as the one in listing
275 \ref{grand-circle-intro} on this synthesized sensory data,
276 just as it would if it were actually experiencing the scene
277 first-hand. If previous experience has been accurately
271 retrieved, and if it is analogous enough to the scene, then 278 retrieved, and if it is analogous enough to the scene, then
272 the creature will correctly identify the action in the scene. 279 the creature will correctly identify the action in the scene.
273
274 280
275 My program, =EMPATH= uses this empathic problem solving technique 281 My program, =EMPATH= uses this empathic problem solving technique
276 to interpret the actions of a simple, worm-like creature. 282 to interpret the actions of a simple, worm-like creature.
277 283
278 #+caption: The worm performs many actions during free play such as 284 #+caption: The worm performs many actions during free play such as
285 #+caption: poses by inferring the complete sensory experience 291 #+caption: poses by inferring the complete sensory experience
286 #+caption: from proprioceptive data. 292 #+caption: from proprioceptive data.
287 #+name: worm-recognition-intro 293 #+name: worm-recognition-intro
288 #+ATTR_LaTeX: :width 15cm 294 #+ATTR_LaTeX: :width 15cm
289 [[./images/worm-poses.png]] 295 [[./images/worm-poses.png]]
290
291 #+caption: From only \emph{proprioceptive} data, =EMPATH= was able to infer
292 #+caption: the complete sensory experience and classify these four poses.
293 #+caption: The last image is a composite, depicting the intermediate stages
294 #+caption: of \emph{wriggling}.
295 #+name: worm-recognition-intro-2
296 #+ATTR_LaTeX: :width 15cm
297 [[./images/empathy-1.png]]
298 296
299 Next, I developed an experiment to test the power of =CORTEX='s 297 *** Main Results
300 sensorimotor-centered language for solving recognition problems. As 298
301 a proof of concept, I wrote routines which enabled a simple 299 - After one-shot supervised training, =EMPATH= was able recognize a
302 worm-like creature to infer the actions of a second worm-like 300 wide variety of static poses and dynamic actions---ranging from
303 creature, using only its own previous sensorimotor experiences and 301 curling in a circle to wiggling with a particular frequency ---
304 knowledge of the second worm's joints (figure 302 with 95\% accuracy.
305 \ref{worm-recognition-intro-2}). The result of this proof of 303
306 concept was the program =EMPATH=, described in section \ref{sec-3}. 304 - These results were completely independent of viewing angle
307 305 because the underlying body-centered language fundamentally is
308 ** =EMPATH= is built on =CORTEX=, en environment for making creatures. 306 independent; once an action is learned, it can be recognized
309 307 equally well from any viewing angle.
310 # =CORTEX= provides a language for describing the sensorimotor 308
311 # experiences of various creatures. 309 - =EMPATH= is surprisingly short; the sensorimotor-centered
310 language provided by =CORTEX= resulted in extremely economical
311 recognition routines --- about 500 lines in all --- suggesting
312 that such representations are very powerful, and often
313 indispensable for the types of recognition tasks considered here.
314
315 - Although for expediency's sake, I relied on direct knowledge of
316 joint positions in this proof of concept, it would be
317 straightforward to extend =EMPATH= so that it (more
318 realistically) infers joint positions from its visual data.
319
320 ** =EMPATH= is built on =CORTEX=, a creature builder.
312 321
313 I built =CORTEX= to be a general AI research platform for doing 322 I built =CORTEX= to be a general AI research platform for doing
314 experiments involving multiple rich senses and a wide variety and 323 experiments involving multiple rich senses and a wide variety and
315 number of creatures. I intend it to be useful as a library for many 324 number of creatures. I intend it to be useful as a library for many
316 more projects than just this thesis. =CORTEX= was necessary to meet 325 more projects than just this thesis. =CORTEX= was necessary to meet
317 a need among AI researchers at CSAIL and beyond, which is that 326 a need among AI researchers at CSAIL and beyond, which is that
318 people often will invent neat ideas that are best expressed in the 327 people often will invent neat ideas that are best expressed in the
319 language of creatures and senses, but in order to explore those 328 language of creatures and senses, but in order to explore those
320 ideas they must first build a platform in which they can create 329 ideas they must first build a platform in which they can create
321 simulated creatures with rich senses! There are many ideas that 330 simulated creatures with rich senses! There are many ideas that
322 would be simple to execute (such as =EMPATH=), but attached to them 331 would be simple to execute (such as =EMPATH= or
323 is the multi-month effort to make a good creature simulator. Often, 332 \cite{larson-symbols}), but attached to them is the multi-month
324 that initial investment of time proves to be too much, and the 333 effort to make a good creature simulator. Often, that initial
325 project must make do with a lesser environment. 334 investment of time proves to be too much, and the project must make
335 do with a lesser environment.
326 336
327 =CORTEX= is well suited as an environment for embodied AI research 337 =CORTEX= is well suited as an environment for embodied AI research
328 for three reasons: 338 for three reasons:
329 339
330 - You can create new creatures using Blender, a popular 3D modeling 340 - You can create new creatures using Blender (\cite{blender}), a
331 program. Each sense can be specified using special blender nodes 341 popular 3D modeling program. Each sense can be specified using
332 with biologically inspired paramaters. You need not write any 342 special blender nodes with biologically inspired parameters. You
333 code to create a creature, and can use a wide library of 343 need not write any code to create a creature, and can use a wide
334 pre-existing blender models as a base for your own creatures. 344 library of pre-existing blender models as a base for your own
345 creatures.
335 346
336 - =CORTEX= implements a wide variety of senses: touch, 347 - =CORTEX= implements a wide variety of senses: touch,
337 proprioception, vision, hearing, and muscle tension. Complicated 348 proprioception, vision, hearing, and muscle tension. Complicated
338 senses like touch, and vision involve multiple sensory elements 349 senses like touch, and vision involve multiple sensory elements
339 embedded in a 2D surface. You have complete control over the 350 embedded in a 2D surface. You have complete control over the
341 png image files. In particular, =CORTEX= implements more 352 png image files. In particular, =CORTEX= implements more
342 comprehensive hearing than any other creature simulation system 353 comprehensive hearing than any other creature simulation system
343 available. 354 available.
344 355
345 - =CORTEX= supports any number of creatures and any number of 356 - =CORTEX= supports any number of creatures and any number of
346 senses. Time in =CORTEX= dialates so that the simulated creatures 357 senses. Time in =CORTEX= dilates so that the simulated creatures
347 always precieve a perfectly smooth flow of time, regardless of 358 always perceive a perfectly smooth flow of time, regardless of
348 the actual computational load. 359 the actual computational load.
349 360
350 =CORTEX= is built on top of =jMonkeyEngine3=, which is a video game 361 =CORTEX= is built on top of =jMonkeyEngine3=
351 engine designed to create cross-platform 3D desktop games. =CORTEX= 362 (\cite{jmonkeyengine}), which is a video game engine designed to
352 is mainly written in clojure, a dialect of =LISP= that runs on the 363 create cross-platform 3D desktop games. =CORTEX= is mainly written
353 java virtual machine (JVM). The API for creating and simulating 364 in clojure, a dialect of =LISP= that runs on the java virtual
354 creatures and senses is entirely expressed in clojure, though many 365 machine (JVM). The API for creating and simulating creatures and
355 senses are implemented at the layer of jMonkeyEngine or below. For 366 senses is entirely expressed in clojure, though many senses are
356 example, for the sense of hearing I use a layer of clojure code on 367 implemented at the layer of jMonkeyEngine or below. For example,
357 top of a layer of java JNI bindings that drive a layer of =C++= 368 for the sense of hearing I use a layer of clojure code on top of a
358 code which implements a modified version of =OpenAL= to support 369 layer of java JNI bindings that drive a layer of =C++= code which
359 multiple listeners. =CORTEX= is the only simulation environment 370 implements a modified version of =OpenAL= to support multiple
360 that I know of that can support multiple entities that can each 371 listeners. =CORTEX= is the only simulation environment that I know
361 hear the world from their own perspective. Other senses also 372 of that can support multiple entities that can each hear the world
362 require a small layer of Java code. =CORTEX= also uses =bullet=, a 373 from their own perspective. Other senses also require a small layer
363 physics simulator written in =C=. 374 of Java code. =CORTEX= also uses =bullet=, a physics simulator
375 written in =C=.
364 376
365 #+caption: Here is the worm from figure \ref{worm-intro} modeled 377 #+caption: Here is the worm from figure \ref{worm-intro} modeled
366 #+caption: in Blender, a free 3D-modeling program. Senses and 378 #+caption: in Blender, a free 3D-modeling program. Senses and
367 #+caption: joints are described using special nodes in Blender. 379 #+caption: joints are described using special nodes in Blender.
368 #+name: worm-recognition-intro 380 #+name: worm-recognition-intro
373 385
374 - exploring new ideas about sensory integration 386 - exploring new ideas about sensory integration
375 - distributed communication among swarm creatures 387 - distributed communication among swarm creatures
376 - self-learning using free exploration, 388 - self-learning using free exploration,
377 - evolutionary algorithms involving creature construction 389 - evolutionary algorithms involving creature construction
378 - exploration of exoitic senses and effectors that are not possible 390 - exploration of exotic senses and effectors that are not possible
379 in the real world (such as telekenisis or a semantic sense) 391 in the real world (such as telekinesis or a semantic sense)
380 - imagination using subworlds 392 - imagination using subworlds
381 393
382 During one test with =CORTEX=, I created 3,000 creatures each with 394 During one test with =CORTEX=, I created 3,000 creatures each with
383 their own independent senses and ran them all at only 1/80 real 395 their own independent senses and ran them all at only 1/80 real
384 time. In another test, I created a detailed model of my own hand, 396 time. In another test, I created a detailed model of my own hand,
398 its own finger from the eye in its palm, and that it can feel its 410 its own finger from the eye in its palm, and that it can feel its
399 own thumb touching its palm.} 411 own thumb touching its palm.}
400 \end{sidewaysfigure} 412 \end{sidewaysfigure}
401 #+END_LaTeX 413 #+END_LaTeX
402 414
403 ** Contributions
404
405 - I built =CORTEX=, a comprehensive platform for embodied AI
406 experiments. =CORTEX= supports many features lacking in other
407 systems, such proper simulation of hearing. It is easy to create
408 new =CORTEX= creatures using Blender, a free 3D modeling program.
409
410 - I built =EMPATH=, which uses =CORTEX= to identify the actions of
411 a worm-like creature using a computational model of empathy.
412
413 - After one-shot supervised training, =EMPATH= was able recognize a
414 wide variety of static poses and dynamic actions---ranging from
415 curling in a circle to wriggling with a particular frequency ---
416 with 95\% accuracy.
417
418 - These results were completely independent of viewing angle
419 because the underlying body-centered language fundamentally is
420 independent; once an action is learned, it can be recognized
421 equally well from any viewing angle.
422
423 - =EMPATH= is surprisingly short; the sensorimotor-centered
424 language provided by =CORTEX= resulted in extremely economical
425 recognition routines --- about 500 lines in all --- suggesting
426 that such representations are very powerful, and often
427 indispensible for the types of recognition tasks considered here.
428
429 - Although for expediency's sake, I relied on direct knowledge of
430 joint positions in this proof of concept, it would be
431 straightforward to extend =EMPATH= so that it (more
432 realistically) infers joint positions from its visual data.
433
434 * Designing =CORTEX= 415 * Designing =CORTEX=
435 416
436 In this section, I outline the design decisions that went into 417 In this section, I outline the design decisions that went into
437 making =CORTEX=, along with some details about its implementation. 418 making =CORTEX=, along with some details about its implementation.
438 (A practical guide to getting started with =CORTEX=, which skips 419 (A practical guide to getting started with =CORTEX=, which skips
439 over the history and implementation details presented here, is 420 over the history and implementation details presented here, is
440 provided in an appendix at the end of this thesis.) 421 provided in an appendix at the end of this thesis.)
441 422
442 Throughout this project, I intended for =CORTEX= to be flexible and 423 Throughout this project, I intended for =CORTEX= to be flexible and
443 extensible enough to be useful for other researchers who want to 424 extensible enough to be useful for other researchers who want to
444 test out ideas of their own. To this end, wherver I have had to make 425 test out ideas of their own. To this end, wherever I have had to make
445 archetictural choices about =CORTEX=, I have chosen to give as much 426 architectural choices about =CORTEX=, I have chosen to give as much
446 freedom to the user as possible, so that =CORTEX= may be used for 427 freedom to the user as possible, so that =CORTEX= may be used for
447 things I have not forseen. 428 things I have not foreseen.
448 429
449 ** Building in simulation versus reality 430 ** Building in simulation versus reality
450 The most important archetictural decision of all is the choice to 431 The most important architectural decision of all is the choice to
451 use a computer-simulated environemnt in the first place! The world 432 use a computer-simulated environment in the first place! The world
452 is a vast and rich place, and for now simulations are a very poor 433 is a vast and rich place, and for now simulations are a very poor
453 reflection of its complexity. It may be that there is a significant 434 reflection of its complexity. It may be that there is a significant
454 qualatative difference between dealing with senses in the real 435 qualitative difference between dealing with senses in the real
455 world and dealing with pale facilimilies of them in a simulation 436 world and dealing with pale facsimiles of them in a simulation
456 \cite{brooks-representation}. What are the advantages and 437 \cite{brooks-representation}. What are the advantages and
457 disadvantages of a simulation vs. reality? 438 disadvantages of a simulation vs. reality?
458 439
459 *** Simulation 440 *** Simulation
460 441
517 ideas in the real world must always worry about getting their 498 ideas in the real world must always worry about getting their
518 algorithms to run fast enough to process information in real time. 499 algorithms to run fast enough to process information in real time.
519 The need for real time processing only increases if multiple senses 500 The need for real time processing only increases if multiple senses
520 are involved. In the extreme case, even simple algorithms will have 501 are involved. In the extreme case, even simple algorithms will have
521 to be accelerated by ASIC chips or FPGAs, turning what would 502 to be accelerated by ASIC chips or FPGAs, turning what would
522 otherwise be a few lines of code and a 10x speed penality into a 503 otherwise be a few lines of code and a 10x speed penalty into a
523 multi-month ordeal. For this reason, =CORTEX= supports 504 multi-month ordeal. For this reason, =CORTEX= supports
524 /time-dialiation/, which scales back the framerate of the 505 /time-dilation/, which scales back the framerate of the
525 simulation in proportion to the amount of processing each frame. 506 simulation in proportion to the amount of processing each frame.
526 From the perspective of the creatures inside the simulation, time 507 From the perspective of the creatures inside the simulation, time
527 always appears to flow at a constant rate, regardless of how 508 always appears to flow at a constant rate, regardless of how
528 complicated the envorimnent becomes or how many creatures are in 509 complicated the environment becomes or how many creatures are in
529 the simulation. The cost is that =CORTEX= can sometimes run slower 510 the simulation. The cost is that =CORTEX= can sometimes run slower
530 than real time. This can also be an advantage, however --- 511 than real time. This can also be an advantage, however ---
531 simulations of very simple creatures in =CORTEX= generally run at 512 simulations of very simple creatures in =CORTEX= generally run at
532 40x on my machine! 513 40x on my machine!
533 514
534 ** All sense organs are two-dimensional surfaces 515 ** All sense organs are two-dimensional surfaces
535 516
536 If =CORTEX= is to support a wide variety of senses, it would help 517 If =CORTEX= is to support a wide variety of senses, it would help
537 to have a better understanding of what a ``sense'' actually is! 518 to have a better understanding of what a ``sense'' actually is!
538 While vision, touch, and hearing all seem like they are quite 519 While vision, touch, and hearing all seem like they are quite
539 different things, I was supprised to learn during the course of 520 different things, I was surprised to learn during the course of
540 this thesis that they (and all physical senses) can be expressed as 521 this thesis that they (and all physical senses) can be expressed as
541 exactly the same mathematical object due to a dimensional argument! 522 exactly the same mathematical object due to a dimensional argument!
542 523
543 Human beings are three-dimensional objects, and the nerves that 524 Human beings are three-dimensional objects, and the nerves that
544 transmit data from our various sense organs to our brain are 525 transmit data from our various sense organs to our brain are
559 complicated surface of the skin onto a two dimensional image. 540 complicated surface of the skin onto a two dimensional image.
560 541
561 Most human senses consist of many discrete sensors of various 542 Most human senses consist of many discrete sensors of various
562 properties distributed along a surface at various densities. For 543 properties distributed along a surface at various densities. For
563 skin, it is Pacinian corpuscles, Meissner's corpuscles, Merkel's 544 skin, it is Pacinian corpuscles, Meissner's corpuscles, Merkel's
564 disks, and Ruffini's endings, which detect pressure and vibration 545 disks, and Ruffini's endings (\cite{9.01-textbook), which detect
565 of various intensities. For ears, it is the stereocilia distributed 546 pressure and vibration of various intensities. For ears, it is the
566 along the basilar membrane inside the cochlea; each one is 547 stereocilia distributed along the basilar membrane inside the
567 sensitive to a slightly different frequency of sound. For eyes, it 548 cochlea; each one is sensitive to a slightly different frequency of
568 is rods and cones distributed along the surface of the retina. In 549 sound. For eyes, it is rods and cones distributed along the surface
569 each case, we can describe the sense with a surface and a 550 of the retina. In each case, we can describe the sense with a
570 distribution of sensors along that surface. 551 surface and a distribution of sensors along that surface.
571 552
572 The neat idea is that every human sense can be effectively 553 The neat idea is that every human sense can be effectively
573 described in terms of a surface containing embedded sensors. If the 554 described in terms of a surface containing embedded sensors. If the
574 sense had any more dimensions, then there wouldn't be enough room 555 sense had any more dimensions, then there wouldn't be enough room
575 in the spinal chord to transmit the information! 556 in the spinal chord to transmit the information!
612 ** Video game engines provide ready-made physics and shading 593 ** Video game engines provide ready-made physics and shading
613 594
614 I did not need to write my own physics simulation code or shader to 595 I did not need to write my own physics simulation code or shader to
615 build =CORTEX=. Doing so would lead to a system that is impossible 596 build =CORTEX=. Doing so would lead to a system that is impossible
616 for anyone but myself to use anyway. Instead, I use a video game 597 for anyone but myself to use anyway. Instead, I use a video game
617 engine as a base and modify it to accomodate the additional needs 598 engine as a base and modify it to accommodate the additional needs
618 of =CORTEX=. Video game engines are an ideal starting point to 599 of =CORTEX=. Video game engines are an ideal starting point to
619 build =CORTEX=, because they are not far from being creature 600 build =CORTEX=, because they are not far from being creature
620 building systems themselves. 601 building systems themselves.
621 602
622 First off, general purpose video game engines come with a physics 603 First off, general purpose video game engines come with a physics
682 one to create boxes, spheres, etc., and leave that API as the sole 663 one to create boxes, spheres, etc., and leave that API as the sole
683 way to create creatures. However, for =CORTEX= to truly be useful 664 way to create creatures. However, for =CORTEX= to truly be useful
684 for other projects, it needs a way to construct complicated 665 for other projects, it needs a way to construct complicated
685 creatures. If possible, it would be nice to leverage work that has 666 creatures. If possible, it would be nice to leverage work that has
686 already been done by the community of 3D modelers, or at least 667 already been done by the community of 3D modelers, or at least
687 enable people who are talented at moedling but not programming to 668 enable people who are talented at modeling but not programming to
688 design =CORTEX= creatures. 669 design =CORTEX= creatures.
689 670
690 Therefore, I use Blender, a free 3D modeling program, as the main 671 Therefore, I use Blender, a free 3D modeling program, as the main
691 way to create creatures in =CORTEX=. However, the creatures modeled 672 way to create creatures in =CORTEX=. However, the creatures modeled
692 in Blender must also be simple to simulate in jMonkeyEngine3's game 673 in Blender must also be simple to simulate in jMonkeyEngine3's game
702 - Add empty nodes which each contain meta-data relevant to the 683 - Add empty nodes which each contain meta-data relevant to the
703 sense, including a UV-map describing the number/distribution of 684 sense, including a UV-map describing the number/distribution of
704 sensors if applicable. 685 sensors if applicable.
705 - Make each empty-node the child of the top-level node. 686 - Make each empty-node the child of the top-level node.
706 687
707 #+caption: An example of annoting a creature model with empty 688 #+caption: An example of annotating a creature model with empty
708 #+caption: nodes to describe the layout of senses. There are 689 #+caption: nodes to describe the layout of senses. There are
709 #+caption: multiple empty nodes which each describe the position 690 #+caption: multiple empty nodes which each describe the position
710 #+caption: of muscles, ears, eyes, or joints. 691 #+caption: of muscles, ears, eyes, or joints.
711 #+name: sense-nodes 692 #+name: sense-nodes
712 #+ATTR_LaTeX: :width 10cm 693 #+ATTR_LaTeX: :width 10cm
715 ** Bodies are composed of segments connected by joints 696 ** Bodies are composed of segments connected by joints
716 697
717 Blender is a general purpose animation tool, which has been used in 698 Blender is a general purpose animation tool, which has been used in
718 the past to create high quality movies such as Sintel 699 the past to create high quality movies such as Sintel
719 \cite{blender}. Though Blender can model and render even complicated 700 \cite{blender}. Though Blender can model and render even complicated
720 things like water, it is crucual to keep models that are meant to 701 things like water, it is crucial to keep models that are meant to
721 be simulated as creatures simple. =Bullet=, which =CORTEX= uses 702 be simulated as creatures simple. =Bullet=, which =CORTEX= uses
722 though jMonkeyEngine3, is a rigid-body physics system. This offers 703 though jMonkeyEngine3, is a rigid-body physics system. This offers
723 a compromise between the expressiveness of a game level and the 704 a compromise between the expressiveness of a game level and the
724 speed at which it can be simulated, and it means that creatures 705 speed at which it can be simulated, and it means that creatures
725 should be naturally expressed as rigid components held together by 706 should be naturally expressed as rigid components held together by
726 joint constraints. 707 joint constraints.
727 708
728 But humans are more like a squishy bag with wrapped around some 709 But humans are more like a squishy bag wrapped around some hard
729 hard bones which define the overall shape. When we move, our skin 710 bones which define the overall shape. When we move, our skin bends
730 bends and stretches to accomodate the new positions of our bones. 711 and stretches to accommodate the new positions of our bones.
731 712
732 One way to make bodies composed of rigid pieces connected by joints 713 One way to make bodies composed of rigid pieces connected by joints
733 /seem/ more human-like is to use an /armature/, (or /rigging/) 714 /seem/ more human-like is to use an /armature/, (or /rigging/)
734 system, which defines a overall ``body mesh'' and defines how the 715 system, which defines a overall ``body mesh'' and defines how the
735 mesh deforms as a function of the position of each ``bone'' which 716 mesh deforms as a function of the position of each ``bone'' which
736 is a standard rigid body. This technique is used extensively to 717 is a standard rigid body. This technique is used extensively to
737 model humans and create realistic animations. It is not a good 718 model humans and create realistic animations. It is not a good
738 technique for physical simulation, however because it creates a lie 719 technique for physical simulation because it is a lie -- the skin
739 -- the skin is not a physical part of the simulation and does not 720 is not a physical part of the simulation and does not interact with
740 interact with any objects in the world or itself. Objects will pass 721 any objects in the world or itself. Objects will pass right though
741 right though the skin until they come in contact with the 722 the skin until they come in contact with the underlying bone, which
742 underlying bone, which is a physical object. Whithout simulating 723 is a physical object. Without simulating the skin, the sense of
743 the skin, the sense of touch has little meaning, and the creature's 724 touch has little meaning, and the creature's own vision will lie to
744 own vision will lie to it about the true extent of its body. 725 it about the true extent of its body. Simulating the skin as a
745 Simulating the skin as a physical object requires some way to 726 physical object requires some way to continuously update the
746 continuously update the physical model of the skin along with the 727 physical model of the skin along with the movement of the bones,
747 movement of the bones, which is unacceptably slow compared to rigid 728 which is unacceptably slow compared to rigid body simulation.
748 body simulation.
749 729
750 Therefore, instead of using the human-like ``deformable bag of 730 Therefore, instead of using the human-like ``deformable bag of
751 bones'' approach, I decided to base my body plans on multiple solid 731 bones'' approach, I decided to base my body plans on multiple solid
752 objects that are connected by joints, inspired by the robot =EVE= 732 objects that are connected by joints, inspired by the robot =EVE=
753 from the movie WALL-E. 733 from the movie WALL-E.
760 740
761 =EVE='s body is composed of several rigid components that are held 741 =EVE='s body is composed of several rigid components that are held
762 together by invisible joint constraints. This is what I mean by 742 together by invisible joint constraints. This is what I mean by
763 ``eve-like''. The main reason that I use eve-style bodies is for 743 ``eve-like''. The main reason that I use eve-style bodies is for
764 efficiency, and so that there will be correspondence between the 744 efficiency, and so that there will be correspondence between the
765 AI's semses and the physical presence of its body. Each individual 745 AI's senses and the physical presence of its body. Each individual
766 section is simulated by a separate rigid body that corresponds 746 section is simulated by a separate rigid body that corresponds
767 exactly with its visual representation and does not change. 747 exactly with its visual representation and does not change.
768 Sections are connected by invisible joints that are well supported 748 Sections are connected by invisible joints that are well supported
769 in jMonkeyEngine3. Bullet, the physics backend for jMonkeyEngine3, 749 in jMonkeyEngine3. Bullet, the physics backend for jMonkeyEngine3,
770 can efficiently simulate hundreds of rigid bodies connected by 750 can efficiently simulate hundreds of rigid bodies connected by
868 Since the objects must be physical, the empty-node itself escapes 848 Since the objects must be physical, the empty-node itself escapes
869 detection. Because the objects must be physical, =joint-targets= 849 detection. Because the objects must be physical, =joint-targets=
870 must be called /after/ =physical!= is called. 850 must be called /after/ =physical!= is called.
871 851
872 #+caption: Program to find the targets of a joint node by 852 #+caption: Program to find the targets of a joint node by
873 #+caption: exponentiallly growth of a search cube. 853 #+caption: exponentially growth of a search cube.
874 #+name: joint-targets 854 #+name: joint-targets
875 #+begin_listing clojure 855 #+begin_listing clojure
876 #+begin_src clojure 856 #+begin_src clojure
877 (defn joint-targets 857 (defn joint-targets
878 "Return the two closest two objects to the joint object, ordered 858 "Return the two closest two objects to the joint object, ordered
903 883
904 Once =CORTEX= finds all joints and targets, it creates them using 884 Once =CORTEX= finds all joints and targets, it creates them using
905 a dispatch on the metadata of each joint node. 885 a dispatch on the metadata of each joint node.
906 886
907 #+caption: Program to dispatch on blender metadata and create joints 887 #+caption: Program to dispatch on blender metadata and create joints
908 #+caption: sutiable for physical simulation. 888 #+caption: suitable for physical simulation.
909 #+name: joint-dispatch 889 #+name: joint-dispatch
910 #+begin_listing clojure 890 #+begin_listing clojure
911 #+begin_src clojure 891 #+begin_src clojure
912 (defmulti joint-dispatch 892 (defmulti joint-dispatch
913 "Translate blender pseudo-joints into real JME joints." 893 "Translate blender pseudo-joints into real JME joints."
983 #+end_listing 963 #+end_listing
984 964
985 In general, whenever =CORTEX= exposes a sense (or in this case 965 In general, whenever =CORTEX= exposes a sense (or in this case
986 physicality), it provides a function of the type =sense!=, which 966 physicality), it provides a function of the type =sense!=, which
987 takes in a collection of nodes and augments it to support that 967 takes in a collection of nodes and augments it to support that
988 sense. The function returns any controlls necessary to use that 968 sense. The function returns any controls necessary to use that
989 sense. In this case =body!= cerates a physical body and returns no 969 sense. In this case =body!= creates a physical body and returns no
990 control functions. 970 control functions.
991 971
992 #+caption: Program to give joints to a creature. 972 #+caption: Program to give joints to a creature.
993 #+name: name 973 #+name: name
994 #+begin_listing clojure 974 #+begin_listing clojure
1020 The hand from figure \ref{blender-hand}, which was modeled after 1000 The hand from figure \ref{blender-hand}, which was modeled after
1021 my own right hand, can now be given joints and simulated as a 1001 my own right hand, can now be given joints and simulated as a
1022 creature. 1002 creature.
1023 1003
1024 #+caption: With the ability to create physical creatures from blender, 1004 #+caption: With the ability to create physical creatures from blender,
1025 #+caption: =CORTEX= gets one step closer to becomming a full creature 1005 #+caption: =CORTEX= gets one step closer to becoming a full creature
1026 #+caption: simulation environment. 1006 #+caption: simulation environment.
1027 #+name: name 1007 #+name: name
1028 #+ATTR_LaTeX: :width 15cm 1008 #+ATTR_LaTeX: :width 15cm
1029 [[./images/physical-hand.png]] 1009 [[./images/physical-hand.png]]
1030 1010
1083 the data. To make this easy for the continuation function, the 1063 the data. To make this easy for the continuation function, the
1084 =SceneProcessor= maintains appropriately sized buffers in RAM to 1064 =SceneProcessor= maintains appropriately sized buffers in RAM to
1085 hold the data. It does not do any copying from the GPU to the CPU 1065 hold the data. It does not do any copying from the GPU to the CPU
1086 itself because it is a slow operation. 1066 itself because it is a slow operation.
1087 1067
1088 #+caption: Function to make the rendered secne in jMonkeyEngine 1068 #+caption: Function to make the rendered scene in jMonkeyEngine
1089 #+caption: available for further processing. 1069 #+caption: available for further processing.
1090 #+name: pipeline-1 1070 #+name: pipeline-1
1091 #+begin_listing clojure 1071 #+begin_listing clojure
1092 #+begin_src clojure 1072 #+begin_src clojure
1093 (defn vision-pipeline 1073 (defn vision-pipeline
1158 XZY rotation for the node in blender." 1138 XZY rotation for the node in blender."
1159 [#^Node creature #^Spatial eye] 1139 [#^Node creature #^Spatial eye]
1160 (let [target (closest-node creature eye) 1140 (let [target (closest-node creature eye)
1161 [cam-width cam-height] 1141 [cam-width cam-height]
1162 ;;[640 480] ;; graphics card on laptop doesn't support 1142 ;;[640 480] ;; graphics card on laptop doesn't support
1163 ;; arbitray dimensions. 1143 ;; arbitrary dimensions.
1164 (eye-dimensions eye) 1144 (eye-dimensions eye)
1165 cam (Camera. cam-width cam-height) 1145 cam (Camera. cam-width cam-height)
1166 rot (.getWorldRotation eye)] 1146 rot (.getWorldRotation eye)]
1167 (.setLocation cam (.getWorldTranslation eye)) 1147 (.setLocation cam (.getWorldTranslation eye))
1168 (.lookAtDirection 1148 (.lookAtDirection
1343 sound from different points of view, and there is no way to directly 1323 sound from different points of view, and there is no way to directly
1344 access the rendered sound data. 1324 access the rendered sound data.
1345 1325
1346 =CORTEX='s hearing is unique because it does not have any 1326 =CORTEX='s hearing is unique because it does not have any
1347 limitations compared to other simulation environments. As far as I 1327 limitations compared to other simulation environments. As far as I
1348 know, there is no other system that supports multiple listerers, 1328 know, there is no other system that supports multiple listeners,
1349 and the sound demo at the end of this section is the first time 1329 and the sound demo at the end of this section is the first time
1350 it's been done in a video game environment. 1330 it's been done in a video game environment.
1351 1331
1352 *** Brief Description of jMonkeyEngine's Sound System 1332 *** Brief Description of jMonkeyEngine's Sound System
1353 1333
1382 *** Extending =OpenAl= 1362 *** Extending =OpenAl=
1383 1363
1384 Extending =OpenAL= to support multiple listeners requires 500 1364 Extending =OpenAL= to support multiple listeners requires 500
1385 lines of =C= code and is too hairy to mention here. Instead, I 1365 lines of =C= code and is too hairy to mention here. Instead, I
1386 will show a small amount of extension code and go over the high 1366 will show a small amount of extension code and go over the high
1387 level stragety. Full source is of course available with the 1367 level strategy. Full source is of course available with the
1388 =CORTEX= distribution if you're interested. 1368 =CORTEX= distribution if you're interested.
1389 1369
1390 =OpenAL= goes to great lengths to support many different systems, 1370 =OpenAL= goes to great lengths to support many different systems,
1391 all with different sound capabilities and interfaces. It 1371 all with different sound capabilities and interfaces. It
1392 accomplishes this difficult task by providing code for many 1372 accomplishes this difficult task by providing code for many
1404 any particular system. These include the Null Device, which 1384 any particular system. These include the Null Device, which
1405 doesn't do anything, and the Wave Device, which writes whatever 1385 doesn't do anything, and the Wave Device, which writes whatever
1406 sound it receives to a file, if everything has been set up 1386 sound it receives to a file, if everything has been set up
1407 correctly when configuring =OpenAL=. 1387 correctly when configuring =OpenAL=.
1408 1388
1409 Actual mixing (doppler shift and distance.environment-based 1389 Actual mixing (Doppler shift and distance.environment-based
1410 attenuation) of the sound data happens in the Devices, and they 1390 attenuation) of the sound data happens in the Devices, and they
1411 are the only point in the sound rendering process where this data 1391 are the only point in the sound rendering process where this data
1412 is available. 1392 is available.
1413 1393
1414 Therefore, in order to support multiple listeners, and get the 1394 Therefore, in order to support multiple listeners, and get the
1621 entity.getMaterial().setColor("Color", ColorRGBA.Gray); 1601 entity.getMaterial().setColor("Color", ColorRGBA.Gray);
1622 } 1602 }
1623 #+END_SRC 1603 #+END_SRC
1624 #+end_listing 1604 #+end_listing
1625 1605
1626 #+caption: First ever simulation of multiple listerners in =CORTEX=. 1606 #+caption: First ever simulation of multiple listeners in =CORTEX=.
1627 #+caption: Each cube is a creature which processes sound data with 1607 #+caption: Each cube is a creature which processes sound data with
1628 #+caption: the =process= function from listing \ref{sound-test}. 1608 #+caption: the =process= function from listing \ref{sound-test}.
1629 #+caption: the ball is constantally emiting a pure tone of 1609 #+caption: the ball is constantly emitting a pure tone of
1630 #+caption: constant volume. As it approaches the cubes, they each 1610 #+caption: constant volume. As it approaches the cubes, they each
1631 #+caption: change color in response to the sound. 1611 #+caption: change color in response to the sound.
1632 #+name: sound-cubes. 1612 #+name: sound-cubes.
1633 #+ATTR_LaTeX: :width 10cm 1613 #+ATTR_LaTeX: :width 10cm
1634 [[./images/java-hearing-test.png]] 1614 [[./images/java-hearing-test.png]]
1754 comprise a mesh, while =pixel-triangles= gets those same triangles 1734 comprise a mesh, while =pixel-triangles= gets those same triangles
1755 expressed in pixel coordinates (which are UV coordinates scaled to 1735 expressed in pixel coordinates (which are UV coordinates scaled to
1756 fit the height and width of the UV image). 1736 fit the height and width of the UV image).
1757 1737
1758 #+caption: Programs to extract triangles from a geometry and get 1738 #+caption: Programs to extract triangles from a geometry and get
1759 #+caption: their verticies in both world and UV-coordinates. 1739 #+caption: their vertices in both world and UV-coordinates.
1760 #+name: get-triangles 1740 #+name: get-triangles
1761 #+begin_listing clojure 1741 #+begin_listing clojure
1762 #+BEGIN_SRC clojure 1742 #+BEGIN_SRC clojure
1763 (defn triangle 1743 (defn triangle
1764 "Get the triangle specified by triangle-index from the mesh." 1744 "Get the triangle specified by triangle-index from the mesh."
1849 1829
1850 The clojure code below recapitulates the formulas above, using 1830 The clojure code below recapitulates the formulas above, using
1851 jMonkeyEngine's =Matrix4f= objects, which can describe any affine 1831 jMonkeyEngine's =Matrix4f= objects, which can describe any affine
1852 transformation. 1832 transformation.
1853 1833
1854 #+caption: Program to interpert triangles as affine transforms. 1834 #+caption: Program to interpret triangles as affine transforms.
1855 #+name: triangle-affine 1835 #+name: triangle-affine
1856 #+begin_listing clojure 1836 #+begin_listing clojure
1857 #+BEGIN_SRC clojure 1837 #+BEGIN_SRC clojure
1858 (defn triangle->matrix4f 1838 (defn triangle->matrix4f
1859 "Converts the triangle into a 4x4 matrix: The first three columns 1839 "Converts the triangle into a 4x4 matrix: The first three columns
1892 triangle. 1872 triangle.
1893 1873
1894 =inside-triangle?= determines whether a point is inside a triangle 1874 =inside-triangle?= determines whether a point is inside a triangle
1895 in 2D pixel-space. 1875 in 2D pixel-space.
1896 1876
1897 #+caption: Program to efficiently determine point includion 1877 #+caption: Program to efficiently determine point inclusion
1898 #+caption: in a triangle. 1878 #+caption: in a triangle.
1899 #+name: in-triangle 1879 #+name: in-triangle
1900 #+begin_listing clojure 1880 #+begin_listing clojure
1901 #+BEGIN_SRC clojure 1881 #+BEGIN_SRC clojure
1902 (defn convex-bounds 1882 (defn convex-bounds
2087 #+END_SRC 2067 #+END_SRC
2088 #+end_listing 2068 #+end_listing
2089 2069
2090 Armed with the =touch!= function, =CORTEX= becomes capable of 2070 Armed with the =touch!= function, =CORTEX= becomes capable of
2091 giving creatures a sense of touch. A simple test is to create a 2071 giving creatures a sense of touch. A simple test is to create a
2092 cube that is outfitted with a uniform distrubition of touch 2072 cube that is outfitted with a uniform distribution of touch
2093 sensors. It can feel the ground and any balls that it touches. 2073 sensors. It can feel the ground and any balls that it touches.
2094 2074
2095 #+caption: =CORTEX= interface for creating touch in a simulated 2075 #+caption: =CORTEX= interface for creating touch in a simulated
2096 #+caption: creature. 2076 #+caption: creature.
2097 #+name: touch 2077 #+name: touch
2109 (node-seq creature))))) 2089 (node-seq creature)))))
2110 #+END_SRC 2090 #+END_SRC
2111 #+end_listing 2091 #+end_listing
2112 2092
2113 The tactile-sensor-profile image for the touch cube is a simple 2093 The tactile-sensor-profile image for the touch cube is a simple
2114 cross with a unifom distribution of touch sensors: 2094 cross with a uniform distribution of touch sensors:
2115 2095
2116 #+caption: The touch profile for the touch-cube. Each pure white 2096 #+caption: The touch profile for the touch-cube. Each pure white
2117 #+caption: pixel defines a touch sensitive feeler. 2097 #+caption: pixel defines a touch sensitive feeler.
2118 #+name: touch-cube-uv-map 2098 #+name: touch-cube-uv-map
2119 #+ATTR_LaTeX: :width 7cm 2099 #+ATTR_LaTeX: :width 7cm
2120 [[./images/touch-profile.png]] 2100 [[./images/touch-profile.png]]
2121 2101
2122 #+caption: The touch cube reacts to canonballs. The black, red, 2102 #+caption: The touch cube reacts to cannonballs. The black, red,
2123 #+caption: and white cross on the right is a visual display of 2103 #+caption: and white cross on the right is a visual display of
2124 #+caption: the creature's touch. White means that it is feeling 2104 #+caption: the creature's touch. White means that it is feeling
2125 #+caption: something strongly, black is not feeling anything, 2105 #+caption: something strongly, black is not feeling anything,
2126 #+caption: and gray is in-between. The cube can feel both the 2106 #+caption: and gray is in-between. The cube can feel both the
2127 #+caption: floor and the ball. Notice that when the ball causes 2107 #+caption: floor and the ball. Notice that when the ball causes
2169 radians you have to move counterclockwise around the axis vector 2149 radians you have to move counterclockwise around the axis vector
2170 to get from the first to the second vector. It is not commutative 2150 to get from the first to the second vector. It is not commutative
2171 like a normal dot-product angle is. 2151 like a normal dot-product angle is.
2172 2152
2173 The purpose of these functions is to build a system of angle 2153 The purpose of these functions is to build a system of angle
2174 measurement that is biologically plausable. 2154 measurement that is biologically plausible.
2175 2155
2176 #+caption: Program to measure angles along a vector 2156 #+caption: Program to measure angles along a vector
2177 #+name: helpers 2157 #+name: helpers
2178 #+begin_listing clojure 2158 #+begin_listing clojure
2179 #+BEGIN_SRC clojure 2159 #+BEGIN_SRC clojure
2199 Given a joint, =proprioception-kernel= produces a function that 2179 Given a joint, =proprioception-kernel= produces a function that
2200 calculates the Euler angles between the the objects the joint 2180 calculates the Euler angles between the the objects the joint
2201 connects. The only tricky part here is making the angles relative 2181 connects. The only tricky part here is making the angles relative
2202 to the joint's initial ``straightness''. 2182 to the joint's initial ``straightness''.
2203 2183
2204 #+caption: Program to return biologially reasonable proprioceptive 2184 #+caption: Program to return biologically reasonable proprioceptive
2205 #+caption: data for each joint. 2185 #+caption: data for each joint.
2206 #+name: proprioception 2186 #+name: proprioception
2207 #+begin_listing clojure 2187 #+begin_listing clojure
2208 #+BEGIN_SRC clojure 2188 #+BEGIN_SRC clojure
2209 (defn proprioception-kernel 2189 (defn proprioception-kernel
2357 red, instead of shades of gray as I've been using for all the 2337 red, instead of shades of gray as I've been using for all the
2358 other senses. This is purely an aesthetic touch. 2338 other senses. This is purely an aesthetic touch.
2359 2339
2360 *** Creating muscles 2340 *** Creating muscles
2361 2341
2362 #+caption: This is the core movement functoion in =CORTEX=, which 2342 #+caption: This is the core movement function in =CORTEX=, which
2363 #+caption: implements muscles that report on their activation. 2343 #+caption: implements muscles that report on their activation.
2364 #+name: muscle-kernel 2344 #+name: muscle-kernel
2365 #+begin_listing clojure 2345 #+begin_listing clojure
2366 #+BEGIN_SRC clojure 2346 #+BEGIN_SRC clojure
2367 (defn movement-kernel 2347 (defn movement-kernel
2415 2395
2416 With all senses enabled, my right hand model looks like an 2396 With all senses enabled, my right hand model looks like an
2417 intricate marionette hand with several strings for each finger: 2397 intricate marionette hand with several strings for each finger:
2418 2398
2419 #+caption: View of the hand model with all sense nodes. You can see 2399 #+caption: View of the hand model with all sense nodes. You can see
2420 #+caption: the joint, muscle, ear, and eye nodess here. 2400 #+caption: the joint, muscle, ear, and eye nodes here.
2421 #+name: hand-nodes-1 2401 #+name: hand-nodes-1
2422 #+ATTR_LaTeX: :width 11cm 2402 #+ATTR_LaTeX: :width 11cm
2423 [[./images/hand-with-all-senses2.png]] 2403 [[./images/hand-with-all-senses2.png]]
2424 2404
2425 #+caption: An alternate view of the hand. 2405 #+caption: An alternate view of the hand.
2428 [[./images/hand-with-all-senses3.png]] 2408 [[./images/hand-with-all-senses3.png]]
2429 2409
2430 With the hand fully rigged with senses, I can run it though a test 2410 With the hand fully rigged with senses, I can run it though a test
2431 that will test everything. 2411 that will test everything.
2432 2412
2433 #+caption: A full test of the hand with all senses. Note expecially 2413 #+caption: A full test of the hand with all senses. Note especially
2434 #+caption: the interactions the hand has with itself: it feels 2414 #+caption: the interactions the hand has with itself: it feels
2435 #+caption: its own palm and fingers, and when it curls its fingers, 2415 #+caption: its own palm and fingers, and when it curls its fingers,
2436 #+caption: it sees them with its eye (which is located in the center 2416 #+caption: it sees them with its eye (which is located in the center
2437 #+caption: of the palm. The red block appears with a pure tone sound. 2417 #+caption: of the palm. The red block appears with a pure tone sound.
2438 #+caption: The hand then uses its muscles to launch the cube! 2418 #+caption: The hand then uses its muscles to launch the cube!
2439 #+name: integration 2419 #+name: integration
2440 #+ATTR_LaTeX: :width 16cm 2420 #+ATTR_LaTeX: :width 16cm
2441 [[./images/integration.png]] 2421 [[./images/integration.png]]
2442 2422
2443 ** =CORTEX= enables many possiblities for further research 2423 ** =CORTEX= enables many possibilities for further research
2444 2424
2445 Often times, the hardest part of building a system involving 2425 Often times, the hardest part of building a system involving
2446 creatures is dealing with physics and graphics. =CORTEX= removes 2426 creatures is dealing with physics and graphics. =CORTEX= removes
2447 much of this initial difficulty and leaves researchers free to 2427 much of this initial difficulty and leaves researchers free to
2448 directly pursue their ideas. I hope that even undergrads with a 2428 directly pursue their ideas. I hope that even undergrads with a
2559 :proprioception (proprioception! model) 2539 :proprioception (proprioception! model)
2560 :muscles (movement! model)})) 2540 :muscles (movement! model)}))
2561 #+end_src 2541 #+end_src
2562 #+end_listing 2542 #+end_listing
2563 2543
2564 ** Embodiment factors action recognition into managable parts 2544 ** Embodiment factors action recognition into manageable parts
2565 2545
2566 Using empathy, I divide the problem of action recognition into a 2546 Using empathy, I divide the problem of action recognition into a
2567 recognition process expressed in the language of a full compliment 2547 recognition process expressed in the language of a full compliment
2568 of senses, and an imaganitive process that generates full sensory 2548 of senses, and an imaginative process that generates full sensory
2569 data from partial sensory data. Splitting the action recognition 2549 data from partial sensory data. Splitting the action recognition
2570 problem in this manner greatly reduces the total amount of work to 2550 problem in this manner greatly reduces the total amount of work to
2571 recognize actions: The imaganitive process is mostly just matching 2551 recognize actions: The imaginative process is mostly just matching
2572 previous experience, and the recognition process gets to use all 2552 previous experience, and the recognition process gets to use all
2573 the senses to directly describe any action. 2553 the senses to directly describe any action.
2574 2554
2575 ** Action recognition is easy with a full gamut of senses 2555 ** Action recognition is easy with a full gamut of senses
2576 2556
2584 2564
2585 The following action predicates each take a stream of sensory 2565 The following action predicates each take a stream of sensory
2586 experience, observe however much of it they desire, and decide 2566 experience, observe however much of it they desire, and decide
2587 whether the worm is doing the action they describe. =curled?= 2567 whether the worm is doing the action they describe. =curled?=
2588 relies on proprioception, =resting?= relies on touch, =wiggling?= 2568 relies on proprioception, =resting?= relies on touch, =wiggling?=
2589 relies on a fourier analysis of muscle contraction, and 2569 relies on a Fourier analysis of muscle contraction, and
2590 =grand-circle?= relies on touch and reuses =curled?= as a gaurd. 2570 =grand-circle?= relies on touch and reuses =curled?= as a guard.
2591 2571
2592 #+caption: Program for detecting whether the worm is curled. This is the 2572 #+caption: Program for detecting whether the worm is curled. This is the
2593 #+caption: simplest action predicate, because it only uses the last frame 2573 #+caption: simplest action predicate, because it only uses the last frame
2594 #+caption: of sensory experience, and only uses proprioceptive data. Even 2574 #+caption: of sensory experience, and only uses proprioceptive data. Even
2595 #+caption: this simple predicate, however, is automatically frame 2575 #+caption: this simple predicate, however, is automatically frame
2632 2612
2633 #+caption: Program for detecting whether the worm is at rest. This program 2613 #+caption: Program for detecting whether the worm is at rest. This program
2634 #+caption: uses a summary of the tactile information from the underbelly 2614 #+caption: uses a summary of the tactile information from the underbelly
2635 #+caption: of the worm, and is only true if every segment is touching the 2615 #+caption: of the worm, and is only true if every segment is touching the
2636 #+caption: floor. Note that this function contains no references to 2616 #+caption: floor. Note that this function contains no references to
2637 #+caption: proprioction at all. 2617 #+caption: proprioception at all.
2638 #+name: resting 2618 #+name: resting
2639 #+begin_listing clojure 2619 #+begin_listing clojure
2640 #+begin_src clojure 2620 #+begin_src clojure
2641 (def worm-segment-bottom (rect-region [8 15] [14 22])) 2621 (def worm-segment-bottom (rect-region [8 15] [14 22]))
2642 2622
2673 #+end_src 2653 #+end_src
2674 #+end_listing 2654 #+end_listing
2675 2655
2676 2656
2677 #+caption: Program for detecting whether the worm has been wiggling for 2657 #+caption: Program for detecting whether the worm has been wiggling for
2678 #+caption: the last few frames. It uses a fourier analysis of the muscle 2658 #+caption: the last few frames. It uses a Fourier analysis of the muscle
2679 #+caption: contractions of the worm's tail to determine wiggling. This is 2659 #+caption: contractions of the worm's tail to determine wiggling. This is
2680 #+caption: signigicant because there is no particular frame that clearly 2660 #+caption: significant because there is no particular frame that clearly
2681 #+caption: indicates that the worm is wiggling --- only when multiple frames 2661 #+caption: indicates that the worm is wiggling --- only when multiple frames
2682 #+caption: are analyzed together is the wiggling revealed. Defining 2662 #+caption: are analyzed together is the wiggling revealed. Defining
2683 #+caption: wiggling this way also gives the worm an opportunity to learn 2663 #+caption: wiggling this way also gives the worm an opportunity to learn
2684 #+caption: and recognize ``frustrated wiggling'', where the worm tries to 2664 #+caption: and recognize ``frustrated wiggling'', where the worm tries to
2685 #+caption: wiggle but can't. Frustrated wiggling is very visually different 2665 #+caption: wiggle but can't. Frustrated wiggling is very visually different
2736 (resting? experiences) (.setText text "Resting"))) 2716 (resting? experiences) (.setText text "Resting")))
2737 #+end_src 2717 #+end_src
2738 #+end_listing 2718 #+end_listing
2739 2719
2740 #+caption: Using =debug-experience=, the body-centered predicates 2720 #+caption: Using =debug-experience=, the body-centered predicates
2741 #+caption: work together to classify the behaviour of the worm. 2721 #+caption: work together to classify the behavior of the worm.
2742 #+caption: the predicates are operating with access to the worm's 2722 #+caption: the predicates are operating with access to the worm's
2743 #+caption: full sensory data. 2723 #+caption: full sensory data.
2744 #+name: basic-worm-view 2724 #+name: basic-worm-view
2745 #+ATTR_LaTeX: :width 10cm 2725 #+ATTR_LaTeX: :width 10cm
2746 [[./images/worm-identify-init.png]] 2726 [[./images/worm-identify-init.png]]
2747 2727
2748 These action predicates satisfy the recognition requirement of an 2728 These action predicates satisfy the recognition requirement of an
2749 empathic recognition system. There is power in the simplicity of 2729 empathic recognition system. There is power in the simplicity of
2750 the action predicates. They describe their actions without getting 2730 the action predicates. They describe their actions without getting
2751 confused in visual details of the worm. Each one is frame 2731 confused in visual details of the worm. Each one is frame
2752 independent, but more than that, they are each indepent of 2732 independent, but more than that, they are each independent of
2753 irrelevant visual details of the worm and the environment. They 2733 irrelevant visual details of the worm and the environment. They
2754 will work regardless of whether the worm is a different color or 2734 will work regardless of whether the worm is a different color or
2755 hevaily textured, or if the environment has strange lighting. 2735 heavily textured, or if the environment has strange lighting.
2756 2736
2757 The trick now is to make the action predicates work even when the 2737 The trick now is to make the action predicates work even when the
2758 sensory data on which they depend is absent. If I can do that, then 2738 sensory data on which they depend is absent. If I can do that, then
2759 I will have gained much, 2739 I will have gained much,
2760 2740
2774 touching and at the same time not also experience the sensation of 2754 touching and at the same time not also experience the sensation of
2775 touching itself. 2755 touching itself.
2776 2756
2777 As the worm moves around during free play and its experience vector 2757 As the worm moves around during free play and its experience vector
2778 grows larger, the vector begins to define a subspace which is all 2758 grows larger, the vector begins to define a subspace which is all
2779 the sensations the worm can practicaly experience during normal 2759 the sensations the worm can practically experience during normal
2780 operation. I call this subspace \Phi-space, short for 2760 operation. I call this subspace \Phi-space, short for
2781 physical-space. The experience vector defines a path through 2761 physical-space. The experience vector defines a path through
2782 \Phi-space. This path has interesting properties that all derive 2762 \Phi-space. This path has interesting properties that all derive
2783 from physical embodiment. The proprioceptive components are 2763 from physical embodiment. The proprioceptive components are
2784 completely smooth, because in order for the worm to move from one 2764 completely smooth, because in order for the worm to move from one
2799 activations of the worm's muscles, because it generally takes a 2779 activations of the worm's muscles, because it generally takes a
2800 unique combination of muscle contractions to transform the worm's 2780 unique combination of muscle contractions to transform the worm's
2801 body along a specific path through \Phi-space. 2781 body along a specific path through \Phi-space.
2802 2782
2803 There is a simple way of taking \Phi-space and the total ordering 2783 There is a simple way of taking \Phi-space and the total ordering
2804 provided by an experience vector and reliably infering the rest of 2784 provided by an experience vector and reliably inferring the rest of
2805 the senses. 2785 the senses.
2806 2786
2807 ** Empathy is the process of tracing though \Phi-space 2787 ** Empathy is the process of tracing though \Phi-space
2808 2788
2809 Here is the core of a basic empathy algorithm, starting with an 2789 Here is the core of a basic empathy algorithm, starting with an
2815 2795
2816 Then, given a sequence of proprioceptive input, generate a set of 2796 Then, given a sequence of proprioceptive input, generate a set of
2817 matching experience records for each input, using the tiered 2797 matching experience records for each input, using the tiered
2818 proprioceptive bins. 2798 proprioceptive bins.
2819 2799
2820 Finally, to infer sensory data, select the longest consective chain 2800 Finally, to infer sensory data, select the longest consecutive chain
2821 of experiences. Conecutive experience means that the experiences 2801 of experiences. Consecutive experience means that the experiences
2822 appear next to each other in the experience vector. 2802 appear next to each other in the experience vector.
2823 2803
2824 This algorithm has three advantages: 2804 This algorithm has three advantages:
2825 2805
2826 1. It's simple 2806 1. It's simple
2831 proprioceptive bin. Redundant experiences in \Phi-space can be 2811 proprioceptive bin. Redundant experiences in \Phi-space can be
2832 merged to save computation. 2812 merged to save computation.
2833 2813
2834 2. It protects from wrong interpretations of transient ambiguous 2814 2. It protects from wrong interpretations of transient ambiguous
2835 proprioceptive data. For example, if the worm is flat for just 2815 proprioceptive data. For example, if the worm is flat for just
2836 an instant, this flattness will not be interpreted as implying 2816 an instant, this flatness will not be interpreted as implying
2837 that the worm has its muscles relaxed, since the flattness is 2817 that the worm has its muscles relaxed, since the flatness is
2838 part of a longer chain which includes a distinct pattern of 2818 part of a longer chain which includes a distinct pattern of
2839 muscle activation. Markov chains or other memoryless statistical 2819 muscle activation. Markov chains or other memoryless statistical
2840 models that operate on individual frames may very well make this 2820 models that operate on individual frames may very well make this
2841 mistake. 2821 mistake.
2842 2822
2853 (flatten) 2833 (flatten)
2854 (mapv #(Math/round (* % (Math/pow 10 (dec digits)))))))) 2834 (mapv #(Math/round (* % (Math/pow 10 (dec digits))))))))
2855 2835
2856 (defn gen-phi-scan 2836 (defn gen-phi-scan
2857 "Nearest-neighbors with binning. Only returns a result if 2837 "Nearest-neighbors with binning. Only returns a result if
2858 the propriceptive data is within 10% of a previously recorded 2838 the proprioceptive data is within 10% of a previously recorded
2859 result in all dimensions." 2839 result in all dimensions."
2860 [phi-space] 2840 [phi-space]
2861 (let [bin-keys (map bin [3 2 1]) 2841 (let [bin-keys (map bin [3 2 1])
2862 bin-maps 2842 bin-maps
2863 (map (fn [bin-key] 2843 (map (fn [bin-key]
2880 2860
2881 =longest-thread= infers sensory data by stitching together pieces 2861 =longest-thread= infers sensory data by stitching together pieces
2882 from previous experience. It prefers longer chains of previous 2862 from previous experience. It prefers longer chains of previous
2883 experience to shorter ones. For example, during training the worm 2863 experience to shorter ones. For example, during training the worm
2884 might rest on the ground for one second before it performs its 2864 might rest on the ground for one second before it performs its
2885 excercises. If during recognition the worm rests on the ground for 2865 exercises. If during recognition the worm rests on the ground for
2886 five seconds, =longest-thread= will accomodate this five second 2866 five seconds, =longest-thread= will accommodate this five second
2887 rest period by looping the one second rest chain five times. 2867 rest period by looping the one second rest chain five times.
2888 2868
2889 =longest-thread= takes time proportinal to the average number of 2869 =longest-thread= takes time proportional to the average number of
2890 entries in a proprioceptive bin, because for each element in the 2870 entries in a proprioceptive bin, because for each element in the
2891 starting bin it performes a series of set lookups in the preceeding 2871 starting bin it performs a series of set lookups in the preceding
2892 bins. If the total history is limited, then this is only a constant 2872 bins. If the total history is limited, then this is only a constant
2893 multiple times the number of entries in the starting bin. This 2873 multiple times the number of entries in the starting bin. This
2894 analysis also applies even if the action requires multiple longest 2874 analysis also applies even if the action requires multiple longest
2895 chains -- it's still the average number of entries in a 2875 chains -- it's still the average number of entries in a
2896 proprioceptive bin times the desired chain length. Because 2876 proprioceptive bin times the desired chain length. Because
2964 2944
2965 To use =EMPATH= with the worm, I first need to gather a set of 2945 To use =EMPATH= with the worm, I first need to gather a set of
2966 experiences from the worm that includes the actions I want to 2946 experiences from the worm that includes the actions I want to
2967 recognize. The =generate-phi-space= program (listing 2947 recognize. The =generate-phi-space= program (listing
2968 \ref{generate-phi-space} runs the worm through a series of 2948 \ref{generate-phi-space} runs the worm through a series of
2969 exercices and gatheres those experiences into a vector. The 2949 exercises and gatherers those experiences into a vector. The
2970 =do-all-the-things= program is a routine expressed in a simple 2950 =do-all-the-things= program is a routine expressed in a simple
2971 muscle contraction script language for automated worm control. It 2951 muscle contraction script language for automated worm control. It
2972 causes the worm to rest, curl, and wiggle over about 700 frames 2952 causes the worm to rest, curl, and wiggle over about 700 frames
2973 (approx. 11 seconds). 2953 (approx. 11 seconds).
2974 2954
2975 #+caption: Program to gather the worm's experiences into a vector for 2955 #+caption: Program to gather the worm's experiences into a vector for
2976 #+caption: further processing. The =motor-control-program= line uses 2956 #+caption: further processing. The =motor-control-program= line uses
2977 #+caption: a motor control script that causes the worm to execute a series 2957 #+caption: a motor control script that causes the worm to execute a series
2978 #+caption: of ``exercices'' that include all the action predicates. 2958 #+caption: of ``exercises'' that include all the action predicates.
2979 #+name: generate-phi-space 2959 #+name: generate-phi-space
2980 #+begin_listing clojure 2960 #+begin_listing clojure
2981 #+begin_src clojure 2961 #+begin_src clojure
2982 (def do-all-the-things 2962 (def do-all-the-things
2983 (concat 2963 (concat
3037 on simulated sensory data just as well as with actual data. Figure 3017 on simulated sensory data just as well as with actual data. Figure
3038 \ref{empathy-debug-image} was generated using =empathy-experiment=: 3018 \ref{empathy-debug-image} was generated using =empathy-experiment=:
3039 3019
3040 #+caption: From only proprioceptive data, =EMPATH= was able to infer 3020 #+caption: From only proprioceptive data, =EMPATH= was able to infer
3041 #+caption: the complete sensory experience and classify four poses 3021 #+caption: the complete sensory experience and classify four poses
3042 #+caption: (The last panel shows a composite image of \emph{wriggling}, 3022 #+caption: (The last panel shows a composite image of /wiggling/,
3043 #+caption: a dynamic pose.) 3023 #+caption: a dynamic pose.)
3044 #+name: empathy-debug-image 3024 #+name: empathy-debug-image
3045 #+ATTR_LaTeX: :width 10cm :placement [H] 3025 #+ATTR_LaTeX: :width 10cm :placement [H]
3046 [[./images/empathy-1.png]] 3026 [[./images/empathy-1.png]]
3047 3027
3048 One way to measure the performance of =EMPATH= is to compare the 3028 One way to measure the performance of =EMPATH= is to compare the
3049 sutiability of the imagined sense experience to trigger the same 3029 suitability of the imagined sense experience to trigger the same
3050 action predicates as the real sensory experience. 3030 action predicates as the real sensory experience.
3051 3031
3052 #+caption: Determine how closely empathy approximates actual 3032 #+caption: Determine how closely empathy approximates actual
3053 #+caption: sensory data. 3033 #+caption: sensory data.
3054 #+name: test-empathy-accuracy 3034 #+name: test-empathy-accuracy
3084 #+end_src 3064 #+end_src
3085 #+end_listing 3065 #+end_listing
3086 3066
3087 Running =test-empathy-accuracy= using the very short exercise 3067 Running =test-empathy-accuracy= using the very short exercise
3088 program defined in listing \ref{generate-phi-space}, and then doing 3068 program defined in listing \ref{generate-phi-space}, and then doing
3089 a similar pattern of activity manually yeilds an accuracy of around 3069 a similar pattern of activity manually yields an accuracy of around
3090 73%. This is based on very limited worm experience. By training the 3070 73%. This is based on very limited worm experience. By training the
3091 worm for longer, the accuracy dramatically improves. 3071 worm for longer, the accuracy dramatically improves.
3092 3072
3093 #+caption: Program to generate \Phi-space using manual training. 3073 #+caption: Program to generate \Phi-space using manual training.
3094 #+name: manual-phi-space 3074 #+name: manual-phi-space
3111 After about 1 minute of manual training, I was able to achieve 95% 3091 After about 1 minute of manual training, I was able to achieve 95%
3112 accuracy on manual testing of the worm using =init-interactive= and 3092 accuracy on manual testing of the worm using =init-interactive= and
3113 =test-empathy-accuracy=. The majority of errors are near the 3093 =test-empathy-accuracy=. The majority of errors are near the
3114 boundaries of transitioning from one type of action to another. 3094 boundaries of transitioning from one type of action to another.
3115 During these transitions the exact label for the action is more open 3095 During these transitions the exact label for the action is more open
3116 to interpretation, and dissaggrement between empathy and experience 3096 to interpretation, and disagreement between empathy and experience
3117 is more excusable. 3097 is more excusable.
3118 3098
3119 ** Digression: Learn touch sensor layout through free play 3099 ** Digression: Learn touch sensor layout through free play
3120 3100
3121 In the previous section I showed how to compute actions in terms of 3101 In the previous section I showed how to compute actions in terms of
3122 body-centered predicates which relied averate touch activation of 3102 body-centered predicates which relied on the average touch
3123 pre-defined regions of the worm's skin. What if, instead of 3103 activation of pre-defined regions of the worm's skin. What if,
3124 recieving touch pre-grouped into the six faces of each worm 3104 instead of receiving touch pre-grouped into the six faces of each
3125 segment, the true topology of the worm's skin was unknown? This is 3105 worm segment, the true topology of the worm's skin was unknown?
3126 more similiar to how a nerve fiber bundle might be arranged. While 3106 This is more similar to how a nerve fiber bundle might be
3127 two fibers that are close in a nerve bundle /might/ correspond to 3107 arranged. While two fibers that are close in a nerve bundle /might/
3128 two touch sensors that are close together on the skin, the process 3108 correspond to two touch sensors that are close together on the
3129 of taking a complicated surface and forcing it into essentially a 3109 skin, the process of taking a complicated surface and forcing it
3130 circle requires some cuts and rerragenments. 3110 into essentially a circle requires some cuts and rearrangements.
3131 3111
3132 In this section I show how to automatically learn the skin-topology of 3112 In this section I show how to automatically learn the skin-topology of
3133 a worm segment by free exploration. As the worm rolls around on the 3113 a worm segment by free exploration. As the worm rolls around on the
3134 floor, large sections of its surface get activated. If the worm has 3114 floor, large sections of its surface get activated. If the worm has
3135 stopped moving, then whatever region of skin that is touching the 3115 stopped moving, then whatever region of skin that is touching the
3149 (= (set (map first touch)) (set full-contact))) 3129 (= (set (map first touch)) (set full-contact)))
3150 #+end_src 3130 #+end_src
3151 #+end_listing 3131 #+end_listing
3152 3132
3153 After collecting these important regions, there will many nearly 3133 After collecting these important regions, there will many nearly
3154 similiar touch regions. While for some purposes the subtle 3134 similar touch regions. While for some purposes the subtle
3155 differences between these regions will be important, for my 3135 differences between these regions will be important, for my
3156 purposes I colapse them into mostly non-overlapping sets using 3136 purposes I collapse them into mostly non-overlapping sets using
3157 =remove-similiar= in listing \ref{remove-similiar} 3137 =remove-similar= in listing \ref{remove-similar}
3158 3138
3159 #+caption: Program to take a lits of set of points and ``collapse them'' 3139 #+caption: Program to take a list of sets of points and ``collapse them''
3160 #+caption: so that the remaining sets in the list are siginificantly 3140 #+caption: so that the remaining sets in the list are significantly
3161 #+caption: different from each other. Prefer smaller sets to larger ones. 3141 #+caption: different from each other. Prefer smaller sets to larger ones.
3162 #+name: remove-similiar 3142 #+name: remove-similar
3163 #+begin_listing clojure 3143 #+begin_listing clojure
3164 #+begin_src clojure 3144 #+begin_src clojure
3165 (defn remove-similar 3145 (defn remove-similar
3166 [coll] 3146 [coll]
3167 (loop [result () coll (sort-by (comp - count) coll)] 3147 (loop [result () coll (sort-by (comp - count) coll)]
3179 #+end_listing 3159 #+end_listing
3180 3160
3181 Actually running this simulation is easy given =CORTEX='s facilities. 3161 Actually running this simulation is easy given =CORTEX='s facilities.
3182 3162
3183 #+caption: Collect experiences while the worm moves around. Filter the touch 3163 #+caption: Collect experiences while the worm moves around. Filter the touch
3184 #+caption: sensations by stable ones, collapse similiar ones together, 3164 #+caption: sensations by stable ones, collapse similar ones together,
3185 #+caption: and report the regions learned. 3165 #+caption: and report the regions learned.
3186 #+name: learn-touch 3166 #+name: learn-touch
3187 #+begin_listing clojure 3167 #+begin_listing clojure
3188 #+begin_src clojure 3168 #+begin_src clojure
3189 (defn learn-touch-regions [] 3169 (defn learn-touch-regions []
3214 (map view-touch-region 3194 (map view-touch-region
3215 (learn-touch-regions))) 3195 (learn-touch-regions)))
3216 #+end_src 3196 #+end_src
3217 #+end_listing 3197 #+end_listing
3218 3198
3219 The only thing remining to define is the particular motion the worm 3199 The only thing remaining to define is the particular motion the worm
3220 must take. I accomplish this with a simple motor control program. 3200 must take. I accomplish this with a simple motor control program.
3221 3201
3222 #+caption: Motor control program for making the worm roll on the ground. 3202 #+caption: Motor control program for making the worm roll on the ground.
3223 #+caption: This could also be replaced with random motion. 3203 #+caption: This could also be replaced with random motion.
3224 #+name: worm-roll 3204 #+name: worm-roll
3273 3253
3274 While simple, =learn-touch-regions= exploits regularities in both 3254 While simple, =learn-touch-regions= exploits regularities in both
3275 the worm's physiology and the worm's environment to correctly 3255 the worm's physiology and the worm's environment to correctly
3276 deduce that the worm has six sides. Note that =learn-touch-regions= 3256 deduce that the worm has six sides. Note that =learn-touch-regions=
3277 would work just as well even if the worm's touch sense data were 3257 would work just as well even if the worm's touch sense data were
3278 completely scrambled. The cross shape is just for convienence. This 3258 completely scrambled. The cross shape is just for convenience. This
3279 example justifies the use of pre-defined touch regions in =EMPATH=. 3259 example justifies the use of pre-defined touch regions in =EMPATH=.
3280 3260
3281 * Contributions 3261 * Contributions
3282 3262
3283 In this thesis you have seen the =CORTEX= system, a complete 3263 In this thesis you have seen the =CORTEX= system, a complete
3284 environment for creating simulated creatures. You have seen how to 3264 environment for creating simulated creatures. You have seen how to
3285 implement five senses: touch, proprioception, hearing, vision, and 3265 implement five senses: touch, proprioception, hearing, vision, and
3286 muscle tension. You have seen how to create new creatues using 3266 muscle tension. You have seen how to create new creatures using
3287 blender, a 3D modeling tool. I hope that =CORTEX= will be useful in 3267 blender, a 3D modeling tool. I hope that =CORTEX= will be useful in
3288 further research projects. To this end I have included the full 3268 further research projects. To this end I have included the full
3289 source to =CORTEX= along with a large suite of tests and examples. I 3269 source to =CORTEX= along with a large suite of tests and examples. I
3290 have also created a user guide for =CORTEX= which is inculded in an 3270 have also created a user guide for =CORTEX= which is included in an
3291 appendix to this thesis \ref{}. 3271 appendix to this thesis.
3292 # dxh: todo reference appendix
3293 3272
3294 You have also seen how I used =CORTEX= as a platform to attach the 3273 You have also seen how I used =CORTEX= as a platform to attach the
3295 /action recognition/ problem, which is the problem of recognizing 3274 /action recognition/ problem, which is the problem of recognizing
3296 actions in video. You saw a simple system called =EMPATH= which 3275 actions in video. You saw a simple system called =EMPATH= which
3297 ientifies actions by first describing actions in a body-centerd, 3276 identifies actions by first describing actions in a body-centered,
3298 rich sense language, then infering a full range of sensory 3277 rich sense language, then inferring a full range of sensory
3299 experience from limited data using previous experience gained from 3278 experience from limited data using previous experience gained from
3300 free play. 3279 free play.
3301 3280
3302 As a minor digression, you also saw how I used =CORTEX= to enable a 3281 As a minor digression, you also saw how I used =CORTEX= to enable a
3303 tiny worm to discover the topology of its skin simply by rolling on 3282 tiny worm to discover the topology of its skin simply by rolling on
3304 the ground. 3283 the ground.
3305 3284
3306 In conclusion, the main contributions of this thesis are: 3285 In conclusion, the main contributions of this thesis are:
3307 3286
3308 - =CORTEX=, a system for creating simulated creatures with rich 3287 - =CORTEX=, a comprehensive platform for embodied AI experiments.
3309 senses. 3288 =CORTEX= supports many features lacking in other systems, such
3310 - =EMPATH=, a program for recognizing actions by imagining sensory 3289 proper simulation of hearing. It is easy to create new =CORTEX=
3311 experience. 3290 creatures using Blender, a free 3D modeling program.
3312 3291
3313 # An anatomical joke: 3292 - =EMPATH=, which uses =CORTEX= to identify the actions of a
3314 # - Training 3293 worm-like creature using a computational model of empathy.
3315 # - Skeletal imitation 3294
3316 # - Sensory fleshing-out
3317 # - Classification
3318 #+BEGIN_LaTeX 3295 #+BEGIN_LaTeX
3319 \appendix 3296 \appendix
3320 #+END_LaTeX 3297 #+END_LaTeX
3298
3321 * Appendix: =CORTEX= User Guide 3299 * Appendix: =CORTEX= User Guide
3322 3300
3323 Those who write a thesis should endeavor to make their code not only 3301 Those who write a thesis should endeavor to make their code not only
3324 accessable, but actually useable, as a way to pay back the community 3302 accessible, but actually usable, as a way to pay back the community
3325 that made the thesis possible in the first place. This thesis would 3303 that made the thesis possible in the first place. This thesis would
3326 not be possible without Free Software such as jMonkeyEngine3, 3304 not be possible without Free Software such as jMonkeyEngine3,
3327 Blender, clojure, emacs, ffmpeg, and many other tools. That is why I 3305 Blender, clojure, emacs, ffmpeg, and many other tools. That is why I
3328 have included this user guide, in the hope that someone else might 3306 have included this user guide, in the hope that someone else might
3329 find =CORTEX= useful. 3307 find =CORTEX= useful.
3347 3325
3348 ** Creating creatures 3326 ** Creating creatures
3349 3327
3350 Creatures are created using /Blender/, a free 3D modeling program. 3328 Creatures are created using /Blender/, a free 3D modeling program.
3351 You will need Blender version 2.6 when using the =CORTEX= included 3329 You will need Blender version 2.6 when using the =CORTEX= included
3352 in this thesis. You create a =CORTEX= creature in a similiar manner 3330 in this thesis. You create a =CORTEX= creature in a similar manner
3353 to modeling anything in Blender, except that you also create 3331 to modeling anything in Blender, except that you also create
3354 several trees of empty nodes which define the creature's senses. 3332 several trees of empty nodes which define the creature's senses.
3355 3333
3356 *** Mass 3334 *** Mass
3357 3335
3415 The eye will point outward from the X-axis of the node, and ``up'' 3393 The eye will point outward from the X-axis of the node, and ``up''
3416 will be in the direction of the X-axis of the node. It will help 3394 will be in the direction of the X-axis of the node. It will help
3417 to set the empty node's display mode to ``Arrows'' so that you can 3395 to set the empty node's display mode to ``Arrows'' so that you can
3418 clearly see the direction of the axes. 3396 clearly see the direction of the axes.
3419 3397
3420 Each retina file should contain white pixels whever you want to be 3398 Each retina file should contain white pixels wherever you want to be
3421 sensitive to your chosen color. If you want the entire field of 3399 sensitive to your chosen color. If you want the entire field of
3422 view, specify :all of 0xFFFFFF and a retinal map that is entirely 3400 view, specify :all of 0xFFFFFF and a retinal map that is entirely
3423 white. 3401 white.
3424 3402
3425 Here is a sample retinal map: 3403 Here is a sample retinal map:
3451 #+BEGIN_EXAMPLE 3429 #+BEGIN_EXAMPLE
3452 <touch-UV-map-file-name> 3430 <touch-UV-map-file-name>
3453 #+END_EXAMPLE 3431 #+END_EXAMPLE
3454 3432
3455 You may also include an optional ``scale'' metadata number to 3433 You may also include an optional ``scale'' metadata number to
3456 specifiy the length of the touch feelers. The default is $0.1$, 3434 specify the length of the touch feelers. The default is $0.1$,
3457 and this is generally sufficient. 3435 and this is generally sufficient.
3458 3436
3459 The touch UV should contain white pixels for each touch sensor. 3437 The touch UV should contain white pixels for each touch sensor.
3460 3438
3461 Here is an example touch-uv map that approximates a human finger, 3439 Here is an example touch-uv map that approximates a human finger,
3473 #+caption: model of a fingertip. 3451 #+caption: model of a fingertip.
3474 #+name: guide-fingertip 3452 #+name: guide-fingertip
3475 #+ATTR_LaTeX: :width 9cm :placement [H] 3453 #+ATTR_LaTeX: :width 9cm :placement [H]
3476 [[./images/finger-2.png]] 3454 [[./images/finger-2.png]]
3477 3455
3478 *** Propriocepotion 3456 *** Proprioception
3479 3457
3480 Proprioception is tied to each joint node -- nothing special must 3458 Proprioception is tied to each joint node -- nothing special must
3481 be done in a blender model to enable proprioception other than 3459 be done in a blender model to enable proprioception other than
3482 creating joint nodes. 3460 creating joint nodes.
3483 3461
3580 3558
3581 - =(load-blender-model file-name)= :: create a node structure 3559 - =(load-blender-model file-name)= :: create a node structure
3582 representing that described in a blender file. 3560 representing that described in a blender file.
3583 3561
3584 - =(light-up-everything world)= :: distribute a standard compliment 3562 - =(light-up-everything world)= :: distribute a standard compliment
3585 of lights throught the simulation. Should be adequate for most 3563 of lights throughout the simulation. Should be adequate for most
3586 purposes. 3564 purposes.
3587 3565
3588 - =(node-seq node)= :: return a recursuve list of the node's 3566 - =(node-seq node)= :: return a recursive list of the node's
3589 children. 3567 children.
3590 3568
3591 - =(nodify name children)= :: construct a node given a node-name and 3569 - =(nodify name children)= :: construct a node given a node-name and
3592 desired children. 3570 desired children.
3593 3571
3636 =[activation, length]= pairs for each touch hair. 3614 =[activation, length]= pairs for each touch hair.
3637 3615
3638 - =(proprioception! creature)= :: give the creature the sense of 3616 - =(proprioception! creature)= :: give the creature the sense of
3639 proprioception. Returns a list of functions, one for each 3617 proprioception. Returns a list of functions, one for each
3640 joint, that when called during a running simulation will 3618 joint, that when called during a running simulation will
3641 report the =[headnig, pitch, roll]= of the joint. 3619 report the =[heading, pitch, roll]= of the joint.
3642 3620
3643 - =(movement! creature)= :: give the creature the power of movement. 3621 - =(movement! creature)= :: give the creature the power of movement.
3644 Creates a list of functions, one for each muscle, that when 3622 Creates a list of functions, one for each muscle, that when
3645 called with an integer, will set the recruitment of that 3623 called with an integer, will set the recruitment of that
3646 muscle to that integer, and will report the current power 3624 muscle to that integer, and will report the current power
3675 3653
3676 - =(mega-import-jme3)= :: for experimenting at the REPL. This 3654 - =(mega-import-jme3)= :: for experimenting at the REPL. This
3677 function will import all jMonkeyEngine3 classes for immediate 3655 function will import all jMonkeyEngine3 classes for immediate
3678 use. 3656 use.
3679 3657
3680 - =(display-dialated-time world timer)= :: Shows the time as it is 3658 - =(display-dilated-time world timer)= :: Shows the time as it is
3681 flowing in the simulation on a HUD display. 3659 flowing in the simulation on a HUD display.
3682 3660
3683 3661
3684 3662