Mercurial > cortex

     1 #+title: =CORTEX=

     2 #+author: Robert McIntyre

     3 #+email: rlm@mit.edu

     4 #+description: Using embodied AI to facilitate Artificial Imagination.

     5 #+keywords: AI, clojure, embodiment

     6 #+LaTeX_CLASS_OPTIONS: [nofloat]

     7 

     8 * COMMENT templates

     9    #+caption: 

    10    #+caption: 

    11    #+caption: 

    12    #+caption: 

    13    #+name: name

    14    #+begin_listing clojure

    15    #+BEGIN_SRC clojure

    16    #+END_SRC

    17    #+end_listing

    18 

    19    #+caption: 

    20    #+caption: 

    21    #+caption: 

    22    #+name: name

    23    #+ATTR_LaTeX: :width 10cm

    24    [[./images/aurellem-gray.png]]

    25 

    26     #+caption: 

    27     #+caption: 

    28     #+caption: 

    29     #+caption: 

    30     #+name: name

    31     #+begin_listing clojure

    32     #+BEGIN_SRC clojure

    33     #+END_SRC

    34     #+end_listing

    35 

    36     #+caption: 

    37     #+caption: 

    38     #+caption: 

    39     #+name: name

    40     #+ATTR_LaTeX: :width 10cm

    41     [[./images/aurellem-gray.png]]

    42 

    43 

    44 * Empathy \& Embodiment: problem solving strategies

    45 

    46   By the end of this thesis, you will have seen a novel approach to

    47   interpreting video using embodiment and empathy. You will have also

    48   seen one way to efficiently implement empathy for embodied

    49   creatures. Finally, you will become familiar with =CORTEX=, a system

    50   for designing and simulating creatures with rich senses, which you

    51   may choose to use in your own research.

    52   

    53   This is the core vision of my thesis: That one of the important ways

    54   in which we understand others is by imagining ourselves in their

    55   position and emphatically feeling experiences relative to our own

    56   bodies. By understanding events in terms of our own previous

    57   corporeal experience, we greatly constrain the possibilities of what

    58   would otherwise be an unwieldy exponential search. This extra

    59   constraint can be the difference between easily understanding what

    60   is happening in a video and being completely lost in a sea of

    61   incomprehensible color and movement.

    62   

    63 ** The problem: recognizing actions in video is hard!

    64    

    65    Examine the following image. What is happening? As you, and indeed

    66    very young children, can easily determine, this is an image of

    67    drinking. 

    68 

    69    #+caption: A cat drinking some water. Identifying this action is 

    70    #+caption: beyond the capabilities of existing computer vision systems.

    71    #+ATTR_LaTeX: :width 7cm

    72    [[./images/cat-drinking.jpg]]

    73      

    74    Nevertheless, it is beyond the state of the art for a computer

    75    vision program to describe what's happening in this image. Part of

    76    the problem is that many computer vision systems focus on

    77    pixel-level details or comparisons to example images (such as

    78    \cite{volume-action-recognition}), but the 3D world is so variable

    79    that it is hard to describe the world in terms of possible images.

    80 

    81    In fact, the contents of scene may have much less to do with pixel

    82    probabilities than with recognizing various affordances: things you

    83    can move, objects you can grasp, spaces that can be filled . For

    84    example, what processes might enable you to see the chair in figure

    85    \ref{hidden-chair}?

    86 

    87    #+caption: The chair in this image is quite obvious to humans, but I 

    88    #+caption: doubt that any modern computer vision program can find it.

    89    #+name: hidden-chair

    90    #+ATTR_LaTeX: :width 10cm

    91    [[./images/fat-person-sitting-at-desk.jpg]]

    92 

    93    Finally, how is it that you can easily tell the difference between

    94    how the girls /muscles/ are working in figure \ref{girl}?

    95    

    96    #+caption: The mysterious ``common sense'' appears here as you are able 

    97    #+caption: to discern the difference in how the girl's arm muscles

    98    #+caption: are activated between the two images.

    99    #+name: girl

   100    #+ATTR_LaTeX: :width 7cm

   101    [[./images/wall-push.png]]

   102   

   103    Each of these examples tells us something about what might be going

   104    on in our minds as we easily solve these recognition problems:

   105    

   106    The hidden chair shows us that we are strongly triggered by cues

   107    relating to the position of human bodies, and that we can determine

   108    the overall physical configuration of a human body even if much of

   109    that body is occluded.

   110 

   111    The picture of the girl pushing against the wall tells us that we

   112    have common sense knowledge about the kinetics of our own bodies.

   113    We know well how our muscles would have to work to maintain us in

   114    most positions, and we can easily project this self-knowledge to

   115    imagined positions triggered by images of the human body.

   116 

   117    The cat tells us that imagination of some kind plays an important

   118    role in understanding actions. The question is: Can we be more

   119    precise about what sort of imagination is required to understand

   120    these actions?

   121 

   122 ** A step forward: the sensorimotor-centered approach

   123 

   124    In this thesis, I explore the idea that our knowledge of our own

   125    bodies, combined with our own rich senses, enables us to recognize

   126    the actions of others.

   127 

   128    For example, I think humans are able to label the cat video as

   129    ``drinking'' because they imagine /themselves/ as the cat, and

   130    imagine putting their face up against a stream of water and

   131    sticking out their tongue. In that imagined world, they can feel

   132    the cool water hitting their tongue, and feel the water entering

   133    their body, and are able to recognize that /feeling/ as drinking.

   134    So, the label of the action is not really in the pixels of the

   135    image, but is found clearly in a simulation inspired by those

   136    pixels. An imaginative system, having been trained on drinking and

   137    non-drinking examples and learning that the most important

   138    component of drinking is the feeling of water sliding down one's

   139    throat, would analyze a video of a cat drinking in the following

   140    manner:

   141    

   142    1. Create a physical model of the video by putting a ``fuzzy''

   143       model of its own body in place of the cat. Possibly also create

   144       a simulation of the stream of water.

   145 

   146    2. ``Play out'' this simulated scene and generate imagined sensory

   147       experience. This will include relevant muscle contractions, a

   148       close up view of the stream from the cat's perspective, and most

   149       importantly, the imagined feeling of water entering the mouth.

   150       The imagined sensory experience can come from a simulation of

   151       the event, but can also be pattern-matched from previous,

   152       similar embodied experience.

   153 

   154    3. The action is now easily identified as drinking by the sense of

   155       taste alone. The other senses (such as the tongue moving in and

   156       out) help to give plausibility to the simulated action. Note that

   157       the sense of vision, while critical in creating the simulation,

   158       is not critical for identifying the action from the simulation.

   159 

   160    For the chair examples, the process is even easier:

   161 

   162     1. Align a model of your body to the person in the image.

   163 

   164     2. Generate proprioceptive sensory data from this alignment.

   165   

   166     3. Use the imagined proprioceptive data as a key to lookup related

   167        sensory experience associated with that particular proprioceptive

   168        feeling.

   169 

   170     4. Retrieve the feeling of your bottom resting on a surface, your

   171        knees bent, and your leg muscles relaxed.

   172 

   173     5. This sensory information is consistent with your =sitting?=

   174        sensory predicate, so you (and the entity in the image) must be

   175        sitting.

   176 

   177     6. There must be a chair-like object since you are sitting.

   178 

   179    Empathy offers yet another alternative to the age-old AI

   180    representation question: ``What is a chair?'' --- A chair is the

   181    feeling of sitting!

   182 

   183    One powerful advantage of empathic problem solving is that it

   184    factors the action recognition problem into two easier problems. To

   185    use empathy, you need an /aligner/, which takes the video and a

   186    model of your body, and aligns the model with the video. Then, you

   187    need a /recognizer/, which uses the aligned model to interpret the

   188    action. The power in this method lies in the fact that you describe

   189    all actions from a body-centered viewpoint. You are less tied to

   190    the particulars of any visual representation of the actions. If you

   191    teach the system what ``running'' is, and you have a good enough

   192    aligner, the system will from then on be able to recognize running

   193    from any point of view, even strange points of view like above or

   194    underneath the runner. This is in contrast to action recognition

   195    schemes that try to identify actions using a non-embodied approach.

   196    If these systems learn about running as viewed from the side, they

   197    will not automatically be able to recognize running from any other

   198    viewpoint.

   199 

   200    Another powerful advantage is that using the language of multiple

   201    body-centered rich senses to describe body-centered actions offers a

   202    massive boost in descriptive capability. Consider how difficult it

   203    would be to compose a set of HOG filters to describe the action of

   204    a simple worm-creature ``curling'' so that its head touches its

   205    tail, and then behold the simplicity of describing thus action in a

   206    language designed for the task (listing \ref{grand-circle-intro}):

   207 

   208    #+caption: Body-centered actions are best expressed in a body-centered 

   209    #+caption: language. This code detects when the worm has curled into a 

   210    #+caption: full circle. Imagine how you would replicate this functionality

   211    #+caption: using low-level pixel features such as HOG filters!

   212    #+name: grand-circle-intro

   213    #+begin_listing clojure

   214    #+begin_src clojure

   215 (defn grand-circle?

   216   "Does the worm form a majestic circle (one end touching the other)?"

   217   [experiences]

   218   (and (curled? experiences)

   219        (let [worm-touch (:touch (peek experiences))

   220              tail-touch (worm-touch 0)

   221              head-touch (worm-touch 4)]

   222          (and (< 0.2 (contact worm-segment-bottom-tip tail-touch))

   223               (< 0.2 (contact worm-segment-top-tip    head-touch))))))

   224    #+end_src

   225    #+end_listing

   226 

   227 ** =EMPATH= recognizes actions using empathy

   228 

   229    Exploring these ideas further demands a concrete implementation, so

   230    first, I built a system for constructing virtual creatures with

   231    physiologically plausible sensorimotor systems and detailed

   232    environments. The result is =CORTEX=, which is described in section

   233    \ref{sec-2}.

   234 

   235    Next, I wrote routines which enabled a simple worm-like creature to

   236    infer the actions of a second worm-like creature, using only its

   237    own prior sensorimotor experiences and knowledge of the second

   238    worm's joint positions. This program, =EMPATH=, is described in

   239    section \ref{sec-3}. It's main components are:

   240 

   241    - Embodied Action Definitions :: Many otherwise complicated actions

   242         are easily described in the language of a full suite of

   243         body-centered, rich senses and experiences. For example,

   244         drinking is the feeling of water sliding down your throat, and

   245         cooling your insides. It's often accompanied by bringing your

   246         hand close to your face, or bringing your face close to water.

   247         Sitting down is the feeling of bending your knees, activating

   248         your quadriceps, then feeling a surface with your bottom and

   249         relaxing your legs. These body-centered action descriptions

   250         can be either learned or hard coded.

   251 

   252    - Guided Play      :: The creature moves around and experiences the

   253         world through its unique perspective. As the creature moves,

   254         it gathers experiences that satisfy the embodied action

   255         definitions. 

   256 

   257    - Posture imitation :: When trying to interpret a video or image,

   258         the creature takes a model of itself and aligns it with

   259         whatever it sees. This alignment might even cross species, as

   260         when humans try to align themselves with things like ponies,

   261         dogs, or other humans with a different body type.

   262 

   263    - Empathy          :: The alignment triggers associations with

   264         sensory data from prior experiences. For example, the

   265         alignment itself easily maps to proprioceptive data. Any

   266         sounds or obvious skin contact in the video can to a lesser

   267         extent trigger previous experience keyed to hearing or touch.

   268         Segments of previous experiences gained from play are stitched

   269         together to form a coherent and complete sensory portrait of

   270         the scene.

   271 

   272    - Recognition      :: With the scene described in terms of

   273         remembered first person sensory events, the creature can now

   274         run its action-identified programs (such as the one in listing

   275         \ref{grand-circle-intro} on this synthesized sensory data,

   276         just as it would if it were actually experiencing the scene

   277         first-hand. If previous experience has been accurately

   278         retrieved, and if it is analogous enough to the scene, then

   279         the creature will correctly identify the action in the scene.

   280 

   281    My program, =EMPATH= uses this empathic problem solving technique

   282    to interpret the actions of a simple, worm-like creature. 

   283    

   284    #+caption: The worm performs many actions during free play such as 

   285    #+caption: curling, wiggling, and resting.

   286    #+name: worm-intro

   287    #+ATTR_LaTeX: :width 15cm

   288    [[./images/worm-intro-white.png]]

   289 

   290    #+caption: =EMPATH= recognized and classified each of these 

   291    #+caption: poses by inferring the complete sensory experience 

   292    #+caption: from proprioceptive data.

   293    #+name: worm-recognition-intro

   294    #+ATTR_LaTeX: :width 15cm

   295    [[./images/worm-poses.png]]

   296    

   297 *** Main Results 

   298 

   299    - After one-shot supervised training, =EMPATH= was able to

   300      recognize a wide variety of static poses and dynamic

   301      actions---ranging from curling in a circle to wiggling with a

   302      particular frequency --- with 95\% accuracy.

   303 

   304    - These results were completely independent of viewing angle

   305      because the underlying body-centered language fundamentally is

   306      independent; once an action is learned, it can be recognized

   307      equally well from any viewing angle.

   308 

   309    - =EMPATH= is surprisingly short; the sensorimotor-centered

   310      language provided by =CORTEX= resulted in extremely economical

   311      recognition routines --- about 500 lines in all --- suggesting

   312      that such representations are very powerful, and often

   313      indispensable for the types of recognition tasks considered here.

   314 

   315    - Although for expediency's sake, I relied on direct knowledge of

   316      joint positions in this proof of concept, it would be

   317      straightforward to extend =EMPATH= so that it (more

   318      realistically) infers joint positions from its visual data.

   319 

   320 ** =EMPATH= is built on =CORTEX=, a creature builder.

   321 

   322    I built =CORTEX= to be a general AI research platform for doing

   323    experiments involving multiple rich senses and a wide variety and

   324    number of creatures. I intend it to be useful as a library for many

   325    more projects than just this thesis. =CORTEX= was necessary to meet

   326    a need among AI researchers at CSAIL and beyond, which is that

   327    people often will invent neat ideas that are best expressed in the

   328    language of creatures and senses, but in order to explore those

   329    ideas they must first build a platform in which they can create

   330    simulated creatures with rich senses! There are many ideas that

   331    would be simple to execute (such as =EMPATH= or

   332    \cite{larson-symbols}), but attached to them is the multi-month

   333    effort to make a good creature simulator. Often, that initial

   334    investment of time proves to be too much, and the project must make

   335    do with a lesser environment.

   336 

   337    =CORTEX= is well suited as an environment for embodied AI research

   338    for three reasons:

   339 

   340    - You can create new creatures using Blender (\cite{blender}), a

   341      popular 3D modeling program. Each sense can be specified using

   342      special blender nodes with biologically inspired parameters. You

   343      need not write any code to create a creature, and can use a wide

   344      library of pre-existing blender models as a base for your own

   345      creatures.

   346 

   347    - =CORTEX= implements a wide variety of senses: touch,

   348      proprioception, vision, hearing, and muscle tension. Complicated

   349      senses like touch, and vision involve multiple sensory elements

   350      embedded in a 2D surface. You have complete control over the

   351      distribution of these sensor elements through the use of simple

   352      png image files. In particular, =CORTEX= implements more

   353      comprehensive hearing than any other creature simulation system

   354      available.

   355 

   356    - =CORTEX= supports any number of creatures and any number of

   357      senses. Time in =CORTEX= dilates so that the simulated creatures

   358      always perceive a perfectly smooth flow of time, regardless of

   359      the actual computational load.

   360 

   361    =CORTEX= is built on top of =jMonkeyEngine3=

   362    (\cite{jmonkeyengine}), which is a video game engine designed to

   363    create cross-platform 3D desktop games. =CORTEX= is mainly written

   364    in clojure, a dialect of =LISP= that runs on the java virtual

   365    machine (JVM). The API for creating and simulating creatures and

   366    senses is entirely expressed in clojure, though many senses are

   367    implemented at the layer of jMonkeyEngine or below. For example,

   368    for the sense of hearing I use a layer of clojure code on top of a

   369    layer of java JNI bindings that drive a layer of =C++= code which

   370    implements a modified version of =OpenAL= to support multiple

   371    listeners. =CORTEX= is the only simulation environment that I know

   372    of that can support multiple entities that can each hear the world

   373    from their own perspective. Other senses also require a small layer

   374    of Java code. =CORTEX= also uses =bullet=, a physics simulator

   375    written in =C=.

   376 

   377    #+caption: Here is the worm from figure \ref{worm-intro} modeled 

   378    #+caption: in Blender, a free 3D-modeling program. Senses and 

   379    #+caption: joints are described using special nodes in Blender.

   380    #+name: worm-recognition-intro-2

   381    #+ATTR_LaTeX: :width 12cm

   382    [[./images/blender-worm.png]]

   383 

   384    Here are some things I anticipate that =CORTEX= might be used for:

   385 

   386    - exploring new ideas about sensory integration

   387    - distributed communication among swarm creatures

   388    - self-learning using free exploration, 

   389    - evolutionary algorithms involving creature construction

   390    - exploration of exotic senses and effectors that are not possible

   391      in the real world (such as telekinesis or a semantic sense)

   392    - imagination using subworlds

   393 

   394    During one test with =CORTEX=, I created 3,000 creatures each with

   395    their own independent senses and ran them all at only 1/80 real

   396    time. In another test, I created a detailed model of my own hand,

   397    equipped with a realistic distribution of touch (more sensitive at

   398    the fingertips), as well as eyes and ears, and it ran at around 1/4

   399    real time.

   400 

   401 #+BEGIN_LaTeX

   402    \begin{sidewaysfigure}

   403    \includegraphics[width=9.5in]{images/full-hand.png}

   404    \caption{

   405    I modeled my own right hand in Blender and rigged it with all the

   406    senses that {\tt CORTEX} supports. My simulated hand has a

   407    biologically inspired distribution of touch sensors. The senses are

   408    displayed on the right, and the simulation is displayed on the

   409    left. Notice that my hand is curling its fingers, that it can see

   410    its own finger from the eye in its palm, and that it can feel its

   411    own thumb touching its palm.}

   412    \end{sidewaysfigure}

   413 #+END_LaTeX

   414 

   415 * Designing =CORTEX=

   416 

   417   In this section, I outline the design decisions that went into

   418   making =CORTEX=, along with some details about its implementation.

   419   (A practical guide to getting started with =CORTEX=, which skips

   420   over the history and implementation details presented here, is

   421   provided in an appendix at the end of this thesis.)

   422 

   423   Throughout this project, I intended for =CORTEX= to be flexible and

   424   extensible enough to be useful for other researchers who want to

   425   test out ideas of their own. To this end, wherever I have had to make

   426   architectural choices about =CORTEX=, I have chosen to give as much

   427   freedom to the user as possible, so that =CORTEX= may be used for

   428   things I have not foreseen.

   429 

   430 ** Building in simulation versus reality

   431    The most important architectural decision of all is the choice to

   432    use a computer-simulated environment in the first place! The world

   433    is a vast and rich place, and for now simulations are a very poor

   434    reflection of its complexity. It may be that there is a significant

   435    qualitative difference between dealing with senses in the real

   436    world and dealing with pale facsimiles of them in a simulation

   437    \cite{brooks-representation}. What are the advantages and

   438    disadvantages of a simulation vs. reality?

   439    

   440 *** Simulation

   441 

   442     The advantages of virtual reality are that when everything is a

   443     simulation, experiments in that simulation are absolutely

   444     reproducible. It's also easier to change the character and world

   445     to explore new situations and different sensory combinations.

   446 

   447     If the world is to be simulated on a computer, then not only do

   448     you have to worry about whether the character's senses are rich

   449     enough to learn from the world, but whether the world itself is

   450     rendered with enough detail and realism to give enough working

   451     material to the character's senses. To name just a few

   452     difficulties facing modern physics simulators: destructibility of

   453     the environment, simulation of water/other fluids, large areas,

   454     nonrigid bodies, lots of objects, smoke. I don't know of any

   455     computer simulation that would allow a character to take a rock

   456     and grind it into fine dust, then use that dust to make a clay

   457     sculpture, at least not without spending years calculating the

   458     interactions of every single small grain of dust. Maybe a

   459     simulated world with today's limitations doesn't provide enough

   460     richness for real intelligence to evolve.

   461 

   462 *** Reality

   463 

   464     The other approach for playing with senses is to hook your

   465     software up to real cameras, microphones, robots, etc., and let it

   466     loose in the real world. This has the advantage of eliminating

   467     concerns about simulating the world at the expense of increasing

   468     the complexity of implementing the senses. Instead of just

   469     grabbing the current rendered frame for processing, you have to

   470     use an actual camera with real lenses and interact with photons to

   471     get an image. It is much harder to change the character, which is

   472     now partly a physical robot of some sort, since doing so involves

   473     changing things around in the real world instead of modifying

   474     lines of code. While the real world is very rich and definitely

   475     provides enough stimulation for intelligence to develop as

   476     evidenced by our own existence, it is also uncontrollable in the

   477     sense that a particular situation cannot be recreated perfectly or

   478     saved for later use. It is harder to conduct science because it is

   479     harder to repeat an experiment. The worst thing about using the

   480     real world instead of a simulation is the matter of time. Instead

   481     of simulated time you get the constant and unstoppable flow of

   482     real time. This severely limits the sorts of software you can use

   483     to program the AI because all sense inputs must be handled in real

   484     time. Complicated ideas may have to be implemented in hardware or

   485     may simply be impossible given the current speed of our

   486     processors. Contrast this with a simulation, in which the flow of

   487     time in the simulated world can be slowed down to accommodate the

   488     limitations of the character's programming. In terms of cost,

   489     doing everything in software is far cheaper than building custom

   490     real-time hardware. All you need is a laptop and some patience.

   491     

   492 ** Simulated time enables rapid prototyping \& simple programs

   493 

   494    I envision =CORTEX= being used to support rapid prototyping and

   495    iteration of ideas. Even if I could put together a well constructed

   496    kit for creating robots, it would still not be enough because of

   497    the scourge of real-time processing. Anyone who wants to test their

   498    ideas in the real world must always worry about getting their

   499    algorithms to run fast enough to process information in real time.

   500    The need for real time processing only increases if multiple senses

   501    are involved. In the extreme case, even simple algorithms will have

   502    to be accelerated by ASIC chips or FPGAs, turning what would

   503    otherwise be a few lines of code and a 10x speed penalty into a

   504    multi-month ordeal. For this reason, =CORTEX= supports

   505    /time-dilation/, which scales back the framerate of the

   506    simulation in proportion to the amount of processing each frame.

   507    From the perspective of the creatures inside the simulation, time

   508    always appears to flow at a constant rate, regardless of how

   509    complicated the environment becomes or how many creatures are in

   510    the simulation. The cost is that =CORTEX= can sometimes run slower

   511    than real time. This can also be an advantage, however ---

   512    simulations of very simple creatures in =CORTEX= generally run at

   513    40x on my machine!

   514 

   515 ** All sense organs are two-dimensional surfaces

   516 

   517    If =CORTEX= is to support a wide variety of senses, it would help

   518    to have a better understanding of what a ``sense'' actually is!

   519    While vision, touch, and hearing all seem like they are quite

   520    different things, I was surprised to learn during the course of

   521    this thesis that they (and all physical senses) can be expressed as

   522    exactly the same mathematical object due to a dimensional argument!

   523 

   524    Human beings are three-dimensional objects, and the nerves that

   525    transmit data from our various sense organs to our brain are

   526    essentially one-dimensional. This leaves up to two dimensions in

   527    which our sensory information may flow. For example, imagine your

   528    skin: it is a two-dimensional surface around a three-dimensional

   529    object (your body). It has discrete touch sensors embedded at

   530    various points, and the density of these sensors corresponds to the

   531    sensitivity of that region of skin. Each touch sensor connects to a

   532    nerve, all of which eventually are bundled together as they travel

   533    up the spinal cord to the brain. Intersect the spinal nerves with a

   534    guillotining plane and you will see all of the sensory data of the

   535    skin revealed in a roughly circular two-dimensional image which is

   536    the cross section of the spinal cord. Points on this image that are

   537    close together in this circle represent touch sensors that are

   538    /probably/ close together on the skin, although there is of course

   539    some cutting and rearrangement that has to be done to transfer the

   540    complicated surface of the skin onto a two dimensional image.

   541 

   542    Most human senses consist of many discrete sensors of various

   543    properties distributed along a surface at various densities. For

   544    skin, it is Pacinian corpuscles, Meissner's corpuscles, Merkel's

   545    disks, and Ruffini's endings \cite{textbook901}, which detect

   546    pressure and vibration of various intensities. For ears, it is the

   547    stereocilia distributed along the basilar membrane inside the

   548    cochlea; each one is sensitive to a slightly different frequency of

   549    sound. For eyes, it is rods and cones distributed along the surface

   550    of the retina. In each case, we can describe the sense with a

   551    surface and a distribution of sensors along that surface.

   552 

   553    The neat idea is that every human sense can be effectively

   554    described in terms of a surface containing embedded sensors. If the

   555    sense had any more dimensions, then there wouldn't be enough room

   556    in the spinal chord to transmit the information!

   557 

   558    Therefore, =CORTEX= must support the ability to create objects and

   559    then be able to ``paint'' points along their surfaces to describe

   560    each sense. 

   561 

   562    Fortunately this idea is already a well known computer graphics

   563    technique called /UV-mapping/. The three-dimensional surface of a

   564    model is cut and smooshed until it fits on a two-dimensional

   565    image. You paint whatever you want on that image, and when the

   566    three-dimensional shape is rendered in a game the smooshing and

   567    cutting is reversed and the image appears on the three-dimensional

   568    object.

   569 

   570    To make a sense, interpret the UV-image as describing the

   571    distribution of that senses sensors. To get different types of

   572    sensors, you can either use a different color for each type of

   573    sensor, or use multiple UV-maps, each labeled with that sensor

   574    type. I generally use a white pixel to mean the presence of a

   575    sensor and a black pixel to mean the absence of a sensor, and use

   576    one UV-map for each sensor-type within a given sense. 

   577 

   578    #+CAPTION: The UV-map for an elongated icososphere. The white

   579    #+caption: dots each represent a touch sensor. They are dense 

   580    #+caption: in the regions that describe the tip of the finger, 

   581    #+caption: and less dense along the dorsal side of the finger 

   582    #+caption: opposite the tip.

   583    #+name: finger-UV

   584    #+ATTR_latex: :width 10cm

   585    [[./images/finger-UV.png]]

   586 

   587    #+caption: Ventral side of the UV-mapped finger. Notice the 

   588    #+caption: density of touch sensors at the tip.

   589    #+name: finger-side-view

   590    #+ATTR_LaTeX: :width 10cm

   591    [[./images/finger-1.png]]

   592 

   593 ** Video game engines provide ready-made physics and shading

   594    

   595    I did not need to write my own physics simulation code or shader to

   596    build =CORTEX=. Doing so would lead to a system that is impossible

   597    for anyone but myself to use anyway. Instead, I use a video game

   598    engine as a base and modify it to accommodate the additional needs

   599    of =CORTEX=. Video game engines are an ideal starting point to

   600    build =CORTEX=, because they are not far from being creature

   601    building systems themselves.

   602    

   603    First off, general purpose video game engines come with a physics

   604    engine and lighting / sound system. The physics system provides

   605    tools that can be co-opted to serve as touch, proprioception, and

   606    muscles. Since some games support split screen views, a good video

   607    game engine will allow you to efficiently create multiple cameras

   608    in the simulated world that can be used as eyes. Video game systems

   609    offer integrated asset management for things like textures and

   610    creatures models, providing an avenue for defining creatures. They

   611    also understand UV-mapping, since this technique is used to apply a

   612    texture to a model. Finally, because video game engines support a

   613    large number of users, as long as =CORTEX= doesn't stray too far

   614    from the base system, other researchers can turn to this community

   615    for help when doing their research.

   616    

   617 ** =CORTEX= is based on jMonkeyEngine3

   618 

   619    While preparing to build =CORTEX= I studied several video game

   620    engines to see which would best serve as a base. The top contenders

   621    were:

   622 

   623    - [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]]    :: The Quake II engine was designed by ID

   624         software in 1997.  All the source code was released by ID

   625         software into the Public Domain several years ago, and as a

   626         result it has been ported to many different languages. This

   627         engine was famous for its advanced use of realistic shading

   628         and had decent and fast physics simulation. The main advantage

   629         of the Quake II engine is its simplicity, but I ultimately

   630         rejected it because the engine is too tied to the concept of a

   631         first-person shooter game. One of the problems I had was that

   632         there does not seem to be any easy way to attach multiple

   633         cameras to a single character. There are also several physics

   634         clipping issues that are corrected in a way that only applies

   635         to the main character and do not apply to arbitrary objects.

   636 

   637    - [[http://source.valvesoftware.com/][Source Engine]]     :: The Source Engine evolved from the Quake II

   638         and Quake I engines and is used by Valve in the Half-Life

   639         series of games. The physics simulation in the Source Engine

   640         is quite accurate and probably the best out of all the engines

   641         I investigated. There is also an extensive community actively

   642         working with the engine. However, applications that use the

   643         Source Engine must be written in C++, the code is not open, it

   644         only runs on Windows, and the tools that come with the SDK to

   645         handle models and textures are complicated and awkward to use.

   646 

   647    -  [[http://jmonkeyengine.com/][jMonkeyEngine3]] :: jMonkeyEngine3 is a new library for creating

   648         games in Java. It uses OpenGL to render to the screen and uses

   649         screengraphs to avoid drawing things that do not appear on the

   650         screen. It has an active community and several games in the

   651         pipeline. The engine was not built to serve any particular

   652         game but is instead meant to be used for any 3D game. 

   653 

   654    I chose jMonkeyEngine3 because it had the most features out of all

   655    the free projects I looked at, and because I could then write my

   656    code in clojure, an implementation of =LISP= that runs on the JVM.

   657 

   658 ** =CORTEX= uses Blender to create creature models

   659 

   660    For the simple worm-like creatures I will use later on in this

   661    thesis, I could define a simple API in =CORTEX= that would allow

   662    one to create boxes, spheres, etc., and leave that API as the sole

   663    way to create creatures. However, for =CORTEX= to truly be useful

   664    for other projects, it needs a way to construct complicated

   665    creatures. If possible, it would be nice to leverage work that has

   666    already been done by the community of 3D modelers, or at least

   667    enable people who are talented at modeling but not programming to

   668    design =CORTEX= creatures.

   669 

   670    Therefore, I use Blender, a free 3D modeling program, as the main

   671    way to create creatures in =CORTEX=. However, the creatures modeled

   672    in Blender must also be simple to simulate in jMonkeyEngine3's game

   673    engine, and must also be easy to rig with =CORTEX='s senses. I

   674    accomplish this with extensive use of Blender's ``empty nodes.'' 

   675 

   676    Empty nodes have no mass, physical presence, or appearance, but

   677    they can hold metadata and have names. I use a tree structure of

   678    empty nodes to specify senses in the following manner:

   679 

   680    - Create a single top-level empty node whose name is the name of

   681      the sense.

   682    - Add empty nodes which each contain meta-data relevant to the

   683      sense, including a UV-map describing the number/distribution of

   684      sensors if applicable.

   685    - Make each empty-node the child of the top-level node.

   686      

   687    #+caption: An example of annotating a creature model with empty

   688    #+caption: nodes to describe the layout of senses. There are 

   689    #+caption: multiple empty nodes which each describe the position

   690    #+caption: of muscles, ears, eyes, or joints.

   691    #+name: sense-nodes

   692    #+ATTR_LaTeX: :width 10cm

   693    [[./images/empty-sense-nodes.png]]

   694 

   695 ** Bodies are composed of segments connected by joints

   696 

   697    Blender is a general purpose animation tool, which has been used in

   698    the past to create high quality movies such as Sintel

   699    \cite{blender}. Though Blender can model and render even complicated

   700    things like water, it is crucial to keep models that are meant to

   701    be simulated as creatures simple. =Bullet=, which =CORTEX= uses

   702    though jMonkeyEngine3, is a rigid-body physics system. This offers

   703    a compromise between the expressiveness of a game level and the

   704    speed at which it can be simulated, and it means that creatures

   705    should be naturally expressed as rigid components held together by

   706    joint constraints.

   707 

   708    But humans are more like a squishy bag wrapped around some hard

   709    bones which define the overall shape. When we move, our skin bends

   710    and stretches to accommodate the new positions of our bones.

   711 

   712    One way to make bodies composed of rigid pieces connected by joints

   713    /seem/ more human-like is to use an /armature/, (or /rigging/)

   714    system, which defines a overall ``body mesh'' and defines how the

   715    mesh deforms as a function of the position of each ``bone'' which

   716    is a standard rigid body. This technique is used extensively to

   717    model humans and create realistic animations. It is not a good

   718    technique for physical simulation because it is a lie -- the skin

   719    is not a physical part of the simulation and does not interact with

   720    any objects in the world or itself. Objects will pass right though

   721    the skin until they come in contact with the underlying bone, which

   722    is a physical object. Without simulating the skin, the sense of

   723    touch has little meaning, and the creature's own vision will lie to

   724    it about the true extent of its body. Simulating the skin as a

   725    physical object requires some way to continuously update the

   726    physical model of the skin along with the movement of the bones,

   727    which is unacceptably slow compared to rigid body simulation.

   728 

   729    Therefore, instead of using the human-like ``deformable bag of

   730    bones'' approach, I decided to base my body plans on multiple solid

   731    objects that are connected by joints, inspired by the robot =EVE=

   732    from the movie WALL-E.

   733    

   734    #+caption: =EVE= from the movie WALL-E.  This body plan turns 

   735    #+caption: out to be much better suited to my purposes than a more 

   736    #+caption: human-like one.

   737    #+ATTR_LaTeX: :width 10cm

   738    [[./images/Eve.jpg]]

   739 

   740    =EVE='s body is composed of several rigid components that are held

   741    together by invisible joint constraints. This is what I mean by

   742    ``eve-like''. The main reason that I use eve-style bodies is for

   743    efficiency, and so that there will be correspondence between the

   744    AI's senses and the physical presence of its body. Each individual

   745    section is simulated by a separate rigid body that corresponds

   746    exactly with its visual representation and does not change.

   747    Sections are connected by invisible joints that are well supported

   748    in jMonkeyEngine3. Bullet, the physics backend for jMonkeyEngine3,

   749    can efficiently simulate hundreds of rigid bodies connected by

   750    joints. Just because sections are rigid does not mean they have to

   751    stay as one piece forever; they can be dynamically replaced with

   752    multiple sections to simulate splitting in two. This could be used

   753    to simulate retractable claws or =EVE='s hands, which are able to

   754    coalesce into one object in the movie.

   755 

   756 *** Solidifying/Connecting a body

   757 

   758     =CORTEX= creates a creature in two steps: first, it traverses the

   759     nodes in the blender file and creates physical representations for

   760     any of them that have mass defined in their blender meta-data.

   761 

   762    #+caption: Program for iterating through the nodes in a blender file

   763    #+caption: and generating physical jMonkeyEngine3 objects with mass

   764    #+caption: and a matching physics shape.

   765    #+name: physical

   766    #+begin_listing clojure

   767    #+begin_src clojure

   768 (defn physical!

   769   "Iterate through the nodes in creature and make them real physical

   770    objects in the simulation."

   771   [#^Node creature]

   772   (dorun

   773    (map

   774     (fn [geom]

   775       (let [physics-control

   776             (RigidBodyControl.

   777              (HullCollisionShape.

   778               (.getMesh geom))

   779              (if-let [mass (meta-data geom "mass")]

   780                (float mass) (float 1)))]

   781         (.addControl geom physics-control)))

   782     (filter #(isa? (class %) Geometry )

   783             (node-seq creature)))))

   784    #+end_src

   785    #+end_listing

   786    

   787     The next step to making a proper body is to connect those pieces

   788     together with joints. jMonkeyEngine has a large array of joints

   789     available via =bullet=, such as Point2Point, Cone, Hinge, and a

   790     generic Six Degree of Freedom joint, with or without spring

   791     restitution. 

   792 

   793     Joints are treated a lot like proper senses, in that there is a

   794     top-level empty node named ``joints'' whose children each

   795     represent a joint.

   796 

   797     #+caption: View of the hand model in Blender showing the main ``joints''

   798     #+caption: node (highlighted in yellow) and its children which each

   799     #+caption: represent a joint in the hand. Each joint node has metadata

   800     #+caption: specifying what sort of joint it is.

   801     #+name: blender-hand

   802     #+ATTR_LaTeX: :width 10cm

   803     [[./images/hand-screenshot1.png]]

   804 

   805 

   806     =CORTEX='s procedure for binding the creature together with joints

   807     is as follows:

   808     

   809     - Find the children of the ``joints'' node.

   810     - Determine the two spatials the joint is meant to connect.

   811     - Create the joint based on the meta-data of the empty node.

   812 

   813     The higher order function =sense-nodes= from =cortex.sense=

   814     simplifies finding the joints based on their parent ``joints''

   815     node.

   816 

   817    #+caption: Retrieving the children empty nodes from a single 

   818    #+caption: named empty node is a common pattern in =CORTEX=

   819    #+caption: further instances of this technique for the senses 

   820    #+caption: will be omitted

   821    #+name: get-empty-nodes

   822    #+begin_listing clojure

   823    #+begin_src clojure

   824 (defn sense-nodes

   825   "For some senses there is a special empty blender node whose

   826    children are considered markers for an instance of that sense. This

   827    function generates functions to find those children, given the name

   828    of the special parent node."

   829   [parent-name]

   830   (fn [#^Node creature]

   831     (if-let [sense-node (.getChild creature parent-name)]

   832       (seq (.getChildren sense-node)) [])))

   833 

   834 (def

   835   ^{:doc "Return the children of the creature's \"joints\" node."

   836     :arglists '([creature])}

   837   joints

   838   (sense-nodes "joints"))

   839    #+end_src

   840    #+end_listing

   841 

   842     To find a joint's targets, =CORTEX= creates a small cube, centered

   843     around the empty-node, and grows the cube exponentially until it

   844     intersects two physical objects. The objects are ordered according

   845     to the joint's rotation, with the first one being the object that

   846     has more negative coordinates in the joint's reference frame.

   847     Since the objects must be physical, the empty-node itself escapes

   848     detection. Because the objects must be physical, =joint-targets=

   849     must be called /after/ =physical!= is called.

   850    

   851     #+caption: Program to find the targets of a joint node by 

   852     #+caption: exponentially growth of a search cube.

   853     #+name: joint-targets

   854     #+begin_listing clojure

   855     #+begin_src clojure

   856 (defn joint-targets

   857   "Return the two closest two objects to the joint object, ordered

   858   from bottom to top according to the joint's rotation."

   859   [#^Node parts #^Node joint]

   860   (loop [radius (float 0.01)]

   861     (let [results (CollisionResults.)]

   862       (.collideWith

   863        parts

   864        (BoundingBox. (.getWorldTranslation joint)

   865                      radius radius radius) results)

   866       (let [targets

   867             (distinct

   868              (map  #(.getGeometry %) results))]

   869         (if (>= (count targets) 2)

   870           (sort-by

   871            #(let [joint-ref-frame-position

   872                   (jme-to-blender

   873                    (.mult

   874                     (.inverse (.getWorldRotation joint))

   875                     (.subtract (.getWorldTranslation %)

   876                                (.getWorldTranslation joint))))]

   877               (.dot (Vector3f. 1 1 1) joint-ref-frame-position))                  

   878            (take 2 targets))

   879           (recur (float (* radius 2))))))))

   880     #+end_src

   881     #+end_listing

   882    

   883     Once =CORTEX= finds all joints and targets, it creates them using

   884     a dispatch on the metadata of each joint node.

   885 

   886     #+caption: Program to dispatch on blender metadata and create joints

   887     #+caption: suitable for physical simulation.

   888     #+name: joint-dispatch

   889     #+begin_listing clojure

   890     #+begin_src clojure

   891 (defmulti joint-dispatch

   892   "Translate blender pseudo-joints into real JME joints."

   893   (fn [constraints & _] 

   894     (:type constraints)))

   895 

   896 (defmethod joint-dispatch :point

   897   [constraints control-a control-b pivot-a pivot-b rotation]

   898   (doto (SixDofJoint. control-a control-b pivot-a pivot-b false)

   899     (.setLinearLowerLimit Vector3f/ZERO)

   900     (.setLinearUpperLimit Vector3f/ZERO)))

   901 

   902 (defmethod joint-dispatch :hinge

   903   [constraints control-a control-b pivot-a pivot-b rotation]

   904   (let [axis (if-let [axis (:axis constraints)] axis Vector3f/UNIT_X)

   905         [limit-1 limit-2] (:limit constraints)

   906         hinge-axis (.mult rotation (blender-to-jme axis))]

   907     (doto (HingeJoint. control-a control-b pivot-a pivot-b 

   908                        hinge-axis hinge-axis)

   909       (.setLimit limit-1 limit-2))))

   910 

   911 (defmethod joint-dispatch :cone

   912   [constraints control-a control-b pivot-a pivot-b rotation]

   913   (let [limit-xz (:limit-xz constraints)

   914         limit-xy (:limit-xy constraints)

   915         twist    (:twist constraints)]

   916     (doto (ConeJoint. control-a control-b pivot-a pivot-b

   917                       rotation rotation)

   918       (.setLimit (float limit-xz) (float limit-xy)

   919                  (float twist)))))

   920     #+end_src

   921     #+end_listing

   922 

   923     All that is left for joints it to combine the above pieces into a

   924     something that can operate on the collection of nodes that a

   925     blender file represents.

   926 

   927     #+caption: Program to completely create a joint given information 

   928     #+caption: from a blender file.

   929     #+name: connect

   930     #+begin_listing clojure

   931    #+begin_src clojure

   932 (defn connect

   933   "Create a joint between 'obj-a and 'obj-b at the location of

   934   'joint. The type of joint is determined by the metadata on 'joint.

   935 

   936    Here are some examples:

   937    {:type :point}

   938    {:type :hinge  :limit [0 (/ Math/PI 2)] :axis (Vector3f. 0 1 0)}

   939    (:axis defaults to (Vector3f. 1 0 0) if not provided for hinge joints)

   940 

   941    {:type :cone :limit-xz 0]

   942                 :limit-xy 0]

   943                 :twist 0]}   (use XZY rotation mode in blender!)"

   944   [#^Node obj-a #^Node obj-b #^Node joint]

   945   (let [control-a (.getControl obj-a RigidBodyControl)

   946         control-b (.getControl obj-b RigidBodyControl)

   947         joint-center (.getWorldTranslation joint)

   948         joint-rotation (.toRotationMatrix (.getWorldRotation joint))

   949         pivot-a (world-to-local obj-a joint-center)

   950         pivot-b (world-to-local obj-b joint-center)]

   951     (if-let

   952         [constraints (map-vals eval (read-string (meta-data joint "joint")))]

   953       ;; A side-effect of creating a joint registers

   954       ;; it with both physics objects which in turn

   955       ;; will register the joint with the physics system

   956       ;; when the simulation is started.

   957         (joint-dispatch constraints

   958                         control-a control-b

   959                         pivot-a pivot-b

   960                         joint-rotation))))

   961     #+end_src

   962     #+end_listing

   963 

   964     In general, whenever =CORTEX= exposes a sense (or in this case

   965     physicality), it provides a function of the type =sense!=, which

   966     takes in a collection of nodes and augments it to support that

   967     sense. The function returns any controls necessary to use that

   968     sense. In this case =body!= creates a physical body and returns no

   969     control functions.

   970 

   971     #+caption: Program to give joints to a creature.

   972     #+name: joints

   973     #+begin_listing clojure

   974     #+begin_src clojure

   975 (defn joints!

   976   "Connect the solid parts of the creature with physical joints. The

   977    joints are taken from the \"joints\" node in the creature."

   978   [#^Node creature]

   979   (dorun

   980    (map

   981     (fn [joint]

   982       (let [[obj-a obj-b] (joint-targets creature joint)]

   983         (connect obj-a obj-b joint)))

   984     (joints creature))))

   985 (defn body!

   986   "Endow the creature with a physical body connected with joints.  The

   987    particulars of the joints and the masses of each body part are

   988    determined in blender."

   989   [#^Node creature]

   990   (physical! creature)

   991   (joints! creature))

   992     #+end_src

   993     #+end_listing

   994 

   995     All of the code you have just seen amounts to only 130 lines, yet

   996     because it builds on top of Blender and jMonkeyEngine3, those few

   997     lines pack quite a punch!

   998 

   999     The hand from figure \ref{blender-hand}, which was modeled after

  1000     my own right hand, can now be given joints and simulated as a

  1001     creature.

  1002    

  1003     #+caption: With the ability to create physical creatures from blender,

  1004     #+caption: =CORTEX= gets one step closer to becoming a full creature

  1005     #+caption: simulation environment.

  1006     #+name: physical-hand

  1007     #+ATTR_LaTeX: :width 15cm

  1008     [[./images/physical-hand.png]]

  1009 

  1010 ** Sight reuses standard video game components...

  1011 

  1012    Vision is one of the most important senses for humans, so I need to

  1013    build a simulated sense of vision for my AI. I will do this with

  1014    simulated eyes. Each eye can be independently moved and should see

  1015    its own version of the world depending on where it is.

  1016 

  1017    Making these simulated eyes a reality is simple because

  1018    jMonkeyEngine already contains extensive support for multiple views

  1019    of the same 3D simulated world. The reason jMonkeyEngine has this

  1020    support is because the support is necessary to create games with

  1021    split-screen views. Multiple views are also used to create

  1022    efficient pseudo-reflections by rendering the scene from a certain

  1023    perspective and then projecting it back onto a surface in the 3D

  1024    world.

  1025 

  1026    #+caption: jMonkeyEngine supports multiple views to enable 

  1027    #+caption: split-screen games, like GoldenEye, which was one of 

  1028    #+caption: the first games to use split-screen views.

  1029    #+name: goldeneye

  1030    #+ATTR_LaTeX: :width 10cm

  1031    [[./images/goldeneye-4-player.png]]

  1032 

  1033 *** A Brief Description of jMonkeyEngine's Rendering Pipeline

  1034 

  1035     jMonkeyEngine allows you to create a =ViewPort=, which represents a

  1036     view of the simulated world. You can create as many of these as you

  1037     want. Every frame, the =RenderManager= iterates through each

  1038     =ViewPort=, rendering the scene in the GPU. For each =ViewPort= there

  1039     is a =FrameBuffer= which represents the rendered image in the GPU.

  1040   

  1041     #+caption: =ViewPorts= are cameras in the world. During each frame, 

  1042     #+caption: the =RenderManager= records a snapshot of what each view 

  1043     #+caption: is currently seeing; these snapshots are =FrameBuffer= objects.

  1044     #+name: rendermanagers

  1045     #+ATTR_LaTeX: :width 10cm

  1046     [[./images/diagram_rendermanager2.png]]

  1047 

  1048     Each =ViewPort= can have any number of attached =SceneProcessor=

  1049     objects, which are called every time a new frame is rendered. A

  1050     =SceneProcessor= receives its =ViewPort's= =FrameBuffer= and can do

  1051     whatever it wants to the data.  Often this consists of invoking GPU

  1052     specific operations on the rendered image.  The =SceneProcessor= can

  1053     also copy the GPU image data to RAM and process it with the CPU.

  1054 

  1055 *** Appropriating Views for Vision

  1056 

  1057     Each eye in the simulated creature needs its own =ViewPort= so

  1058     that it can see the world from its own perspective. To this

  1059     =ViewPort=, I add a =SceneProcessor= that feeds the visual data to

  1060     any arbitrary continuation function for further processing. That

  1061     continuation function may perform both CPU and GPU operations on

  1062     the data. To make this easy for the continuation function, the

  1063     =SceneProcessor= maintains appropriately sized buffers in RAM to

  1064     hold the data. It does not do any copying from the GPU to the CPU

  1065     itself because it is a slow operation.

  1066 

  1067     #+caption: Function to make the rendered scene in jMonkeyEngine 

  1068     #+caption: available for further processing.

  1069     #+name: pipeline-1 

  1070     #+begin_listing clojure

  1071     #+begin_src clojure

  1072 (defn vision-pipeline

  1073   "Create a SceneProcessor object which wraps a vision processing

  1074   continuation function. The continuation is a function that takes 

  1075   [#^Renderer r #^FrameBuffer fb #^ByteBuffer b #^BufferedImage bi],

  1076   each of which has already been appropriately sized."

  1077   [continuation]

  1078   (let [byte-buffer (atom nil)

  1079 	renderer (atom nil)

  1080         image (atom nil)]

  1081   (proxy [SceneProcessor] []

  1082     (initialize

  1083      [renderManager viewPort]

  1084      (let [cam (.getCamera viewPort)

  1085 	   width (.getWidth cam)

  1086 	   height (.getHeight cam)]

  1087        (reset! renderer (.getRenderer renderManager))

  1088        (reset! byte-buffer

  1089 	     (BufferUtils/createByteBuffer

  1090 	      (* width height 4)))

  1091         (reset! image (BufferedImage.

  1092                       width height

  1093                       BufferedImage/TYPE_4BYTE_ABGR))))

  1094     (isInitialized [] (not (nil? @byte-buffer)))

  1095     (reshape [_ _ _])

  1096     (preFrame [_])

  1097     (postQueue [_])

  1098     (postFrame

  1099      [#^FrameBuffer fb]

  1100      (.clear @byte-buffer)

  1101      (continuation @renderer fb @byte-buffer @image))

  1102     (cleanup []))))

  1103     #+end_src

  1104     #+end_listing

  1105 

  1106     The continuation function given to =vision-pipeline= above will be

  1107     given a =Renderer= and three containers for image data. The

  1108     =FrameBuffer= references the GPU image data, but the pixel data

  1109     can not be used directly on the CPU. The =ByteBuffer= and

  1110     =BufferedImage= are initially "empty" but are sized to hold the

  1111     data in the =FrameBuffer=. I call transferring the GPU image data

  1112     to the CPU structures "mixing" the image data.

  1113 

  1114 *** Optical sensor arrays are described with images and referenced with metadata

  1115 

  1116     The vision pipeline described above handles the flow of rendered

  1117     images. Now, =CORTEX= needs simulated eyes to serve as the source

  1118     of these images.

  1119 

  1120     An eye is described in blender in the same way as a joint. They

  1121     are zero dimensional empty objects with no geometry whose local

  1122     coordinate system determines the orientation of the resulting eye.

  1123     All eyes are children of a parent node named "eyes" just as all

  1124     joints have a parent named "joints". An eye binds to the nearest

  1125     physical object with =bind-sense=.

  1126 

  1127     #+caption: Here, the camera is created based on metadata on the

  1128     #+caption: eye-node and attached to the nearest physical object 

  1129     #+caption: with =bind-sense=

  1130     #+name: add-eye

  1131     #+begin_listing clojure

  1132 (defn add-eye!

  1133   "Create a Camera centered on the current position of 'eye which

  1134    follows the closest physical node in 'creature. The camera will

  1135    point in the X direction and use the Z vector as up as determined

  1136    by the rotation of these vectors in blender coordinate space. Use

  1137    XZY rotation for the node in blender."

  1138   [#^Node creature #^Spatial eye]

  1139   (let [target (closest-node creature eye)

  1140         [cam-width cam-height] 

  1141         ;;[640 480] ;; graphics card on laptop doesn't support

  1142                     ;; arbitrary dimensions.

  1143         (eye-dimensions eye)

  1144         cam (Camera. cam-width cam-height)

  1145         rot (.getWorldRotation eye)]

  1146     (.setLocation cam (.getWorldTranslation eye))

  1147     (.lookAtDirection

  1148      cam                           ; this part is not a mistake and

  1149      (.mult rot Vector3f/UNIT_X)   ; is consistent with using Z in

  1150      (.mult rot Vector3f/UNIT_Y))  ; blender as the UP vector.

  1151     (.setFrustumPerspective

  1152      cam (float 45)

  1153      (float (/ (.getWidth cam) (.getHeight cam)))

  1154      (float 1)

  1155      (float 1000))

  1156     (bind-sense target cam) cam))

  1157     #+end_listing

  1158 

  1159 *** Simulated Retina 

  1160 

  1161     An eye is a surface (the retina) which contains many discrete

  1162     sensors to detect light. These sensors can have different

  1163     light-sensing properties. In humans, each discrete sensor is

  1164     sensitive to red, blue, green, or gray. These different types of

  1165     sensors can have different spatial distributions along the retina.

  1166     In humans, there is a fovea in the center of the retina which has

  1167     a very high density of color sensors, and a blind spot which has

  1168     no sensors at all. Sensor density decreases in proportion to

  1169     distance from the fovea.

  1170 

  1171     I want to be able to model any retinal configuration, so my

  1172     eye-nodes in blender contain metadata pointing to images that

  1173     describe the precise position of the individual sensors using

  1174     white pixels. The meta-data also describes the precise sensitivity

  1175     to light that the sensors described in the image have. An eye can

  1176     contain any number of these images. For example, the metadata for

  1177     an eye might look like this:

  1178 

  1179     #+begin_src clojure

  1180 {0xFF0000 "Models/test-creature/retina-small.png"}

  1181     #+end_src

  1182 

  1183     #+caption: An example retinal profile image. White pixels are 

  1184     #+caption: photo-sensitive elements. The distribution of white 

  1185     #+caption: pixels is denser in the middle and falls off at the 

  1186     #+caption: edges and is inspired by the human retina.

  1187     #+name: retina

  1188     #+ATTR_LaTeX: :width 7cm

  1189     [[./images/retina-small.png]]

  1190 

  1191     Together, the number 0xFF0000 and the image image above describe

  1192     the placement of red-sensitive sensory elements.

  1193 

  1194     Meta-data to very crudely approximate a human eye might be

  1195     something like this:

  1196 

  1197     #+begin_src clojure

  1198 (let [retinal-profile "Models/test-creature/retina-small.png"]

  1199   {0xFF0000 retinal-profile

  1200    0x00FF00 retinal-profile

  1201    0x0000FF retinal-profile

  1202    0xFFFFFF retinal-profile})

  1203     #+end_src

  1204 

  1205     The numbers that serve as keys in the map determine a sensor's

  1206     relative sensitivity to the channels red, green, and blue. These

  1207     sensitivity values are packed into an integer in the order

  1208     =|_|R|G|B|= in 8-bit fields. The RGB values of a pixel in the

  1209     image are added together with these sensitivities as linear

  1210     weights. Therefore, 0xFF0000 means sensitive to red only while

  1211     0xFFFFFF means sensitive to all colors equally (gray).

  1212 

  1213     #+caption: This is the core of vision in =CORTEX=. A given eye node 

  1214     #+caption: is converted into a function that returns visual

  1215     #+caption: information from the simulation.

  1216     #+name: vision-kernel

  1217     #+begin_listing clojure

  1218     #+BEGIN_SRC clojure

  1219 (defn vision-kernel

  1220   "Returns a list of functions, each of which will return a color

  1221    channel's worth of visual information when called inside a running

  1222    simulation."

  1223   [#^Node creature #^Spatial eye & {skip :skip :or {skip 0}}]

  1224   (let [retinal-map (retina-sensor-profile eye)

  1225         camera (add-eye! creature eye)

  1226         vision-image

  1227         (atom

  1228          (BufferedImage. (.getWidth camera)

  1229                          (.getHeight camera)

  1230                          BufferedImage/TYPE_BYTE_BINARY))

  1231         register-eye!

  1232         (runonce

  1233          (fn [world]

  1234            (add-camera!

  1235             world camera

  1236             (let [counter  (atom 0)]

  1237               (fn [r fb bb bi]

  1238                 (if (zero? (rem (swap! counter inc) (inc skip)))

  1239                   (reset! vision-image

  1240                           (BufferedImage! r fb bb bi))))))))]

  1241      (vec

  1242       (map

  1243        (fn [[key image]]

  1244          (let [whites (white-coordinates image)

  1245                topology (vec (collapse whites))

  1246                sensitivity (sensitivity-presets key key)]

  1247            (attached-viewport.

  1248             (fn [world]

  1249               (register-eye! world)

  1250               (vector

  1251                topology

  1252                (vec 

  1253                 (for [[x y] whites]

  1254                   (pixel-sense 

  1255                    sensitivity

  1256                    (.getRGB @vision-image x y))))))

  1257             register-eye!)))

  1258          retinal-map))))

  1259     #+END_SRC

  1260     #+end_listing

  1261 

  1262     Note that since each of the functions generated by =vision-kernel=

  1263     shares the same =register-eye!= function, the eye will be

  1264     registered only once the first time any of the functions from the

  1265     list returned by =vision-kernel= is called. Each of the functions

  1266     returned by =vision-kernel= also allows access to the =Viewport=

  1267     through which it receives images.

  1268 

  1269     All the hard work has been done; all that remains is to apply

  1270     =vision-kernel= to each eye in the creature and gather the results

  1271     into one list of functions.

  1272 

  1273 

  1274     #+caption: With =vision!=, =CORTEX= is already a fine simulation 

  1275     #+caption: environment for experimenting with different types of 

  1276     #+caption: eyes.

  1277     #+name: vision!

  1278     #+begin_listing clojure

  1279     #+BEGIN_SRC clojure

  1280 (defn vision!

  1281   "Returns a list of functions, each of which returns visual sensory

  1282    data when called inside a running simulation."

  1283   [#^Node creature & {skip :skip :or {skip 0}}]

  1284   (reduce

  1285    concat 

  1286    (for [eye (eyes creature)]

  1287      (vision-kernel creature eye))))

  1288     #+END_SRC

  1289     #+end_listing

  1290 

  1291     #+caption: Simulated vision with a test creature and the 

  1292     #+caption: human-like eye approximation. Notice how each channel

  1293     #+caption: of the eye responds differently to the differently 

  1294     #+caption: colored balls.

  1295     #+name: worm-vision-test.

  1296     #+ATTR_LaTeX: :width 13cm

  1297     [[./images/worm-vision.png]]

  1298 

  1299     The vision code is not much more complicated than the body code,

  1300     and enables multiple further paths for simulated vision. For

  1301     example, it is quite easy to create bifocal vision -- you just

  1302     make two eyes next to each other in blender! It is also possible

  1303     to encode vision transforms in the retinal files. For example, the

  1304     human like retina file in figure \ref{retina} approximates a

  1305     log-polar transform.

  1306 

  1307     This vision code has already been absorbed by the jMonkeyEngine

  1308     community and is now (in modified form) part of a system for

  1309     capturing in-game video to a file.

  1310 

  1311 ** ...but hearing must be built from scratch

  1312 

  1313    At the end of this section I will have simulated ears that work the

  1314    same way as the simulated eyes in the last section. I will be able to

  1315    place any number of ear-nodes in a blender file, and they will bind to

  1316    the closest physical object and follow it as it moves around. Each ear

  1317    will provide access to the sound data it picks up between every frame.

  1318 

  1319    Hearing is one of the more difficult senses to simulate, because there

  1320    is less support for obtaining the actual sound data that is processed

  1321    by jMonkeyEngine3. There is no "split-screen" support for rendering

  1322    sound from different points of view, and there is no way to directly

  1323    access the rendered sound data.

  1324 

  1325    =CORTEX='s hearing is unique because it does not have any

  1326    limitations compared to other simulation environments. As far as I

  1327    know, there is no other system that supports multiple listeners,

  1328    and the sound demo at the end of this section is the first time

  1329    it's been done in a video game environment.

  1330 

  1331 *** Brief Description of jMonkeyEngine's Sound System

  1332 

  1333    jMonkeyEngine's sound system works as follows:

  1334 

  1335    - jMonkeyEngine uses the =AppSettings= for the particular

  1336      application to determine what sort of =AudioRenderer= should be

  1337      used.

  1338    - Although some support is provided for multiple AudioRendering

  1339      backends, jMonkeyEngine at the time of this writing will either

  1340      pick no =AudioRenderer= at all, or the =LwjglAudioRenderer=.

  1341    - jMonkeyEngine tries to figure out what sort of system you're

  1342      running and extracts the appropriate native libraries.

  1343    - The =LwjglAudioRenderer= uses the [[http://lwjgl.org/][=LWJGL=]] (LightWeight Java Game

  1344      Library) bindings to interface with a C library called [[http://kcat.strangesoft.net/openal.html][=OpenAL=]]

  1345    - =OpenAL= renders the 3D sound and feeds the rendered sound

  1346      directly to any of various sound output devices with which it

  1347      knows how to communicate.

  1348   

  1349    A consequence of this is that there's no way to access the actual

  1350    sound data produced by =OpenAL=. Even worse, =OpenAL= only supports

  1351    one /listener/ (it renders sound data from only one perspective),

  1352    which normally isn't a problem for games, but becomes a problem

  1353    when trying to make multiple AI creatures that can each hear the

  1354    world from a different perspective.

  1355 

  1356    To make many AI creatures in jMonkeyEngine that can each hear the

  1357    world from their own perspective, or to make a single creature with

  1358    many ears, it is necessary to go all the way back to =OpenAL= and

  1359    implement support for simulated hearing there.

  1360 

  1361 *** Extending =OpenAl=

  1362 

  1363     Extending =OpenAL= to support multiple listeners requires 500

  1364     lines of =C= code and is too hairy to mention here. Instead, I

  1365     will show a small amount of extension code and go over the high

  1366     level strategy. Full source is of course available with the

  1367     =CORTEX= distribution if you're interested.

  1368 

  1369     =OpenAL= goes to great lengths to support many different systems,

  1370     all with different sound capabilities and interfaces. It

  1371     accomplishes this difficult task by providing code for many

  1372     different sound backends in pseudo-objects called /Devices/.

  1373     There's a device for the Linux Open Sound System and the Advanced

  1374     Linux Sound Architecture, there's one for Direct Sound on Windows,

  1375     and there's even one for Solaris. =OpenAL= solves the problem of

  1376     platform independence by providing all these Devices.

  1377 

  1378     Wrapper libraries such as LWJGL are free to examine the system on

  1379     which they are running and then select an appropriate device for

  1380     that system.

  1381 

  1382     There are also a few "special" devices that don't interface with

  1383     any particular system. These include the Null Device, which

  1384     doesn't do anything, and the Wave Device, which writes whatever

  1385     sound it receives to a file, if everything has been set up

  1386     correctly when configuring =OpenAL=.

  1387 

  1388     Actual mixing (Doppler shift and distance.environment-based

  1389     attenuation) of the sound data happens in the Devices, and they

  1390     are the only point in the sound rendering process where this data

  1391     is available.

  1392 

  1393     Therefore, in order to support multiple listeners, and get the

  1394     sound data in a form that the AIs can use, it is necessary to

  1395     create a new Device which supports this feature.

  1396 

  1397     Adding a device to OpenAL is rather tricky -- there are five

  1398     separate files in the =OpenAL= source tree that must be modified

  1399     to do so. I named my device the "Multiple Audio Send" Device, or

  1400     =Send= Device for short, since it sends audio data back to the

  1401     calling application like an Aux-Send cable on a mixing board.

  1402 

  1403     The main idea behind the Send device is to take advantage of the

  1404     fact that LWJGL only manages one /context/ when using OpenAL. A

  1405     /context/ is like a container that holds samples and keeps track

  1406     of where the listener is. In order to support multiple listeners,

  1407     the Send device identifies the LWJGL context as the master

  1408     context, and creates any number of slave contexts to represent

  1409     additional listeners. Every time the device renders sound, it

  1410     synchronizes every source from the master LWJGL context to the

  1411     slave contexts. Then, it renders each context separately, using a

  1412     different listener for each one. The rendered sound is made

  1413     available via JNI to jMonkeyEngine.

  1414 

  1415     Switching between contexts is not the normal operation of a

  1416     Device, and one of the problems with doing so is that a Device

  1417     normally keeps around a few pieces of state such as the

  1418     =ClickRemoval= array above which will become corrupted if the

  1419     contexts are not rendered in parallel. The solution is to create a

  1420     copy of this normally global device state for each context, and

  1421     copy it back and forth into and out of the actual device state

  1422     whenever a context is rendered.

  1423 

  1424     The core of the =Send= device is the =syncSources= function, which

  1425     does the job of copying all relevant data from one context to

  1426     another. 

  1427 

  1428     #+caption: Program for extending =OpenAL= to support multiple

  1429     #+caption: listeners via context copying/switching.

  1430     #+name: sync-openal-sources

  1431     #+begin_listing c

  1432     #+BEGIN_SRC c

  1433 void syncSources(ALsource *masterSource, ALsource *slaveSource, 

  1434 		 ALCcontext *masterCtx, ALCcontext *slaveCtx){

  1435   ALuint master = masterSource->source;

  1436   ALuint slave = slaveSource->source;

  1437   ALCcontext *current = alcGetCurrentContext();

  1438 

  1439   syncSourcef(master,slave,masterCtx,slaveCtx,AL_PITCH);

  1440   syncSourcef(master,slave,masterCtx,slaveCtx,AL_GAIN);

  1441   syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_DISTANCE);

  1442   syncSourcef(master,slave,masterCtx,slaveCtx,AL_ROLLOFF_FACTOR);

  1443   syncSourcef(master,slave,masterCtx,slaveCtx,AL_REFERENCE_DISTANCE);

  1444   syncSourcef(master,slave,masterCtx,slaveCtx,AL_MIN_GAIN);

  1445   syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_GAIN);

  1446   syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_GAIN);

  1447   syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_INNER_ANGLE);

  1448   syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_ANGLE);

  1449   syncSourcef(master,slave,masterCtx,slaveCtx,AL_SEC_OFFSET);

  1450   syncSourcef(master,slave,masterCtx,slaveCtx,AL_SAMPLE_OFFSET);

  1451   syncSourcef(master,slave,masterCtx,slaveCtx,AL_BYTE_OFFSET);

  1452     

  1453   syncSource3f(master,slave,masterCtx,slaveCtx,AL_POSITION);

  1454   syncSource3f(master,slave,masterCtx,slaveCtx,AL_VELOCITY);

  1455   syncSource3f(master,slave,masterCtx,slaveCtx,AL_DIRECTION);

  1456   

  1457   syncSourcei(master,slave,masterCtx,slaveCtx,AL_SOURCE_RELATIVE);

  1458   syncSourcei(master,slave,masterCtx,slaveCtx,AL_LOOPING);

  1459 

  1460   alcMakeContextCurrent(masterCtx);

  1461   ALint source_type;

  1462   alGetSourcei(master, AL_SOURCE_TYPE, &source_type);

  1463 

  1464   // Only static sources are currently synchronized! 

  1465   if (AL_STATIC == source_type){

  1466     ALint master_buffer;

  1467     ALint slave_buffer;

  1468     alGetSourcei(master, AL_BUFFER, &master_buffer);

  1469     alcMakeContextCurrent(slaveCtx);

  1470     alGetSourcei(slave, AL_BUFFER, &slave_buffer);

  1471     if (master_buffer != slave_buffer){

  1472       alSourcei(slave, AL_BUFFER, master_buffer);

  1473     }

  1474   }

  1475   

  1476   // Synchronize the state of the two sources.

  1477   alcMakeContextCurrent(masterCtx);

  1478   ALint masterState;

  1479   ALint slaveState;

  1480 

  1481   alGetSourcei(master, AL_SOURCE_STATE, &masterState);

  1482   alcMakeContextCurrent(slaveCtx);

  1483   alGetSourcei(slave, AL_SOURCE_STATE, &slaveState);

  1484 

  1485   if (masterState != slaveState){

  1486     switch (masterState){

  1487     case AL_INITIAL : alSourceRewind(slave); break;

  1488     case AL_PLAYING : alSourcePlay(slave);   break;

  1489     case AL_PAUSED  : alSourcePause(slave);  break;

  1490     case AL_STOPPED : alSourceStop(slave);   break;

  1491     }

  1492   }

  1493   // Restore whatever context was previously active.

  1494   alcMakeContextCurrent(current);

  1495 }

  1496     #+END_SRC

  1497     #+end_listing

  1498 

  1499     With this special context-switching device, and some ugly JNI

  1500     bindings that are not worth mentioning, =CORTEX= gains the ability

  1501     to access multiple sound streams from =OpenAL=. 

  1502 

  1503     #+caption: Program to create an ear from a blender empty node. The ear

  1504     #+caption: follows around the nearest physical object and passes 

  1505     #+caption: all sensory data to a continuation function.

  1506     #+name: add-ear

  1507     #+begin_listing clojure

  1508     #+BEGIN_SRC clojure

  1509 (defn add-ear!  

  1510   "Create a Listener centered on the current position of 'ear 

  1511    which follows the closest physical node in 'creature and 

  1512    sends sound data to 'continuation."

  1513   [#^Application world #^Node creature #^Spatial ear continuation]

  1514   (let [target (closest-node creature ear)

  1515         lis (Listener.)

  1516         audio-renderer (.getAudioRenderer world)

  1517         sp (hearing-pipeline continuation)]

  1518     (.setLocation lis (.getWorldTranslation ear))

  1519     (.setRotation lis (.getWorldRotation ear))

  1520     (bind-sense target lis)

  1521     (update-listener-velocity! target lis)

  1522     (.addListener audio-renderer lis)

  1523     (.registerSoundProcessor audio-renderer lis sp)))

  1524     #+END_SRC

  1525     #+end_listing

  1526     

  1527     The =Send= device, unlike most of the other devices in =OpenAL=,

  1528     does not render sound unless asked. This enables the system to

  1529     slow down or speed up depending on the needs of the AIs who are

  1530     using it to listen. If the device tried to render samples in

  1531     real-time, a complicated AI whose mind takes 100 seconds of

  1532     computer time to simulate 1 second of AI-time would miss almost

  1533     all of the sound in its environment!

  1534 

  1535     #+caption: Program to enable arbitrary hearing in =CORTEX=

  1536     #+name: hearing

  1537     #+begin_listing clojure

  1538 #+BEGIN_SRC clojure

  1539 (defn hearing-kernel

  1540   "Returns a function which returns auditory sensory data when called

  1541    inside a running simulation."

  1542   [#^Node creature #^Spatial ear]

  1543   (let [hearing-data (atom [])

  1544         register-listener!

  1545         (runonce 

  1546          (fn [#^Application world]

  1547            (add-ear!

  1548             world creature ear

  1549             (comp #(reset! hearing-data %)

  1550                   byteBuffer->pulse-vector))))]

  1551     (fn [#^Application world]

  1552       (register-listener! world)

  1553       (let [data @hearing-data

  1554             topology              

  1555             (vec (map #(vector % 0) (range 0 (count data))))]

  1556         [topology data]))))

  1557     

  1558 (defn hearing!

  1559   "Endow the creature in a particular world with the sense of

  1560    hearing. Will return a sequence of functions, one for each ear,

  1561    which when called will return the auditory data from that ear."

  1562   [#^Node creature]

  1563   (for [ear (ears creature)]

  1564     (hearing-kernel creature ear)))

  1565     #+END_SRC

  1566     #+end_listing

  1567 

  1568     Armed with these functions, =CORTEX= is able to test possibly the

  1569     first ever instance of multiple listeners in a video game engine

  1570     based simulation!

  1571 

  1572     #+caption: Here a simple creature responds to sound by changing

  1573     #+caption: its color from gray to green when the total volume

  1574     #+caption: goes over a threshold.

  1575     #+name: sound-test

  1576     #+begin_listing java

  1577     #+BEGIN_SRC java

  1578 /**

  1579  * Respond to sound!  This is the brain of an AI entity that 

  1580  * hears its surroundings and reacts to them.

  1581  */

  1582 public void process(ByteBuffer audioSamples, 

  1583 		    int numSamples, AudioFormat format) {

  1584     audioSamples.clear();

  1585     byte[] data = new byte[numSamples];

  1586     float[] out = new float[numSamples];

  1587     audioSamples.get(data);

  1588     FloatSampleTools.

  1589 	byte2floatInterleaved

  1590 	(data, 0, out, 0, numSamples/format.getFrameSize(), format);

  1591 

  1592     float max = Float.NEGATIVE_INFINITY;

  1593     for (float f : out){if (f > max) max = f;}

  1594     audioSamples.clear();

  1595 

  1596     if (max > 0.1){

  1597 	entity.getMaterial().setColor("Color", ColorRGBA.Green);

  1598     }

  1599     else {

  1600 	entity.getMaterial().setColor("Color", ColorRGBA.Gray);

  1601     }

  1602     #+END_SRC

  1603     #+end_listing

  1604 

  1605     #+caption: First ever simulation of multiple listeners in =CORTEX=.

  1606     #+caption: Each cube is a creature which processes sound data with

  1607     #+caption: the =process= function from listing \ref{sound-test}. 

  1608     #+caption: the ball is constantly emitting a pure tone of

  1609     #+caption: constant volume. As it approaches the cubes, they each

  1610     #+caption: change color in response to the sound.

  1611     #+name: sound-cubes.

  1612     #+ATTR_LaTeX: :width 10cm

  1613     [[./images/java-hearing-test.png]]

  1614 

  1615     This system of hearing has also been co-opted by the

  1616     jMonkeyEngine3 community and is used to record audio for demo

  1617     videos.

  1618 

  1619 ** Hundreds of hair-like elements provide a sense of touch

  1620 

  1621    Touch is critical to navigation and spatial reasoning and as such I

  1622    need a simulated version of it to give to my AI creatures.

  1623    

  1624    Human skin has a wide array of touch sensors, each of which

  1625    specialize in detecting different vibrational modes and pressures.

  1626    These sensors can integrate a vast expanse of skin (i.e. your

  1627    entire palm), or a tiny patch of skin at the tip of your finger.

  1628    The hairs of the skin help detect objects before they even come

  1629    into contact with the skin proper.

  1630    

  1631    However, touch in my simulated world can not exactly correspond to

  1632    human touch because my creatures are made out of completely rigid

  1633    segments that don't deform like human skin.

  1634    

  1635    Instead of measuring deformation or vibration, I surround each

  1636    rigid part with a plenitude of hair-like objects (/feelers/) which

  1637    do not interact with the physical world. Physical objects can pass

  1638    through them with no effect. The feelers are able to tell when

  1639    other objects pass through them, and they constantly report how

  1640    much of their extent is covered. So even though the creature's body

  1641    parts do not deform, the feelers create a margin around those body

  1642    parts which achieves a sense of touch which is a hybrid between a

  1643    human's sense of deformation and sense from hairs.

  1644    

  1645    Implementing touch in jMonkeyEngine follows a different technical

  1646    route than vision and hearing. Those two senses piggybacked off

  1647    jMonkeyEngine's 3D audio and video rendering subsystems. To

  1648    simulate touch, I use jMonkeyEngine's physics system to execute

  1649    many small collision detections, one for each feeler. The placement

  1650    of the feelers is determined by a UV-mapped image which shows where

  1651    each feeler should be on the 3D surface of the body.

  1652 

  1653 *** Defining Touch Meta-Data in Blender

  1654 

  1655     Each geometry can have a single UV map which describes the

  1656     position of the feelers which will constitute its sense of touch.

  1657     This image path is stored under the ``touch'' key. The image itself

  1658     is black and white, with black meaning a feeler length of 0 (no

  1659     feeler is present) and white meaning a feeler length of =scale=,

  1660     which is a float stored under the key "scale".

  1661 

  1662     #+caption: Touch does not use empty nodes, to store metadata, 

  1663     #+caption: because the metadata of each solid part of a 

  1664     #+caption: creature's body is sufficient.

  1665     #+name: touch-meta-data

  1666     #+begin_listing clojure

  1667     #+BEGIN_SRC  clojure

  1668 (defn tactile-sensor-profile

  1669   "Return the touch-sensor distribution image in BufferedImage format,

  1670    or nil if it does not exist."

  1671   [#^Geometry obj]

  1672   (if-let [image-path (meta-data obj "touch")]

  1673     (load-image image-path)))

  1674 

  1675 (defn tactile-scale

  1676   "Return the length of each feeler. Default scale is 0.01

  1677   jMonkeyEngine units."

  1678   [#^Geometry obj]

  1679   (if-let [scale (meta-data obj "scale")]

  1680     scale 0.1))

  1681     #+END_SRC

  1682     #+end_listing

  1683 

  1684     Here is an example of a UV-map which specifies the position of

  1685     touch sensors along the surface of the upper segment of a fingertip.

  1686 

  1687     #+caption: This is the tactile-sensor-profile for the upper segment 

  1688     #+caption: of a fingertip. It defines regions of high touch sensitivity 

  1689     #+caption: (where there are many white pixels) and regions of low 

  1690     #+caption: sensitivity (where white pixels are sparse).

  1691     #+name: fingertip-UV

  1692     #+ATTR_LaTeX: :width 13cm

  1693     [[./images/finger-UV.png]]

  1694 

  1695 *** Implementation Summary

  1696   

  1697     To simulate touch there are three conceptual steps. For each solid

  1698     object in the creature, you first have to get UV image and scale

  1699     parameter which define the position and length of the feelers.

  1700     Then, you use the triangles which comprise the mesh and the UV

  1701     data stored in the mesh to determine the world-space position and

  1702     orientation of each feeler. Then once every frame, update these

  1703     positions and orientations to match the current position and

  1704     orientation of the object, and use physics collision detection to

  1705     gather tactile data.

  1706     

  1707     Extracting the meta-data has already been described. The third

  1708     step, physics collision detection, is handled in =touch-kernel=.

  1709     Translating the positions and orientations of the feelers from the

  1710     UV-map to world-space is itself a three-step process.

  1711 

  1712     - Find the triangles which make up the mesh in pixel-space and in

  1713       world-space. \\(=triangles=, =pixel-triangles=).

  1714 

  1715     - Find the coordinates of each feeler in world-space. These are

  1716       the origins of the feelers. (=feeler-origins=).

  1717     

  1718     - Calculate the normals of the triangles in world space, and add

  1719       them to each of the origins of the feelers. These are the

  1720       normalized coordinates of the tips of the feelers.

  1721       (=feeler-tips=).

  1722 

  1723 *** Triangle Math

  1724 

  1725     The rigid objects which make up a creature have an underlying

  1726     =Geometry=, which is a =Mesh= plus a =Material= and other

  1727     important data involved with displaying the object.

  1728     

  1729     A =Mesh= is composed of =Triangles=, and each =Triangle= has three

  1730     vertices which have coordinates in world space and UV space.

  1731     

  1732     Here, =triangles= gets all the world-space triangles which

  1733     comprise a mesh, while =pixel-triangles= gets those same triangles

  1734     expressed in pixel coordinates (which are UV coordinates scaled to

  1735     fit the height and width of the UV image).

  1736 

  1737     #+caption: Programs to extract triangles from a geometry and get 

  1738     #+caption: their vertices in both world and UV-coordinates.

  1739     #+name: get-triangles

  1740     #+begin_listing clojure

  1741     #+BEGIN_SRC clojure

  1742 (defn triangle

  1743   "Get the triangle specified by triangle-index from the mesh."

  1744   [#^Geometry geo triangle-index]

  1745   (triangle-seq

  1746    (let [scratch (Triangle.)]

  1747      (.getTriangle (.getMesh geo) triangle-index scratch) scratch)))

  1748 

  1749 (defn triangles

  1750   "Return a sequence of all the Triangles which comprise a given

  1751    Geometry." 

  1752   [#^Geometry geo]

  1753   (map (partial triangle geo) (range (.getTriangleCount (.getMesh geo)))))

  1754 

  1755 (defn triangle-vertex-indices

  1756   "Get the triangle vertex indices of a given triangle from a given

  1757    mesh."

  1758   [#^Mesh mesh triangle-index]

  1759   (let [indices (int-array 3)]

  1760     (.getTriangle mesh triangle-index indices)

  1761     (vec indices)))

  1762 

  1763     (defn vertex-UV-coord

  1764   "Get the UV-coordinates of the vertex named by vertex-index"

  1765   [#^Mesh mesh vertex-index]

  1766   (let [UV-buffer

  1767         (.getData

  1768          (.getBuffer

  1769           mesh

  1770           VertexBuffer$Type/TexCoord))]

  1771     [(.get UV-buffer (* vertex-index 2))

  1772      (.get UV-buffer (+ 1 (* vertex-index 2)))]))

  1773 

  1774 (defn pixel-triangle [#^Geometry geo image index]

  1775   (let [mesh (.getMesh geo)

  1776         width (.getWidth image)

  1777         height (.getHeight image)]

  1778     (vec (map (fn [[u v]] (vector (* width u) (* height v)))

  1779               (map (partial vertex-UV-coord mesh)

  1780                    (triangle-vertex-indices mesh index))))))

  1781 

  1782 (defn pixel-triangles 

  1783   "The pixel-space triangles of the Geometry, in the same order as

  1784    (triangles geo)"

  1785   [#^Geometry geo image]

  1786   (let [height (.getHeight image)

  1787         width (.getWidth image)]

  1788     (map (partial pixel-triangle geo image)

  1789          (range (.getTriangleCount (.getMesh geo))))))

  1790     #+END_SRC

  1791     #+end_listing

  1792     

  1793 *** The Affine Transform from one Triangle to Another

  1794 

  1795     =pixel-triangles= gives us the mesh triangles expressed in pixel

  1796     coordinates and =triangles= gives us the mesh triangles expressed

  1797     in world coordinates. The tactile-sensor-profile gives the

  1798     position of each feeler in pixel-space. In order to convert

  1799     pixel-space coordinates into world-space coordinates we need

  1800     something that takes coordinates on the surface of one triangle

  1801     and gives the corresponding coordinates on the surface of another

  1802     triangle.

  1803     

  1804     Triangles are [[http://mathworld.wolfram.com/AffineTransformation.html ][affine]], which means any triangle can be transformed

  1805     into any other by a combination of translation, scaling, and

  1806     rotation. The affine transformation from one triangle to another

  1807     is readily computable if the triangle is expressed in terms of a

  1808     $4x4$ matrix.

  1809 

  1810     #+BEGIN_LaTeX

  1811     $$

  1812     \begin{bmatrix}

  1813     x_1 & x_2 & x_3 & n_x \\

  1814     y_1 & y_2 & y_3 & n_y \\ 

  1815     z_1 & z_2 & z_3 & n_z \\

  1816     1 & 1 & 1 & 1 

  1817     \end{bmatrix}

  1818     $$

  1819     #+END_LaTeX

  1820     

  1821     Here, the first three columns of the matrix are the vertices of

  1822     the triangle. The last column is the right-handed unit normal of

  1823     the triangle.

  1824     

  1825     With two triangles $T_{1}$ and $T_{2}$ each expressed as a

  1826     matrix like above, the affine transform from $T_{1}$ to $T_{2}$

  1827     is $T_{2}T_{1}^{-1}$.

  1828     

  1829     The clojure code below recapitulates the formulas above, using

  1830     jMonkeyEngine's =Matrix4f= objects, which can describe any affine

  1831     transformation.

  1832 

  1833     #+caption: Program to interpret triangles as affine transforms.

  1834     #+name: triangle-affine

  1835     #+begin_listing clojure

  1836     #+BEGIN_SRC clojure

  1837 (defn triangle->matrix4f

  1838   "Converts the triangle into a 4x4 matrix: The first three columns

  1839    contain the vertices of the triangle; the last contains the unit

  1840    normal of the triangle. The bottom row is filled with 1s."

  1841   [#^Triangle t]

  1842   (let [mat (Matrix4f.)

  1843         [vert-1 vert-2 vert-3]

  1844         (mapv #(.get t %) (range 3))

  1845         unit-normal (do (.calculateNormal t)(.getNormal t))

  1846         vertices [vert-1 vert-2 vert-3 unit-normal]]

  1847     (dorun 

  1848      (for [row (range 4) col (range 3)]

  1849        (do

  1850          (.set mat col row (.get (vertices row) col))

  1851          (.set mat 3 row 1)))) mat))

  1852 

  1853 (defn triangles->affine-transform

  1854   "Returns the affine transformation that converts each vertex in the

  1855    first triangle into the corresponding vertex in the second

  1856    triangle."

  1857   [#^Triangle tri-1 #^Triangle tri-2]

  1858   (.mult 

  1859    (triangle->matrix4f tri-2)

  1860    (.invert (triangle->matrix4f tri-1))))

  1861     #+END_SRC

  1862     #+end_listing

  1863 

  1864 *** Triangle Boundaries

  1865   

  1866 For efficiency's sake I will divide the tactile-profile image into

  1867 small squares which inscribe each pixel-triangle, then extract the

  1868 points which lie inside the triangle and map them to 3D-space using

  1869 =triangle-transform= above. To do this I need a function,

  1870 =convex-bounds= which finds the smallest box which inscribes a 2D

  1871 triangle.

  1872 

  1873 =inside-triangle?= determines whether a point is inside a triangle

  1874 in 2D pixel-space.

  1875 

  1876     #+caption: Program to efficiently determine point inclusion 

  1877     #+caption: in a triangle.

  1878     #+name: in-triangle

  1879     #+begin_listing clojure

  1880     #+BEGIN_SRC clojure

  1881 (defn convex-bounds

  1882   "Returns the smallest square containing the given vertices, as a

  1883    vector of integers [left top width height]."

  1884   [verts]

  1885   (let [xs (map first verts)

  1886         ys (map second verts)

  1887         x0 (Math/floor (apply min xs))

  1888         y0 (Math/floor (apply min ys))

  1889         x1 (Math/ceil (apply max xs))

  1890         y1 (Math/ceil (apply max ys))]

  1891     [x0 y0 (- x1 x0) (- y1 y0)]))

  1892 

  1893 (defn same-side?

  1894   "Given the points p1 and p2 and the reference point ref, is point p

  1895   on the same side of the line that goes through p1 and p2 as ref is?" 

  1896   [p1 p2 ref p]

  1897   (<=

  1898    0

  1899    (.dot 

  1900     (.cross (.subtract p2 p1) (.subtract p p1))

  1901     (.cross (.subtract p2 p1) (.subtract ref p1)))))

  1902 

  1903 (defn inside-triangle?

  1904   "Is the point inside the triangle?"

  1905   {:author "Dylan Holmes"}

  1906   [#^Triangle tri #^Vector3f p]

  1907   (let [[vert-1 vert-2 vert-3] [(.get1 tri) (.get2 tri) (.get3 tri)]]

  1908     (and

  1909      (same-side? vert-1 vert-2 vert-3 p)

  1910      (same-side? vert-2 vert-3 vert-1 p)

  1911      (same-side? vert-3 vert-1 vert-2 p))))

  1912     #+END_SRC

  1913     #+end_listing

  1914 

  1915 *** Feeler Coordinates

  1916 

  1917     The triangle-related functions above make short work of

  1918     calculating the positions and orientations of each feeler in

  1919     world-space.

  1920 

  1921     #+caption: Program to get the coordinates of ``feelers '' in 

  1922     #+caption: both world and UV-coordinates.

  1923     #+name: feeler-coordinates

  1924     #+begin_listing clojure

  1925     #+BEGIN_SRC clojure

  1926 (defn feeler-pixel-coords

  1927  "Returns the coordinates of the feelers in pixel space in lists, one

  1928   list for each triangle, ordered in the same way as (triangles) and

  1929   (pixel-triangles)."

  1930  [#^Geometry geo image]

  1931  (map 

  1932   (fn [pixel-triangle]

  1933     (filter

  1934      (fn [coord]

  1935        (inside-triangle? (->triangle pixel-triangle)

  1936                          (->vector3f coord)))

  1937        (white-coordinates image (convex-bounds pixel-triangle))))

  1938   (pixel-triangles geo image)))

  1939 

  1940 (defn feeler-world-coords 

  1941  "Returns the coordinates of the feelers in world space in lists, one

  1942   list for each triangle, ordered in the same way as (triangles) and

  1943   (pixel-triangles)."

  1944  [#^Geometry geo image]

  1945  (let [transforms

  1946        (map #(triangles->affine-transform

  1947               (->triangle %1) (->triangle %2))

  1948             (pixel-triangles geo image)

  1949             (triangles geo))]

  1950    (map (fn [transform coords]

  1951           (map #(.mult transform (->vector3f %)) coords))

  1952         transforms (feeler-pixel-coords geo image))))

  1953     #+END_SRC

  1954     #+end_listing

  1955 

  1956     #+caption: Program to get the position of the base and tip of 

  1957     #+caption: each ``feeler''

  1958     #+name: feeler-tips

  1959     #+begin_listing clojure

  1960     #+BEGIN_SRC clojure

  1961 (defn feeler-origins

  1962   "The world space coordinates of the root of each feeler."

  1963   [#^Geometry geo image]

  1964    (reduce concat (feeler-world-coords geo image)))

  1965 

  1966 (defn feeler-tips

  1967   "The world space coordinates of the tip of each feeler."

  1968   [#^Geometry geo image]

  1969   (let [world-coords (feeler-world-coords geo image)

  1970         normals

  1971         (map

  1972          (fn [triangle]

  1973            (.calculateNormal triangle)

  1974            (.clone (.getNormal triangle)))

  1975          (map ->triangle (triangles geo)))]

  1976 

  1977     (mapcat (fn [origins normal]

  1978               (map #(.add % normal) origins))

  1979             world-coords normals)))

  1980 

  1981 (defn touch-topology

  1982   [#^Geometry geo image]

  1983   (collapse (reduce concat (feeler-pixel-coords geo image))))

  1984     #+END_SRC

  1985     #+end_listing

  1986 

  1987 *** Simulated Touch

  1988 

  1989     Now that the functions to construct feelers are complete,

  1990     =touch-kernel= generates functions to be called from within a

  1991     simulation that perform the necessary physics collisions to

  1992     collect tactile data, and =touch!= recursively applies it to every

  1993     node in the creature.

  1994 

  1995     #+caption: Efficient program to transform a ray from 

  1996     #+caption: one position to another.

  1997     #+name: set-ray

  1998     #+begin_listing clojure

  1999     #+BEGIN_SRC clojure

  2000 (defn set-ray [#^Ray ray #^Matrix4f transform

  2001                #^Vector3f origin #^Vector3f tip]

  2002   ;; Doing everything locally reduces garbage collection by enough to

  2003   ;; be worth it.

  2004   (.mult transform origin (.getOrigin ray))

  2005   (.mult transform tip (.getDirection ray))

  2006   (.subtractLocal (.getDirection ray) (.getOrigin ray))

  2007   (.normalizeLocal (.getDirection ray)))

  2008     #+END_SRC

  2009     #+end_listing

  2010 

  2011     #+caption: This is the core of touch in =CORTEX= each feeler 

  2012     #+caption: follows the object it is bound to, reporting any 

  2013     #+caption: collisions that may happen.

  2014     #+name: touch-kernel

  2015     #+begin_listing clojure

  2016     #+BEGIN_SRC clojure

  2017 (defn touch-kernel

  2018   "Constructs a function which will return tactile sensory data from

  2019    'geo when called from inside a running simulation"

  2020   [#^Geometry geo]

  2021   (if-let

  2022       [profile (tactile-sensor-profile geo)]

  2023     (let [ray-reference-origins (feeler-origins geo profile)

  2024           ray-reference-tips (feeler-tips geo profile)

  2025           ray-length (tactile-scale geo)

  2026           current-rays (map (fn [_] (Ray.)) ray-reference-origins)

  2027           topology (touch-topology geo profile)

  2028           correction (float (* ray-length -0.2))]

  2029       ;; slight tolerance for very close collisions.

  2030       (dorun

  2031        (map (fn [origin tip]

  2032               (.addLocal origin (.mult (.subtract tip origin)

  2033                                        correction)))

  2034             ray-reference-origins ray-reference-tips))

  2035       (dorun (map #(.setLimit % ray-length) current-rays))

  2036       (fn [node]

  2037         (let [transform (.getWorldMatrix geo)]

  2038           (dorun

  2039            (map (fn [ray ref-origin ref-tip]

  2040                   (set-ray ray transform ref-origin ref-tip))

  2041                 current-rays ray-reference-origins

  2042                 ray-reference-tips))

  2043           (vector

  2044            topology

  2045            (vec

  2046             (for [ray current-rays]

  2047               (do

  2048                 (let [results (CollisionResults.)]

  2049                   (.collideWith node ray results)

  2050                   (let [touch-objects

  2051                         (filter #(not (= geo (.getGeometry %)))

  2052                                 results)

  2053                         limit (.getLimit ray)]

  2054                     [(if (empty? touch-objects)

  2055                        limit

  2056                        (let [response

  2057                              (apply min (map #(.getDistance %)

  2058                                              touch-objects))]

  2059                          (FastMath/clamp

  2060                           (float 

  2061                            (if (> response limit) (float 0.0)

  2062                                (+ response correction)))

  2063                            (float 0.0)

  2064                            limit)))

  2065                      limit])))))))))))

  2066     #+END_SRC

  2067     #+end_listing

  2068 

  2069     Armed with the =touch!= function, =CORTEX= becomes capable of

  2070     giving creatures a sense of touch. A simple test is to create a

  2071     cube that is outfitted with a uniform distribution of touch

  2072     sensors. It can feel the ground and any balls that it touches.

  2073 

  2074     #+caption: =CORTEX= interface for creating touch in a simulated

  2075     #+caption: creature.

  2076     #+name: touch

  2077     #+begin_listing clojure

  2078     #+BEGIN_SRC clojure

  2079 (defn touch! 

  2080   "Endow the creature with the sense of touch. Returns a sequence of

  2081    functions, one for each body part with a tactile-sensor-profile,

  2082    each of which when called returns sensory data for that body part."

  2083   [#^Node creature]

  2084   (filter

  2085    (comp not nil?)

  2086    (map touch-kernel

  2087         (filter #(isa? (class %) Geometry)

  2088                 (node-seq creature)))))

  2089     #+END_SRC

  2090     #+end_listing

  2091     

  2092     The tactile-sensor-profile image for the touch cube is a simple

  2093     cross with a uniform distribution of touch sensors:

  2094 

  2095     #+caption: The touch profile for the touch-cube. Each pure white 

  2096     #+caption: pixel defines a touch sensitive feeler.

  2097     #+name: touch-cube-uv-map

  2098     #+ATTR_LaTeX: :width 7cm

  2099     [[./images/touch-profile.png]]

  2100 

  2101     #+caption: The touch cube reacts to cannonballs. The black, red, 

  2102     #+caption: and white cross on the right is a visual display of 

  2103     #+caption: the creature's touch. White means that it is feeling 

  2104     #+caption: something strongly, black is not feeling anything,

  2105     #+caption: and gray is in-between. The cube can feel both the 

  2106     #+caption: floor and the ball. Notice that when the ball causes 

  2107     #+caption: the cube to tip, that the bottom face can still feel 

  2108     #+caption: part of the ground.

  2109     #+name: touch-cube-uv-map-2

  2110     #+ATTR_LaTeX: :width 15cm

  2111     [[./images/touch-cube.png]]

  2112 

  2113 ** Proprioception provides knowledge of your own body's position

  2114 

  2115    Close your eyes, and touch your nose with your right index finger.

  2116    How did you do it? You could not see your hand, and neither your

  2117    hand nor your nose could use the sense of touch to guide the path

  2118    of your hand. There are no sound cues, and Taste and Smell

  2119    certainly don't provide any help. You know where your hand is

  2120    without your other senses because of Proprioception.

  2121    

  2122    Humans can sometimes loose this sense through viral infections or

  2123    damage to the spinal cord or brain, and when they do, they loose

  2124    the ability to control their own bodies without looking directly at

  2125    the parts they want to move. In [[http://en.wikipedia.org/wiki/The_Man_Who_Mistook_His_Wife_for_a_Hat][The Man Who Mistook His Wife for a

  2126    Hat]] (\cite{man-wife-hat}), a woman named Christina looses this

  2127    sense and has to learn how to move by carefully watching her arms

  2128    and legs. She describes proprioception as the "eyes of the body,

  2129    the way the body sees itself".

  2130    

  2131    Proprioception in humans is mediated by [[http://en.wikipedia.org/wiki/Articular_capsule][joint capsules]], [[http://en.wikipedia.org/wiki/Muscle_spindle][muscle

  2132    spindles]], and the [[http://en.wikipedia.org/wiki/Golgi_tendon_organ][Golgi tendon organs]]. These measure the relative

  2133    positions of each body part by monitoring muscle strain and length.

  2134    

  2135    It's clear that this is a vital sense for fluid, graceful movement.

  2136    It's also particularly easy to implement in jMonkeyEngine.

  2137    

  2138    My simulated proprioception calculates the relative angles of each

  2139    joint from the rest position defined in the blender file. This

  2140    simulates the muscle-spindles and joint capsules. I will deal with

  2141    Golgi tendon organs, which calculate muscle strain, in the next

  2142    section.

  2143 

  2144 *** Helper functions

  2145 

  2146     =absolute-angle= calculates the angle between two vectors,

  2147     relative to a third axis vector. This angle is the number of

  2148     radians you have to move counterclockwise around the axis vector

  2149     to get from the first to the second vector. It is not commutative

  2150     like a normal dot-product angle is.

  2151 

  2152     The purpose of these functions is to build a system of angle

  2153     measurement that is biologically plausible.

  2154 

  2155     #+caption: Program to measure angles along a vector

  2156     #+name: helpers

  2157     #+begin_listing clojure

  2158     #+BEGIN_SRC clojure

  2159 (defn right-handed?

  2160   "true iff the three vectors form a right handed coordinate

  2161    system. The three vectors do not have to be normalized or

  2162    orthogonal."

  2163   [vec1 vec2 vec3]

  2164   (pos? (.dot (.cross vec1 vec2) vec3)))

  2165 

  2166 (defn absolute-angle

  2167   "The angle between 'vec1 and 'vec2 around 'axis. In the range 

  2168    [0 (* 2 Math/PI)]."

  2169   [vec1 vec2 axis]

  2170   (let [angle (.angleBetween vec1 vec2)]

  2171     (if (right-handed? vec1 vec2 axis)

  2172       angle (- (* 2 Math/PI) angle))))

  2173     #+END_SRC

  2174     #+end_listing

  2175 

  2176 *** Proprioception Kernel

  2177     

  2178     Given a joint, =proprioception-kernel= produces a function that

  2179     calculates the Euler angles between the the objects the joint

  2180     connects. The only tricky part here is making the angles relative

  2181     to the joint's initial ``straightness''.

  2182 

  2183     #+caption: Program to return biologically reasonable proprioceptive

  2184     #+caption: data for each joint.

  2185     #+name: proprioception

  2186     #+begin_listing clojure

  2187     #+BEGIN_SRC clojure

  2188 (defn proprioception-kernel

  2189   "Returns a function which returns proprioceptive sensory data when

  2190   called inside a running simulation."

  2191   [#^Node parts #^Node joint]

  2192   (let [[obj-a obj-b] (joint-targets parts joint)

  2193         joint-rot (.getWorldRotation joint)

  2194         x0 (.mult joint-rot Vector3f/UNIT_X)

  2195         y0 (.mult joint-rot Vector3f/UNIT_Y)

  2196         z0 (.mult joint-rot Vector3f/UNIT_Z)]

  2197     (fn []

  2198       (let [rot-a (.clone (.getWorldRotation obj-a))

  2199             rot-b (.clone (.getWorldRotation obj-b))

  2200             x (.mult rot-a x0)

  2201             y (.mult rot-a y0)

  2202             z (.mult rot-a z0)

  2203 

  2204             X (.mult rot-b x0)

  2205             Y (.mult rot-b y0)

  2206             Z (.mult rot-b z0)

  2207             heading  (Math/atan2 (.dot X z) (.dot X x))

  2208             pitch  (Math/atan2 (.dot X y) (.dot X x))

  2209 

  2210             ;; rotate x-vector back to origin

  2211             reverse

  2212             (doto (Quaternion.)

  2213               (.fromAngleAxis

  2214                (.angleBetween X x)

  2215                (let [cross (.normalize (.cross X x))]

  2216                  (if (= 0 (.length cross)) y cross))))

  2217             roll (absolute-angle (.mult reverse Y) y x)]

  2218         [heading pitch roll]))))

  2219 

  2220 (defn proprioception!

  2221   "Endow the creature with the sense of proprioception. Returns a

  2222    sequence of functions, one for each child of the \"joints\" node in

  2223    the creature, which each report proprioceptive information about

  2224    that joint."

  2225   [#^Node creature]

  2226   ;; extract the body's joints

  2227   (let [senses (map (partial proprioception-kernel creature)

  2228                     (joints creature))]

  2229     (fn []

  2230       (map #(%) senses))))

  2231     #+END_SRC

  2232     #+end_listing

  2233 

  2234     =proprioception!= maps =proprioception-kernel= across all the

  2235     joints of the creature. It uses the same list of joints that

  2236     =joints= uses. Proprioception is the easiest sense to implement in

  2237     =CORTEX=, and it will play a crucial role when efficiently

  2238     implementing empathy.

  2239 

  2240     #+caption: In the upper right corner, the three proprioceptive

  2241     #+caption: angle measurements are displayed. Red is yaw, Green is 

  2242     #+caption: pitch, and White is roll.

  2243     #+name: proprio

  2244     #+ATTR_LaTeX: :width 11cm

  2245     [[./images/proprio.png]]

  2246 

  2247 ** Muscles contain both sensors and effectors

  2248 

  2249    Surprisingly enough, terrestrial creatures only move by using

  2250    torque applied about their joints. There's not a single straight

  2251    line of force in the human body at all! (A straight line of force

  2252    would correspond to some sort of jet or rocket propulsion.)

  2253    

  2254    In humans, muscles are composed of muscle fibers which can contract

  2255    to exert force. The muscle fibers which compose a muscle are

  2256    partitioned into discrete groups which are each controlled by a

  2257    single alpha motor neuron. A single alpha motor neuron might

  2258    control as little as three or as many as one thousand muscle

  2259    fibers. When the alpha motor neuron is engaged by the spinal cord,

  2260    it activates all of the muscle fibers to which it is attached. The

  2261    spinal cord generally engages the alpha motor neurons which control

  2262    few muscle fibers before the motor neurons which control many

  2263    muscle fibers. This recruitment strategy allows for precise

  2264    movements at low strength. The collection of all motor neurons that

  2265    control a muscle is called the motor pool. The brain essentially

  2266    says "activate 30% of the motor pool" and the spinal cord recruits

  2267    motor neurons until 30% are activated. Since the distribution of

  2268    power among motor neurons is unequal and recruitment goes from

  2269    weakest to strongest, the first 30% of the motor pool might be 5%

  2270    of the strength of the muscle.

  2271    

  2272    My simulated muscles follow a similar design: Each muscle is

  2273    defined by a 1-D array of numbers (the "motor pool"). Each entry in

  2274    the array represents a motor neuron which controls a number of

  2275    muscle fibers equal to the value of the entry. Each muscle has a

  2276    scalar strength factor which determines the total force the muscle

  2277    can exert when all motor neurons are activated. The effector

  2278    function for a muscle takes a number to index into the motor pool,

  2279    and then "activates" all the motor neurons whose index is lower or

  2280    equal to the number. Each motor-neuron will apply force in

  2281    proportion to its value in the array. Lower values cause less

  2282    force. The lower values can be put at the "beginning" of the 1-D

  2283    array to simulate the layout of actual human muscles, which are

  2284    capable of more precise movements when exerting less force. Or, the

  2285    motor pool can simulate more exotic recruitment strategies which do

  2286    not correspond to human muscles.

  2287    

  2288    This 1D array is defined in an image file for ease of

  2289    creation/visualization. Here is an example muscle profile image.

  2290 

  2291    #+caption: A muscle profile image that describes the strengths

  2292    #+caption: of each motor neuron in a muscle. White is weakest 

  2293    #+caption: and dark red is strongest. This particular pattern 

  2294    #+caption: has weaker motor neurons at the beginning, just 

  2295    #+caption: like human muscle.

  2296    #+name: muscle-recruit

  2297    #+ATTR_LaTeX: :width 7cm

  2298    [[./images/basic-muscle.png]]

  2299 

  2300 *** Muscle meta-data

  2301 

  2302     #+caption: Program to deal with loading muscle data from a blender

  2303     #+caption: file's metadata.

  2304     #+name: motor-pool

  2305     #+begin_listing clojure

  2306     #+BEGIN_SRC clojure

  2307 (defn muscle-profile-image

  2308   "Get the muscle-profile image from the node's blender meta-data."

  2309   [#^Node muscle]

  2310   (if-let [image (meta-data muscle "muscle")]

  2311     (load-image image)))

  2312 

  2313 (defn muscle-strength

  2314   "Return the strength of this muscle, or 1 if it is not defined."

  2315   [#^Node muscle]

  2316   (if-let [strength (meta-data muscle "strength")]

  2317     strength 1))

  2318 

  2319 (defn motor-pool

  2320   "Return a vector where each entry is the strength of the \"motor

  2321    neuron\" at that part in the muscle."

  2322   [#^Node muscle]

  2323   (let [profile (muscle-profile-image muscle)]

  2324     (vec

  2325      (let [width (.getWidth profile)]

  2326        (for [x (range width)]

  2327        (- 255

  2328           (bit-and

  2329            0x0000FF

  2330            (.getRGB profile x 0))))))))

  2331     #+END_SRC

  2332     #+end_listing

  2333 

  2334     Of note here is =motor-pool= which interprets the muscle-profile

  2335     image in a way that allows me to use gradients between white and

  2336     red, instead of shades of gray as I've been using for all the

  2337     other senses. This is purely an aesthetic touch.

  2338 

  2339 *** Creating muscles

  2340 

  2341     #+caption: This is the core movement function in =CORTEX=, which

  2342     #+caption: implements muscles that report on their activation.

  2343     #+name: muscle-kernel

  2344     #+begin_listing clojure

  2345     #+BEGIN_SRC clojure

  2346 (defn movement-kernel

  2347   "Returns a function which when called with a integer value inside a

  2348    running simulation will cause movement in the creature according

  2349    to the muscle's position and strength profile. Each function

  2350    returns the amount of force applied / max force."

  2351   [#^Node creature #^Node muscle]

  2352   (let [target (closest-node creature muscle)

  2353         axis

  2354         (.mult (.getWorldRotation muscle) Vector3f/UNIT_Y)

  2355         strength (muscle-strength muscle)

  2356         

  2357         pool (motor-pool muscle)

  2358         pool-integral (reductions + pool)

  2359         forces

  2360         (vec (map  #(float (* strength (/ % (last pool-integral))))

  2361               pool-integral))

  2362         control (.getControl target RigidBodyControl)]

  2363     ;;(println-repl (.getName target) axis)

  2364     (fn [n]

  2365       (let [pool-index (max 0 (min n (dec (count pool))))

  2366             force (forces pool-index)]

  2367         (.applyTorque control (.mult axis force))

  2368         (float (/ force strength))))))

  2369 

  2370 (defn movement!

  2371   "Endow the creature with the power of movement. Returns a sequence

  2372    of functions, each of which accept an integer value and will

  2373    activate their corresponding muscle."

  2374   [#^Node creature]

  2375     (for [muscle (muscles creature)]

  2376       (movement-kernel creature muscle)))

  2377     #+END_SRC

  2378     #+end_listing

  2379 

  2380 

  2381     =movement-kernel= creates a function that will move the nearest

  2382     physical object to the muscle node. The muscle exerts a rotational

  2383     force dependent on it's orientation to the object in the blender

  2384     file. The function returned by =movement-kernel= is also a sense

  2385     function: it returns the percent of the total muscle strength that

  2386     is currently being employed. This is analogous to muscle tension

  2387     in humans and completes the sense of proprioception begun in the

  2388     last section.

  2389     

  2390 ** =CORTEX= brings complex creatures to life!

  2391    

  2392    The ultimate test of =CORTEX= is to create a creature with the full

  2393    gamut of senses and put it though its paces. 

  2394 

  2395    With all senses enabled, my right hand model looks like an

  2396    intricate marionette hand with several strings for each finger:

  2397 

  2398    #+caption: View of the hand model with all sense nodes. You can see 

  2399    #+caption: the joint, muscle, ear, and eye nodes here.

  2400    #+name: hand-nodes-1

  2401    #+ATTR_LaTeX: :width 11cm

  2402    [[./images/hand-with-all-senses2.png]]

  2403 

  2404    #+caption: An alternate view of the hand.

  2405    #+name: hand-nodes-2

  2406    #+ATTR_LaTeX: :width 15cm

  2407    [[./images/hand-with-all-senses3.png]]

  2408 

  2409    With the hand fully rigged with senses, I can run it though a test

  2410    that will test everything. 

  2411 

  2412    #+caption: A full test of the hand with all senses. Note especially 

  2413    #+caption: the interactions the hand has with itself: it feels 

  2414    #+caption: its own palm and fingers, and when it curls its fingers, 

  2415    #+caption: it sees them with its eye (which is located in the center

  2416    #+caption: of the palm. The red block appears with a pure tone sound.

  2417    #+caption: The hand then uses its muscles to launch the cube!

  2418    #+name: integration

  2419    #+ATTR_LaTeX: :width 16cm

  2420    [[./images/integration.png]]

  2421 

  2422 ** =CORTEX= enables many possibilities for further research

  2423 

  2424    Often times, the hardest part of building a system involving

  2425    creatures is dealing with physics and graphics. =CORTEX= removes

  2426    much of this initial difficulty and leaves researchers free to

  2427    directly pursue their ideas. I hope that even undergrads with a

  2428    passing curiosity about simulated touch or creature evolution will

  2429    be able to use cortex for experimentation. =CORTEX= is a completely

  2430    simulated world, and far from being a disadvantage, its simulated

  2431    nature enables you to create senses and creatures that would be

  2432    impossible to make in the real world.

  2433 

  2434    While not by any means a complete list, here are some paths

  2435    =CORTEX= is well suited to help you explore:

  2436 

  2437    - Empathy         :: my empathy program leaves many areas for

  2438         improvement, among which are using vision to infer

  2439         proprioception and looking up sensory experience with imagined

  2440         vision, touch, and sound.

  2441    - Evolution       :: Karl Sims created a rich environment for

  2442         simulating the evolution of creatures on a connection

  2443         machine. Today, this can be redone and expanded with =CORTEX=

  2444         on an ordinary computer.

  2445    - Exotic senses  :: Cortex enables many fascinating senses that are

  2446         not possible to build in the real world. For example,

  2447         telekinesis is an interesting avenue to explore. You can also

  2448         make a ``semantic'' sense which looks up metadata tags on

  2449         objects in the environment the metadata tags might contain

  2450         other sensory information.

  2451    - Imagination via subworlds :: this would involve a creature with

  2452         an effector which creates an entire new sub-simulation where

  2453         the creature has direct control over placement/creation of

  2454         objects via simulated telekinesis. The creature observes this

  2455         sub-world through it's normal senses and uses its observations

  2456         to make predictions about its top level world.

  2457    - Simulated prescience :: step the simulation forward a few ticks,

  2458         gather sensory data, then supply this data for the creature as

  2459         one of its actual senses. The cost of prescience is slowing

  2460         the simulation down by a factor proportional to however far

  2461         you want the entities to see into the future. What happens

  2462         when two evolved creatures that can each see into the future

  2463         fight each other?

  2464    - Swarm creatures :: Program a group of creatures that cooperate

  2465         with each other. Because the creatures would be simulated, you

  2466         could investigate computationally complex rules of behavior

  2467         which still, from the group's point of view, would happen in

  2468         ``real time''. Interactions could be as simple as cellular

  2469         organisms communicating via flashing lights, or as complex as

  2470         humanoids completing social tasks, etc.

  2471    - =HACKER= for writing muscle-control programs :: Presented with

  2472         low-level muscle control/ sense API, generate higher level

  2473         programs for accomplishing various stated goals. Example goals

  2474         might be "extend all your fingers" or "move your hand into the

  2475         area with blue light" or "decrease the angle of this joint".

  2476         It would be like Sussman's HACKER, except it would operate

  2477         with much more data in a more realistic world. Start off with

  2478         "calisthenics" to develop subroutines over the motor control

  2479         API. This would be the "spinal chord" of a more intelligent

  2480         creature. The low level programming code might be a turning

  2481         machine that could develop programs to iterate over a "tape"

  2482         where each entry in the tape could control recruitment of the

  2483         fibers in a muscle.

  2484    - Sense fusion    :: There is much work to be done on sense

  2485         integration -- building up a coherent picture of the world and

  2486         the things in it with =CORTEX= as a base, you can explore

  2487         concepts like self-organizing maps or cross modal clustering

  2488         in ways that have never before been tried.

  2489    - Inverse kinematics :: experiments in sense guided motor control

  2490         are easy given =CORTEX='s support -- you can get right to the

  2491         hard control problems without worrying about physics or

  2492         senses.

  2493 

  2494 * =EMPATH=: action recognition in a simulated worm

  2495 

  2496   Here I develop a computational model of empathy, using =CORTEX= as a

  2497   base. Empathy in this context is the ability to observe another

  2498   creature and infer what sorts of sensations that creature is

  2499   feeling. My empathy algorithm involves multiple phases. First is

  2500   free-play, where the creature moves around and gains sensory

  2501   experience. From this experience I construct a representation of the

  2502   creature's sensory state space, which I call \Phi-space. Using

  2503   \Phi-space, I construct an efficient function which takes the

  2504   limited data that comes from observing another creature and enriches

  2505   it full compliment of imagined sensory data. I can then use the

  2506   imagined sensory data to recognize what the observed creature is

  2507   doing and feeling, using straightforward embodied action predicates.

  2508   This is all demonstrated with using a simple worm-like creature, and

  2509   recognizing worm-actions based on limited data.

  2510 

  2511   #+caption: Here is the worm with which we will be working. 

  2512   #+caption: It is composed of 5 segments. Each segment has a 

  2513   #+caption: pair of extensor and flexor muscles. Each of the 

  2514   #+caption: worm's four joints is a hinge joint which allows 

  2515   #+caption: about 30 degrees of rotation to either side. Each segment

  2516   #+caption: of the worm is touch-capable and has a uniform 

  2517   #+caption: distribution of touch sensors on each of its faces.

  2518   #+caption: Each joint has a proprioceptive sense to detect 

  2519   #+caption: relative positions. The worm segments are all the 

  2520   #+caption: same except for the first one, which has a much

  2521   #+caption: higher weight than the others to allow for easy 

  2522   #+caption: manual motor control.

  2523   #+name: basic-worm-view

  2524   #+ATTR_LaTeX: :width 10cm

  2525   [[./images/basic-worm-view.png]]

  2526 

  2527   #+caption: Program for reading a worm from a blender file and 

  2528   #+caption: outfitting it with the senses of proprioception, 

  2529   #+caption: touch, and the ability to move, as specified in the 

  2530   #+caption: blender file.

  2531   #+name: get-worm

  2532   #+begin_listing clojure

  2533   #+begin_src clojure

  2534 (defn worm []

  2535   (let [model (load-blender-model "Models/worm/worm.blend")]

  2536     {:body (doto model (body!))

  2537      :touch (touch! model)

  2538      :proprioception (proprioception! model)

  2539      :muscles (movement! model)}))

  2540   #+end_src

  2541   #+end_listing

  2542 

  2543 ** Embodiment factors action recognition into manageable parts

  2544 

  2545    Using empathy, I divide the problem of action recognition into a

  2546    recognition process expressed in the language of a full compliment

  2547    of senses, and an imaginative process that generates full sensory

  2548    data from partial sensory data. Splitting the action recognition

  2549    problem in this manner greatly reduces the total amount of work to

  2550    recognize actions: The imaginative process is mostly just matching

  2551    previous experience, and the recognition process gets to use all

  2552    the senses to directly describe any action.

  2553 

  2554 ** Action recognition is easy with a full gamut of senses

  2555 

  2556    Embodied representations using multiple senses such as touch,

  2557    proprioception, and muscle tension turns out be be exceedingly

  2558    efficient at describing body-centered actions. It is the ``right

  2559    language for the job''. For example, it takes only around 5 lines

  2560    of LISP code to describe the action of ``curling'' using embodied

  2561    primitives. It takes about 10 lines to describe the seemingly

  2562    complicated action of wiggling.

  2563 

  2564    The following action predicates each take a stream of sensory

  2565    experience, observe however much of it they desire, and decide

  2566    whether the worm is doing the action they describe. =curled?=

  2567    relies on proprioception, =resting?= relies on touch, =wiggling?=

  2568    relies on a Fourier analysis of muscle contraction, and

  2569    =grand-circle?= relies on touch and reuses =curled?= as a guard.

  2570    

  2571    #+caption: Program for detecting whether the worm is curled. This is the 

  2572    #+caption: simplest action predicate, because it only uses the last frame 

  2573    #+caption: of sensory experience, and only uses proprioceptive data. Even 

  2574    #+caption: this simple predicate, however, is automatically frame 

  2575    #+caption: independent and ignores vermopomorphic differences such as 

  2576    #+caption: worm textures and colors.

  2577    #+name: curled

  2578    #+begin_listing clojure

  2579    #+begin_src clojure

  2580 (defn curled?

  2581   "Is the worm curled up?"

  2582   [experiences]

  2583   (every?

  2584    (fn [[_ _ bend]]

  2585      (> (Math/sin bend) 0.64))

  2586    (:proprioception (peek experiences))))

  2587    #+end_src

  2588    #+end_listing

  2589 

  2590    #+caption: Program for summarizing the touch information in a patch 

  2591    #+caption: of skin.

  2592    #+name: touch-summary

  2593    #+begin_listing clojure

  2594    #+begin_src clojure

  2595 (defn contact

  2596   "Determine how much contact a particular worm segment has with

  2597    other objects. Returns a value between 0 and 1, where 1 is full

  2598    contact and 0 is no contact."

  2599   [touch-region [coords contact :as touch]]

  2600   (-> (zipmap coords contact)

  2601       (select-keys touch-region)

  2602       (vals)

  2603       (#(map first %))

  2604       (average)

  2605       (* 10)

  2606       (- 1)

  2607       (Math/abs)))

  2608    #+end_src

  2609    #+end_listing

  2610 

  2611 

  2612    #+caption: Program for detecting whether the worm is at rest. This program

  2613    #+caption: uses a summary of the tactile information from the underbelly 

  2614    #+caption: of the worm, and is only true if every segment is touching the 

  2615    #+caption: floor. Note that this function contains no references to 

  2616    #+caption: proprioception at all.

  2617    #+name: resting

  2618 #+begin_listing clojure

  2619    #+begin_src clojure

  2620 (def worm-segment-bottom (rect-region [8 15] [14 22]))

  2621 

  2622 (defn resting?

  2623   "Is the worm resting on the ground?"

  2624   [experiences]

  2625   (every?

  2626    (fn [touch-data]

  2627      (< 0.9 (contact worm-segment-bottom touch-data)))

  2628    (:touch (peek experiences))))

  2629    #+end_src

  2630    #+end_listing

  2631 

  2632    #+caption: Program for detecting whether the worm is curled up into a 

  2633    #+caption: full circle. Here the embodied approach begins to shine, as

  2634    #+caption: I am able to both use a previous action predicate (=curled?=)

  2635    #+caption: as well as the direct tactile experience of the head and tail.

  2636    #+name: grand-circle

  2637 #+begin_listing clojure

  2638    #+begin_src clojure

  2639 (def worm-segment-bottom-tip (rect-region [15 15] [22 22]))

  2640 

  2641 (def worm-segment-top-tip (rect-region [0 15] [7 22]))

  2642 

  2643 (defn grand-circle?

  2644   "Does the worm form a majestic circle (one end touching the other)?"

  2645   [experiences]

  2646   (and (curled? experiences)

  2647        (let [worm-touch (:touch (peek experiences))

  2648              tail-touch (worm-touch 0)

  2649              head-touch (worm-touch 4)]

  2650          (and (< 0.55 (contact worm-segment-bottom-tip tail-touch))

  2651               (< 0.55 (contact worm-segment-top-tip    head-touch))))))

  2652    #+end_src

  2653    #+end_listing

  2654 

  2655 

  2656    #+caption: Program for detecting whether the worm has been wiggling for 

  2657    #+caption: the last few frames. It uses a Fourier analysis of the muscle 

  2658    #+caption: contractions of the worm's tail to determine wiggling. This is 

  2659    #+caption: significant because there is no particular frame that clearly 

  2660    #+caption: indicates that the worm is wiggling --- only when multiple frames 

  2661    #+caption: are analyzed together is the wiggling revealed. Defining 

  2662    #+caption: wiggling this way also gives the worm an opportunity to learn 

  2663    #+caption: and recognize ``frustrated wiggling'', where the worm tries to 

  2664    #+caption: wiggle but can't. Frustrated wiggling is very visually different 

  2665    #+caption: from actual wiggling, but this definition gives it to us for free.

  2666    #+name: wiggling

  2667 #+begin_listing clojure

  2668    #+begin_src clojure

  2669 (defn fft [nums]

  2670   (map

  2671    #(.getReal %)

  2672    (.transform

  2673     (FastFourierTransformer. DftNormalization/STANDARD)

  2674     (double-array nums) TransformType/FORWARD)))

  2675 

  2676 (def indexed (partial map-indexed vector))

  2677 

  2678 (defn max-indexed [s]

  2679   (first (sort-by (comp - second) (indexed s))))

  2680 

  2681 (defn wiggling?

  2682   "Is the worm wiggling?"

  2683   [experiences]

  2684   (let [analysis-interval 0x40]

  2685     (when (> (count experiences) analysis-interval)

  2686       (let [a-flex 3

  2687             a-ex   2

  2688             muscle-activity

  2689             (map :muscle (vector:last-n experiences analysis-interval))

  2690             base-activity

  2691             (map #(- (% a-flex) (% a-ex)) muscle-activity)]

  2692         (= 2

  2693            (first

  2694             (max-indexed

  2695              (map #(Math/abs %)

  2696                   (take 20 (fft base-activity))))))))))

  2697    #+end_src

  2698    #+end_listing

  2699 

  2700    With these action predicates, I can now recognize the actions of

  2701    the worm while it is moving under my control and I have access to

  2702    all the worm's senses.

  2703 

  2704    #+caption: Use the action predicates defined earlier to report on 

  2705    #+caption: what the worm is doing while in simulation.

  2706    #+name: report-worm-activity

  2707 #+begin_listing clojure

  2708    #+begin_src clojure

  2709 (defn debug-experience

  2710   [experiences text]

  2711   (cond

  2712    (grand-circle? experiences) (.setText text "Grand Circle")

  2713    (curled? experiences)       (.setText text "Curled")

  2714    (wiggling? experiences)     (.setText text "Wiggling")

  2715    (resting? experiences)      (.setText text "Resting")))

  2716    #+end_src

  2717    #+end_listing

  2718 

  2719    #+caption: Using =debug-experience=, the body-centered predicates

  2720    #+caption: work together to classify the behavior of the worm. 

  2721    #+caption: the predicates are operating with access to the worm's

  2722    #+caption: full sensory data.

  2723    #+name: basic-worm-view

  2724    #+ATTR_LaTeX: :width 10cm

  2725    [[./images/worm-identify-init.png]]

  2726 

  2727    These action predicates satisfy the recognition requirement of an

  2728    empathic recognition system. There is power in the simplicity of

  2729    the action predicates. They describe their actions without getting

  2730    confused in visual details of the worm. Each one is frame

  2731    independent, but more than that, they are each independent of

  2732    irrelevant visual details of the worm and the environment. They

  2733    will work regardless of whether the worm is a different color or

  2734    heavily textured, or if the environment has strange lighting.

  2735 

  2736    The trick now is to make the action predicates work even when the

  2737    sensory data on which they depend is absent. If I can do that, then

  2738    I will have gained much,

  2739 

  2740 ** \Phi-space describes the worm's experiences

  2741    

  2742    As a first step towards building empathy, I need to gather all of

  2743    the worm's experiences during free play. I use a simple vector to

  2744    store all the experiences. 

  2745 

  2746    Each element of the experience vector exists in the vast space of

  2747    all possible worm-experiences. Most of this vast space is actually

  2748    unreachable due to physical constraints of the worm's body. For

  2749    example, the worm's segments are connected by hinge joints that put

  2750    a practical limit on the worm's range of motions without limiting

  2751    its degrees of freedom. Some groupings of senses are impossible;

  2752    the worm can not be bent into a circle so that its ends are

  2753    touching and at the same time not also experience the sensation of

  2754    touching itself.

  2755 

  2756    As the worm moves around during free play and its experience vector

  2757    grows larger, the vector begins to define a subspace which is all

  2758    the sensations the worm can practically experience during normal

  2759    operation. I call this subspace \Phi-space, short for

  2760    physical-space. The experience vector defines a path through

  2761    \Phi-space. This path has interesting properties that all derive

  2762    from physical embodiment. The proprioceptive components are

  2763    completely smooth, because in order for the worm to move from one

  2764    position to another, it must pass through the intermediate

  2765    positions. The path invariably forms loops as actions are repeated.

  2766    Finally and most importantly, proprioception actually gives very

  2767    strong inference about the other senses. For example, when the worm

  2768    is flat, you can infer that it is touching the ground and that its

  2769    muscles are not active, because if the muscles were active, the

  2770    worm would be moving and would not be perfectly flat. In order to

  2771    stay flat, the worm has to be touching the ground, or it would

  2772    again be moving out of the flat position due to gravity. If the

  2773    worm is positioned in such a way that it interacts with itself,

  2774    then it is very likely to be feeling the same tactile feelings as

  2775    the last time it was in that position, because it has the same body

  2776    as then. If you observe multiple frames of proprioceptive data,

  2777    then you can become increasingly confident about the exact

  2778    activations of the worm's muscles, because it generally takes a

  2779    unique combination of muscle contractions to transform the worm's

  2780    body along a specific path through \Phi-space.

  2781 

  2782    There is a simple way of taking \Phi-space and the total ordering

  2783    provided by an experience vector and reliably inferring the rest of

  2784    the senses.

  2785 

  2786 ** Empathy is the process of tracing though \Phi-space 

  2787 

  2788    Here is the core of a basic empathy algorithm, starting with an

  2789    experience vector:

  2790 

  2791    First, group the experiences into tiered proprioceptive bins. I use

  2792    powers of 10 and 3 bins, and the smallest bin has an approximate

  2793    size of 0.001 radians in all proprioceptive dimensions.

  2794    

  2795    Then, given a sequence of proprioceptive input, generate a set of

  2796    matching experience records for each input, using the tiered

  2797    proprioceptive bins. 

  2798 

  2799    Finally, to infer sensory data, select the longest consecutive chain

  2800    of experiences. Consecutive experience means that the experiences

  2801    appear next to each other in the experience vector.

  2802 

  2803    This algorithm has three advantages: 

  2804 

  2805    1. It's simple

  2806 

  2807    3. It's very fast -- retrieving possible interpretations takes

  2808       constant time. Tracing through chains of interpretations takes

  2809       time proportional to the average number of experiences in a

  2810       proprioceptive bin. Redundant experiences in \Phi-space can be

  2811       merged to save computation.

  2812 

  2813    2. It protects from wrong interpretations of transient ambiguous

  2814       proprioceptive data. For example, if the worm is flat for just

  2815       an instant, this flatness will not be interpreted as implying

  2816       that the worm has its muscles relaxed, since the flatness is

  2817       part of a longer chain which includes a distinct pattern of

  2818       muscle activation. Markov chains or other memoryless statistical

  2819       models that operate on individual frames may very well make this

  2820       mistake.

  2821 

  2822    #+caption: Program to convert an experience vector into a 

  2823    #+caption: proprioceptively binned lookup function.

  2824    #+name: bin

  2825 #+begin_listing clojure

  2826    #+begin_src clojure

  2827 (defn bin [digits]

  2828   (fn [angles]

  2829     (->> angles

  2830          (flatten)

  2831          (map (juxt #(Math/sin %) #(Math/cos %)))

  2832          (flatten)

  2833          (mapv #(Math/round (* % (Math/pow 10 (dec digits))))))))

  2834 

  2835 (defn gen-phi-scan 

  2836   "Nearest-neighbors with binning. Only returns a result if

  2837    the proprioceptive data is within 10% of a previously recorded

  2838    result in all dimensions."

  2839   [phi-space]

  2840   (let [bin-keys (map bin [3 2 1])

  2841         bin-maps

  2842         (map (fn [bin-key]

  2843                (group-by

  2844                 (comp bin-key :proprioception phi-space)

  2845                 (range (count phi-space)))) bin-keys)

  2846         lookups (map (fn [bin-key bin-map]

  2847                        (fn [proprio] (bin-map (bin-key proprio))))

  2848                      bin-keys bin-maps)]

  2849     (fn lookup [proprio-data]

  2850       (set (some #(% proprio-data) lookups)))))

  2851    #+end_src

  2852    #+end_listing

  2853 

  2854    #+caption: =longest-thread= finds the longest path of consecutive 

  2855    #+caption: experiences to explain proprioceptive worm data from 

  2856    #+caption: previous data. Here, the film strip represents the  

  2857    #+caption: creature's previous experience. Sort sequeuces of

  2858    #+caption: memories are spliced together to match the

  2859    #+caption: proprioceptive data. Their carry the other senses 

  2860    #+caption: along with them.

  2861    #+name: phi-space-history-scan

  2862    #+ATTR_LaTeX: :width 10cm

  2863    [[./images/film-of-imagination.png]]

  2864 

  2865    =longest-thread= infers sensory data by stitching together pieces

  2866    from previous experience. It prefers longer chains of previous

  2867    experience to shorter ones. For example, during training the worm

  2868    might rest on the ground for one second before it performs its

  2869    exercises. If during recognition the worm rests on the ground for

  2870    five seconds, =longest-thread= will accommodate this five second

  2871    rest period by looping the one second rest chain five times.

  2872 

  2873    =longest-thread= takes time proportional to the average number of

  2874    entries in a proprioceptive bin, because for each element in the

  2875    starting bin it performs a series of set lookups in the preceding

  2876    bins. If the total history is limited, then this is only a constant

  2877    multiple times the number of entries in the starting bin. This

  2878    analysis also applies even if the action requires multiple longest

  2879    chains -- it's still the average number of entries in a

  2880    proprioceptive bin times the desired chain length. Because

  2881    =longest-thread= is so efficient and simple, I can interpret

  2882    worm-actions in real time.

  2883 

  2884    #+caption: Program to calculate empathy by tracing though \Phi-space

  2885    #+caption: and finding the longest (ie. most coherent) interpretation

  2886    #+caption: of the data.

  2887    #+name: longest-thread

  2888 #+begin_listing clojure

  2889    #+begin_src clojure

  2890 (defn longest-thread

  2891   "Find the longest thread from phi-index-sets. The index sets should

  2892    be ordered from most recent to least recent."

  2893   [phi-index-sets]

  2894   (loop [result '()

  2895          [thread-bases & remaining :as phi-index-sets] phi-index-sets]

  2896     (if (empty? phi-index-sets)

  2897       (vec result)

  2898       (let [threads

  2899             (for [thread-base thread-bases]

  2900               (loop [thread (list thread-base)

  2901                      remaining remaining]

  2902                 (let [next-index (dec (first thread))]

  2903                   (cond (empty? remaining) thread

  2904                         (contains? (first remaining) next-index)

  2905                         (recur

  2906                          (cons next-index thread) (rest remaining))

  2907                         :else thread))))

  2908             longest-thread

  2909             (reduce (fn [thread-a thread-b]

  2910                       (if (> (count thread-a) (count thread-b))

  2911                         thread-a thread-b))

  2912                     '(nil)

  2913                     threads)]

  2914         (recur (concat longest-thread result)

  2915                (drop (count longest-thread) phi-index-sets))))))

  2916    #+end_src

  2917    #+end_listing

  2918 

  2919    There is one final piece, which is to replace missing sensory data

  2920    with a best-guess estimate. While I could fill in missing data by

  2921    using a gradient over the closest known sensory data points,

  2922    averages can be misleading. It is certainly possible to create an

  2923    impossible sensory state by averaging two possible sensory states.

  2924    Therefore, I simply replicate the most recent sensory experience to

  2925    fill in the gaps.

  2926 

  2927    #+caption: Fill in blanks in sensory experience by replicating the most 

  2928    #+caption: recent experience.

  2929    #+name: infer-nils

  2930 #+begin_listing clojure

  2931    #+begin_src clojure

  2932 (defn infer-nils

  2933   "Replace nils with the next available non-nil element in the

  2934    sequence, or barring that, 0."

  2935   [s]

  2936   (loop [i (dec (count s))

  2937          v (transient s)]

  2938     (if (zero? i) (persistent! v)

  2939         (if-let [cur (v i)]

  2940           (if (get v (dec i) 0)

  2941             (recur (dec i) v)

  2942             (recur (dec i) (assoc! v (dec i) cur)))

  2943           (recur i (assoc! v i 0))))))

  2944    #+end_src

  2945    #+end_listing

  2946   

  2947 ** =EMPATH= recognizes actions efficiently

  2948    

  2949    To use =EMPATH= with the worm, I first need to gather a set of

  2950    experiences from the worm that includes the actions I want to

  2951    recognize. The =generate-phi-space= program (listing

  2952    \ref{generate-phi-space} runs the worm through a series of

  2953    exercises and gatherers those experiences into a vector. The

  2954    =do-all-the-things= program is a routine expressed in a simple

  2955    muscle contraction script language for automated worm control. It

  2956    causes the worm to rest, curl, and wiggle over about 700 frames

  2957    (approx. 11 seconds).

  2958 

  2959    #+caption: Program to gather the worm's experiences into a vector for 

  2960    #+caption: further processing. The =motor-control-program= line uses

  2961    #+caption: a motor control script that causes the worm to execute a series

  2962    #+caption: of ``exercises'' that include all the action predicates.

  2963    #+name: generate-phi-space

  2964 #+begin_listing clojure 

  2965    #+begin_src clojure

  2966 (def do-all-the-things 

  2967   (concat

  2968    curl-script

  2969    [[300 :d-ex 40]

  2970     [320 :d-ex 0]]

  2971    (shift-script 280 (take 16 wiggle-script))))

  2972 

  2973 (defn generate-phi-space []

  2974   (let [experiences (atom [])]

  2975     (run-world

  2976      (apply-map 

  2977       worm-world

  2978       (merge

  2979        (worm-world-defaults)

  2980        {:end-frame 700

  2981         :motor-control

  2982         (motor-control-program worm-muscle-labels do-all-the-things)

  2983         :experiences experiences})))

  2984     @experiences))

  2985    #+end_src

  2986    #+end_listing

  2987 

  2988    #+caption: Use longest thread and a phi-space generated from a short

  2989    #+caption: exercise routine to interpret actions during free play.

  2990    #+name: empathy-debug

  2991 #+begin_listing clojure

  2992    #+begin_src clojure

  2993 (defn init []

  2994   (def phi-space (generate-phi-space))

  2995   (def phi-scan (gen-phi-scan phi-space)))

  2996 

  2997 (defn empathy-demonstration []

  2998   (let [proprio (atom ())]

  2999     (fn

  3000       [experiences text]

  3001       (let [phi-indices (phi-scan (:proprioception (peek experiences)))]

  3002         (swap! proprio (partial cons phi-indices))

  3003         (let [exp-thread (longest-thread (take 300 @proprio))

  3004               empathy (mapv phi-space (infer-nils exp-thread))]

  3005           (println-repl (vector:last-n exp-thread 22))

  3006           (cond

  3007            (grand-circle? empathy) (.setText text "Grand Circle")

  3008            (curled? empathy)       (.setText text "Curled")

  3009            (wiggling? empathy)     (.setText text "Wiggling")

  3010            (resting? empathy)      (.setText text "Resting")

  3011            :else                       (.setText text "Unknown")))))))

  3012 

  3013 (defn empathy-experiment [record]

  3014   (.start (worm-world :experience-watch (debug-experience-phi)

  3015                       :record record :worm worm*)))

  3016    #+end_src

  3017    #+end_listing

  3018    

  3019    The result of running =empathy-experiment= is that the system is

  3020    generally able to interpret worm actions using the action-predicates

  3021    on simulated sensory data just as well as with actual data. Figure

  3022    \ref{empathy-debug-image} was generated using =empathy-experiment=:

  3023 

  3024   #+caption: From only proprioceptive data, =EMPATH= was able to infer 

  3025   #+caption: the complete sensory experience and classify four poses

  3026   #+caption: (The last panel shows a composite image of /wiggling/, 

  3027   #+caption: a dynamic pose.)

  3028   #+name: empathy-debug-image

  3029   #+ATTR_LaTeX: :width 10cm :placement [H]

  3030   [[./images/empathy-1.png]]

  3031 

  3032   One way to measure the performance of =EMPATH= is to compare the

  3033   suitability of the imagined sense experience to trigger the same

  3034   action predicates as the real sensory experience. 

  3035   

  3036    #+caption: Determine how closely empathy approximates actual 

  3037    #+caption: sensory data.

  3038    #+name: test-empathy-accuracy

  3039 #+begin_listing clojure

  3040    #+begin_src clojure

  3041 (def worm-action-label

  3042   (juxt grand-circle? curled? wiggling?))

  3043 

  3044 (defn compare-empathy-with-baseline [matches]

  3045   (let [proprio (atom ())]

  3046     (fn

  3047       [experiences text]

  3048       (let [phi-indices (phi-scan (:proprioception (peek experiences)))]

  3049         (swap! proprio (partial cons phi-indices))

  3050         (let [exp-thread (longest-thread (take 300 @proprio))

  3051               empathy (mapv phi-space (infer-nils exp-thread))

  3052               experience-matches-empathy

  3053               (= (worm-action-label experiences)

  3054                  (worm-action-label empathy))]

  3055           (println-repl experience-matches-empathy)

  3056           (swap! matches #(conj % experience-matches-empathy)))))))

  3057               

  3058 (defn accuracy [v]

  3059   (float (/ (count (filter true? v)) (count v))))

  3060 

  3061 (defn test-empathy-accuracy []

  3062   (let [res (atom [])]

  3063     (run-world

  3064      (worm-world :experience-watch

  3065                  (compare-empathy-with-baseline res)

  3066                  :worm worm*))

  3067     (accuracy @res)))

  3068    #+end_src

  3069    #+end_listing

  3070 

  3071   Running =test-empathy-accuracy= using the very short exercise

  3072   program defined in listing \ref{generate-phi-space}, and then doing

  3073   a similar pattern of activity manually yields an accuracy of around

  3074   73%. This is based on very limited worm experience. By training the

  3075   worm for longer, the accuracy dramatically improves.

  3076 

  3077    #+caption: Program to generate \Phi-space using manual training.

  3078    #+name: manual-phi-space

  3079    #+begin_listing clojure

  3080    #+begin_src clojure

  3081 (defn init-interactive []

  3082   (def phi-space

  3083     (let [experiences (atom [])]

  3084       (run-world

  3085        (apply-map 

  3086         worm-world

  3087         (merge

  3088          (worm-world-defaults)

  3089          {:experiences experiences})))

  3090       @experiences))

  3091   (def phi-scan (gen-phi-scan phi-space)))

  3092    #+end_src

  3093    #+end_listing

  3094 

  3095   After about 1 minute of manual training, I was able to achieve 95%

  3096   accuracy on manual testing of the worm using =init-interactive= and

  3097   =test-empathy-accuracy=. The majority of errors are near the

  3098   boundaries of transitioning from one type of action to another.

  3099   During these transitions the exact label for the action is more open

  3100   to interpretation, and disagreement between empathy and experience

  3101   is more excusable.

  3102 

  3103 ** Digression: Learn touch sensor layout through free play

  3104 

  3105    In the previous section I showed how to compute actions in terms of

  3106    body-centered predicates which relied on the average touch

  3107    activation of pre-defined regions of the worm's skin. What if,

  3108    instead of receiving touch pre-grouped into the six faces of each

  3109    worm segment, the true topology of the worm's skin was unknown?

  3110    This is more similar to how a nerve fiber bundle might be

  3111    arranged. While two fibers that are close in a nerve bundle /might/

  3112    correspond to two touch sensors that are close together on the

  3113    skin, the process of taking a complicated surface and forcing it

  3114    into essentially a circle requires some cuts and rearrangements.

  3115    

  3116    In this section I show how to automatically learn the skin-topology of

  3117    a worm segment by free exploration. As the worm rolls around on the

  3118    floor, large sections of its surface get activated. If the worm has

  3119    stopped moving, then whatever region of skin that is touching the

  3120    floor is probably an important region, and should be recorded.

  3121    

  3122    #+caption: Program to detect whether the worm is in a resting state 

  3123    #+caption: with one face touching the floor.

  3124    #+name: pure-touch

  3125    #+begin_listing clojure

  3126    #+begin_src clojure

  3127 (def full-contact [(float 0.0) (float 0.1)])

  3128 

  3129 (defn pure-touch?

  3130   "This is worm specific code to determine if a large region of touch

  3131    sensors is either all on or all off."

  3132   [[coords touch :as touch-data]]

  3133   (= (set (map first touch)) (set full-contact)))

  3134    #+end_src

  3135    #+end_listing

  3136 

  3137    After collecting these important regions, there will many nearly

  3138    similar touch regions. While for some purposes the subtle

  3139    differences between these regions will be important, for my

  3140    purposes I collapse them into mostly non-overlapping sets using

  3141    =remove-similar= in listing \ref{remove-similar}

  3142 

  3143    #+caption: Program to take a list of sets of points and ``collapse them''

  3144    #+caption: so that the remaining sets in the list are significantly 

  3145    #+caption: different from each other. Prefer smaller sets to larger ones.

  3146    #+name: remove-similar

  3147    #+begin_listing clojure

  3148    #+begin_src clojure

  3149 (defn remove-similar

  3150   [coll]

  3151   (loop [result () coll (sort-by (comp - count) coll)]

  3152     (if (empty? coll) result

  3153         (let  [[x & xs] coll

  3154                c (count x)]

  3155           (if (some

  3156                (fn [other-set]

  3157                  (let [oc (count other-set)]

  3158                    (< (- (count (union other-set x)) c) (* oc 0.1))))

  3159                xs)

  3160             (recur result xs)

  3161             (recur (cons x result) xs))))))

  3162    #+end_src

  3163    #+end_listing

  3164 

  3165    Actually running this simulation is easy given =CORTEX='s facilities.

  3166 

  3167    #+caption: Collect experiences while the worm moves around. Filter the touch 

  3168    #+caption: sensations by stable ones, collapse similar ones together, 

  3169    #+caption: and report the regions learned.

  3170    #+name: learn-touch

  3171    #+begin_listing clojure

  3172    #+begin_src clojure

  3173 (defn learn-touch-regions []

  3174   (let [experiences (atom [])

  3175         world (apply-map

  3176                worm-world

  3177                (assoc (worm-segment-defaults)

  3178                  :experiences experiences))]

  3179     (run-world world)

  3180     (->>

  3181      @experiences

  3182      (drop 175)

  3183      ;; access the single segment's touch data

  3184      (map (comp first :touch))

  3185      ;; only deal with "pure" touch data to determine surfaces

  3186      (filter pure-touch?)

  3187      ;; associate coordinates with touch values

  3188      (map (partial apply zipmap))

  3189      ;; select those regions where contact is being made

  3190      (map (partial group-by second))

  3191      (map #(get % full-contact))

  3192      (map (partial map first))

  3193      ;; remove redundant/subset regions

  3194      (map set)

  3195      remove-similar)))

  3196 

  3197 (defn learn-and-view-touch-regions []

  3198   (map view-touch-region

  3199        (learn-touch-regions)))

  3200    #+end_src

  3201    #+end_listing

  3202 

  3203    The only thing remaining to define is the particular motion the worm

  3204    must take. I accomplish this with a simple motor control program.

  3205 

  3206    #+caption: Motor control program for making the worm roll on the ground.

  3207    #+caption: This could also be replaced with random motion.

  3208    #+name: worm-roll

  3209    #+begin_listing clojure

  3210    #+begin_src clojure

  3211 (defn touch-kinesthetics []

  3212   [[170 :lift-1 40]

  3213    [190 :lift-1 19]

  3214    [206 :lift-1  0]

  3215 

  3216    [400 :lift-2 40]

  3217    [410 :lift-2  0]

  3218 

  3219    [570 :lift-2 40]

  3220    [590 :lift-2 21]

  3221    [606 :lift-2  0]

  3222 

  3223    [800 :lift-1 30]

  3224    [809 :lift-1 0]

  3225 

  3226    [900 :roll-2 40]

  3227    [905 :roll-2 20]

  3228    [910 :roll-2  0]

  3229 

  3230    [1000 :roll-2 40]

  3231    [1005 :roll-2 20]

  3232    [1010 :roll-2  0]

  3233    

  3234    [1100 :roll-2 40]

  3235    [1105 :roll-2 20]

  3236    [1110 :roll-2  0]

  3237    ])

  3238    #+end_src

  3239    #+end_listing

  3240 

  3241 

  3242    #+caption: The small worm rolls around on the floor, driven

  3243    #+caption: by the motor control program in listing \ref{worm-roll}.

  3244    #+name: worm-roll

  3245    #+ATTR_LaTeX: :width 12cm

  3246    [[./images/worm-roll.png]]

  3247 

  3248 

  3249    #+caption: After completing its adventures, the worm now knows 

  3250    #+caption: how its touch sensors are arranged along its skin. These 

  3251    #+caption: are the regions that were deemed important by 

  3252    #+caption: =learn-touch-regions=. Note that the worm has discovered

  3253    #+caption: that it has six sides.

  3254    #+name: worm-touch-map

  3255    #+ATTR_LaTeX: :width 12cm

  3256    [[./images/touch-learn.png]]

  3257 

  3258    While simple, =learn-touch-regions= exploits regularities in both

  3259    the worm's physiology and the worm's environment to correctly

  3260    deduce that the worm has six sides. Note that =learn-touch-regions=

  3261    would work just as well even if the worm's touch sense data were

  3262    completely scrambled. The cross shape is just for convenience. This

  3263    example justifies the use of pre-defined touch regions in =EMPATH=.

  3264 

  3265 * Contributions

  3266   

  3267   In this thesis you have seen the =CORTEX= system, a complete

  3268   environment for creating simulated creatures. You have seen how to

  3269   implement five senses: touch, proprioception, hearing, vision, and

  3270   muscle tension. You have seen how to create new creatures using

  3271   blender, a 3D modeling tool. I hope that =CORTEX= will be useful in

  3272   further research projects. To this end I have included the full

  3273   source to =CORTEX= along with a large suite of tests and examples. I

  3274   have also created a user guide for =CORTEX= which is included in an

  3275   appendix to this thesis.

  3276 

  3277   You have also seen how I used =CORTEX= as a platform to attach the

  3278   /action recognition/ problem, which is the problem of recognizing

  3279   actions in video. You saw a simple system called =EMPATH= which

  3280   identifies actions by first describing actions in a body-centered,

  3281   rich sense language, then inferring a full range of sensory

  3282   experience from limited data using previous experience gained from

  3283   free play.

  3284 

  3285   As a minor digression, you also saw how I used =CORTEX= to enable a

  3286   tiny worm to discover the topology of its skin simply by rolling on

  3287   the ground. 

  3288 

  3289   In conclusion, the main contributions of this thesis are:

  3290 

  3291    - =CORTEX=, a comprehensive platform for embodied AI experiments.

  3292      =CORTEX= supports many features lacking in other systems, such

  3293      proper simulation of hearing. It is easy to create new =CORTEX=

  3294      creatures using Blender, a free 3D modeling program.

  3295 

  3296    - =EMPATH=, which uses =CORTEX= to identify the actions of a

  3297      worm-like creature using a computational model of empathy.

  3298 

  3299 #+BEGIN_LaTeX

  3300 \appendix

  3301 #+END_LaTeX

  3302 

  3303 * Appendix: =CORTEX= User Guide

  3304 

  3305   Those who write a thesis should endeavor to make their code not only

  3306   accessible, but actually usable, as a way to pay back the community

  3307   that made the thesis possible in the first place. This thesis would

  3308   not be possible without Free Software such as jMonkeyEngine3,

  3309   Blender, clojure, emacs, ffmpeg, and many other tools. That is why I

  3310   have included this user guide, in the hope that someone else might

  3311   find =CORTEX= useful.

  3312 

  3313 ** Obtaining =CORTEX= 

  3314 

  3315    You can get cortex from its mercurial repository at

  3316    http://hg.bortreb.com/cortex. You may also download =CORTEX=

  3317    releases at http://aurellem.org/cortex/releases/. As a condition of

  3318    making this thesis, I have also provided Professor Winston the

  3319    =CORTEX= source, and he knows how to run the demos and get started.

  3320    You may also email me at =cortex@aurellem.org= and I may help where

  3321    I can.

  3322 

  3323 ** Running =CORTEX= 

  3324    

  3325    =CORTEX= comes with README and INSTALL files that will guide you

  3326    through installation and running the test suite. In particular you

  3327    should look at test =cortex.test= which contains test suites that

  3328    run through all senses and multiple creatures.

  3329 

  3330 ** Creating creatures

  3331 

  3332    Creatures are created using /Blender/, a free 3D modeling program.

  3333    You will need Blender version 2.6 when using the =CORTEX= included

  3334    in this thesis. You create a =CORTEX= creature in a similar manner

  3335    to modeling anything in Blender, except that you also create

  3336    several trees of empty nodes which define the creature's senses.

  3337 

  3338 *** Mass 

  3339     

  3340     To give an object mass in =CORTEX=, add a ``mass'' metadata label

  3341     to the object with the mass in jMonkeyEngine units. Note that

  3342     setting the mass to 0 causes the object to be immovable.

  3343 

  3344 *** Joints

  3345 

  3346     Joints are created by creating an empty node named =joints= and

  3347     then creating any number of empty child nodes to represent your

  3348     creature's joints. The joint will automatically connect the

  3349     closest two physical objects. It will help to set the empty node's

  3350     display mode to ``Arrows'' so that you can clearly see the

  3351     direction of the axes.

  3352    

  3353     Joint nodes should have the following metadata under the ``joint''

  3354     label:

  3355 

  3356     #+BEGIN_SRC clojure

  3357 ;; ONE OF the following, under the label "joint":

  3358 {:type :point}

  3359 

  3360 ;; OR

  3361 

  3362 {:type :hinge

  3363  :limit [<limit-low> <limit-high>]

  3364  :axis (Vector3f. <x> <y> <z>)}

  3365 ;;(:axis defaults to (Vector3f. 1 0 0) if not provided for hinge joints)

  3366 

  3367 ;; OR

  3368 

  3369 {:type :cone

  3370  :limit-xz <lim-xz>

  3371  :limit-xy <lim-xy>

  3372  :twist    <lim-twist>}   ;(use XZY rotation mode in blender!)

  3373     #+END_SRC

  3374 

  3375 *** Eyes

  3376 

  3377     Eyes are created by creating an empty node named =eyes= and then

  3378     creating any number of empty child nodes to represent your

  3379     creature's eyes.

  3380 

  3381     Eye nodes should have the following metadata under the ``eye''

  3382     label:

  3383 

  3384 #+BEGIN_SRC clojure

  3385 {:red    <red-retina-definition>

  3386  :blue   <blue-retina-definition>

  3387  :green  <green-retina-definition>

  3388  :all    <all-retina-definition>

  3389  (<0xrrggbb> <custom-retina-image>)...

  3390 }

  3391 #+END_SRC

  3392 

  3393     Any of the color channels may be omitted. You may also include

  3394     your own color selectors, and in fact :red is equivalent to

  3395     0xFF0000 and so forth. The eye will be placed at the same position

  3396     as the empty node and will bind to the neatest physical object.

  3397     The eye will point outward from the X-axis of the node, and ``up''

  3398     will be in the direction of the X-axis of the node. It will help

  3399     to set the empty node's display mode to ``Arrows'' so that you can

  3400     clearly see the direction of the axes.

  3401 

  3402     Each retina file should contain white pixels wherever you want to be

  3403     sensitive to your chosen color. If you want the entire field of

  3404     view, specify :all of 0xFFFFFF and a retinal map that is entirely

  3405     white. 

  3406 

  3407     Here is a sample retinal map:

  3408 

  3409     #+caption: An example retinal profile image. White pixels are 

  3410     #+caption: photo-sensitive elements. The distribution of white 

  3411     #+caption: pixels is denser in the middle and falls off at the 

  3412     #+caption: edges and is inspired by the human retina.

  3413     #+name: retina

  3414     #+ATTR_LaTeX: :width 7cm :placement [H]

  3415     [[./images/retina-small.png]]

  3416 

  3417 *** Hearing

  3418 

  3419     Ears are created by creating an empty node named =ears= and then

  3420     creating any number of empty child nodes to represent your

  3421     creature's ears. 

  3422 

  3423     Ear nodes do not require any metadata.

  3424 

  3425     The ear will bind to and follow the closest physical node.

  3426 

  3427 *** Touch

  3428 

  3429     Touch is handled similarly to mass. To make a particular object

  3430     touch sensitive, add metadata of the following form under the

  3431     object's ``touch'' metadata field:

  3432     

  3433     #+BEGIN_EXAMPLE

  3434     <touch-UV-map-file-name>    

  3435     #+END_EXAMPLE

  3436 

  3437     You may also include an optional ``scale'' metadata number to

  3438     specify the length of the touch feelers. The default is $0.1$,

  3439     and this is generally sufficient.

  3440 

  3441     The touch UV should contain white pixels for each touch sensor.

  3442 

  3443     Here is an example touch-uv map that approximates a human finger,

  3444     and its corresponding model.

  3445 

  3446     #+caption: This is the tactile-sensor-profile for the upper segment 

  3447     #+caption: of a fingertip. It defines regions of high touch sensitivity 

  3448     #+caption: (where there are many white pixels) and regions of low 

  3449     #+caption: sensitivity (where white pixels are sparse).

  3450     #+name: guide-fingertip-UV

  3451     #+ATTR_LaTeX: :width 9cm :placement [H]

  3452     [[./images/finger-UV.png]]

  3453 

  3454     #+caption: The fingertip UV-image form above applied to a simple

  3455     #+caption: model of a fingertip.

  3456     #+name: guide-fingertip

  3457     #+ATTR_LaTeX: :width 9cm :placement [H]

  3458     [[./images/finger-2.png]]

  3459 

  3460 *** Proprioception

  3461 

  3462     Proprioception is tied to each joint node -- nothing special must

  3463     be done in a blender model to enable proprioception other than

  3464     creating joint nodes.

  3465 

  3466 *** Muscles

  3467 

  3468     Muscles are created by creating an empty node named =muscles= and

  3469     then creating any number of empty child nodes to represent your

  3470     creature's muscles.

  3471 

  3472     

  3473     Muscle nodes should have the following metadata under the

  3474     ``muscle'' label:

  3475 

  3476     #+BEGIN_EXAMPLE

  3477     <muscle-profile-file-name>

  3478     #+END_EXAMPLE

  3479 

  3480     Muscles should also have a ``strength'' metadata entry describing

  3481     the muscle's total strength at full activation. 

  3482 

  3483     Muscle profiles are simple images that contain the relative amount

  3484     of muscle power in each simulated alpha motor neuron. The width of

  3485     the image is the total size of the motor pool, and the redness of

  3486     each neuron is the relative power of that motor pool.

  3487 

  3488     While the profile image can have any dimensions, only the first

  3489     line of pixels is used to define the muscle. Here is a sample

  3490     muscle profile image that defines a human-like muscle.

  3491 

  3492     #+caption: A muscle profile image that describes the strengths

  3493     #+caption: of each motor neuron in a muscle. White is weakest 

  3494     #+caption: and dark red is strongest. This particular pattern 

  3495     #+caption: has weaker motor neurons at the beginning, just 

  3496     #+caption: like human muscle.

  3497     #+name: muscle-recruit

  3498     #+ATTR_LaTeX: :width 7cm :placement [H]

  3499     [[./images/basic-muscle.png]]

  3500     

  3501     Muscles twist the nearest physical object about the muscle node's

  3502     Z-axis. I recommend using the ``Single Arrow'' display mode for

  3503     muscles and using the right hand rule to determine which way the

  3504     muscle will twist. To make a segment that can twist in multiple

  3505     directions, create multiple, differently aligned muscles.

  3506 

  3507 ** =CORTEX= API

  3508 

  3509    These are the some functions exposed by =CORTEX= for creating

  3510    worlds and simulating creatures. These are in addition to

  3511    jMonkeyEngine3's extensive library, which is documented elsewhere.

  3512 

  3513 *** Simulation

  3514    - =(world root-node key-map setup-fn update-fn)= :: create

  3515         a simulation.

  3516      - /root-node/     :: a =com.jme3.scene.Node= object which

  3517           contains all of the objects that should be in the

  3518           simulation.

  3519 

  3520      - /key-map/       :: a map from strings describing keys to

  3521           functions that should be executed whenever that key is

  3522           pressed. the functions should take a SimpleApplication

  3523           object and a boolean value. The SimpleApplication is the

  3524           current simulation that is running, and the boolean is true

  3525           if the key is being pressed, and false if it is being

  3526           released. As an example,

  3527           #+BEGIN_SRC clojure

  3528        {"key-j" (fn [game value] (if value (println "key j pressed")))}		  

  3529 	  #+END_SRC

  3530 	  is a valid key-map which will cause the simulation to print

  3531           a message whenever the 'j' key on the keyboard is pressed.

  3532 

  3533      - /setup-fn/      :: a function that takes a =SimpleApplication=

  3534           object. It is called once when initializing the simulation.

  3535           Use it to create things like lights, change the gravity,

  3536           initialize debug nodes, etc.

  3537 

  3538      - /update-fn/     :: this function takes a =SimpleApplication=

  3539           object and a float and is called every frame of the

  3540           simulation. The float tells how many seconds is has been

  3541           since the last frame was rendered, according to whatever

  3542           clock jme is currently using. The default is to use IsoTimer

  3543           which will result in this value always being the same.

  3544 

  3545    - =(position-camera world position rotation)= :: set the position

  3546         of the simulation's main camera.

  3547 

  3548    - =(enable-debug world)= :: turn on debug wireframes for each

  3549         simulated object.

  3550 

  3551    - =(set-gravity world gravity)= :: set the gravity of a running

  3552         simulation.

  3553 

  3554    - =(box length width height & {options})= :: create a box in the

  3555         simulation. Options is a hash map specifying texture, mass,

  3556         etc. Possible options are =:name=, =:color=, =:mass=,

  3557         =:friction=, =:texture=, =:material=, =:position=,

  3558         =:rotation=, =:shape=, and =:physical?=.

  3559 

  3560    - =(sphere radius & {options})= :: create a sphere in the simulation.

  3561         Options are the same as in =box=.

  3562 

  3563    - =(load-blender-model file-name)= :: create a node structure

  3564         representing that described in a blender file.

  3565 

  3566    - =(light-up-everything world)= :: distribute a standard compliment

  3567         of lights throughout the simulation. Should be adequate for most

  3568         purposes.

  3569 

  3570    - =(node-seq node)= :: return a recursive list of the node's

  3571         children.

  3572 

  3573    - =(nodify name children)= :: construct a node given a node-name and

  3574         desired children.

  3575 

  3576    - =(add-element world element)= :: add an object to a running world

  3577         simulation.

  3578 

  3579    - =(set-accuracy world accuracy)= :: change the accuracy of the

  3580         world's physics simulator.

  3581 

  3582    - =(asset-manager)= :: get an /AssetManager/, a jMonkeyEngine

  3583         construct that is useful for loading textures and is required

  3584         for smooth interaction with jMonkeyEngine library functions.

  3585 

  3586    - =(load-bullet)=   :: unpack native libraries and initialize

  3587         blender. This function is required before other world building

  3588         functions are called.

  3589 	

  3590 *** Creature Manipulation / Import

  3591 

  3592    - =(body! creature)= :: give the creature a physical body.

  3593 

  3594    - =(vision! creature)= :: give the creature a sense of vision.

  3595         Returns a list of functions which will each, when called

  3596         during a simulation, return the vision data for the channel of

  3597         one of the eyes. The functions are ordered depending on the

  3598         alphabetical order of the names of the eye nodes in the

  3599         blender file. The data returned by the functions is a vector

  3600         containing the eye's /topology/, a vector of coordinates, and

  3601         the eye's /data/, a vector of RGB values filtered by the eye's

  3602         sensitivity. 

  3603 

  3604    - =(hearing! creature)= :: give the creature a sense of hearing.

  3605         Returns a list of functions, one for each ear, that when

  3606         called will return a frame's worth of hearing data for that

  3607         ear. The functions are ordered depending on the alphabetical

  3608         order of the names of the ear nodes in the blender file. The

  3609         data returned by the functions is an array PCM encoded wav

  3610         data. 

  3611 

  3612    - =(touch! creature)= :: give the creature a sense of touch. Returns

  3613         a single function that must be called with the /root node/ of

  3614         the world, and which will return a vector of /touch-data/

  3615         one entry for each touch sensitive component, each entry of

  3616         which contains a /topology/ that specifies the distribution of

  3617         touch sensors, and the /data/, which is a vector of

  3618         =[activation, length]= pairs for each touch hair.

  3619 

  3620    - =(proprioception! creature)= :: give the creature the sense of

  3621         proprioception. Returns a list of functions, one for each

  3622         joint, that when called during a running simulation will

  3623         report the =[heading, pitch, roll]= of the joint.

  3624 

  3625    - =(movement! creature)= :: give the creature the power of movement.

  3626         Creates a list of functions, one for each muscle, that when

  3627         called with an integer, will set the recruitment of that

  3628         muscle to that integer, and will report the current power

  3629         being exerted by the muscle. Order of muscles is determined by

  3630         the alphabetical sort order of the names of the muscle nodes.

  3631 	

  3632 *** Visualization/Debug

  3633 

  3634    - =(view-vision)= :: create a function that when called with a list

  3635         of visual data returned from the functions made by =vision!=, 

  3636         will display that visual data on the screen.

  3637 

  3638    - =(view-hearing)= :: same as =view-vision= but for hearing.

  3639 

  3640    - =(view-touch)= :: same as =view-vision= but for touch.

  3641 

  3642    - =(view-proprioception)= :: same as =view-vision= but for

  3643         proprioception.

  3644 

  3645    - =(view-movement)= :: same as =view-vision= but for

  3646         proprioception.

  3647 

  3648    - =(view anything)= :: =view= is a polymorphic function that allows

  3649         you to inspect almost anything you could reasonably expect to

  3650         be able to ``see'' in =CORTEX=.

  3651 

  3652    - =(text anything)= :: =text= is a polymorphic function that allows

  3653         you to convert practically anything into a text string.	

  3654 

  3655    - =(println-repl anything)= :: print messages to clojure's repl

  3656         instead of the simulation's terminal window.

  3657 

  3658    - =(mega-import-jme3)= :: for experimenting at the REPL. This

  3659         function will import all jMonkeyEngine3 classes for immediate

  3660         use.

  3661 

  3662    - =(display-dilated-time world timer)= :: Shows the time as it is

  3663         flowing in the simulation on a HUD display.

  3664 

  3665 

  3666
author	rlm
date	Mon, 31 Mar 2014 08:29:50 -0400
parents	1803144ec9ae
children	1e51263afdc0