`CORTEX`

+ + + + +

Written by Robert McIntyre

+ + + + + + + +

Artificial Imagination

+ + +

+ Imagine watching a video of someone skateboarding. When you watch + the video, you can imagine yourself skateboarding, and your + knowledge of the human body and its dynamics guides your + interpretation of the scene. For example, even if the skateboarder + is partially occluded, you can infer the positions of his arms and + body from your own knowledge of how your body would be positioned if + you were skateboarding. If the skateboarder suffers an accident, you + wince in sympathy, imagining the pain your own body would experience + if it were in the same situation. This empathy with other people + guides our understanding of whatever they are doing because it is a + powerful constraint on what is probable and possible. In order to + make use of this powerful empathy constraint, I need a system that + can generate and make sense of sensory data from the many different + senses that humans possess. The two key proprieties of such a system + are embodiment and imagination. +

+ +

What is imagination?

+ + +

+ One kind of imagination is sympathetic imagination: you imagine + yourself in the position of something/someone you are + observing. This type of imagination comes into play when you follow + along visually when watching someone perform actions, or when you + sympathetically grimace when someone hurts themselves. This type of + imagination uses the constraints you have learned about your own + body to highly constrain the possibilities in whatever you are + seeing. It uses all your senses to including your senses of touch, + proprioception, etc. Humans are flexible when it comes to "putting + themselves in another's shoes," and can sympathetically understand + not only other humans, but entities ranging animals to cartoon + characters to single dots on a screen! +

+ Another kind of imagination is predictive imagination: you + construct scenes in your mind that are not entirely related to + whatever you are observing, but instead are predictions of the + future or simply flights of fancy. You use this type of imagination + to plan out multi-step actions, or play out dangerous situations in + your mind so as to avoid messing them up in reality. +

+ Of course, sympathetic and predictive imagination blend into each + other and are not completely separate concepts. One dimension along + which you can distinguish types of imagination is dependence on raw + sense data. Sympathetic imagination is highly constrained by your + senses, while predictive imagination can be more or less dependent + on your senses depending on how far ahead you imagine. Daydreaming + is an extreme form of predictive imagination that wanders through + different possibilities without concern for whether they are + related to whatever is happening in reality. +

+ For this thesis, I will mostly focus on sympathetic imagination and + the constraint it provides for understanding sensory data. +

+ +

What problems can imagination solve?

+ + +

+ Consider a video of a cat drinking some water. +

+ +

../images/cat-drinking.jpg

A cat drinking some water. Identifying this action is beyond the state of the art for computers.

+ +

+ It is currently impossible for any computer program to reliably + label such an video as "drinking". I think humans are able to label + such video as "drinking" because they imagine themselves as the + cat, and imagine putting their face up against a stream of water + and sticking out their tongue. In that imagined world, they can + feel the cool water hitting their tongue, and feel the water + entering their body, and are able to recognize that feeling as + drinking. So, the label of the action is not really in the pixels + of the image, but is found clearly in a simulation inspired by + those pixels. An imaginative system, having been trained on + drinking and non-drinking examples and learning that the most + important component of drinking is the feeling of water sliding + down one's throat, would analyze a video of a cat drinking in the + following manner: +

Create a physical model of the video by putting a "fuzzy" model + of its own body in place of the cat. Also, create a simulation of + the stream of water. + +
Play out this simulated scene and generate imagined sensory + experience. This will include relevant muscle contractions, a + close up view of the stream from the cat's perspective, and most + importantly, the imagined feeling of water entering the mouth. + +
The action is now easily identified as drinking by the sense of + taste alone. The other senses (such as the tongue moving in and + out) help to give plausibility to the simulated action. Note that + the sense of vision, while critical in creating the simulation, + is not critical for identifying the action from the simulation. +

+ + +

+ More generally, I expect imaginative systems to be particularly + good at identifying embodied actions in videos. +

+ +

Cortex

+ + +

+ The previous example involves liquids, the sense of taste, and + imagining oneself as a cat. For this thesis I constrain myself to + simpler, more easily digitizable senses and situations. +

+ My system, Cortex performs imagination in two different simplified + worlds: worm world and stick figure world. In each of these + worlds, entities capable of imagination recognize actions by + simulating the experience from their own perspective, and then + recognizing the action from a database of examples. +

+ In order to serve as a framework for experiments in imagination, + Cortex requires simulated bodies, worlds, and senses like vision, + hearing, touch, proprioception, etc. +

+ +

A Video Game Engine takes care of some of the groundwork

+ + +

+ When it comes to simulation environments, the engines used to + create the worlds in video games offer top-notch physics and + graphics support. These engines also have limited support for + creating cameras and rendering 3D sound, which can be repurposed + for vision and hearing respectively. Physics collision detection + can be expanded to create a sense of touch. +

+ jMonkeyEngine3 is one such engine for creating video games in + Java. It uses OpenGL to render to the screen and uses screengraphs + to avoid drawing things that do not appear on the screen. It has an + active community and several games in the pipeline. The engine was + not built to serve any particular game but is instead meant to be + used for any 3D game. I chose jMonkeyEngine3 it because it had the + most features out of all the open projects I looked at, and because + I could then write my code in Clojure, an implementation of LISP + that runs on the JVM. +

+ +

`CORTEX` Extends jMonkeyEngine3 to implement rich senses

+ + +

+ Using the game-making primitives provided by jMonkeyEngine3, I have + constructed every major human sense except for smell and + taste. Cortex also provides an interface for creating creatures + in Blender, a 3D modeling environment, and then "rigging" the + creatures with senses using 3D annotations in Blender. A creature + can have any number of senses, and there can be any number of + creatures in a simulation. +

+ The senses available in Cortex are: +

+ + +

+ +

A roadmap for `Cortex` experiments

+ + + +

+ +

Worm World

+ + +

+ Worms in Cortex are segmented creatures which vary in length and + number of segments, and have the senses of vision, proprioception, + touch, and muscle tension. +

+ +

../images/finger-UV.png

This is the tactile-sensor-profile for the upper segment of a worm. It defines regions of high touch sensitivity (where there are many white pixels) and regions of low sensitivity (where white pixels are sparse).

+ + + + +

+ +

+
YouTube + +

The worm responds to touch.

+ +

+
YouTube + +

Proprioception in a worm. The proprioceptive readout is + in the upper left corner of the screen.

+ +

+ A worm is trained in various actions such as sinusoidal movement, + curling, flailing, and spinning by directly playing motor + contractions while the worm "feels" the experience. These actions + are recorded both as vectors of muscle tension, touch, and + proprioceptive data, but also in higher level forms such as + frequencies of the various contractions and a symbolic name for the + action. +

+ Then, the worm watches a video of another worm performing one of + the actions, and must judge which action was performed. Normally + this would be an extremely difficult problem, but the worm is able + to greatly diminish the search space through sympathetic + imagination. First, it creates an imagined copy of its body which + it observes from a third person point of view. Then for each frame + of the video, it maneuvers its simulated body to be in registration + with the worm depicted in the video. The physical constraints + imposed by the physics simulation greatly decrease the number of + poses that have to be tried, making the search feasible. As the + imaginary worm moves, it generates imaginary muscle tension and + proprioceptive sensations. The worm determines the action not by + vision, but by matching the imagined proprioceptive data with + previous examples. +

+ By using non-visual sensory data such as touch, the worms can also + answer body related questions such as "did your head touch your + tail?" and "did worm A touch worm B?" +

+ The proprioceptive information used for action identification is + body-centric, so only the registration step is dependent on point + of view, not the identification step. Registration is not specific + to any particular action. Thus, action identification can be + divided into a point-of-view dependent generic registration step, + and a action-specific step that is body-centered and invariant to + point of view. +

+ +

Stick Figure World

+ + +

+ This environment is similar to Worm World, except the creatures are + more complicated and the actions and questions more varied. It is + an experiment to see how far imagination can go in interpreting + actions. +

CORTEX

aurellem ☉

Artificial Imagination

What is imagination?

What problems can imagination solve?

Cortex

A Video Game Engine takes care of some of the groundwork

CORTEX Extends jMonkeyEngine3 to implement rich senses

A roadmap for Cortex experiments

Worm World

Stick Figure World

`CORTEX`

`CORTEX` Extends jMonkeyEngine3 to implement rich senses

A roadmap for `Cortex` experiments