rlm@22: #+title: Simulated Senses
rlm@22: #+author: Robert McIntyre
rlm@22: #+email: rlm@mit.edu
rlm@22: #+description: Simulating senses for AI research using JMonkeyEngine3
rlm@22: #+SETUPFILE: ../../aurellem/org/setup.org
rlm@22: #+INCLUDE: ../../aurellem/org/level-0.org
rlm@22: #+babel: :mkdirp yes :noweb yes
rlm@22: 
rlm@22: * Background
rlm@22: Artificial Intelligence has tried and failed for more than half a
rlm@22: century to produce programs as flexible, creative, and "intelligent"
rlm@22: as the human mind itself. Clearly, we are still missing some important
rlm@22: ideas concerning intelligent programs or we would have strong AI
rlm@22: already. What idea could be missing?
rlm@22: 
rlm@22: When Turing first proposed his famous "Turing Test" in the
rlm@22: groundbreaking paper [[./sources/turing.pdf][/Computing Machines and Intelligence/]], he gave
rlm@22: little importance to how a computer program might interact with the
rlm@22: world:
rlm@22: 
rlm@22: #+BEGIN_QUOTE
rlm@22: \ldquo{}We need not be too concerned about the legs, eyes, etc. The example of
rlm@22: Miss Helen Keller shows that education can take place provided that
rlm@22: communication in both directions between teacher and pupil can take
rlm@22: place by some means or other.\rdquo{}
rlm@22: #+END_QUOTE
rlm@22: 
rlm@22: And from the example of Hellen Keller he went on to assume that the
rlm@22: only thing a fledgling AI program could need by way of communication
rlm@22: is a teletypewriter. But Hellen Keller did possess vision and hearing
rlm@22: for the first few months of her life, and her tactile sense was far
rlm@22: more rich than any text-stream could hope to achieve. She possessed a
rlm@22: body she could move freely, and had continual access to the real world
rlm@22: to learn from her actions.
rlm@22: 
rlm@22: I believe that our programs are suffering from too little sensory
rlm@22: input to become really intelligent. Imagine for a moment that you
rlm@22: lived in a world completely cut off form all sensory stimulation. You
rlm@22: have no eyes to see, no ears to hear, no mouth to speak. No body, no
rlm@22: taste, no feeling whatsoever. The only sense you get at all is a
rlm@22: single point of light, flickering on and off in the void. If this was
rlm@22: your life from birth, you would never learn anything, and could never
rlm@22: become intelligent. Actual humans placed in sensory deprivation
rlm@22: chambers experience hallucinations and can begin to loose their sense
rlm@22: of reality in as little as 15 minutes[sensory-deprivation]. Most of
rlm@22: the time, the programs we write are in exactly this situation. They do
rlm@22: not interface with cameras and microphones, and they do not control a
rlm@22: real or simulated body or interact with any sort of world.
rlm@22: 
rlm@22: * Simulation vs. Reality
rlm@22: I want demonstrate that multiple senses are what enable
rlm@22: intelligence. There are two ways of playing around with senses and
rlm@22: computer programs:
rlm@22: 
rlm@22: The first is to go entirely with simulation: virtual world, virtual
rlm@22: character, virtual senses. The advantages are that when everything is
rlm@22: a simulation, experiments in that simulation are absolutely
rlm@22: reproducible. It's also easier to change the character and world to
rlm@22: explore new situations and different sensory combinations.
rlm@22: 
rlm@22: 
rlm@22: ** Issues with Simulation
rlm@22: 
rlm@22: If the world is to be simulated on a computer, then not only do you
rlm@22: have to worry about whether the character's senses are rich enough to
rlm@22: learn from the world, but whether the world itself is rendered with
rlm@22: enough detail and realism to give enough working material to the
rlm@22: character's senses. To name just a few difficulties facing modern
rlm@22: physics simulators: destructibility of the environment, simulation of
rlm@22: water/other fluids, large areas, nonrigid bodies, lots of objects,
rlm@22: smoke. I don't know of any computer simulation that would allow a
rlm@22: character to take a rock and grind it into fine dust, then use that
rlm@22: dust to make a clay sculpture, at least not without spending years
rlm@22: calculating the interactions of every single small grain of
rlm@22: dust. Maybe a simulated world with today's limitations doesn't provide
rlm@22: enough richness for real intelligence to evolve.
rlm@22: 
rlm@22: ** Issues with Reality
rlm@22: 
rlm@22: The other approach for playing with senses is to hook your software up
rlm@22: to real cameras, microphones, robots, etc., and let it loose in the
rlm@22: real world. This has the advantage of eliminating concerns about
rlm@22: simulating the world at the expense of increasing the complexity of
rlm@22: implementing the senses. Instead of just grabbing the current rendered
rlm@22: frame for processing, you have to use an actual camera with real
rlm@22: lenses and interact with photons to get an image. It is much harder to
rlm@22: change the character, which is now partly a physical robot of some
rlm@22: sort, since doing so involves changing things around in the real world
rlm@22: instead of modifying lines of code. While the real world is very rich
rlm@22: and definitely provides enough stimulation for intelligence to develop
rlm@22: as evidenced by our own existence, it is also uncontrollable in the
rlm@22: sense that a particular situation cannot be recreated perfectly or
rlm@22: saved for later use. It is harder to conduct science because it is
rlm@22: harder to repeat an experiment. The worst thing about using the real
rlm@22: world instead of a simulation is the matter of time. Instead of
rlm@22: simulated time you get the constant and unstoppable flow of real
rlm@22: time. This severely limits the sorts of software you can use to
rlm@22: program the AI because all sense inputs must be handled in real
rlm@22: time. Complicated ideas may have to be implemented in hardware or may
rlm@22: simply be impossible given the current speed of our
rlm@22: processors. Contrast this with a simulation, in which the flow of time
rlm@22: in the simulated world can be slowed down to accommodate the
rlm@22: limitations of the character's programming. In terms of cost, doing
rlm@22: everything in software is far cheaper than building custom real-time
rlm@22: hardware. All you need is a laptop and some patience.
rlm@22: 
rlm@22: * Choose a Simulation Engine
rlm@22: 
rlm@22: Mainly because of issues with controlling the flow of time, I chose to
rlm@22: simulate both the world and the character. I set out to make a minimal
rlm@22: world in which I could embed a character with multiple senses. My main
rlm@22: goal is to make an environment where I can perform further experiments
rlm@22: in simulated senses.
rlm@22: 
rlm@22: As Carl Sagan once said, "If you wish to make an apple pie from
rlm@22: scratch, you must first invent the universe." I examined many
rlm@22: different 3D environments to try and find something I would use as the
rlm@22: base for my simulation; eventually the choice came down to three
rlm@22: engines: the Quake II engine, the Source Engine, and jMonkeyEngine.
rlm@22: 
rlm@27: ** [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]]
rlm@22: 
rlm@22: I spent a bit more than a month working with the Quake II Engine from
rlm@22: ID software to see if I could use it for my purposes. All the source
rlm@22: code was released by ID software into the Public Domain several years
rlm@22: ago, and as a result it has been ported and modified for many
rlm@22: different reasons. This engine was famous for its advanced use of
rlm@22: realistic shading and had decent and fast physics
rlm@22: simulation. Researchers at Princeton [[http://www.nature.com/nature/journal/v461/n7266/pdf/nature08499.pdf][used this code]] to study spatial
rlm@22: information encoding in the hippocampal cells of rats. Those
rlm@22: researchers created a special Quake II level that simulated a maze,
rlm@22: and added an interface where a mouse could run around inside a ball in
rlm@22: various directions to move the character in the simulated maze. They
rlm@22: measured hippocampal activity during this exercise to try and tease
rlm@22: out the method in which spatial data was stored in that area of the
rlm@22: brain. I find this promising because if a real living rat can interact
rlm@22: with a computer simulation of a maze in the same way as it interacts
rlm@22: with a real-world maze, then maybe that simulation is close enough to
rlm@22: reality that a simulated sense of vision and motor control interacting
rlm@22: with that simulation could reveal useful information about the real
rlm@24: thing. There is a Java port of the original C source code called
rlm@24: Jake2. The port demonstrates Java's OpenGL bindings and runs anywhere
rlm@24: from 90% to 105% as fast as the C version. After reviewing much of the
rlm@24: source of Jake2, I eventually rejected it because the engine is too
rlm@24: tied to the concept of a first-person shooter game. One of the
rlm@24: problems I had was that there do not seem to be any easy way to attach
rlm@24: multiple cameras to a single character. There are also several physics
rlm@24: clipping issues that are corrected in a way that only applies to the
rlm@24: main character and does not apply to arbitrary objects. While there is
rlm@24: a large community of level modders, I couldn't find a community to
rlm@24: support using the engine to make new things.
rlm@22: 
rlm@27: ** [[http://source.valvesoftware.com/][Source Engine]]
rlm@22: 
rlm@22: The Source Engine evolved from the Quake II and Quake I engines and is
rlm@22: used by Valve in the Half-Life series of games. The physics simulation
rlm@22: in the Source Engine is quite accurate and probably the best out of
rlm@22: all the engines I investigated. There is also an extensive community
rlm@22: actively working with the engine. However, applications that use the
rlm@22: Source Engine must be written in C++, the code is not open, it only
rlm@22: runs on Windows, and the tools that come with the SDK to handle models
rlm@22: and textures are complicated and awkward to use.
rlm@22: 
rlm@27: ** [[http://jmonkeyengine.com/][jMonkeyEngine3]]
rlm@22: 
rlm@22: jMonkeyEngine is a new library for creating games in Java. It uses
rlm@22: OpenGL to render to the screen and uses screengraphs to avoid drawing
rlm@22: things that do not appear on the screen. It has an active community
rlm@22: and several games in the pipeline. The engine was not built to serve
rlm@22: any particular game but is instead meant to be used for any 3D
rlm@22: game. After experimenting with each of these three engines and a few
rlm@22: others for about 2 months I settled on jMonkeyEngine. I chose it
rlm@22: because it had the most features out of all the open projects I looked
rlm@22: at, and because I could then write my code in Clojure, an
rlm@22: implementation of LISP that runs on the JVM...
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@32: 
rlm@32: 
rlm@32: