rlm@22: #+title: Simulated Senses
rlm@22: #+author: Robert McIntyre
rlm@22: #+email: rlm@mit.edu
rlm@22: #+description: Simulating senses for AI research using JMonkeyEngine3
rlm@306: #+keywords: Alan Turing, AI, simulated senses, jMonkeyEngine3, virtual world
rlm@22: #+SETUPFILE: ../../aurellem/org/setup.org
rlm@22: #+INCLUDE: ../../aurellem/org/level-0.org
rlm@22: #+babel: :mkdirp yes :noweb yes
rlm@22: 
rlm@22: * Background
rlm@333: 
rlm@333: Artificial Intelligence has tried and failed for more than
rlm@333: half a century to produce programs as flexible, creative,
rlm@333: and "intelligent" as the human mind itself. Clearly, we are
rlm@333: still missing some important ideas concerning intelligent
rlm@333: programs or we would have strong AI already. What idea could
rlm@333: be missing?
rlm@22: 
rlm@22: When Turing first proposed his famous "Turing Test" in the
rlm@333: groundbreaking paper [[../sources/turing.pdf][/Computing Machines and Intelligence/]],
rlm@333: he gave little importance to how a computer program might
rlm@333: interact with the world:
rlm@22: 
rlm@22: #+BEGIN_QUOTE
rlm@333: \ldquo{}We need not be too concerned about the legs, eyes,
rlm@333: etc. The example of Miss Helen Keller shows that education
rlm@333: can take place provided that communication in both
rlm@333: directions between teacher and pupil can take place by some
rlm@333: means or other.\rdquo{}
rlm@22: #+END_QUOTE
rlm@22: 
rlm@333: And from the example of Hellen Keller he went on to assume
rlm@333: that the only thing a fledgling AI program could need by way
rlm@333: of communication is a teletypewriter. But Hellen Keller did
rlm@333: possess vision and hearing for the first few months of her
rlm@333: life, and her tactile sense was far more rich than any
rlm@333: text-stream could hope to achieve. She possessed a body she
rlm@333: could move freely, and had continual access to the real
rlm@333: world to learn from her actions.
rlm@22: 
rlm@333: I believe that our programs are suffering from too little
rlm@333: sensory input to become really intelligent. Imagine for a
rlm@333: moment that you lived in a world completely cut off form all
rlm@333: sensory stimulation. You have no eyes to see, no ears to
rlm@333: hear, no mouth to speak. No body, no taste, no feeling
rlm@333: whatsoever. The only sense you get at all is a single point
rlm@333: of light, flickering on and off in the void. If this was
rlm@333: your life from birth, you would never learn anything, and
rlm@333: could never become intelligent. Actual humans placed in
rlm@333: sensory deprivation chambers experience hallucinations and
rlm@333: can begin to loose their sense of reality. Most of the time,
rlm@333: the programs we write are in exactly this situation. They do
rlm@333: not interface with cameras and microphones, and they do not
rlm@333: control a real or simulated body or interact with any sort
rlm@333: of world.
rlm@22: 
rlm@22: * Simulation vs. Reality
rlm@333: 
rlm@22: I want demonstrate that multiple senses are what enable
rlm@333: intelligence. There are two ways of playing around with
rlm@333: senses and computer programs:
rlm@34: 
rlm@34: ** Simulation
rlm@22: 
rlm@333: The first is to go entirely with simulation: virtual world,
rlm@333: virtual character, virtual senses. The advantages are that
rlm@333: when everything is a simulation, experiments in that
rlm@333: simulation are absolutely reproducible. It's also easier to
rlm@333: change the character and world to explore new situations and
rlm@333: different sensory combinations.
rlm@333: 
rlm@333: If the world is to be simulated on a computer, then not only
rlm@333: do you have to worry about whether the character's senses
rlm@333: are rich enough to learn from the world, but whether the
rlm@333: world itself is rendered with enough detail and realism to
rlm@333: give enough working material to the character's senses. To
rlm@333: name just a few difficulties facing modern physics
rlm@333: simulators: destructibility of the environment, simulation
rlm@333: of water/other fluids, large areas, nonrigid bodies, lots of
rlm@333: objects, smoke. I don't know of any computer simulation that
rlm@333: would allow a character to take a rock and grind it into
rlm@333: fine dust, then use that dust to make a clay sculpture, at
rlm@333: least not without spending years calculating the
rlm@333: interactions of every single small grain of dust. Maybe a
rlm@333: simulated world with today's limitations doesn't provide
rlm@22: enough richness for real intelligence to evolve.
rlm@22: 
rlm@34: ** Reality
rlm@22: 
rlm@333: The other approach for playing with senses is to hook your
rlm@333: software up to real cameras, microphones, robots, etc., and
rlm@333: let it loose in the real world. This has the advantage of
rlm@333: eliminating concerns about simulating the world at the
rlm@333: expense of increasing the complexity of implementing the
rlm@333: senses. Instead of just grabbing the current rendered frame
rlm@333: for processing, you have to use an actual camera with real
rlm@333: lenses and interact with photons to get an image. It is much
rlm@333: harder to change the character, which is now partly a
rlm@333: physical robot of some sort, since doing so involves
rlm@333: changing things around in the real world instead of
rlm@333: modifying lines of code. While the real world is very rich
rlm@333: and definitely provides enough stimulation for intelligence
rlm@333: to develop as evidenced by our own existence, it is also
rlm@333: uncontrollable in the sense that a particular situation
rlm@333: cannot be recreated perfectly or saved for later use. It is
rlm@333: harder to conduct science because it is harder to repeat an
rlm@333: experiment. The worst thing about using the real world
rlm@333: instead of a simulation is the matter of time. Instead of
rlm@333: simulated time you get the constant and unstoppable flow of
rlm@333: real time. This severely limits the sorts of software you
rlm@333: can use to program the AI because all sense inputs must be
rlm@333: handled in real time. Complicated ideas may have to be
rlm@333: implemented in hardware or may simply be impossible given
rlm@333: the current speed of our processors. Contrast this with a
rlm@333: simulation, in which the flow of time in the simulated world
rlm@333: can be slowed down to accommodate the limitations of the
rlm@333: character's programming. In terms of cost, doing everything
rlm@333: in software is far cheaper than building custom real-time
rlm@22: hardware. All you need is a laptop and some patience.
rlm@22: 
rlm@22: * Choose a Simulation Engine
rlm@22: 
rlm@333: Mainly because of issues with controlling the flow of time,
rlm@333: I chose to simulate both the world and the character. I set
rlm@333: out to make a world in which I could embed a character with
rlm@333: multiple senses. My main goal is to make an environment
rlm@333: where I can perform further experiments in simulated senses.
rlm@22: 
rlm@333: I examined many different 3D environments to try and find
rlm@333: something I would use as the base for my simulation;
rlm@333: eventually the choice came down to three engines: the Quake
rlm@333: II engine, the Source Engine, and jMonkeyEngine.
rlm@22: 
rlm@27: ** [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]]
rlm@22: 
rlm@333: I spent a bit more than a month working with the Quake II
rlm@333: Engine from ID software to see if I could use it for my
rlm@333: purposes. All the source code was released by ID software
rlm@333: into the Public Domain several years ago, and as a result it
rlm@333: has been ported and modified for many different
rlm@333: reasons. This engine was famous for its advanced use of
rlm@22: realistic shading and had decent and fast physics
rlm@333: simulation. Researchers at Princeton [[http://papers.cnl.salk.edu/PDFs/Intracelllular%20Dynamics%20of%20Virtual%20Place%20Cells%202011-4178.pdf][used this code]] ([[http://brainwindows.wordpress.com/2009/10/14/playing-quake-with-a-real-mouse/][video]])
rlm@333: to study spatial information encoding in the hippocampal
rlm@333: cells of rats. Those researchers created a special Quake II
rlm@333: level that simulated a maze, and added an interface where a
rlm@333: mouse could run on top of a ball in various directions to
rlm@333: move the character in the simulated maze. They measured
rlm@333: hippocampal activity during this exercise to try and tease
rlm@333: out the method in which spatial data was stored in that area
rlm@333: of the brain. I find this promising because if a real living
rlm@333: rat can interact with a computer simulation of a maze in the
rlm@333: same way as it interacts with a real-world maze, then maybe
rlm@333: that simulation is close enough to reality that a simulated
rlm@333: sense of vision and motor control interacting with that
rlm@333: simulation could reveal useful information about the real
rlm@333: thing. There is a Java port of the original C source code
rlm@333: called Jake2. The port demonstrates Java's OpenGL bindings
rlm@333: and runs anywhere from 90% to 105% as fast as the C
rlm@333: version. After reviewing much of the source of Jake2, I
rlm@333: rejected it because the engine is too tied to the concept of
rlm@333: a first-person shooter game. One of the problems I had was
rlm@333: that there does not seem to be any easy way to attach
rlm@333: multiple cameras to a single character. There are also
rlm@333: several physics clipping issues that are corrected in a way
rlm@333: that only applies to the main character and do not apply to
rlm@333: arbitrary objects. While there is a large community of level
rlm@333: modders, I couldn't find a community to support using the
rlm@333: engine to make new things.
rlm@22: 
rlm@27: ** [[http://source.valvesoftware.com/][Source Engine]]
rlm@22: 
rlm@333: The Source Engine evolved from the Quake II and Quake I
rlm@333: engines and is used by Valve in the Half-Life series of
rlm@333: games. The physics simulation in the Source Engine is quite
rlm@333: accurate and probably the best out of all the engines I
rlm@333: investigated. There is also an extensive community actively
rlm@333: working with the engine. However, applications that use the
rlm@333: Source Engine must be written in C++, the code is not open,
rlm@333: it only runs on Windows, and the tools that come with the
rlm@333: SDK to handle models and textures are complicated and
rlm@333: awkward to use.
rlm@22: 
rlm@27: ** [[http://jmonkeyengine.com/][jMonkeyEngine3]]
rlm@22: 
rlm@333: jMonkeyEngine is a new library for creating games in
rlm@333: Java. It uses OpenGL to render to the screen and uses
rlm@333: screengraphs to avoid drawing things that do not appear on
rlm@333: the screen. It has an active community and several games in
rlm@333: the pipeline. The engine was not built to serve any
rlm@333: particular game but is instead meant to be used for any 3D
rlm@333: game. After experimenting with each of these three engines
rlm@333: and a few others for about 2 months I settled on
rlm@333: jMonkeyEngine. I chose it because it had the most features
rlm@333: out of all the open projects I looked at, and because I
rlm@333: could then write my code in Clojure, an implementation of
rlm@333: LISP that runs on the JVM.
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@29: 
rlm@32: 
rlm@32: