rlm@22: #+title: Simulated Senses rlm@22: #+author: Robert McIntyre rlm@22: #+email: rlm@mit.edu rlm@22: #+description: Simulating senses for AI research using JMonkeyEngine3 rlm@34: #+keywords: Alan Turing, AI, sinulated senses, jMonkeyEngine3, virtual world rlm@22: #+SETUPFILE: ../../aurellem/org/setup.org rlm@22: #+INCLUDE: ../../aurellem/org/level-0.org rlm@22: #+babel: :mkdirp yes :noweb yes rlm@22: rlm@22: * Background rlm@22: Artificial Intelligence has tried and failed for more than half a rlm@22: century to produce programs as flexible, creative, and "intelligent" rlm@22: as the human mind itself. Clearly, we are still missing some important rlm@22: ideas concerning intelligent programs or we would have strong AI rlm@22: already. What idea could be missing? rlm@22: rlm@22: When Turing first proposed his famous "Turing Test" in the rlm@34: groundbreaking paper [[../sources/turing.pdf][/Computing Machines and Intelligence/]], he gave rlm@22: little importance to how a computer program might interact with the rlm@22: world: rlm@22: rlm@22: #+BEGIN_QUOTE rlm@22: \ldquo{}We need not be too concerned about the legs, eyes, etc. The example of rlm@22: Miss Helen Keller shows that education can take place provided that rlm@22: communication in both directions between teacher and pupil can take rlm@22: place by some means or other.\rdquo{} rlm@22: #+END_QUOTE rlm@22: rlm@22: And from the example of Hellen Keller he went on to assume that the rlm@22: only thing a fledgling AI program could need by way of communication rlm@22: is a teletypewriter. But Hellen Keller did possess vision and hearing rlm@22: for the first few months of her life, and her tactile sense was far rlm@22: more rich than any text-stream could hope to achieve. She possessed a rlm@22: body she could move freely, and had continual access to the real world rlm@22: to learn from her actions. rlm@22: rlm@22: I believe that our programs are suffering from too little sensory rlm@22: input to become really intelligent. Imagine for a moment that you rlm@22: lived in a world completely cut off form all sensory stimulation. You rlm@22: have no eyes to see, no ears to hear, no mouth to speak. No body, no rlm@22: taste, no feeling whatsoever. The only sense you get at all is a rlm@22: single point of light, flickering on and off in the void. If this was rlm@22: your life from birth, you would never learn anything, and could never rlm@22: become intelligent. Actual humans placed in sensory deprivation rlm@22: chambers experience hallucinations and can begin to loose their sense rlm@34: of reality. Most of the time, the programs we write are in exactly rlm@34: this situation. They do not interface with cameras and microphones, rlm@34: and they do not control a real or simulated body or interact with any rlm@34: sort of world. rlm@22: rlm@22: * Simulation vs. Reality rlm@22: I want demonstrate that multiple senses are what enable rlm@22: intelligence. There are two ways of playing around with senses and rlm@22: computer programs: rlm@22: rlm@34: rlm@34: ** Simulation rlm@22: The first is to go entirely with simulation: virtual world, virtual rlm@22: character, virtual senses. The advantages are that when everything is rlm@22: a simulation, experiments in that simulation are absolutely rlm@22: reproducible. It's also easier to change the character and world to rlm@22: explore new situations and different sensory combinations. rlm@22: rlm@22: If the world is to be simulated on a computer, then not only do you rlm@22: have to worry about whether the character's senses are rich enough to rlm@22: learn from the world, but whether the world itself is rendered with rlm@22: enough detail and realism to give enough working material to the rlm@22: character's senses. To name just a few difficulties facing modern rlm@22: physics simulators: destructibility of the environment, simulation of rlm@22: water/other fluids, large areas, nonrigid bodies, lots of objects, rlm@22: smoke. I don't know of any computer simulation that would allow a rlm@22: character to take a rock and grind it into fine dust, then use that rlm@22: dust to make a clay sculpture, at least not without spending years rlm@22: calculating the interactions of every single small grain of rlm@22: dust. Maybe a simulated world with today's limitations doesn't provide rlm@22: enough richness for real intelligence to evolve. rlm@22: rlm@34: ** Reality rlm@22: rlm@22: The other approach for playing with senses is to hook your software up rlm@22: to real cameras, microphones, robots, etc., and let it loose in the rlm@22: real world. This has the advantage of eliminating concerns about rlm@22: simulating the world at the expense of increasing the complexity of rlm@22: implementing the senses. Instead of just grabbing the current rendered rlm@22: frame for processing, you have to use an actual camera with real rlm@22: lenses and interact with photons to get an image. It is much harder to rlm@22: change the character, which is now partly a physical robot of some rlm@22: sort, since doing so involves changing things around in the real world rlm@22: instead of modifying lines of code. While the real world is very rich rlm@22: and definitely provides enough stimulation for intelligence to develop rlm@22: as evidenced by our own existence, it is also uncontrollable in the rlm@22: sense that a particular situation cannot be recreated perfectly or rlm@22: saved for later use. It is harder to conduct science because it is rlm@22: harder to repeat an experiment. The worst thing about using the real rlm@22: world instead of a simulation is the matter of time. Instead of rlm@22: simulated time you get the constant and unstoppable flow of real rlm@22: time. This severely limits the sorts of software you can use to rlm@22: program the AI because all sense inputs must be handled in real rlm@22: time. Complicated ideas may have to be implemented in hardware or may rlm@22: simply be impossible given the current speed of our rlm@22: processors. Contrast this with a simulation, in which the flow of time rlm@22: in the simulated world can be slowed down to accommodate the rlm@22: limitations of the character's programming. In terms of cost, doing rlm@22: everything in software is far cheaper than building custom real-time rlm@22: hardware. All you need is a laptop and some patience. rlm@22: rlm@22: * Choose a Simulation Engine rlm@22: rlm@22: Mainly because of issues with controlling the flow of time, I chose to rlm@34: simulate both the world and the character. I set out to make a world rlm@34: in which I could embed a character with multiple senses. My main goal rlm@34: is to make an environment where I can perform further experiments in rlm@34: simulated senses. rlm@22: rlm@34: I examined many different 3D environments to try and find something I rlm@34: would use as the base for my simulation; eventually the choice came rlm@34: down to three engines: the Quake II engine, the Source Engine, and rlm@34: jMonkeyEngine. rlm@22: rlm@27: ** [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]] rlm@22: rlm@22: I spent a bit more than a month working with the Quake II Engine from rlm@22: ID software to see if I could use it for my purposes. All the source rlm@22: code was released by ID software into the Public Domain several years rlm@22: ago, and as a result it has been ported and modified for many rlm@22: different reasons. This engine was famous for its advanced use of rlm@22: realistic shading and had decent and fast physics rlm@196: simulation. Researchers at Princeton [[http://papers.cnl.salk.edu/PDFs/Intracelllular%20Dynamics%20of%20Virtual%20Place%20Cells%202011-4178.pdf][used this code]] to study spatial rlm@22: information encoding in the hippocampal cells of rats. Those rlm@22: researchers created a special Quake II level that simulated a maze, rlm@22: and added an interface where a mouse could run around inside a ball in rlm@22: various directions to move the character in the simulated maze. They rlm@22: measured hippocampal activity during this exercise to try and tease rlm@22: out the method in which spatial data was stored in that area of the rlm@22: brain. I find this promising because if a real living rat can interact rlm@22: with a computer simulation of a maze in the same way as it interacts rlm@22: with a real-world maze, then maybe that simulation is close enough to rlm@22: reality that a simulated sense of vision and motor control interacting rlm@22: with that simulation could reveal useful information about the real rlm@24: thing. There is a Java port of the original C source code called rlm@24: Jake2. The port demonstrates Java's OpenGL bindings and runs anywhere rlm@24: from 90% to 105% as fast as the C version. After reviewing much of the rlm@24: source of Jake2, I eventually rejected it because the engine is too rlm@24: tied to the concept of a first-person shooter game. One of the rlm@24: problems I had was that there do not seem to be any easy way to attach rlm@24: multiple cameras to a single character. There are also several physics rlm@24: clipping issues that are corrected in a way that only applies to the rlm@24: main character and does not apply to arbitrary objects. While there is rlm@24: a large community of level modders, I couldn't find a community to rlm@24: support using the engine to make new things. rlm@22: rlm@27: ** [[http://source.valvesoftware.com/][Source Engine]] rlm@22: rlm@22: The Source Engine evolved from the Quake II and Quake I engines and is rlm@22: used by Valve in the Half-Life series of games. The physics simulation rlm@22: in the Source Engine is quite accurate and probably the best out of rlm@22: all the engines I investigated. There is also an extensive community rlm@22: actively working with the engine. However, applications that use the rlm@22: Source Engine must be written in C++, the code is not open, it only rlm@22: runs on Windows, and the tools that come with the SDK to handle models rlm@22: and textures are complicated and awkward to use. rlm@22: rlm@27: ** [[http://jmonkeyengine.com/][jMonkeyEngine3]] rlm@22: rlm@22: jMonkeyEngine is a new library for creating games in Java. It uses rlm@22: OpenGL to render to the screen and uses screengraphs to avoid drawing rlm@22: things that do not appear on the screen. It has an active community rlm@22: and several games in the pipeline. The engine was not built to serve rlm@22: any particular game but is instead meant to be used for any 3D rlm@22: game. After experimenting with each of these three engines and a few rlm@22: others for about 2 months I settled on jMonkeyEngine. I chose it rlm@22: because it had the most features out of all the open projects I looked rlm@22: at, and because I could then write my code in Clojure, an rlm@34: implementation of LISP that runs on the JVM. rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@32: rlm@32: