rlm@22: #+title: Simulated Senses rlm@22: #+author: Robert McIntyre rlm@22: #+email: rlm@mit.edu rlm@22: #+description: Simulating senses for AI research using JMonkeyEngine3 rlm@306: #+keywords: Alan Turing, AI, simulated senses, jMonkeyEngine3, virtual world rlm@22: #+SETUPFILE: ../../aurellem/org/setup.org rlm@22: #+INCLUDE: ../../aurellem/org/level-0.org rlm@22: #+babel: :mkdirp yes :noweb yes rlm@22: rlm@22: * Background rlm@333: rlm@333: Artificial Intelligence has tried and failed for more than rlm@333: half a century to produce programs as flexible, creative, rlm@333: and "intelligent" as the human mind itself. Clearly, we are rlm@333: still missing some important ideas concerning intelligent rlm@333: programs or we would have strong AI already. What idea could rlm@333: be missing? rlm@22: rlm@22: When Turing first proposed his famous "Turing Test" in the rlm@333: groundbreaking paper [[../sources/turing.pdf][/Computing Machines and Intelligence/]], rlm@333: he gave little importance to how a computer program might rlm@333: interact with the world: rlm@22: rlm@22: #+BEGIN_QUOTE rlm@333: \ldquo{}We need not be too concerned about the legs, eyes, rlm@333: etc. The example of Miss Helen Keller shows that education rlm@333: can take place provided that communication in both rlm@333: directions between teacher and pupil can take place by some rlm@333: means or other.\rdquo{} rlm@22: #+END_QUOTE rlm@22: rlm@333: And from the example of Hellen Keller he went on to assume rlm@333: that the only thing a fledgling AI program could need by way rlm@333: of communication is a teletypewriter. But Hellen Keller did rlm@333: possess vision and hearing for the first few months of her rlm@333: life, and her tactile sense was far more rich than any rlm@333: text-stream could hope to achieve. She possessed a body she rlm@333: could move freely, and had continual access to the real rlm@333: world to learn from her actions. rlm@22: rlm@333: I believe that our programs are suffering from too little rlm@333: sensory input to become really intelligent. Imagine for a rlm@333: moment that you lived in a world completely cut off form all rlm@333: sensory stimulation. You have no eyes to see, no ears to rlm@333: hear, no mouth to speak. No body, no taste, no feeling rlm@333: whatsoever. The only sense you get at all is a single point rlm@333: of light, flickering on and off in the void. If this was rlm@333: your life from birth, you would never learn anything, and rlm@333: could never become intelligent. Actual humans placed in rlm@333: sensory deprivation chambers experience hallucinations and rlm@333: can begin to loose their sense of reality. Most of the time, rlm@333: the programs we write are in exactly this situation. They do rlm@333: not interface with cameras and microphones, and they do not rlm@333: control a real or simulated body or interact with any sort rlm@333: of world. rlm@22: rlm@22: * Simulation vs. Reality rlm@333: rlm@22: I want demonstrate that multiple senses are what enable rlm@333: intelligence. There are two ways of playing around with rlm@333: senses and computer programs: rlm@34: rlm@34: ** Simulation rlm@22: rlm@333: The first is to go entirely with simulation: virtual world, rlm@333: virtual character, virtual senses. The advantages are that rlm@333: when everything is a simulation, experiments in that rlm@333: simulation are absolutely reproducible. It's also easier to rlm@333: change the character and world to explore new situations and rlm@333: different sensory combinations. rlm@333: rlm@333: If the world is to be simulated on a computer, then not only rlm@333: do you have to worry about whether the character's senses rlm@333: are rich enough to learn from the world, but whether the rlm@333: world itself is rendered with enough detail and realism to rlm@333: give enough working material to the character's senses. To rlm@333: name just a few difficulties facing modern physics rlm@333: simulators: destructibility of the environment, simulation rlm@333: of water/other fluids, large areas, nonrigid bodies, lots of rlm@333: objects, smoke. I don't know of any computer simulation that rlm@333: would allow a character to take a rock and grind it into rlm@333: fine dust, then use that dust to make a clay sculpture, at rlm@333: least not without spending years calculating the rlm@333: interactions of every single small grain of dust. Maybe a rlm@333: simulated world with today's limitations doesn't provide rlm@22: enough richness for real intelligence to evolve. rlm@22: rlm@34: ** Reality rlm@22: rlm@333: The other approach for playing with senses is to hook your rlm@333: software up to real cameras, microphones, robots, etc., and rlm@333: let it loose in the real world. This has the advantage of rlm@333: eliminating concerns about simulating the world at the rlm@333: expense of increasing the complexity of implementing the rlm@333: senses. Instead of just grabbing the current rendered frame rlm@333: for processing, you have to use an actual camera with real rlm@333: lenses and interact with photons to get an image. It is much rlm@333: harder to change the character, which is now partly a rlm@333: physical robot of some sort, since doing so involves rlm@333: changing things around in the real world instead of rlm@333: modifying lines of code. While the real world is very rich rlm@333: and definitely provides enough stimulation for intelligence rlm@333: to develop as evidenced by our own existence, it is also rlm@333: uncontrollable in the sense that a particular situation rlm@333: cannot be recreated perfectly or saved for later use. It is rlm@333: harder to conduct science because it is harder to repeat an rlm@333: experiment. The worst thing about using the real world rlm@333: instead of a simulation is the matter of time. Instead of rlm@333: simulated time you get the constant and unstoppable flow of rlm@333: real time. This severely limits the sorts of software you rlm@333: can use to program the AI because all sense inputs must be rlm@333: handled in real time. Complicated ideas may have to be rlm@333: implemented in hardware or may simply be impossible given rlm@333: the current speed of our processors. Contrast this with a rlm@333: simulation, in which the flow of time in the simulated world rlm@333: can be slowed down to accommodate the limitations of the rlm@333: character's programming. In terms of cost, doing everything rlm@333: in software is far cheaper than building custom real-time rlm@22: hardware. All you need is a laptop and some patience. rlm@22: rlm@22: * Choose a Simulation Engine rlm@22: rlm@333: Mainly because of issues with controlling the flow of time, rlm@333: I chose to simulate both the world and the character. I set rlm@333: out to make a world in which I could embed a character with rlm@333: multiple senses. My main goal is to make an environment rlm@333: where I can perform further experiments in simulated senses. rlm@22: rlm@333: I examined many different 3D environments to try and find rlm@333: something I would use as the base for my simulation; rlm@333: eventually the choice came down to three engines: the Quake rlm@333: II engine, the Source Engine, and jMonkeyEngine. rlm@22: rlm@27: ** [[http://www.idsoftware.com][Quake II]]/[[http://www.bytonic.de/html/jake2.html][Jake2]] rlm@22: rlm@333: I spent a bit more than a month working with the Quake II rlm@333: Engine from ID software to see if I could use it for my rlm@333: purposes. All the source code was released by ID software rlm@333: into the Public Domain several years ago, and as a result it rlm@333: has been ported and modified for many different rlm@333: reasons. This engine was famous for its advanced use of rlm@22: realistic shading and had decent and fast physics rlm@333: simulation. Researchers at Princeton [[http://papers.cnl.salk.edu/PDFs/Intracelllular%20Dynamics%20of%20Virtual%20Place%20Cells%202011-4178.pdf][used this code]] ([[http://brainwindows.wordpress.com/2009/10/14/playing-quake-with-a-real-mouse/][video]]) rlm@333: to study spatial information encoding in the hippocampal rlm@333: cells of rats. Those researchers created a special Quake II rlm@333: level that simulated a maze, and added an interface where a rlm@333: mouse could run on top of a ball in various directions to rlm@333: move the character in the simulated maze. They measured rlm@333: hippocampal activity during this exercise to try and tease rlm@333: out the method in which spatial data was stored in that area rlm@333: of the brain. I find this promising because if a real living rlm@333: rat can interact with a computer simulation of a maze in the rlm@333: same way as it interacts with a real-world maze, then maybe rlm@333: that simulation is close enough to reality that a simulated rlm@333: sense of vision and motor control interacting with that rlm@333: simulation could reveal useful information about the real rlm@333: thing. There is a Java port of the original C source code rlm@333: called Jake2. The port demonstrates Java's OpenGL bindings rlm@333: and runs anywhere from 90% to 105% as fast as the C rlm@333: version. After reviewing much of the source of Jake2, I rlm@333: rejected it because the engine is too tied to the concept of rlm@333: a first-person shooter game. One of the problems I had was rlm@333: that there does not seem to be any easy way to attach rlm@333: multiple cameras to a single character. There are also rlm@333: several physics clipping issues that are corrected in a way rlm@333: that only applies to the main character and do not apply to rlm@333: arbitrary objects. While there is a large community of level rlm@333: modders, I couldn't find a community to support using the rlm@333: engine to make new things. rlm@22: rlm@27: ** [[http://source.valvesoftware.com/][Source Engine]] rlm@22: rlm@333: The Source Engine evolved from the Quake II and Quake I rlm@333: engines and is used by Valve in the Half-Life series of rlm@333: games. The physics simulation in the Source Engine is quite rlm@333: accurate and probably the best out of all the engines I rlm@333: investigated. There is also an extensive community actively rlm@333: working with the engine. However, applications that use the rlm@333: Source Engine must be written in C++, the code is not open, rlm@333: it only runs on Windows, and the tools that come with the rlm@333: SDK to handle models and textures are complicated and rlm@333: awkward to use. rlm@22: rlm@27: ** [[http://jmonkeyengine.com/][jMonkeyEngine3]] rlm@22: rlm@333: jMonkeyEngine is a new library for creating games in rlm@333: Java. It uses OpenGL to render to the screen and uses rlm@333: screengraphs to avoid drawing things that do not appear on rlm@333: the screen. It has an active community and several games in rlm@333: the pipeline. The engine was not built to serve any rlm@333: particular game but is instead meant to be used for any 3D rlm@333: game. After experimenting with each of these three engines rlm@333: and a few others for about 2 months I settled on rlm@333: jMonkeyEngine. I chose it because it had the most features rlm@333: out of all the open projects I looked at, and because I rlm@333: could then write my code in Clojure, an implementation of rlm@333: LISP that runs on the JVM. rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@29: rlm@32: rlm@32: