changeset 472:516a029e0be9

complete first draft of hearing.
author Robert McIntyre <rlm@mit.edu>
date Fri, 28 Mar 2014 18:14:04 -0400
parents f14fa9e5b67f
children 486ce07f5545
files org/hearing.org thesis/cortex.org
diffstat 2 files changed, 305 insertions(+), 6 deletions(-) [+]
line wrap: on
line diff
     1.1 --- a/org/hearing.org	Fri Mar 28 17:31:33 2014 -0400
     1.2 +++ b/org/hearing.org	Fri Mar 28 18:14:04 2014 -0400
     1.3 @@ -26,8 +26,8 @@
     1.4  jMonkeyEngine's sound system works as follows:
     1.5  
     1.6   - jMonkeyEngine uses the =AppSettings= for the particular application
     1.7 -   to determine what sort of =AudioRenderer= should be used.
     1.8 - - Although some support is provided for multiple AudioRendering
     1.9 +   to determine what sort of =AudioRenderer= should be used. 
    1.10 +- Although some support is provided for multiple AudioRendering
    1.11     backends, jMonkeyEngine at the time of this writing will either
    1.12     pick no =AudioRenderer= at all, or the =LwjglAudioRenderer=.
    1.13   - jMonkeyEngine tries to figure out what sort of system you're
    1.14 @@ -78,7 +78,7 @@
    1.15  
    1.16  Therefore, in order to support multiple listeners, and get the sound
    1.17  data in a form that the AIs can use, it is necessary to create a new
    1.18 -Device which supports this features.
    1.19 +Device which supports this feature.
    1.20  
    1.21  ** The Send Device
    1.22  Adding a device to OpenAL is rather tricky -- there are five separate
    1.23 @@ -823,12 +823,12 @@
    1.24  Together, these three functions define how ears found in a specially
    1.25  prepared blender file will be translated to =Listener= objects in a
    1.26  simulation. =ears= extracts all the children of to top level node
    1.27 -named "ears".  =add-ear!= and =update-listener-velocity!= use
    1.28 +named "ears". =add-ear!= and =update-listener-velocity!= use
    1.29  =bind-sense= to bind a =Listener= object located at the initial
    1.30  position of an "ear" node to the closest physical object in the
    1.31  creature. That =Listener= will stay in the same orientation to the
    1.32  object with which it is bound, just as the camera in the [[http://aurellem.localhost/cortex/html/sense.html#sec-4-1][sense binding
    1.33 -demonstration]].  =OpenAL= simulates the Doppler effect for moving
    1.34 +demonstration]]. =OpenAL= simulates the Doppler effect for moving
    1.35  listeners, =update-listener-velocity!= ensures that this velocity
    1.36  information is always up-to-date.
    1.37  
     2.1 --- a/thesis/cortex.org	Fri Mar 28 17:31:33 2014 -0400
     2.2 +++ b/thesis/cortex.org	Fri Mar 28 18:14:04 2014 -0400
     2.3 @@ -954,7 +954,7 @@
     2.4      #+ATTR_LaTeX: :width 15cm
     2.5      [[./images/physical-hand.png]]
     2.6  
     2.7 -** Eyes reuse standard video game components
     2.8 +** COMMENT Eyes reuse standard video game components
     2.9  
    2.10     Vision is one of the most important senses for humans, so I need to
    2.11     build a simulated sense of vision for my AI. I will do this with
    2.12 @@ -1253,6 +1253,305 @@
    2.13  
    2.14  ** Hearing is hard; =CORTEX= does it right
    2.15  
    2.16 +   At the end of this section I will have simulated ears that work the
    2.17 +   same way as the simulated eyes in the last section. I will be able to
    2.18 +   place any number of ear-nodes in a blender file, and they will bind to
    2.19 +   the closest physical object and follow it as it moves around. Each ear
    2.20 +   will provide access to the sound data it picks up between every frame.
    2.21 +
    2.22 +   Hearing is one of the more difficult senses to simulate, because there
    2.23 +   is less support for obtaining the actual sound data that is processed
    2.24 +   by jMonkeyEngine3. There is no "split-screen" support for rendering
    2.25 +   sound from different points of view, and there is no way to directly
    2.26 +   access the rendered sound data.
    2.27 +
    2.28 +   =CORTEX='s hearing is unique because it does not have any
    2.29 +   limitations compared to other simulation environments. As far as I
    2.30 +   know, there is no other system that supports multiple listerers,
    2.31 +   and the sound demo at the end of this section is the first time
    2.32 +   it's been done in a video game environment.
    2.33 +
    2.34 +*** Brief Description of jMonkeyEngine's Sound System
    2.35 +
    2.36 +   jMonkeyEngine's sound system works as follows:
    2.37 +
    2.38 +   - jMonkeyEngine uses the =AppSettings= for the particular
    2.39 +     application to determine what sort of =AudioRenderer= should be
    2.40 +     used.
    2.41 +   - Although some support is provided for multiple AudioRendering
    2.42 +     backends, jMonkeyEngine at the time of this writing will either
    2.43 +     pick no =AudioRenderer= at all, or the =LwjglAudioRenderer=.
    2.44 +   - jMonkeyEngine tries to figure out what sort of system you're
    2.45 +     running and extracts the appropriate native libraries.
    2.46 +   - The =LwjglAudioRenderer= uses the [[http://lwjgl.org/][=LWJGL=]] (LightWeight Java Game
    2.47 +     Library) bindings to interface with a C library called [[http://kcat.strangesoft.net/openal.html][=OpenAL=]]
    2.48 +   - =OpenAL= renders the 3D sound and feeds the rendered sound
    2.49 +     directly to any of various sound output devices with which it
    2.50 +     knows how to communicate.
    2.51 +  
    2.52 +   A consequence of this is that there's no way to access the actual
    2.53 +   sound data produced by =OpenAL=. Even worse, =OpenAL= only supports
    2.54 +   one /listener/ (it renders sound data from only one perspective),
    2.55 +   which normally isn't a problem for games, but becomes a problem
    2.56 +   when trying to make multiple AI creatures that can each hear the
    2.57 +   world from a different perspective.
    2.58 +
    2.59 +   To make many AI creatures in jMonkeyEngine that can each hear the
    2.60 +   world from their own perspective, or to make a single creature with
    2.61 +   many ears, it is necessary to go all the way back to =OpenAL= and
    2.62 +   implement support for simulated hearing there.
    2.63 +
    2.64 +*** Extending =OpenAl=
    2.65 +
    2.66 +    Extending =OpenAL= to support multiple listeners requires 500
    2.67 +    lines of =C= code and is too hairy to mention here. Instead, I
    2.68 +    will show a small amount of extension code and go over the high
    2.69 +    level stragety. Full source is of course available with the
    2.70 +    =CORTEX= distribution if you're interested.
    2.71 +
    2.72 +    =OpenAL= goes to great lengths to support many different systems,
    2.73 +    all with different sound capabilities and interfaces. It
    2.74 +    accomplishes this difficult task by providing code for many
    2.75 +    different sound backends in pseudo-objects called /Devices/.
    2.76 +    There's a device for the Linux Open Sound System and the Advanced
    2.77 +    Linux Sound Architecture, there's one for Direct Sound on Windows,
    2.78 +    and there's even one for Solaris. =OpenAL= solves the problem of
    2.79 +    platform independence by providing all these Devices.
    2.80 +
    2.81 +    Wrapper libraries such as LWJGL are free to examine the system on
    2.82 +    which they are running and then select an appropriate device for
    2.83 +    that system.
    2.84 +
    2.85 +    There are also a few "special" devices that don't interface with
    2.86 +    any particular system. These include the Null Device, which
    2.87 +    doesn't do anything, and the Wave Device, which writes whatever
    2.88 +    sound it receives to a file, if everything has been set up
    2.89 +    correctly when configuring =OpenAL=.
    2.90 +
    2.91 +    Actual mixing (doppler shift and distance.environment-based
    2.92 +    attenuation) of the sound data happens in the Devices, and they
    2.93 +    are the only point in the sound rendering process where this data
    2.94 +    is available.
    2.95 +
    2.96 +    Therefore, in order to support multiple listeners, and get the
    2.97 +    sound data in a form that the AIs can use, it is necessary to
    2.98 +    create a new Device which supports this feature.
    2.99 +
   2.100 +    Adding a device to OpenAL is rather tricky -- there are five
   2.101 +    separate files in the =OpenAL= source tree that must be modified
   2.102 +    to do so. I named my device the "Multiple Audio Send" Device, or
   2.103 +    =Send= Device for short, since it sends audio data back to the
   2.104 +    calling application like an Aux-Send cable on a mixing board.
   2.105 +
   2.106 +    The main idea behind the Send device is to take advantage of the
   2.107 +    fact that LWJGL only manages one /context/ when using OpenAL. A
   2.108 +    /context/ is like a container that holds samples and keeps track
   2.109 +    of where the listener is. In order to support multiple listeners,
   2.110 +    the Send device identifies the LWJGL context as the master
   2.111 +    context, and creates any number of slave contexts to represent
   2.112 +    additional listeners. Every time the device renders sound, it
   2.113 +    synchronizes every source from the master LWJGL context to the
   2.114 +    slave contexts. Then, it renders each context separately, using a
   2.115 +    different listener for each one. The rendered sound is made
   2.116 +    available via JNI to jMonkeyEngine.
   2.117 +
   2.118 +    Switching between contexts is not the normal operation of a
   2.119 +    Device, and one of the problems with doing so is that a Device
   2.120 +    normally keeps around a few pieces of state such as the
   2.121 +    =ClickRemoval= array above which will become corrupted if the
   2.122 +    contexts are not rendered in parallel. The solution is to create a
   2.123 +    copy of this normally global device state for each context, and
   2.124 +    copy it back and forth into and out of the actual device state
   2.125 +    whenever a context is rendered.
   2.126 +
   2.127 +    The core of the =Send= device is the =syncSources= function, which
   2.128 +    does the job of copying all relevant data from one context to
   2.129 +    another. 
   2.130 +
   2.131 +    #+caption: Program for extending =OpenAL= to support multiple
   2.132 +    #+caption: listeners via context copying/switching.
   2.133 +    #+name: sync-openal-sources
   2.134 +    #+begin_listing C
   2.135 +void syncSources(ALsource *masterSource, ALsource *slaveSource, 
   2.136 +		 ALCcontext *masterCtx, ALCcontext *slaveCtx){
   2.137 +  ALuint master = masterSource->source;
   2.138 +  ALuint slave = slaveSource->source;
   2.139 +  ALCcontext *current = alcGetCurrentContext();
   2.140 +
   2.141 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_PITCH);
   2.142 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_GAIN);
   2.143 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_DISTANCE);
   2.144 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_ROLLOFF_FACTOR);
   2.145 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_REFERENCE_DISTANCE);
   2.146 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_MIN_GAIN);
   2.147 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_GAIN);
   2.148 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_GAIN);
   2.149 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_INNER_ANGLE);
   2.150 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_ANGLE);
   2.151 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_SEC_OFFSET);
   2.152 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_SAMPLE_OFFSET);
   2.153 +  syncSourcef(master,slave,masterCtx,slaveCtx,AL_BYTE_OFFSET);
   2.154 +    
   2.155 +  syncSource3f(master,slave,masterCtx,slaveCtx,AL_POSITION);
   2.156 +  syncSource3f(master,slave,masterCtx,slaveCtx,AL_VELOCITY);
   2.157 +  syncSource3f(master,slave,masterCtx,slaveCtx,AL_DIRECTION);
   2.158 +  
   2.159 +  syncSourcei(master,slave,masterCtx,slaveCtx,AL_SOURCE_RELATIVE);
   2.160 +  syncSourcei(master,slave,masterCtx,slaveCtx,AL_LOOPING);
   2.161 +
   2.162 +  alcMakeContextCurrent(masterCtx);
   2.163 +  ALint source_type;
   2.164 +  alGetSourcei(master, AL_SOURCE_TYPE, &source_type);
   2.165 +
   2.166 +  // Only static sources are currently synchronized! 
   2.167 +  if (AL_STATIC == source_type){
   2.168 +    ALint master_buffer;
   2.169 +    ALint slave_buffer;
   2.170 +    alGetSourcei(master, AL_BUFFER, &master_buffer);
   2.171 +    alcMakeContextCurrent(slaveCtx);
   2.172 +    alGetSourcei(slave, AL_BUFFER, &slave_buffer);
   2.173 +    if (master_buffer != slave_buffer){
   2.174 +      alSourcei(slave, AL_BUFFER, master_buffer);
   2.175 +    }
   2.176 +  }
   2.177 +  
   2.178 +  // Synchronize the state of the two sources.
   2.179 +  alcMakeContextCurrent(masterCtx);
   2.180 +  ALint masterState;
   2.181 +  ALint slaveState;
   2.182 +
   2.183 +  alGetSourcei(master, AL_SOURCE_STATE, &masterState);
   2.184 +  alcMakeContextCurrent(slaveCtx);
   2.185 +  alGetSourcei(slave, AL_SOURCE_STATE, &slaveState);
   2.186 +
   2.187 +  if (masterState != slaveState){
   2.188 +    switch (masterState){
   2.189 +    case AL_INITIAL : alSourceRewind(slave); break;
   2.190 +    case AL_PLAYING : alSourcePlay(slave);   break;
   2.191 +    case AL_PAUSED  : alSourcePause(slave);  break;
   2.192 +    case AL_STOPPED : alSourceStop(slave);   break;
   2.193 +    }
   2.194 +  }
   2.195 +  // Restore whatever context was previously active.
   2.196 +  alcMakeContextCurrent(current);
   2.197 +}
   2.198 +    #+end_listing
   2.199 +
   2.200 +    With this special context-switching device, and some ugly JNI
   2.201 +    bindings that are not worth mentioning, =CORTEX= gains the ability
   2.202 +    to access multiple sound streams from =OpenAL=. 
   2.203 +
   2.204 +    #+caption: Program to create an ear from a blender empty node. The ear
   2.205 +    #+caption: follows around the nearest physical object and passes 
   2.206 +    #+caption: all sensory data to a continuation function.
   2.207 +    #+name: add-ear
   2.208 +    #+begin_listing clojure
   2.209 +(defn add-ear!  
   2.210 +  "Create a Listener centered on the current position of 'ear 
   2.211 +   which follows the closest physical node in 'creature and 
   2.212 +   sends sound data to 'continuation."
   2.213 +  [#^Application world #^Node creature #^Spatial ear continuation]
   2.214 +  (let [target (closest-node creature ear)
   2.215 +        lis (Listener.)
   2.216 +        audio-renderer (.getAudioRenderer world)
   2.217 +        sp (hearing-pipeline continuation)]
   2.218 +    (.setLocation lis (.getWorldTranslation ear))
   2.219 +    (.setRotation lis (.getWorldRotation ear))
   2.220 +    (bind-sense target lis)
   2.221 +    (update-listener-velocity! target lis)
   2.222 +    (.addListener audio-renderer lis)
   2.223 +    (.registerSoundProcessor audio-renderer lis sp)))
   2.224 +    #+end_listing
   2.225 +
   2.226 +    
   2.227 +    The =Send= device, unlike most of the other devices in =OpenAL=,
   2.228 +    does not render sound unless asked. This enables the system to
   2.229 +    slow down or speed up depending on the needs of the AIs who are
   2.230 +    using it to listen. If the device tried to render samples in
   2.231 +    real-time, a complicated AI whose mind takes 100 seconds of
   2.232 +    computer time to simulate 1 second of AI-time would miss almost
   2.233 +    all of the sound in its environment!
   2.234 +
   2.235 +    #+caption: Program to enable arbitrary hearing in =CORTEX=
   2.236 +    #+name: hearing
   2.237 +    #+begin_listing clojure
   2.238 +(defn hearing-kernel
   2.239 +  "Returns a function which returns auditory sensory data when called
   2.240 +   inside a running simulation."
   2.241 +  [#^Node creature #^Spatial ear]
   2.242 +  (let [hearing-data (atom [])
   2.243 +        register-listener!
   2.244 +        (runonce 
   2.245 +         (fn [#^Application world]
   2.246 +           (add-ear!
   2.247 +            world creature ear
   2.248 +            (comp #(reset! hearing-data %)
   2.249 +                  byteBuffer->pulse-vector))))]
   2.250 +    (fn [#^Application world]
   2.251 +      (register-listener! world)
   2.252 +      (let [data @hearing-data
   2.253 +            topology              
   2.254 +            (vec (map #(vector % 0) (range 0 (count data))))]
   2.255 +        [topology data]))))
   2.256 +    
   2.257 +(defn hearing!
   2.258 +  "Endow the creature in a particular world with the sense of
   2.259 +   hearing. Will return a sequence of functions, one for each ear,
   2.260 +   which when called will return the auditory data from that ear."
   2.261 +  [#^Node creature]
   2.262 +  (for [ear (ears creature)]
   2.263 +    (hearing-kernel creature ear)))
   2.264 +    #+end_listing
   2.265 +
   2.266 +    Armed with these functions, =CORTEX= is able to test possibly the
   2.267 +    first ever instance of multiple listeners in a video game engine
   2.268 +    based simulation!
   2.269 +
   2.270 +    #+caption: Here a simple creature responds to sound by changing
   2.271 +    #+caption: its color from gray to green when the total volume
   2.272 +    #+caption: goes over a threshold.
   2.273 +    #+name: sound-test
   2.274 +    #+begin_listing java
   2.275 +/**
   2.276 + * Respond to sound!  This is the brain of an AI entity that 
   2.277 + * hears its surroundings and reacts to them.
   2.278 + */
   2.279 +public void process(ByteBuffer audioSamples, 
   2.280 +		    int numSamples, AudioFormat format) {
   2.281 +    audioSamples.clear();
   2.282 +    byte[] data = new byte[numSamples];
   2.283 +    float[] out = new float[numSamples];
   2.284 +    audioSamples.get(data);
   2.285 +    FloatSampleTools.
   2.286 +	byte2floatInterleaved
   2.287 +	(data, 0, out, 0, numSamples/format.getFrameSize(), format);
   2.288 +
   2.289 +    float max = Float.NEGATIVE_INFINITY;
   2.290 +    for (float f : out){if (f > max) max = f;}
   2.291 +    audioSamples.clear();
   2.292 +
   2.293 +    if (max > 0.1){
   2.294 +	entity.getMaterial().setColor("Color", ColorRGBA.Green);
   2.295 +    }
   2.296 +    else {
   2.297 +	entity.getMaterial().setColor("Color", ColorRGBA.Gray);
   2.298 +    }
   2.299 +    #+end_listing
   2.300 +
   2.301 +    #+caption: First ever simulation of multiple listerners in =CORTEX=.
   2.302 +    #+caption: Each cube is a creature which processes sound data with
   2.303 +    #+caption: the =process= function from listing \ref{sound-test}. 
   2.304 +    #+caption: the ball is constantally emiting a pure tone of
   2.305 +    #+caption: constant volume. As it approaches the cubes, they each
   2.306 +    #+caption: change color in response to the sound.
   2.307 +    #+name: sound-cubes.
   2.308 +    #+ATTR_LaTeX: :width 10cm
   2.309 +    [[./images/aurellem-gray.png]]
   2.310 +
   2.311 +    This system of hearing has also been co-opted by the
   2.312 +    jMonkeyEngine3 community and is used to record audio for demo
   2.313 +    videos.
   2.314 +
   2.315  ** Touch uses hundreds of hair-like elements
   2.316  
   2.317  ** Proprioception is the sense that makes everything ``real''