Mercurial > cortex
changeset 472:516a029e0be9
complete first draft of hearing.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Fri, 28 Mar 2014 18:14:04 -0400 |
parents | f14fa9e5b67f |
children | 486ce07f5545 |
files | org/hearing.org thesis/cortex.org |
diffstat | 2 files changed, 305 insertions(+), 6 deletions(-) [+] |
line wrap: on
line diff
1.1 --- a/org/hearing.org Fri Mar 28 17:31:33 2014 -0400 1.2 +++ b/org/hearing.org Fri Mar 28 18:14:04 2014 -0400 1.3 @@ -26,8 +26,8 @@ 1.4 jMonkeyEngine's sound system works as follows: 1.5 1.6 - jMonkeyEngine uses the =AppSettings= for the particular application 1.7 - to determine what sort of =AudioRenderer= should be used. 1.8 - - Although some support is provided for multiple AudioRendering 1.9 + to determine what sort of =AudioRenderer= should be used. 1.10 +- Although some support is provided for multiple AudioRendering 1.11 backends, jMonkeyEngine at the time of this writing will either 1.12 pick no =AudioRenderer= at all, or the =LwjglAudioRenderer=. 1.13 - jMonkeyEngine tries to figure out what sort of system you're 1.14 @@ -78,7 +78,7 @@ 1.15 1.16 Therefore, in order to support multiple listeners, and get the sound 1.17 data in a form that the AIs can use, it is necessary to create a new 1.18 -Device which supports this features. 1.19 +Device which supports this feature. 1.20 1.21 ** The Send Device 1.22 Adding a device to OpenAL is rather tricky -- there are five separate 1.23 @@ -823,12 +823,12 @@ 1.24 Together, these three functions define how ears found in a specially 1.25 prepared blender file will be translated to =Listener= objects in a 1.26 simulation. =ears= extracts all the children of to top level node 1.27 -named "ears". =add-ear!= and =update-listener-velocity!= use 1.28 +named "ears". =add-ear!= and =update-listener-velocity!= use 1.29 =bind-sense= to bind a =Listener= object located at the initial 1.30 position of an "ear" node to the closest physical object in the 1.31 creature. That =Listener= will stay in the same orientation to the 1.32 object with which it is bound, just as the camera in the [[http://aurellem.localhost/cortex/html/sense.html#sec-4-1][sense binding 1.33 -demonstration]]. =OpenAL= simulates the Doppler effect for moving 1.34 +demonstration]]. =OpenAL= simulates the Doppler effect for moving 1.35 listeners, =update-listener-velocity!= ensures that this velocity 1.36 information is always up-to-date. 1.37
2.1 --- a/thesis/cortex.org Fri Mar 28 17:31:33 2014 -0400 2.2 +++ b/thesis/cortex.org Fri Mar 28 18:14:04 2014 -0400 2.3 @@ -954,7 +954,7 @@ 2.4 #+ATTR_LaTeX: :width 15cm 2.5 [[./images/physical-hand.png]] 2.6 2.7 -** Eyes reuse standard video game components 2.8 +** COMMENT Eyes reuse standard video game components 2.9 2.10 Vision is one of the most important senses for humans, so I need to 2.11 build a simulated sense of vision for my AI. I will do this with 2.12 @@ -1253,6 +1253,305 @@ 2.13 2.14 ** Hearing is hard; =CORTEX= does it right 2.15 2.16 + At the end of this section I will have simulated ears that work the 2.17 + same way as the simulated eyes in the last section. I will be able to 2.18 + place any number of ear-nodes in a blender file, and they will bind to 2.19 + the closest physical object and follow it as it moves around. Each ear 2.20 + will provide access to the sound data it picks up between every frame. 2.21 + 2.22 + Hearing is one of the more difficult senses to simulate, because there 2.23 + is less support for obtaining the actual sound data that is processed 2.24 + by jMonkeyEngine3. There is no "split-screen" support for rendering 2.25 + sound from different points of view, and there is no way to directly 2.26 + access the rendered sound data. 2.27 + 2.28 + =CORTEX='s hearing is unique because it does not have any 2.29 + limitations compared to other simulation environments. As far as I 2.30 + know, there is no other system that supports multiple listerers, 2.31 + and the sound demo at the end of this section is the first time 2.32 + it's been done in a video game environment. 2.33 + 2.34 +*** Brief Description of jMonkeyEngine's Sound System 2.35 + 2.36 + jMonkeyEngine's sound system works as follows: 2.37 + 2.38 + - jMonkeyEngine uses the =AppSettings= for the particular 2.39 + application to determine what sort of =AudioRenderer= should be 2.40 + used. 2.41 + - Although some support is provided for multiple AudioRendering 2.42 + backends, jMonkeyEngine at the time of this writing will either 2.43 + pick no =AudioRenderer= at all, or the =LwjglAudioRenderer=. 2.44 + - jMonkeyEngine tries to figure out what sort of system you're 2.45 + running and extracts the appropriate native libraries. 2.46 + - The =LwjglAudioRenderer= uses the [[http://lwjgl.org/][=LWJGL=]] (LightWeight Java Game 2.47 + Library) bindings to interface with a C library called [[http://kcat.strangesoft.net/openal.html][=OpenAL=]] 2.48 + - =OpenAL= renders the 3D sound and feeds the rendered sound 2.49 + directly to any of various sound output devices with which it 2.50 + knows how to communicate. 2.51 + 2.52 + A consequence of this is that there's no way to access the actual 2.53 + sound data produced by =OpenAL=. Even worse, =OpenAL= only supports 2.54 + one /listener/ (it renders sound data from only one perspective), 2.55 + which normally isn't a problem for games, but becomes a problem 2.56 + when trying to make multiple AI creatures that can each hear the 2.57 + world from a different perspective. 2.58 + 2.59 + To make many AI creatures in jMonkeyEngine that can each hear the 2.60 + world from their own perspective, or to make a single creature with 2.61 + many ears, it is necessary to go all the way back to =OpenAL= and 2.62 + implement support for simulated hearing there. 2.63 + 2.64 +*** Extending =OpenAl= 2.65 + 2.66 + Extending =OpenAL= to support multiple listeners requires 500 2.67 + lines of =C= code and is too hairy to mention here. Instead, I 2.68 + will show a small amount of extension code and go over the high 2.69 + level stragety. Full source is of course available with the 2.70 + =CORTEX= distribution if you're interested. 2.71 + 2.72 + =OpenAL= goes to great lengths to support many different systems, 2.73 + all with different sound capabilities and interfaces. It 2.74 + accomplishes this difficult task by providing code for many 2.75 + different sound backends in pseudo-objects called /Devices/. 2.76 + There's a device for the Linux Open Sound System and the Advanced 2.77 + Linux Sound Architecture, there's one for Direct Sound on Windows, 2.78 + and there's even one for Solaris. =OpenAL= solves the problem of 2.79 + platform independence by providing all these Devices. 2.80 + 2.81 + Wrapper libraries such as LWJGL are free to examine the system on 2.82 + which they are running and then select an appropriate device for 2.83 + that system. 2.84 + 2.85 + There are also a few "special" devices that don't interface with 2.86 + any particular system. These include the Null Device, which 2.87 + doesn't do anything, and the Wave Device, which writes whatever 2.88 + sound it receives to a file, if everything has been set up 2.89 + correctly when configuring =OpenAL=. 2.90 + 2.91 + Actual mixing (doppler shift and distance.environment-based 2.92 + attenuation) of the sound data happens in the Devices, and they 2.93 + are the only point in the sound rendering process where this data 2.94 + is available. 2.95 + 2.96 + Therefore, in order to support multiple listeners, and get the 2.97 + sound data in a form that the AIs can use, it is necessary to 2.98 + create a new Device which supports this feature. 2.99 + 2.100 + Adding a device to OpenAL is rather tricky -- there are five 2.101 + separate files in the =OpenAL= source tree that must be modified 2.102 + to do so. I named my device the "Multiple Audio Send" Device, or 2.103 + =Send= Device for short, since it sends audio data back to the 2.104 + calling application like an Aux-Send cable on a mixing board. 2.105 + 2.106 + The main idea behind the Send device is to take advantage of the 2.107 + fact that LWJGL only manages one /context/ when using OpenAL. A 2.108 + /context/ is like a container that holds samples and keeps track 2.109 + of where the listener is. In order to support multiple listeners, 2.110 + the Send device identifies the LWJGL context as the master 2.111 + context, and creates any number of slave contexts to represent 2.112 + additional listeners. Every time the device renders sound, it 2.113 + synchronizes every source from the master LWJGL context to the 2.114 + slave contexts. Then, it renders each context separately, using a 2.115 + different listener for each one. The rendered sound is made 2.116 + available via JNI to jMonkeyEngine. 2.117 + 2.118 + Switching between contexts is not the normal operation of a 2.119 + Device, and one of the problems with doing so is that a Device 2.120 + normally keeps around a few pieces of state such as the 2.121 + =ClickRemoval= array above which will become corrupted if the 2.122 + contexts are not rendered in parallel. The solution is to create a 2.123 + copy of this normally global device state for each context, and 2.124 + copy it back and forth into and out of the actual device state 2.125 + whenever a context is rendered. 2.126 + 2.127 + The core of the =Send= device is the =syncSources= function, which 2.128 + does the job of copying all relevant data from one context to 2.129 + another. 2.130 + 2.131 + #+caption: Program for extending =OpenAL= to support multiple 2.132 + #+caption: listeners via context copying/switching. 2.133 + #+name: sync-openal-sources 2.134 + #+begin_listing C 2.135 +void syncSources(ALsource *masterSource, ALsource *slaveSource, 2.136 + ALCcontext *masterCtx, ALCcontext *slaveCtx){ 2.137 + ALuint master = masterSource->source; 2.138 + ALuint slave = slaveSource->source; 2.139 + ALCcontext *current = alcGetCurrentContext(); 2.140 + 2.141 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_PITCH); 2.142 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_GAIN); 2.143 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_DISTANCE); 2.144 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_ROLLOFF_FACTOR); 2.145 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_REFERENCE_DISTANCE); 2.146 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_MIN_GAIN); 2.147 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_GAIN); 2.148 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_GAIN); 2.149 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_INNER_ANGLE); 2.150 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_ANGLE); 2.151 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_SEC_OFFSET); 2.152 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_SAMPLE_OFFSET); 2.153 + syncSourcef(master,slave,masterCtx,slaveCtx,AL_BYTE_OFFSET); 2.154 + 2.155 + syncSource3f(master,slave,masterCtx,slaveCtx,AL_POSITION); 2.156 + syncSource3f(master,slave,masterCtx,slaveCtx,AL_VELOCITY); 2.157 + syncSource3f(master,slave,masterCtx,slaveCtx,AL_DIRECTION); 2.158 + 2.159 + syncSourcei(master,slave,masterCtx,slaveCtx,AL_SOURCE_RELATIVE); 2.160 + syncSourcei(master,slave,masterCtx,slaveCtx,AL_LOOPING); 2.161 + 2.162 + alcMakeContextCurrent(masterCtx); 2.163 + ALint source_type; 2.164 + alGetSourcei(master, AL_SOURCE_TYPE, &source_type); 2.165 + 2.166 + // Only static sources are currently synchronized! 2.167 + if (AL_STATIC == source_type){ 2.168 + ALint master_buffer; 2.169 + ALint slave_buffer; 2.170 + alGetSourcei(master, AL_BUFFER, &master_buffer); 2.171 + alcMakeContextCurrent(slaveCtx); 2.172 + alGetSourcei(slave, AL_BUFFER, &slave_buffer); 2.173 + if (master_buffer != slave_buffer){ 2.174 + alSourcei(slave, AL_BUFFER, master_buffer); 2.175 + } 2.176 + } 2.177 + 2.178 + // Synchronize the state of the two sources. 2.179 + alcMakeContextCurrent(masterCtx); 2.180 + ALint masterState; 2.181 + ALint slaveState; 2.182 + 2.183 + alGetSourcei(master, AL_SOURCE_STATE, &masterState); 2.184 + alcMakeContextCurrent(slaveCtx); 2.185 + alGetSourcei(slave, AL_SOURCE_STATE, &slaveState); 2.186 + 2.187 + if (masterState != slaveState){ 2.188 + switch (masterState){ 2.189 + case AL_INITIAL : alSourceRewind(slave); break; 2.190 + case AL_PLAYING : alSourcePlay(slave); break; 2.191 + case AL_PAUSED : alSourcePause(slave); break; 2.192 + case AL_STOPPED : alSourceStop(slave); break; 2.193 + } 2.194 + } 2.195 + // Restore whatever context was previously active. 2.196 + alcMakeContextCurrent(current); 2.197 +} 2.198 + #+end_listing 2.199 + 2.200 + With this special context-switching device, and some ugly JNI 2.201 + bindings that are not worth mentioning, =CORTEX= gains the ability 2.202 + to access multiple sound streams from =OpenAL=. 2.203 + 2.204 + #+caption: Program to create an ear from a blender empty node. The ear 2.205 + #+caption: follows around the nearest physical object and passes 2.206 + #+caption: all sensory data to a continuation function. 2.207 + #+name: add-ear 2.208 + #+begin_listing clojure 2.209 +(defn add-ear! 2.210 + "Create a Listener centered on the current position of 'ear 2.211 + which follows the closest physical node in 'creature and 2.212 + sends sound data to 'continuation." 2.213 + [#^Application world #^Node creature #^Spatial ear continuation] 2.214 + (let [target (closest-node creature ear) 2.215 + lis (Listener.) 2.216 + audio-renderer (.getAudioRenderer world) 2.217 + sp (hearing-pipeline continuation)] 2.218 + (.setLocation lis (.getWorldTranslation ear)) 2.219 + (.setRotation lis (.getWorldRotation ear)) 2.220 + (bind-sense target lis) 2.221 + (update-listener-velocity! target lis) 2.222 + (.addListener audio-renderer lis) 2.223 + (.registerSoundProcessor audio-renderer lis sp))) 2.224 + #+end_listing 2.225 + 2.226 + 2.227 + The =Send= device, unlike most of the other devices in =OpenAL=, 2.228 + does not render sound unless asked. This enables the system to 2.229 + slow down or speed up depending on the needs of the AIs who are 2.230 + using it to listen. If the device tried to render samples in 2.231 + real-time, a complicated AI whose mind takes 100 seconds of 2.232 + computer time to simulate 1 second of AI-time would miss almost 2.233 + all of the sound in its environment! 2.234 + 2.235 + #+caption: Program to enable arbitrary hearing in =CORTEX= 2.236 + #+name: hearing 2.237 + #+begin_listing clojure 2.238 +(defn hearing-kernel 2.239 + "Returns a function which returns auditory sensory data when called 2.240 + inside a running simulation." 2.241 + [#^Node creature #^Spatial ear] 2.242 + (let [hearing-data (atom []) 2.243 + register-listener! 2.244 + (runonce 2.245 + (fn [#^Application world] 2.246 + (add-ear! 2.247 + world creature ear 2.248 + (comp #(reset! hearing-data %) 2.249 + byteBuffer->pulse-vector))))] 2.250 + (fn [#^Application world] 2.251 + (register-listener! world) 2.252 + (let [data @hearing-data 2.253 + topology 2.254 + (vec (map #(vector % 0) (range 0 (count data))))] 2.255 + [topology data])))) 2.256 + 2.257 +(defn hearing! 2.258 + "Endow the creature in a particular world with the sense of 2.259 + hearing. Will return a sequence of functions, one for each ear, 2.260 + which when called will return the auditory data from that ear." 2.261 + [#^Node creature] 2.262 + (for [ear (ears creature)] 2.263 + (hearing-kernel creature ear))) 2.264 + #+end_listing 2.265 + 2.266 + Armed with these functions, =CORTEX= is able to test possibly the 2.267 + first ever instance of multiple listeners in a video game engine 2.268 + based simulation! 2.269 + 2.270 + #+caption: Here a simple creature responds to sound by changing 2.271 + #+caption: its color from gray to green when the total volume 2.272 + #+caption: goes over a threshold. 2.273 + #+name: sound-test 2.274 + #+begin_listing java 2.275 +/** 2.276 + * Respond to sound! This is the brain of an AI entity that 2.277 + * hears its surroundings and reacts to them. 2.278 + */ 2.279 +public void process(ByteBuffer audioSamples, 2.280 + int numSamples, AudioFormat format) { 2.281 + audioSamples.clear(); 2.282 + byte[] data = new byte[numSamples]; 2.283 + float[] out = new float[numSamples]; 2.284 + audioSamples.get(data); 2.285 + FloatSampleTools. 2.286 + byte2floatInterleaved 2.287 + (data, 0, out, 0, numSamples/format.getFrameSize(), format); 2.288 + 2.289 + float max = Float.NEGATIVE_INFINITY; 2.290 + for (float f : out){if (f > max) max = f;} 2.291 + audioSamples.clear(); 2.292 + 2.293 + if (max > 0.1){ 2.294 + entity.getMaterial().setColor("Color", ColorRGBA.Green); 2.295 + } 2.296 + else { 2.297 + entity.getMaterial().setColor("Color", ColorRGBA.Gray); 2.298 + } 2.299 + #+end_listing 2.300 + 2.301 + #+caption: First ever simulation of multiple listerners in =CORTEX=. 2.302 + #+caption: Each cube is a creature which processes sound data with 2.303 + #+caption: the =process= function from listing \ref{sound-test}. 2.304 + #+caption: the ball is constantally emiting a pure tone of 2.305 + #+caption: constant volume. As it approaches the cubes, they each 2.306 + #+caption: change color in response to the sound. 2.307 + #+name: sound-cubes. 2.308 + #+ATTR_LaTeX: :width 10cm 2.309 + [[./images/aurellem-gray.png]] 2.310 + 2.311 + This system of hearing has also been co-opted by the 2.312 + jMonkeyEngine3 community and is used to record audio for demo 2.313 + videos. 2.314 + 2.315 ** Touch uses hundreds of hair-like elements 2.316 2.317 ** Proprioception is the sense that makes everything ``real''