cortex: thesis/cortex.org comparison

comparison thesis/cortex.org @ 472:516a029e0be9

complete first draft of hearing.

author	Robert McIntyre <rlm@mit.edu>
date	Fri, 28 Mar 2014 18:14:04 -0400
parents	f14fa9e5b67f
children	486ce07f5545

comparison

equal deleted inserted replaced

-:f14fa9e5b67f
+:516a029e0be9
 #+caption: simulation environment.
 #+name: name
 #+ATTR_LaTeX: :width 15cm
 [[./images/physical-hand.png]]
-** Eyes reuse standard video game components
+** COMMENT Eyes reuse standard video game components
 Vision is one of the most important senses for humans, so I need to
 build a simulated sense of vision for my AI. I will do this with
 simulated eyes. Each eye can be independently moved and should see
 its own version of the world depending on where it is.
 community and is now (in modified form) part of a system for
 capturing in-game video to a file.
 ** Hearing is hard; =CORTEX= does it right
+At the end of this section I will have simulated ears that work the
+same way as the simulated eyes in the last section. I will be able to
+place any number of ear-nodes in a blender file, and they will bind to
+the closest physical object and follow it as it moves around. Each ear
+will provide access to the sound data it picks up between every frame.
+Hearing is one of the more difficult senses to simulate, because there
+is less support for obtaining the actual sound data that is processed
+by jMonkeyEngine3. There is no "split-screen" support for rendering
+sound from different points of view, and there is no way to directly
+access the rendered sound data.
+=CORTEX='s hearing is unique because it does not have any
+limitations compared to other simulation environments. As far as I
+know, there is no other system that supports multiple listerers,
+and the sound demo at the end of this section is the first time
+it's been done in a video game environment.
+*** Brief Description of jMonkeyEngine's Sound System
+jMonkeyEngine's sound system works as follows:
+- jMonkeyEngine uses the =AppSettings= for the particular
+application to determine what sort of =AudioRenderer= should be
+used.
+- Although some support is provided for multiple AudioRendering
+backends, jMonkeyEngine at the time of this writing will either
+pick no =AudioRenderer= at all, or the =LwjglAudioRenderer=.
+- jMonkeyEngine tries to figure out what sort of system you're
+running and extracts the appropriate native libraries.
+- The =LwjglAudioRenderer= uses the [[http://lwjgl.org/][=LWJGL=]] (LightWeight Java Game
+Library) bindings to interface with a C library called [[http://kcat.strangesoft.net/openal.html][=OpenAL=]]
+- =OpenAL= renders the 3D sound and feeds the rendered sound
+directly to any of various sound output devices with which it
+knows how to communicate.
+A consequence of this is that there's no way to access the actual
+sound data produced by =OpenAL=. Even worse, =OpenAL= only supports
+one /listener/ (it renders sound data from only one perspective),
+which normally isn't a problem for games, but becomes a problem
+when trying to make multiple AI creatures that can each hear the
+world from a different perspective.
+To make many AI creatures in jMonkeyEngine that can each hear the
+world from their own perspective, or to make a single creature with
+many ears, it is necessary to go all the way back to =OpenAL= and
+implement support for simulated hearing there.
+*** Extending =OpenAl=
+Extending =OpenAL= to support multiple listeners requires 500
+lines of =C= code and is too hairy to mention here. Instead, I
+will show a small amount of extension code and go over the high
+level stragety. Full source is of course available with the
+=CORTEX= distribution if you're interested.
+=OpenAL= goes to great lengths to support many different systems,
+all with different sound capabilities and interfaces. It
+accomplishes this difficult task by providing code for many
+different sound backends in pseudo-objects called /Devices/.
+There's a device for the Linux Open Sound System and the Advanced
+Linux Sound Architecture, there's one for Direct Sound on Windows,
+and there's even one for Solaris. =OpenAL= solves the problem of
+platform independence by providing all these Devices.
+Wrapper libraries such as LWJGL are free to examine the system on
+which they are running and then select an appropriate device for
+that system.
+There are also a few "special" devices that don't interface with
+any particular system. These include the Null Device, which
+doesn't do anything, and the Wave Device, which writes whatever
+sound it receives to a file, if everything has been set up
+correctly when configuring =OpenAL=.
+Actual mixing (doppler shift and distance.environment-based
+attenuation) of the sound data happens in the Devices, and they
+are the only point in the sound rendering process where this data
+is available.
+Therefore, in order to support multiple listeners, and get the
+sound data in a form that the AIs can use, it is necessary to
+create a new Device which supports this feature.
+Adding a device to OpenAL is rather tricky -- there are five
+separate files in the =OpenAL= source tree that must be modified
+to do so. I named my device the "Multiple Audio Send" Device, or
+=Send= Device for short, since it sends audio data back to the
+calling application like an Aux-Send cable on a mixing board.
+The main idea behind the Send device is to take advantage of the
+fact that LWJGL only manages one /context/ when using OpenAL. A
+/context/ is like a container that holds samples and keeps track
+of where the listener is. In order to support multiple listeners,
+the Send device identifies the LWJGL context as the master
+context, and creates any number of slave contexts to represent
+additional listeners. Every time the device renders sound, it
+synchronizes every source from the master LWJGL context to the
+slave contexts. Then, it renders each context separately, using a
+different listener for each one. The rendered sound is made
+available via JNI to jMonkeyEngine.
+Switching between contexts is not the normal operation of a
+Device, and one of the problems with doing so is that a Device
+normally keeps around a few pieces of state such as the
+=ClickRemoval= array above which will become corrupted if the
+contexts are not rendered in parallel. The solution is to create a
+copy of this normally global device state for each context, and
+copy it back and forth into and out of the actual device state
+whenever a context is rendered.
+The core of the =Send= device is the =syncSources= function, which
+does the job of copying all relevant data from one context to
+another.
+#+caption: Program for extending =OpenAL= to support multiple
+#+caption: listeners via context copying/switching.
+#+name: sync-openal-sources
+#+begin_listing C
+void syncSources(ALsource *masterSource, ALsource *slaveSource,
+		 ALCcontext *masterCtx, ALCcontext *slaveCtx){
+ALuint master = masterSource->source;
+ALuint slave = slaveSource->source;
+ALCcontext *current = alcGetCurrentContext();
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_PITCH);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_GAIN);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_DISTANCE);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_ROLLOFF_FACTOR);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_REFERENCE_DISTANCE);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_MIN_GAIN);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_GAIN);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_GAIN);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_INNER_ANGLE);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_ANGLE);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_SEC_OFFSET);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_SAMPLE_OFFSET);
+syncSourcef(master,slave,masterCtx,slaveCtx,AL_BYTE_OFFSET);
+syncSource3f(master,slave,masterCtx,slaveCtx,AL_POSITION);
+syncSource3f(master,slave,masterCtx,slaveCtx,AL_VELOCITY);
+syncSource3f(master,slave,masterCtx,slaveCtx,AL_DIRECTION);
+syncSourcei(master,slave,masterCtx,slaveCtx,AL_SOURCE_RELATIVE);
+syncSourcei(master,slave,masterCtx,slaveCtx,AL_LOOPING);
+alcMakeContextCurrent(masterCtx);
+ALint source_type;
+alGetSourcei(master, AL_SOURCE_TYPE, &source_type);
+// Only static sources are currently synchronized!
+if (AL_STATIC == source_type){
+ALint master_buffer;
+ALint slave_buffer;
+alGetSourcei(master, AL_BUFFER, &master_buffer);
+alcMakeContextCurrent(slaveCtx);
+alGetSourcei(slave, AL_BUFFER, &slave_buffer);
+if (master_buffer != slave_buffer){
+alSourcei(slave, AL_BUFFER, master_buffer);
+}
+}
+// Synchronize the state of the two sources.
+alcMakeContextCurrent(masterCtx);
+ALint masterState;
+ALint slaveState;
+alGetSourcei(master, AL_SOURCE_STATE, &masterState);
+alcMakeContextCurrent(slaveCtx);
+alGetSourcei(slave, AL_SOURCE_STATE, &slaveState);
+if (masterState != slaveState){
+switch (masterState){
+case AL_INITIAL : alSourceRewind(slave); break;
+case AL_PLAYING : alSourcePlay(slave);   break;
+case AL_PAUSED  : alSourcePause(slave);  break;
+case AL_STOPPED : alSourceStop(slave);   break;
+}
+}
+// Restore whatever context was previously active.
+alcMakeContextCurrent(current);
+}
+#+end_listing
+With this special context-switching device, and some ugly JNI
+bindings that are not worth mentioning, =CORTEX= gains the ability
+to access multiple sound streams from =OpenAL=.
+#+caption: Program to create an ear from a blender empty node. The ear
+#+caption: follows around the nearest physical object and passes
+#+caption: all sensory data to a continuation function.
+#+name: add-ear
+#+begin_listing clojure
+(defn add-ear!
+"Create a Listener centered on the current position of 'ear
+which follows the closest physical node in 'creature and
+sends sound data to 'continuation."
+[#^Application world #^Node creature #^Spatial ear continuation]
+(let [target (closest-node creature ear)
+lis (Listener.)
+audio-renderer (.getAudioRenderer world)
+sp (hearing-pipeline continuation)]
+(.setLocation lis (.getWorldTranslation ear))
+(.setRotation lis (.getWorldRotation ear))
+(bind-sense target lis)
+(update-listener-velocity! target lis)
+(.addListener audio-renderer lis)
+(.registerSoundProcessor audio-renderer lis sp)))
+#+end_listing
+The =Send= device, unlike most of the other devices in =OpenAL=,
+does not render sound unless asked. This enables the system to
+slow down or speed up depending on the needs of the AIs who are
+using it to listen. If the device tried to render samples in
+real-time, a complicated AI whose mind takes 100 seconds of
+computer time to simulate 1 second of AI-time would miss almost
+all of the sound in its environment!
+#+caption: Program to enable arbitrary hearing in =CORTEX=
+#+name: hearing
+#+begin_listing clojure
+(defn hearing-kernel
+"Returns a function which returns auditory sensory data when called
+inside a running simulation."
+[#^Node creature #^Spatial ear]
+(let [hearing-data (atom [])
+register-listener!
+(runonce
+(fn [#^Application world]
+(add-ear!
+world creature ear
+(comp #(reset! hearing-data %)
+byteBuffer->pulse-vector))))]
+(fn [#^Application world]
+(register-listener! world)
+(let [data @hearing-data
+topology
+(vec (map #(vector % 0) (range 0 (count data))))]
+[topology data]))))
+(defn hearing!
+"Endow the creature in a particular world with the sense of
+hearing. Will return a sequence of functions, one for each ear,
+which when called will return the auditory data from that ear."
+[#^Node creature]
+(for [ear (ears creature)]
+(hearing-kernel creature ear)))
+#+end_listing
+Armed with these functions, =CORTEX= is able to test possibly the
+first ever instance of multiple listeners in a video game engine
+based simulation!
+#+caption: Here a simple creature responds to sound by changing
+#+caption: its color from gray to green when the total volume
+#+caption: goes over a threshold.
+#+name: sound-test
+#+begin_listing java
+/**
+* Respond to sound!  This is the brain of an AI entity that
+* hears its surroundings and reacts to them.
+*/
+public void process(ByteBuffer audioSamples,
+		    int numSamples, AudioFormat format) {
+audioSamples.clear();
+byte[] data = new byte[numSamples];
+float[] out = new float[numSamples];
+audioSamples.get(data);
+FloatSampleTools.
+	byte2floatInterleaved
+	(data, 0, out, 0, numSamples/format.getFrameSize(), format);
+float max = Float.NEGATIVE_INFINITY;
+for (float f : out){if (f > max) max = f;}
+audioSamples.clear();
+if (max > 0.1){
+	entity.getMaterial().setColor("Color", ColorRGBA.Green);
+}
+else {
+	entity.getMaterial().setColor("Color", ColorRGBA.Gray);
+}
+#+end_listing
+#+caption: First ever simulation of multiple listerners in =CORTEX=.
+#+caption: Each cube is a creature which processes sound data with
+#+caption: the =process= function from listing \ref{sound-test}.
+#+caption: the ball is constantally emiting a pure tone of
+#+caption: constant volume. As it approaches the cubes, they each
+#+caption: change color in response to the sound.
+#+name: sound-cubes.
+#+ATTR_LaTeX: :width 10cm
+[[./images/aurellem-gray.png]]
+This system of hearing has also been co-opted by the
+jMonkeyEngine3 community and is used to record audio for demo
+videos.
 ** Touch uses hundreds of hair-like elements
 ** Proprioception is the sense that makes everything ``real''
 ** Muscles are both effectors and sensors

Mercurial > cortex

comparison thesis/cortex.org @ 472:516a029e0be9