# HG changeset patch # User Robert McIntyre # Date 1328969318 25200 # Node ID c5f6d880558b015ca512f5697406395abd6b6da2 # Parent 5f14fd7b12885ffab6fb93d39bf0c520418fe6a0 making hearing.org up-to-date diff -r 5f14fd7b1288 -r c5f6d880558b org/hearing.org --- a/org/hearing.org Sat Feb 11 00:51:54 2012 -0700 +++ b/org/hearing.org Sat Feb 11 07:08:38 2012 -0700 @@ -9,39 +9,46 @@ * Hearing -I want to be able to place ears in a similar manner to how I place -the eyes. I want to be able to place ears in a unique spatial -position, and receive as output at every tick the F.F.T. of whatever -signals are happening at that point. +At the end of this post I will have simulated ears that work the same +way as the simulated eyes in the last post. I will be able to place +any number of ear-nodes in a blender file, and they will bind to the +closest physical object and follow it as it moves around. Each ear +will provide access to the sound data it picks up between every frame. Hearing is one of the more difficult senses to simulate, because there is less support for obtaining the actual sound data that is processed -by jMonkeyEngine3. +by jMonkeyEngine3. There is no "split-screen" support for rendering +sound from different points of view, and there is no way to directly +access the rendered sound data. + +** Brief Description of jMonkeyEngine's Sound System jMonkeyEngine's sound system works as follows: - jMonkeyEngine uses the =AppSettings= for the particular application to determine what sort of =AudioRenderer= should be used. - - although some support is provided for multiple AudioRendering + - Although some support is provided for multiple AudioRendering backends, jMonkeyEngine at the time of this writing will either - pick no AudioRenderer at all, or the =LwjglAudioRenderer= + pick no =AudioRenderer= at all, or the =LwjglAudioRenderer=. - jMonkeyEngine tries to figure out what sort of system you're running and extracts the appropriate native libraries. - - the =LwjglAudioRenderer= uses the [[http://lwjgl.org/][=LWJGL=]] (LightWeight Java Game + - The =LwjglAudioRenderer= uses the [[http://lwjgl.org/][=LWJGL=]] (LightWeight Java Game Library) bindings to interface with a C library called [[http://kcat.strangesoft.net/openal.html][=OpenAL=]] - - =OpenAL= calculates the 3D sound localization and feeds a stream of - sound to any of various sound output devices with which it knows - how to communicate. + - =OpenAL= renders the 3D sound and feeds the rendered sound directly + to any of various sound output devices with which it knows how to + communicate. A consequence of this is that there's no way to access the actual -sound data produced by =OpenAL=. Even worse, =OpenAL= only supports -one /listener/, which normally isn't a problem for games, but becomes -a problem when trying to make multiple AI creatures that can each hear -the world from a different perspective. +sound data produced by =OpenAL=. Even worse, =OpenAL= only supports +one /listener/ (it renders sound data from only one perspective), +which normally isn't a problem for games, but becomes a problem when +trying to make multiple AI creatures that can each hear the world from +a different perspective. To make many AI creatures in jMonkeyEngine that can each hear the -world from their own perspective, it is necessary to go all the way -back to =OpenAL= and implement support for simulated hearing there. +world from their own perspective, or to make a single creature with +many ears, it is necessary to go all the way back to =OpenAL= and +implement support for simulated hearing there. * Extending =OpenAL= ** =OpenAL= Devices @@ -71,22 +78,25 @@ Therefore, in order to support multiple listeners, and get the sound data in a form that the AIs can use, it is necessary to create a new -Device, which supports this features. +Device which supports this features. ** The Send Device Adding a device to OpenAL is rather tricky -- there are five separate files in the =OpenAL= source tree that must be modified to do so. I've -documented this process [[./add-new-device.org][here]] for anyone who is interested. +documented this process [[../../audio-send/html/add-new-device.html][here]] for anyone who is interested. - -Onward to that actual Device! - -again, my objectives are: +Again, my objectives are: - Support Multiple Listeners from jMonkeyEngine3 - Get access to the rendered sound data for further processing from clojure. +I named it the "Multiple Audio Send" Deives, or =Send= Device for +short, since it sends audio data back to the callig application like +an Aux-Send cable on a mixing board. + +Onward to the actual Device! + ** =send.c= ** Header @@ -172,7 +182,7 @@ Switching between contexts is not the normal operation of a Device, and one of the problems with doing so is that a Device normally keeps around a few pieces of state such as the =ClickRemoval= array above -which will become corrupted if the contexts are not done in +which will become corrupted if the contexts are not rendered in parallel. The solution is to create a copy of this normally global device state for each context, and copy it back and forth into and out of the actual device state whenever a context is rendered. @@ -398,13 +408,13 @@ } #+end_src -=OpenAL= normally renders all Contexts in parallel, outputting the +=OpenAL= normally renders all contexts in parallel, outputting the whole result to the buffer. It does this by iterating over the Device->Contexts array and rendering each context to the buffer in turn. By temporally setting Device->NumContexts to 1 and adjusting the Device's context list to put the desired context-to-be-rendered -into position 0, we can get trick =OpenAL= into rendering each slave -context separate from all the others. +into position 0, we can get trick =OpenAL= into rendering each context +separate from all the others. ** Main Device Loop #+name: main-loop @@ -419,7 +429,6 @@ addContext(Device, masterContext); } - static void renderData(ALCdevice *Device, int samples){ if(!Device->Connected){return;} send_data *data = (send_data*)Device->ExtraData; @@ -451,8 +460,8 @@ #+end_src The main loop synchronizes the master LWJGL context with all the slave -contexts, then walks each context, rendering just that context to it's -audio-sample storage buffer. +contexts, then iterates through each context, rendering just that +context to it's audio-sample storage buffer. ** JNI Methods @@ -461,9 +470,9 @@ waiting patiently in internal buffers, one for each listener. We need a way to transport this information to Java, and also a way to drive this device from Java. The following JNI interface code is inspired -by the way LWJGL interfaces with =OpenAL=. +by the LWJGL JNI interface to =OpenAL=. -*** step +*** Stepping the Device #+name: jni-step #+begin_src C //////////////////// JNI Methods @@ -490,7 +499,7 @@ its environment. -*** getSamples +*** Device->Java Data Transport #+name: jni-get-samples #+begin_src C /* @@ -639,9 +648,9 @@ } #+end_src -** Boring Device management stuff +** Boring Device Management Stuff / Memory Cleanup This code is more-or-less copied verbatim from the other =OpenAL= -backends. It's the basis for =OpenAL='s primitive object system. +Devices. It's the basis for =OpenAL='s primitive object system. #+name: device-init #+begin_src C //////////////////// Device Initialization / Management @@ -732,62 +741,98 @@ * The Java interface, =AudioSend= The Java interface to the Send Device follows naturally from the JNI -definitions. It is included here for completeness. The only thing here -of note is the =deviceID=. This is available from LWJGL, but to only -way to get it is reflection. Unfortunately, there is no other way to -control the Send device than to obtain a pointer to it. +definitions. The only thing here of note is the =deviceID=. This is +available from LWJGL, but to only way to get it is with reflection. +Unfortunately, there is no other way to control the Send device than +to obtain a pointer to it. -#+include: "../java/src/com/aurellem/send/AudioSend.java" src java :exports code +#+include: "../../audio-send/java/src/com/aurellem/send/AudioSend.java" src java + +* The Java Audio Renderer, =AudioSendRenderer= + +#+include: "../../jmeCapture/src/com/aurellem/capture/audio/AudioSendRenderer.java" src java + +The =AudioSendRenderer= is a modified version of the +=LwjglAudioRenderer= which implements the =MultiListener= interface to +provide access and creation of more than one =Listener= object. + +** MultiListener.java + +#+include: "../../jmeCapture/src/com/aurellem/capture/audio/MultiListener.java" src java + +** SoundProcessors are like SceneProcessors + +A =SoundProcessor= is analgous to a =SceneProcessor=. Every frame, the +=SoundProcessor= registered with a given =Listener= recieves the +rendered sound data and can do whatever processing it wants with it. + +#+include "../../jmeCapture/src/com/aurellem/capture/audio/SoundProcessor.java" src java * Finally, Ears in clojure! -Now that the infrastructure is complete the clojure ear abstraction is -simple. Just as there were =SceneProcessors= for vision, there are -now =SoundProcessors= for hearing. +Now that the =C= and =Java= infrastructure is complete, the clojure +hearing abstraction is simple and closely parallels the [[./vision.org][vision]] +abstraction. -#+include "../../jmeCapture/src/com/aurellem/capture/audio/SoundProcessor.java" src java +** Hearing Pipeline - +All sound rendering is done in the CPU, so =(hearing-pipeline)= is +much less complicated than =(vision-pipelie)= The bytes available in +the ByteBuffer obtained from the =send= Device have different meanings +dependant upon the particular hardware or your system. That is why +the =AudioFormat= object is necessary to provide the meaning that the +raw bytes lack. =(byteBuffer->pulse-vector)= uses the excellent +conversion facilities from [[http://www.tritonus.org/ ][tritonus]] ([[http://tritonus.sourceforge.net/apidoc/org/tritonus/share/sampled/FloatSampleTools.html#byte2floatInterleaved%2528byte%5B%5D,%2520int,%2520float%5B%5D,%2520int,%2520int,%2520javax.sound.sampled.AudioFormat%2529][javadoc]]) to generate a clojure vector of +floats which represent the linear PCM encoded waveform of the +sound. With linear PCM (pulse code modulation) -1.0 represents maximum +rarefaction of the air while 1.0 represents maximum compression of the +air at a given instant. #+name: ears #+begin_src clojure -(ns cortex.hearing - "Simulate the sense of hearing in jMonkeyEngine3. Enables multiple - listeners at different positions in the same world. Automatically - reads ear-nodes from specially prepared blender files and - instantiates them in the world as actual ears." - {:author "Robert McIntyre"} - (:use (cortex world util sense)) - (:use clojure.contrib.def) - (:import java.nio.ByteBuffer) - (:import java.awt.image.BufferedImage) - (:import org.tritonus.share.sampled.FloatSampleTools) - (:import (com.aurellem.capture.audio - SoundProcessor AudioSendRenderer)) - (:import javax.sound.sampled.AudioFormat) - (:import (com.jme3.scene Spatial Node)) - (:import com.jme3.audio.Listener) - (:import com.jme3.app.Application) - (:import com.jme3.scene.control.AbstractControl)) +(in-ns 'cortex.hearing) -(defn sound-processor - "Deals with converting ByteBuffers into Vectors of floats so that - the continuation functions can be defined in terms of immutable - stuff." +(defn hearing-pipeline + "Creates a SoundProcessor which wraps a sound processing + continuation function. The continuation is a function that takes + [#^ByteBuffer b #^Integer int numSamples #^AudioFormat af ], each of which + has already been apprpiately sized." [continuation] (proxy [SoundProcessor] [] (cleanup []) (process [#^ByteBuffer audioSamples numSamples #^AudioFormat audioFormat] - (let [bytes (byte-array numSamples) - num-floats (/ numSamples (.getFrameSize audioFormat)) - floats (float-array num-floats)] - (.get audioSamples bytes 0 numSamples) - (FloatSampleTools/byte2floatInterleaved - bytes 0 floats 0 num-floats audioFormat) - (continuation - (vec floats)))))) + (continuation audioSamples numSamples audioFormat)))) +(defn byteBuffer->pulse-vector + "Extract the sound samples from the byteBuffer as a PCM encoded + waveform with values ranging from -1.0 to 1.0 into a vector of + floats." + [#^ByteBuffer audioSamples numSamples #^AudioFormat audioFormat] + (let [num-floats (/ numSamples (.getFrameSize audioFormat)) + bytes (byte-array numSamples) + floats (float-array num-floats)] + (.get audioSamples bytes 0 numSamples) + (FloatSampleTools/byte2floatInterleaved + bytes 0 floats 0 num-floats audioFormat) + (vec floats))) +#+end_src + +** Physical Ears + +Together, these three functions define how ears found in a specially +prepared blender file will be translated to =Listener= objects in a +simulation. =(ears)= extracts all the children of to top level node +named "ears". =(add-ear!)= and =(update-listener-velocity!)= use +=(bind-sense)= to bind a =Listener= object located at the initial +position of an "ear" node to the closest physical object in the +creature. That =Listener= will stay in the same orientation to the +object with which it is bound, just as the camera in the [[http://aurellem.localhost/cortex/html/sense.html#sec-4-1][sense binding +demonstration]]. =OpenAL= simulates the doppler effect for moving +listeners, =(update-listener-velocity!)= ensures that this velocity +information is always up-to-date. + +#+begin_src clojure (defvar ^{:arglists '([creature])} ears @@ -818,15 +863,19 @@ (let [target (closest-node creature ear) lis (Listener.) audio-renderer (.getAudioRenderer world) - sp (sound-processor continuation)] + sp (hearing-pipeline continuation)] (.setLocation lis (.getWorldTranslation ear)) (.setRotation lis (.getWorldRotation ear)) (bind-sense target lis) (update-listener-velocity! target lis) (.addListener audio-renderer lis) (.registerSoundProcessor audio-renderer lis sp))) +#+end_src -(defn hearing-fn +** Ear Creation + +#+begin_src clojure +(defn hearing-kernel "Returns a functon which returns auditory sensory data when called inside a running simulation." [#^Node creature #^Spatial ear] @@ -836,19 +885,14 @@ (fn [#^Application world] (add-ear! world creature ear - (fn [data] - (reset! hearing-data (vec data))))))] + (comp #(reset! hearing-data %) + byteBuffer->pulse-vector))))] (fn [#^Application world] (register-listener! world) (let [data @hearing-data topology - (vec (map #(vector % 0) (range 0 (count data)))) - scaled-data - (vec - (map - #(rem (int (* 255 (/ (+ 1 %) 2))) 256) - data))] - [topology scaled-data])))) + (vec (map #(vector % 0) (range 0 (count data))))] + [topology data])))) (defn hearing! "Endow the creature in a particular world with the sense of @@ -856,58 +900,87 @@ which when called will return the auditory data from that ear." [#^Node creature] (for [ear (ears creature)] - (hearing-fn creature ear))) + (hearing-kernel creature ear))) +#+end_src +Each function returned by =(hearing-kernel!)= will register a new +=Listener= with the simulation the first time it is called. Each time +it is called, the hearing-function will return a vector of linear PCM +encoded sound data that was heard since the last frame. The size of +this vector is of course determined by the overall framerate of the +game. With a constant framerate of 60 frames per second and a sampling +frequency of 44,100 samples per second, the vector will have exactly +735 elements. + +** Visualizing Hearing + +This is a simple visualization function which displaye the waveform +reported by the simulated sense of hearing. It converts the values +reported in the vector returned by the hearing function from the range +[-1.0, 1.0] to the range [0 255], converts to integer, and displays +the number as a greyscale pixel. + +#+begin_src clojure (defn view-hearing "Creates a function which accepts a list of auditory data and display each element of the list to the screen as an image." [] (view-sense (fn [[coords sensor-data]] - (let [height 50 + (let [pixel-data + (vec + (map + #(rem (int (* 255 (/ (+ 1 %) 2))) 256) + sensor-data)) + height 50 image (BufferedImage. (count coords) height BufferedImage/TYPE_INT_RGB)] (dorun (for [x (range (count coords))] (dorun (for [y (range height)] - (let [raw-sensor (sensor-data x)] + (let [raw-sensor (pixel-data x)] (.setRGB image x y (gray raw-sensor))))))) image)))) - #+end_src -#+results: ears -: #'cortex.hearing/hearing! +* Testing Hearing -* Example +** Advanced Java Example + +I wrote a test case in Java that demonstrates the use of the Java +components of this hearing system. It is part of a larger java library +to capture perfect Audio from jMonkeyEngine. Some of the clojure +constructs above are partially reiterated in the java source file. But +first, the video! As far as I know this is the first instance of +multiple simulated listeners in a virtual environment using OpenAL. + +#+begin_html +
+
+ +
+

The blue ball is emitting a constant sound. Each blue box is + listening for sound, and will change color from blue to green if it + detects sound which is louder than a certain threshold. As the blue + sphere travels along the path, it excites each of the cubes in turn.

+
+ +#+end_html + +#+include "../../jmeCapture/src/com/aurellem/capture/examples/Advanced.java" src java + +Here is a small clojure program to drive the java program and make it +available as part of my test suite. #+name: test-hearing -#+begin_src clojure :results silent -(ns cortex.test.hearing - (:use (cortex world util hearing)) - (:import (com.jme3.audio AudioNode Listener)) - (:import com.jme3.scene.Node - com.jme3.system.AppSettings)) +#+begin_src clojure +(in-ns 'cortex.test.hearing) -(defn setup-fn [world] - (let [listener (Listener.)] - (add-ear world listener #(println-repl (nth % 0))))) - -(defn play-sound [node world value] - (if (not value) - (do - (.playSource (.getAudioRenderer world) node)))) - -(defn test-basic-hearing [] - (let [node1 (AudioNode. (asset-manager) "Sounds/pure.wav" false false)] - (world - (Node.) - {"key-space" (partial play-sound node1)} - setup-fn - no-op))) - -(defn test-advanced-hearing +(defn test-java-hearing "Testing hearing: You should see a blue sphere flying around several cubes. As the sphere approaches each cube, it turns @@ -919,21 +992,56 @@ (.setAudioRenderer "Send"))) (.setShowSettings false) (.setPauseOnLostFocus false))) - #+end_src -This extremely basic program prints out the first sample it encounters -at every time stamp. You can see the rendered sound being printed at -the REPL. +** Adding Hearing to the Worm + + +* Headers + +#+name: hearing-header +#+begin_src clojure +(ns cortex.hearing + "Simulate the sense of hearing in jMonkeyEngine3. Enables multiple + listeners at different positions in the same world. Automatically + reads ear-nodes from specially prepared blender files and + instantiates them in the world as actual ears." + {:author "Robert McIntyre"} + (:use (cortex world util sense)) + (:use clojure.contrib.def) + (:import java.nio.ByteBuffer) + (:import java.awt.image.BufferedImage) + (:import org.tritonus.share.sampled.FloatSampleTools) + (:import (com.aurellem.capture.audio + SoundProcessor AudioSendRenderer)) + (:import javax.sound.sampled.AudioFormat) + (:import (com.jme3.scene Spatial Node)) + (:import com.jme3.audio.Listener) + (:import com.jme3.app.Application) + (:import com.jme3.scene.control.AbstractControl)) +#+end_src + +#+begin_src clojure +(ns cortex.test.hearing + (:use (cortex world util hearing)) + (:import (com.jme3.audio AudioNode Listener)) + (:import com.jme3.scene.Node + com.jme3.system.AppSettings)) +#+end_src + + +* Next - As a bonus, this method of capturing audio for AI can also be used to capture perfect audio from a jMonkeyEngine application, for use in demos and the like. + * COMMENT Code Generation #+begin_src clojure :tangle ../src/cortex/hearing.clj +<> <> #+end_src