Mercurial > cortex
comparison thesis/cortex.org @ 472:516a029e0be9
complete first draft of hearing.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Fri, 28 Mar 2014 18:14:04 -0400 |
parents | f14fa9e5b67f |
children | 486ce07f5545 |
comparison
equal
deleted
inserted
replaced
471:f14fa9e5b67f | 472:516a029e0be9 |
---|---|
952 #+caption: simulation environment. | 952 #+caption: simulation environment. |
953 #+name: name | 953 #+name: name |
954 #+ATTR_LaTeX: :width 15cm | 954 #+ATTR_LaTeX: :width 15cm |
955 [[./images/physical-hand.png]] | 955 [[./images/physical-hand.png]] |
956 | 956 |
957 ** Eyes reuse standard video game components | 957 ** COMMENT Eyes reuse standard video game components |
958 | 958 |
959 Vision is one of the most important senses for humans, so I need to | 959 Vision is one of the most important senses for humans, so I need to |
960 build a simulated sense of vision for my AI. I will do this with | 960 build a simulated sense of vision for my AI. I will do this with |
961 simulated eyes. Each eye can be independently moved and should see | 961 simulated eyes. Each eye can be independently moved and should see |
962 its own version of the world depending on where it is. | 962 its own version of the world depending on where it is. |
1251 community and is now (in modified form) part of a system for | 1251 community and is now (in modified form) part of a system for |
1252 capturing in-game video to a file. | 1252 capturing in-game video to a file. |
1253 | 1253 |
1254 ** Hearing is hard; =CORTEX= does it right | 1254 ** Hearing is hard; =CORTEX= does it right |
1255 | 1255 |
1256 At the end of this section I will have simulated ears that work the | |
1257 same way as the simulated eyes in the last section. I will be able to | |
1258 place any number of ear-nodes in a blender file, and they will bind to | |
1259 the closest physical object and follow it as it moves around. Each ear | |
1260 will provide access to the sound data it picks up between every frame. | |
1261 | |
1262 Hearing is one of the more difficult senses to simulate, because there | |
1263 is less support for obtaining the actual sound data that is processed | |
1264 by jMonkeyEngine3. There is no "split-screen" support for rendering | |
1265 sound from different points of view, and there is no way to directly | |
1266 access the rendered sound data. | |
1267 | |
1268 =CORTEX='s hearing is unique because it does not have any | |
1269 limitations compared to other simulation environments. As far as I | |
1270 know, there is no other system that supports multiple listerers, | |
1271 and the sound demo at the end of this section is the first time | |
1272 it's been done in a video game environment. | |
1273 | |
1274 *** Brief Description of jMonkeyEngine's Sound System | |
1275 | |
1276 jMonkeyEngine's sound system works as follows: | |
1277 | |
1278 - jMonkeyEngine uses the =AppSettings= for the particular | |
1279 application to determine what sort of =AudioRenderer= should be | |
1280 used. | |
1281 - Although some support is provided for multiple AudioRendering | |
1282 backends, jMonkeyEngine at the time of this writing will either | |
1283 pick no =AudioRenderer= at all, or the =LwjglAudioRenderer=. | |
1284 - jMonkeyEngine tries to figure out what sort of system you're | |
1285 running and extracts the appropriate native libraries. | |
1286 - The =LwjglAudioRenderer= uses the [[http://lwjgl.org/][=LWJGL=]] (LightWeight Java Game | |
1287 Library) bindings to interface with a C library called [[http://kcat.strangesoft.net/openal.html][=OpenAL=]] | |
1288 - =OpenAL= renders the 3D sound and feeds the rendered sound | |
1289 directly to any of various sound output devices with which it | |
1290 knows how to communicate. | |
1291 | |
1292 A consequence of this is that there's no way to access the actual | |
1293 sound data produced by =OpenAL=. Even worse, =OpenAL= only supports | |
1294 one /listener/ (it renders sound data from only one perspective), | |
1295 which normally isn't a problem for games, but becomes a problem | |
1296 when trying to make multiple AI creatures that can each hear the | |
1297 world from a different perspective. | |
1298 | |
1299 To make many AI creatures in jMonkeyEngine that can each hear the | |
1300 world from their own perspective, or to make a single creature with | |
1301 many ears, it is necessary to go all the way back to =OpenAL= and | |
1302 implement support for simulated hearing there. | |
1303 | |
1304 *** Extending =OpenAl= | |
1305 | |
1306 Extending =OpenAL= to support multiple listeners requires 500 | |
1307 lines of =C= code and is too hairy to mention here. Instead, I | |
1308 will show a small amount of extension code and go over the high | |
1309 level stragety. Full source is of course available with the | |
1310 =CORTEX= distribution if you're interested. | |
1311 | |
1312 =OpenAL= goes to great lengths to support many different systems, | |
1313 all with different sound capabilities and interfaces. It | |
1314 accomplishes this difficult task by providing code for many | |
1315 different sound backends in pseudo-objects called /Devices/. | |
1316 There's a device for the Linux Open Sound System and the Advanced | |
1317 Linux Sound Architecture, there's one for Direct Sound on Windows, | |
1318 and there's even one for Solaris. =OpenAL= solves the problem of | |
1319 platform independence by providing all these Devices. | |
1320 | |
1321 Wrapper libraries such as LWJGL are free to examine the system on | |
1322 which they are running and then select an appropriate device for | |
1323 that system. | |
1324 | |
1325 There are also a few "special" devices that don't interface with | |
1326 any particular system. These include the Null Device, which | |
1327 doesn't do anything, and the Wave Device, which writes whatever | |
1328 sound it receives to a file, if everything has been set up | |
1329 correctly when configuring =OpenAL=. | |
1330 | |
1331 Actual mixing (doppler shift and distance.environment-based | |
1332 attenuation) of the sound data happens in the Devices, and they | |
1333 are the only point in the sound rendering process where this data | |
1334 is available. | |
1335 | |
1336 Therefore, in order to support multiple listeners, and get the | |
1337 sound data in a form that the AIs can use, it is necessary to | |
1338 create a new Device which supports this feature. | |
1339 | |
1340 Adding a device to OpenAL is rather tricky -- there are five | |
1341 separate files in the =OpenAL= source tree that must be modified | |
1342 to do so. I named my device the "Multiple Audio Send" Device, or | |
1343 =Send= Device for short, since it sends audio data back to the | |
1344 calling application like an Aux-Send cable on a mixing board. | |
1345 | |
1346 The main idea behind the Send device is to take advantage of the | |
1347 fact that LWJGL only manages one /context/ when using OpenAL. A | |
1348 /context/ is like a container that holds samples and keeps track | |
1349 of where the listener is. In order to support multiple listeners, | |
1350 the Send device identifies the LWJGL context as the master | |
1351 context, and creates any number of slave contexts to represent | |
1352 additional listeners. Every time the device renders sound, it | |
1353 synchronizes every source from the master LWJGL context to the | |
1354 slave contexts. Then, it renders each context separately, using a | |
1355 different listener for each one. The rendered sound is made | |
1356 available via JNI to jMonkeyEngine. | |
1357 | |
1358 Switching between contexts is not the normal operation of a | |
1359 Device, and one of the problems with doing so is that a Device | |
1360 normally keeps around a few pieces of state such as the | |
1361 =ClickRemoval= array above which will become corrupted if the | |
1362 contexts are not rendered in parallel. The solution is to create a | |
1363 copy of this normally global device state for each context, and | |
1364 copy it back and forth into and out of the actual device state | |
1365 whenever a context is rendered. | |
1366 | |
1367 The core of the =Send= device is the =syncSources= function, which | |
1368 does the job of copying all relevant data from one context to | |
1369 another. | |
1370 | |
1371 #+caption: Program for extending =OpenAL= to support multiple | |
1372 #+caption: listeners via context copying/switching. | |
1373 #+name: sync-openal-sources | |
1374 #+begin_listing C | |
1375 void syncSources(ALsource *masterSource, ALsource *slaveSource, | |
1376 ALCcontext *masterCtx, ALCcontext *slaveCtx){ | |
1377 ALuint master = masterSource->source; | |
1378 ALuint slave = slaveSource->source; | |
1379 ALCcontext *current = alcGetCurrentContext(); | |
1380 | |
1381 syncSourcef(master,slave,masterCtx,slaveCtx,AL_PITCH); | |
1382 syncSourcef(master,slave,masterCtx,slaveCtx,AL_GAIN); | |
1383 syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_DISTANCE); | |
1384 syncSourcef(master,slave,masterCtx,slaveCtx,AL_ROLLOFF_FACTOR); | |
1385 syncSourcef(master,slave,masterCtx,slaveCtx,AL_REFERENCE_DISTANCE); | |
1386 syncSourcef(master,slave,masterCtx,slaveCtx,AL_MIN_GAIN); | |
1387 syncSourcef(master,slave,masterCtx,slaveCtx,AL_MAX_GAIN); | |
1388 syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_GAIN); | |
1389 syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_INNER_ANGLE); | |
1390 syncSourcef(master,slave,masterCtx,slaveCtx,AL_CONE_OUTER_ANGLE); | |
1391 syncSourcef(master,slave,masterCtx,slaveCtx,AL_SEC_OFFSET); | |
1392 syncSourcef(master,slave,masterCtx,slaveCtx,AL_SAMPLE_OFFSET); | |
1393 syncSourcef(master,slave,masterCtx,slaveCtx,AL_BYTE_OFFSET); | |
1394 | |
1395 syncSource3f(master,slave,masterCtx,slaveCtx,AL_POSITION); | |
1396 syncSource3f(master,slave,masterCtx,slaveCtx,AL_VELOCITY); | |
1397 syncSource3f(master,slave,masterCtx,slaveCtx,AL_DIRECTION); | |
1398 | |
1399 syncSourcei(master,slave,masterCtx,slaveCtx,AL_SOURCE_RELATIVE); | |
1400 syncSourcei(master,slave,masterCtx,slaveCtx,AL_LOOPING); | |
1401 | |
1402 alcMakeContextCurrent(masterCtx); | |
1403 ALint source_type; | |
1404 alGetSourcei(master, AL_SOURCE_TYPE, &source_type); | |
1405 | |
1406 // Only static sources are currently synchronized! | |
1407 if (AL_STATIC == source_type){ | |
1408 ALint master_buffer; | |
1409 ALint slave_buffer; | |
1410 alGetSourcei(master, AL_BUFFER, &master_buffer); | |
1411 alcMakeContextCurrent(slaveCtx); | |
1412 alGetSourcei(slave, AL_BUFFER, &slave_buffer); | |
1413 if (master_buffer != slave_buffer){ | |
1414 alSourcei(slave, AL_BUFFER, master_buffer); | |
1415 } | |
1416 } | |
1417 | |
1418 // Synchronize the state of the two sources. | |
1419 alcMakeContextCurrent(masterCtx); | |
1420 ALint masterState; | |
1421 ALint slaveState; | |
1422 | |
1423 alGetSourcei(master, AL_SOURCE_STATE, &masterState); | |
1424 alcMakeContextCurrent(slaveCtx); | |
1425 alGetSourcei(slave, AL_SOURCE_STATE, &slaveState); | |
1426 | |
1427 if (masterState != slaveState){ | |
1428 switch (masterState){ | |
1429 case AL_INITIAL : alSourceRewind(slave); break; | |
1430 case AL_PLAYING : alSourcePlay(slave); break; | |
1431 case AL_PAUSED : alSourcePause(slave); break; | |
1432 case AL_STOPPED : alSourceStop(slave); break; | |
1433 } | |
1434 } | |
1435 // Restore whatever context was previously active. | |
1436 alcMakeContextCurrent(current); | |
1437 } | |
1438 #+end_listing | |
1439 | |
1440 With this special context-switching device, and some ugly JNI | |
1441 bindings that are not worth mentioning, =CORTEX= gains the ability | |
1442 to access multiple sound streams from =OpenAL=. | |
1443 | |
1444 #+caption: Program to create an ear from a blender empty node. The ear | |
1445 #+caption: follows around the nearest physical object and passes | |
1446 #+caption: all sensory data to a continuation function. | |
1447 #+name: add-ear | |
1448 #+begin_listing clojure | |
1449 (defn add-ear! | |
1450 "Create a Listener centered on the current position of 'ear | |
1451 which follows the closest physical node in 'creature and | |
1452 sends sound data to 'continuation." | |
1453 [#^Application world #^Node creature #^Spatial ear continuation] | |
1454 (let [target (closest-node creature ear) | |
1455 lis (Listener.) | |
1456 audio-renderer (.getAudioRenderer world) | |
1457 sp (hearing-pipeline continuation)] | |
1458 (.setLocation lis (.getWorldTranslation ear)) | |
1459 (.setRotation lis (.getWorldRotation ear)) | |
1460 (bind-sense target lis) | |
1461 (update-listener-velocity! target lis) | |
1462 (.addListener audio-renderer lis) | |
1463 (.registerSoundProcessor audio-renderer lis sp))) | |
1464 #+end_listing | |
1465 | |
1466 | |
1467 The =Send= device, unlike most of the other devices in =OpenAL=, | |
1468 does not render sound unless asked. This enables the system to | |
1469 slow down or speed up depending on the needs of the AIs who are | |
1470 using it to listen. If the device tried to render samples in | |
1471 real-time, a complicated AI whose mind takes 100 seconds of | |
1472 computer time to simulate 1 second of AI-time would miss almost | |
1473 all of the sound in its environment! | |
1474 | |
1475 #+caption: Program to enable arbitrary hearing in =CORTEX= | |
1476 #+name: hearing | |
1477 #+begin_listing clojure | |
1478 (defn hearing-kernel | |
1479 "Returns a function which returns auditory sensory data when called | |
1480 inside a running simulation." | |
1481 [#^Node creature #^Spatial ear] | |
1482 (let [hearing-data (atom []) | |
1483 register-listener! | |
1484 (runonce | |
1485 (fn [#^Application world] | |
1486 (add-ear! | |
1487 world creature ear | |
1488 (comp #(reset! hearing-data %) | |
1489 byteBuffer->pulse-vector))))] | |
1490 (fn [#^Application world] | |
1491 (register-listener! world) | |
1492 (let [data @hearing-data | |
1493 topology | |
1494 (vec (map #(vector % 0) (range 0 (count data))))] | |
1495 [topology data])))) | |
1496 | |
1497 (defn hearing! | |
1498 "Endow the creature in a particular world with the sense of | |
1499 hearing. Will return a sequence of functions, one for each ear, | |
1500 which when called will return the auditory data from that ear." | |
1501 [#^Node creature] | |
1502 (for [ear (ears creature)] | |
1503 (hearing-kernel creature ear))) | |
1504 #+end_listing | |
1505 | |
1506 Armed with these functions, =CORTEX= is able to test possibly the | |
1507 first ever instance of multiple listeners in a video game engine | |
1508 based simulation! | |
1509 | |
1510 #+caption: Here a simple creature responds to sound by changing | |
1511 #+caption: its color from gray to green when the total volume | |
1512 #+caption: goes over a threshold. | |
1513 #+name: sound-test | |
1514 #+begin_listing java | |
1515 /** | |
1516 * Respond to sound! This is the brain of an AI entity that | |
1517 * hears its surroundings and reacts to them. | |
1518 */ | |
1519 public void process(ByteBuffer audioSamples, | |
1520 int numSamples, AudioFormat format) { | |
1521 audioSamples.clear(); | |
1522 byte[] data = new byte[numSamples]; | |
1523 float[] out = new float[numSamples]; | |
1524 audioSamples.get(data); | |
1525 FloatSampleTools. | |
1526 byte2floatInterleaved | |
1527 (data, 0, out, 0, numSamples/format.getFrameSize(), format); | |
1528 | |
1529 float max = Float.NEGATIVE_INFINITY; | |
1530 for (float f : out){if (f > max) max = f;} | |
1531 audioSamples.clear(); | |
1532 | |
1533 if (max > 0.1){ | |
1534 entity.getMaterial().setColor("Color", ColorRGBA.Green); | |
1535 } | |
1536 else { | |
1537 entity.getMaterial().setColor("Color", ColorRGBA.Gray); | |
1538 } | |
1539 #+end_listing | |
1540 | |
1541 #+caption: First ever simulation of multiple listerners in =CORTEX=. | |
1542 #+caption: Each cube is a creature which processes sound data with | |
1543 #+caption: the =process= function from listing \ref{sound-test}. | |
1544 #+caption: the ball is constantally emiting a pure tone of | |
1545 #+caption: constant volume. As it approaches the cubes, they each | |
1546 #+caption: change color in response to the sound. | |
1547 #+name: sound-cubes. | |
1548 #+ATTR_LaTeX: :width 10cm | |
1549 [[./images/aurellem-gray.png]] | |
1550 | |
1551 This system of hearing has also been co-opted by the | |
1552 jMonkeyEngine3 community and is used to record audio for demo | |
1553 videos. | |
1554 | |
1256 ** Touch uses hundreds of hair-like elements | 1555 ** Touch uses hundreds of hair-like elements |
1257 | 1556 |
1258 ** Proprioception is the sense that makes everything ``real'' | 1557 ** Proprioception is the sense that makes everything ``real'' |
1259 | 1558 |
1260 ** Muscles are both effectors and sensors | 1559 ** Muscles are both effectors and sensors |