comparison thesis/cortex.org @ 449:09b7c8dd4365

first chapter done, half of last chapter done.
author Robert McIntyre <rlm@mit.edu>
date Wed, 26 Mar 2014 02:42:01 -0400
parents af13fc73e851
children 432f2c4646cb
comparison
equal deleted inserted replaced
448:af13fc73e851 449:09b7c8dd4365
224 (< 0.55 (contact worm-segment-top-tip head-touch)))))) 224 (< 0.55 (contact worm-segment-top-tip head-touch))))))
225 #+end_src 225 #+end_src
226 #+end_listing 226 #+end_listing
227 227
228 228
229 ** =CORTEX= is a toolkit for building sensate creatures 229 ** =CORTEX= is a toolkit for building sensate creatures
230 230
231 I built =CORTEX= to be a general AI research platform for doing 231 I built =CORTEX= to be a general AI research platform for doing
232 experiments involving multiple rich senses and a wide variety and 232 experiments involving multiple rich senses and a wide variety and
233 number of creatures. I intend it to be useful as a library for many 233 number of creatures. I intend it to be useful as a library for many
234 more projects than just this one. =CORTEX= was necessary to meet a 234 more projects than just this one. =CORTEX= was necessary to meet a
267 267
268 =CORTEX= is built on top of =jMonkeyEngine3=, which is a video game 268 =CORTEX= is built on top of =jMonkeyEngine3=, which is a video game
269 engine designed to create cross-platform 3D desktop games. =CORTEX= 269 engine designed to create cross-platform 3D desktop games. =CORTEX=
270 is mainly written in clojure, a dialect of =LISP= that runs on the 270 is mainly written in clojure, a dialect of =LISP= that runs on the
271 java virtual machine (JVM). The API for creating and simulating 271 java virtual machine (JVM). The API for creating and simulating
272 creatures is entirely expressed in clojure. Hearing is implemented 272 creatures and senses is entirely expressed in clojure, though many
273 as a layer of clojure code on top of a layer of java code on top of 273 senses are implemented at the layer of jMonkeyEngine or below. For
274 a layer of =C++= code which implements a modified version of 274 example, for the sense of hearing I use a layer of clojure code on
275 =OpenAL= to support multiple listeners. =CORTEX= is the only 275 top of a layer of java JNI bindings that drive a layer of =C++=
276 simulation environment that I know of that can support multiple 276 code which implements a modified version of =OpenAL= to support
277 entities that can each hear the world from their own perspective. 277 multiple listeners. =CORTEX= is the only simulation environment
278 Other senses also require a small layer of Java code. =CORTEX= also 278 that I know of that can support multiple entities that can each
279 uses =bullet=, a physics simulator written in =C=. 279 hear the world from their own perspective. Other senses also
280 require a small layer of Java code. =CORTEX= also uses =bullet=, a
281 physics simulator written in =C=.
280 282
281 #+caption: Here is the worm from above modeled in Blender, a free 283 #+caption: Here is the worm from above modeled in Blender, a free
282 #+caption: 3D-modeling program. Senses and joints are described 284 #+caption: 3D-modeling program. Senses and joints are described
283 #+caption: using special nodes in Blender. 285 #+caption: using special nodes in Blender.
284 #+name: worm-recognition-intro 286 #+name: worm-recognition-intro
285 #+ATTR_LaTeX: :width 12cm 287 #+ATTR_LaTeX: :width 12cm
286 [[./images/blender-worm.png]] 288 [[./images/blender-worm.png]]
287 289
290 Here are some thing I anticipate that =CORTEX= might be used for:
291
292 - exploring new ideas about sensory integration
293 - distributed communication among swarm creatures
294 - self-learning using free exploration,
295 - evolutionary algorithms involving creature construction
296 - exploration of exoitic senses and effectors that are not possible
297 in the real world (such as telekenisis or a semantic sense)
298 - imagination using subworlds
299
288 During one test with =CORTEX=, I created 3,000 entities each with 300 During one test with =CORTEX=, I created 3,000 entities each with
289 their own independent senses and ran them all at only 1/80 real 301 their own independent senses and ran them all at only 1/80 real
290 time. In another test, I created a detailed model of my own hand, 302 time. In another test, I created a detailed model of my own hand,
291 equipped with a realistic distribution of touch (more sensitive at 303 equipped with a realistic distribution of touch (more sensitive at
292 the fingertips), as well as eyes and ears, and it ran at around 1/4 304 the fingertips), as well as eyes and ears, and it ran at around 1/4
293 real time. 305 real time.
294 306
295 #+caption: Here is the worm from above modeled in Blender, a free 307 #+BEGIN_LaTeX
296 #+caption: 3D-modeling program. Senses and joints are described 308 \begin{sidewaysfigure}
297 #+caption: using special nodes in Blender. 309 \includegraphics[width=9.5in]{images/full-hand.png}
298 #+name: worm-recognition-intro 310 \caption{Here is the worm from above modeled in Blender,
299 #+ATTR_LaTeX: :width 15cm 311 a free 3D-modeling program. Senses and joints are described
300 [[./images/full-hand.png]] 312 using special nodes in Blender. The senses are displayed on
301 313 the right, and the simulation is displayed on the left. Notice
302 314 that the hand is curling its fingers, that it can see its own
303 315 finger from the eye in its palm, and thta it can feel its own
304 316 thumb touching its palm.}
305 317 \end{sidewaysfigure}
318 #+END_LaTeX
319
306 ** Contributions 320 ** Contributions
307 321
322 I built =CORTEX=, a comprehensive platform for embodied AI
323 experiments. =CORTEX= many new features lacking in other systems,
324 such as sound. It is easy to create new creatures using Blender, a
325 free 3D modeling program.
326
327 I built =EMPATH=, which uses =CORTEX= to identify the actions of a
328 worm-like creature using a computational model of empathy.
329
308 * Building =CORTEX= 330 * Building =CORTEX=
309 331
310 ** To explore embodiment, we need a world, body, and senses 332 ** To explore embodiment, we need a world, body, and senses
311 333
312 ** Because of Time, simulation is perferable to reality 334 ** Because of Time, simulation is perferable to reality
329 351
330 ** =CORTEX= enables many possiblities for further research 352 ** =CORTEX= enables many possiblities for further research
331 353
332 * Empathy in a simulated worm 354 * Empathy in a simulated worm
333 355
356 Here I develop a computational model of empathy, using =CORTEX= as a
357 base. Empathy in this context is the ability to observe another
358 creature and infer what sorts of sensations that creature is
359 feeling. My empathy algorithm involves multiple phases. First is
360 free-play, where the creature moves around and gains sensory
361 experience. From this experience I construct a representation of the
362 creature's sensory state space, which I call \Phi-space. Using
363 \Phi-space, I construct an efficient function which takes the
364 limited data that comes from observing another creature and enriches
365 it full compliment of imagined sensory data. I can then use the
366 imagined sensory data to recognize what the observed creature is
367 doing and feeling, using straightforward embodied action predicates.
368 This is all demonstrated with using a simple worm-like creature, and
369 recognizing worm-actions based on limited data.
370
371 #+caption: Here is the worm with which we will be working.
372 #+caption: It is composed of 5 segments. Each segment has a
373 #+caption: pair of extensor and flexor muscles. Each of the
374 #+caption: worm's four joints is a hinge joint which allows
375 #+caption: 30 degrees of rotation to either side. Each segment
376 #+caption: of the worm is touch-capable and has a uniform
377 #+caption: distribution of touch sensors on each of its faces.
378 #+caption: Each joint has a proprioceptive sense to detect
379 #+caption: relative positions. The worm segments are all the
380 #+caption: same except for the first one, which has a much
381 #+caption: higher weight than the others to allow for easy
382 #+caption: manual motor control.
383 #+name: basic-worm-view
384 #+ATTR_LaTeX: :width 10cm
385 [[./images/basic-worm-view.png]]
386
387 #+caption: Program for reading a worm from a blender file and
388 #+caption: outfitting it with the senses of proprioception,
389 #+caption: touch, and the ability to move, as specified in the
390 #+caption: blender file.
391 #+name: get-worm
392 #+begin_listing clojure
393 #+begin_src clojure
394 (defn worm []
395 (let [model (load-blender-model "Models/worm/worm.blend")]
396 {:body (doto model (body!))
397 :touch (touch! model)
398 :proprioception (proprioception! model)
399 :muscles (movement! model)}))
400 #+end_src
401 #+end_listing
402
334 ** Embodiment factors action recognition into managable parts 403 ** Embodiment factors action recognition into managable parts
335 404
405 Using empathy, I divide the problem of action recognition into a
406 recognition process expressed in the language of a full compliment
407 of senses, and an imaganitive process that generates full sensory
408 data from partial sensory data. Splitting the action recognition
409 problem in this manner greatly reduces the total amount of work to
410 recognize actions: The imaganitive process is mostly just matching
411 previous experience, and the recognition process gets to use all
412 the senses to directly describe any action.
413
336 ** Action recognition is easy with a full gamut of senses 414 ** Action recognition is easy with a full gamut of senses
337 415
338 ** Digression: bootstrapping touch using free exploration 416 Embodied representations using multiple senses such as touch,
417 proprioception, and muscle tension turns out be be exceedingly
418 efficient at describing body-centered actions. It is the ``right
419 language for the job''. For example, it takes only around 5 lines
420 of LISP code to describe the action of ``curling'' using embodied
421 primitives. It takes about 8 lines to describe the seemingly
422 complicated action of wiggling.
423
424 The following action predicates each take a stream of sensory
425 experience, observe however much of it they desire, and decide
426 whether the worm is doing the action they describe. =curled?=
427 relies on proprioception, =resting?= relies on touch, =wiggling?=
428 relies on a fourier analysis of muscle contraction, and
429 =grand-circle?= relies on touch and reuses =curled?= as a gaurd.
430
431 #+caption: Program for detecting whether the worm is curled. This is the
432 #+caption: simplest action predicate, because it only uses the last frame
433 #+caption: of sensory experience, and only uses proprioceptive data. Even
434 #+caption: this simple predicate, however, is automatically frame
435 #+caption: independent and ignores vermopomorphic differences such as
436 #+caption: worm textures and colors.
437 #+name: curled
438 #+begin_listing clojure
439 #+begin_src clojure
440 (defn curled?
441 "Is the worm curled up?"
442 [experiences]
443 (every?
444 (fn [[_ _ bend]]
445 (> (Math/sin bend) 0.64))
446 (:proprioception (peek experiences))))
447 #+end_src
448 #+end_listing
449
450 #+caption: Program for summarizing the touch information in a patch
451 #+caption: of skin.
452 #+name: touch-summary
453 #+begin_listing clojure
454 #+begin_src clojure
455 (defn contact
456 "Determine how much contact a particular worm segment has with
457 other objects. Returns a value between 0 and 1, where 1 is full
458 contact and 0 is no contact."
459 [touch-region [coords contact :as touch]]
460 (-> (zipmap coords contact)
461 (select-keys touch-region)
462 (vals)
463 (#(map first %))
464 (average)
465 (* 10)
466 (- 1)
467 (Math/abs)))
468 #+end_src
469 #+end_listing
470
471
472 #+caption: Program for detecting whether the worm is at rest. This program
473 #+caption: uses a summary of the tactile information from the underbelly
474 #+caption: of the worm, and is only true if every segment is touching the
475 #+caption: floor. Note that this function contains no references to
476 #+caption: proprioction at all.
477 #+name: resting
478 #+begin_listing clojure
479 #+begin_src clojure
480 (def worm-segment-bottom (rect-region [8 15] [14 22]))
481
482 (defn resting?
483 "Is the worm resting on the ground?"
484 [experiences]
485 (every?
486 (fn [touch-data]
487 (< 0.9 (contact worm-segment-bottom touch-data)))
488 (:touch (peek experiences))))
489 #+end_src
490 #+end_listing
491
492 #+caption: Program for detecting whether the worm is curled up into a
493 #+caption: full circle. Here the embodied approach begins to shine, as
494 #+caption: I am able to both use a previous action predicate (=curled?=)
495 #+caption: as well as the direct tactile experience of the head and tail.
496 #+name: grand-circle
497 #+begin_listing clojure
498 #+begin_src clojure
499 (def worm-segment-bottom-tip (rect-region [15 15] [22 22]))
500
501 (def worm-segment-top-tip (rect-region [0 15] [7 22]))
502
503 (defn grand-circle?
504 "Does the worm form a majestic circle (one end touching the other)?"
505 [experiences]
506 (and (curled? experiences)
507 (let [worm-touch (:touch (peek experiences))
508 tail-touch (worm-touch 0)
509 head-touch (worm-touch 4)]
510 (and (< 0.55 (contact worm-segment-bottom-tip tail-touch))
511 (< 0.55 (contact worm-segment-top-tip head-touch))))))
512 #+end_src
513 #+end_listing
514
515
516 #+caption: Program for detecting whether the worm has been wiggling for
517 #+caption: the last few frames. It uses a fourier analysis of the muscle
518 #+caption: contractions of the worm's tail to determine wiggling. This is
519 #+caption: signigicant because there is no particular frame that clearly
520 #+caption: indicates that the worm is wiggling --- only when multiple frames
521 #+caption: are analyzed together is the wiggling revealed. Defining
522 #+caption: wiggling this way also gives the worm an opportunity to learn
523 #+caption: and recognize ``frustrated wiggling'', where the worm tries to
524 #+caption: wiggle but can't. Frustrated wiggling is very visually different
525 #+caption: from actual wiggling, but this definition gives it to us for free.
526 #+name: wiggling
527 #+begin_listing clojure
528 #+begin_src clojure
529 (defn fft [nums]
530 (map
531 #(.getReal %)
532 (.transform
533 (FastFourierTransformer. DftNormalization/STANDARD)
534 (double-array nums) TransformType/FORWARD)))
535
536 (def indexed (partial map-indexed vector))
537
538 (defn max-indexed [s]
539 (first (sort-by (comp - second) (indexed s))))
540
541 (defn wiggling?
542 "Is the worm wiggling?"
543 [experiences]
544 (let [analysis-interval 0x40]
545 (when (> (count experiences) analysis-interval)
546 (let [a-flex 3
547 a-ex 2
548 muscle-activity
549 (map :muscle (vector:last-n experiences analysis-interval))
550 base-activity
551 (map #(- (% a-flex) (% a-ex)) muscle-activity)]
552 (= 2
553 (first
554 (max-indexed
555 (map #(Math/abs %)
556 (take 20 (fft base-activity))))))))))
557 #+end_src
558 #+end_listing
559
560 With these action predicates, I can now recognize the actions of
561 the worm while it is moving under my control and I have access to
562 all the worm's senses.
563
564 #+caption: Use the action predicates defined earlier to report on
565 #+caption: what the worm is doing while in simulation.
566 #+name: report-worm-activity
567 #+begin_listing clojure
568 #+begin_src clojure
569 (defn debug-experience
570 [experiences text]
571 (cond
572 (grand-circle? experiences) (.setText text "Grand Circle")
573 (curled? experiences) (.setText text "Curled")
574 (wiggling? experiences) (.setText text "Wiggling")
575 (resting? experiences) (.setText text "Resting")))
576 #+end_src
577 #+end_listing
578
579 #+caption: Using =debug-experience=, the body-centered predicates
580 #+caption: work together to classify the behaviour of the worm.
581 #+caption: while under manual motor control.
582 #+name: basic-worm-view
583 #+ATTR_LaTeX: :width 10cm
584 [[./images/worm-identify-init.png]]
585
586 These action predicates satisfy the recognition requirement of an
587 empathic recognition system. There is a lot of power in the
588 simplicity of the action predicates. They describe their actions
589 without getting confused in visual details of the worm. Each one is
590 frame independent, but more than that, they are each indepent of
591 irrelevant visual details of the worm and the environment. They
592 will work regardless of whether the worm is a different color or
593 hevaily textured, or of the environment has strange lighting.
594
595 The trick now is to make the action predicates work even when the
596 sensory data on which they depend is absent. If I can do that, then
597 I will have gained much,
339 598
340 ** \Phi-space describes the worm's experiences 599 ** \Phi-space describes the worm's experiences
600
601 As a first step towards building empathy, I need to gather all of
602 the worm's experiences during free play. I use a simple vector to
603 store all the experiences.
604
605 #+caption: Program to gather the worm's experiences into a vector for
606 #+caption: further processing. The =motor-control-program= line uses
607 #+caption: a motor control script that causes the worm to execute a series
608 #+caption: of ``exercices'' that include all the action predicates.
609 #+name: generate-phi-space
610 #+begin_listing clojure
611 #+begin_src clojure
612 (defn generate-phi-space []
613 (let [experiences (atom [])]
614 (run-world
615 (apply-map
616 worm-world
617 (merge
618 (worm-world-defaults)
619 {:end-frame 700
620 :motor-control
621 (motor-control-program worm-muscle-labels do-all-the-things)
622 :experiences experiences})))
623 @experiences))
624 #+end_src
625 #+end_listing
626
627 Each element of the experience vector exists in the vast space of
628 all possible worm-experiences. Most of this vast space is actually
629 unreachable due to physical constraints of the worm's body. For
630 example, the worm's segments are connected by hinge joints that put
631 a practical limit on the worm's degrees of freedom. Also, the worm
632 can not be bent into a circle so that its ends are touching and at
633 the same time not also experience the sensation of touching itself.
634
635 As the worm moves around during free play and the vector grows
636 larger, the vector begins to define a subspace which is all the
637 practical experiences the worm can experience during normal
638 operation, which I call \Phi-space, short for physical-space. The
639 vector defines a path through \Phi-space. This path has interesting
640 properties that all derive from embodiment. The proprioceptive
641 components are completely smooth, because in order for the worm to
642 move from one position to another, it must pass through the
643 intermediate positions. The path invariably forms loops as actions
644 are repeated. Finally and most importantly, proprioception actually
645 gives very strong inference about the other senses. For example,
646 when the worm is flat, you can infer that it is touching the ground
647 and that its muscles are not active, because if the muscles were
648 active, the worm would be moving and would not be perfectly flat.
649 In order to stay flat, the worm has to be touching the ground, or
650 it would again be moving out of the flat position due to gravity.
651 If the worm is positioned in such a way that it interacts with
652 itself, then it is very likely to be feeling the same tactile
653 feelings as the last time it was in that position, because it has
654 the same body as then. If you observe multiple frames of
655 proprioceptive data, then you can become increasingly confident
656 about the exact activations of the worm's muscles, because it
657 generally takes a unique combination of muscle contractions to
658 transform the worm's body along a specific path through \Phi-space.
659
660 There is a simple way of taking \Phi-space and the total ordering
661 provided by an experience vector and reliably infering the rest of
662 the senses.
341 663
342 ** Empathy is the process of tracing though \Phi-space 664 ** Empathy is the process of tracing though \Phi-space
665
666
667
668 (defn bin [digits]
669 (fn [angles]
670 (->> angles
671 (flatten)
672 (map (juxt #(Math/sin %) #(Math/cos %)))
673 (flatten)
674 (mapv #(Math/round (* % (Math/pow 10 (dec digits))))))))
675
676 (defn gen-phi-scan
677 "Nearest-neighbors with spatial binning. Only returns a result if
678 the propriceptive data is within 10% of a previously recorded
679 result in all dimensions."
680
681 [phi-space]
682 (let [bin-keys (map bin [3 2 1])
683 bin-maps
684 (map (fn [bin-key]
685 (group-by
686 (comp bin-key :proprioception phi-space)
687 (range (count phi-space)))) bin-keys)
688 lookups (map (fn [bin-key bin-map]
689 (fn [proprio] (bin-map (bin-key proprio))))
690 bin-keys bin-maps)]
691 (fn lookup [proprio-data]
692 (set (some #(% proprio-data) lookups)))))
693
694
695 (defn longest-thread
696 "Find the longest thread from phi-index-sets. The index sets should
697 be ordered from most recent to least recent."
698 [phi-index-sets]
699 (loop [result '()
700 [thread-bases & remaining :as phi-index-sets] phi-index-sets]
701 (if (empty? phi-index-sets)
702 (vec result)
703 (let [threads
704 (for [thread-base thread-bases]
705 (loop [thread (list thread-base)
706 remaining remaining]
707 (let [next-index (dec (first thread))]
708 (cond (empty? remaining) thread
709 (contains? (first remaining) next-index)
710 (recur
711 (cons next-index thread) (rest remaining))
712 :else thread))))
713 longest-thread
714 (reduce (fn [thread-a thread-b]
715 (if (> (count thread-a) (count thread-b))
716 thread-a thread-b))
717 '(nil)
718 threads)]
719 (recur (concat longest-thread result)
720 (drop (count longest-thread) phi-index-sets))))))
721
722 There is one final piece, which is to replace missing sensory data
723 with a best-guess estimate. While I could fill in missing data by
724 using a gradient over the closest known sensory data points, averages
725 can be misleading. It is certainly possible to create an impossible
726 sensory state by averaging two possible sensory states. Therefore, I
727 simply replicate the most recent sensory experience to fill in the
728 gaps.
729
730 #+caption: Fill in blanks in sensory experience by replicating the most
731 #+caption: recent experience.
732 #+name: infer-nils
733 #+begin_listing clojure
734 #+begin_src clojure
735 (defn infer-nils
736 "Replace nils with the next available non-nil element in the
737 sequence, or barring that, 0."
738 [s]
739 (loop [i (dec (count s))
740 v (transient s)]
741 (if (zero? i) (persistent! v)
742 (if-let [cur (v i)]
743 (if (get v (dec i) 0)
744 (recur (dec i) v)
745 (recur (dec i) (assoc! v (dec i) cur)))
746 (recur i (assoc! v i 0))))))
747 #+end_src
748 #+end_listing
749
750
751
752
343 753
344 ** Efficient action recognition with =EMPATH= 754 ** Efficient action recognition with =EMPATH=
345 755
756 ** Digression: bootstrapping touch using free exploration
757
346 * Contributions 758 * Contributions
347 - Built =CORTEX=, a comprehensive platform for embodied AI
348 experiments. Has many new features lacking in other systems, such
349 as sound. Easy to model/create new creatures.
350 - created a novel concept for action recognition by using artificial
351 imagination.
352
353 In the second half of the thesis I develop a computational model of
354 empathy, using =CORTEX= as a base. Empathy in this context is the
355 ability to observe another creature and infer what sorts of sensations
356 that creature is feeling. My empathy algorithm involves multiple
357 phases. First is free-play, where the creature moves around and gains
358 sensory experience. From this experience I construct a representation
359 of the creature's sensory state space, which I call \Phi-space. Using
360 \Phi-space, I construct an efficient function for enriching the
361 limited data that comes from observing another creature with a full
362 compliment of imagined sensory data based on previous experience. I
363 can then use the imagined sensory data to recognize what the observed
364 creature is doing and feeling, using straightforward embodied action
365 predicates. This is all demonstrated with using a simple worm-like
366 creature, and recognizing worm-actions based on limited data.
367
368 Embodied representation using multiple senses such as touch,
369 proprioception, and muscle tension turns out be be exceedingly
370 efficient at describing body-centered actions. It is the ``right
371 language for the job''. For example, it takes only around 5 lines of
372 LISP code to describe the action of ``curling'' using embodied
373 primitives. It takes about 8 lines to describe the seemingly
374 complicated action of wiggling.
375
376
377
378 * COMMENT names for cortex
379 - bioland
380 759
381 760
382 761
383 762
384 # An anatomical joke: 763 # An anatomical joke: