rlm@425
|
1 #+title: =CORTEX=
|
rlm@425
|
2 #+author: Robert McIntyre
|
rlm@425
|
3 #+email: rlm@mit.edu
|
rlm@425
|
4 #+description: Using embodied AI to facilitate Artificial Imagination.
|
rlm@425
|
5 #+keywords: AI, clojure, embodiment
|
rlm@422
|
6
|
rlm@436
|
7 * Embodiment is a critical component of Intelligence
|
rlm@435
|
8
|
rlm@436
|
9 ** Recognizing actions in video is extremely difficult
|
rlm@436
|
10 cat drinking, mimes, leaning, common sense
|
rlm@435
|
11
|
rlm@436
|
12 ** Embodiment is the the right language for the job
|
rlm@435
|
13
|
rlm@436
|
14 a new possibility for the question ``what is a chair?'' -- it's the
|
rlm@436
|
15 feeling of your butt on something and your knees bent, with your
|
rlm@436
|
16 back muscles and legs relaxed.
|
rlm@435
|
17
|
rlm@436
|
18 ** =CORTEX= is a system for exploring embodiment
|
rlm@435
|
19
|
rlm@436
|
20 Hand integration demo
|
rlm@435
|
21
|
rlm@436
|
22 ** =CORTEX= solves recognition problems using empathy
|
rlm@436
|
23
|
rlm@436
|
24 worm empathy demo
|
rlm@435
|
25
|
rlm@436
|
26 ** Overview
|
rlm@435
|
27
|
rlm@436
|
28 * Building =CORTEX=
|
rlm@435
|
29
|
rlm@436
|
30 ** To explore embodiment, we need a world, body, and senses
|
rlm@435
|
31
|
rlm@436
|
32 ** Because of Time, simulation is perferable to reality
|
rlm@435
|
33
|
rlm@436
|
34 ** Video game engines are a great starting point
|
rlm@435
|
35
|
rlm@436
|
36 ** Bodies are composed of segments connected by joints
|
rlm@435
|
37
|
rlm@436
|
38 ** Eyes reuse standard video game components
|
rlm@436
|
39
|
rlm@436
|
40 ** Hearing is hard; =CORTEX= does it right
|
rlm@436
|
41
|
rlm@436
|
42 ** Touch uses hundreds of hair-like elements
|
rlm@436
|
43
|
rlm@436
|
44 ** Proprioception is the force that makes everything ``real''
|
rlm@436
|
45
|
rlm@436
|
46 ** Muscles are both effectors and sensors
|
rlm@436
|
47
|
rlm@436
|
48 ** =CORTEX= brings complex creatures to life!
|
rlm@436
|
49
|
rlm@436
|
50 ** =CORTEX= enables many possiblities for further research
|
rlm@435
|
51
|
rlm@435
|
52 * Empathy in a simulated worm
|
rlm@435
|
53
|
rlm@436
|
54 ** Embodiment factors action recognition into managable parts
|
rlm@435
|
55
|
rlm@436
|
56 ** Action recognition is easy with a full gamut of senses
|
rlm@435
|
57
|
rlm@436
|
58 ** Digression: bootstrapping with multiple senses
|
rlm@435
|
59
|
rlm@436
|
60 ** \Phi-space describes the worm's experiences
|
rlm@435
|
61
|
rlm@436
|
62 ** Empathy is the process of tracing though \Phi-space
|
rlm@435
|
63
|
rlm@436
|
64 ** Efficient action recognition via empathy
|
rlm@425
|
65
|
rlm@432
|
66 * Contributions
|
rlm@432
|
67 - Built =CORTEX=, a comprehensive platform for embodied AI
|
rlm@432
|
68 experiments. Has many new features lacking in other systems, such
|
rlm@432
|
69 as sound. Easy to model/create new creatures.
|
rlm@432
|
70 - created a novel concept for action recognition by using artificial
|
rlm@432
|
71 imagination.
|
rlm@426
|
72
|
rlm@436
|
73 * =CORTEX= User Guide
|
rlm@426
|
74
|
rlm@432
|
75
|
rlm@432
|
76
|
rlm@436
|
77 In the second half of the thesis I develop a computational model of
|
rlm@436
|
78 empathy, using =CORTEX= as a base. Empathy in this context is the
|
rlm@436
|
79 ability to observe another creature and infer what sorts of sensations
|
rlm@436
|
80 that creature is feeling. My empathy algorithm involves multiple
|
rlm@436
|
81 phases. First is free-play, where the creature moves around and gains
|
rlm@436
|
82 sensory experience. From this experience I construct a representation
|
rlm@436
|
83 of the creature's sensory state space, which I call \phi-space. Using
|
rlm@436
|
84 \phi-space, I construct an efficient function for enriching the
|
rlm@436
|
85 limited data that comes from observing another creature with a full
|
rlm@436
|
86 compliment of imagined sensory data based on previous experience. I
|
rlm@436
|
87 can then use the imagined sensory data to recognize what the observed
|
rlm@436
|
88 creature is doing and feeling, using straightforward embodied action
|
rlm@436
|
89 predicates. This is all demonstrated with using a simple worm-like
|
rlm@436
|
90 creature, and recognizing worm-actions based on limited data.
|
rlm@432
|
91
|
rlm@436
|
92 Embodied representation using multiple senses such as touch,
|
rlm@436
|
93 proprioception, and muscle tension turns out be be exceedingly
|
rlm@436
|
94 efficient at describing body-centered actions. It is the ``right
|
rlm@436
|
95 language for the job''. For example, it takes only around 5 lines of
|
rlm@436
|
96 LISP code to describe the action of ``curling'' using embodied
|
rlm@436
|
97 primitives. It takes about 8 lines to describe the seemingly
|
rlm@436
|
98 complicated action of wiggling.
|
rlm@432
|
99
|