Mercurial > cortex
comparison thesis/cortex.org @ 436:853377051f1e
abstract v. 2
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Sun, 23 Mar 2014 19:09:14 -0400 |
parents | ae3bfc82ac7c |
children | c1e6b7221b2f |
comparison
equal
deleted
inserted
replaced
435:ae3bfc82ac7c | 436:853377051f1e |
---|---|
2 #+author: Robert McIntyre | 2 #+author: Robert McIntyre |
3 #+email: rlm@mit.edu | 3 #+email: rlm@mit.edu |
4 #+description: Using embodied AI to facilitate Artificial Imagination. | 4 #+description: Using embodied AI to facilitate Artificial Imagination. |
5 #+keywords: AI, clojure, embodiment | 5 #+keywords: AI, clojure, embodiment |
6 | 6 |
7 -- show hand | 7 * Embodiment is a critical component of Intelligence |
8 | 8 |
9 * Embodiment is a critical component to Intelligence | 9 ** Recognizing actions in video is extremely difficult |
10 cat drinking, mimes, leaning, common sense | |
10 | 11 |
11 * To explore embodiment, we need a world, body, and senses | 12 ** Embodiment is the the right language for the job |
12 | 13 |
13 * Because of Time, simulation is perferable to reality | 14 a new possibility for the question ``what is a chair?'' -- it's the |
15 feeling of your butt on something and your knees bent, with your | |
16 back muscles and legs relaxed. | |
14 | 17 |
15 * Video game engines are a great starting point | 18 ** =CORTEX= is a system for exploring embodiment |
16 | 19 |
17 * Bodies are composed of segments connected by joints | 20 Hand integration demo |
18 | 21 |
19 * Eyes reuse standard video game components | 22 ** =CORTEX= solves recognition problems using empathy |
23 | |
24 worm empathy demo | |
20 | 25 |
21 * Hearing is hard; =CORTEX= does it right | 26 ** Overview |
22 | 27 |
23 * Touch uses hundreds of hair-like elements | 28 * Building =CORTEX= |
24 | 29 |
25 * Proprioception is the force that makes everything ``real'' | 30 ** To explore embodiment, we need a world, body, and senses |
26 | 31 |
27 * Muscles are both effectors and sensors | 32 ** Because of Time, simulation is perferable to reality |
28 | 33 |
29 * =CORTEX= brings complex creatures to life! | 34 ** Video game engines are a great starting point |
30 | 35 |
31 * =CORTEX= enables many possiblities for further research | 36 ** Bodies are composed of segments connected by joints |
32 | 37 |
33 * =CORTEX= User Guide | 38 ** Eyes reuse standard video game components |
39 | |
40 ** Hearing is hard; =CORTEX= does it right | |
41 | |
42 ** Touch uses hundreds of hair-like elements | |
43 | |
44 ** Proprioception is the force that makes everything ``real'' | |
45 | |
46 ** Muscles are both effectors and sensors | |
47 | |
48 ** =CORTEX= brings complex creatures to life! | |
49 | |
50 ** =CORTEX= enables many possiblities for further research | |
34 | 51 |
35 * Empathy in a simulated worm | 52 * Empathy in a simulated worm |
36 | 53 |
37 * Embodiment factors action recognition into managable parts | 54 ** Embodiment factors action recognition into managable parts |
38 | 55 |
39 * Action recognition is easy with a full gamut of senses | 56 ** Action recognition is easy with a full gamut of senses |
40 | 57 |
41 * Digression: bootstrapping with multiple senses | 58 ** Digression: bootstrapping with multiple senses |
42 | 59 |
43 * \Phi-space describes the worm's experiences | 60 ** \Phi-space describes the worm's experiences |
44 | 61 |
45 * Empathy is the process of tracing though \Phi-space | 62 ** Empathy is the process of tracing though \Phi-space |
46 | 63 |
47 * Efficient action recognition via empathy | 64 ** Efficient action recognition via empathy |
48 | |
49 * Contributions | |
50 | |
51 | |
52 * Vision | |
53 | |
54 System for understanding what the actors in a video are doing -- | |
55 Action Recognition. | |
56 | |
57 Separate action recognition into three components: | |
58 | |
59 - free play | |
60 - embodied action predicates | |
61 - model alignment | |
62 - sensory imagination | |
63 | |
64 * Steps | |
65 | |
66 - Build cortex, a simulated environment for sensate AI | |
67 - solid bodies w/ joints | |
68 - vision | |
69 - touch | |
70 - vision | |
71 - hearing | |
72 - proprioception | |
73 - muscle contraction | |
74 | |
75 - Build experimental framework for worm-actions | |
76 - embodied stream predicates | |
77 - \phi-space | |
78 - \phi-scan | |
79 | |
80 * News | |
81 | |
82 Experimental results: | |
83 | |
84 - \phi-space actually works very well for the worm! | |
85 - self organizing touch map | |
86 | |
87 | 65 |
88 * Contributions | 66 * Contributions |
89 - Built =CORTEX=, a comprehensive platform for embodied AI | 67 - Built =CORTEX=, a comprehensive platform for embodied AI |
90 experiments. Has many new features lacking in other systems, such | 68 experiments. Has many new features lacking in other systems, such |
91 as sound. Easy to model/create new creatures. | 69 as sound. Easy to model/create new creatures. |
92 - created a novel concept for action recognition by using artificial | 70 - created a novel concept for action recognition by using artificial |
93 imagination. | 71 imagination. |
94 | 72 |
73 * =CORTEX= User Guide | |
95 | 74 |
96 | 75 |
97 | 76 |
77 In the second half of the thesis I develop a computational model of | |
78 empathy, using =CORTEX= as a base. Empathy in this context is the | |
79 ability to observe another creature and infer what sorts of sensations | |
80 that creature is feeling. My empathy algorithm involves multiple | |
81 phases. First is free-play, where the creature moves around and gains | |
82 sensory experience. From this experience I construct a representation | |
83 of the creature's sensory state space, which I call \phi-space. Using | |
84 \phi-space, I construct an efficient function for enriching the | |
85 limited data that comes from observing another creature with a full | |
86 compliment of imagined sensory data based on previous experience. I | |
87 can then use the imagined sensory data to recognize what the observed | |
88 creature is doing and feeling, using straightforward embodied action | |
89 predicates. This is all demonstrated with using a simple worm-like | |
90 creature, and recognizing worm-actions based on limited data. | |
98 | 91 |
92 Embodied representation using multiple senses such as touch, | |
93 proprioception, and muscle tension turns out be be exceedingly | |
94 efficient at describing body-centered actions. It is the ``right | |
95 language for the job''. For example, it takes only around 5 lines of | |
96 LISP code to describe the action of ``curling'' using embodied | |
97 primitives. It takes about 8 lines to describe the seemingly | |
98 complicated action of wiggling. | |
99 | 99 |
100 |