comparison thesis/cortex.org @ 437:c1e6b7221b2f

progress on intro.
author Robert McIntyre <rlm@mit.edu>
date Sun, 23 Mar 2014 22:20:44 -0400
parents 853377051f1e
children 4dcb923c9b16
comparison
equal deleted inserted replaced
436:853377051f1e 437:c1e6b7221b2f
2 #+author: Robert McIntyre 2 #+author: Robert McIntyre
3 #+email: rlm@mit.edu 3 #+email: rlm@mit.edu
4 #+description: Using embodied AI to facilitate Artificial Imagination. 4 #+description: Using embodied AI to facilitate Artificial Imagination.
5 #+keywords: AI, clojure, embodiment 5 #+keywords: AI, clojure, embodiment
6 6
7 * Embodiment is a critical component of Intelligence 7
8 * Empathy and Embodiment as a problem solving strategy
9
10 By the end of this thesis, you will have seen a novel approach to
11 interpreting video using embodiment and empathy. You will have also
12 seen one way to efficiently implement empathy for embodied
13 creatures.
14
15 The core vision of this thesis is that one of the important ways in
16 which we understand others is by imagining ourselves in their
17 posistion and empathicaly feeling experiences based on our own past
18 experiences and imagination.
19
20 By understanding events in terms of our own previous corperal
21 experience, we greatly constrain the possibilities of what would
22 otherwise be an unweidly exponential search. This extra constraint
23 can be the difference between easily understanding what is happening
24 in a video and being completely lost in a sea of incomprehensible
25 color and movement.
8 26
9 ** Recognizing actions in video is extremely difficult 27 ** Recognizing actions in video is extremely difficult
28
29 Consider for example the problem of determining what is happening in
30 a video of which this is one frame:
31
32 #+caption: A cat drinking some water. Identifying this action is beyond the state of the art for computers.
33 #+ATTR_LaTeX: :width 7cm
34 [[./images/cat-drinking.jpg]]
35
36 It is currently impossible for any computer program to reliably
37 label such an video as "drinking". And rightly so -- it is a very
38 hard problem! What features can you describe in terms of low level
39 functions of pixels that can even begin to describe what is
40 happening here?
41
42 Or suppose that you are building a program that recognizes
43 chairs. How could you ``see'' the chair in the following picture?
44
45 #+caption: When you look at this, do you think ``chair''? I certainly do.
46 #+ATTR_LaTeX: :width 10cm
47 [[./images/invisible-chair.png]]
48
49 #+caption: The chair in this image is quite obvious to humans, but I doubt any computer program can find it.
50 #+ATTR_LaTeX: :width 10cm
51 [[./images/fat-person-sitting-at-desk.jpg]]
52
53
54 I think humans are able to label
55 such video as "drinking" because they imagine /themselves/ as the
56 cat, and imagine putting their face up against a stream of water and
57 sticking out their tongue. In that imagined world, they can feel the
58 cool water hitting their tongue, and feel the water entering their
59 body, and are able to recognize that /feeling/ as drinking. So, the
60 label of the action is not really in the pixels of the image, but is
61 found clearly in a simulation inspired by those pixels. An
62 imaginative system, having been trained on drinking and non-drinking
63 examples and learning that the most important component of drinking
64 is the feeling of water sliding down one's throat, would analyze a
65 video of a cat drinking in the following manner:
66
67 - Create a physical model of the video by putting a "fuzzy" model
68 of its own body in place of the cat. Also, create a simulation of
69 the stream of water.
70
71 - Play out this simulated scene and generate imagined sensory
72 experience. This will include relevant muscle contractions, a
73 close up view of the stream from the cat's perspective, and most
74 importantly, the imagined feeling of water entering the mouth.
75
76 - The action is now easily identified as drinking by the sense of
77 taste alone. The other senses (such as the tongue moving in and
78 out) help to give plausibility to the simulated action. Note that
79 the sense of vision, while critical in creating the simulation,
80 is not critical for identifying the action from the simulation.
81
82
83
84
85
86
87
10 cat drinking, mimes, leaning, common sense 88 cat drinking, mimes, leaning, common sense
11 89
12 ** Embodiment is the the right language for the job 90 ** =EMPATH= neatly solves recognition problems
91
92 factorization , right language, etc
13 93
14 a new possibility for the question ``what is a chair?'' -- it's the 94 a new possibility for the question ``what is a chair?'' -- it's the
15 feeling of your butt on something and your knees bent, with your 95 feeling of your butt on something and your knees bent, with your
16 back muscles and legs relaxed. 96 back muscles and legs relaxed.
17 97
18 ** =CORTEX= is a system for exploring embodiment 98 ** =CORTEX= is a toolkit for building sensate creatures
19 99
20 Hand integration demo 100 Hand integration demo
21 101
22 ** =CORTEX= solves recognition problems using empathy 102 ** Contributions
23
24 worm empathy demo
25
26 ** Overview
27 103
28 * Building =CORTEX= 104 * Building =CORTEX=
29 105
30 ** To explore embodiment, we need a world, body, and senses 106 ** To explore embodiment, we need a world, body, and senses
31 107
53 129
54 ** Embodiment factors action recognition into managable parts 130 ** Embodiment factors action recognition into managable parts
55 131
56 ** Action recognition is easy with a full gamut of senses 132 ** Action recognition is easy with a full gamut of senses
57 133
58 ** Digression: bootstrapping with multiple senses 134 ** Digression: bootstrapping touch using free exploration
59 135
60 ** \Phi-space describes the worm's experiences 136 ** \Phi-space describes the worm's experiences
61 137
62 ** Empathy is the process of tracing though \Phi-space 138 ** Empathy is the process of tracing though \Phi-space
63 139
67 - Built =CORTEX=, a comprehensive platform for embodied AI 143 - Built =CORTEX=, a comprehensive platform for embodied AI
68 experiments. Has many new features lacking in other systems, such 144 experiments. Has many new features lacking in other systems, such
69 as sound. Easy to model/create new creatures. 145 as sound. Easy to model/create new creatures.
70 - created a novel concept for action recognition by using artificial 146 - created a novel concept for action recognition by using artificial
71 imagination. 147 imagination.
72
73 * =CORTEX= User Guide
74
75
76 148
77 In the second half of the thesis I develop a computational model of 149 In the second half of the thesis I develop a computational model of
78 empathy, using =CORTEX= as a base. Empathy in this context is the 150 empathy, using =CORTEX= as a base. Empathy in this context is the
79 ability to observe another creature and infer what sorts of sensations 151 ability to observe another creature and infer what sorts of sensations
80 that creature is feeling. My empathy algorithm involves multiple 152 that creature is feeling. My empathy algorithm involves multiple
95 language for the job''. For example, it takes only around 5 lines of 167 language for the job''. For example, it takes only around 5 lines of
96 LISP code to describe the action of ``curling'' using embodied 168 LISP code to describe the action of ``curling'' using embodied
97 primitives. It takes about 8 lines to describe the seemingly 169 primitives. It takes about 8 lines to describe the seemingly
98 complicated action of wiggling. 170 complicated action of wiggling.
99 171
172
173
174 * COMMENT names for cortex
175 - bioland