Mercurial > cortex
changeset 437:c1e6b7221b2f
progress on intro.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Sun, 23 Mar 2014 22:20:44 -0400 |
parents | 853377051f1e |
children | 4dcb923c9b16 |
files | thesis/Makefile thesis/abstract.org thesis/cortex.org thesis/cover.tex thesis/images/fat-person-sitting-at-desk.jpg thesis/images/invisible-chair.png thesis/rlm-cortex-meng.tex thesis/user-guide.org thesis/weave-thesis.sh |
diffstat | 9 files changed, 104 insertions(+), 20 deletions(-) [+] |
line wrap: on
line diff
1.1 --- a/thesis/Makefile Sun Mar 23 19:09:14 2014 -0400 1.2 +++ b/thesis/Makefile Sun Mar 23 22:20:44 2014 -0400 1.3 @@ -7,7 +7,7 @@ 1.4 rsync -avz --delete /home/r/proj/cortex/thesis "r@aurellem.org:~" 1.5 ssh r@aurellem.org cd "~/thesis; $(INVOKE_LATEX)" 1.6 scp "r@aurellem.org:/home/r/thesis/$(THESIS_NAME).pdf" . 1.7 - rm cortex.tex abstract.tex 1.8 + rm cortex.tex abstract.tex user-guide.tex 1.9 1.10 1.11
2.1 --- a/thesis/abstract.org Sun Mar 23 19:09:14 2014 -0400 2.2 +++ b/thesis/abstract.org Sun Mar 23 22:20:44 2014 -0400 2.3 @@ -1,12 +1,12 @@ 2.4 Here I demonstrate the power of using embodied artificial intelligence 2.5 to attack the /action recognition/ problem, which is the challenge of 2.6 recognizing actions performed by a creature given limited data about 2.7 -the creature's actions, such as a video recording. I solve this problem 2.8 -in the case of a worm-like creature performing actions such as curling 2.9 -and wiggling. 2.10 +the creature's actions, such as a video recording. I solve this 2.11 +problem in the case of a worm-like creature performing actions such as 2.12 +curling and wiggling. 2.13 2.14 To attack the action recognition problem, I developed a computational 2.15 -model of empathy which allows me to use simple, embodied 2.16 +model of empathy (=EMPATH=) which allows me to use simple, embodied 2.17 representations of actions (which require rich sensory data), even 2.18 when that sensory data is not actually available. The missing sense 2.19 data is ``imagined'' by the system by combining previous experiences
3.1 --- a/thesis/cortex.org Sun Mar 23 19:09:14 2014 -0400 3.2 +++ b/thesis/cortex.org Sun Mar 23 22:20:44 2014 -0400 3.3 @@ -4,26 +4,102 @@ 3.4 #+description: Using embodied AI to facilitate Artificial Imagination. 3.5 #+keywords: AI, clojure, embodiment 3.6 3.7 -* Embodiment is a critical component of Intelligence 3.8 + 3.9 +* Empathy and Embodiment as a problem solving strategy 3.10 + 3.11 + By the end of this thesis, you will have seen a novel approach to 3.12 + interpreting video using embodiment and empathy. You will have also 3.13 + seen one way to efficiently implement empathy for embodied 3.14 + creatures. 3.15 + 3.16 + The core vision of this thesis is that one of the important ways in 3.17 + which we understand others is by imagining ourselves in their 3.18 + posistion and empathicaly feeling experiences based on our own past 3.19 + experiences and imagination. 3.20 + 3.21 + By understanding events in terms of our own previous corperal 3.22 + experience, we greatly constrain the possibilities of what would 3.23 + otherwise be an unweidly exponential search. This extra constraint 3.24 + can be the difference between easily understanding what is happening 3.25 + in a video and being completely lost in a sea of incomprehensible 3.26 + color and movement. 3.27 3.28 ** Recognizing actions in video is extremely difficult 3.29 + 3.30 + Consider for example the problem of determining what is happening in 3.31 + a video of which this is one frame: 3.32 + 3.33 + #+caption: A cat drinking some water. Identifying this action is beyond the state of the art for computers. 3.34 + #+ATTR_LaTeX: :width 7cm 3.35 + [[./images/cat-drinking.jpg]] 3.36 + 3.37 + It is currently impossible for any computer program to reliably 3.38 + label such an video as "drinking". And rightly so -- it is a very 3.39 + hard problem! What features can you describe in terms of low level 3.40 + functions of pixels that can even begin to describe what is 3.41 + happening here? 3.42 + 3.43 + Or suppose that you are building a program that recognizes 3.44 + chairs. How could you ``see'' the chair in the following picture? 3.45 + 3.46 + #+caption: When you look at this, do you think ``chair''? I certainly do. 3.47 + #+ATTR_LaTeX: :width 10cm 3.48 + [[./images/invisible-chair.png]] 3.49 + 3.50 + #+caption: The chair in this image is quite obvious to humans, but I doubt any computer program can find it. 3.51 + #+ATTR_LaTeX: :width 10cm 3.52 + [[./images/fat-person-sitting-at-desk.jpg]] 3.53 + 3.54 + 3.55 + I think humans are able to label 3.56 + such video as "drinking" because they imagine /themselves/ as the 3.57 + cat, and imagine putting their face up against a stream of water and 3.58 + sticking out their tongue. In that imagined world, they can feel the 3.59 + cool water hitting their tongue, and feel the water entering their 3.60 + body, and are able to recognize that /feeling/ as drinking. So, the 3.61 + label of the action is not really in the pixels of the image, but is 3.62 + found clearly in a simulation inspired by those pixels. An 3.63 + imaginative system, having been trained on drinking and non-drinking 3.64 + examples and learning that the most important component of drinking 3.65 + is the feeling of water sliding down one's throat, would analyze a 3.66 + video of a cat drinking in the following manner: 3.67 + 3.68 + - Create a physical model of the video by putting a "fuzzy" model 3.69 + of its own body in place of the cat. Also, create a simulation of 3.70 + the stream of water. 3.71 + 3.72 + - Play out this simulated scene and generate imagined sensory 3.73 + experience. This will include relevant muscle contractions, a 3.74 + close up view of the stream from the cat's perspective, and most 3.75 + importantly, the imagined feeling of water entering the mouth. 3.76 + 3.77 + - The action is now easily identified as drinking by the sense of 3.78 + taste alone. The other senses (such as the tongue moving in and 3.79 + out) help to give plausibility to the simulated action. Note that 3.80 + the sense of vision, while critical in creating the simulation, 3.81 + is not critical for identifying the action from the simulation. 3.82 + 3.83 + 3.84 + 3.85 + 3.86 + 3.87 + 3.88 + 3.89 cat drinking, mimes, leaning, common sense 3.90 3.91 -** Embodiment is the the right language for the job 3.92 +** =EMPATH= neatly solves recognition problems 3.93 + 3.94 + factorization , right language, etc 3.95 3.96 a new possibility for the question ``what is a chair?'' -- it's the 3.97 feeling of your butt on something and your knees bent, with your 3.98 back muscles and legs relaxed. 3.99 3.100 -** =CORTEX= is a system for exploring embodiment 3.101 +** =CORTEX= is a toolkit for building sensate creatures 3.102 3.103 Hand integration demo 3.104 3.105 -** =CORTEX= solves recognition problems using empathy 3.106 - 3.107 - worm empathy demo 3.108 - 3.109 -** Overview 3.110 +** Contributions 3.111 3.112 * Building =CORTEX= 3.113 3.114 @@ -55,7 +131,7 @@ 3.115 3.116 ** Action recognition is easy with a full gamut of senses 3.117 3.118 -** Digression: bootstrapping with multiple senses 3.119 +** Digression: bootstrapping touch using free exploration 3.120 3.121 ** \Phi-space describes the worm's experiences 3.122 3.123 @@ -70,10 +146,6 @@ 3.124 - created a novel concept for action recognition by using artificial 3.125 imagination. 3.126 3.127 -* =CORTEX= User Guide 3.128 - 3.129 - 3.130 - 3.131 In the second half of the thesis I develop a computational model of 3.132 empathy, using =CORTEX= as a base. Empathy in this context is the 3.133 ability to observe another creature and infer what sorts of sensations 3.134 @@ -97,3 +169,7 @@ 3.135 primitives. It takes about 8 lines to describe the seemingly 3.136 complicated action of wiggling. 3.137 3.138 + 3.139 + 3.140 +* COMMENT names for cortex 3.141 + - bioland 3.142 \ No newline at end of file
4.1 --- a/thesis/cover.tex Sun Mar 23 19:09:14 2014 -0400 4.2 +++ b/thesis/cover.tex Sun Mar 23 22:20:44 2014 -0400 4.3 @@ -45,7 +45,7 @@ 4.4 % however the specifications can change. We recommend that you verify the 4.5 % layout of your title page with your thesis advisor and/or the MIT 4.6 % Libraries before printing your final copy. 4.7 -\title{CORTEX : A Virtual World for Sensate AI} 4.8 +\title{Solving Problems using Embodiment \& Empathy.} 4.9 \author{Robert Louis M\raisebox{\depth}{\small \underline{\underline{c}}}Intyre} 4.10 %\author{Robert McIntyre} 4.11
5.1 Binary file thesis/images/fat-person-sitting-at-desk.jpg has changed
6.1 Binary file thesis/images/invisible-chair.png has changed
7.1 --- a/thesis/rlm-cortex-meng.tex Sun Mar 23 19:09:14 2014 -0400 7.2 +++ b/thesis/rlm-cortex-meng.tex Sun Mar 23 22:20:44 2014 -0400 7.3 @@ -100,7 +100,7 @@ 7.4 %\bibliographystyle{agsm} 7.5 %\bibliographystyle{apa} 7.6 %\bibliographystyle{plainnat} 7.7 - 7.8 +\include{user-guide} 7.9 \printbibliography 7.10 \end{singlespace} 7.11 \end{document}
8.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 8.2 +++ b/thesis/user-guide.org Sun Mar 23 22:20:44 2014 -0400 8.3 @@ -0,0 +1,6 @@ 8.4 +* Appendix: =CORTEX= User Guide 8.5 + 8.6 + For future students who whould like to use =CORTEX= in their own 8.7 + projects. 8.8 + 8.9 +
9.1 --- a/thesis/weave-thesis.sh Sun Mar 23 19:09:14 2014 -0400 9.2 +++ b/thesis/weave-thesis.sh Sun Mar 23 22:20:44 2014 -0400 9.3 @@ -9,6 +9,8 @@ 9.4 (progn 9.5 (find-file \"cortex.org\") 9.6 (org-latex-export-to-latex nil nil nil t nil) \ 9.7 + (find-file \"user-guide.org\") 9.8 + (org-latex-export-to-latex nil nil nil t nil) \ 9.9 (find-file \"abstract.org\") 9.10 (org-latex-export-to-latex nil nil nil t nil))" \ 9.11 \