comparison thesis/cortex.org @ 446:3e91585b2a1c

save.
author Robert McIntyre <rlm@mit.edu>
date Tue, 25 Mar 2014 03:24:28 -0400
parents 47cfbe84f00e
children 284316604be0
comparison
equal deleted inserted replaced
445:47cfbe84f00e 446:3e91585b2a1c
179 to interpret the actions of a simple, worm-like creature. 179 to interpret the actions of a simple, worm-like creature.
180 180
181 #+caption: The worm performs many actions during free play such as 181 #+caption: The worm performs many actions during free play such as
182 #+caption: curling, wiggling, and resting. 182 #+caption: curling, wiggling, and resting.
183 #+name: worm-intro 183 #+name: worm-intro
184 #+ATTR_LaTeX: :width 13cm 184 #+ATTR_LaTeX: :width 15cm
185 [[./images/worm-intro-white.png]] 185 [[./images/worm-intro-white.png]]
186 186
187 #+caption: The actions of a worm in a video can be recognized by
188 #+caption: proprioceptive data and sentory predicates by filling
189 #+caption: in the missing sensory detail with previous experience.
190 #+name: worm-recognition-intro
191 #+ATTR_LaTeX: :width 15cm
192 [[./images/worm-poses.png]]
193
194
195 One powerful advantage of empathic problem solving is that it
196 factors the action recognition problem into two easier problems. To
197 use empathy, you need an /aligner/, which takes the video and a
198 model of your body, and aligns the model with the video. Then, you
199 need a /recognizer/, which uses the aligned model to interpret the
200 action. The power in this method lies in the fact that you describe
201 all actions form a body-centered, rich viewpoint. This way, if you
202 teach the system what ``running'' is, and you have a good enough
203 aligner, the system will from then on be able to recognize running
204 from any point of view, even strange points of view like above or
205 underneath the runner. This is in contrast to action recognition
206 schemes that try to identify actions using a non-embodied approach
207 such as TODO:REFERENCE. If these systems learn about running as viewed
208 from the side, they will not automatically be able to recognize
209 running from any other viewpoint.
210
211 Another powerful advantage is that using the language of multiple
212 body-centered rich senses to describe body-centerd actions offers a
213 massive boost in descriptive capability. Consider how difficult it
214 would be to compose a set of HOG filters to describe the action of
215 a simple worm-creature "curling" so that its head touches its tail,
216 and then behold the simplicity of describing thus action in a
217 language designed for the task (listing \ref{grand-circle-intro}):
187 218
188 #+caption: Body-centerd actions are best expressed in a body-centered 219 #+caption: Body-centerd actions are best expressed in a body-centered
189 #+caption: language. This code detects when the worm has curled into a 220 #+caption: language. This code detects when the worm has curled into a
190 #+caption: full circle. Imagine how you would replicate this functionality 221 #+caption: full circle. Imagine how you would replicate this functionality
191 #+caption: using low-level pixel features such as HOG filters! 222 #+caption: using low-level pixel features such as HOG filters!
202 (and (< 0.55 (contact worm-segment-bottom-tip tail-touch)) 233 (and (< 0.55 (contact worm-segment-bottom-tip tail-touch))
203 (< 0.55 (contact worm-segment-top-tip head-touch)))))) 234 (< 0.55 (contact worm-segment-top-tip head-touch))))))
204 #+end_src 235 #+end_src
205 #+end_listing 236 #+end_listing
206 237
207 #+caption: The actions of a worm in a video can be recognized by
208 #+caption: proprioceptive data and sentory predicates by filling
209 #+caption: in the missing sensory detail with previous experience.
210 #+name: worm-recognition-intro
211 #+ATTR_LaTeX: :width 10cm
212 [[./images/worm-poses.png]]
213
214
215 One powerful advantage of empathic problem solving is that it
216 factors the action recognition problem into two easier problems. To
217 use empathy, you need an /aligner/, which takes the video and a
218 model of your body, and aligns the model with the video. Then, you
219 need a /recognizer/, which uses the aligned model to interpret the
220 action. The power in this method lies in the fact that you describe
221 all actions form a body-centered, rich viewpoint. This way, if you
222 teach the system what ``running'' is, and you have a good enough
223 aligner, the system will from then on be able to recognize running
224 from any point of view, even strange points of view like above or
225 underneath the runner. This is in contrast to action recognition
226 schemes that try to identify actions using a non-embodied approach
227 such as TODO:REFERENCE. If these systems learn about running as viewed
228 from the side, they will not automatically be able to recognize
229 running from any other viewpoint.
230
231 Another powerful advantage is that using the language of multiple
232 body-centered rich senses to describe body-centerd actions offers a
233 massive boost in descriptive capability. Consider how difficult it
234 would be to compose a set of HOG filters to describe the action of
235 a simple worm-creature "curling" so that its head touches its tail,
236 and then behold the simplicity of describing thus action in a
237 language designed for the task (listing \ref{grand-circle-intro}):
238
239 238
240 ** =CORTEX= is a toolkit for building sensate creatures 239 ** =CORTEX= is a toolkit for building sensate creatures
241 240
242 Hand integration demo 241 Hand integration demo
243 242