Mercurial > cortex
comparison thesis/cortex.org @ 446:3e91585b2a1c
save.
author | Robert McIntyre <rlm@mit.edu> |
---|---|
date | Tue, 25 Mar 2014 03:24:28 -0400 |
parents | 47cfbe84f00e |
children | 284316604be0 |
comparison
equal
deleted
inserted
replaced
445:47cfbe84f00e | 446:3e91585b2a1c |
---|---|
179 to interpret the actions of a simple, worm-like creature. | 179 to interpret the actions of a simple, worm-like creature. |
180 | 180 |
181 #+caption: The worm performs many actions during free play such as | 181 #+caption: The worm performs many actions during free play such as |
182 #+caption: curling, wiggling, and resting. | 182 #+caption: curling, wiggling, and resting. |
183 #+name: worm-intro | 183 #+name: worm-intro |
184 #+ATTR_LaTeX: :width 13cm | 184 #+ATTR_LaTeX: :width 15cm |
185 [[./images/worm-intro-white.png]] | 185 [[./images/worm-intro-white.png]] |
186 | 186 |
187 #+caption: The actions of a worm in a video can be recognized by | |
188 #+caption: proprioceptive data and sentory predicates by filling | |
189 #+caption: in the missing sensory detail with previous experience. | |
190 #+name: worm-recognition-intro | |
191 #+ATTR_LaTeX: :width 15cm | |
192 [[./images/worm-poses.png]] | |
193 | |
194 | |
195 One powerful advantage of empathic problem solving is that it | |
196 factors the action recognition problem into two easier problems. To | |
197 use empathy, you need an /aligner/, which takes the video and a | |
198 model of your body, and aligns the model with the video. Then, you | |
199 need a /recognizer/, which uses the aligned model to interpret the | |
200 action. The power in this method lies in the fact that you describe | |
201 all actions form a body-centered, rich viewpoint. This way, if you | |
202 teach the system what ``running'' is, and you have a good enough | |
203 aligner, the system will from then on be able to recognize running | |
204 from any point of view, even strange points of view like above or | |
205 underneath the runner. This is in contrast to action recognition | |
206 schemes that try to identify actions using a non-embodied approach | |
207 such as TODO:REFERENCE. If these systems learn about running as viewed | |
208 from the side, they will not automatically be able to recognize | |
209 running from any other viewpoint. | |
210 | |
211 Another powerful advantage is that using the language of multiple | |
212 body-centered rich senses to describe body-centerd actions offers a | |
213 massive boost in descriptive capability. Consider how difficult it | |
214 would be to compose a set of HOG filters to describe the action of | |
215 a simple worm-creature "curling" so that its head touches its tail, | |
216 and then behold the simplicity of describing thus action in a | |
217 language designed for the task (listing \ref{grand-circle-intro}): | |
187 | 218 |
188 #+caption: Body-centerd actions are best expressed in a body-centered | 219 #+caption: Body-centerd actions are best expressed in a body-centered |
189 #+caption: language. This code detects when the worm has curled into a | 220 #+caption: language. This code detects when the worm has curled into a |
190 #+caption: full circle. Imagine how you would replicate this functionality | 221 #+caption: full circle. Imagine how you would replicate this functionality |
191 #+caption: using low-level pixel features such as HOG filters! | 222 #+caption: using low-level pixel features such as HOG filters! |
202 (and (< 0.55 (contact worm-segment-bottom-tip tail-touch)) | 233 (and (< 0.55 (contact worm-segment-bottom-tip tail-touch)) |
203 (< 0.55 (contact worm-segment-top-tip head-touch)))))) | 234 (< 0.55 (contact worm-segment-top-tip head-touch)))))) |
204 #+end_src | 235 #+end_src |
205 #+end_listing | 236 #+end_listing |
206 | 237 |
207 #+caption: The actions of a worm in a video can be recognized by | |
208 #+caption: proprioceptive data and sentory predicates by filling | |
209 #+caption: in the missing sensory detail with previous experience. | |
210 #+name: worm-recognition-intro | |
211 #+ATTR_LaTeX: :width 10cm | |
212 [[./images/worm-poses.png]] | |
213 | |
214 | |
215 One powerful advantage of empathic problem solving is that it | |
216 factors the action recognition problem into two easier problems. To | |
217 use empathy, you need an /aligner/, which takes the video and a | |
218 model of your body, and aligns the model with the video. Then, you | |
219 need a /recognizer/, which uses the aligned model to interpret the | |
220 action. The power in this method lies in the fact that you describe | |
221 all actions form a body-centered, rich viewpoint. This way, if you | |
222 teach the system what ``running'' is, and you have a good enough | |
223 aligner, the system will from then on be able to recognize running | |
224 from any point of view, even strange points of view like above or | |
225 underneath the runner. This is in contrast to action recognition | |
226 schemes that try to identify actions using a non-embodied approach | |
227 such as TODO:REFERENCE. If these systems learn about running as viewed | |
228 from the side, they will not automatically be able to recognize | |
229 running from any other viewpoint. | |
230 | |
231 Another powerful advantage is that using the language of multiple | |
232 body-centered rich senses to describe body-centerd actions offers a | |
233 massive boost in descriptive capability. Consider how difficult it | |
234 would be to compose a set of HOG filters to describe the action of | |
235 a simple worm-creature "curling" so that its head touches its tail, | |
236 and then behold the simplicity of describing thus action in a | |
237 language designed for the task (listing \ref{grand-circle-intro}): | |
238 | |
239 | 238 |
240 ** =CORTEX= is a toolkit for building sensate creatures | 239 ** =CORTEX= is a toolkit for building sensate creatures |
241 | 240 |
242 Hand integration demo | 241 Hand integration demo |
243 | 242 |