Robot Learning From Human Demonstration Maja J Matarić Chad
32 Slides4.03 MB
Robot Learning From Human Demonstration Maja J Matarić Chad Jenkins, Marcelo Kallmann, Evan Drumwright, Nathan Miller, and Chi-Wei Chu University of Southern California Interaction Lab / Robotics Research Lab Center for Robotics and Embedded Systems (CRES) http://robotics.usc.edu/ agents/Mars2020/mars2020.html 1/33 MARS PI Meeting 9/2003
Motivation & Approach Goals: – Natural human-robot interaction in various domains – Automated robot programming & learning by imitation General Approach: – Use intrinsic behavior repertoire to facilitate control, human-robot interaction, and learning – Use human interactive training method (past work) – Use human data-driven programming & training methods 2/33 MARS PI Meeting 9/2003
Recent Progress Getting more & better training data: – a light-weight low-cost motion-capture mechanism Real-world validation of the method: – application of the method to Robonaut data Application of the method I: – synthesis of novel humanoid motion from automatically derived movement primitives Application of the method II: – movement classification, prediction, and imitation The next big problem: – humanoid motion planning around (dynamic) obstacles & validation on Robonaut 3/33 MARS PI Meeting 9/2003
Getting more & better training data IMU Motion Capture Suit Goal: Develop a low cost motion capture device capable of logging high-DOF human motion in an unstructured environment. Solution: Use filtered Inertial Measurement Units (IMUs) for 3 DOF tracking of each joint. Each sensor is developed at 300.00, resulting in a suit cost of 4200.00 to track 14 DOF Advantages: 1) Motion tracking is not coupled to off-person emitters/detectors, so can be used outdoors, anywhere 2) Sensors are small and networked, allowing various configurations 3) High bandwidth allows for real-time interaction with visualization and simulation tools 4/33 MARS PI Meeting 9/2003
Getting more & better training data IMU Motion Capture Suit Wireless Connection Human Body Model Sensor Network -sensor location -onboard computer & battery 5/33 MARS PI Meeting 9/2003
Getting more & better training data Suit Details Specifications: Atmel 8 bit microcontroller w/ 10 bit ADC, 8 Mhz (3) 300 deg/sec Gyroscopes (3) 2-G Accelerometers 200.00/sensor (2) DOF Filtered Next Revision: (3) Honeywell Magnetometers Change to 12 bit ADC, 16Mhz CPU 260.00/sensor Full 3 DOF Filtered 6/33 REV 1 1.5 “ MARS PI Meeting 9/2003
Recap of the method Automatically Deriving Behaviors Input: kinematic motion; time series of joint angles Motion segmentation – Partition input motion into conceptually indivisible motion segments Grouping of behavior exemplars – Spatio-temporal Isomap dimension reduction and clustering Generalizing behaviors into forward models – Interpolation of a dense sampling for each behavior Meta-level exemplar grouping – Additional embedding iterations for higher level behaviors O. C. Jenkins, M. J Matarić, “Automated Derivation of Behavior Vocabularies for Autonomous Humanoid Motion", Autonomous Agents and Multiagent Systems, Melbourne, Australia, July 14-16, 2003. 7/33 MARS PI Meeting 9/2003
Validation of the method on Robonaut Applying the Method to Robonaut Work with Alan Peters, more tomorrow 80-D data from tactile and sensors 5 tele-op grasps of a horizontal wrench PCA Embedding, not informative force – 460 frames each, 2300 total Applied sequentially continuous ST-Isomap ST-Isomap Distance Matrix 8/33 ST-Isomap embedding MARS PI Meeting 9/2003
Validation of the method on Robonaut Uncovering Structure in the Data ST-Isomap embedding Mapping of a new grasp motion onto the derived embedding; structure is retained Useful for monitoring performance, data analysis, generating controllers, etc. 9/33 MARS PI Meeting 9/2003
Using Derived Behaviors We now have a method for deriving vocabularies of behaviors from kinematic time-series of human motion – each primitive is a nonparametric exemplar-based motion model – each primitive can be eagerly evaluated to encode nonlinear dynamics in joint angle space We can use those derived behaviors for motion synthesis, prediction, and classification Our recent work applied the behaviors toward: – individually indexing to provide state prediction – providing desireds for control – matching against observed motion for classification and learning 10/33 MARS PI Meeting 9/2003
Use of the method: generating movement Forward Model Motion Synthesis Controller has a set of primitive behaviors Arbitrator decides which primitive to activate (e.g., based on transition probabilities) The active primitive incrementally updates the robot’s current pose (i.e., sets the desireds) Controller can generate motion indefinitely 11/33 MARS PI Meeting 9/2003
Use of the method: generating movement Representation of the Behaviors Behavior primitives are manifold-like flow fields in joint angle space, temporal ordering creates the flow field gradient This representation is a forward model, allowing for motion to be indexed, predicted, and synthesized dynamically, in realtime Model can generalize the exemplars to create novel motion Blue: exemplar trajectories Black to Red: interpolated motion creating the temporal gradient flow field Right: 3 main PCs of a primitive flow field in joint angle space 12/33 MARS PI Meeting 9/2003
Use of the method: generating movement Example: 1Primitive-Based Synthesis Three motions generated from the same primitive behavior, using different starting poses, showing flow and variation PCA-view of primitive flow field in joint angle space Resulting kinematic motion 13/33 MARS PI Meeting 9/2003
Use of the method: generating movement Example: 2 Primitive-Based Synthesis Motion generated by combining two primitives (wave-in and wave-out) with a high-level arbitrator that sequences their activation arm waving 14/33 MARS PI Meeting 9/2003
Use of the method: generating movement Examples: 78 Beh.-Based Synthesis Multi-activity (take 1) cabbage patch twist Multi-activity (take 2) cabbage patch twist Single activity reaching (no root info) 15/33 MARS PI Meeting 9/2003
Use of the method: generating movement Synthesis from Isolated Activities cabbage patch (20000 frames) combined punching (3400 frames) 16/33 jab punching (5000 frames) jab punching (view 2) MARS PI Meeting 9/2003
Use of the method: classifying movement Behavior Classification & Imitation Goal: use the primitive behaviors to recognize, classify, predict, and imitate/reconstruct observed movement Compare observed motion with predictions from behavior primitives – Use Euclidean distance between end-effector positions as a metric – Use a Bayesian classifier Reconstruct/imitate the observed movement – Concatenate best match trajectories from classified primitives to reconstruct/imitate 17/33 MARS PI Meeting 9/2003
Use of the method: classifying movement Classification & Imitation Schematic 18/33 MARS PI Meeting 9/2003
Use of the method: imitating movement Example: “Yo-yo” Imitation Observed “yo-yo” motion (from MegaMocap V2) “yo-yo” from punching 19/33 “yo-yo” reconstruction from waving “yo-yo” from the twist “yo-yo” from cabbage patch MARS PI Meeting 9/2003
Use of the method: classifying movement Probabilistic Behavior Classification Behavior primitives are models Can use the flow-field representation or, in this case, radial basis functions were used We can apply a Bayesian classifier: P(C X) P(X C)*P(C) C is a class (behavior) X is an observation (joint angles) P(X C) can be determined from the primitives Classifier operates in real-time on joint-angle data Applications: human avoidance, interactive tasks with human operators and collaborators and/or other robots E. Drumwright, M. J Matarić, “Generating and Recognizing Free-Space movements in Humanoid Robots", IEEE/RSJ Int. Conf. on Intelligent Robotics and Systems (IROS-2003), Las Vegas, Nevada, Oct 25-30, 2003. MARS PI Meeting 9/2003 21/33
Use of the method: classifying movement Bayesian Behavior Classification - Model is a distribution of joint angles over time (below) - Actual distribution is multivariate (variables DOF used by primitive behaviors) Mixture spaces between 2 exemplars of the jab primitive for (left) one shoulder DOF and (right) 2nd shoulder DOF 22/33 MARS PI Meeting 9/2003
Use of the method: classifying movement Bayesian Classification Results Classification of novel movement is highly accurate Dataset Description % error Primitive movements 50 non-exemplar instances of primitives executed on physically simulated humanoid 3.39 Motion capture and animation data 550 movements from animation and motion capture 0.03 23/33 MARS PI Meeting 9/2003
Next problem: humanoid motion planning Humanoid Motion Planning Goal: – Synthesize real-time humanoid collision-free motion in dynamic environments Approach: – Use demonstrated motion data to compute a meaningful representation of valid motions – This enables: - fast determination of collision-free paths - adaptation to new obstacles 24/33 MARS PI Meeting 9/2003
Next problem: humanoid motion planning Humanoid Motion Planning Approach – Use pre-computed probabilistic roadmaps to represent the valid motion space of the humanoid – Temporarily disable parts of the roadmap that are invalid when obstacles are perceived. If the remaining part is not enough, perform on-line planning. Contribution – Introduction of dynamic roadmaps for motion planning, joining the advantages of multi-query methods (PRMs, PRTs, VGs) and single-query methods (RRTs, Exp. Spaces, SBLs). – Solutions for the humanoid case, e.g., the use of demonstrated motions to construct suitable roadmaps 25/33 MARS PI Meeting 9/2003
Next problem: humanoid motion planning Details of the Approach (1/3) Roadmap computation: – In high dimensional configuration space, comprising both arms and torso – Pre-computed using PRM sampling – Use density limits to achieve uniform sampling of end-effectors positions in the reachable workspace – Sample postures in the subspace covered by the demonstrated data (current work) – Even without considering obstacles the roadmap is useful for deriving motions without self-collisions 26/33 22 DOFs 17 DOFs MARS PI Meeting 9/2003
Next problem: humanoid motion planning Details of the Approach (2/3) On-line roadmap maintenance – When obstacles are detected, invalid edges and nodes are disabled – Workspace cell decomposition is used for fast localization of invalid nodes and edges – The time required to update the roadmap depends on the complexity of the environment and robot (collision detection) – Trade-offs with on-line planning: roadmap update is not suitable to highly dynamic environments, but fine for pick and place applications 27/33 MARS PI Meeting 9/2003
Next problem: humanoid motion planning Details of the Approach (3/3) On-line query – A*-like graph search quickly finds a path to the nearest node of the goal posture – If the node cannot be directly connected to the goal posture, on-line single-query planning is used (currently using bidirectional RRTs) – Better results when few portions of the roadmap are invalidated – Worst cases achieve similar results to using single-query planning alone 28/33 MARS PI Meeting 9/2003
Next problem: humanoid motion planning Validation: Results With Robonaut Example motions – Visualization geometry: 23930 triangles – Collision geometry: 1016 triangles – Optimization (smoothing) takes about 0.3s (Pentium III 2.8 GHz) 29/33 MARS PI Meeting 9/2003
Next problem: humanoid motion planning Path Optimization (1/2) Incremental path linearization – Simple and efficient in most cases – May be time-consuming as collision detection must be invoked before each local linearization. 30/33 MARS PI Meeting 9/2003
Next problem: humanoid motion planning Path Optimization (2/2) Incremental path linearization – Simple and efficient in most cases – May be time-consuming as collision detection must be invoked before each local linearization. – Sub-configuration linearization may be required, e.g., to decouple arm motions 31/33 MARS PI Meeting 9/2003
Summary Getting more & better data: – A light-weight low-cost motion-capture mechanism Real-world validation of the method: – Successful application of the method to Robonaut data Application of the method I: – Successful synthesis of novel humanoid motion from automatically derived movement primitives Application of the method II: – Efficient movement classification, prediction, and imitation The next big problem: – Humanoid motion planning around obstacles validated on Robonaut, dynamic obstacles to be addressed next 32/33 MARS PI Meeting 9/2003
Contributors and More Info Chad Jenkins Marcelo Kallmann Evan Drumwright Additional info, papers, videos: Chi-Wei Chu & Nathan Miller http://robotics.usc.edu/ agents/Mars2020/mars2020.html 33/33 MARS PI Meeting 9/2003