We apply Constructive Learning from Demonstration (CLfD), an approach grounded in constructionism—a learning theory traditionally employed in human education— to robotic culinary skill acquisition by comparatively analyzing two distinct teaching m...
We apply Constructive Learning from Demonstration (CLfD), an approach grounded in constructionism—a learning theory traditionally employed in human education— to robotic culinary skill acquisition by comparatively analyzing two distinct teaching methodologies: Direct Teaching and Computer Vision-based imitation. The first method involves Direct Teaching, capturing precise coordinate data from the robot arm's end -effector, ensuring accurate physical reproduction of culinary tasks. In contrast, the second method leverages computer vision techniques, utilizing depth cameras to capture detailed human hand movements while physically guiding the robot arm during cooking instruction, subsequently converting these observations into robotic-compatible coordinate for imitation learning. Both methodologies incorporate Convolutional Neural Network (CNN) networks to effectively extract and generate optimized feature representations, facilitating robotic imitation-based learning processes. A comparative evaluation was conducted on a culinary task involving tofu cutting, measuring physical outcomes such as the number and thickness of slices, alongside operational metrics including total robotic arm movement distance and task completion time. Results indicated that Direct Teaching, especially with ample training data, led to more accurate motion execution and reflected a greater impact of constructionist learning. By comparison, the computer vision-based method was less influenced by constructionist principles. These findings suggest that applying constructivist learning principles significantly enhances robotic skill acquisition, presenting substantial implications for the advancement of food robotics and computer vision disciplines.