Robotics
HIGHLIGHT - Autonomous Multi-Robot Learning in Inherently Cooperative Tasks
A particularly challenging domain for multi-robot learning concerns tightly coupled, inherently cooperative tasks. In these tasks, the utility of the action of one robot is dependent upon the current actions of its team members. Such inherently cooperative tasks cannot be decomposed into independent subtasks to be solved by a loosely coupled distributed robot team. Instead, the success of the task is measured by the combined actions of the robot team, rather than the individual robot actions and contributions. This problem domain is extremely difficult for robots to learn autonomously because of the challenges of assigning performance credit to individual robots based only upon global rewards provided to the team as a whole. However, the solution to this problem is critical for reducing the cost of programming multi-robot teams, and for enabling the teams to autonomously adapt to new environmental conditions.
We have been studying autonomous multi-robot learning for inherently cooperative tasks and have developed two new approaches to learning in the domain of cooperative multi-robot observation of multiple moving targets. The first approach uses a combination of reinforcement learning, instance-based learning, and a Pessimistic Algorithm able to compute for each team member a lower bound on the utility of executing an action in a given situation. The second approach applies Q-learning along with the VQQL (Vector Quantization Q-Learning) technique for reduction of the state space of large domains and the Generalized Lloyd Algorithm to address the generalization issue in reinforcement learning. The figures below show the empirical studies of these approaches on our multi-robot team, and some of the results of the second learning approach. These new techniques now allow robots to build up memories of their experiences in the environment, evaluate the utility of alternative cooperative actions, and then select actions to take that increase the likelihood that the desired global team goals will be achieved through the individual robot decisions. These multi-robot learning techniques are the first in the field that enable robot teams to automatically learn new inherently cooperative control tasks, rather than having to be programmed explicitly. These capabilities facilitate the solution to a wide variety of applications, including environmental cleanup, space exploration, military applications, and industrial operations. Research by L. E. Parker, C. Touzet, and F. Fernandez, ORNL Computer Science and Mathematics Division; for details see:
|