Robots Learn Soccer (and the Game of Life)
http://www.nytimes.com/2001/11/27/science/physical/27ROBO.html?ex=1007442000&e
n=3c6027b157c5124c&ei=5040&partner=MOREOVER
BY YUDHIJIT BHATTACHARJEE
Tucker Balch is an unusual kind of soccer coach.
He rarely cheers from the sidelines, and he never gets angry with his players.
Instead of shouting or pummeling the air, he leans forward every once in a
while to punch commands into a computer.
But then, his team is unusual too. The players are all robots. And while Dr.
Balch, a robotics researcher at the Georgia Institute of Technology in
Atlanta, sees no great future for robotic soccer stars, his experiments could
provide surprising insights into the workings of human society.
The robots in Dr. Balch's experiments are nothing more than a few lines of
software code, gradually taught to play soccer on a computer screen. They are
computer simulations of physical robots built by Dr. Balch - shoe-box-size
machines that he has entered in robot soccer tournaments.
Dr. Balch plans to reproduce his experiments using the real robots, work that
he says will bring his findings closer to the realm of social reality.
On the computer, the robots learn to play soccer by executing a random
sequence of basic moves - running toward the ball, kicking it, moving behind
the ball, blocking it. For every sequence, a computer program either rewards
or punishes the robot with a digital signal telling it whether the sequence
made sense and whether it should be repeated.
Dr. Balch divides his robots into two teams, represented by little circles on
the computer screen. Robots on a control team are able to pass the ball,
defend and attack from the starting whistle; the test team must learn by trial
and error as the game progresses. And it turns out that the test team behaves
much differently depending on whether its members are rewarded as individuals
or as a group.
Under the first scheme, a reward signal is sent only to robots that score a
goal. As the match progresses, every team member ends up learning the same
sequence of behaviors - going after the ball in a solo effort to score. As a
result, the circles on the screen bunch around a single point - wherever the
ball is - leaving the rest of the field open to attack.
Under the group-based reward scheme, all members in the learning team receive
a reward whenever any of them scores a goal. After several learning loops,
some of the robots ended up with behavioral sequences that made them good
defenders. Others in the team evolved into forwards. "Group rewarding produced
greater diversity," Dr. Balch said, "and that made the team a winning
combination."
The results may be surprising to those who believe that the pursuit of
individual rewards - as in capitalism - encourages people to develop a
diversity of ideas, points of view, goals and strategies to achieve them.
Dr. Balch is not rushing to extend his findings to human societies, and he
acknowledges that human complexities - including traits like motivation and
jealousy - are hard or impossible to reproduce in robotic systems. But he says
the experiments show that robot studies can serve as a window for
understanding human behavior.
"Robots can learn and plan and communicate," he said. "They are probably the
best model we have right now for controlled experiments on social systems."
Dr. Ross Burkhart, a political science professor at Boise State University in
Idaho, who has studied the relationship between economic freedom and
democracy, says there are interesting parallels between the robot studies and
the way economies work.
"The goal in capitalism is for individuals to make as much money as they can,"
he said. "Whatever strategy brings one person a lot of wealth will inevitably
become the strategy used by most people."
Would a group-based reward system in a corporation make its employees
specialize in different roles like the soccer-playing robots did under the
socialist scheme? Dr. Andrew Schotter, an economist at New York University,
laughs at the comparison.
"Some employees would probably stop working," he said. "You can't eliminate
free riding among office colleagues like you can in a team of robots. There
will always be some people who will take advantage of the fact that their
individual performance does not really matter."
If employees are rewarded individually, Dr. Schotter argues, they can maximize
their rewards by specializing in a certain role and carving a niche in the
organization.
Dr. Roger T. Johnson, who studies cooperative learning at the University of
Minnesota, believes that competition for rewards pressures people to become
similar.
"Cooperative reward systems often encourage diversity," Dr. Johnson said,
"because a group can be more productive if its members have different
perspectives. They offer a wider choice of ideas for problem-solving and doing
things efficiently."
Still, Dr. Schotter said there were too many differences between robots and
human beings to justify any parallels. For one thing, people don't just work
for money; they also work for fun.
"Also, people don't care only about what they are making individually," Dr.
Schotter said. "They also worry about how much the person in the next cubicle
is making."
This archive was generated by hypermail 2b30 : Sat May 11 2002 - 17:44:22 MDT