[AAMAS09] Multi-Agent Learning II. Emergent Behaviour II

Well, well, well, finally I’m in a room

Stigmergic Landmark Foraging
Nyree Lemmens, Karl Tuyls

Late

Integrating Organizational Control into Multi-Agent Learning
Chongjie Zhang, Shereif Abdallah, Victor Lesser

Problems of distributed learning

Basic idea: organisational-based supervision framework. It’s a multilevel structure (recursive?) Lowest level network agents are ‘workers’. Each leaning agent reposrt its abstract state to its inmediate supervisor and them use rules a suggestions to transmit its supervisory informatio to its subordinates. Rules are set of forbidden actions and suggestions are actions with a degree in [-1,1], Rules are hard constraints and sugg are soft constraints that represent preferences.

The problem they’ve used to test this model is DTAP (distributed task allocation problem). Using a 27×27 agent grid… only!!!, too small!! I can manage several millions of agents to do the same :-( The results: interesting, but I don’t
understand all this stuff to be used in a small network as this: two supervision levels for such a group of agents.

It scales, but adding more supervision levels that may affect to the performance. I don’t like it. You’ll need a lot of layers for a really big network. Furthermore, in the experiments they’ve used a grid instead of a network and this is not ‘elegant’.

Multiagent Learning in Large Anonymous Games
Ian Kash, Eric Friedman, Joseph Halpern

We need to learn quickly, with minimal information and despite of noise. And to test their method they’re using games, but instead of being game theoretic games, they’re continuous, anonymous and designed games. He explains the method with a simple game but at the end it’s similar to game theory… I hate utility functions for agents. The behaviour can’t be reduced to a number or a function. Agents are more complex that that. We are more complex than that.

A simple algorithm to adapt the agent’s behaviour to the rest, so the dynamic converge despite of having agents making mistakes (so they’re introducing noise) in their decisions. As the number of agents increases, the system is more stable and converges faster… they’ve tried with 100 agents (again, too small for me). This results allows to tolerate strange behaviours.

Learning of Coordination
Francisco Melo, Manuela Veloso

Problem: many MAS solutions assume full joint sate observability because consider only local observability makes the problem too complex to be solved. But in many of these problems agent interactions are local. So they have to learn when interaction/coordination is advantageous. MDP and Q-learning is useed. And to show how it works, with an example of two robots that have to cross a gate.

They introduce a Coordination action (pseudo-action) and agents have to decide when to use this Coord action (it has a small penalty). Interesting method: agents can decide when to coordinate instead of exchange irrelevant messages all the time. They’ve tried with many different scenarios.

Abstraction Pathologies in Extensive Games
Kevin Waugh, Dave Schnizlein, Michael Bowling, Duane Szafron

Talking about poker competition for agents. Just two-playesrs. They use abstractions and the agent has to decice when to refine. Test with no-limit and leduc hold’em (small game, 6 cards deck, one ard ‘hidden’ and the other public). Boring… talking about the details of the game and many, many results.

State-Coupled Replicator Dynamics
Daniel Hennes, Karl Tuyls

Using evolutionary game theory, but it is single state dynamic, so it has to be extended to multi-state. Showing the behaviour in different classic games (as Prisoner’s Dilemma). Definitely…. i’m not interesting on this at all.

Wait a minute, with the examples I’ve seen hat it’s very similiar to our model of agreement, at least how it behaves. I’ll need to take a look to it. Too formal, but I hope that Alberto could help us with this.

Blogged with the Flock Browser

Tags: aamas, learning, game theory