Download Report

How to make intelligent
agents coordinated and
work for us?
Prof. Von-Wun Soo
Department of Computer Science
National Tsing Hua University
Outline of the talk
Introduction of intelligent agents
Agent Coordination techniques
Game theoretical
Negotiation
Research topics on agents
The advent of ubiquitous agents
The computing and communication technologies
on wireless LAN, PDA, mobile phones, GPS,
and Bluetooth, WAP, GPRS and so on have
made the advent of a new pervasive computing
era. *any time, any place, any platform, peer-topeer
Personal assistants or software agents on the
hand-held devices for many different applications
and services have been conceived and becomes
more and more feasible…
Characteristics of Intelligent
Agents
– Autonomous, Proactive, and Rational:
Intelligent agents have their own goals, and
execute their tasks that optimize certain
performance measures
– Interactive/communicative: Interacting with
environment and other agents
What’s software agent
technology?
A paradigm shift of information utilization from
direct manipulation to indirect access and
delegation
A kind of middleware between information
demand (client) and information supply (server)
A software that has autonomous, personalized,
adaptive, mobile, communicative, social abilities
What is software agent
technology?
Agent theories:
– cognitive representation (belief, goal, plan,
desire, intention, commitment)
– reasoning and problem solving (planning)
Agent architecture:
– components, layers, models, multi-agent teams,
brokers, community, society
– control, communication and coordination
mechanisms among multi-agents
What is software agent
technology?
Agent Language:
– agent communication language KQML
– agent construction language AgentBuilder;
WebL; Agent0
– domain ontology KIF
Agent Environment and Supporting Platform
– creation, registration, deletion, mobility,
communication, authorization, authentication,
security, etc.
Why is multi-agent coordination
important?
Real world problems are quite complex and
human problem solving are multi-agent in nature
resolve conflicts among multi-agents
satisfy global constraints and by working as a
team, enhance global welfare or performance
sharing information, synchronizing actions, avoid
redundancy, avoid deadlock, avoid livelock.
prevent an anarchy or chaos
What is coordination?
Coordination is a coherent task assignment and
execution.
Coordination can be achieved by means of
Planning + Control + Communication
(Negotiation) + Conflict resolving
+ Resource/information sharing
Coordination can result in
–
–
–
efficiency enhancing
avoiding both deadlock & livelock
reducing resource contention
A taxonomy of coordination
coordination
Cooperation
Competition
Planning
Negotiation
Distributed
Planning
Centralized
Planning
Techniques of coordination
Coordination without communication:
–
–
–
–
–
Compiled of social laws and conventions
Reason by focal points
Inferring other agents via observation
Partial/global planning distributed/centralized
Organization-structure
Coordination with communication
– Knowledge-transfer protocol -- blackboard
– Contract net protocol
– Negotiation approaches (game-theoretic)
– Market mechanisms
Distributed planning
[Edmond Durfee]
Centralized planning for distributed plans
Distributed planning for centralized plans
Distributed planning for distributed plans
Blackboard– A cooperative
problem solving model model
blackboard
Executing
Activated
KS
events
Control
components
Pending
KS
Activations
Library
of
KSs
Coordination by organization
structure
Pre-defined roles, responsibilities and
preferences of agents
Pre-defined control and communication
protocols among agents
Prioritizing tasks over agents (allow
overlapping responsibilities)
Organizational agents are not necessary
cooperative; they can be competitive
Coordination by a market
mechanism
Coordination with a large number of unknown agents
Coordination with minimal number of direct
communication among agents
The market reach competitive equilibrium when
– 1) consumer bids to maximize their utility, subjected to budget
constraints
– 2) provider bids to maximize their profit, subjected to technology
capacity
– 3) net demand of good is zero
Competitive equilibrium = Pareto efficient solution
Coordination of tasks
Decomposition of tasks
Distribution of tasks
Control or coordination
–
–
–
–
Determine shared goal
Determine common tasks
Avoid unnecessary conflicts
Pool knowledge and evidence
Task decomposition
Divide and conquer
– AND/OR tree
– Spatial decomposition vs functional
decomposition
Depends on designer’s choice
Must consider resources and capabilities of
agents
Distribution of tasks
Market mechanisms: generalized
agreement and mutual selection
Contract net
Multi-agent planning
Organizational structure
Recursive allocations
Agent-mediated matchmaking/brokerage
Contract net protocol
Manager announces tasks via (possible
selective) multicast
Contract net protocol
Agents evaluate the announcement, Some
of the agents submit bids
Contract net protocol
The manager awards a contract to the most
appropriate agent
Why game theory?
Previous work [Rosenschein and
Genesereth, 1985; Rosenschein, 1994,
Haynes and Sen, 1996, Wu and Soo, 1998a,
1999]
Provide fundamental and theoretical
explanation on how multi-agent might
behave under at various conflicting
situations.
Underlying assumptions
Rational agent assumption
– maximize its own expected utility (selfish)
– mutual rationality
Each agent has as set of strategies to choose and
each agent’s payoff is a utility function of the
strategy combination (the outcome) of all agents’
choices.
The question is how can we predict the outcome
of different game situations based on rational
agent’s decision making?
Nash equilibrium
A strategy combination is a Nash equilibrium
Nash equilibrium is a stable solution concept
– if none agent will gain more payoff by leaving from
the strategy combination alone.
– Namely, no single agent will deviate from the solution
based on individual rationality
Pareto-optimality
A strategy combination is Pareto optimal if no
other strategy combination could increase the
payoff of one agent without decreasing the
payoff of any other agent.
– Namely, a Pareto-optimality is a kind of quality of
measure on a solution that is a solution that is not
Pareo-dominated by any other possible solution
What are difficult games?
Prisoner’s dilemma [A Nash equilibrium that is
not Pareto-optimal]
Games with no Nash equilibrium
Games with multiple Nash equilibria
The need for a trusted third
party
Traditional game theory cannot easily
resolve the difficulty games
We propose to use a trusted third party to
to enforce the commitments and contracts
negotiated by both agents
Roles of a trusted third party
an intermediary agent who
– temporary holds the deposit of guarantee or
compensation from one agent
– forfeits or returns the guarantee or compensation
deposited if the other side does not obey the
agreement
– Ensures that rules or principles of negotiation be
obeyed by each side of agents
Examples: bank; government; court; referee
Create an Equilibrium
by TTP Negotiation
We propose two communication actions in
the TTP negotiation process
– Ask guarantee; offer guarantee
– Ask compensation; offer compensation
The guarantee communication action
in TTP negotiation
1. Ask for guarantee
Agent P
Agent Q
4. Play the game
2. Deposit guarantee
5. Return Guarantee
3. Notify P
Trusted third
party
The compensation communication action
In TTP negotiation
Agent P
1. Offering compensation
2.Agree
Agent Q
4. Pay the game
3. Deposit compensation
Trusted third
party
5. Send compensation
A prisoner’s dilemma game
P
Q
Cooperate
Fink
-1
Cooperate
-1
-1
Fink
-1.1
-10
-10
-8
-8
(a)
Cooperate
-1
Cooperate
-10
0
Q
0
-10
Fink
P
Fink
-1.1
-9.1
-9.1
(b)
Prisoner's Dilemma game matrix (a) A special case of
a PD game matrix. (b) A dilemma-free game matrix.
(Both P&Q promised to play cooperate and deposit
Guarantee 1.1)
Note: (-1,-1) Pareto-dominates all three other strategy
combinations
Battle of Sexes Game
with Multiple Nash Equilibria
Woman
Man
Wrestle
Ballet
1
Wrestle
2
-1
-1
-5
Ballet
-5
2
1
After Negotiation with
communication actions
Man offering compensation 0.5 to Woman to play “Wrestle”
& Woman offering 0.5 compensation to Man to play "Ballet".
Woman
Wrestle
Man
Ballet
1.5
Wrestle
-1
-4.5
-5.5
Ballet
1
-1
Wrestle
-1
1.5
Ballet
Woman
Wrestle
Man
2
-5.5
2
1
-1
Ballet
-4.5
1.5
1.5
A Welfare Game without Nash
Equilibrium
G
P
Work
Idle
2
Aid
3
Idle
3
1
-1
1
0
0
(a)
Work
2
Aid
-1
-1
P
3
1
No Aid
G
No Aid
-1
-2
0
(b)
A Welfare Game (a) The original game matrix. (b) The
game matrix after negotiation.
Some assumptions and
properties in TTP negotiation
Proper quantum principle
Equal concession principle
Order independent property
Existence of Nash and Pareto-optimality
property
Convergence of negotiation property
Negotiation without knowing
other agent’s payoff
Previous game theoretic reasoning assume
a complete payoff matrix
In real situations, it is often difficult to
obtain other agent’s payoff information.
How can agents reason and negotiate
without knowing other agent’s payoffs?
QP
P1
P2
-2
2
P’s view Q1
?
Q
P
P1
P2
2
Q
Q2
-2
1
-1
-1
1
-1
?
?
-2
2
Q1
Q2
?
1
A complete payoff
matrix
QP
Q’s view Q1
Q2
P1
P2
?
?
2
-2
?
?
-1
1
Min-max strategies under
incomplete information
QP
P1
P2
?
Q2
-2
2
Q1
?
?
P1
P2
Q1
Q2
?
?
2
1
-1
?
QP
-2
?
?
-1
1
Q will choose Q2
P will choose P1
Without communication, P and Q using min-max
strategies will end up in the state (Q2, P1) which is not a
Pareto-optimal state
Q asks P to pay guarantee
QP
P1
Q1
?
2
Q2
? -4
-2 +4
?
-1
P2
? -4
1+4
Q asks P to pay guarantee 4 to play P1
P asks Q to pay guarantee
QP
P1
-2
2
Q1
?
Q2
P2
?
-1+3
1+3
?-3
?-3
P asks Q to pay guarantee 3 to play Q1
Final results
Q
P
P1
P2
Q1
2
Q2
-2-4
2
-2+4
-1+3
-1-3
1-4+3
1+4-3
Agreed at the created NFD (P1,Q1),
P pays guarantee 4
Q pays guarantee 3
A Prisoner’s dilemma game
Q
P
Q1
Q2
P1
-1
-1
1
0
P2
1
0
Created Nash (P1,Q1)
-8
Q asks P to deposit guarantee 1
to play P1
-10
-10
-8
Complete information:
P asks Q to deposit guarantee 1
to play Q1
TTP negotiation under incomplete
payoff information
Q
P
P1
P2
-1
0
Q1
-1 99 -10
-8
-10
Q2
0
-8
Reach created NFD equilibrium
P asks Q to pay guarantee
9 to play Q1
Q asks P to pay guarantee
9 to play P1
Summary
Prisoner’s dilemma
games
Battle of Sexes
Games
Social Welfare Games
Traditional
game theory
without
communicatio
n
Trap into suboptimal Nash
Mini-max doesn’t
help
Cannot predict result
with pure strategy
Mini-max strategy
may lead to
suboptimal solution
No Nash eqilibrium
with pure strategy
Mini-max strategy may
lead to suboptimal
solution
Allows TTP
negotiation
Complete
Payoff
information
Created Paretooptimal Nash
Q pays guarantee 1
P pays guarantee 1
Created Paretooptial Nash
P pays compensation
1
Created Pareto-optimal
Nash
P pays compensation
0.5
Q pays guarantee 1
Allows TTP
negotiation
incomplete
Payoff
information
Created Pareto
optimal NFD
Q pays guarantee 9
P pays guarantee 9
Created Paretooptimal NFD
P pays compensation
1 and pays guarantee
6
Q pays guarantee 3
Created Pareto-optimal
NFD
P pays compensation
0.5
and pays guarantee 3
Q pays guarantee 3
Comments and current research
Only two agents are involved
When more than 2 agents are involved in the
negotiation, there are collusion problems that any
subgroup of agents might deviate a negotiated
solution simultaneously by sacrificing the other
agents.
The TTP negotiation protocol needs to be
designed in a more general way to take the
collusion into consideration.
Rosenschein’s Work on
Rules of Encounter
Negotiation on different domains
– Task oriented domain (postmen, database, fax)
– State oriented domain (block world)
– Worth oriented domain (agents rank the worth
on different states)
Deception in negotiation–
postmen domain
Post office
h
a
True task Assignment:
b
Must return to office
Agent 1
g
f
Agent 2
c
e
d
{b,f}
{e}
Flip a coin to decide who
Deliver all the mails
Hiding task
h
Post
office
a
g
f
Task claim during
negotiation
b
Agent 1
Agent 2
c
e
d
{f}
{e}
{b}
They decide agent 2
Delivers all the mails
Phantom task
True task assignment
Agent 1
Agent 2
a
c
b
{b,c}
{b,c}
Phantom letter
{b,c,d}
{b,c}
d
They agree agent 1
Goes to c
Negotiation over mixed deals
Divide the tasks {D1,D2}
Agent 1 has probability p to take D1 and 1p to take D2
Agent 2 has probability 1-p to take D1 and
p to take D2
Change discrete deals to continuous deals
Hiding letters with mixed all-ornothing deals
h
Post
office
a
g
f
Task claim during
negotiation
b
Agent 1
Agent 2
c
e
d
{f}
{e}
{b}
They will agree on the mixed deal where agent 1 has
3/7 chance of delivering to f and e
Mixed deal prevents hiding
tasks
• The expected utility of agent 1 with honest bid is
½*8= 4
• Agent 1 might still have a chance to delivery all
letters even if he hide the delivery task b
– The expected utility of agent 1 with deception is
(3/7)*8 + 2 = 5.43,
– there is no reason to hide the task under the mixed allor-nothing deal
Phantom letters with
Mixed deals
Agent 1
a
2
Agent 2
1
c
{b,c}
{b,c}
b
Phantom letter
{b,c,d}
{b,c}
They agree on a mixed deal that d
Agent 1 has 5/8 to deliver all letters
2
Mixed deal prevents phantom
tasks
The agent 1 with honest bid has expected
utility ½*6= 3
With phantom letter, the agent 1 has the
expected utility
5/8* 6 = 3.75
no reason to propose a phantom letter
Incentive compatibility
mechanism
Theorem
– For all encounters in all sub-additive TODs,
when using a PMM(product-maximizing
mechanism) over all-or-nothing deals, no
agent has an incentive to hide a task.
Recommending a Trip
Plan by Negotiation
with a Software
Travel Agent
Introduction
Traveler
Recommendation
By
Negotiation
Time Constraints
Budget Constraints
Preference
Travel
Agent
Spatial, Temporal,
Physical Constraints
Transportation
Accommodation
Cost and Profit
User Preference and Constraints
an origin and a destination of the trip,
a travel time and a return time the of trip,
an upper bound budget for the trip,
a preferred hotel class and hotel price,
a set of spots that the user prefer to visit.
Constraints on a valid trip plan
Time constraint: any activities scheduled in the plan
should be before the return time of the trip and after the
travel time.
Budget constraint: the total expense of a trip plan should
not exceed the budget specified by the user.
Necessary component constraint: a valid and minimal set
of constraints in trip plan.
Preference satisfaction constraint: the travel agent must
also try his best to satisfy users’ preferences and personal
constraints while building a trip plan.
Components in a travel agent
Travel
Database
Trip
Planner
Context
User Model
Service
Behaviours
Domain
Constraints
Inconsistency
Manager
Travel Agent
Communication
Toolkits
User Interface
Agent
Trip planner – stage 1: skeletal
plan formation
Prepares a round-trip transportation tickets,
hotels for the nights during the trip, and a
number of day slots
Check consistency with local domain
constraints
Trip planner – stage 2: visiting
spots routing and scheduling
1) Searching a route with the shortest
traveling time,
2) Determining the duration to stay at each
visiting spot, and
3) Allocating the visiting spots to a particular
period of a day.
Trip planner-- stage 3: plan
refinement
1) Recommending more visiting spots to the
trip plan,
2) Upgrade or downgrade the
recommendation of travel components
3) Elongation or reduction on the staying
period of a visiting spot.
The Context User Model
It models the likes and the dislikes of a
user in the current dialogue context.
The model is initialized and updated
during the negotiation dialogue.
Consistency maintenance
– By timestamps
The Inconsistency Manager
Derives several
candidate solutions
and persuade the user
to relax some
constraints.
Carried out by a
negotiation protocol.
• user constraint 1
• user constraint 2
•…
• user constraint N
• budget constraint
• time constraint
• minimal staying period
• necessary component
Trip Planner
Inconsistency
Manager
Resolving the violation of time
constraints
Vacant time problem
– Adding a deprecated spot
– Elongating the staying period
– Advancing the return time of the trip
Not-enough-time problem
– Shortening the staying periods
– Postponing the return time of the trip
Resolving the violation of
budget constraint
Strategies for resolving the violation of the
budget constraint
– Raising the budget
– Removing expensive spots
– Choosing a cheaper hotel
Resolving infeasible user
constraints
The ones inconsistent with the application
content, or do not comply with the limitations.
– Specifying a non-existing hotel
– Specifying a staying period that is less than the
minimal one
The resolution
– complying with the constraints and application content
– similar to the user’s original specification
Scenario 1(1): User Request on
Planning a Trip
Build an initial trip plan that satisfies a user’s request
Origin:
Destination:
Travel time:
Return time:
Budgets:
Hotel:
Visiting
spots:
HsinChu
Taipei
5/1 12:00PM
5/1 19:00PM
Below 3350 NTD
Normal class, below 1000 NTD
SKM skyscraper,
The President’s Office Building,
CKS Memorial Hall,
LoongSha Temple
National History Museum
Scenario 1 (2) Simple Trip Plan
May 1
Arrival Location
Cost
Remark
12:10 HsinChu station
0
Origin of the trip
13:20 Taipei station
180
13:49 LoongSha Temple
20
S1,*
14:54 SKM Skyscraper
200
S2,*
15:56 The President’s O.B.
20
S3,*
16:58 CKS Memorial. Hall
20
S4,*
17:59 N. History Museum
40
S5,*
18:53 Taipei station
40
19:58 HsinChu station
139
Time
Total traffic cost: 459
Total hotel cost: 0
Total entrance fee: 180
Estimated meals cost: 400
Total cost: 1059
Scenario 2(1): User Request by
Adding the Botanical Garden
Scenario 2 (2) time constraint
violation & recommend to
remove a spot
Scenario 2 (3) time constraint
violation & recommend to
postpone the return time
Scenario 3 infeasible hotel &
recommend a feasible hotel
A final trip plan
May 2nd
May 1st
Arrival Location
Cost
Remark
Arrival Location
Cost
Remark
Time
Time
12:10 HsinChu station
0
08:00 The Prince Hotel
0
13:20 Taipei station
180
08:41 History Museum
60
S5,*
13:46 The Prince Hotel
1520
11:51
15
S7
Science Museum
14:58 LoongSha Temple 40
S1,*
13:01 Botanical Garden
0
S8,*
16:08 ZhuShe Temple
0
S6
14:42 Simen Street
20
S9
17:21 SKM skyscraper
200
S2,*
16:52 Haunted House
150
S10
18:53 President’s O.B.
20
S3,*
18:40 Taipei station
20
19:54 CKS M. Hall
20
S4,*
19:58 HsinChu station
139
22:00 The Prince Hotel
20
Total traffic cost: 539
Total hotel cost: 1500
Total entrance fee: 365
Estimated meals cost: 900
Total cost: 3304
Comments
Previous methods of recommendation
– Collaborative filtering
– Content-based
Recommendation by negotiation can personalize
the recommendation by complying user’s
preference and domain constraints
Recommendation can be conducted at many
stages of trip planning in different forms
– Remove or adding a spots, raise budget, prolong a stay,
change hotels, and so on
Conclusions
Coordinate with sharable ontology
– For agents to share knowledge and
communicate with the same language and
content is also important for coordinate among
agents
– Semantic Web concepts provide such an
opportunity to share domain knowledge
among agents
Conclusions
Automated agent negotiation needs still further
investigation including
– Learning to negotiate (both tactics and opponent
model)
– Negotiation under non-cooperative situations or
hostile opponents (were threatening and cheating
might be unavoidable)
– Negotiation with different risk and trust attitudes and
under time constraints
– Negotiation and combining with other coordination
techniques such as agents in organization
– Ontology for negotiation and coordination