Decision support using social media: how to deal with different opinions?

Decision support using social media: how to deal with
different opinions?
Robin De Mol
Promotor: prof. dr. Guy De Tré
Begeleider: Ana Tapia Rosero
Masterproef ingediend tot het behalen van de academische graad van
Master in de ingenieurswetenschappen: computerwetenschappen
Vakgroep Telecommunicatie en Informatieverwerking
Voorzitter: prof. dr. ir. Herwig Bruneel
Faculteit Ingenieurswetenschappen en Architectuur
Academiejaar 2012-2013
Faculty of Engineering and Architecture
Department Telecommunicatie en Informatieverwerking
Decision support using social media:
how to deal with different opinions?
by Robin De Mol
Promotor: prof. dr. G. De Tr´e
Thesis coach: ir. A. Tapia
Master thesis submitted to obtain the degree of master of science in
computer science: software engineering
2012–2013
Decision support using social media: how to deal with different opinions?
by
Robin De Mol
Master thesis submitted to obtain the degree of master of science in computer science:
software engineering
2012–2013
Ghent University
Faculty of Engineering and Architecture
Promotor: prof. dr. G. De Tr´e
Summary
Making decisions is part of our every day life. Some decisions are more important than
others, and some are more complex. They are made in different contexts, ranging from
individual to group decisions, both from a social and a business point of view. Due to
the increase in communication capabilities between people worldwide through the rise
of the internet, and the amount of resources made available this way, group decision
support is becoming more and more actual. The development of group decision support
software is currently very scarcely documented as it is usually created in-house.
In this work, an extension to an already existing modern decision support algorithm
is presented allowing multiple applications of group decision support. To this avail,
aggregation techniques are discussed. Due to the fact aggregation techniques come with
information loss, we define a new measure to minimize this loss by reintroducing it as a
degree of confidence.
The concept of pre-aggregation is explained in depth. Nowadays, using soft computing techniques, the opinions of several experts could be expressed through membership
functions setting their preference levels providing a flexible way to express the desired
values for the attributes of the problem. Regardless of their expertise, the experts are
i
clustered into groups based on their opinions’ similarity. These are further merged into
a single result, which is then used as an input to an already proven decision support
system. This provides the luxury to keep existing decision support algorithms as they
are, with only a slight need for modification.
The clustering step uses a new technique based on the shape-similarity of membership
functions. The representativity of the cluster, or the degree in which it represents what
is more desired by the consulted experts is referred to as the confidence. It is calculated based on: the fraction of experts in a cluster, the weights of these experts and the
degree of similarity of the opinions of the experts in a cluster. In a second iteration,
the confidence of each cluster can be readjusted based on its distance to other clusters.
For each of these steps there are alternative approaches, which are compared in the
discussion.
The confidence levels are propagated through the regular decision support aggregation
structure and will result in a new output parameter for the evaluated systems, the
overall confidence. This represents the measure in which we can trust the result to be
representative for the experts’ opinions. The way this is calculated is also covered.
To illustrate the research, a case study is introduced with generated test data. The results
of the calculations using the proposed techniques are presented and discussed.
Finally, there is a section on future work introducing other techniques that can be
investigated, and their advantages and disadvantages.
I would like to thank professor De Tr´e for allowing me to conduct this research at the
faculty, Ana and her colleagues for coaching me during this thesis.
The author gives permission to make this master dissertation available for consultation
and to copy parts of this master dissertation for personal use. In the case of any other
use, the limitations of the copyright have to be respected, in particular with regard
to the obligation to state expressly the source when quoting results from this master
dissertation.
Keywords: LSP, GDSS, Aggregation, Confidence, Soft Computing.
ii
Decision support using social media: how to deal
with different opinions?
Robin De Mol
Supervisor(s): prof. dr. Guy De Tr´e, Ana Tapia-Rosero
Abstract— In this work we propose an extended group decision support
technique to deal with a group of possibly different opinions rather than
single person’s input. Information can be gathered from social media using
a soft computing technique that allows persons to express their desires in
the form of membership functions. These focus on working with linguistic terms over mathematically “crisp” values and are grouped into clusters based on their shape-similarity, for which the calculation thereof a new
technique is introduced. These clusters represent groups of people with
opinions that are alike. We continue by defining a measure for the gravity
of each cluster to represent how important it is. This helps finding the “average” opinion of the consulted people which is then used in the evaluation
process of an existing decision support system. Through a slight modification it generates an additional output parameter which reflects the accuracy
of the results, moreover the representativity of the output with respect to the
actual average opinion. This technique allows us to evaluate large amounts
of possibly different opinions using existing group decision support systems
which was previously impossible.
Keywords—Logic Scoring of Preferences (LSP), Group decision support
systems (GDSS), Aggregation, Confidence, Soft Computing
I. I NTRODUCTION
ODAY , people can reach anyone anywhere thanks to the
internet. This creates a lot of new possibilities, also from
a business point of view. Companies now have to compete with
each other globally. To be the most successful, they need to create products that are be innovative and user-friendly. There has
been a clear trend of shifting funding from the production line
to the research department. One of the techniques that is gaining popularity is including as many clients as possible in the
business decision making process concerning products in development. The clients are asked to give their opinions so they have
a share in the selection of certain features, something which has
long been done only by experts.
This new way of gaining information can rapidly lead to huge
amounts of different opinions, which are troublesome to handle.
We will discuss a few techniques that help the analysis of large
amounts of data. Furthermore, we will show how this can be
used in decision support systems (DSS).
First, we introduce some general terminology. After that, we introduce an aggregation methodology to analyse the information
based on clustering. This is followed by an explanation of a new
concept called confidence. Finally, there is a short conclusion.
T
II. T ERMINOLOGY
In the context of decision making and DSS some terms are
generally used to refer to certain concepts. Usually, there is a
given problem for which we want to find the best solution. The
analysis of problem leads to a hierarchically structured requirements tree of which the leaves are called the performance variables. Possible solutions to the problem are called (candidate)
systems. Normally we have a set of systems and we want to find
out which is the best to solve the problem.
For each performance variable, a scoring function called the
elementary criterion is specified, which indicates the values a
system should have to satisfy the corresponding requirement.
Evaluating a system will produce a set of elementary preference
scores, one per performance variable, indicating how well the
system satisfies all requirements individually. These are combined through an aggregation structure, which leads to the system’s global preference score. This indicates how well the system is fit to solve the problem as a whole.
After calculating this for each system, the results are analysed
by the decision maker, a person (or possibly a group) responsible for selecting the best system.
What differentiates a group decision support system is the fact
the elementary criteria of more than one person are used in the
evaluation. The consulted people are called experts, regardless
of their actual expertise. They represent their opinions using
membership functions. These are functions which define a degree of satisfaction between 0 and 1 for each value in the range
of the performance variable. This specifies which of the values
are desired and which are not. Any intermediary value between
0 and 1 indicates a partial tolerance for the value. In the context of decision support, this degree of satisfaction represents the
elementary preference score of the corresponding performance
variable of a system.
Fig. 1. Example membership functions where the Y-axis has been converted to
a percentage of satisfaction, illustrating how a membership function can be
used to imply a range of desired values for a variable.
III. AGGREGATION
Our goal is to aggregate the large amounts of information
in order to reduce the complexity of the data to a handleable
size. For this, we have chosen for a “pre-aggregation” technique, where the aggregation step occurs before the decision
support algorithm evaluates the systems. The results of doing
this are twofold:
• the experts are combined into a single, merged expert which
represents the “average opinion” before the decision support algorithm evaluates the systems;
• we can use existing DSS more or less without changes.
The representative of the average opinion is only an approximation. The correctness of its representativity is indicated by
the confidence parameter, which is discussed later on.
To aggregate the elementary criteria based on their similarity we first need to define the similarity between two membership functions. For this, we introduce an alternative notation for
them based on two aspects: their shape and the relative lengths
of their components. This is called the shape-symbolic notation
and consists of a shape-string and a feature-string.
A. The shape-string
The shape-string describes the shape of an elementary criterion. Generally, an elementary criterion consists of several typical components such as high values (indicating preferred values for the performance variable), low values (indicating nonpreferred values) and slopes connecting them. These components can be easily identified and the shape-string can be constructed on sight by stringing together the characters that represent these typical components.
B. The feature-string
The feature-string contains information about the relative
lengths of the components of an elementary criterion. We use
a soft computing technique to evaluate the length of a component. Instead of measuring its exact value we classify it into one
of the following ranges: extremely short, very short, short, normal, long, very long, extremely long. The actual implied range
of valued for these terms depends on the total length of the elementary criterion being translated. These intervals were chosen
based on a research which has shown that between seven and
nine distinct levels are the optimal amount for visually-based
human decomposition.
C. The shape-symbolic notation and similarity
Together, the shape-string and the feature-string represent the
shape-symbolic notation of an elementary criterion, which consists of shape-symbolic characters. These are couples of a shape
component and a length component.
The similarity between two elementary criteria is based on
a modified version of the Levenshtein distance between two
words. Originally, this works with characters and basic actions
such as insertion, deletion and replacement. We extended these
to work with shape-symbolic characters. Each action induces a
penalty defined by the gravity of the action. For example, replacing an extremely short character with a very long character
will have a high cost, which will be even higher if the shape
component is also different. Using this measure and given a
set of elementary criteria, we can calculate a similarity matrix
showing the pairwise similarity between every pair of two criteria. This matrix is one everywhere on the main diagonal and is
symmetric.
D. Clustering
After calculating the similarity matrix, the elementary criteria
are grouped into clusters. This is done by using a hierarchical
bottom-up clustering technique, combining the most similar criteria first. This approach has the interesting property that the
Fig. 2. Example of the shape-symbolic notation of a membership function,
showing its symbolic characters.
resulting clusters are unique.
It is important to keep a clear view on what we have done so far.
We have gathered information from a large crowd of people.
They have all given their preferred range of values for each of
the performance variables of the problem. These are represented
using elementary criteria. We then cluster those into groups of
people with similar opinions per performance variable based on
the shape-similarity method.
In order to go to the next step of the decision support system,
we need to elect one representative per performance variable for
evaluation. To facilitate this decision we introduce the concept
of confidence.
IV. C ONFIDENCE
The concept confidence appears at different stages in the decision making process, each with its own definition. We have already mentioned it indicates the representativity of the average
opinion of the consulted experts after clustering. It also facilitates the election process in finding this average opinion.
Confidence is also used in the presentation of the results. Much
like the aggregation of the elementary preference scores into one
global preference per system, the confidence in each of the representative elementary criteria, which is the same as the confidence of the cluster it is selected from, is propagated and aggregated into a global confidence value per system. Then we can
interpret this parameter as a degree of trust we can put in the
accuracy of the calculated global preference.
The three distinct instantiations of confidence are listed:
• cluster confidence for each cluster
• elementary confidence for the elementary preference score of
the elected representative criterion
• global confidence for each system (also called system confidence)
A. Cluster and elementary confidence
The first time we use the confidence concept is after the clustering algorithm has run. The goal is to find a measure for the
gravity of each of the clusters. Therefore we define the cluster
confidence as a combination of the relative size of that cluster
and the degree of similarity of the elementary criteria in it, referred to as the compactness of the cluster. The relative size
can further be weighted by adding weights to the experts to distinguish them by their level of expertise. The compactness can
be calculated in different ways but the concept is the same for
each of them. This measure defines how similar the opinions
in the cluster are. One possible approach of calculating this is
by taking the similarity of the most and least typical criteria of
a cluster. These can be found by calculating the members of
a cluster that have respectively the highest and lowest average
similarity with all other elements in it. Another method is by
calculating and normalizing the enclosed surface between the
upper and lower bounds of an interval valued fuzzy set enclosing all elementary criteria of the cluster.
These two factors are then combined with weights, depending
on relative importance of the similarity of the opinions and the
importance of representing the majority of the population. This
produces a single value, the cluster confidence.
Now we need to select a criterion per performance variables
from the clusters. This should represent the average opinion of
the consulted population on what they desire as values for the
variable. It is often simplest to select the most typical value
from the cluster with the highest cluster confidence. Alternatively, one could try to merge the most typical values of the top
k clusters, but this is often troublesome and can lead to illogical
results such as a criterion that accepts all values or none. The
confidence in the representativity of the elected criterion and furthermore the elementary preference score that follows from its
evaluation is the same as the cluster confidence of the cluster
it was selected from. This is the elementary confidence score,
which is in fact just the highest cluster confidence.
B. System confidence and goodness
After choosing the representatives we can evaluate the systems using the underlying decision support system. We have
chosen to extend Logic Scoring of Preferences (LSP), a modern and flexible DSS based on soft computing techniques. Each
system is evaluated in turn. First, all performance variables of
a system are evaluated using the elected criteria, resulting in a
vector a elementary preference scores. These are then combined
into a single global preference score through an LSP aggregation
structure which defines the importance of the attributes. This
is constructed in advance by the decision maker and depends
largely on the original decomposition of the problem. Similarly
to a global preference, a global confidence score is calculated,
which is called the global confidence or system confidence.
The results of the evaluation step are not trivial to interpret.
Each system has two scores which makes it hard to rank them.
To facilitate the final step in the decision making we propose
to combine the preference score and the confidence value into
a single parameter. Note that this can also be combined with a
possible cost analysis, as discussed in [10]. We call the combined value the “goodness” of a system, which represents both
the degree in which the system satisfies the problem’s requirements and the confidence we have in the accuracy thereof. It
is calculated by taking the weighted average of the global preference and the system confidence. This allows us to rank the
systems by their goodness.
V. C ONCLUSION
In this work we have proposed a technique to aggregate large
amounts of data. This has allowed us to extend an existing DSS
in such a way that it remains practically unchanged while ex-
tending its field of applicability. Now it can handle the input of
more than one person. There is no real limit to the size of the
input. The amount of people consulted can vary from a small
board of actual experts to a large population of clients to a combination of both using weights to distinguish both. This allows
us to use social media as a source of input without having to
worry about how to combine the different opinions of separate
individuals.
R EFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
J.J. Dujmovi´c and Fang W. Y., “Reliability of lsp criteria,” 2004.
P.J.G. Lisboa, H. Wong, P. Harris, and R. Swindell, “A bayesian neural
network approach for modelling censored data with an application to prognosis after surgery for breast cancer,” Artificial Intelligence in Medicine,
vol. 28, pp. 1–25, 2003.
P.G.W. Keen, “Decision support systems: The next decade,” Decision
Support Systems, vol. 3, pp. 253–265, 1987.
G.P. Huber, “Issues in the design of group decision support systems,” MIS
quarterly, pp. 195–204, September 1984.
J.P. Shim, M. Warkentin, J.F. Courtney, D.J Power, R. Sharda, and
C. Carlsson, “Past, present, and future of decision support technology,”
Decision Support Systems, vol. 33, pp. 111–126, 2002.
A. Tapia-Rosero, A. Bronselaer, and G. De Tr´e, “Similarity of membership functions - a shaped based approach,” in Proceedings of the 4th International Joint Conference on Computational Intelligence. 2012, pp. 402–
409, SciTePress - Science and Technology Publications.
A. Tapia-Rosero, A. Bronselaer, and G. De Tr´e, “A shape-similarity based
method for detecting similar opinions in group decision-making,” Information Sciences Sp. Iss. on New Challenges of Computing with Words in
Decision Making, 2013.
G. DeSanctis and R.B. Gallupe, “Group decision support systems: A new
frontier,” 1985.
G. DeSanctis and R.B. Gallupe, “A foundation for the study of group
decision support systems,” Management Science, vol. 33, no. 5, May 1987.
H.-J. Zimmermann, “Fuzzy programming and linear programming with
several objective functions,” Fuzzy Sets and Systems, vol. 1, pp. 45–55,
1978.
G. De Tr´e, Vage Databanken.
J.J. Dujmovi´c and W.Y. Fang, “An empirical analysis of assessment errors for weights and andness in lsp criteria,” 2004, San Francisco State
University, Department of Computer Science.
J.J. Dujmovi´c, “A comparison of andness/orness indicators,” San Francisco State University, Department of Computer Science.
J.J. Dujmovi´c, “Optimum location of an elementary school,” .
J.J. Dujmovi´c, G. De Tr´e, and S. Dragi´cevi´c, “Comparison of multicriteria
methods for land-use suitability assessment,” 2009.
J.J. Dujmovi´c, G. De Tr´e, and N. Van De Weghe, “Lsp suitability maps,”
2009.
R. Bodea and E. El Sayr, “Code coverage tool evaluation,” 2008.
J.J. Dujmovi´c and G. De Tr´e, “Multicriteria methods and logic aggregation
in suitability maps,” 2011.
J.J. Dujmovi´c and H. Nagashima, “Lsp methods and its use for evaluation
of java ides,” 2005.
J.J. Dujmovi´c,
“Characteristic forms of generalized conjunction/disjunction,” .
Didier Dubois and Henry Prade, Fuzzy Sets and Systems: Theory and
Applications, Academic Press, Inc., 1980.
George J Klir and Bo Yuan, Fuzzy Sets and Fuzzy Logic: Theory and
Applications, Prentice Hall, 1995.
Dan Gusfield, Algorithms on strings, trees, and sequences: computer science and computational biology, Cambridge University Press, New York,
NY, USA, 1997.
Jozo J. Dujmovi´c and Henrik Legind Larsen, “Generalized conjunction/disjunction,” International Journal of Approximate Reasoning, vol.
46, no. 3, pp. 423–446, Dec. 2007.
Beslissingsondersteuning en sociale media: hoe
moeten we omgaan met grote hoeveelheden
uiteenlopende meningen?
Robin De Mol
Begeleider(s): prof. dr. Guy De Tr´e, Ana Tapia-Rosero
Abstract—In dit document stellen we een uitbreiding voor op bestaande
(groeps)beslissingsondersteuningsalgoritmen voor om te kunnen omgaan
met grote hoeveelheden data, wat tot nu toe onmogelijk was. Er wordt gebruik gemaakt van sociale media als bron van informatie. Via die medium
worden mensen gevraagd om hun mening omtrent bepaalde aspecten. Ze
dienen deze in te geven in de vorm van lidmaatschapsfuncties, een soft computing techniek die toestaat om te rekenen met woorden in plaats van met
exacte wiskunde. Deze functies worden gegroepeerd in clusters op basis van
hun gelijkaardigheid. De manier om deze te berekenen is gebaseerd op een
nieuwe techniek afgeleid van de Levenshtein-afstand. Daarna berekenen
we de representativiteit van elke cluster ten opzichte van de gemiddelde
mening van de populatie. Deze data wordt gebruikt tijdens de evaluatie
van het beslissingsondersteuningsprogramma. Mits een kleine aanpassing
wordt er een extra parameter berekend die aangeeft hoe accuraat de resulaten zijn. Deze technieken maken het mogelijk om grote hoeveelheden
mogelijks uiteenlopende meningen te evalueren door gebruik te maken van
bestaande algoritmen.
Sleutelwoorden— Logic Scoring of Preferences (LSP), Groepsbeslissingsondersteuningsprogrammas (GDSS), Aggregatie, Vertrouwen, Soft Computing
I. I NTRODUCTION
ANDAAG kan men om het even wie om het even waar
bereiken dankzij het internet. Dit cre¨eert een groot aantal
mogelijkheden, alsook van een economisch standpunt. Bedrijven zijn nu competitief met elkaar op globale schaal. Om als
beste naar voor te komen moeten ze zich steeds meer focussen
op wat de klant precies wil van een product. Dit heeft gezorgd
voor een groei in de onderzoeksafdeling van veel ondernemingen, waarbij er veel aandacht geschonken wordt aan de mening
van de klant. Lang werden enkel experts betrokken bij belangrijke beslissingen maar dankzij de mogelijkheden die het internet
biedt wordt ook het clienteel steeds meer geconsulteerd. Zo kunnen ze hun mening geven over opkomende producten.
Dit heeft echter ook tot gevolg dat men veel meer informatie
moet verwerken. In dit werk bespreken we een manier om
bestaande beslissingsondersteuningssoftware uit te breiden op
zodanige wijze dat het mogelijk wordt om deze grote hoeveelheden informatie gemakkelijk en overzichtelijk te verwerken. We
bespreken een aantal technieken die ons hiermee zullen helpen.
Vervolgens tonen we ook aan hoe dit gebruikt kan worden voor
bestaande beslissingsonderteuningssystemen.
Eerst introduceren we een aantal belangrijke termen. Vervolgens gaan we dieper in op het proces van de aggregatie van de
informatie. Daarna bespreken we het concept “vertrouwen”. Tot
slot is er een korte conclusie.
V
II. T ERMINOLOGIE
In de context van beslissingsondersteuningssoftware zijn er
een aantal termen die gebruikt worden om te verwijzen naar
bepaalde aspecten. In het meest algemene scenario beschouwen
we een probleem waar we de beste oplossing voor zoeken. Mogelijke oplossingen heten (candidaat-) systemen. De analyse
van een probleem leidt tot een hi¨erarchische boom waarvan de
bladeren performantievariabelen worden genoemd, of kortweg
variabelen. Voor elke variabele wordt er een evaluatiefunctie
opgesteld, dit zijn de elementaire criteria. Deze worden gebruikt om de waarden van de variabelen van de systemen te
evalueren, wat leidt tot elementaire voorkeursscores. Per systeem worden deze geaggregeerd tot een globale voorkeursscore,
een maatstaaf die aangeeft in welke mate het systeem geschikt
is om het probleem op te lossen.
Eens alle systemen ge¨evalueerd zijn worden de resultaten
voorgelegd aan de beslissingsmaker, een persoon (of mogelijks
een groep) die instaat voor de selectie van het meest geschikte
systeem. Merk op dat het beslissingsondersteuningsprogramma
slechts dient als ondersteuning voor het maken van een beslissing; het maakt de beslissing niet zelf.
Wat een groepsbeslissingsondersteuningsprogramma onderscheidt van het voorgaande is het feit dat de elementaire criteria afgeleid worden van een groep experts. Dit kunnen echte
experts zijn, maar de term slaat ook op een mogelijks grote
groep mensen die geconsulteert worden via sociale media. Elke
persoon die zijn of haar mening ingeeft wordt een expert genoemd. Hoe meer mensen betrokken worden in dit proces, hoe
meer meningen men kan verzamelen. Per variabele wordt men
gevraagd om een functie op te stellen die aangeeft welke waarden men verkiest. Deze functies zijn afkomstig van soft computing en heten lidmaatschapsfuncties. Als bereik hebben ze
een waarde tussen 0 en 1 (inclusief) die aangeeft welke waarden
uit het domein (een performantievariabele) preferabel zijn. In
de context van beslissingsondersteuning zijn dit de elementaire
criteria en zijn de waarden bekomen na evaluatie de elementaire
voorkeursscores.
III. AGGREGATIE
Ons doel is om de grote hoeveelheden data te reduceren om
de complexiteit ervan te verlagen. We hebben gekozen voor een
“pre-aggregatie” techniek, waarbij de aggregatie gebeurt voor
de evaluatie door het beslissingsondersteuningssysteem van de
systemen. De gevolgen hiervan zijn tweevoudig:
´ e´ n
• de gegevens van alle experts worden gecombineerd in e
Fig. 1. Een voorbeeld van lidmaatschapsfuncties waarbij de Y-as omgezet is
naar een percentage. Op de X-as staan waarden uit het domein van de
overeenkomstige gerepresenteerde term.
enkele, algemene expert die de “gemiddelde mening” representeert, voor de evaluatie van de systemen gebeurt;
• we kunnen bestaande beslissingsondersteuningssystemen gebruiken zonder veel aanpassingen.
De algemene expert is slechts een benadering van de gemiddelde mening. De correctheid van zijn representativiteit wordt
aangegeven door een vertrouwensparameter, waar we later
verder op ingaan.
Om de meningen te kunnen groeperen op basis van hun gelijkheid dienen we eerst gelijkheid tussen twee lidmaatschapsfuncties te defini¨eren. Hiervoor introduceren we een alternatieve
notatie gebaseerd op twee aspecten: hun vorm en de relatieve
lengtes van hun componenten. Samen vormen zij de symbolische notatie van een lidmaatschapsfunctie.
A. De vorm-notatie
De vorm-notatie beschrijft de vorm van de elementaire criteria. Algemeen bestaan die uit een aantal typische componenten zoals hoge waarden (die aangeven wat voorkeurswaarden
uit het domein zijn), lage waarden (die ongewenste waarden
aangeven) en de delen die deze verbinden. De componenten
kunnen gemakkelijk op zich ge¨ıdentificeerd worden wat het opstellen van de vorm-notatie gemakkelijk maakt.
B. De lengte-notatie
De lengte-notatie bevat informatie over de relatieve lengtes
van de componenten. Hier wordt gebruik gemaakt van soft
computing technieken om de lengte van componenten onder
te verdelen in e´ e´ n van de volgende zeven categorie¨en: extreem kort, zeer kort, kort, normaal, lang, zeer lang en extreem
lang. De eigenlijke intervallen die deze categorie¨en omvatten is
afhankelijk van de totale lengte van de elementaire criteria.
C. De symbolische notatie en gelijkaardigheid
Het is duidelijk dat de vorm-notatie en de lengte-notatie even
lang zijn qua aantal karakters. Samen vormen ze de symbolische
notatie van een elementair criterium. Ze bestaan uit een opeenvolging van symbolische karakters, die elk in feite een koppel
zijn van een vorm-component en een lengte-component.
De gelijkaardigheid tussen twee elementaire criteria is
gebaseerd op de Levenshtein afstandsmaat tussen twee termen.
Deze werkt origineel op basis van een aantal basisoperaties op
karakters maar hebben we hier uitgebreid om te werken met
symbolische karakters. We beschouwen drie acties: invoegen,
vervangen en verwijderen. Elke actie heeft een bepaalde kost
die bepaalt hoe zwaar de operatie is. Als men bijvoorbeeld een
extreem kort karakter door een zeer lang karakter wenst te vervangen zal de kost hoog zijn, en zelfs nog hoger wanneer ook
de vorm-component verschilt. De kost van invoegen of verwijderen hangt ook af van de lengte van het symbolisch karakter. Met deze techniek zijn we in staat om een afstand te meten
tussen twee symbolische notaties. Dit doen we voor elk paar
elementaire criteria en zo bekomen we een gelijkheidsmatrix
per performantievariabele. Deze matrix is symmetrisch rond de
hoofddiagonaal en is daarop overal 1.
Fig. 2. Voorbeeld van de symbolische notatie van een lidmaatschapsfunctie..
D. Clusteren
Na het opstellen van de gelijkheidsmatrix kunnen we de elementaire criteria groeperen in clusters. Dit doen we op hi¨erarchische wijze op basis van meest gelijk eerst. Dit heeft als interessante eigenschap dat de finale clusters uniek zijn.
Het is belangrijk om een goed overzicht te houden over wat we
zover al bereikt hebben. We hebben informatie verzameld van
een grote groep mensen. Zij hebben elk hun mening gegeven
in de vorm van een lidmaatschapsfunctie over wat ze goed en
niet goed vinden voor een aantal performantievariabelen van een
probleem. Deze functies hebben we omgezet naar hun symbolische notatie en gegroepeerd op basis van hun onderlinge gelijkaardigheid.
Om over te gaan naar de volgende stap dienen we per performantievariabele e´ e´ n vertegenwoordigend criterium te kiezen.
Om deze keuze te kunnen maken introduceren we een nieuw
concept, het vertrouwen.
IV. V ERTROUWEN
Het vertrouwen is een breed concept dat op verschillende
plaatsen in het proces opduikt, telkens met een eigen definitie. We hebben reeds vermeld dat het weergeeft hoe goed het
verkozen criterium de gemiddelde mening representeert, maar
ook in de resultaten wordt het concept gebruikt. Net zoals
de finale voorkeursscore geaggregeerd wordt uit elementaire
voorkeursscores wordt er ook een finaal vertrouwen berekend
op basis van de elementaire vertrouwens, afkomstig van de elementaire voorkeursscores van de elementaire criteria. Op dat
moment dient deze parameter als indicator voor de mate waarin
we kunnen vertrouwen in de juistheid van de finale voorkeursscore.
We kunnen drie verschillende instanties waar we vertrouwen
aantreffen zijn de volgende:
• clustervertrouwen, bij elke cluster
• elementair vertrouwen, bij de elementaire voorkeursscore het
verkozen representatief elementair criterion
globaal vertrouwen, bij elk systeem (ook systeemvertrouwen
en finaal vertrouwn)
•
A. Clustervertrouwen en elementair vertrouwen
De eerste keer dat we het concept vertrouwen gebruiken is
na het clusteralgoritme zijn werk heeft gedaan. We zoeken een
maat van belang voor elk van de clusters. Hiervoor stellen we
dat het clustervertrouwen, per cluster, afhankelijk is van de relatieve grootte van de cluster en van de mate waarin de leden
van de cluster gelijkaardig zijn. De relatieve grootte van de
clusters is eenvoudig te berekenen en kan verder nog uitgebreid
worden met gewichten voor elke expert om een onderscheid te
kunnen maken op basis van hun vakkundige kennis. De gelijkaardigheid de elementen binnen een cluster kan op verschillende manieren berekend worden. Een mogelijke aanpak is
door de gelijkaardigheid van het meest en minst typerende element van de cluster te nemen. Deze elementen worden berekend op basis van hun gemiddelde gelijkheid of ongelijkheid ten
opzichte van de andere elementen in de cluster. Een andere
aanpak ligt meer in de richting van de vaagverzamelingenleer,
waarbij we gebruik maken van de oppervlakte tussen de onderen bovengrens van de interval-gebaseerde vage verzameling die
de cluster omvat.
De gewogen relatieve grootte en de algemene gelijkheid worden gewogen gecombineerd, waarbij men de gewichtsparameter
kan aanpassen om ofwel de impact van de grootte van de clusters ofwel van gelijkheid binnen een cluster te vergroten. De
bekomen waarde is het clustervertrouwen.
Nu moeten we een enkel vertegenwoordigend criterium
zoeken per performantievariabele. Dit moet de algemene
mening van de groep over wat goede en slechte waarden zijn
representeren. Vaak het het eenvoudigst om de meest typerende waarde van de cluster met het hoogste vertrouwen hiervoor te kiezen. Het op een of andere manier samenvoegen van
de typerende waarden van alle clusters zou te complex zijn en
vaak ook leiden tot wiskundig juist maar onlogische resultaten
zoals lidmaatschapsfuncties die alle waarden aanvaarden of alle
waarden verwerpen. Het vertrouwen in dit vertegenwoordigend
criterium is overeenkomstig met het clustervertrouwen van de
cluster waartoe het criterium behoort. Hier spreken we van het
elementair vertrouwen. Het geeft aan hoe accuraat deze vertegenwoordiger de gemiddelde mening voorstelt.
vertrouwens samengevoegd tot een globaal vertrouwen, het systeemvertrouwen.
De resultaten van de evaluatie zijn in niet triviaal te interpreteren. Het rangschikken van de systemen is niet evident
gezien ze elk gekenmerkt worden door twee parameters. Indien
het aantal systemen groot is en er geen overzicht kan gehouden
worden is het mogelijk om een laatste stap uit te voeren waarbij de voorkeur en het vertrouwen gecombineerd worden met
bepaalde gewichten. Dit is eventueel samen te voegen met
de kosten- en voorkeursanalyse zoals voorgesteld in [10]. De
bekomen waarde heet de goedheid van het systeem en geeft de
mate weer waarin het systeem voldoet aan de vereisten van het
probleem (de mate waarin het geschikt is als oplossing) en de
mate waarin we zeker zijn over de juistheid van dit resultaat.
Eens dit berekend is voor elk systeem kan men ze rangschikken
op basis van dalende goedheid.
V. C ONCLUSIE
In dit werk hebben we een techniek opgesteld om grote
hoeveelheden informatie te aggregeren. Dit staat ons toe
om bestaande beslissingsondersteuningssoftware uit te breiden
zodat hun huidige werking ongeschonden blijft terwijl hun
toepassingsdomein wordt vergroot. Op deze manier wordt het
mogelijk om meer dan e´ e´ n persoon te consulteren bij het maken
van een beslissing. De hoeveelheid mensen is niet begrensd:
men kan werken met een kleine groep van vakdeskundigen of
met een enorme groep aan mensen die men contacteert via sociale media.
R EFERENTIES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
B. Systeemvertrouwen en goedheid
Na het kiezen van een vertegenwoordigend criterium per performantievariabele zijn we klaar om de systemen e´ e´ n voor een
te evalueren. Hiervoor gebruiken we als basis Logic Scoring of Preferences (LSP), een bestaand beslissingsondersteuningssysteem. Eerst berekenen we voor elke performantievariabele hoe goed elk systeem hieraan voldoet. Hiervoor gebruiken we de gekozen vertegenwoordigende criteria. Dit levert voor elk systeem een set elementaire voorkeurswaarden op
met overeenkomstige elementaire vertrouwens. Deze worden
gecombineerd tot een globale voorkeursscore per systeem door
de LSP aggregatiestructuur. Deze geeft het onderling belang
van de performantievariabelen weer en leunt nauw aan bij het
oorspronkelijke resultaat van de decompositie van het probleem in zijn attributen. Gelijkaardig worden de elementaire
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
J.J. Dujmovi´c and Fang W. Y., “Reliability of lsp criteria,” 2004.
P.J.G. Lisboa, H. Wong, P. Harris, and R. Swindell, “A bayesian neural
network approach for modelling censored data with an application to prognosis after surgery for breast cancer,” Artificial Intelligence in Medicine,
vol. 28, pp. 1–25, 2003.
P.G.W. Keen, “Decision support systems: The next decade,” Decision
Support Systems, vol. 3, pp. 253–265, 1987.
G.P. Huber, “Issues in the design of group decision support systems,” MIS
quarterly, pp. 195–204, September 1984.
J.P. Shim, M. Warkentin, J.F. Courtney, D.J Power, R. Sharda, and
C. Carlsson, “Past, present, and future of decision support technology,”
Decision Support Systems, vol. 33, pp. 111–126, 2002.
A. Tapia-Rosero, A. Bronselaer, and G. De Tr´e, “Similarity of membership functions - a shaped based approach,” in Proceedings of the 4th International Joint Conference on Computational Intelligence. 2012, pp. 402–
409, SciTePress - Science and Technology Publications.
A. Tapia-Rosero, A. Bronselaer, and G. De Tr´e, “A shape-similarity based
method for detecting similar opinions in group decision-making,” Information Sciences Sp. Iss. on New Challenges of Computing with Words in
Decision Making, 2013.
G. DeSanctis and R.B. Gallupe, “Group decision support systems: A new
frontier,” 1985.
G. DeSanctis and R.B. Gallupe, “A foundation for the study of group
decision support systems,” Management Science, vol. 33, no. 5, May 1987.
H.-J. Zimmermann, “Fuzzy programming and linear programming with
several objective functions,” Fuzzy Sets and Systems, vol. 1, pp. 45–55,
1978.
G. De Tr´e, Vage Databanken.
J.J. Dujmovi´c and W.Y. Fang, “An empirical analysis of assessment errors for weights and andness in lsp criteria,” 2004, San Francisco State
University, Department of Computer Science.
J.J. Dujmovi´c, “A comparison of andness/orness indicators,” San Francisco State University, Department of Computer Science.
J.J. Dujmovi´c, “Optimum location of an elementary school,” .
J.J. Dujmovi´c, G. De Tr´e, and S. Dragi´cevi´c, “Comparison of multicriteria
methods for land-use suitability assessment,” 2009.
J.J. Dujmovi´c, G. De Tr´e, and N. Van De Weghe, “Lsp suitability maps,”
2009.
[17] R. Bodea and E. El Sayr, “Code coverage tool evaluation,” 2008.
[18] J.J. Dujmovi´c and G. De Tr´e, “Multicriteria methods and logic aggregation
in suitability maps,” 2011.
[19] J.J. Dujmovi´c and H. Nagashima, “Lsp methods and its use for evaluation
of java ides,” 2005.
[20] J.J. Dujmovi´c,
“Characteristic forms of generalized conjunction/disjunction,” .
[21] Didier Dubois and Henry Prade, Fuzzy Sets and Systems: Theory and
Applications, Academic Press, Inc., 1980.
[22] George J Klir and Bo Yuan, Fuzzy Sets and Fuzzy Logic: Theory and
Applications, Prentice Hall, 1995.
[23] Dan Gusfield, Algorithms on strings, trees, and sequences: computer science and computational biology, Cambridge University Press, New York,
NY, USA, 1997.
[24] Jozo J. Dujmovi´c and Henrik Legind Larsen, “Generalized conjunction/disjunction,” International Journal of Approximate Reasoning, vol.
46, no. 3, pp. 423–446, Dec. 2007.
Contents
1 Introduction
1
1.1
Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2
The Current Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3
Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.4
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1.5
Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2 Literary study
2.1
2.2
Decision Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.1.1
About Decision Support . . . . . . . . . . . . . . . . . . . . . . . .
6
2.1.2
Characteristics of Decision Support Software . . . . . . . . . . . . 11
2.1.3
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Multicriteria Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1
2.3
2.4
6
Decision Support Algorithms . . . . . . . . . . . . . . . . . . . . . 14
Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1
Soft Computing Techniques . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2
Logic Scoring of Preferences . . . . . . . . . . . . . . . . . . . . . . 22
2.3.3
Suitability Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Group Decision Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.1
Aspects of GDSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2
Fields of Application . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Research
29
3.1
Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2
Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3
Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
x
3.3.1
Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.2
Further Tiers of Aggregation . . . . . . . . . . . . . . . . . . . . . 36
3.4
Confidence as a Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5
Defining and Calculating Confidence . . . . . . . . . . . . . . . . . . . . . 38
3.6
3.5.1
Confidence at Cluster Level . . . . . . . . . . . . . . . . . . . . . . 39
3.5.2
Elementary Confidence at Membership Function Level . . . . . . . 47
3.5.3
Global Confidence at System Level . . . . . . . . . . . . . . . . . . 48
Combining Confidence and Preference . . . . . . . . . . . . . . . . . . . . 49
4 Case Study
53
4.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2
Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3
4.2.1
Required Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.2
Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.3
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5 Conclusions
5.1
68
Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
xi
List of Figures
2.1
Membership function for “Slow” . . . . . . . . . . . . . . . . . . . . . . . 17
2.2
General membership functions
2.3
Generalized conjunction and disjunction . . . . . . . . . . . . . . . . . . . 19
2.4
Generalized conjunction/disjunction gradations . . . . . . . . . . . . . . . 20
2.5
Generalized conjunction/disjunction Weighted Power Mean . . . . . . . . 21
2.6
LSP Aggregators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.7
Disjunctive Partial Absorption aggregator . . . . . . . . . . . . . . . . . . 24
2.8
Mandatory-Desired-Optional compound aggregator, D-nested . . . . . . . 25
2.9
Mandatory-Desired-Optional compound aggregator, A-nested . . . . . . . 25
. . . . . . . . . . . . . . . . . . . . . . . . 18
2.10 Mandatory-Desired-Optional compound aggregator, general . . . . . . . . 26
2.11 LSP Compound Aggregators . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.12 Suitability map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1
Shape-String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2
Shape-String example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3
Relative lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4
Feature-String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5
Shape-Symbolic notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6
The Shape-Similarity method . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.7
Most and least typical values . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.8
Interval-Valued Fuzzy Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.9
Upper and lowers bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.10 Lower bound without core . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1
Case study aggregation structure . . . . . . . . . . . . . . . . . . . . . . . 59
xii
4.2
Clustering dendrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3
Case study cluster 18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4
Case study cluster 30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.5
Case study cluster 52 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.6
Case study cluster 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
xiii
List of Tables
4.1
Generated elementary criteria . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2
Shape- and feature-strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3
Similarity matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4
Confidence configuration A1 . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.5
Confidence configuration A2 . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.6
Confidence configuration A3 . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.7
Performance variable confidences . . . . . . . . . . . . . . . . . . . . . . . 64
4.8
Global preference and confidence . . . . . . . . . . . . . . . . . . . . . . . 65
4.9
Global goodness B1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.10 Global goodness B2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.11 Global goodness B3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.12 Derived optimal game prototype . . . . . . . . . . . . . . . . . . . . . . . 67
xiv
Chapter 1
Introduction
1.1
Context
Today, the world has become more “flat” than ever before. As described in the book
by Thomas Friedman 1 , the third wave of globalization is a result of the rise of the
internet. This allows us to connect to people on a more global scale than ever before:
we have the freedom to talk to people all around the globe. We can “see” people
everywhere around the world; the world is flat. This has multiple consequences, but it
mainly implies having easier access to information which we can freely share. In the past
we were limited to our closest environment but the arrival of personal transportation
introduced a wave of globalization allowing us to connect to more people in a larger
circle. Suddenly, other countries were also possible sources of information, but might
also be considered as competition. Exchanging technology, expanding economy and
trade allowed for a personal enrichment leading to an increased life standard, but this
life was not for everyone. Now, with the internet, we as individuals can reach other
individuals anywhere, at any time. But also for companies these changes have important
consequences. This new, free medium of communication provides a new, large, possible
market as it allows them to reach more customers. But this is true for all businesses,
so in order to stay on top of their game, they had to shift their focus to marketing
techniques. In order to stay competitive with their rivals, two things received a large
boost the last decennia: the research and development department, and the development
of commercials. Opposed to the past, where a company was focussed on producing good
1
The World Is Flat, Thomas Friedman, 5 April 2005, Farrar, Straus and Giroux, 0-374-29288-4
1
products on a steady scale to convince and keep a steady customer base, businesses now
invest more money in commercials to reach more people.
However, commercials alone are not enough. Because every company commercializes
their products, upgrades that appease the crowd are needed to convince possible buyers
that product A is more interesting than product B. Products are becoming more personalized, and customers like this. New techniques that focus on the ease of use can cause
a large boost in popularity. This explains the trend we see nowadays where companies
focus more on the customer than on the product. Often the overall quality of products
is showing a decline (mostly in durability), mainly because companies notice that their
clients do not mind consumerism. If they make the product attractive enough, people
will buy it and more importantly, buy a new one if the old one gives up. Even when the
old product is still working, people will replace their product by a new version as long
as the new version is attractive enough. Therefore, there is a strong trend for companies
focussing on the needs and desires of the customers.
One type of shift everyone experiences is the use of personalized commercials. Companies realize that bombarding everyone they can reach with commercials is not effective.
The spam-effect causes people to neglect the overload of information, which has the
adverse effect of what is desired: instead of promoting their products, they push the
customer away.
Personalized commercials are a first step in the direction to handling with this. Companies gather information on the habits of their users in multiple ways. A passive way
of doing this is by simply observing the behavior of their clients and trying to deduct
correlations in their actions. A typical example of this is the “shopping basket analysis”,
in which an inventory is kept of recently made purchases and companies go looking for
trends of items that are often bought together. A famous example of this is the story of
how a company discovered a teenage woman was pregnant before her father did, simply
by her purchase history 2 .
New techniques for passive information gathering have been developed the last years in
the domain of data mining, which focusses on handling with large amounts of information. Techniques such as clustering, data warehousing and trend analysis are becoming
more important in the data management systems of businesses.
2
http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-waspregnant-before-her-father-did/
2
An active way of gathering information of customers is through various types of questionnaires. An important factor of product development now depends on creating the
product the client wants, and hence it is important to know what the customer wants.
To find out what this is, companies ask questions of their users on what they like. Often extra information based can be found on geographical and demographical analysis,
but now in the new, flat world, when the spectrum of clients has vastly expanded, new
techniques are required.
1.2
The Current Situation
It is obvious there is a need for techniques on data management to stay competitive in the
flat environment where everyone can reach everyone in the current world. Incorporating
the customer is a necessary trait to create better products in order to gain a position in
the global market. Therefore, information has to be analysed, as accurately as possible
and in a clever way. Artificial intelligence based on ontologies and mathematical analysis
tools based on statistics are up and coming.
Nowadays, users are being included in business decisions. The gathering of information
on their preferences is not the problem. The real question when facing the vast amount
of data being gathered is how to interpret this information? How can you put this
information and use it to derive what product someone really wants? Obviously, everyone
has their own opinion, but what steps should be taken during development to satisfy
more customers? What types of attributes of the product are mandatory, which are
optional? How important are they compared to each other?
Such problems are known as decision support problems. To that end, decision support
systems have been developed which will be further discussed in the next chapter. However, in the evolving flat world, the employed techniques should also evolve, as sticking
to old ones will not suffice. New technologies are needed to create accurate insight into
the vast amounts of data now available to us.
3
1.3
Approach
This paper introduces a set of techniques to handle large amounts of information. The
resulting software is based on an existing decision support system (DSS). The main idea
is that through the use of an aggregation technique, analysing the data is simplified.
With aggregation comes a loss of information, and to compensate for this a measure of
confidence is introduced. A discussion on the interpretation thereof follows later.
The technique heavily relies on soft computing, a branch of mathematics that is rapidly
gaining popularity. It revolves around a couple of principles stating that people do
not always reason in exact mathematics but more so in linguistic terms and ranges of
values rather than sharp (crisp) and precise values. This is illustrated by the following
example:
Imagine driving a car and approaching a red light. As the driver, you do not
think “I am exactly 24 m and 72 cm away from the lights, so I must brake
to exactly 31 km/h in order to stop at the right place”. In reality, what
happens is that you realise “I am close to the lights, so I should slow down”.
The exact distance and speed do not matter that much, what matters are terms such as
close and slow. These terms are different depending on the context and the individual.
What is close, and what is slow? Soft computing techniques allow us to model these
concepts on an individual level, while handing us a set of operations allowing us to use
this information to perform several calculations.
It should also be clear that soft computing techniques do not always provide “the optimal” solution to a problem, such as the braking technique that wears out the brakes
the least, or uses the least gas. Instead, it provides a robust framework that is able
to gracefully handle a range of problems in a smooth way. Moreover, it just present
a solution. An important property of soft computing techniques is that the solution,
when plotted, will display smooth curves rather than the solutions produced by exact
techniques, which often show discontinuities and breaking points. That, combined with
the fact these techniques allow the handling of a range of problems rather than reducing
any problem to optimal path, makes soft computing similar to a natural way of handling, much like what humans do. This is exactly what we are trying to accomplish,
and that makes soft computing perfect for the application field in this research. A more
4
detailed mathematical explanation is given later on, with the rest of the mathematical
preliminaries.
1.4
Applications
It is interesting to shed some light on possible fields of application for the proposed
techniques. Besides from a commercialization point of view, aggregation can be used
in different business contexts. Interesting types of applications are social media based,
incorporating inputs from a large number of users in business decisions. On the other
hand, several techniques can be applied to create a platform of flexible querying for
users, allowing them to ask questions to a system built with data aggregation, such as
“where would be a good place to go to on a sunny day?”. This platform can provide
a geographical map showing the suitability of possible locations matching the flexible
query, called a suitability map (S-map). Alternatively, these techniques can be used in
Group Decision Support Systems (GDSS) in a highly advanced context, being used by
groups of experts to reach a consensus on an important business decision, even when the
group of experts is not very large.
In this work, the focus lies on the social media aspect and on how large amounts of
data can be handled efficiently, though here and there the connection to other fields of
application is made.
1.5
Structure
The remainder of this document is structured as follows. Chapter 2 is a literary study
elaborating on the background of the research conducted in the thesis. Chapter 3 is
dedicated to the research itself where new concepts related to group decision support
such as the proposed aggregation technique and the confidence measure are introduced.
Chapter 4 presents a case study to illustrate how it all works. Chapter 5 is the final
chapter that concludes with a summary of what is accomplished and briefly settles some
opportunities for future work.
5
Chapter 2
Literary study
The purpose of this literary study is twofold. One the one hand it portrays the current
state of decision support systems, algorithms and techniques, backed by the documented
literary works listed in the bibliography. On the other hand it gives a detailed mathematical explanation of one of the more recent decision support algorithms, which is used as a
base for the conducted research. This introduces the mathematical preliminaries which
are necessary for understanding the conducted research. The lack of literature found on
group decision support reflects the fact that this research is modern and ongoing, which
is further contributed to by the subject of the thesis.
First we will discuss decision support in greater detail followed by an elaboration on multicriteria decision problems. Then, some mathematical preliminaries are given to serve
as a base for the research. Finally, there is a small part on group decision support.
2.1
2.1.1
Decision Support
About Decision Support
Throughout the day, people make choices concerning trivial matters such as what to eat
for lunch or what to do in the next hour. At work, choices that might influence the
course of the business are made, which might have an economical impact. In clinical
environments, doctors make decisions regarding the healthcare of their patients. Often
this is backed by probabilistic models such as Bayesian Dependency Networks or Neural
Networks [LWHS03].
6
Clearly, systems to assist in making decisions, especially difficult ones, are a valuable
asset. This is what Decision Support Systems try to do.
Not every decision is made by an individual. On a larger scale, decisions are also made
by groups of people. It is a common practice for business decisions to consult a board
of professionals or managers. They come together to make decisions concerning their
company. Doing this improves the quality of the decision that is being made, as they
combine their different expertises and share their knowledge with each other. This way,
they can accomplish more as a group than they would be able to alone.
This concept of sharing knowledge has already been in place for a long time but it is
the way the information is exchanged between each other which has changed through
time. Originally, meetings were held and all experts had to be physically present to
attend. Due to the uprise of tele- and videoconferencing there was no longer a direct
need for executives to be together at the same location [Hub84] as it became possible
to communicate over the internet. This first step has multiple beneficiary effects. First,
the experts no longer need to come together to a certain location, which takes time and
effort, but they can simply conference anywhere they are as long as they are online.
Second, more experts can now be consulted due to the lack of any geographical restriction. Today, it is not unusual to discuss problems with different individuals all over the
world, using the internet as a medium [SWC+ 02]. An example of this are the many
websites dedicated to answering questions in a community-driven sense, often referred
to as “stacks”
12 .
But decision making on a large scale is not restricted to only business decisions and
experts. A large crowd of people can also be asked to make decisions. Think of voting
for a political leader or a referendum. Alternatively, this can evolve further allowing
businesses to include large crowds in their decisions due to the rise of the internet to
the point where social media is involved. Facebook originally had a policy to allow users
to vote on “important site governance changes” but has recently decided to take away
this right from its users 3 , ironically through that same voting system (which passed in
their favour because not enough votes were cast, which was the reason Facebook chose
to call the vote). However, the idea has been introduced and a door has been opened to
1
http://stackoverflow.com
http://serverfault.com
3
http://venturebeat.com/2012/11/21/no-more-voting/
2
7
companies that want to get a public opinion by using social media as a way of getting
information.
Alternatively, the public can be used to create a common place of information, much like
community-based projects. Social media can be consulted to produce a map of events
or to rate restaurants or to derive which locations are popular on a warm, summery
day. This can be done passively by observing the behaviour of users or actively through
asking people what they like and then aggregating these results into a publicly accessible
map.
Decision support has also paved the road to more advanced technology. Systems like
data warehouses (DW), Online Analytical Processing (OLAP) and techniques like data
mining are all examples of software aiding when making complex decisions [SWC+ 02].
These programs often grant the user an elevated look on the data at hand, allowing for
smarter choices to be made.
It should be clear that the development of DSS has many interesting applications in the
modern world of today and tomorrow. Due to its versatile nature, DSS can be employed
for different purposes ranging end-user personal assistance to making business decisions
to massive-scale collaboration. Furthermore, there is a growing market and with it a
need for group decision algorithms, which will allow software to gather and aggregate
large amounts of data and present them back to groups of users, be they experts or just
a large group of social media users, in an orderly fashion, and this with a minimal loss
of information.
Definition
Many different definitions have been given to try to define what decision support is
exactly. The truth is that decision support is exactly what the name implies, it supports
making decisions. This is a broad concept and several people have given their point of
view on decision support (DS).
One of the earliest definitions is given by Gerrity (1971) [DG85][SWC+ 02], who said
decision support is:
“An effective blend of human intelligence, information technology and software which interact closely to solve complex problems”.
8
This clearly reflects the fact Gerrity feels decision support is useful to solve problems
that are too complex to handle by hand. Problems that involve complex mathematical calculations or have a large range of parameters. Problems that have too many
candidate solutions to evaluate them each manually make an set of example problems
that are perfect for computers to solve. It also illustrates that decision support is not
autonomous: it simply provides a human-computer interface for problem solving. The
user is a required part of the decision support and is often referred to as the decision
maker (DM).
A different definition is given by Keen (1987):
“The application of available and suitable computer-based technology to help
improve the effectiveness of managerial decision making in semi-structured
tasks”.
Clearly, Keen focusses on decision support from a business point of view as he illustrates
managerial decisions are often a dread and not always very effective, some things which
he hoped could be resolved by decision support systems. Otherwise, DSS are also referred
to as:
“Interactive decision aids for managers”.
For Keen, decision support systems are simply software implementations of decision aids
for managers. They are varying based on the field of application they are used in, but
the bare essentials are the same for all.
Gorry and Morton defined the concept of a decision support system as follows:
“A decision support system is a computer system that deals with a problem
where at least some stage is semi-structured or unstructured.”
The terms structured, semi-structured and unstructured were defined to indicate degrees
in which a decision problem is easily solvable. A new and unknown problem is seen as
difficult, whereas an older, well-known problem might already have a known solution
algorithm to solve it. Regardless, DSS should be able to solve a broad range of problems
but are particularly interesting to solve those problems of which solutions methods are
not well-known.
Decision support systems are thus exactly what the name says, systems that support the
9
process of decision making. They require input from the decision maker to model both
the problem and the evaluation logic. Decision support is especially useful for problems
where there are multiple possible solution systems and we want to find the best ones
from amongst them to solve the given problem.
DSS are on the other hand a human-computer interface exposing the computational
strength that is available to aid in solving a whole range of problems, constructed to
increase the efficiency of decision making and the effectiveness of the result. It is important to keep in mind human evaluation lies at the heart of decision support, from
defining what the problem is and what is preferred from the solution, to making the
actual selection between the viable systems. A final desired property for DSS is the
ability to generate ad hoc results quickly to new problems as they arise, which Keen
discusses when talking about levels of support [Kee87]. He emphasizes DSS should be
able to answer “what if” questions.
History
Decision support systems have evolved a lot through the years, growing from analytical
tools and information systems to computer-based software. They were originally introduced in the 1970’s as a way to allow decision makers to simplify their task of making
hard decisions by providing them with a powerful tool to do the calculations for them
[Kee87]. Through time, decision support has evolved into several specific branches supporting specific decision making tasks, but generally the idea is the same, to improve
the efficiency with which a user can make a decision and to improve the effectiveness of
that decision [SWC+ 02].
Today, decision support has a growing potential due to the increase in resources made
available by the internet and cloud computing, allowing observers to ask questions to a
large audience through social media and to aggregate the received replies. This has led
to the birth of Group Decision Support Systems [DG85],[DG87].
It is much believed that the future of DSS lies in mobile computing, where mobile cells
such as smartphones, tablets or PDAs serve as clients, requesting on-demand decision
support from a server [SWC+ 02]. Possibly, the passive clients can be agents collecting
data or serve as small worker nodes doing a part of the calculations needed to answer a
10
question asked by another client.
2.1.2
Characteristics of Decision Support Software
In the business world large and important questions ([DDTD09]) such as “where to build
the next expansion store to enlarge our influence” or “what location would be best suited
for a new industrial infrastructure” often need to be answered. Because these are important decision of large gravity, experts are consulted for their opinion on the matter.
However, they do not always find a solution with ease. They might disagree on vital
parts or the problem might just be too complex to handle. These questions depend on
many factors ([DDTD09],[DDTVDW09]) and are too large to handle alone. The process
of coming to a conclusion can take a long period of time which is not only by definition
timely, but also costly, and even then there is no guarantee that every bit of information
that was presented was used properly and that nothing was forgotten. This is obviously
a perfect field of application for group decision support systems which have certain interesting properties and advantages. For one, GDSS can handle large groups of information
quickly and they will never overlook information or make calculational mistakes. These
programs allow experts to model both the problem and their way of reasoning about the
best solution. They perform all the tedious calculations, saving lots of time and making
sure every piece of information is used and nothing is overlooked.
At the heart of a DSS lies a decision support algorithm. There are many different
decision support algorithms ([DDT11]). The main factor that is used to separate them
from one another is the degree in which the user is capable of modelling the way he
or she evaluates a logical problem ([DDTVDW09]). If we look at how decision support
systems have evolved throughout time, it becomes clear that there is a trend towards
systems that involve soft computing techniques [Duja],[SWC+ 02]. These allow for more
flexibility in the way of inputting a human way of thinking, which is often not as strict
as the classical logic part of math. Sometimes we do not want the exact and strict
conjugation between two attributes, but we want to express a desire for both to be
satisfied, with a slight preference for one above the other. Hence we see that simple
systems are no longer sufficient to compete in the actual world; complex systems based
on advanced mathematical theory are needed.
Moving from DSS to GDSS, the underlying decision support algorithm becomes more
11
complex as well. Different experts have different opinions, and it is no longer trivial to
say which attributes are more important that others, because not every experts has the
same priorities. If you were to ask each expert individually which factors he or she thinks
are more important than others, they would generally answer different. For example,
a geologist would have different requirements for a suitable location for building a new
school than a biologist. The geologist would deem it necessary that the ground material
is solid and rigid and would put this requirement high on his priority list whereas the
biologist might be more concerned with the environment and would hence allocate a
higher necessity to the proximity of trees and parks ([Dujc]).
Sometimes, experts will not be able to reach an agreement. Then how should the decision
making process proceed? Whose opinion should be valued more and whose suggestions
should be followed? Is this even a correct way of handling, keeping in mind that picking
preferred experts results in ignoring the input of some of the other experts, which would
result in a loss of information, which is bad in any scenario.
2.1.3
Summary
It is clear that good decision support software should be able to overcome many problems.
More precisely, it should be so that the program can handle multiple users with diverging
opinions and aggregate them to a single output with as little loss of information as
possible. Furthermore it should be flexible in its modelling capabilities. It should be
able to handle difficult problems and handle large groups of data quickly.
The development of aggregation techniques is a field of study that is under active research
and is also the subject of this thesis. As will be discussed later, there are many different
ways of combining the input to a single output.
2.2
Multicriteria Problems
For any decision problem, we say there are different systems (i.e., candidates or candidate
solutions) which are all possible solutions. The purpose of DSS is then to evaluate and
rank the systems. To define which of the evaluated systems are the good ones, we define a
measure to indicate how good a system is at solving the problem. In other words, we are
looking for the system that best satisfies the requirements of the problem [DN05].
12
A multicriteria problem is a problem that simply has multiple requirements. These
requirements can be decomposed into a hierarchical tree of measurable performance
variables, sometimes also called attributes. For each of the performance variables, we
define an elementary criterion, which is an evaluator function mapping a system’s measured value for that variable on a value indicating how well that system satisfies that
criterion. This value is called an elementary preference score or elementary preference
degree [DDT11],[Dujc].
In order to compare multiple systems, we aim to find a single measure that combines
all the information of the elementary preference scores. We call this score the global
preference score, or just global preference for short. This measure indicates how well the
system meets all requirements at the same time. A high value indicates most, if not
all, requirements are met. A low value indicates the opposite and usually carries the
meaning that that system is not a viable solution.
Preference scores are often normalized ([DF04]) so a value of 0 indicates completely
unsatisfactory whereas 1 indicates completely satisfactory, and any gradation should be
interpreted as the degree of which a requirement is satisfied, where a value closer to 1
means a higher degree of satisfaction ([DDTVDW09]). While evaluating all systems, a
mapping between each system and its calculated global preference score is maintained.
This makes it easy to evaluate the results at the end. The system with the highest score
is in theory the “best” one, though that does not imply it is also the best in practice. It
is rare to find a solution that fully satisfies all requirements. The proposed best solution,
the one with the highest global preference, should be validated by the decision makers
for its viability. It is still his or her responsibility to select the right system from the ones
the algorithm rated as most suitable. The DSS simply aids in the process by evaluating
the systems through making computations.
For finding the global preference score of a system, its elementary preference degrees
should be aggregated into one single value. It is usually the way this aggregation is
done that separates different MultiCriteria Decision Methods (MCDM) from one another.
13
2.2.1
Decision Support Algorithms
The currently most used decision support algorithms are briefly explained now. Besides
the fact all decision support algorithms have as a main goal to combine elementary
preference degrees to a single global preference, they should have at least the following
ten fundamental properties, as discussed in [DDT11]:
1. Ability to combine any number of attributes (performance variables)
2. Ability to combine objective and subjective inputs
3. Ability to combine absolute and relative criteria
4. Flexible adjustment of relative importance of attributes
5. Modeling of simultaneity requirements (both soft and hard simultaneity)
6. Modeling of replaceability requirements (both soft and hard replaceability)
7. Modeling of balanced simultaneity/replaceability
8. Modeling of mandatory, desired and optional requirements
9. Modeling of sufficient, desired and optional requirements
10. Ability to express suitability as an aggregate of usefulness and inexpensiveness
The combination of these properties allows for the modelling of a natural way of thinking and allows for a flexible method for combining performance variables in many conceivable ways. A couple of approaches are briefly discussed: Simple Additive Scoring
(SAS), MultiAttribute Value Technique (MAVT) and MultiAttribute Utility Technique
(MAUT), Analytic Hierarchy Process (AHP), Ordered Weighted Averaging (OWA) and
Logic Scoring of Preferences (LSP) [DDTVDW09], [DDT11], [DN05].
SAS is the simplest of techniques. It is also the oldest and relies simply on the concept of
assigning weights to each performance variable to denote their relative importance. The
global preference of a system is then calculated as the weighted sum of the elementary
preference degrees of the attributes. This model is simple and fast but does not allow
for much flexibility and implies a disjunction between all attributes.
MAVT and MAUT are much like SAS but replace the way the elementary preference
degrees are calculated by specific functions with the intent to capture human judgements.
14
Furthermore, the weighting factors are chosen so that they sum to a total of 1. As SAS,
MAVT and MAUT are not very flexible.
AHP makes an assumption based on psychological research where it is posited that the
highest accuracy is achieved when the amount of attributes that has to be compared is
small. It is based on a hierarchical structure where attributes are compared pairwise.
The root of the resulting tree represents the global preference score.
In OWA, the performance variables are first ranked on their relevance by the decision
maker. Then, the relative importance between them is modelled by a vector of weights
that sum up to 1. The elementary preference scores of each performance variable are
used in an OWA aggregator function, which is a parametrized instance of a class of
mean-type aggregation operators. Through choosing the weights properly, adjustable
levels of simultaneity and replaceability can be achieved.
LSP is one of the more recent methods which is based on a step-by-step process where
the elementary preference scores are combined with desired simultaneity or replaceability
and aggregated into a new value, which can in turn be aggregated again recursively with
other degrees. This can be seen as a non-binary tree with at the root the global preference
of the system.
In this document, LSP is expanded to handle group data, effectively transforming it into
a GDSS. To better understand LSP, a few mathematical concepts are needed, which are
explained in the next section.
2.3
Mathematical Preliminaries
LSP heavily relies on soft computing techniques, both for the elementary criteria and the
aggregation steps that combine the elementary preference scores to a global preference.
Therefore we will now briefly explain the necessary soft computing concepts.
15
2.3.1
Soft Computing Techniques
Membership Functions
Membership functions are a basic soft computing technique which are used in the context
of thinking in terms instead of values. Mathematically seen, they are the characteristic
representation of a fuzzy set [DT]. In this work, they are mainly used to represent
elementary criteria. They can best be understood by understanding what a fuzzy set
is.
Fuzzy Sets
A fuzzy set [DP80], [KY95] is similar to the concept of a set in a mathematical sense.
It is often used to describe a linguistic term. In classical mathematics, each individual
in a universe is either completely a member or completely not a member of a set. This
means any individual either “is” the linguistic term, or “is not” the linguistic term. For
example, in the universe of people, each individual either is or is not male. We call this
a strict or crisp set.
With fuzzy sets, each individual is a member of each set with a certain degree of membership. This degree of membership can be interpreted in different ways, depending
on the application context [DT]. Whatever the interpretation, each has the same characteristics for the extreme values, namely that a membership degree of 1 means total
participation (equivalent to the classical membership of an individual) and a 0 means
the opposite (equivalent to the non-membership of an individual).
A fuzzy set can be represented graphically using a membership function. It plots the
degree of participation for each element in a universe. The membership function is then
characterized by the fuzzy set it represents, though different sets may have the same
membership function. As an illustrative example, lets look at a possible membership
function for the linguistic term “slow”.
Keep in mind this represents the opinion of the person defining the membership function.
This is my individual function, others might produce a similar yet slightly different one
or even a completely different one.
In the context of driving with a car, we aim to model the concept of average
16
“slow” speed. We say that anything below 30 km/h is slow and that everything above 50 km/h is absolutely not slow. In this case, all real values from
0 to 30 receives a membership degree of 1 (indicating fully in accordance to
the term “slow”), and everything above 50 gets a 0 (indicating “not slow”).
Note that this does not imply that any speed above 50 km/h is necessarily
“fast”; this would be modelled by a different membership function.
The part between 30 and 50 km/h can be filled in as we please, representing our interpretation of the term slow. A technique that is often used is through a linear connection.
The reason for this is two-fold: it is simple to work with mathematically and it often
properly represents human logic. The higher the speed, the less “slow” it is. This gradient is generally perceived as linear, meaning that increasing the speed twice as much
results in an interpretation of twice as much “less slow”. The resulting membership
function is shown in Figure 2.1.
Figure 2.1: Membership function for the linguistic term “Slow”.
Typically, the graphical representation of a membership function is trapezoid-shaped.
This property stems from the fact neighbouring elements in a universe are often close to
each other and hence exhibit similar behaviour. Moreover, these are the easiest functions
to work with based on the linearity of each part yet are also sufficient in their modelling
17
capabilities. The membership function should be continuous.
Note that even if the ordered elements do not produce a continuous membership function,
a bijection can be found that reorders the data in such a manner that a continuous membership function is found. In the rest of this work we will thus assume all membership
functions are continuous without violating the generality of the technique.
Finally, fuzzy sets can also be used to represent strict sets. Their corresponding membership function-representation can then be split into rectangles. Each element is either
part of the set, or is not, indicated by a membership degree of respectively 1 or 0. This
gives a typical rectangular shape. It is thus clear that fuzzy sets extend strict sets with
an extra dimension, the gradient of membership.
Interpreting Membership Functions
There are different types of interpretations that can be given to membership functions.
The interpretation in the context of decision making is that of elementary criteria, in
which a performance variable is chosen of which the values are presented on the x-axis.
The function is then used to model regions of preferred values and gradients between
them. The y-axis then reflects the degree in which a value is satisfactory for the chosen
criterion [DDTVDW09]. Applied to DSS, this value represents the elementary preference
score.
Figure 2.2: Membership functions where the Y-axis has been converted to a percentage of satisfaction, illustrating how a membership function can be used to imply a range of desired values
for a variable.
We call the support of a membership function the range on the X-axis where the function
does not equal zero. The core of a membership function is the range on the X-axis where
the function equals one.
18
Generalized Conjunction and Disjunction
Another technique which is used in LSP is the generalized conjunction and disjunction
(GCD) [DL07]. This is a generalization of the common concepts of conjunction and
disjunction in the sense that it allows a gradient between the extremes. Soft computing
here extends the classical concepts of disjunction and conjunction by treating them as
points. Adding a connecting line between them creates a new dimension. The concepts
“andness” and “orness” are introduced as degrees of simultaneity and replaceability.
The mathematical AND, when weighted, then becomes the generalized conjunction. A
new parameter α is introduced. It represents the degree of “andness” between a set of
input values. In an extreme case, the full conjunction, or the classical logical AND, is
the generalized conjunction with α = 1. The other extreme, with value 0, expresses a
full disjunction and equivalent to the classical logical OR. Any gradation with α between
0.5 and 1 indicates a partial conjunction, also known as the “andor”-function, and still
implies conjunction over disjunction. There is a higher degree of simultaneity between
the inputs than replaceability. Completely dual to this concept of “andness” is “orness”,
represented by the ω sign. This is shown on Figures 2.3 and 2.4.
Figure 2.3: Graphical representation of the generalized conjunction/disjunction.
19
Figure 2.4: Symbolic representation of the GCD and some of its gradations as they are often used
in LSP. HPD, SPD, HPC and SPC stand for hard and soft partial disjunction and conjunction,
respectively.
There are different implementations for the GCD, but a commonly used implementation
is based on the Weighted Power Mean (WPM 4 5 ) [Duja]. This is a series which calculates
a mean given two input vectors, E and W, and a parameter r. The vectors respectively
represent the input values and their weights. The parameter r changes the behaviour of
the mean. The formula to calculate the weighted power mean is the following:
k
X
E1 · · · Ek = (
Wi Eir )1/r , −∞ ≤ r ≤ +∞
i=1
In this implementation of the GCD, the degree of “andness” (or equivalently “orness”)
is configurable through the choice of the r-parameter. For minimal r, that is for the
limit where r approaches negative infinity, the mean equals the minimum of the input
values, independent of the separate weights assuming they are non-zero. This represents
a full conjunction between the inputs, which is better known as the logical AND between
predicates. For maximal r, that is for the limit where r approaches positive infinity, the
behaviour is reversed and represents a full disjunction, which equals the logical OR. In
the specific case where r exactly equals 1, the weighted power mean is reduced to a simple arithmetic mean better known as the weighted mean of its inputs. For any r-value
larger than 1, the WPM results in a partial disjunction, growing in expressiveness as
r increases. Reversely, for r smaller than 1, the WPM produces a partial conjunction.
4
5
http://en.wikipedia.org/wiki/Generalized mean
http://planetmath.org/weightedpowermean
20
For negative r, the conjunction gains an additional interpretation. In these cases, the
input values to the WPM have as characteristic that they are considered mandatory.
This means that if the value of any of the inputs is 0 (or completely unsatisfied) that
the result of the aggregation will also be 0. In the context of LSP, this would is used
to indicate a bunch of requirements must be fulfilled. Systems that violate one of these
have a resulting global preference score of 0 because one of the mandatory attributes is
not respected, leading to the conclusion that the evaluated system is unusable.
In order to simplify calculations and to specify a sort of standard in order to do comparisons, there is a set of suggested values for r to express specific gradations in andness
and orness [DL07]. These are shown in Figure 2.5:
Figure 2.5: An example of GCD using WPM to map the connection between the WPM rparameter and the respective GCD andness and orness levels between inputs.
21
2.3.2
Logic Scoring of Preferences
Because LSP uses the soft computing techniques discussed above, it allows accurate and
flexible modelling of human evaluation logic. It uses membership functions as elementary
criteria and the generalized conjunction and disjunction for the aggregation structure.
The steps of LSP are as follows:
• Compose a hierarchical requirement tree indicating the relevance of the attributes
• Define elementary criteria with membership functions [Zim78],[DT]
• Create the aggregation tree composed of LSP aggregators
• Evaluate each system by calculating its global preference
Requirement Trees and Elementary Criteria
For multicriteria decision problems, the requirements can often be decomposed hierarchically in a requirement tree. For a software product prototype for example, these are
closely related to the software quality attributes. Similarly, they can be decomposed
into components that can be individually measured and evaluated. Such a component
is a performance variable. Each is represented by a linguistic term. Therefore, their
evaluator functions, the elementary criteria, are perfectly suited for the application of
membership functions.
To illustrate this, we give a short example of an elementary criterion. For example, lets
say we want to build a new school and are looking for the best possible location to do
this. One of the considered performance values is the level of sound that is allowed in
the neighbourhood. We could rate that anything over 80dB is completely unacceptable
and that anything under 50dB is considered best-case. We can proceed to model our
decreasing tolerance for increasing loudness using a linear downward slope. The resulting
elementary criterion is a membership function of the family displayed in the middle graph
in Figure 2.2, where C would be 50dB and D would be 80dB. This function is then used
to model our preference for the range of acceptable values for this performance variable
and can be reused to evaluate each system to calculate their elementary preference score
for the performance variable “loudness”. Different systems will typically have different
values on the X-axis.
22
This process is applied for each performance variable, so there will be exactly as many
elementary criteria as there are performance variables.
Only after defining all of the elementary criteria it makes sense to evaluate a system.
Doing so generates a vector of elementary preference scores. The next step is to combine
them into one measure, the global preference score, which reflects the global ability of
the evaluated system to satisfy the requirements of the multicriteria problem. In LSP,
this is done using LSP aggregators.
LSP Aggregators
In LSP, the aggregation of elementary preference scores is heavily based on the GCD. The
r-parameter in each step is based on the hierarchical performance variable tree. During
its creation in the first phase of LSP, the performance variables can be annotated with
mandatory and optional indicators, simplifying the aggregation later on. The remaining
work to be done is to choose weights specifying the relative importance of the attributes.
An example illustrating this is displayed in Figure 2.6.
Figure 2.6: An example of GCD aggregators in an LSP aggregation structure.
Note that the conjunction is used to express the desire for simultaneous satisfaction of
multiple requirements in an LSP problem. Dually, the disjunction is used to indicate it
suffices when any of the inputs is satisfied.
23
Compound Aggregators
Using the concepts of LSP aggregators we can already model a large part of human
evaluation logic. However, after dealing with many problems, it occurred that similar
structures kept reappearing. It is possible to define some sort of design patterns for
specific constructs of reasoning. These are usually realized through the composition of
LSP aggregators in a specific way and are called canonical aggregation structures (CAS,
[DDT11]). An example relation that is often used is the Conjunctive/Disjunctive Partial Absorption (CPA/DPA, [DN05]). This aggregation respectively combines a mandatory/sufficient input x and a desired output y. In the mandatory case, this means that
the resulting preference score will be 0 if the elementary preference of x is not satisfied
and x+xR or x−xP in case it is, where x is the elementary preference of the mandatory
input and R and P are reward and penalty terms, which are based on the elementary
preference score of the desired input, y. Similarly, for a sufficient input x and a desired
output y, the output preference is (close to) 1 if the input is completely satisfied, x + xR
in case it is not but the desired input is satisfied, x − xP in case it is not and the desired
input is not satisfied, and smaller than y when x is completely unsatisfied. The values of
R and P are usually chosen in the range of [0.05 − 0.15] and −[0.10 − 0.30] respectively,
based on the importance of the desired input.
Figure 2.7: An illustration of the Disjunctive Partial Absorption aggregator, combining a Sufficient input and a Desired input with P = -0.1 and R = 0.25.
We can even go further and define compound aggregators using regular GCD and
CPA/DPA aggregators as building blocks to define Mandatory/Desired/Optional (MDO)
and Sufficient/Desired/Optional (SDO) aggregators ([DN05]). There are several implementations of the these aggregators and they slightly differ in their properties. In these
operators, the optional input is much like the desired input but has a lower compensational power.
24
Figure 2.8: A realization of the MDO operator, D-nested.
Figure 2.9: A realization of the MDO operator, M-nested.
2.3.3
Suitability Maps
A particularly interesting example of displaying the results of the previously mentioned
system-preference mapping is through suitability maps (S-maps). These are applicable
for problems of which the candidate solutions can be displayed on a graphical map.
A typical example is when solving geographical problems, where the possible solutions
are locations. First, a geographical map is meshed into a grid of locations, and for
each location the performance variables are measured, such as altitude, slope, ground
material, etc. ([Dujc]). After calculating the global preferences, they can be visually
represented in a 3D-plot where the global preference, referred to as the suitability, is
displayed as a bar on the Z-axis and the locations are plotted in the x,y-plane based on
their grid coordinates. Figure 2.12 shows an example.
2.4
Group Decision Support
The discussed decision support systems thus far are all in the assumption there is one
central person running the operation. This person is supposed to enter both the aggregation structure and his or her personal preferences for the attributes in the form of
elementary criteria, after first defining the performance variables of the problem. This
person is often also the decision maker because it is also his or her task to eventually
make the decision based on the information that is calculated and presented by the
DSS.
GDSS extend regular DSS in a way that it tries to provide a solution for the presence
25
Figure 2.10: A realization of the MDO operator using a weighted neutrality (A).
Figure 2.11: An example of an LSP aggregation structure with compound aggregators, outlined
in gray.
Figure 2.12: An example S-map, where suitability is synonymous for global preference score.
of multiple DMs, which is a much more realistic scenario, especially in business environments. But also in a different context, the one of social media, more people than just
a board of experts can be involved in the decision making process. Where classic DSS
support one DM, and GDSS support a group of experts (say anywhere between 2 and 15
DMs), we strive to find techniques to handle many more DMs, in the order of hundreds
or thousands. To do this, certain factors start to play a role such as complexity of the
calculations and the representations of the results, but most importantly ways to handle
the large amounts of data, which we propose to do through aggregation. Therefore we
26
need to split decision support algorithms into several aspects. This allows us to handle
them separately, resulting in a decomposition of DSS in several aspects.
In the current literature on group decision support systems, the focus lies on the sharing
of information among the members of a relatively small group of experts. This research
is an innovation in the sense that we want to broaden the spectrum to open up the
possibility of using GDSS in the context of social media, effectively consulting large
groups of experts.
2.4.1
Aspects of GDSS
Analysing the nature of decision support systems, we can find several parts in which
we can split the process. This mainly happens by looking at who does what, and by
separating these tasks into roles.
First, it is clear that we can separate the decision makers from the experts. This means
that opinions from outside the board of executives making the decision are also heard.
Second, the experts can be separated from the evaluators, the latter being the people
who decide which performance variable of a problem is more important than the other.
Finally, the analysts can also be seen separately. They are the people who model the
problem by deciding which performance values there are and how they are hierarchically
structured. Generally, a person can have multiple roles, though the roles are independent
of each other.
This decomposition allows us to split GDSS into several independent roles, such as
problem modelling, expert opinion (data) gathering, aggregation techniques and decision
making based on the resulting system-preference mapping.
2.4.2
Fields of Application
Possible fields of application for decision support are versatile and each have their own
requirements, so a flexible framework is necessary, with careful consideration of the user
about when a certain technique is considered good or bad and for which contexts this is
true.
The most traditional field of application for GDSS is that of business decisions, where
a group of experts is consulted in the decision making process. Their opinions need
27
to be both respected and combined at the same time, as accurately as possible. The
system should facilitate the sharing of information and compute the compatibility of
their possibly diverging opinions. The GDSS serves as an aiding tool to help keep a
clear overview on all the information that is at hand and at the same time provides a
mathematically grounded summary of all the input.
Another possible field of application is in the domain of medicine. Clinical decisions can
in some way be compared to business decisions in the sense that a group of experts is
consulted for their opinions on what would be the best course of action given a problem.
These problems can be trying to determine the illness of a patient or the proper treatment
for a tricky condition. This type of GDSS would be more probabilistically based, backed
by other types of probabilistic tools such as Neural Networks, Bayesian Networks and
general genetics science. These types of problems are considered critical, as the health
of a person is at stake, and the correct handling of information is a matter of great
importance. Therefore, the correctness of the computations, the interpretation of the
results and the use of the framework should be clear and precise.
Another field of application is one that has yet received little attention in current literature, as it is created by a recently up and coming trend. The use of including social
media in the decision making process is becoming more and more valued due to a change
of the face of the earth. Businesses can now reach much more people in a simple way
through the internet to obtain valuable information on their clients, and use this information to produce a better product to gain a market advantage. A better product can
be either a product that better satisfies the demands of the customer or a product of
higher quality, based on the desired properties of a product. Because a client base can
be really large, techniques to handle such vast quantities of information are needed and
this is what GDSS can provide.
An interesting recent and well-known example of a business including social media in
decision making is the one from Hasbro, the company behind the popular board game
Monopoly. Hasbro held a public poll to gain information about the user preference for
the player tokens, including the well known tokens and some new ones. This led to the
replacement of the iron by the cat token 6 .
6
http://www.bbc.co.uk/news/entertainment-arts-21356033
28
Chapter 3
Research
It is clear that there is a large range of applications that would benefit from extended
decision support. With the discussion on decision support systems however, it is also
apparent that it is not obvious to find a one-size-fits-all solution.
The performed research focusses on the terms aggregation and confidence. Aggregation
is used to minimize the trouble of handling large amounts of data input, and confidence
is used as a measure to minimize the loss of information.
First, the different aspects of decision support systems are analysed more in-depth than
before, followed by a clear declaration of the scope of the research. Then, an aggregation
technique is introduced and explained, followed by the definition of confidence, and
the calculation thereof. Finally, there is a section on how to interpret confidence by
combining it with global preferences.
3.1
Aspects
Before going into the depth of the research it is interesting to analyse the different aspects of decision support systems. Step by step, we first have the problem analysis, in
which a problem is decomposed into its performance variables. At this point, we can
specify which are most important and which are mandatory. Then comes the definition
of elementary criteria defining the preferred values for these attributes, gained from experts. These are then used to evaluate several candidate systems. The hereby produced
elementary preference scores are aggregated by an aggregation structure. This struc29
ture reflects the previously established relative importance between the attributes. The
resulting output global preference scores are gathered in a system-preference mapping.
The decision maker(s) can now make a decision.
In conservative decision support, all these roles are usually performed by a single person.
In group decision support, this is extended in such a way that a group of experts “acts”
as one person.
The aim is to produce a system with a logical separation of the roles in the decision
making process. For example, it is desired that the experts should be able to model
their own opinion on what are good values for the performance variables individually
and independently.
Another point of difference is the aggregation structure. Some might find a certain
attribute more important, even going from optional or desired to mandatory. Some
might differently distribute weights indicating a different relative importance between
the attributes of the problem.
It is apparent that there are different dimensions to decision support, and that a separation of the independent steps into the previously mentioned roles is possible. This
allows us to consider that separate people are involved for the separate steps, or even
scenarios where certain inputs are externally provided.
3.2
Scope
To further explain the developed techniques, an example case study is described in detail
in chapter 4. We hereby limit ourselves to an investigation into the field of social media
applications. Therein the “experts” consist of all the consulted people, which can be a
very large group, going from hundreds to tens of thousands.
Some important established organisations already consult their user base for making
decisions, such as Google
1
and, until recently, Facebook 2 . The decision makers are
then a group consisting of the members of a board of executives of the considered business.
1
2
http://www.google.com/about/company/philosophy/
https://blog.facebook.com/blog.php?post=70896562130
30
The core of the research is about the aggregation of the large amounts of “expert” inputs. Their elementary criteria indicating their preferences are represented by generated
membership functions. From here on, we consider the aggregation structure to be externally provided. The actual origin thereof is not important, as it can be independently
developed by the decision makers. Otherwise, it can be the result of an aggregation
technique on combining non-binary weighted trees, but this is outside the scope of the
work. The next section focusses on the aggregation of membership functions, and thus
the clustering of elementary criteria.
3.3
Aggregation
The aggregation technique applied in this research is based on a recently developed technique [TRBDT12], [TRBDT13]. It happens in the preference expression step, directly
after gathering all elementary criteria. Because this happens before the elementary
preference scores are calculated and long before the aggregation structure is used. This
causes the aggregation structure to be oblivious of the aggregation, allowing it to remain
largely unchanged from the aggregation structure in traditional LSP.
The purpose of the extra early aggregation step is to combine the experts into a single,
merged expert. This means that all experts will appear to the aggregation structure as
a single person. Because this all happens in an early stage, it can also be referred to as
pre-aggregation.
The concept of aggregation is chosen to reduce the large amounts of information to a
handleable portion, both logically and calculationally. This brings with it an inevitable
loss of information, which is mitigated by the reintroduction thereof as the parameter
called confidence.
Aggregation is a multiple-tier process based on the similarity of membership functions.
The purpose is to group experts with similar opinions in clusters followed by selecting
a representative cluster with certain confidence. This representative is then used for
further calculations.
31
3.3.1
Clustering
In order to cluster the experts into groups we need a way to compare membership
functions. To do this we first translate them to an alternative representation based on
their shape and the length of their components. This representation is called the shapesymbolic notation and consists of two parts, a shape-string and a feature-string. Because
this is purely mathematical and not directly related to the concept of decision support,
we here use the term membership functions instead of elementary criteria.
Definition 3.3.1 The shape-symbolic notation of a membership function is an alternative representation consisting of two parts: a shape-string describing its shape and a
feature-string describing the relative lengths of its components.
Shape-string
First, the membership functions are described based on their shape. The according
representation is called a shape-string where each segment of the membership function
will use a symbol among a sign {+, -} to represent an upward or downward slope, a value
[0, 1] to represent the level of preference (on segments without a slope) and a letter {L,
I, H} to denote a point where L means 0, H means 1 and I means a value in ]0,1[.
This serves as an alphabet for a context-free grammar G(N;T;S;P) which we can use to
generate shape-strings, where
N = {<slope>, <preference level>, <point>, <segment>, <shape-string>} is the
set of non-terminal symbols;
T = {+, -, 0, 1, L, I, H} is the set of terminal symbols;
S = {<shape-string>} is the starting symbol; and
P is the following set of production rules:
<slope>::= + | <preference level>::= 0 | 1
<point>::= L | I | H
<segment>::= <slope>|<preference level>|<point>
<shape-string>::= <segment> | <segment><shape-string>
32
Figure 3.1: Illustration of shape-string alphabet based on trapezoid shaped membership function.
Figure 3.2: Examples of shape-strings for membership functions.
Feature-string
Second, each membership function is related to a feature-string. This linguistic description tries to capture the length of the segments on the X-axis of the membership
functions. This allows us to make a distinction between two functions with similar
shape-strings. Here, soft computing is used for its flexibility. We define a set of relative
lengths:
R = {ES = extremely short, VS = very short, S = short, M = medium, L = long, VL
= very long, EL = extremely long}
Research has shown that around seven distinct linguistic terms is the optimum when
weighing off precision versus correctness: it becomes harder to differentiate between
more than seven different types of length, whereas less than seven terms do not allow
enough descriptive power to be useful.
For each membership function, the length of its characteristics are translated to their
correct linguistic representation. To this end, first all membership functions are scaled
on the X-axis to have the same length. This is harder than it sounds as the membership
functions represent expert opinions and it is difficult to scale the functions without
harming their representativity. Hence a maximum value has to be found at which all
33
functions are clipped, and this value needs to be at least strictly larger than the maximum
of all d-values of all elementary criteria.
When this known, fixed length has been determined, each part can be assigned a linguistic term describing its relative length. For this, we calculate the fraction of the
length of the part to the known length of the X-axis. A possible mapping of these fractions to the terms is depicted in Figure 3.3 using membership functions for each relative
length:
Figure 3.3: Possible representation of relative lengths using membership functions.
To determine the corresponding term, we look up the fraction on the X-axis and look at
the maximum membership function vertically above that point. Based on this principle
we can determine the feature-string for each membership function.
The Shape-symbolic Notation
Combining both the shape-string and the feature-string we could annotate each membership function by the shape-symbolic notation. This exists of a set of symbolic characters,
one for each part of the membership function and each consisting of a shape character
and a length indicator.
Based on this representation, membership functions can be compared to each other.
Therefore, we use a similarity measure which respects the properties of reflexivity and
symmetry, and is based on the similarity between two shape-symbolic notations. Is is
calculated using a modified version of the Levenshtein-distance [Gus97] based on a cost
function taking into account inserting, replacing and deleting shape-symbolic characters.
34
Figure 3.4: Example of feature-string for a membership function.
Figure 3.5: Example of the shape-symbolic notation of a membership function, showing symbolic
characters b1 and b2.
For inserting and deleting, the cost depends on the length of the respectively inserted and
deleted part. For replacing, the cost depends on the change of length. An extra penalty is
added in case the new shape-symbolic character has a different shape component.
For each pair of membership functions, the distance is calculated as a measure of similarity and stored in a matrix. This matrix is symmetric and always one on the main diagonal, respecting the reflexivity and symmetry requirements. After all similarities have
been found, the membership functions are clustered hierarchically in a bottom-to-top
manner based on a highest similarity first policy. This policy has the interesting property
35
that it produces unique results. The entire process is captured in Figure 3.6.
Figure 3.6: Detailed breakdown of the shape-similarity measure used to aggregate expert opinions.
Note that this grouping process is in fact independent of the context and can be applied
purely mathematically to group any set of membership functions based on their similarities. This technique can thus also be used in other scenarios or applications. In the
context of GDSS and LSP, the elementary criteria are clustered: each expert is asked for
his or her opinion on each attribute. Per attribute, they are grouped into clusters. The
experts in a single cluster generally have the same opinion for what values are desired
for the considered performance variable. This also implies clustering has to be done for
each of the performance variables.
3.3.2
Further Tiers of Aggregation
After the experts are grouped in clusters, further steps of aggregation can be done.
Representing a cluster with a single membership function, the output of the first round
of aggregation can be interpreted as the inputs of a small group of experts with dissimilar
opinions. This is similar to the current literature, which proposes several techniques for
handling with small groups of experts, though a couple of things need to be kept in
36
mind:
• In the literature, no assumptions are made on the similarity between experts,
meaning that it is possible multiple experts have similar opinions. In our case, this
is no longer possible due to the fact we have already grouped experts with similar
opinions into clusters.
• As described further on, a new measure called confidence is introduced, which
is not dealt with in existing techniques in the way it is used here. Not using
confidence would mean increasing loss of information.
Therefore it is interesting to investigate new modes of aggregation after clustering, rather
than applying an already existing technique. Assuming LSP is used as decision support
algorithm, a single elementary criterion is needed per performance variable. In the
current situation, we have a group of clusters per attribute from among which we need
to find a representative.
To this end, we need to find a representative for each cluster and then combine these
into a single representative elementary criterion. To find these representatives there are
several possible approaches. For the representative of a cluster, the suggested approach
is based on some characteristics like the number of opinions (e.g., majority or minority)
and shape of membership functions representing the expert opinions (represented by
small cores), among other meaningful cluster characteristics [TRBDT13].
When selecting the final representative, the cluster representatives have to be either
merged or one of them has to be selected. Both have advantages and disadvantages:
merging means less loss of information whereas selection is easier.
In the context of elementary criteria however, the merging of membership functions is
often meaningless. When trying to merge a function that represents “only low values”
and one that prefers “only high values”, one could quickly end up with a meaningless
result that accepts all values, or that has a medium preference along the entire range.
This means the process of merging might in fact be counter-productive and lead to a
large loss of information. For this reason, selection is recommended to elect the final
representative. To make the choice of which cluster is elected as final representative,
taking the cluster with the highest confidence is recommended.
37
Extensive research on this topic is beyond the scope of this paper but it is not unimportant. We continue the research with a discussion of the concept of confidence.
3.4
Confidence as a Concept
To minimize the loss of information introduced by aggregation we define a new measure,
“confidence”. This is a broad concept and its interpretation is not trivial. The term
confidence appears at different levels throughout the system and a clear definition can
only be given in the proper context. Depending on the level at which we view confidence,
its interpretation is different. We will discuss three levels of confidence in order:
1. Cluster confidence, at cluster level
2. Elementary confidence, at elementary criterion level
3. Global confidence, at system level
At the system level, global confidence in itself implies several things. For one, it is an
indicator for the certainty that the global preference of the system is representative for
the bulk of experts. It defines how correctly it follows their opinions. Otherwise, it also
represents the degree of difference between them. In case all experts agree closely, the
global confidence will be higher than in case the experts disagree. Furthermore, it also
depends on the measured values for the performance variables of the specific system
being evaluated.
3.5
Defining and Calculating Confidence
After outlining the confidence concept, there is still a need for a clear definition and the
way to calculate it. Finding a final global confidence value per system will be a multiple
step process. We start with elementary confidence scores per cluster which are then
aggregated similarly to preference values.
The first appearance of confidence happens at the cluster level after the experts are
grouped into clusters. For each cluster, we calculate a value that indicates the degree in
which it represents the average opinion of the population. Afterwards, a single representative from the clusters is elected for further evaluation leaving us with one elementary
38
criterion per performance variable. We define a confidence measure for it to indicate
its representativity for the opinion of all the participating experts. We call this the
elementary confidence. Finally, these are propagated through the aggregation structure
and produce a global confidence value per system. This represents the certainty of the
correctness of the corresponding global preference score. Each of these levels is now
further explained. For each level we will clearly define confidence and discuss how it can
be calculated.
3.5.1
Confidence at Cluster Level
The calculations of cluster confidence is in itself a multiple step process. The value is
based on two aspects: the cluster itself and its similarity to the other clusters.
First we calculate the cluster confidence based only on the cluster itself. Then, it can
be readjusted based on the similarity of the cluster to the others. This may be done to
increase the confidence we have in clusters with similar typical values. This might be
the case when a larger portion of experts share similar opinions even though they were
originally separated into different clusters. These recalculations keep this in mind and
try to balance this out properly.
Definition of Cluster Confidence
The cluster confidence is calculated immediately after the clustering algorithm is finished.
It serves as a measure of the magnitude and compactness for each cluster. To that end,
it depends on two important factors:
• relative frequency fr , the relative amount of experts in the cluster and
• compactness c, the degree in which the elementary criteria in the cluster are similar.
If we call the cluster confidence level γ, we desire the following relations with these
parameters:
γ ∼ fr ,
γ ∼ c.
This leads to the following definition of cluster confidence:
39
Definition 3.5.1 The cluster confidence represents the importance of the cluster. It
combines the relative size of a cluster and its compactness. Clusters with a containing
a majority of the population or that have very similar opinions will often have a higher
confidence value. Let γ be the cluster confidence, fr the weighted size of the cluster and c
the degree of compactness of the cluster (see further). γ can then be calculated as follows:
γ = k1 · fr + k2 · c
In this definition we are free to choose k1 and k2 as normalisation constants. Note that
these parameters can be interpreted as weights which can be modified to change the
relative importance of the weighted relative weights and the compactness. We can thus
reduce this by replacing k1 and k2 by α and (1 − α), introducing alpha as the weight
coefficient. The final form of the first-tier confidence formula can then be written as
follows:
We can rewrite the calculation of cluster confidence γ using α as follows:
γ = α · fr + (1 − α) · c
We still need to define how to calculate the relative frequency fr and the compactness
c. The former seems straight forward but it is not. There is an important third factor
to keep in mind which we have thus far not discussed. This is the possibility of weights
among experts.
Relative Weighted Frequency There are multiple scenarios in which it would be
interesting to assign weights to experts. This occurs when the experts can be partitioned
into groups of expertise, independently of their elementary criteria. Such an example
would be in the process of a business decision where a board of experts is consulted
aside from the social media polling. As the director, you might value the opinion of a
single expert more than that of an individual from the crowd, and this can be specified
by adding weights.
These weights can be normalized but they do not have to be. In fact, when normalized,
problems might occur for when the amounts of experts becomes very large. When
40
consulting tens of thousands of people and besides them renowned experts, whose opinion
you value five times as much, you might need to assign an individual a weight of one fifty
thousandth, which can cause a loss of precision in calculations. Much more natural is
the approach where the weights are relative and only the fraction of two weights shows
the relative importance of one individual compared to another, where two equal weights
imply equal importance. In the case of social media, a suggested default value is to
assign all experts the same weight first, and possibly assign actual experts a (much)
higher weight.
The relative frequency then becomes a weighted relative frequency, defined in the following.
Definition 3.5.2 Let E be the set of all experts, n be the size of E, Ei be the i-th expert
in E and let w(e) be the weight of expert e. Then the relative frequency fr,k of cluster
k is calculated as follows:
P
fr,k =
w(e)
e∈ k
n
P
w(Ei )
i=1
Compactness The calculation of the compactness c is not as straight forwards, because the concept of compactness represents the degree in which a cluster is internally
coherent, or similarly be the inverse of a measure indicating how spread it is. We propose
several approaches for calculating the compactness: one based on typical values and one
based on interval-valued fuzzy sets.
Typical Value Compactness A first approach is by using the typical value of a cluster. The typical value is often taken as the median or mean. For membership functions,
we have already defined a similarity measure using the shape-symbolic measure. We can
use this to select a representative from each cluster that has the highest similarity with
all other membership functions in it. From now on. we call this the most typical value.
Note that the most typical value is always a member of the cluster.
Definition 3.5.3 The most typical value of a cluster is the membership function that
has the highest similarity to all other membership functions in the cluster. It is also
necessarily a member of this cluster.
41
Opposed to the most typical value, there is also the least typical value. This is in itself
also a typical value but this time of the dissimilarity between the membership functions
in a cluster, and can be found by finding the element with the highest distance (i.e., the
lowest similarity) to the others.
Definition 3.5.4 The least typical value of a cluster is the membership function that
has the highest dissimilarity to all other membership functions in the cluster. It is also
necessarily a member of this cluster.
Based on the similarity between the most and the least typical values we can define the
compactness of the cluster as being proportional to their similarity. Their similarity
has already been calculated during the clustering step and can be found in the similarity matrix. Because this is already normalized it can directly serve as a measure for
compactness.
Definition 3.5.5 The compactness based on the most and least typical values of a
cluster is equal to the similarity between them.
However, keep in mind that a small cluster with little membership functions will be more
likely to have similar most typical and least typical values. In fact, in the extreme case
where a cluster exists of exactly one membership function, this will be both at the same
time, and the compactness will be maximal because the similarity of a function with itself
is 1. This is mathematically correct, however this could have unwanted repercussions.
This would mean that a cluster with only one expert in it would end up with a higher
compactness than a cluster of five experts whose opinions somewhat differ. This is
not really a problem for the confidence though, as the weighted relative frequency also
appears as a parameter in its definition. Another possible mitigation would be through
the use of the relative frequency in the calculation of the compactness, too. The relation
should then be inversely proportional, but would not have to be linear. A suggestion
would be to divide the similarity by the square root of the relative frequency.
An illustration of a cluster with its most and least typical values indicated is given in
Figure 3.7.
An obvious advantage of this approach is the fact is it computationally fast, easy and
cheap. However, selecting a representative is not exact, and defining a measure over a
set of values based on two members of the set implies a loss of information. However,
the measure does still depend on all members indirectly, as the most and least typical
values are calculated based on the entire cluster.
42
Figure 3.7: Most and least typical values for a cluster, the most typical value is displayed in dark
red, the least typical value is displayed in blue.
Interval-Valued Fuzzy Set Compactness
A second approach to calculating the
compactness of a cluster is through the use of interval-valued fuzzy sets. An intervalvalued fuzzy set is an extension to regular fuzzy sets in the sense that it defines an
interval of possible ranges for each value in the domain of the set. Graphically, this can
be seen as the composition of two membership functions where one represents the upper
bound of possible values and the other represents to lower bound.
Figure 3.8: Example of an interval-valued fuzzy set where the lower bound has a maximum value
of λ.
In order to calculate the compactness from this, we need to find a bounding surface
enclosing all membership functions in the set. Analytically this is easily done by taking
the maximum and minimum of all membership functions in each point when plotting
them together on a graph. Computationally, this is difficult and to be exact the calculations would require an analytical engine and infinite precision. The idea is to then
calculate the surface enclosed by the upper and lower bound of the interval-valued fuzzy
set. A large surface then indicates a large spread and a small surface indicates a compact
cluster.
Because we do not have infinite precision, we need an alternative way to find these
bounds to approximate their enclosed surface. Through using approximations it becomes possible to estimate the desired surface in a computationally much cheaper and
43
easier way.
Figure 3.9: The upper and lower bounds of the encapsulating interval-valued fuzzy set; the upper
bound is displayed in green, the lower bound is displayed in teal..
If we consider each membership function as a collection of four values, a, b, c and d,
then we can find an approximate upper bound by taking the minimum a and b and
maximum c and d. To know whether or not doing this introduces errors it suffices to
check if the minimum a and b belong to the same membership function (dually for the
maximum c and d). In case they do not, it is possible the estimate is wrong. This
can be mitigated by either performing another iteration of calculations, further refining
the surface, or by simply reflecting this uncertainty by lowering the confidence in the
compactness, for example by increasing α by a certain percentage, lowering the weight
of the compactness.
The lower bound is harder to estimate. First we compute the maximum a and minimum
d. In case the a-value exceeds the d-value, the lower bound is simply the X-axis itself,
which is the simplest case. In case it does not, we need to compute the maximum b and
minimum c. Again we compare these to check if the maximum b does this time not exceed the minimum c. In that case, the lower bound estimate is given by the computed a,
b, c and d values. Similarly to before, we can test the possibility of an error by checking
if the maximum a and b (and dually minimum c and d) belong to the same membership
functions.
In case b does however exceed c, the lower bound will not have a core and is much harder
to specify. An additional parameter needs to be added to indicate the maximum height
reached by the lower bound. Moreover, a new b and c value need to be computed, which
will be equal to each other, and equal to the point on the x-axis where the lower bound
is maximal.
This can be heuristically done by looking at the inclination between a and b of the
membership function with maximum b and the inclination between c and d of the membership function with minimum c. The intersection of these lines can be used as an
estimate for the maximal value of the lower bound and its abscissa can be used as new
44
b and c value.
Again we face the problem of uncertainty, which can be mitigated as before, either by
reiterating with more precision or by lowering the confidence.
Definition 3.5.6 The compactness based on interval-valued fuzzy sets can be computed
from the surface between by the upper and lower bounds of the interval-valued fuzzy set
enclosing all membership functions in the cluster. The surface has to be normalized
which can be done by rescaling the x-axis to map the maximum on 1.
Figure 3.10: The lower bound of this interval-valued fuzzy set has no core and is clearly only an
approximation to the real surface, as one of the membership functions lies partially below the
lower bound..
Note that both approaches respect extreme cases such as one expert per cluster and
scenarios with one huge cluster containing all experts.
Inter-cluster Confidence Redistribution
Now have a confidence level per cluster that is only dependent of the elements of that
cluster. Sometimes it can be interesting to readjust this for multiple reasons. For one,
we might have a number of clusters that are unusable due to the fact they have a very
low confidence. Otherwise, a might find a group of clusters that are very similar.
In order to deal with this accordingly, we could do several things. A first possibility is to
review the clustering step and reiterate until a desired amount of clusters is produced.
This is not always optimal however, because adjusting the parameters to the clustering algorithm to produce fewer clusters might have adverse effects on the compactness
thereof.
Therefore we recommend a different approach, which relies on selecting certain clusters.
This again can be done in different ways. Either we select the top-k clusters, after
ranking them by decreasing confidence, or we can simply set a threshold and ignore
all clusters that fall below it. The former has the side effect that k is fixed and that
sometimes a relatively important cluster might be ignored in case there are more than
k interesting ones. The latter has the problem that the threshold is not easy to find.
45
Either we set a fixed value, which might in some extreme cases lead to the elimination
of all clusters or none at all, or we calculate it dynamically based on the computed
intra-cluster confidences.
The approach of selecting clusters also has additional possibly negative effects. These
come from the fact that selecting clusters directly implies that information is purposefully
ignored and thus lost. This can be partially mitigated by redistributing the confidence
after making the selection.
After establishing the desired amount of clusters, we might have a situation where we
can utilize our insight to further aid the algorithm in its artificial human evaluation. We
can do this by translating our knowledge to an adjustment of the parameters to influence
the decision.
To clarify why this can be interesting, take as an example a scenario where there are
three clusters with similar confidence, yet two of them have a similar typical value while
the third represents a quite different opinion yet has a slightly higher confidence. To now
select a final representative, we as humans would likely choose for one of the two clusters
that are closely related, because even though individually they are less important than
the third, the total image shows they are in some way correlated as they represent groups
of experts that have similar opinions.
It can therefore be interesting to readjust the confidence post-calculative in order to
achieve more “logical” results. This can be done by computing the pairwise distances
between the confidences of the clusters and boosting those that have low distances while
lagging those that are considered to be outliers, resulting in a net redistribution.
Note of Caution The reader should be wary of the concept of inter-cluster confidence
redistribution. This step is considered optional as it might do good but it might also do
bad. It is not always justified to perform readjustment of confidence levels as this in fact
twists the representation of the consulted experts. In social media applications, this is
probably not a problem, yet the decision maker should be careful so that he does not end
up tweaking the parameters so that he manipulates the output in a certain direction.
In critical applications, the manipulation and redistribution of confidence is generally
discouraged as to present as “clean” results as possible, without distortion.
46
3.5.2
Elementary Confidence at Membership Function Level
The next level where we encounter confidence is when a representative for each cluster
is selected for further calculation. This is done because the computation of elementary
preferences requires a single elementary criterion per performance variable. This implies
all clusters per attribute must be recombined into one representative. We call this the
elementary confidence of the selected representative elementary criterion. Again there
are multiple approaches to calculate it, but first we need to find a representative per
cluster. The most obvious way to do this is by selecting the most typical value. We
choose the initial confidence of this representative to be the same as that of the cluster
it represents. Next, we discuss how we can combine these cluster representatives into
a single elementary criterion, which will be used to calculate the elementary preference
score.
First, we can try to merge the representatives into a single criterion. This is the path of
least loss of information, as each representative is taken into account, however the merging thereof still implies some inevitable loss. The representatives can be combined in a
weighted manner where the confidence levels can serve as their weight. These weights
need not be normalized all together as an extra normalization step can be easily introduced to achieve this.
Merging, however, is rarely meaningful. If two elementary criteria representing opposing
opinions with similar confidence, such as “only low values” and otherwise “only high
values” are merged, the result would be “semi-preferable everywhere” in case normalization is used, and “all values everywhere” in case a pointwise maximum is chosen as
a merging strategy. Even “no values” is possible in case the strict pointwise minimum
is taken. These functions might be mathematically merged but they are logically void.
Therefore, we will not further examine this approach as additional research is necessary
to study the possibilities in this case.
Second, the opposing extreme of merging is selection. This strategy is by far the simplest but it also introduces the largest amount of information loss. Selection can be
done based on confidence, for example taking the representative of the cluster with the
highest confidence score.
There are scenarios were selection is acceptable. This is when the clusters have clearly
distinct confidence levels and one is obviously “ahead” of the others. In other cases,
47
where the confidence levels are closer to each other, an extra selection criterion could be
included to ease the process, such as the size of the cluster, the distance to other clusters,
or others. The extra complication however lies in the fact that a choice from similar
representatives means a lot of information is lost. This can be reflected by lowering the
final representative’s confidence before performing further computations.
Note that in the rare case two clusters have the same confidence and selection is chosen,
an additional tiebreaker is necessary. When possible, in case the competing representatives allow it, a merging step can be used to minimize the loss of information and to keep
the confidence as high as possible. We will use the selection approach in the remainder
of this research and the case study.
Definition 3.5.7 The elementary confidence is the confidence of the elementary criterion selected for evaluation purposes. When using the selection approach, it equals that
of the cluster with the highest cluster confidence.
3.5.3
Global Confidence at System Level
Finally there is the global confidence of a system. Its calculation occurs during the
propagation through the aggregation structure. This happens similarly to the global
preference calculation. Indeed, there is a close resemblance comparing the desired behaviour of preference and confidence: in case of a complete conjunction, the lowest
preference score will be dominant and produce a low output. Similarly, a low confidence
in the preference of either input of a conjunction should produce a low confidence in the
result.
The exact value of the resulting confidence after an aggregator, LSP or compound, also
depends on the preferences of the system being evaluated. This is the main difference
between confidence and preference propagation: the global preference only depends on
the elementary preferences and the aggregation structure but the global confidence depends on the elementary confidences, the aggregation structure and also the elementary
preferences. In the case of a full disjunction with two inputs with similarly high confidence, the output will depend on which of the inputs has the highest preference score.
Alternatively, in case both have a high preference but one of them has a significantly
higher confidence, the result will have a high confidence too. Similar situations are found
for the generalized version of the disjunction, and dually for the conjunction.
48
Definition 3.5.8 The global confidence at system level is an output parameter indicating the trust we can put in the accuracy of the representativity of the global preference
of that system. It is calculated similar to the global preference through propagation and
aggregation of the elementary confidence values through the aggregation structure.
Each system will then have two parameters indicating its preference and confidence.
Both are necessary for the decision makers to perform their evaluation of the systems.
The importance of the global confidence depends on the nature of the problem. In the
case of social media polling, the cruciality of the correctness of the decision is often
not life-important. In fact, even when a mathematically wrong decision is then made,
there are often no real repercussions, though the sales numbers might not be as high
as desired. In other fields of application, however, such as medical analysis based on
decision support, the correctness is very important, and a high level of confidence is
required.
3.6
Combining Confidence and Preference
In traditional decision support systems, the output is typically a list of the evaluated
systems and their calculated global preference scores. The interpretation thereof is
straight forward. This is largely due to the fact that each system is linked with only
one parameter, making it trivial to rank them. This makes it simple to select the “best”
system. Often when the amount of evaluated systems is large, the ones with highest
global preference scores are selected by the DSS. Only they are presented to the decision
makers as contenders for the final solution.
In our case, this is no longer such a trivial process due to the fact there are now two
output parameters. The ordering of viable systems is complicated due to the fact there
is no natural total ordering on the couple (preference, confidence). We could define
a partial ordering or even a total ordering but this would be a strictly mathematical
solution to solve a logic problem, and is hence not the preferred way to go. Instead, we
look in the direction of comparing both confidence and preference in the context of the
application.
Much like the comparison between cost analysis and global preference is kept separate
49
from the preference calculation, as explained by Jozo Dujmovi´c, the combination of
confidence and global preference is also best kept separate. The main question to which
we are searching an answer for is still “which system is the best”, which can not be simply
translated to “which system has the highest global preference” or “which candidate has
the highest confidence”.
Clearly, for the best system we desire a high degree of confidence. At the same time, we
also want the solution to have a high preference score. However, what conclusion should
we draw in case the candidate with high confidence has a low preference, or vice versa?
Depending on the context and more importantly the criticality of the accuracy of the
decision, a high confidence might play a more important role than a high preference.
In the case of social media applications, the criticality is not very high, and thus the
solution with highest global preference can be considered as a viable solution, given their
confidence at least surpasses a desired threshold. In other, more critical cases, a high
level of confidence might be necessary. In that case, it is plausible to choose a candidate
that does not have the highest preference but that has a high degree of confidence.
Generally, we still want to find a solution with both high preference and confidence.
This is the best possible scenario. To facilitate the decision making process, we propose
a technique to combine the two parameters to one, again allowing the evaluated systems
to be ordered. The following properties must be respected:
• A minimum degree of confidence must be met.
• A high global preference is desired.
• A high global confidence is desired.
• Depending on the context, it must be possible to assign differing importance to
the impact of preference versus confidence.
We call this combined new parameter the goodness ν of system ξ, and define it as
follows:
Definition 3.6.1 The goodness ν of system ξ combines the global preference and global
confidence of the system with weights and can be used to rank evaluated systems. Let pξ
be the global preference score of the system and cξ be its global confidence, then we can
find νξ as follows:
50
νξ = k1 · pξ + k2 · cξ
Again we see the combination of two weight parameters k1 and k2 . Also here they share
the property that increasing k1 implies lowering k2 (at least relatively). We reduce them
to one parameter and rewrite the equation as follows:
We can rewrite the calculation of goodness ν by using β which gives the following:
νξ = β · p + (1 − β) · c
where β is the parameter defining the importance of preference versus confidence. In
critical applications, β would typically be below 0.5, indicating the desire for a high level
of confidence. On the right hand side of the equation, the preference and confidence
indication
ξ
is omitted because it is implied by the left hand side of the equation.
After computing this for each system to be evaluated, we can define a filtering rule
selecting all those ξ with a confidence above a certain threshold θ, resulting in a filtered
set of still viable system Ξ as follows:
Ξ = {ξ | cξ > θ}
These can then be ordered by an ordering function φ, which is defined as follows:
φ : {1, 2, ..., | Ξ |} → Ξ ∧ ∀(i, j) | i < j : νφ(i) ≥ νp hi(j)
This ordering function ranks the viable systems in descending order according to their
goodness ν.
The introduction of this parameter allows us to select good systems to solve the given
problem which respect all required properties. It can be tuned by changing the β and
θ parameters, allowing a degree of specification of importance between confidence and
51
preference and limiting the results to a set of systems with a minimal degree of confidence.
The choice of β can be done beforehand, but the choice of θ is more difficult, as it is not
known what confidences will be produced in the aggregation step. Therefore, the choice
of θ is best postponed until after the presentation of the global preference calculation
results. Based on the output, θ can then be dynamically calculated. Alternatively, in
case the results are unsatisfactory, for example when all confidence levels are significantly
low, the aggregation calculations can be repeated with different α, assigning different
weights to the importance of the intra-cluster confidence parameters. However, this
might also be an indication that the consulted experts have very diverging opinions. All
in all, unsatisfactory results should be further examined.
52
Chapter 4
Case Study
What follows is an illustrative case study to show the discussed techniques in action.
We have chosen a possible application using social media and simulate a problem with
multiple candidate systems. We evaluate them each using the techniques discussed above
to calculate a global preference and global goodness for them and combine these into a
goodness. All the phases are explained and their results briefly analysed.
The remainder of this chapter is as follows. First, we sketch the background of the
problem that we are trying to solve. Then, we go into the entire evaluation process
going from selecting performance values to calculating the goodness for each of the
considered systems. Finally, there are some brief remarks on the results.
4.1
Background
An interesting application of social media interaction is found in the gaming industry.
Here, game developers often interact closely with their gaming community through various channels. The most popular games nowadays are those of whom the developers
have chosen for a maintenance model based on community feedback. Certain game developers maintain their creations through the use of a feedback loop, listening to what
their players have to say and updating their product accordingly. The most well-known
examples of this are the biggest hit by Blizzard, World of Warcraft, which has the most
hours played per day according to recent observations 1 , and League Of Legends, by
1
http://www.xfire.com/cms/stats
53
Riot Games, which is paving the road for eSports in Europe and America, with the most
active players at the moment 2 . Clearly, including users in their development choices
pays off for them. Blizzard has held their dominant spot for almost nine years so far,
almost dating back to the original release of their biggest hit.
The gaming industry is a booming business with large companies and a lot of money.
To be successful in this competitive world, it is of key importance to make exactly
the product the customer wants. This, and because the users are often also the most
knowledgeable about the product itself as gamers, makes this sector ideal for GDSS. The
users are the ideal expert for consultation in business decisions.
This case study is about a fictive game developer gathering information from its user base
for its next game. Of course, the precise details are undisclosed, but some general facts
are necessarily given to the community to get accurate feedback. To be as successful as
possible with this game, the game developer decides to acquire input from its user base
through social media techniques based on the methodologies described in this document.
That information will be put together with the opinions of in-house, experienced experts
that have worked there for many years and that already have contributed to previously
successful games.
4.2
Evaluation
The development of a game is a difficult process with a lot of phases, each with a lot
of decisions to be made. Some aspects of the game itself can be decomposed into a
hierarchical tree of measurable variables, making it an excellent candidate for GDSS.
The entire process of attribute decomposition is not relevant to the research and is not
discussed further. However, some performance variables are selected and discussed in
greater detail. One of them will be studied to elaborate on the clustering and confidence
calculations. Then, the results of other performance variables are shown and used as
input to the propagation through the aggregation structure.
Because the creation of a (possibly compound) aggregation structure is not in the scope
of this paper, a simple illustrative example is generated based on the chosen performance
variables. Note that this is an interesting topic for further research; it is worth investi2
http://euw.leagueoflegends.com
54
gating the viability of making aggregation structures through social media. In this case
study however, we don’t investigate the origin of the aggregation structure and treat
it as if it were created externally. Keep in mind that the accuracy of the aggregation
structure plays a big role in the calculation of global preference and overall confidence
and thus also the goodness of the evaluated systems, but in this example we are more
interested in showing the process of calculations and influence of the parameters than
the actual correctness of the results.
Te rest of the evaluation process is structured as follows: first, we limit ourself to a couple
of performance variables. A small discussion rationalizes the made choices. Second, we
define the systems that we are going to evaluate. Next we simulate the consulting of
experts. Then the clustering and confidence calculations are illustrated by investigating
one of the performance variables in detail. Afterwards, the results of the others are
given and the propagation of confidence through the aggregation structure is studied.
Finally, the resulting goodness calculations are explained based on the evaluation of
three systems.
The entire process can be roughly split into three big parts:
1. Gathering inputs
2. Performing calculations
3. Analysing results
We will handle each of these separately.
4.2.1
Required Inputs
First, we take a look at the inputs that are required to perform the calculations. Given a
problem, we can define its performance variables. Then, we can generate a few candidate
systems to solve the problem. At the same time we can gather information from experts
on their elementary criteria. We also need to set up an aggregation structure that will
be used to combine the elementary scores into a global score. After gathering all this
data, we will be ready to perform the evaluation.
55
Performance Variables
The entire decomposition of a game into performance variables is large and cumbersome.
Generally, the highest tier of the tree contains quality attributes such as performance,
usability, availability, security and scalability 3 . Some of these, mostly from the usability
category, have a great influence on the end-user gaming experience and thus make good
candidates for social media consultation. The performance variables that are chosen for
further investigation are the following:
• Average loading time (ALT),
• Offline playthrough time (OPT),
• Ease of learning (EOL),
• Maximum amount of players per server (PPS).
These are all performance variables with a continuous range. For some, it is obvious
what kind of values will be preferred, like low loading screen times, however it is still
useful to gather information to gain insight into the average amount of time players are
willing to wait.
The offline playthrough time is an important factor to decide the balance between offline
and online gameplay material, where offline material strives to be continuously innovative and online material needs enough variation to have replay value.
The ease of learning plays a big role in user experience and otherwise user frustration. It
is important to let the player grow in experience and discover the game piece by piece,
but of course the player should not get the feeling that he only unlocks his full potential
by the time the game is over. Therefore it is important to properly balance the learning
curve. A good game should be both challenging and rewarding.
In the aspect of online gaming, there are always servers involved. An important performance variable for scalability is the maximal server load, but this also has an impact on
the gaming experience and player opinions should hence be kept in consideration.
These performance variables capture some of the most important aspects of a game,
going from offline to online experience and balancing reward and frustration.
3
http://equis.cs.queensu.ca/∼graham/cisc877/slides/CISC%20877%20%20Game%20Architecture.pdf
56
Other Measures
Evidently, not every aspect can be decomposed into measurable
components. Some choices are between discrete options, the preference of public opinion
thereof can be better gauged through questionnaires than the use of GDSS. Examples
of such aspects are:
• Dedicated online servers versus private hosting on public servers,
• Target platforms (pc, console, ...),
• Target operating systems (Windows, Mac, Linux, iOS, Android, ...),
• Early release with lots of downloadable content (DLC) versus a late but full release.
These will not be investigated further but they are also an important aspect to game
development and are therefore mentioned.
Candidate Systems
Three possible game configurations are evaluated. This example is simplified, as only
the four mentioned attributes are taken into consideration, but suffices for the purpose
of illustration.
Candidate one is an example of a balanced game, with both elements for offline and
online play. The playthrough time does not take into account replay value but simply
indicates the time needed to play through the entire content once.
C1: {PPS = 32, EOL = 5, ALT = 25, OPT = 30}
Candidate two represents a single player centered game, but has some online elements
in it as well. This implies the game has a long learning curve constantly adding new
elements unlocking the full arsenal bit by bit as the game progresses. There are relatively long loading screens because there is a lot of different scenery with little reused
elements.
C2: {PPS = 12, EOL = 7, ALT = 30, OPT = 80}
Candidate three is based around multiplayer action and online play, with a low amount
of offline playthrough content. The learning curve is low and fast, to allow players to
fully dive into the game quickly. Here, the power of the player comes from repeating the
57
same actions to get better at them, rather than spending more time to unlock features.
Loading times are low so players don’t have to wait long.
C3: {PPS = 64, EOL = 3, ALT = 10, OPT = 12}
Experts
To illustrate the clustering algorithm the performance variable OPT is elected for further
investigation. N=100 experts are generated, represented by their elementary criterion
for the chosen performance variable. Alongside those, random weights between 1 and 5
are generated, where 5 indicates the highest level of expertise and 1 is the lowest. Most
experts get a weight of 1, representing regular gamers, the vast majority of the group of
consulted people. Long time gamers might get a 2, but the higher values in the spectrum
are reserved for in house experts at game development, which are also consulted. Some
of the experts are displayed in Table 4.1.
ID
0
1
2
a
0
0
0
b
0
0
0
49
50
51
71
39
10
77
71
44
97
98
99
26
22
16
55
35
36
c
65
48
36
...
77
71
44
...
56
43
82
d
98
84
82
weight
1
1
2
90
84
76
4
1
3
81
45
86
1
4
1
Table 4.1: Some of the 100 generated elementary criteria, representing the already weighted
experts.
These criteria can be converted into their shape-strings and feature-strings, giving their
shape-symbolic representations. This is depicted in Table 4.2.
Aggregation Structure
The aggregation structure is kept simple in this example. This is mainly because of the
fact we limited ourselves to a small number of performance variables. The structure that
58
ID
0
1
2
49
50
51
97
98
99
Shape-string
1-0
1-0
1-0
...
0+H-0
0+H-0
0+H-0
...
0+1-0
0+1-0
0+1-0
Feature-string
L|E|ES
M|S|VS
S|M|VS
L|ES|ES|ES|ES
S|S|ES|ES|VS
ES|S|ES|S|VS
VS|S|ES|VS|VS
VS|ES|ES|ES|M
VS|VS|M|ES|ES
Table 4.2: The shape-string and feature-string for some of the experts.
is used is displayed in Figure 4.1.
Figure 4.1: The aggregation structure used in the case study.
We understand from this that at least one performance variables should be fulfilled
because of the mandatory conjunction at the highest level. “Ease of learning” and “offline
playthrough time” are considered disjunctive. Their requirement is non mandatory
conjunctive with “average loading time”.
4.2.2
Calculations
Now that we have the required inputs we can move on to the actual evaluation process
based on clustering, confidence calculations and the core of LSP.
59
Clustering
We can calculate the shape-similarity matrix using the distance measure for the shapesymbolic notations we have derived earlier. These calculations result in the following
similarity matrix:
0
1
2
0
1.000
0.925
0.825
1
0.925
1.000
0.925
2
0.825
0.925
1.000
...
...
...
...
49
0.450
0.450
0.450
49
50
51
0.450
0.425
0.475
0.450
0.475
0.525
0.450
0.475
0.525
...
...
...
1.000
0.875
0.775
97
98
99
0.450
0.375
0.525
0.500
0.425
0.525
0.500
0.425
0.475
...
...
...
0.825
0.850
0.825
50
0.425
0.475
0.475
...
0.875
1.000
0.900
...
0.950
0.875
0.850
51
0.475
0.525
0.525
...
...
...
...
97
0.450
0.500
0.500
98
0.375
0.425
0.425
99
0.525
0.525
0.475
0.775
0.900
1.000
...
...
...
0.825
0.950
0.950
0.850
0.875
0.825
0.825
0.850
0.800
0.950
0.825
0.800
...
...
...
1.000
0.875
0.850
0.875
1.000
0.825
0.850
0.825
1.000
Table 4.3: Similarity matrix showing similarities for each pair of elementary criteria.
Note that this matrix is indeed symmetric around the main diagonal and everywhere 1
on it.
Next, the elementary criteria are hierarchically clustered based on most-similar first. The
stopping criterion is based on a threshold, which is here set at 0.95. This is a rather high
threshold which results in a large amount of compact clusters. Increasing the threshold
increases both the amount of clusters and the compactness thereof, whereas lowering it
decreases both.
Running the clustering algorithm produces 42 clusters. Part of the results can be seen
in Figure 4.2.
Purely based on the amount of experts and clusters there is a cluster for every two
or three experts. However we can see that most clusters contain one experts and some
contain a larger amount. The largest cluster has 10 experts and the second largest has 8.
The clusters with one expert in them are a direct result of the high clustering threshold.
For each cluster we can calculate the most and least typical values and define the upper
and lower bounds. Some of them are shown here. The colouring scheme is the same
60
Figure 4.2: An excerpt of the clustering algorithm result dendrogram.
as before: green and teal are respectively the upper and lower bounds of the enclosing
interval-valued fuzzy set.
Figure 4.3: Cluster 18.
Figure 4.4: Cluster 30.
After the clustering is done we can start the preference and confidence calculations.
61
Figure 4.5: Cluster 52.
Figure 4.6: Cluster 60.
Confidence Calculations
First we calculate the cluster confidence. The results of using interval-valued fuzzy sets
are displayed for several values of α to study its impact. Based on the cluster confidence
the top five are selected and shown.
Cluster ID
68
76
36
2
18
Cluster Confidence
0.597
0.590
0.589
0.582
0.575
Table 4.4: Configuration A1
α = 0.4, interval-valued fuzzy sets
Cluster ID
68
36
76
2
18
Cluster Confidence
0.503
0.494
0.493
0.490
0.586
Table 4.5: Configuration A2
α = 0.5, interval-valued fuzzy sets
The calculations are illustrated for cluster 18. It contains five experts with IDs 18, 19,
62
Cluster ID
30
68
36
18
76
Cluster Confidence
0.412
0.409
0.399
0.397
0.397
Table 4.6: Configuration A3
α = 0.6, interval-valued fuzzy sets
25, 26 and 27. Their respective weights are 1, 1, 3, 1 and 1. Their relative weighted
frequency is thus equal to the sum of those weights divided by the total weight of all
experts. This leads to a value of fr = 0.042.
The normalized surface of the cluster is calculated by taking the minimum a and maximum d of all criteria, multiplying them and taking that as the maximal surface. The
surface enclosed by the interval-valued fuzzy set for cluster 18 is then divided by the
maximal surface. This results in a compactness c = 0.93.
Combining both with α = 0.4 gives the listed cluster confidence of γ = 0.575.
Increasing α increases the importance of the relative frequency of the cluster, which
increases the confidence in clusters with more members. The reason the overall confidence seems to drop lower when increasing α is due to the fact the original clustering
threshold resulted in a lot of compact clusters. The clusters all have a relatively low size,
meaning the relative frequency is low, even for the largest cluster. Therefore, shifting
the weight towards its importance lowers the confidence in general. In a scenario where
the examined population of experts is much larger, in the order of thousands, and the
clustering algorithm produces a higher average experts per cluster, this behaviour would
not occur.
Note that the top five clusters do not necessarily contain the largest clusters. Due to the
random weights and possibly the compactness of other clusters, some smaller clusters
such as clusters 18 and 68 score the best. Also note that the largest cluster does not
have a much higher confidence than the other clusters. This is because the weights of
the members in the other cluster are low on average, as opposed to the smaller clusters
that score well because the weights of their experts are high. This makes the relative
weighted frequency of the top five clusters about the same.
63
For the remainder of the calculations configuration A3 is used. The choice for α = 0.6
is on purpose, as we wish to give a larger impact to the relative size of the clusters
to compensate for the high threshold during the clustering phase. In what follows,
each cluster is represented by its most typical values which will serve as elementary
criterion.
Other Performance Variables The results of the other performance variables’ confidence calculations are shown in Table 4.7.
Performance variable
EOL
ALT
PPS
a
0
0
8
b
1
0
16
c
4
0
64
d
8
60
128
Elementary Confidence
0.830
0.772
0.917
Table 4.7: Confidence result for the other performance variables
The ease of learning is rated on a scale from one to ten indicating how difficult it is to
learn the aspects of the game. This can be interpreted as the amount of times an action
should be repeated before it is considered to be an acquired skill. The average loading
time is in seconds and shows the tolerance for sitting idle while the game loads various
elements. The maximum players per server is pretty straight forward and expresses the
amount of players that can be logged in and playing at the same time on one server.
Propagation Through the Aggregation Structure
The global preferences are derived by aggregating the elementary preferences of the
systems using the given aggregation tree. This is the core of the decision support program
that remains unchanged, we just apply LSP here. Similarly but slightly different, the
global confidence is propagated as well. For any aggregator A, the output confidence is
calculated as follows:
1. Calculate the output preference from the input preferences
2. Find the input with preference closest to the output preference
3. Take that input’s confidence as a starting point
4. Calculate the output confidence from the input confidences (similar to preference)
64
5. Take the average of the calculated output confidence and the selected input’s confidence
This implementation leads to logical results but is not part of the research. It also
respects the dependency between the output and the input confidences, input preferences and the aggregation parameter. Alternative implementations are worth examining
further, as is mentioned later on in the part of future work on compound aggregation
structures.
The results are displayed in Table 4.8:
System ID
1
2
3
Global preference
0.711
0.532
0.856
Global confidence
0.727
0.876
0.806
Table 4.8: Global preference scores and global confidences per system after propagation through
the aggregation structure
At this point, the algorithm can be stopped and the results can be analysed by the
decision maker(s). In case there are only a few systems being evaluated, as is the case
here, this is possible as there is a clear overview over all systems.
We see candidate system 2 has a significantly lower preference than the others. At the
same time, its confidence is the highest, meaning we can put a high amount of trust in
the accuracy of these results. Apart from that, system 3 seems to win from system 1, in
both preference and confidence, likely rendering it the most suitable system.
4.2.3
Results
If there are a lot of systems and there is no clear oversight of the results, the next step
in the process is executed to rank the systems. This is illustrated here for the sake of
the example. The following shows the results of combining preference and confidence for
several values of β.
In case the accuracy of the results is crucial, we want to have a high confidence for the
solution. As we can derive from configuration B1, with β = 0.4, this may lead to a
higher goodness for a system with relatively low preference, as long as the confidence in
this is high, because the trust we have in the accuracy of this solution is important. In
65
Cluster ID
1
2
3
Goodness
0.720
0.739
0.827
Table 4.9: Configuration B1
Global goodness, β = 0.4
Cluster ID
1
2
3
Goodness
0.719
0.704
0.832
Table 4.10: Configuration B2
Global goodness, β = 0.5
Cluster ID
1
2
3
Goodness
0.717
0.669
0.837
Table 4.11: Configuration B3
Global goodness, β = 0.6
a way, the confidence can be interpreted as a sort of variance on the preference, which
can then be seen as the expected value. That is why a solution with high preference yet
low confidence is sometimes less trustworthy.
In this system, the cruciality is not very high and β values closer to one are also viable.
Choosing β = 0.5 or β = 0.6, we find that system 2 drops to the bottom as confidence
becomes less important relatively to preference. The results affirms our previously made
short evaluation, which confirms system 3 is the best.
The semantic analysis of these results translate to the following: over all, the gaming population seems to prefer a multiplayer game, though a balanced game is also
viable.
4.3
Final Remarks
In the previous, the proposed techniques are used to evaluate certain systems to estimate
how viable they are. However, when the performance variables of the problem are
66
sufficiently mutable and the set of all possible systems too exhaustive, the methodologies
explained in this work can also be used to generate good possible solutions with attribute
values in the optimally preferred range. For example, instead of evaluating certain
prototype games, the “perfect” game could be made, where the performance variables
are allotted a value based on the results of the clustering step. It is then no longer
necessary to perform further aggregation and goodness calculations as the system is de
facto by construction optimal.
For the chosen clustering configuration this would lead to a game with the following
parameters:
Performance variable
EOL
ALT
PPS
OPT
amax
0
0
8
10
bmax
1
0
16
71
cmax
4
0
64
83
dmax
8
60
128
83
Optimal value
2.5
as low as possible
32
75
Table 4.12: The optimal game prototype, as derived from the results of the clustering step
β = 0.6
This derived prototype clearly has a bit of everything. It should be easy to learn, with
average sized servers, quick loading times and still a good amount of playthrough time.
Of course, this might not be feasible, as this means the game is in fact a bit of everything.
In reality, this includes the risk it has everything but excels at nothing. This immediately
shows the drawback of constructing an optimal prototype versus evaluating multiple
candidate systems: the systems are constructed with care and take into account factors
that aren’t incorporated in the decision support algorithm, such as time to market,
available budget and resource requirements, whereas the optimally constructed prototype
might be completely infeasible.
67
Chapter 5
Conclusions
So far we have shown the possibilities of the proposed framework. It allows us to gather
information from a source that was previously unreachable. It proposes methodologies
to analyse this data in a useful way. The used techniques are all recently developed
and offer a high degree of flexibility in human logic modelling through the use of soft
computing techniques and configurable parameters.
First, we gather weighted expert opinions based on soft computing techniques where
each expert could express his/her expertise or preferences through membership functions. Then we convert them to their corresponding shape-symbolic notations and cluster
them hierarchically according to a shape-similarity measure based on the Levensteihndistance.
Second, we calculate the confidence of the clusters based on their relative size and compactness. The balance of importance between both can be changed by tweaking the
α parameter. Optionally, a second round of confidence calculations can be executed,
carefully redistributing the confidence, boosting the confidence of clusters with similar
cores and lowering that of the others to compensate.
Third, the aggregation step occurs, selecting the best clusters and their most typical
values as representatives. The confidence at this point reflects the representativity of
the experts’ opinions by the representing elementary criterion.
Fourth, each candidate system is evaluated through the traversal of the aggregation
structure. A global preference and a global confidence per system are derived. The
confidence level can now be interpreted as a degree of trust we can put in the accuracy
of the global preference score.
68
Finally, the resulting preference and confidence can be combined. The cruciality of the
problem can be reflected by altering the β parameter. The result is an orderable set
of systems, from which the mathematically best can be selected for evaluation by the
decision maker(s).
It should be clear that the output depends heavily on the parameters. The most obvious
ones are the clustering threshold, α and β, but also the aggregation structure, θ and the
weights of the experts play an important role. A little change in any of these can cause
a drastic change in the results. Hence it is important to realize the correctness thereof
has a large influence on the accuracy of the final ranking of the evaluated systems. The
accuracy of the parameters reflects the actual opinions of the experts and the decision
makers. The results of the framework are only as accurate as the correctness of the
representation of their the human logic. Obviously, the framework is just a tool to aid
humans in their selection process.
5.1
Future Work
Looking at possibilities to further explore, it is interesting to think of different aggregation techniques, as the confidence calculations proposed in this work are closely tied
to the context of pre-aggregation and are not applicable in other situations. Another
thing worth looking into is the construction of compound aggregation structures, as was
briefly mentioned before.
Alternative Aggregation Techniques
Pre-aggregation has the requirement that all experts have to be consulted in advance.
Their data needs to be collected before the algorithm can run in order to produce meaningful results as they are aggregated first of all to serve as an input to the decision support
system. In case new experts are introduced later on, all calculations have to be remade
and the results of a previous evaluation possibly neglected as the extra information can
and most likely will lead to different results. This also implies that all information on
the previously consulted experts needs to be stored somewhere, even after their input
has been used in a calculation.
This is not always a problem but it rules out the possibility of use in certain fields of
69
application. In case, for example, we want to set up a community driven on-demand
suitability map generator for specific queries, the entire aggregation and confidence calculations have to be redone every time a new user (here considered an expert) enters his
or her preferences.
It is clear that with pre-aggregation there is a separate input gathering phase, a short
calculation step and a long life phase, in which the gathered input is used to answer
flexible queries, though the input can no longer be changed without recalculation. Most
importantly, these phases are separated and do not overlap.
It might be interesting to look for an aggregation technique that allows a certain form of
continuous (re-)evaluation, where there is no separation of time in the phases, and where
the cost of aggregation is amortized by splitting it into small chunks of short calculations
each time an expert inputs his or her opinions. The final desired scenario is one in which
experts can input their opinions and produce a resulting preference mapping. Then, a
new expert can join in and enter his or her input, and the previous results are updated
on-the-fly. In a way, the resulting preference and confidence scores per system can be
seen as a sort of running average of the results of all experts.
A possible approach to that end could be by postponing the aggregation step until
after decision support has been done. The whole system would then consist of multiple
independent single decision support calculations, one per expert, each resulting in a
personalized output reflecting the opinions of that expert. Those outputs can then be
aggregated in multiple ways.
Note that there is no mention of confidence up to this point as there has been no
aggregation thus far. This also means the existing DSS can be used as-is, without
modifications and extensions, as all aggregation is done on their outputs.
The aggregation of the independent outputs can be done in different ways. Keep in mind
that thus far we have an array of mappings between each system and its calculated global
preference, one element per expert. The simplest way to aggregate these to a single
preference score, reflecting their combined opinions, is by taking a weighted average.
The weights can be distributed based on the expertise level of the experts, similarly to
pre-aggregation. A simple measure for confidence can then be derived from the size of the
interval of difference between the preference scores for each system. Simply subtracting
the minimal preference from the maximal preference from all experts and rescaling the
70
result to a normalized axis gives an insight into how differing the opinions are. This
confidence measure however is however sensitive to outliers.
Another possibility worth investigating is through a more probabilistic approach. Note
that we can interpret the array of global preference values per candidate as a probabilistic
distribution which would presumably converge to a Gaussian distribution in case the
amount of experts is large enough. The overall combined global preference can then be
derived by first finding the best matching Gaussian and taking its expected value. The
confidence that goes with it can be seen as a measure of spread on this distribution and
can thus be found from the standard deviation.
Compound Aggregation Structures
In the examined case study, the origin of the aggregation structure was not specified. In
this case, it was simply generated from my point of view, but it was meant to represent
the opinions of a small group of decision makers, whom agree on which of the preference
variables are most important and how they should be aggregated. Similarly to the expert
opinions however, this could also be constructed from information gathered from social
media. The experts could be asked to rank the performance variables on their importance
and indicate which they deem necessary. Some effort can go to the exploration of the
possibility of aggregating that information to create a compound aggregation structure,
to further include clients in the decision making process.
71
Bibliography
[BES08] R. Bodea and E. El Sayr. Code coverage tool evaluation. 2008.
[DDT11] J.J. Dujmovi´c and G. De Tr´e. Multicriteria methods and logic aggregation in suitability maps. 2011.
[DDTD09] J.J. Dujmovi´c, G. De Tr´e, and S. Dragi´cevi´c. Comparison of multicriteria
methods for land-use suitability assessment. 2009.
[DDTVDW09] J.J. Dujmovi´c, G. De Tr´e, and N. Van De Weghe. Lsp suitability maps.
2009.
[DF04] J.J. Dujmovi´c and W.Y. Fang. An empirical analysis of assessment errors for weights and andness in lsp criteria. 2004. San Francisco State
University, Department of Computer Science.
[DG85] G. DeSanctis and R.B. Gallupe. Group decision support systems: A new
frontier. 1985.
[DG87] G. DeSanctis and R.B. Gallupe. A foundation for the study of group
decision support systems. Management Science, 33(5), May 1987.
[DL07] Jozo J. Dujmovi´c and Henrik Legind Larsen.
tion/disjunction.
Generalized conjunc-
International Journal of Approximate Reasoning,
46(3):423–446, December 2007.
[DN05] J.J. Dujmovi´c and H. Nagashima. Lsp methods and its use for evaluation
of java ides. 2005.
[DP80] Didier Dubois and Henry Prade. Fuzzy Sets and Systems: Theory and
Applications. Academic Press, Inc., 1980.
72
[DT] G. De Tr´e. Vage Databanken.
[Duja] J.J.
Dujmovi´c.
Characteristic
forms
of
generalized
conjunc-
tion/disjunction.
[Dujb] J.J. Dujmovi´c. A comparison of andness/orness indicators. San Francisco
State University, Department of Computer Science.
[Dujc] J.J. Dujmovi´c. Optimum location of an elementary school.
[DY04] J.J. Dujmovi´c and Fang W. Y. Reliability of lsp criteria. 2004.
[Gus97] Dan Gusfield. Algorithms on strings, trees, and sequences: computer
science and computational biology. Cambridge University Press, New
York, NY, USA, 1997.
[Hub84] G.P. Huber. Issues in the design of group decision support systems. MIS
quarterly, pages 195–204, September 1984.
[Kee87] P.G.W. Keen. Decision support systems: The next decade. Decision
Support Systems, 3:253–265, 1987.
[KY95] George J Klir and Bo Yuan. Fuzzy Sets and Fuzzy Logic: Theory and
Applications. Prentice Hall, 1995.
[LWHS03] P.J.G. Lisboa, H. Wong, P. Harris, and R. Swindell. A bayesian neural network approach for modelling censored data with an application
to prognosis after surgery for breast cancer. Artificial Intelligence in
Medicine, 28:1–25, 2003.
[SWC+ 02] J.P. Shim, M. Warkentin, J.F. Courtney, D.J Power, R. Sharda, and
C. Carlsson. Past, present, and future of decision support technology.
Decision Support Systems, 33:111–126, 2002.
[TRBDT12] A. Tapia-Rosero, A. Bronselaer, and G. De Tr´e. Similarity of membership
functions - a shaped based approach. In Proceedings of the 4th International Joint Conference on Computational Intelligence, pages 402–409.
SciTePress - Science and Technology Publications, 2012.
73
[TRBDT13] A. Tapia-Rosero, A. Bronselaer, and G. De Tr´e. A shape-similarity based
method for detecting similar opinions in group decision-making. Information Sciences Sp. Iss. on New Challenges of Computing with Words
in Decision Making, 2013.
[Zim78] H.-J. Zimmermann. Fuzzy programming and linear programming with
several objective functions. Fuzzy Sets and Systems, 1:45–55, 1978.
74