The Wisdom of Crowds in Matters of Taste Johannes Müller-Trede Rady School of Management, University of California, San Diego Shoham Choshen-Hillel The University of Chicago Booth School of Business Ilan Yaniv, & Meir Barneron Department of Psychology, Hebrew University of Jerusalem Aggregating predictions How far is it from Mihaylo Hall to John Wayne Airport? 20 mi 27 k 9k 5 mi 16 mi 13 mi Aggregating predictions How far is it from Mihaylo Hall to John Wayne Airport? 27 k 9k 5 mi Average: 12.7 mi 20 mi 16 mi 13 mi 14.9 mi Aggregating taste predictions How enjoyable will the Friday afternoon session at the conference be? very very can‘t wait so-so 9/10 8/10 Aggregating taste predictions How enjoyable will the Friday afternoon session at the conference be? very very so-so 9/10 Average: ??? can‘t wait 8/10 ??? Can we benefit from aggregating predictions in matters of taste? The taste prediction problem 8/10 1 Taste predictions are imperfect. Errors can be random (noise) or systematic (bias) (March, 1978; Gilbert, 2006). 2 We consider the problem from the point of view of single individuals predicting their own tastes. 9/10 The taste aggregation problem 3 In predicting their own tastes, should individuals take into account others‘ predictions of their respective tastes? 9/10 very can‘t wait 9/10 a bit 8/10 very Prediction accuracy MSE We define accuracy as the squared error between taste predictions and taste criteria. 3 5 9 4 9 9 ... x1 u1 (x1 – u1)² x2 u2 (x2 – u2)² xi ui (xi – ui)² Modeling assumptions 5 9 1. Predictions are unbiased: xi = ui + εi, with E[εi] = 0 and Var[εi] = σ2εi 2. Prediction errors are not correlated with tastes: Cov [ui, εj] = 0 for all i, j Simple averages The Wisdom of Crowds We show that simple averages of taste predictions can lead to accuracy gains when predictability is low (i.e., σ2ε is large). 3 5 4 9 x1 u1 x2 u2 ((x1+x2)/2 - u1)² < (x1- u1)² Optimal weights Taste Similarity We show that individuals should often place a larger weight on their own predictions and on predictions of others who share their tastes (i.e., r(u1, u2) is large). 3 5 4 9 x1 u1 x2 Minw1,w2 ((w1x1+w2x2) - u1)² u2 w 1 > w2 Study 1: Music Method N = 104 (108) undergraduate participants. Stimuli. Participants listened to 22 1-minute excerpts from a variety of musical pieces including different styles such as classical music, national and international pop music, and ethnic music from Africa. The 22 pieces consisted of 11 pairs (e.g., 2 orchestral pieces by Bach, 2 songs from a Bob Dylan album…). Procedure. After listening to each piece, participants rated how much they liked it, and how familiar they were with it on 10point Likert scales. Averages of n random others‘ ratings Averages of n random others‘ ratings n Decomposing inaccuracy 1 Mean squared errors can be decomposed into different parts (Lee & Yates, 1992; Theil, 1966): MSEx = (Mx – Mu)² = (Mx – Mu)² + (Sx – Su)² + (Sx – raSu)² + 2 (1 – ra) Sx Su + (1 – ra²) Su² 2 We use this decomposition to identify the nature of the error in the (averages of the) taste predictions. Decomposing inaccuracy 1 Mean squared errors can be decomposed into different parts (Lee & Yates, 1992; Theil, 1966): MSEx = (Mx – Mu)² = (Mx – Mu)² 1 Bias: Systematic overor underprediction. + (Sx – Su)² + (Sx – raSu)² + 2 (1 – ra) Sx Su + (1 – ra²) Su² 2 Variability: Predictions should “regress to the mean”. 3 Correlation: Between the predictions and the criteria. Decomposing inaccuracy Averages of n similar others‘ ratings Averages of n similar others‘ ratings nn Averages of n similar others‘ ratings Study 1, results 1 Averaging can be beneficial in matters of taste. 2 Averaging effects are more pronounced for similar others. Study 1, results 1 Averaging can be beneficial in matters of taste. 2 Averaging effects are more pronounced for similar others. … Participants in Study 1 did not make predictions, though. Study 2: Short films Method N = 62 (66) undergraduate participants. Session 1. Participants viewed 10-second clips from 7 short films along with a brief description of the films. They then made predictions on a 100-point scale regarding how much they expected to enjoy each of the films. Session 2. Two weeks later, participants returned to the lab and were shown the 7 short films. They then had to rate how much they enjoyed each of the films on a 100-point scale. Averaging n taste predictions 1800 Averaging n taste predictions Total MSE = Bias + Variability Bias + Lack of Correspondence 20 30 40 50 800 10 800 MSE 1200 1000 800 600 0 800 vMSESim 1400 1600 Average of the predictions of n randomly chosen other participants Average of the predictions of the n most similar other participants Benchmark: Own prediction 60 n 1800 Decomposing accuracy gains Total MSE = Bias + Variability Bias + Lack of Correspondence MSE 600 800 1000 1200 vMSESim 1400 1600 Average of the predictions of n randomly chosen other participants Average of the predictions of the n most similar other participants Benchmark: Own prediction 40 0 10 20 30 N 40 50 60 50 60 200 400 600 Lack of Correspondence 0 200 400 vrbSim 600 Variability Bias 0 200 400 vSbSim 600 Bias 0 vMbSim 30 800 20 800 10 800 0 0 10 20 30 N 40 50 60 0 10 20 30 N 40 50 60 Applications and Implications 1 Preference predictions inform decisions. By taking into account what other people might like, DMs can make better decisions. 2 Psychological foundations for similarity-based algorithms in recommender systems (e.g., collaborative filtering, Ansari, Essegaier, & Kohli, 2000; Koren & Bell, 2011). Conclusions 1 Aggregating predictions can lead to accuracy gains even in the context of taste predictions and other „subjective truths“. 2 Taste predictability and taste similarity determine the potential for-, and the nature of these accuracy gains. Taste subjectivity: optimal weights 0.6 w* 0.5 0.4 0.3 0.2 0.1 0 Weight Self Weight Other 1 Weight Other 2
© Copyright 2024