Title Of Presentation (Verdana font size 28, bold)

Real-time Sociofeedback: A VoIP
Application and a Correlation Study
presented by
Debsubhra CHAKRABORTY
PhD student
Institute for Media Innovation/ Interdisciplinary Graduate School
Supervisor: Asst. Prof. Justin Dauwels (EEE)
Co-Supervisor: Prof. Daniel Thalmann (IMI/SCE)
21st May, 2015
Outline
• Introduction
• VoIP Application
• Corpora
• Correlation Study
• Conclusion
2
Introduction
• Computers from tools to facilitating human-human interaction to
social robotics
• Social Signal Processing – endow computers with social
intelligence
Vinciarelli, A., Pantic, M., & Bourlard, H. (2009). Social signal processing: Survey of an emerging domain. Image and Vision
Computing, 27(12), 1743-1759.
Albrecht, K. (2006). Social Intelligence: The new science of success. John Wiley & Sons.
3
Introduction
• Social signals – key to SSP
• Social signals contain non-verbal cues
• Existing methods – offline analysis, one or two social signals, no
additional video data used, rule-based, no feedback
4
Vinciarelli, A., Pantic, M., Heylen, D., Pelachaud, C., Poggi, I., D'Errico, F., & Schröder, M. (2012). Bridging the gap between
social animal and unsocial machine: A survey of social signal processing.Affective Computing, IEEE Transactions on, 3(1), 6987. (figure source)
Poggi, I. (2007). Mind, hands, face and body: a goal and belief view of multimodal communication. Weidler.
Our Approach
• Applications: Skype interview or online course
Politeness
Friendliness
Frustration
Respect
Empathy
Confusion
Hostility
Agreement
Dominance
Interest
5
VoIP Application: A Demo
6
VoIP Application
• Records each speaker on separate .mp4 files
SuperTintin • Segments of 1 minute
for Skype
Matlab
• File event handler analyzes each new file saved
• Separates audio from video
• Uses trained machine learning models on audio to get sociometrics
• Use Skype API to conveniently display the sociometrics
Skype
7
Machine Learning
• Different machine learning algorithms to obtain best
classification result
Support
Vector
Machine
Support
Vector Ordinal
Regression
Adaptive
Boosting
Bagging
Artificial
Neural
Network
k-Nearest
Neighbors
Random
Subspace
Ensembles
Naïve Bayes
Least Squares
• Model training is performed on two corpora: AC and
AVC
8
Corpora
Audio Corpus
(AC)
• Audio on separate
microphones
• No video
• 150 recordings
• 2.5-3 minutes long
• 22 participants (17 M, 5 F)
• Both speaker participants
Audio-Visual
Corpus (AVC)
• Audio on separate
microphones
• Video on separate Kinects
• 100 recordings
• 1 minute long
• 21 participants (16 M, 5 F)
• One speaker participant,
other control
9
Feature Extraction
• Speech detection
using HMM
• Prosodic cues
computed over 30
ms
• Visual cues from
RGB + depth data
from Kinect
Conversational
Prosodic
Visual
Speaking
Duration
Frequencies
Postures
Speaking
Turns
MFCC
Head
Movement
Interruption
Amplitude
Hand
Gestures
Interjection
Head Pose
10
Basu, S. (2003, April). A linked-HMM model for robust voicing and speech detection. In Acoustics, Speech, and Signal
Processing, 2003. Proceedings.(ICASSP'03). 2003 IEEE International Conference on(Vol. 1, pp. I-816). IEEE.
Annotation Protocol
• Each recording annotated by multiple judges
• Annotation on likert scale ranging from 1 (low) to 3
(high)
Corpora
Maximum
SD
Minimum
SD
Mean SD
Median
SD
AC
0.55
0.24
0.45
0.47
AVC
0.51
0.24
0.41
0.43
Tahir, Y., Chakraborty, D., Maszczyk, T., Dauwels, S., Dauwels, J., Thalmann, N., Thalmann, D. (2015). Real-Time
Sociometrics from Audio-Visual Features for Two-Person Dialogs. – Accepted in DSP 2015.
11
Classification Results
• Leave-one-person-out cross validation technique
Sociometric
AC
AVC
Agreement
84 %
81 %
Dominance
86 %
90 %
Interest
85 %
92 %
Politeness
81 %
76 %
Friendliness
51 %
63 %
Frustration
50 %
67 %
Empathy
59 %
67 %
Respect
59 %
62 %
Confusion
81 %
89 %
Hostility
77 %
72 %
[6] Tahir, Y., Chakraborty, D., Maszczyk, T., Dauwels, S., Dauwels, J., Thalmann, N., Thalmann, D. (2015). Real-Time
Sociometrics from Audio-Visual Features for Two-Person Dialogs. – Accepted in DSP 2015.
12
Correlation Study
• Understand interrelationships between indicators
(sociometrics) and features
• Express them in a visually engaging manner
• Identify redundancies
• Boost existing classification
13
Correlation: Between Indicators
AC
Oleg Komarov, “Schemaball,” [Software], June 2013, Available from
http://www.mathworks.com/matlabcentral/fileexchange/42279-schemaball.
AVC
14
Correlation: Indicators & Features
AC
AVC
15
Conclusion
• Feedback of social states in real-time is important in
certain situations
• Such feedback can be provided through VoIP
• Such feedback is based on machine learning of
annotated corpora
• Correlation study of corpora can help us understand
important relationships
16
Acknowledgements
• Asst. Prof. Justin Dauwels for his thoughtful insights
• Yasir Tahir and Tomasz Maszczyk for their generous
help
• IMI and IGS at NTU for supporting this research
17
18