- •Stellingen
- •Propositions
- •List of Figures
- •List of Tables
- •1 Introduction
- •Introduction
- •Affect, emotion, and related constructs
- •Affective Computing: A concise overview
- •The closed loop model
- •Three disciplines
- •Human-Computer Interaction (HCI)
- •Health Informatics
- •Three disciplines, one family
- •Outline
- •2 A review of Affective Computing
- •Introduction
- •Vision
- •Speech
- •Biosignals
- •A review
- •Time for a change
- •3 Statistical moments as signal features
- •Introduction
- •Emotion
- •Measures of affect
- •Affective wearables
- •Experiment
- •Participants
- •Equipment and materials
- •Procedure
- •Data reduction
- •Results
- •Discussion
- •Comparison with the literature
- •Use in products
- •4 Time windows and event-related responses
- •Introduction
- •Data reduction
- •Results
- •Mapping events on signals
- •Discussion and conclusion
- •Interpreting the signals measured
- •Looking back and forth
- •5 Emotion models, environment, personality, and demographics
- •Introduction
- •Emotions
- •Modeling emotion
- •Ubiquitous signals of emotion
- •Method
- •Participants
- •International Affective Picture System (IAPS)
- •Digital Rating System (DRS)
- •Signal processing
- •Signal selection
- •Speech signal
- •Heart rate variability (HRV) extraction
- •Normalization
- •Results
- •Considerations with the analysis
- •The (dimensional) valence-arousal (VA) model
- •The six basic emotions
- •The valence-arousal (VA) model versus basic emotions
- •Discussion
- •Conclusion
- •6 Static versus dynamic stimuli
- •Introduction
- •Emotion
- •Method
- •Preparation for analysis
- •Results
- •Considerations with the analysis
- •The (dimensional) valence-arousal (VA) model
- •The six basic emotions
- •The valence-arousal (VA) model versus basic emotions
- •Static versus dynamic stimuli
- •Conclusion
- •IV. Towards affective computing
- •Introduction
- •Data set
- •Procedure
- •Preprocessing
- •Normalization
- •Baseline matrix
- •Feature selection
- •k-Nearest Neighbors (k-NN)
- •Support vector machines (SVM)
- •Multi-Layer Perceptron (MLP) neural network
- •Discussion
- •Conclusions
- •8 Two clinical case studies on bimodal health-related stress assessment
- •Introduction
- •Post-Traumatic Stress Disorder (PTSD)
- •Storytelling and reliving the past
- •Emotion detection by means of speech signal analysis
- •The Subjective Unit of Distress (SUD)
- •Design and procedure
- •Features extracted from the speech signal
- •Results
- •Results of the Stress-Provoking Story (SPS) sessions
- •Results of the Re-Living (RL) sessions
- •Overview of the features
- •Discussion
- •Stress-Provoking Stories (SPS) study
- •Re-Living (RL) study
- •Stress-Provoking Stories (SPS) versus Re-Living (RL)
- •Conclusions
- •9 Cross-validation of bimodal health-related stress assessment
- •Introduction
- •Speech signal processing
- •Outlier removal
- •Parameter selection
- •Dimensionality Reduction
- •k-Nearest Neighbors (k-NN)
- •Support vector machines (SVM)
- •Multi-Layer Perceptron (MLP) neural network
- •Results
- •Cross-validation
- •Assessment of the experimental design
- •Discussion
- •Conclusion
- •10 Guidelines for ASP
- •Introduction
- •Signal processing guidelines
- •Physical sensing characteristics
- •Temporal construction
- •Normalization
- •Context
- •Pattern recognition guidelines
- •Validation
- •Triangulation
- •Conclusion
- •11 Discussion
- •Introduction
- •Hot topics: On the value of this monograph
- •Applications: Here and now!
- •TV experience
- •Knowledge representations
- •Computer-Aided Diagnosis (CAD)
- •Visions of the future
- •Robot nannies
- •Digital Human Model
- •Conclusion
- •Bibliography
- •Summary
- •Samenvatting
- •Dankwoord
- •Curriculum Vitae
- •Publications and Patents: A selection
- •Publications
- •Patents
- •SIKS Dissertation Series
11
Discussion
Abstract
This chapter will start with a wrap-up of what has been presented in this monograph. Its main contribution will lie in looking back and forth in time. After an introduction, a historical perspective will be taken, which will illustrate the vast amount of knowledge that is frequently ignored in ASP research. Subsequently, I will weigh this monograph’s contribution to emotion science’s 10 hot topics as has been recently identified [236]. After this, ASP will be brought back to practice by introducing affective computing’s I/O. Next, three applications that fit three disciplines of computer science will be unveiled, namely: Human-Computer Interaction (HCI), Artificial Intelligence (AI), and health informatics. It will be posed that the technique is ready to bring these applications to the market. Subsequently, the pros and cons of two possible future application areas (i.e., robot nannies and a digital human model) will be discussed. I will finish this chapter and, hence, this monograph with a brief conclusion.
In order of appearance, this chapter includes parts of:
Broek, E.L. van den, Zwaag, M.D. van der, Healey, J.A., Janssen, J.H., & Westerink, J.H.D.M. (2010). Prerequisites for Affective Signal Processing (ASP) - Part IV. In J. Kim & P. Karjalainen (Eds.), Proceedings of the 1st International Workshop on Bio-inspired Human-Machine Interfaces and Healthcare Applications – B-Interface 2010, p. 59–66. January 21, Valencia. Spain;
Broek, E.L. van den, Sluis, F. van der, & Schouten, Th.E. (2010). User-centered digital preservation of multimedia. European Research Consortium for Informatics and Mathematics (ERCIM) News, No. 80 (January), 45–47;
Broek, E.L. van den (2010). Robot nannies: Future or fiction? Interaction Studies, 11(2), 274–282; and:
Broek, E.L. van den (2010). Beyond Biometrics. Procedia Computer Science, 1(1), 2505–2513.
[invited].
11.1 Introduction
11.1 Introduction
This monograph was divided into five parts: I. a prologue, II. baseline-free ASP using statistical moments, III. bi-modal Affective Signal Processing (ASP) that explored various possible key factors, IV two studies towards affective computing, and V. an epilogue of which this discussion is the second part. In addition, Appendix A provides additional background information on the statistical techniques used in this monograph.
The first part, the prologue, started with a general introduction and an introduction on this monograph’s key concepts: affect (and emotion), affective computing, and Affective Signal Processing (ASP). Next, the closed loop model for affective computing and ASP was introduced, which served as the working model for this monograph. Moreover, the relevance of ASP for three branches of computer science (i.e., Human-Computer Interaction (HCI), Artificial Intelligence (AI), and health informatics) was explained. Last, an outline of this monograph was provided. The second and last chapter of the prologue provided a review of affective computing and, more in particular, ASP. Biosignals received most attention as this was the target modality of the monograph.
The second part of this monograph consisted of two chapters (Chapters 3 and 4) that presented two distinct sets of analyses on the same data set. The analyses differed in their choice of time windows. This way the impact and usage of this parameter for ASP was explored. Dynamic real world stimuli (i.e., fragments from movies) were used to induce emotions, instead of less ecologically valid static stimuli. The EMG of three facial muscles was recorded. This is often done to establish a ground truth measurement. In addition, the participants’ EDA’s were recorded. This is a robust well-documented biosignal that reveals the level of experienced arousal experienced. Baseline-free ASP was achieved through the use of statistical moments. The 3rd and 4th order moments (i.e., skewness and kurtosis) of the biosignals revealed hidden signal characteristics that enabled to discriminate very well between four emotional states with up to 62% explained variance.
The third part of this monograph also consisted of two chapters, Chapters 5 and 6. The studies presented in these chapters only differed with respect to the stimuli used. In the first study, Chapter 5, one of, or perhaps the reference set for affective computing was used: IAPS images. In the second study, Chapter 6, the same set of movie fragments was used as in Chapters 3 and 4. This enabled a comparison of static versus dynamic stimuli and, as such, assessed their validity. Both studies employed a bi-modal ASP approach to assess affective processes, including ECG as biosignal as well as speech. To the best of the author’s knowledge, in this context, this combination has so far only been explored by Kim and colleagues [336, 337, 339, 340]. Both studies also explored a range of issues important to ASP, namely: emotion models, environment, personality traits, and demographics. Surprisingly, some of these issues were shown to have little influence on ASP (e.g., the personality trait
187
11 Discussion
extroversion and demographics). In contrast, other issues (e.g., environment and gender) were shown to be of influence. Up to 90% of variance in the data was explained. Moreover, with both studies more support was found for the valence-arousal model than for basic emotions.
The fourth part consisted of three chapters that presented work bringing us further towards affective computing. The first chapter, Chapter 7, presented the execution of the complete signal processing + pattern recognition processing pipeline, see also Section 1.5. In the quest for an optimal processing pipeline, several signal processing aspects and classification methods were explored. The second chapter, Chapter 8, assessed the use of the speech signal as affective signal. The study’s aim was to explore the feasibility of a speech-based Computer-Aided Diagnosis (CAD) for mental health care. The study consisted of two experiments, one well controlled and one open, in which patients with a post-traumatic stress disorder (PTSD) participated. For both experiments, a model was developed that explained a significant amount of variance. In the third chapter, Chapter 9, the data of Chapter 8 was used to execute the complete signal processing + pattern recognition processing pipeline (cf. Chapter 7). As such, this chapter explores the feasibility of the envisioned ASP-based CAD for mental health care. I concluded that both from a clinical and from an engineering point of view, affective computing seems to be within reach.
The fifth part, the epilogue, consists of two parts: the discussion you are currently reading and a set of guidelines for ASP, which was presented in the previous chapter. This guideline chapter presented the lessons learned while working on the research reported in this monograph. These guidelines indicated possible problems, presented solutions for them, and provided research directives for affective computing. As such, this was perhaps the most important chapter of this monograph.
The remainder of this discussion will look back and forth in time. In Section 11.2, I will stress that we should go back to the basics and learn from the field’s research history. The reason for this is simple: energy spent on reinvention is wasted. In Sections 11.5 and 11.6, I will go from theory to practice and present some applications that could be realized with the current state-of-art ASP as presented in this monograph. Additionally, I will touch upon some of the ethical aspects of these applications. I will end this monograph with a brief conclusion in Section 11.7.
11.2 Historical reflection
Although a lot of knowledge on emotions has been gained over the last centuries [22, Chapter 1], researchers tend to ignore this to a great extent and to stick to some relatively recent theories; for example, the valence and arousal model or the approach avoidance model.
188
11.2 Historical reflection
This holds in particular for ASP and affective computing, where an engineering approach is dominant and a theoretical framework is considered of lesser importance [680]. Consequently, for most engineering approaches, the valence-arousal model is applied as a default option, without considering other possibilities. Nonetheless, a higher awareness of other theories can heighten the understanding and, with that, the success of ASP.
It is far beyond the scope of this monograph to provide a complete overview of all of the literature relevant for ASP and affective affective computing. For such an overview, I refer to the various handbooks and review papers on emotions, affective sciences, and affective neuroscience [16, 52, 72, 139, 144, 208, 209, 238, 396, 492, 535, 573, 582, 631]. In this section, I will touch upon some of the major works on emotion research which originate from medicine, biology, physiology, and psychology.
Let us start with one of the earliest works on biosignals: De l’Électricité du corps humain by M. l’Abbé Bertholon (1780) [366], who was the first who described human biosignals. Nearly a century later Darwin (1872) published his book The expression of emotions in man and animals [139]. Subsequently, independently of each other, William James and C.G. Lange revealed their theories on emotions, which were remarkably similar [139]. Consequently, their theories have been merged and were baptized the James-Lange theory.
In a nutshell, the James-Lange theory argues that the perception of our own biosignals is the emotion. Consequently, no emotions can be experienced without these biosignals. Two decades after the publication of James’ theory, this was seriously challenged by Cannon [90, 91] and Bard [29, 30]. They emphasized the role of subcortical structures (e.g., the thalamus, the hypothalamus, and the amygdala) in experiencing emotions. Their rebuttal on the James-Lange theory was expressed in a theory that was founded on five notions:
1.Compared to a normal situation, experienced emotions are similar when biosignals are omitted; e.g., as with the transection of the spinal cord and vagus nerve.
2.Similar biosignals emerge with all emotions. So, these signals cannot cause distinct emotions.
3.The body’s internal organs have fewer sensory nerves than other structures. Hence, people are unaware of their possible biosignals.
4.Generally, biosignals have a long latency period, compared to the time emotional responses are expressed.
5.Drugs that trigger the emergence of biosignals do not necessarily trigger emotions in parallel.
I will now address each of Cannon’s notions from the perspective of ASP. It is important to consider these notions for current ASP, as will become apparent.
To the author’s knowledge, the first case that illustrated both theories’ weaknesses was
189
11 Discussion
that of a patient with a lesion, as denoted in Cannon’s first notion. This patient reported:
“Sometimes I act angry when I see some injustice. I yell and cuss and raise hell, because if you don’t do it sometimes, I learned people will take advantage of you, but it just doesn’t have the heat to it that it used to. It’s a mental kind of anger.” [285, p. 151]. On the one hand, this case seems to support the James-Lange theory since the lesion disturbed the patient’s biosignals and, in parallel, his emotions have diminished or are even absent. On the other hand, the patient does still report emotions, although of a different kind. If biosignals are the emotion how can this be explained then? Can this be attributed to higher level cognition, to reasoning only? If not, this can be considered as support for the Cannon-Bard theory. More than anything else this case once more illustrates the complexity of affective processes as well as the need for user identification, in particular research on special cases (see also Section 10.3.3).
The second notion of the Cannon-Bard theory strikes the essence of ASP. It would imply that the quest of affective computing is doomed to fail. According to Cannon-Bard, ASP is of no use since there are no unique sets of biosignals that map to distinct emotions. Luckily, nowadays, this statement is judged as coarse [139]. However, it is generally acknowledged that it is very hard to apply ASP successfully [52]. So, to some extent Cannon has been right.
It was confirmed that the number of sensory nerves differs in distinct structures in human bodies (Cannon’s notion 3). So, indeed people’s physiological structures determine their internal variations to emotional sensitivity. To make ASP even more challenging, there are cross-cultural and ethnic differences in people’s patterns of biosignals, as was already shown by Sternbach and Tursky [627, 660] and confirmed repeatedly [585, Chapter 28], [556, 557, 603]. Once more this stresses the need for user identification, as is one of the guidelines proposed in this monograph (see Section 10.3.3).
The fourth notion concerns the latency period of biosignals, which Cannon denoted as being ‘long’. Indeed a response time is present with biosignals, which one could denote as being long. Moreover, it varies considerably between the several biosignals used with ASP; see also Table 1.1 in Chapter 1. The former is a problem, although in most cases a work around is, to some extent, possible. The latter is possibly even more important to take into account, when conducting ASP. Regrettably, this is seldom done. This problem has been addressed as temporal construction in Section 10.2.2.
The fifth and last notion of Cannon is one that has not been addressed so far. It goes beyond biosignals since it concerns the neurochemical aspects of emotions. Although this component of human physiology can indeed have a significant influence on experienced emotions, this falls far beyond the scope of this monograph.
It should be noted that the current general opinion among neuroscientists is that the truth lies somewhere in between the theories of James-Lange and Cannon-Bard [139], as was first suggested by Schachter and Singer [575]. However, the various relations between
190
