Research team

Expertise

Computational linguistics, natural language processing and digital text analysis. Special interest in emotion analysis.

Emotions without borders: Studying the verbalization and automatic detection of emotions across languages. 01/05/2025 - 30/04/2029

Abstract

Emotions have attracted a lot of attention in psychology, socio- and psycholinguistics and communication science, but since the past decade also in the fields of computational linguistics and natural language processing. In the latter fields, the term emotion detection is used to refer to the task of automatically identifying fine-grained emotions in texts. Research on emotion detection has mainly focused on English, but with the emergence of (multilingual) large language models, the interest in multilingual approaches to emotion detection increased. Meanwhile, state-of-the-art research in psychology have developed new theories about emotion, claiming that emotions are not universal: neither in conceptualization, nor in emotion expression. This might have consequences for how multilingual emotion detection models work. Therefore, it is crucial to investigate differences in emotional language use across languages. Most studies that deal with the cultural component of emotions are limited to studying the translatability of emotion words, or focus on very specific cases and language pairs. In this research project, we will transcend the word level and go beyond the comparison of language pairs by comparing emotion verbalization across ten languages, using methods from computational linguistics. Moreover, we will investigate how state-of-the-art emotion detection models deal with cross-lingual differences in emotion verbalization.

Researcher(s)

Research team(s)

Project type(s)

  • Research Project

Webcare through the eyes of the bystander: A cross-linguistic comparison of pragmatic-rhetorical features in hotel review-response interactions. 01/01/2025 - 31/12/2028

Abstract

Webcare, as a manifestation of digital reputation management, has become ubiquitous within the tourism industry. The significance of this online customer service communication, accessible to all, cannot be overstated. It demonstrates a commitment for guest satisfaction, thereby positively influencing the hotel's image. Consequently, it can sway prospective clients who, as bystanders, pursue this communication and subsequently opt for a specific hotel. Although recent studies suggest that guest reviews and hotel responses are influenced by cultural factors, cross-cultural analyses of hotel interactions remain limited and scarce in terms of the languages and cultures investigated. Therefore, the objective of this project is to conduct a cross-linguistic study of a multilingual corpus consisting of 80,000 hotel reviews and their corresponding responses in German, French, English (UK/US), Italian, Dutch, and Spanish (ES/MX). Specifically, this project aims to explore the cross-linguistic characteristics of hotel interactions in L1. It seeks to identify which of these characteristics are perceived as positive or negative by the bystander, who is ultimately the intended audience for these responses. The knowledge gained from this foundational research will inform the fields of pragmatics and marketing communication and present opportunities for the development of generative AI systems that can automatically craft responses tailored to the linguistic and cultural context.

Researcher(s)

Research team(s)

Project type(s)

  • Research Project

CLARIAH-VL+: paving the way for a SSH Open Science Cloud for Flanders 01/01/2025 - 31/12/2028

Abstract

CLARIAH-VL+ focuses on boosting research in the Humanities and Social Sciences using digital tools and services and fostering collaborations to create an accessible cloud platform for researchers. It unifies disciplines, which share the challenge of utilising a wide range of sources and heritage from ancient manuscripts to modern digital data. The project acts as a bridge, connecting researchers in Flanders with broader European research communities. It addresses three major societal issues: environmental change, social inequality, and the complexities of migration and cultural diversity. By integrating insights from Social Sciences with a detailed analysis typical of the Humanities, and linking these to other fields like ecology and geography, CLARIAH-VL+ aims to provide comprehensive new understandings. A key aspect of CLARIAH-VL+ is its focus on transforming old and often neglected data — like weather records and population statistics — into useful information. This data is crucial for tackling pressing societal challenges. CLARIAH-VL+ commits to providing the infrastructure to use these data in practice by harnessing the capabilities of large language models and generative AI and securing the sustainability of essential research components. Through these efforts, CLARIAH-VL+ not only makes research easier, but also Humanities and Social Sciences research more relevant and impactful for society.

Researcher(s)

Research team(s)

Project type(s)

  • Research Project

Classification of online multimodal data. 01/01/2025 - 31/12/2026

Abstract

In an increasingly digital world, we are confronted with an information supply that is not only increasing in scale, but is also becoming increasingly diverse in form. This often involves multimodal data, in which text, audio, image and video go hand in hand. The dominance of multimodal data in the online sphere offers both interesting opportunities and challenges in research disciplines such as computational linguistics and natural language processing (NLP). On the one hand, the diversity of multimodal data allows us to obtain a more complete picture of human communication (for example on social media, forums or on online news platforms), including relatively new forms of communication such as memes, vlogs and podcasts. On the other hand, robust automatic classification methods are required to allow large-scale and efficient analysis of this type of data. Computational linguistics and NLP mainly focus on the automatic processing of text. An important research domain within these disciplines is text classification, with typical examples being sentiment analysis (in which labels such as 'positive', 'negative' or emotion categories are assigned to texts), hate speech detection, topic classification or the detection of fake news. In many cases, only unimodal (text-only) data is collected, which means that a significant part of the data available online is not used. In many other cases, only the textual part in multimodal data (for example, transcribed text in audio/video, or text without images on social media) is included for the automatic analysis. This leads to potentially important information from other modalities being ignored, or having to be manually analysed on a much smaller scale. However, recent developments in machine learning show that integrating multiple modalities can significantly increase the accuracy of classification systems. For example, in sentiment analysis on social media, it can be crucial to include not only text, but also images in the analysis, or in the case of speech/video analysis, intonation might play an important role. For this postdoc challenge we therefore encourage candidates to prepare a proposal that focuses on research into innovative and robust methodologies for the classification of online multimodal data. The ultimate goal is to enable a more holistic understanding of communication in the digital world, which can lead to improved insights into social interactions and online communication. This research can also contribute to the development of advanced tools for monitoring online content and fostering a healthier digital communication environment.

Researcher(s)

Research team(s)

Project type(s)

  • Research Project