AuthorConnect

Paper 2090

The Visual Language of Emotions: Vision-Language Models and Double Machine Learning for Robust Causal Estimation of Image Emotions on Engagement

Masoud Salehi, Mohammad Hakimi, Zhu Zhang

Abstract This study investigates the causal impact of image emotions on user engagement in social media by leveraging vision-language models (VLMs) and double machine learning (DML). We develop a novel image emotion recognition model tailored for the social media context, addressing both measurement limitations and causal inference challenges. Using a dataset of over 1 million Instagram posts, we extract a comprehensive set of image attributes, including low-level pixel features, objects, scenes, actions, and aesthetic features, to control for high-dimensional confounders. Our findings reveal that Anger, Fear, Sadness, and Surprise significantly increase engagement, while Disgust reduces it, and surprisingly, Joy shows no significant effect. By incorporating advanced image processing models and robust causal estimation techniques, this study advances the understanding of visual emotions and contributes to affective computing research. These insights provide valuable implications for advertisers, content creators, and platform designers seeking to optimize engagement strategies through emotionally resonant imagery.

AuthorConnect Sessions

No sessions scheduled yet