Large language models (LLMs) have proven themselves not only in tasks requiring understanding and generating text but also as decision-making tools. It is expected that machine intelligence exhibits a lack of emotion, the ability to reason "with a cool head", and to make balanced decisions. Such qualities are essential in areas where human lives are at stake: medicine, law, and other fields.
On the other hand, the ability of AI systems to capture how humans behave under the influence of emotions is also a useful trait, allowing for the modeling of social and economic processes. Training models to make decisions that align with human preferences, ethical norms, and expectations is called alignment.
Large language models are trained on vast datasets generated by humans, thus they absorb certain emotional patterns that can be revealed through carefully tailored prompts. However, the question of how accurately this emulation occurs and how to assess its correctness remains open.
Researchers from AIRI and their colleagues from several Russian and foreign institutes have made an important step in this direction by conducting a series of experiments with several popular LLMs, placing them in gaming and ethical scenarios where emotions play a significant role. The authors selected five basic emotions — anger, happiness, fear, sadness, and disgust — and developed several ways to integrate them into models using prompts.
The experiments included commercial models (GPT-4, GPT-3.5, Claude) and open models (LLaMA-2, OpenChat, GigaChat). The researchers tried different game configurations and also considered the language predominantly used for training the model. For scenarios, they chose ethical tasks of varying complexity, bargaining games, and repetitive multiplayer social games.
Scheme of the framework for emotion modeling in LLMs
As a result of the experiments, the researchers identified numerous interesting patterns related to the influence of emotions on LLMs. For instance, medium-sized models like GPT-3.5 demonstrated the most accurate alignment with human behavior patterns. Larger models like GPT-4 tended to choose strategies closer to optimal and were less swayed by emotions. The authors also discovered that the language predominantly used to train a multilingual LLM plays an important role in its ability to reproduce human decisions. It turned out that the transition to secondary languages entails a decrease in the quality of alignment.
More details about the experimental results are presented in a paper submitted to the NeurIPS 2024 conference. The code is open and available on GitHub.