Meta Querier: Revealing Gender Stereotypes in LLM

Haochen Zhang, Weijun Guo, Xuedan Zheng

doi:10.52710/cfs.423

PDF

Published: Feb 27, 2025

DOI: https://doi.org/10.52710/cfs.423

Keywords:

large language model; stereotype; bias; metamorphic testing

Haochen Zhang, Weijun Guo, Xuedan Zheng

Abstract

In recent years, the rapid development of large language models (LLMs) and the widespread adoption of open-source foundational models have significantly advanced technological accessibility. LLMs generate response due to context window, which consists of current prompt and conversation history. However, LLMs still suffer from inherent stereotypes and biases in their generated content, which may lead to erroneous judgments in LLM-based applications and unintentionally perpetuate stereotypes. Most existing studies about LLM stereotypes pay more attention to single-turn conversations, which have no conversation context. This paper, however, focuses on LLMs' vulnerability in robustness for stereotypical bias in multi-turn conversations. In this paper, we propose MetaQuerier, an automated framework grounded in metamorphic testing, which employs a metamorphic-transformation strategy to construct multi-turn contextual consistent prompt pairs to evaluate stereotypical bias in LLMs. We conduct more than 260000 test prompts towards 8 famous LLMs in total. The results show that up to 58.8% of the prompt pairs generated by MetaQuerier detected violations.

Issue

Volume 2025, Issue 1

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details