Mitigating Gender Bias in Language Models

By Tamara Mestvirishvili

A recent study, ‘Gender Bias in Large Language Models across Multiple Languages,’ published in March of this year, spotlights the prevalence of gender biases within large language models (LLMs) across a variety of applications. LLMs are a type of AI algorithm utilizing deep learning techniques to not only summarize but also generate and predict new content. The term ‘large’ is not merely descriptive—it is fittingly literal: these models incorporate billions of parameters to process myriads of text data. Despite significant attention on gender bias in English-language contexts, particularly within natural language processing (NLP) which focuses on language analysis, research beyond English remains relatively scarce and under-explored, especially in the context of LLM applications

Both NLP and LLM models have demonstrated advancements outperforming previous standards in various language-based tasks such as standardized tests including medical school exams, SAT, LSAT, and even IQ assessments. Their proficiency underscores their growing influence in fields like education and technology. More significantly, they mold social discourse by influencing public opinion, shaping certain narratives, and impacting societal attitudes through the content they indivertibly generate and analyze.

The study employed the GPT series of LLMs to analyze outputs in multiple languages based on specific metrics. The findings reveal widespread gender biases across all languages studied, including English, French, Spanish, Chinese, Japanese, and Korean. These biases manifest in discrepancies in the occurrence of certain descriptive words with genders, predictions of gender roles based on personal descriptions, and underlying sentiment tendencies reflected in dialogues

This study emphasizes the critical imperative to address and mitigate biases in AI systems and ensure fair and equitable representation across diverse linguistic and cultural contexts. As we propel AI technologies forward, integrating inclusivity and fairness must be fundamental to their development and deployment. The pressing question remains: How do we achieve this essential objective?

LLMs undergo training using a wide variety of text data sourced from diverse platforms. Once fed into the model, this data is processed as empirical information. However, concerns arise if the training datasets are biased or contain unrepresentative samples, potentially skewing the model’s outputs. Large language models can also fall prey to “falsehood mimicry,” where they may inadvertently propagate misinformation instead of correcting it when prompted.

Factors that critically influence AI model generation include its creators and the inherent biases in its datasets that are used for training, both of which can significantly perpetuate, widen, or mitigate gender equality gaps. According to the World Economic Forum’s 2023 report, women make up only 29% of all STEM workers. Despite more women graduating and entering STEM fields than ever before, women often find themselves concentrated in entry-level roles and are still less likely to ascend to leadership positions. In 2024, the Global Digital Compact (GDC) negotiation, published by UN Women in collaboration with the Action Coalition on Innovation and Technology for Gender Equality, presents a unique opportunity. It aims to build political momentum and integrate gender perspectives into a new digital governance framework. Without these efforts, there is a risk that AI could perpetuate existing gender gaps, potentially amplifying gender-based discrimination and harm rather than addressing it.

Currently, research teams at Google are addressing the glaring problems with the LLM training data. These teams are enhancing their training datasets through a pre-training phase, which enables the model to be refined later for specific tasks. During pre-training, they utilize a large dataset containing unannotated text in multiple languages, effectively expanding the model’s training data.

Will increasing the representation of women in AI workforces and enhancing training data suffice to bridge language-based biases in AI, or are additional strategies necessary to ensure equitable outcomes? The approval of the first United Nations resolution on artificial intelligence by the General Assembly in March of this year underscores a global commitment to ensuring this technology not only benefits all nations but in addition respects human rights, and is ‘safe, secure, and trustworthy.’ In an era where global consensus can be rare, perhaps we are witnessing the beginning of a unified effort to harness AI’s potential for positive change and inclusivity worldwide.

Tamara Mestvirishvili holds an MS in Biology from New York University, focusing on Bioinformatics and Systems Biology. With a BS from City College and post-baccalaureate studies at Columbia University, she is a bioinformatics analyst at NYU Langone Medical Center (NYUMC). She has mentored at the New York Academy of Science and is currently Science Communicator for LifeSci NYC.

Mitigating Gender Bias in Language Models

Quick Links

Let's Connect