Gender Bias in Artificial Intelligence
Word Embeddings
All contents extracted from:
Tolga Bolukbasi, Kai-Wei Chang,
James Y Zou, Venkatesh Saligrama,
and Adam T Kalai. 2016. Man is to
computer programmer as woman
is to homemaker? debiasing word
embeddings. In Advances in Neural
Information Processing Systems.
4349–4357.
The blind application of machine learning runs
the risk of amplifying biases present in data.
Such a danger is facing us with word embedding, a popular framework to represent
text data as vectors which has been used in many machine learning and natural
language processing tasks.
This raises concerns because their
widespread use, as we describe,
often tends to amplify these biases.
Word embeddings, trained only on word co-occurrence in text corpora,
serve as a dictionary of sorts for computer programs that would like to use
word meaning.
Word-embeddings contain biases in their geometry that reflect gender stereotypes
present in broader society. Due to their wide-spread usage as basic features,
word embeddings not only reflect such stereotypes but can also amplify them.
This poses a significant risk and challenge for machine learning and its
applications.
Stereotypes are biases that are widely held among a group of people.
We show that the biases in the word embedding are in fact closely aligned
with social conception of gender stereotype.
Analogies are a useful way to both evaluate the
quality of a word embedding and also its stereotypes.
(Interaction with the graph currently unavailable in safari. Please use chrome or firefox)