Bias in machine learning: “We need more diverse teams”

Can AI be discriminating? Our MT Engineer Andrada Pumnea explains why bias occurs in machine learning, how it affects people's lives, and what data scientists can do about it.

Bias in machine learning: “We need more diverse teams”
Lengoo Marketing Team
Jun 16, 2021

Imagine you’re applying for a job, and your CV gets rejected. An algorithm filtered it out before someone even had the chance to look at it—based on your gender, age, or zip code. Is artificial intelligence to blame? The HR team using the machine learning (ML) application for recruiting? Or is it the data scientists who programmed the system? Our Lengoo MT Engineer, Andrada Pumnea, cares deeply about using data processing for good. She explains how bias in the machine learning process occurs and suggests what engineers can do about it.

What is bias in machine learning and why is it an issue?
Essentially, it’s computers learning prejudices and stereotypes. They learn these biases from the data humans feed them, and then reproduce them. So the discriminating output reinforces prejudices and stereotypes and contributes to overall oppression and social inequality. The phenomenon occurs when humans design and train machine learning models. It happens either through unconscious cognitive bias or real-life prejudices. ML models can’t distinguish whether or not the information it gets could cause harm in the real world. So they can further marginalize and alienate already marginalized groups.

What does biased AI output look like?
There’s the classic example of Amazon who tried to automate its recruiting process. The models were trained on CVs that had led to successful hiring in the past. But the training data skewed towards men. So the automation they created just rejected the CVs of women, or even CVs that included the word “women”, like “women’s organization.” A more recent example, which surfaced in 2019, was Facebook optimized ads recommended to users based on race, gender, and religion. This meant that job ads for secretarial work and nursing were recommended mainly to women, while taxi driver and janitor jobs were recommended to men from minority groups.

Does bias occur at Lengoo when setting up language models for neural machine translation?
The translations we deliver are mainly technical. User manuals for machines or product descriptions don’t leave much room for harmful bias. But what's most important is our human translators who revise and check our machine-translation output. So if anything did fall through the cracks, they'd catch it.

Why does bias occur?
The goal of a machine learning model is to find patterns in the data we provide. This means generalizing and finding “universal” rules. When the model makes predictions based on this data, they’ll be biased. ML models learn from the past, they're fed with historical data. For example, a model trained on a big corpora of text available on the internet can learn an “association bias”. This is when two words appear together in the same context like, “woman-kitchen” or “man-office”. The model learns to associate women as homemakers and men as workers, because historically, in the training data, these words tended to appear in the same context. And depending on how such a model is used and applied to the real world today, it might have unintended consequences that perpetuate harmful stereotypes.

So the root of all problems is in the data. How exactly does the ML model become biased?
Bias can creep in at every stage of the process—when you collect and select the data, while you’re labeling it, when you process it. It all affects the outcome. Collecting data, for example, creates bias because the people who are gathering it might not realize that the data's not a representative sample. A scientist developing a model for the human resource department probably won’t have the same level of awareness and knowledge as an HR expert in the field. On the other hand, while processing data, an engineer might remove important information and consider it as “outliers”, for the sake of making the model more accurate. This information may have reflected something from real life, so it’s important to consider the consequences before removing something.

"We need to ask: who am I impacting with this technology?"

You’re passionate about using data for good. What's your first-hand experience with bias in data labeling?
I was working on an open-source project developing a model for hate speech detection on Facebook comments. The experience was really valuable and showed me that bias starts early when deciding what makes good data. I was labeling my own data and creating my own gold standard for labeling, but this alone was biased since I was working by myself. Ideally, you wouldn’t rely on just one source of truth. You'd have a diverse team of at least three to five people labeling the data sets. Then you'd have majority votes, and a group of domain experts would check the data and labels. For this project, I had to consider how different research groups define and label hate speech, guidelines for labeling the data consistently, and being mindful about how the model would impact people in a real-life situation. I learned a lot about the challenges and pitfalls of bias when using ML for social good.

Is there a solution for the bias issue in machine learning?
There's no definitive answer or ultimate solution. For sure, both individuals and especially companies need to be held accountable for the way they collect, process and use data to ensure that their practices are ethical. We have to keep educating ourselves and challenge assumptions. We need to ask: who am I impacting with this technology? What do we consider the gold standard? What's the “default”? There are techniques for mitigating bias in all steps of the process, but it all starts from being mindful and being aware of our decisions when creating a model.

What can companies do to ensure more fairness in their ML models?
They can have diverse teams and a good mix of domain experts, ML engineers, user researchers, and designers—it's a necessity when designing a system based on machine learning to be used by many people. It’s about getting as many perspectives on the data as possible and creating a genuinely valuable product. The more diverse the team, the better the chance they'll pick up on harmful outcomes. The most important thing is studying and understanding all the potential users and how automatic predictions impact them. In general, making fairness a requirement rather than an afterthought and investing in educating people on ethics is essential when going further with ML.

Why is it so important to be aware of bias in ML?
Technology touches so many people's lives at once. Its effect has even increased over the last two years when pretty much everything moved online. When ML models are biased and reproduce stereotypes, the effect amplifies. We need to be mindful in ML. We need to understand the user, the system, and how it might backfire. Also, having a human in the loop as we do at Lengoo is a good idea, especially when you’re dealing with sensitive information like CVs or financial history. A ML prediction can change someone’s life. If we want to prevent ML applications from reproducing inequality, we need to be extremely careful, and maybe we shouldn’t fully automate certain processes that make life-changing decisions. As with all things AI, we should always ask the question: Yes, you can build it—but should you?