How can we eliminate bias from AI algorithms? The pen-testing manifesto

· Thomas Wood
How can we eliminate bias from AI algorithms? The pen-testing manifesto

Unlock Your Future in NLP!

Dive into the world of Natural Language Processing! Explore cutting-edge NLP roles that match your skills and passions.

Explore NLP Jobs

Gender bias in credit scoring AI?

In recent weeks a number of Apple Card users in the US have been reporting that they and their partners have been allocated vastly different credit limits on the branded credit card, despite having the same income and credit score (see BBC article). Steve Wozniak, a co-founder of Apple, tweeted that his credit limit on the card was ten times higher than his wife’s, despite the couple having the same credit limit on all their other cards.

The Department of Financial Services in New York, a financial services regulator, is investigating allegations that the users' gender may be the base of the disparity. Apple is keen to point out that Goldman Sachs is responsible for the algorithm, seemingly at odds with Apple’s marketing slogan ‘Created by Apple, not a bank’.

Since the regulator’s investigation is ongoing and no bias has yet been proven, I am writing only in hypotheticals in this article.

Bias in AI used in the justice system

The Apple Card story isn’t the only recent example of algorithmic bias hitting the headlines. In July last year the NAACP (National Association for the Advancement of Colored People) in the US signed a statement requesting a moratorium on the use of automated decision-making tools, since some of them have been shown to have racial bias when used to predict recidivism - in other words, how likely an offender is to re-offend.

In 2013, Eric Loomis was sentenced to six years in prison, after the state of Wisconsin used a program called COMPAS to calculate his odds of committing another crime. COMPAS is a proprietary algorithm whose inner workings are known only to its vendor Equivant. Loomis attempted to challenge the use of the algorithm in Wisconsin’s Supreme Court but his challenge was ultimately denied.

A screenshot of the questionnaire that a suspect fills out, which the COMPAS model uses to predict recidivism risk. The COMPAS model has been accused of AI bias. Image source and full document.

Unfortunately incidents such as these are only worsening the widely held perception of AI as a dangerous tool, opaque, under-regulated, capable of encoding the worst of society’s prejudices.

How can an AI be prejudiced, racist or biased? What went wrong?

I will focus here on the example of a loan application, since it is a simpler problem to frame and analyse, but the points I make are generalisable to any kind of bias and protected category.

I would like to point out first that I strongly doubt that anybody at Apple or Goldman Sachs has sat down and created an explicit set of rules that take gender into account for loan decisions.

Let us first of all imagine that we are creating a machine learning model which predicts the probability of a person defaulting on a loan. There are a number of ‘protected categories’, such as gender, which we are not allowed to discriminate on.

Developing and training a loan decision AI is that kind of ‘vanilla’ data science problem that routinely pops up on Kaggle (a website that lets you participate in data science competitions) and which aspiring data scientists can expect to be asked about in job interviews. The recipe to make a robot loan officer is as follows:

Imagine you have a large table of 10 thousand rows, all about loan applicants that your bank has seen in the past:

ageincomecredit scoregendereducation levelnumber of years at employerjob titledid they default?
3828000460MBSc2NurseNo

Fast Data Science - London

Need a business solution?

NLP, ML and data science leader since 2016 - get in touch for an NLP consulting session.

The final column is what we want to predict.

You would take this data, and split the rows into three groups, called the training set, the validation set and the test set.

You then pick a machine learning algorithm, such as Linear Regression, Random Forest or Neural Networks, and let it ‘learn’ from the training rows without letting it see the validation rows. You then test it on the validation set. You rinse and repeat for different algorithms, tweaking the algorithms each time, and the model you will eventually deploy is the one that scored the highest on your validation rows.

When you have finished you are allowed to test your model on the test dataset and check its performance.

The fallacy of removing a column and expecting bias to disappear from the AI

Now obviously if the ‘gender’ column was present in the training data, then there is a risk of building a biased model.

However the Apple/Goldman data scientists probably removed that column from their dataset at the outset.

So how can the digital money lender still be gender biased? Surely there’s no way for our algorithm to be sexist, right? After all it doesn’t even know an applicant’s gender!

Unfortunately and counter-intuitively, it is still possible for bias to creep in!

There might be information in our dataset that is a proxy for gender. For example: tenure in current job, salary and especially job title could all correlate with our applicant being male or female.

If it’s possible to train a machine learning model on your sanitised dataset to predict the gender with any degree of accuracy, then you are running the risk of your model accidentally being gender biased. Your loan prediction model could learn to use the implicit hints about gender in the dataset, even if it can’t see the gender itself.

A manifesto for unbiased AI

I would like to propose an addition to the workflow of AI development: we should attack our AI from different angles, attempting to discover any possible bias, before deploying it.

It’s not enough just to remove the protected categories from your dataset, dust off your hands and think ‘job done’.

AI bias pen-test

We also need to play devil’s advocate when we develop an AI, and instead of just attempting to remove causes of bias, we should attempt to prove the presence of bias.

If you are familiar with the field of cyber security, then you will have heard of the concept of a pen-test or penetration test. A person who was not involved in developing your system, perhaps an external consultant, attempts to hack your system to discover vulnerabilities.

I propose that we should introduce AI pen-tests: an analogy to the pen-test for uncovering and eliminating AI bias:

What an AI pen-test would involve

To pen-test an AI for bias, either an external person, or an internal data scientist who was not involved in the algorithm development, would attempt to build a predictive model to reconstruct the removed protected categories.

So returning to the loan example, if you have scrubbed out the gender from your dataset, the pen-tester would try his or her hardest to make a predictive model to put it back. Perhaps you should pay them a bonus if they manage to reconstruct the gender with any degree of accuracy, reflecting the money you would otherwise have spent on damage control, had you unwittingly shipped a sexist loan prediction model.

Further AI bias stress tests

In addition to the pen-test above, I suggest the following further checks:

  • Segment the data into genders. Evaluate the accuracy of the model for each gender.

  • Identify any tendency to over and under estimate probability of default for either gender

  • Identify any difference in model accuracy by gender.

Further measures

I have not covered some of the more obvious causes of AI bias. For example it is possible that the training data itself is biased. This is highly likely in the case of some of the algorithms used in the criminal justice system.

What to do if you have discovered a bias?

Let’s assume that you have discovered that the algorithm you have trained does indeed exhibit a bias for a protected category such as gender. Your options to mitigate this are:

  • If the pen-test showed that another input parameter, such as job title, is serving as a proxy for gender, you can remove it, or attempt to obfuscate the gender related aspects of it or sanitise the data further until the pen-tester is unable to reconstruct the gender
  • you can reverse engineer the result of the pen-test to artificially morph your training data, until the gender is no longer discoverable.
  • you can manually correct the inner workings of your model to compensate for the bias
  • you can check your training table for bias. If your AI is learning from biased data then we cannot expect it to be unbiased.
  • if your predictions are less accurate for females than for males, it’s likely that you have e.g. more training data for men than for women. In these cases you can use data augmentation: you duplicate every female entry in your data until your training dataset is balanced.
  • you can also go out of your way to collect extra training data for underrepresented categories.
  • you can try to make your model explainable and identify where the bias is creeping in. If you are interested in going into more detail about machine learning explainability, I invite you to also read my earlier post about explainable AI.

An aside… bias in recruitment?

One application of this approach that I would be interested in investigating further, is how to eliminate bias if you are using machine learning for recruitment. Imagine you have an algorithm matching CVs to jobs. If it inadvertently spots gaps in people’s CVs that correspond to maternity leave and therefore gender, we run the risk of a discriminatory AI. I imagine this could be compensated for by some of the above suggestions, such as tweaking the training data and artificially removing this kind of signal. I think that the pen-test would be a powerful tool for this challenge.

How can companies avoid bias re-appearing?

Today large companies are very much aware of the potential for bad PR to go viral. So if the Apple Card algorithm is indeed biased I am surprised that nobody checked the algorithm more thoroughly before shipping it.

A loan limit differing by a factor of 10 depending on gender is an egregious error.

Had the data scientists involved in the loan algorithm, or indeed the recidivism prediction algorithm used by the state of Wisconsin, followed my checklist above for pen-testing and stress testing their algorithms, I imagine they would have spotted the PR disaster before it had a chance to make headlines.

Of course it is easy to point fingers after the fact, and the field of data science in big industry is as yet in its infancy. Some would call it a Wild West of under-regulation.

I think we can also be glad that some conservative industries such as healthcare have not yet adopted AI for important decisions. Imagine the fallout if a melanoma-analysing algorithm, or amniocentesis decision making model, turned out to have a racial bias.

For this reason I would strongly recommend that large companies releasing algorithms into the wild to take important decisions start to segregate out a team of data scientists whose job is not to develop algorithms, but to pen-test and stress test them.

The data scientists developing the models are under too much time pressure to be able to do this themselves, and as the cybersecurity industry has discovered through years of experience, sometimes it is best to have an external person play devil’s advocate and try to break your system.

References

Your NLP Career Awaits!

Ready to take the next step in your NLP journey? Connect with top employers seeking talent in natural language processing. Discover your dream job!

Find Your Dream Job

Semantic similarity with sentence embeddings
Data scienceNatural language processing

Semantic similarity with sentence embeddings

In natural language processing, we have the concept of word vector embeddings and sentence embeddings. This is a vector, typically hundreds of numbers, which represents the meaning of a word or sentence.

How is AI being used in healthcare?
Ai and societyData science

How is AI being used in healthcare?

We often hear about the potential for AI in healthcare, or how it could transform organisations like the UK’s National Health Service.

Large language models (LLM) and NLP: A new era of AI and ML has begun
Data scienceNatural language processing

Large language models (LLM) and NLP: A new era of AI and ML has begun

Large Language Models and NLP: Overview AI has seen remarkable advancements in recent years and one of its most notable ones is the development of LLMs or large language models.

What we can do for you

Transform Unstructured Data into Actionable Insights

Contact us