Skip to main content

Bias and Fairness in Machine Learning

Project Information

Project Status: Reviewing Applicants
Project Region: CAREERS
Submitted By: Ahmed Rashed
Project Email: amrashed@ship.edu
Anchor Institution: CR-Penn State
Project Address: 6127 Galleon Dr
Mechanicsburg, Pennsylvania. 17050

Project Description

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them.

In the industry, it has become very critical to create fair ML models in order to respect different groups in the sensitive features that are protected by the law and not to favorably select some groups against the others. Bias can show up in either dataset sampling or model performance against protected groups or individuals. Therefore, it is important in the industry to establish a bias analysis system to identify and mitigate the bias in both the dataset and model performance with respect to group and individual fairness. There are several fairness libraries to achieve this job. In the industry, fairness libraries that are used in bias analysis must be created by well-known organizations. There are fairness libraries created by big companies such as Microsoft, IBM, and Google.

The goal of this project is to compare the fairness libraries that can be used in the industry and work out a use-case using a published dataset and written in the Python language. The methodology of the project includes the following points:

1. Surveying the basics of bias and fairness in machine learning.
2. Selecting a published structured and unstructured dataset.
3. Searching for possible fairness libraries that can be used in the industry.
4. Choosing the proper fairness metrics to identify the bias.
5. Discussing the possible mitigation algorithms that can be used in pre-processing, in-processing, and post-processing.
6. Employing the libraries in identifying and mitigating the bias in structured and unstructured data as a use-case.
7. Discussing the results and summarizing the comparison among the libraries.
8. Deploying a classification or regression model to be used with structured data

Project Information

Project Status: Reviewing Applicants
Project Region: CAREERS
Submitted By: Ahmed Rashed
Project Email: amrashed@ship.edu
Anchor Institution: CR-Penn State
Project Address: 6127 Galleon Dr
Mechanicsburg, Pennsylvania. 17050

Project Description

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them.

In the industry, it has become very critical to create fair ML models in order to respect different groups in the sensitive features that are protected by the law and not to favorably select some groups against the others. Bias can show up in either dataset sampling or model performance against protected groups or individuals. Therefore, it is important in the industry to establish a bias analysis system to identify and mitigate the bias in both the dataset and model performance with respect to group and individual fairness. There are several fairness libraries to achieve this job. In the industry, fairness libraries that are used in bias analysis must be created by well-known organizations. There are fairness libraries created by big companies such as Microsoft, IBM, and Google.

The goal of this project is to compare the fairness libraries that can be used in the industry and work out a use-case using a published dataset and written in the Python language. The methodology of the project includes the following points:

1. Surveying the basics of bias and fairness in machine learning.
2. Selecting a published structured and unstructured dataset.
3. Searching for possible fairness libraries that can be used in the industry.
4. Choosing the proper fairness metrics to identify the bias.
5. Discussing the possible mitigation algorithms that can be used in pre-processing, in-processing, and post-processing.
6. Employing the libraries in identifying and mitigating the bias in structured and unstructured data as a use-case.
7. Discussing the results and summarizing the comparison among the libraries.
8. Deploying a classification or regression model to be used with structured data