Skip to main content

Bias and Fairness in Machine Learning: Identifying Bias

Project Information

machine-learning, natural-language-processing
Project Status: In Progress
Project Region: CAREERS
Submitted By: Ahmed Rashed
Project Email: amrashed@ship.edu
Anchor Institution: CR-Penn State
Project Address: 6127 Galleon Dr
Mechanicsburg, Pennsylvania. 17050

Mentors: Pranav Venkit
Students: Abdelkrim Kallich

Project Description

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them.
In the industry, it has become very critical to create fair ML models in order to respect different groups in the sensitive features that are protected by the law and not to favorably select some groups against the others. Bias can show up in either dataset sampling or model performance against protected groups or individuals. Therefore, it is important in the industry to establish a bias analysis system to identify and mitigate the bias in both the dataset and model performance with respect to group and individual fairness.
There are several fairness libraries to achieve this job. In the industry, fairness libraries that are used in bias analysis must be created by well-known organizations. There are fairness libraries created by big companies such as Microsoft, IBM, and Google. The goal of this project is to compare the fairness libraries that can be used in the industry and work out a use-case using a published dataset.

Project Information

machine-learning, natural-language-processing
Project Status: In Progress
Project Region: CAREERS
Submitted By: Ahmed Rashed
Project Email: amrashed@ship.edu
Anchor Institution: CR-Penn State
Project Address: 6127 Galleon Dr
Mechanicsburg, Pennsylvania. 17050

Mentors: Pranav Venkit
Students: Abdelkrim Kallich

Project Description

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them.
In the industry, it has become very critical to create fair ML models in order to respect different groups in the sensitive features that are protected by the law and not to favorably select some groups against the others. Bias can show up in either dataset sampling or model performance against protected groups or individuals. Therefore, it is important in the industry to establish a bias analysis system to identify and mitigate the bias in both the dataset and model performance with respect to group and individual fairness.
There are several fairness libraries to achieve this job. In the industry, fairness libraries that are used in bias analysis must be created by well-known organizations. There are fairness libraries created by big companies such as Microsoft, IBM, and Google. The goal of this project is to compare the fairness libraries that can be used in the industry and work out a use-case using a published dataset.