We will be using the dataset at https://github.com/propublica/compas-analysis/raw/master/compas-scores-two-years.csv. Reading it in:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn import linear_model
from sklearn import preprocessing
import learningmachine as lm
import warnings
warnings.filterwarnings('ignore')
np.random.seed(0)
# read in data as Pandas dataframe
df_in = pd.read_csv("https://github.com/propublica/compas-analysis/raw/master/compas-scores-two-years.csv")
df_in.head()
In preparation for upcoming analysis, we'll also change categorical variables (sex
and c_charge_degree
) to numerical labels.
lm.label_encode(df_in, 'sex')
lm.label_encode(df_in, 'c_charge_degree')
Convert Pandas dataframe to list of lists.
compas = lm.df_to_list(df_in)
# inspect amount of data (number of defendants)
len(compas)
First, we'll split our data into train, validation, and test sets using a 70:15:15 split ratio.
idx = list(range(len(compas)))
np.random.shuffle(idx)
train_size = int(.7*len(compas))
valid_size = int(.15*len(compas))
test_size = int(.15*len(compas))
compas_train = [compas[i] for i in idx[:train_size]]
compas_valid = [compas[i] for i in idx[train_size+1:train_size+valid_size]]
compas_test = [compas[i] for i in idx[train_size+valid_size+1:train_size+valid_size+test_size+1]]
PART 1: COMPARING THE SCORES OF BLACK AND WHITE DEFENDANTS
We first explore if white and black defendants get the same COMPAS scores.
# data for black defendants
compas_train_b = [row for row in compas_train if 'African-American' in row]
compas_valid_b = [row for row in compas_valid if 'African-American' in row]
# data for white defendants
compas_train_w = [row for row in compas_train if 'Caucasian' in row]
compas_valid_w = [row for row in compas_valid if 'Caucasian' in row]
Since our lists no longer contain column names, use a function in the learningmachine
module that will allow us to index into our list using the original column names.
index_feature_list = lm.columnname_to_index(df_in)
def feature_ind(feat_name):
"""
Take feature name and return relevant index within list.
"""
for row in index_feature_list:
if feat_name == row[1]:
return row[0]
# list of decile scores for black defendants in training data
decile_scores_b = [x[feature_ind('decile_score')] for x in compas_train_b]
# list of decile scores for white defendants in training data
decile_scores_w = [x[feature_ind('decile_score')] for x in compas_train_w]
# overlapping histograms of decile scores of black and white defendants
plt.hist(decile_scores_b, density=True, alpha=0.5, label='black')
plt.hist(decile_scores_w, density=True, alpha=0.5, label='white')
plt.title("Score Distributions of Black and White Defendants")
plt.xlabel('Decile score')
plt.ylabel('Density')
plt.legend(loc="upper right")
plt.show()
For African-American defendants, the distribution of the scores is approximately uniform. For Caucasian defendants, many more get low scores than high scores.
PART 2: INITIAL EVALUATION OF THE COMPAS SCORES
Here, we are computing the false positive rate (FPR), false negative rate (FNR), and correct classification rate (CCR) for different populations. First, we'll define functions to compute the quantities needed.
def getFPR(data, thr):
"""
Return false positive rate for COMPAS data data, using
thr as the threshold on the decile score
Keyword arguments:
data -- dataset containing data with features and outcome
thr -- threshold
"""
false_positives = 0
total_negatives = 0
for row in data:
if row[feature_ind('decile_score')] >= thr and row[feature_ind('is_recid')] == 0:
false_positives += 1
if row[feature_ind('is_recid')] == 0:
total_negatives += 1
return false_positives/total_negatives
def getFNR(data, thr):
"""
Return false negative rate for COMPAS data data, using
thr as the threshold on the decile score
Keyword arguments:
data -- dataset containing data with features and outcome
thr -- threshold
"""
false_negatives = 0
total_positives = 0
for row in data:
if row[feature_ind('decile_score')] < thr and row[feature_ind('is_recid')] == 1:
false_negatives += 1
if row[feature_ind('is_recid')] == 1:
total_positives += 1
return false_negatives/total_positives
def getCCR(data, thr):
"""
Return correct classification rate for COMPAS data data, using
thr as the threshold on the decile score
Keyword arguments:
data -- dataset containing data with features and outcome
thr -- threshold
"""
correctly_classified = 0
for row in data:
if row[feature_ind('decile_score')] >= thr and row[feature_ind('is_recid')] == 1:
correctly_classified += 1
if row[feature_ind('decile_score')] < thr and row[feature_ind('is_recid')] == 0:
correctly_classified += 1
return correctly_classified/len(data)
Given a threshold of 5, we can compute the scores.
thr = 5
fps = [getFPR(compas_valid_b, thr), getFPR(compas_valid_w, thr), getFPR(compas_valid, thr)]
fns = [getFNR(compas_valid_b, thr), getFNR(compas_valid_w, thr), getFNR(compas_valid, thr)]
ccr = [getCCR(compas_valid_b, thr), getCCR(compas_valid_w, thr), getCCR(compas_valid, thr)]
print('FPR for black, white, and all defendants:', fps)
print()
print('FNR for black, white, and all defendants:', fns)
print()
print('CCR for black, white, and all defendants:', ccr)
We can see that the scores do not satisfy false positive parity and do not satisfy false negative parity. The scores do satisfy classification parity. Demographic parity is also not satisfied.
PART 3: ALTERING THE THRESHOLD
We will now see how changing the threshold influences the false positive, false negative, and correct classification rates.
def getRates(data, thr):
"""
Return list containing FPR, FNR, and CCR.
Keyword arguments:
data -- dataset containing data with features and outcome
thr -- threshold
"""
return [getFPR(data, thr), getFNR(data, thr), getCCR(data, thr)]
# list of thresholds
thrs = list(range(0,10,1))
# make lists of rates for white defendants, black defendants, and all defendants
rates_w = [getRates(compas_valid_w, thr) for thr in thrs]
rates_b = [getRates(compas_valid_b, thr) for thr in thrs]
rates_all = [getRates(compas_valid, thr) for thr in thrs]
def PlotRates(rates_data, thrs, title):
"""
Plot rate (FPR, FNR, CCR) on y axis and threshold on x axis.
Keyword arguments:
rates_data -- list of previously calculated rates (FPR, FNR, CCR)
thrs -- list of thresholds
title -- title of figure ('white defendants', 'black defendents', or 'all defendents')
"""
plt.plot(thrs, [x[0] for x in rates_data], label='FPR')
plt.plot(thrs, [x[1] for x in rates_data], label='FNR')
plt.plot(thrs, [x[2] for x in rates_data], label='CCR')
plt.xlim(1, 9)
plt.title(title)
plt.xlabel('threshold')
plt.ylabel('rate')
plt.legend(loc="upper right")
plt.show()
PlotRates(rates_w, thrs, "white defendants")
PlotRates(rates_b, thrs, "black defendants")
PlotRates(rates_all, thrs, "all defendants")
PART 4: TRYING TO REPRODUCE THE SCORE
Before we fit the model, let's split our datasets into predictors (x) and outcome (y). We'll build a function called get_x_y_split
to split the data into x and y components.
def get_x_y_split(data, predictors):
"""
Split data into x and y components. x will contain data corresponding to predictors of interest.
Keyword arguments:
data -- list of lists containing data
predictors -- list containing predictors of interest
"""
feats_inds = []
x = []
y = []
for feat in predictors:
feats_inds.append(feature_ind(feat))
for defendant_data in data:
x.append([defendant_data[i] for i in feats_inds])
y.append(defendant_data[feature_ind('is_recid')])
return x, y
We'll create our datasets to include two predictors: age
and priors_count
.
predictors = ['age', 'priors_count']
compas_train_x, compas_train_y = get_x_y_split(compas_train, predictors)
We'll now use the learning_machine
function to make our model.
my_model = lm.learning_machine(compas_train_x, compas_train_y, predictors)
We'll check out the coefficients of our model.
my_model.coefs
An increase of 1 in the number of priors is associated with an increase of 0.17 in the log-odds of recidivism, all other things being equal
An increase in age by one year corresponds to a decrease of 0.05 in the log-odds of recidivism
(If we are being a bit silly and extrapolate) according to the model, a newborn with no priors would have a probability of $\sigma(1.04) = 0.74$ of being re-arrested.
Let's now obtain the FPR, FNR, and CCR for our model, using the threshold 0.5.
We'll use the predict
function to generate the model's predictions.
Now let's build our scoring functions. This time, we have the model as another parameter.
def get_binary_preds(prob_preds, thr):
"""
Return the binary versions of the predictions by
thresholding prob_preds with threshold thr
"""
binary_preds = []
for prob_pred in prob_preds:
binary_preds.append(prob_pred > thr)
return binary_preds
def getFPR_fit(model, x_data, y_data, thr):
"""
Return the false positive rate for predictions
by the model on x_data using threshold thr,
with ground truth data y_data
Keyword arguments:
model -- model generated from the learning machine
x_data -- list of lists containing predictors
y_data -- list of lists containing outcome
thr -- threshold
"""
prob_pred = lm.predict(model, x_data)
pred = get_binary_preds(prob_pred, thr)
false_positives = 0
total_negatives = 0
for i, outcome in enumerate(y_data):
if pred[i] == 1 and outcome == 0:
false_positives += 1
if outcome == 0:
total_negatives +=1
return false_positives/total_negatives
def getFNR_fit(model, x_data, y_data, thr):
"""
Return the false negative rate for predictions
by the model on x_data using threshold thr,
with ground truth data y_data
Keyword arguments:
model -- model generated from the learning machine
x_data -- list of lists containing predictors
y_data -- list of lists containing outcome
thr -- threshold
"""
prob_pred = lm.predict(model, x_data)
pred = get_binary_preds(prob_pred, thr)
false_negatives = 0
total_positives = 0
for i, outcome in enumerate(y_data):
if pred[i] == 0 and outcome == 1:
false_negatives += 1
if outcome == 1:
total_positives +=1
return false_negatives/total_positives
def getCCR_fit(model, x_data, y_data, thr):
"""
Return the correct classification rate rate for predictions
by the model on x_data using threshold thr,
with ground truth data y_data
Keyword arguments:
model -- model generated from the learning machine
x_data -- list of lists containing predictors
y_data -- list of lists containing outcome
thr -- threshold
"""
prob_pred = lm.predict(model, x_data)
pred = get_binary_preds(prob_pred, thr)
correctly_classified = 0
for i, outcome in enumerate(y_data):
if pred[i] == outcome:
correctly_classified += 1
return correctly_classified/len(y_data)
# build relevant datasets
compas_valid_b_x, compas_valid_b_y = get_x_y_split(compas_valid_b, predictors)
compas_valid_w_x, compas_valid_w_y = get_x_y_split(compas_valid_w, predictors)
compas_valid_x, compas_valid_y = get_x_y_split(compas_valid, predictors)
thr = 0.5
fps_fit = [getFPR_fit(my_model, compas_valid_b_x, compas_valid_b_y, thr), getFPR_fit(my_model, compas_valid_w_x, compas_valid_w_y, thr), getFPR_fit(my_model, compas_valid_x, compas_valid_y, thr)]
fns_fit = [getFNR_fit(my_model, compas_valid_b_x, compas_valid_b_y, thr), getFNR_fit(my_model, compas_valid_w_x, compas_valid_w_y, thr), getFNR_fit(my_model, compas_valid_x, compas_valid_y, thr)]
ccr_fit = [getCCR_fit(my_model, compas_valid_b_x, compas_valid_b_y, thr), getCCR_fit(my_model, compas_valid_w_x, compas_valid_w_y, thr), getCCR_fit(my_model, compas_valid_x, compas_valid_y, thr)]
print('FPR for black, white, and all defendants:', fps_fit)
print()
print('FNR for black, white, and all defendants:', fns_fit)
print()
print('CCR for black, white, and all defendants:', ccr_fit)
It appears that there is basically no overfitting.
PART 5: ADJUSTING THRESHOLDS
We basically want to find the thresholds for which the false positive rates are at parity. Let's see what the rates are for different thresholds.
compas_train_b_x, compas_train_b_y = get_x_y_split(compas_train_b, predictors)
compas_train_w_x, compas_train_w_y = get_x_y_split(compas_train_w, predictors)
# list of thresholds
thrs = [0.3, 0.4, 0.5, 0.57, 0.6, 0.7, 0.8]
FP_b = [getFPR_fit(my_model, compas_train_b_x, compas_train_b_y, thr) for thr in thrs]
FP_w = [getFPR_fit(my_model, compas_train_w_x, compas_train_w_y, thr) for thr in thrs]
print('FPR for black defendants:', FP_b)
print()
print('FPR for white defendants:', FP_w)
def PlotFPR(rates_b, rates_w, thrs):
"""
Plot rate (FPR, FNR, CCR) on y axis and threshold on x axis.
Keyword arguments:
thrs -- list of thresholds
rates_b -- FPR list for black defendants
rates_w -- FPR list for white defendants
"""
plt.plot(thrs, rates_b, label="black")
plt.plot(thrs, rates_w, label="white")
plt.title('FPR across thresholds')
plt.xlabel('threshold')
plt.ylabel('FPR')
plt.legend()
We need to tweak the threshold for black defendants just a little:
thrs_detail = [0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59]
FP_b_new = [getFPR_fit(my_model, compas_train_b_x, compas_train_b_y, thr) for thr in thrs_detail]
print('FPR for black defendants for tweaked thresholds:', list(zip(thrs_detail, FP_b_new)))
Let's try to visualize the threshold at which the white and black demographic would be at parity.
PlotFPR(FP_b, FP_w, thrs)
plt.axvline(x=0.5, color="grey", ls="dashed")
plt.axhline(y=0.21, color="grey", ls="dashed")
plt.axvline(x=0.557, color="grey", ls="dashed")
plt.show()
thr = 0.557
seems about right.
Now the white and black demographic would be at parity. We'll compute the correct classification rate on the validation set.
compas_valid_x, compas_valid_y = get_x_y_split(compas_valid, predictors)
getCCR_fit(my_model, compas_valid_x, compas_valid_y, 0.557)
(Note that we ignored everyone who wasn't white or black. That's OK to do, but including other demographics (in any way you like) is OK too).