{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Auditing the COMPAS Score: Predictive Modeling and Algorithmic Fairness" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will be using the dataset at [https://github.com/propublica/compas-analysis/raw/master/compas-scores-two-years.csv](https://github.com/propublica/compas-analysis/raw/master/compas-scores-two-years.csv). Reading it in:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "from sklearn import linear_model\n", "from sklearn import preprocessing\n", "import learningmachine as lm\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "np.random.seed(0)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# read in data as Pandas dataframe\n", "df_in = pd.read_csv(\"https://github.com/propublica/compas-analysis/raw/master/compas-scores-two-years.csv\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamefirstlastcompas_screening_datesexdobageage_catrace...v_decile_scorev_score_textv_screening_datein_custodyout_custodypriors_count.1startendeventtwo_year_recid
01miguel hernandezmiguelhernandez2013-08-14Male1947-04-1869Greater than 45Other...1Low2013-08-142014-07-072014-07-140032700
13kevon dixonkevondixon2013-01-27Male1982-01-223425 - 45African-American...1Low2013-01-272013-01-262013-02-050915911
24ed philoedphilo2013-04-14Male1991-05-1424Less than 25African-American...3Low2013-04-142013-06-162013-06-16406301
35marcu brownmarcubrown2013-01-13Male1993-01-2123Less than 25African-American...6Medium2013-01-13NaNNaN10117400
46bouthy pierrelouisbouthypierrelouis2013-03-26Male1973-01-224325 - 45Other...1Low2013-03-26NaNNaN20110200
\n", "

5 rows × 53 columns

\n", "
" ], "text/plain": [ " id name first last compas_screening_date sex \\\n", "0 1 miguel hernandez miguel hernandez 2013-08-14 Male \n", "1 3 kevon dixon kevon dixon 2013-01-27 Male \n", "2 4 ed philo ed philo 2013-04-14 Male \n", "3 5 marcu brown marcu brown 2013-01-13 Male \n", "4 6 bouthy pierrelouis bouthy pierrelouis 2013-03-26 Male \n", "\n", " dob age age_cat race ... v_decile_score \\\n", "0 1947-04-18 69 Greater than 45 Other ... 1 \n", "1 1982-01-22 34 25 - 45 African-American ... 1 \n", "2 1991-05-14 24 Less than 25 African-American ... 3 \n", "3 1993-01-21 23 Less than 25 African-American ... 6 \n", "4 1973-01-22 43 25 - 45 Other ... 1 \n", "\n", " v_score_text v_screening_date in_custody out_custody priors_count.1 \\\n", "0 Low 2013-08-14 2014-07-07 2014-07-14 0 \n", "1 Low 2013-01-27 2013-01-26 2013-02-05 0 \n", "2 Low 2013-04-14 2013-06-16 2013-06-16 4 \n", "3 Medium 2013-01-13 NaN NaN 1 \n", "4 Low 2013-03-26 NaN NaN 2 \n", "\n", " start end event two_year_recid \n", "0 0 327 0 0 \n", "1 9 159 1 1 \n", "2 0 63 0 1 \n", "3 0 1174 0 0 \n", "4 0 1102 0 0 \n", "\n", "[5 rows x 53 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_in.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In preparation for upcoming analysis, we'll also change categorical variables (`sex` and `c_charge_degree`) to numerical labels. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "lm.label_encode(df_in, 'sex')\n", "lm.label_encode(df_in, 'c_charge_degree')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert Pandas dataframe to list of lists." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "compas = lm.df_to_list(df_in)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7214" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# inspect amount of data (number of defendants)\n", "len(compas)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we'll split our data into train, validation, and test sets using a 70:15:15 split ratio." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "idx = list(range(len(compas)))\n", "np.random.shuffle(idx)\n", "\n", "train_size = int(.7*len(compas))\n", "valid_size = int(.15*len(compas)) \n", "test_size = int(.15*len(compas))\n", "\n", "compas_train = [compas[i] for i in idx[:train_size]]\n", "compas_valid = [compas[i] for i in idx[train_size+1:train_size+valid_size]]\n", "compas_test = [compas[i] for i in idx[train_size+valid_size+1:train_size+valid_size+test_size+1]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**PART 1: COMPARING THE SCORES OF BLACK AND WHITE DEFENDANTS**\n", "\n", "We first explore if white and black defendants get the same COMPAS scores." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# data for black defendants\n", "compas_train_b = [row for row in compas_train if 'African-American' in row]\n", "compas_valid_b = [row for row in compas_valid if 'African-American' in row]\n", "\n", "# data for white defendants\n", "compas_train_w = [row for row in compas_train if 'Caucasian' in row]\n", "compas_valid_w = [row for row in compas_valid if 'Caucasian' in row]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since our lists no longer contain column names, use a function in the `learningmachine` module that will allow us to index into our list using the original column names." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "index_feature_list = lm.columnname_to_index(df_in)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def feature_ind(feat_name):\n", " \"\"\"\n", " Take feature name and return relevant index within list.\n", " \"\"\"\n", " for row in index_feature_list:\n", " if feat_name == row[1]:\n", " return row[0]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# list of decile scores for black defendants in training data\n", "decile_scores_b = [x[feature_ind('decile_score')] for x in compas_train_b]\n", "\n", "# list of decile scores for white defendants in training data\n", "decile_scores_w = [x[feature_ind('decile_score')] for x in compas_train_w]" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# overlapping histograms of decile scores of black and white defendants\n", "plt.hist(decile_scores_b, density=True, alpha=0.5, label='black')\n", "plt.hist(decile_scores_w, density=True, alpha=0.5, label='white')\n", "plt.title(\"Score Distributions of Black and White Defendants\")\n", "plt.xlabel('Decile score')\n", "plt.ylabel('Density')\n", "plt.legend(loc=\"upper right\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For African-American defendants, the distribution of the scores is approximately uniform. For Caucasian defendants, many more get low scores than high scores." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**PART 2: INITIAL EVALUATION OF THE COMPAS SCORES**\n", "\n", "Here, we are computing the false positive rate (FPR), false negative rate (FNR), and correct classification rate (CCR) for different populations. First, we'll define functions to compute the quantities needed. " ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "def getFPR(data, thr):\n", " \"\"\"\n", " Return false positive rate for COMPAS data data, using\n", " thr as the threshold on the decile score\n", " \n", " Keyword arguments:\n", " data -- dataset containing data with features and outcome\n", " thr -- threshold \n", " \"\"\"\n", " false_positives = 0\n", " total_negatives = 0\n", " \n", " for row in data:\n", " if row[feature_ind('decile_score')] >= thr and row[feature_ind('is_recid')] == 0:\n", " false_positives += 1\n", " if row[feature_ind('is_recid')] == 0:\n", " total_negatives += 1\n", " \n", " return false_positives/total_negatives\n", "\n", "\n", "def getFNR(data, thr):\n", " \"\"\"\n", " Return false negative rate for COMPAS data data, using\n", " thr as the threshold on the decile score\n", " \n", " Keyword arguments:\n", " data -- dataset containing data with features and outcome\n", " thr -- threshold \n", " \"\"\"\n", " false_negatives = 0\n", " total_positives = 0\n", " \n", " for row in data:\n", " if row[feature_ind('decile_score')] < thr and row[feature_ind('is_recid')] == 1:\n", " false_negatives += 1\n", " if row[feature_ind('is_recid')] == 1:\n", " total_positives += 1 \n", " \n", " return false_negatives/total_positives\n", "\n", "\n", "def getCCR(data, thr):\n", " \"\"\"\n", " Return correct classification rate for COMPAS data data, using\n", " thr as the threshold on the decile score\n", " \n", " Keyword arguments:\n", " data -- dataset containing data with features and outcome\n", " thr -- threshold \n", " \"\"\"\n", " correctly_classified = 0\n", " \n", " for row in data:\n", " if row[feature_ind('decile_score')] >= thr and row[feature_ind('is_recid')] == 1:\n", " correctly_classified += 1\n", " if row[feature_ind('decile_score')] < thr and row[feature_ind('is_recid')] == 0:\n", " correctly_classified += 1\n", " \n", " return correctly_classified/len(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Given a threshold of 5, we can compute the scores." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FPR for black, white, and all defendants: [0.38846153846153847, 0.2706422018348624, 0.3120689655172414]\n", "\n", "FNR for black, white, and all defendants: [0.29372937293729373, 0.47058823529411764, 0.37924151696606784]\n", "\n", "CCR for black, white, and all defendants: [0.6625222024866785, 0.652542372881356, 0.6567992599444958]\n" ] } ], "source": [ "thr = 5\n", "\n", "fps = [getFPR(compas_valid_b, thr), getFPR(compas_valid_w, thr), getFPR(compas_valid, thr)]\n", "fns = [getFNR(compas_valid_b, thr), getFNR(compas_valid_w, thr), getFNR(compas_valid, thr)]\n", "ccr = [getCCR(compas_valid_b, thr), getCCR(compas_valid_w, thr), getCCR(compas_valid, thr)]\n", "\n", "print('FPR for black, white, and all defendants:', fps)\n", "print()\n", "print('FNR for black, white, and all defendants:', fns)\n", "print()\n", "print('CCR for black, white, and all defendants:', ccr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that the scores do not satisfy false positive parity and do not satisfy false negative parity. The scores do satisfy classification parity. Demographic parity is also not satisfied." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**PART 3: ALTERING THE THRESHOLD**\n", "\n", "We will now see how changing the threshold influences the false positive, false negative, and correct classification rates." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "def getRates(data, thr):\n", " \"\"\"\n", " Return list containing FPR, FNR, and CCR. \n", " \n", " Keyword arguments:\n", " data -- dataset containing data with features and outcome\n", " thr -- threshold \n", " \"\"\"\n", " return [getFPR(data, thr), getFNR(data, thr), getCCR(data, thr)]" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "# list of thresholds\n", "thrs = list(range(0,10,1))\n", "\n", "# make lists of rates for white defendants, black defendants, and all defendants\n", "rates_w = [getRates(compas_valid_w, thr) for thr in thrs] \n", "rates_b = [getRates(compas_valid_b, thr) for thr in thrs] \n", "rates_all = [getRates(compas_valid, thr) for thr in thrs]" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "def PlotRates(rates_data, thrs, title):\n", " \"\"\"\n", " Plot rate (FPR, FNR, CCR) on y axis and threshold on x axis.\n", " \n", " Keyword arguments:\n", " rates_data -- list of previously calculated rates (FPR, FNR, CCR)\n", " thrs -- list of thresholds\n", " title -- title of figure ('white defendants', 'black defendents', or 'all defendents')\n", " \"\"\"\n", " plt.plot(thrs, [x[0] for x in rates_data], label='FPR')\n", " plt.plot(thrs, [x[1] for x in rates_data], label='FNR')\n", " plt.plot(thrs, [x[2] for x in rates_data], label='CCR')\n", " plt.xlim(1, 9)\n", " plt.title(title)\n", " plt.xlabel('threshold')\n", " plt.ylabel('rate')\n", " plt.legend(loc=\"upper right\")\n", " plt.show()" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "PlotRates(rates_w, thrs, \"white defendants\")\n", "PlotRates(rates_b, thrs, \"black defendants\")\n", "PlotRates(rates_all, thrs, \"all defendants\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**PART 4: TRYING TO REPRODUCE THE SCORE**\n", "\n", "Before we fit the model, let's split our datasets into predictors (x) and outcome (y). We'll build a function called `get_x_y_split` to split the data into x and y components." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "def get_x_y_split(data, predictors):\n", " \"\"\"\n", " Split data into x and y components. x will contain data corresponding to predictors of interest.\n", " \n", " Keyword arguments:\n", " data -- list of lists containing data \n", " predictors -- list containing predictors of interest\n", " \"\"\" \n", " feats_inds = []\n", " x = []\n", " y = []\n", "\n", " for feat in predictors:\n", " feats_inds.append(feature_ind(feat))\n", "\n", " for defendant_data in data:\n", " x.append([defendant_data[i] for i in feats_inds]) \n", " y.append(defendant_data[feature_ind('is_recid')])\n", " \n", " return x, y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll create our datasets to include two predictors: `age` and `priors_count`." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "predictors = ['age', 'priors_count']\n", "compas_train_x, compas_train_y = get_x_y_split(compas_train, predictors)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll now use the `learning_machine` function to make our model." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "my_model = lm.learning_machine(compas_train_x, compas_train_y, predictors)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll check out the coefficients of our model." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('age', -0.048346079830393386), ('priors_count', 0.16380407311845432)]" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_model.coefs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* An increase of 1 in the number of priors is associated with an increase of 0.17 in the log-odds of recidivism, all other things being equal\n", "\n", "* An increase in age by one year corresponds to a decrease of 0.05 in the log-odds of recidivism\n", "\n", "* (If we are being a bit silly and extrapolate) according to the model, a newborn with no priors would have a probability of $\\sigma(1.04) = 0.74$ of being re-arrested.\n", "\n", "Let's now obtain the FPR, FNR, and CCR for our model, using the threshold 0.5." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll use the `predict` function to generate the model's predictions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's build our scoring functions. This time, we have the model as another parameter. " ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "def get_binary_preds(prob_preds, thr):\n", " \"\"\"\n", " Return the binary versions of the predictions by \n", " thresholding prob_preds with threshold thr\n", " \"\"\"\n", " binary_preds = []\n", " for prob_pred in prob_preds:\n", " binary_preds.append(prob_pred > thr)\n", " \n", " return binary_preds\n", "\n", "\n", "def getFPR_fit(model, x_data, y_data, thr):\n", " \"\"\"\n", " Return the false positive rate for predictions\n", " by the model on x_data using threshold thr, \n", " with ground truth data y_data\n", " \n", " Keyword arguments:\n", " model -- model generated from the learning machine\n", " x_data -- list of lists containing predictors \n", " y_data -- list of lists containing outcome \n", " thr -- threshold\n", " \"\"\"\n", " prob_pred = lm.predict(model, x_data) \n", " pred = get_binary_preds(prob_pred, thr)\n", " false_positives = 0\n", " total_negatives = 0\n", " \n", " for i, outcome in enumerate(y_data):\n", " if pred[i] == 1 and outcome == 0:\n", " false_positives += 1\n", " if outcome == 0:\n", " total_negatives +=1 \n", "\n", " return false_positives/total_negatives\n", "\n", "\n", "def getFNR_fit(model, x_data, y_data, thr):\n", " \"\"\"\n", " Return the false negative rate for predictions\n", " by the model on x_data using threshold thr, \n", " with ground truth data y_data\n", " \n", " Keyword arguments:\n", " model -- model generated from the learning machine \n", " x_data -- list of lists containing predictors \n", " y_data -- list of lists containing outcome \n", " thr -- threshold\n", " \"\"\"\n", " prob_pred = lm.predict(model, x_data) \n", " pred = get_binary_preds(prob_pred, thr)\n", " \n", " false_negatives = 0\n", " total_positives = 0\n", " \n", " for i, outcome in enumerate(y_data):\n", " if pred[i] == 0 and outcome == 1:\n", " false_negatives += 1\n", " if outcome == 1:\n", " total_positives +=1 \n", "\n", " return false_negatives/total_positives\n", "\n", "\n", "def getCCR_fit(model, x_data, y_data, thr):\n", " \"\"\"\n", " Return the correct classification rate rate for predictions\n", " by the model on x_data using threshold thr, \n", " with ground truth data y_data\n", " \n", " Keyword arguments:\n", " model -- model generated from the learning machine \n", " x_data -- list of lists containing predictors \n", " y_data -- list of lists containing outcome \n", " thr -- threshold\n", " \"\"\"\n", "\n", " prob_pred = lm.predict(model, x_data) \n", " pred = get_binary_preds(prob_pred, thr)\n", " \n", " correctly_classified = 0\n", "\n", " for i, outcome in enumerate(y_data):\n", " if pred[i] == outcome:\n", " correctly_classified += 1\n", " \n", " return correctly_classified/len(y_data)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "# build relevant datasets\n", "compas_valid_b_x, compas_valid_b_y = get_x_y_split(compas_valid_b, predictors)\n", "compas_valid_w_x, compas_valid_w_y = get_x_y_split(compas_valid_w, predictors)\n", "compas_valid_x, compas_valid_y = get_x_y_split(compas_valid, predictors)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FPR for black, white, and all defendants: [0.36538461538461536, 0.23853211009174313, 0.3017241379310345]\n", "\n", "FNR for black, white, and all defendants: [0.2706270627062706, 0.4852941176470588, 0.35728542914171657]\n", "\n", "CCR for black, white, and all defendants: [0.6856127886323268, 0.6666666666666666, 0.6725254394079556]\n" ] } ], "source": [ "thr = 0.5\n", "\n", "fps_fit = [getFPR_fit(my_model, compas_valid_b_x, compas_valid_b_y, thr), getFPR_fit(my_model, compas_valid_w_x, compas_valid_w_y, thr), getFPR_fit(my_model, compas_valid_x, compas_valid_y, thr)]\n", "fns_fit = [getFNR_fit(my_model, compas_valid_b_x, compas_valid_b_y, thr), getFNR_fit(my_model, compas_valid_w_x, compas_valid_w_y, thr), getFNR_fit(my_model, compas_valid_x, compas_valid_y, thr)]\n", "ccr_fit = [getCCR_fit(my_model, compas_valid_b_x, compas_valid_b_y, thr), getCCR_fit(my_model, compas_valid_w_x, compas_valid_w_y, thr), getCCR_fit(my_model, compas_valid_x, compas_valid_y, thr)]\n", "\n", "print('FPR for black, white, and all defendants:', fps_fit)\n", "print()\n", "print('FNR for black, white, and all defendants:', fns_fit)\n", "print()\n", "print('CCR for black, white, and all defendants:', ccr_fit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It appears that there is basically no overfitting. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**PART 5: ADJUSTING THRESHOLDS**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We basically want to find the thresholds for which the false positive rates are at parity. Let's see what the rates are for different thresholds." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FPR for black defendants: [0.8432203389830508, 0.6754237288135593, 0.36610169491525424, 0.1788135593220339, 0.14491525423728813, 0.07203389830508475, 0.03898305084745763]\n", "\n", "FPR for white defendants: [0.6464646464646465, 0.47575757575757577, 0.2101010101010101, 0.07878787878787878, 0.06161616161616162, 0.024242424242424242, 0.010101010101010102]\n" ] } ], "source": [ "compas_train_b_x, compas_train_b_y = get_x_y_split(compas_train_b, predictors)\n", "compas_train_w_x, compas_train_w_y = get_x_y_split(compas_train_w, predictors)\n", "\n", "# list of thresholds\n", "thrs = [0.3, 0.4, 0.5, 0.57, 0.6, 0.7, 0.8]\n", "\n", "FP_b = [getFPR_fit(my_model, compas_train_b_x, compas_train_b_y, thr) for thr in thrs]\n", "FP_w = [getFPR_fit(my_model, compas_train_w_x, compas_train_w_y, thr) for thr in thrs]\n", "\n", "print('FPR for black defendants:', FP_b)\n", "print()\n", "print('FPR for white defendants:', FP_w)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "def PlotFPR(rates_b, rates_w, thrs):\n", " \"\"\"\n", " Plot rate (FPR, FNR, CCR) on y axis and threshold on x axis.\n", " \n", " Keyword arguments:\n", " thrs -- list of thresholds\n", " rates_b -- FPR list for black defendants\n", " rates_w -- FPR list for white defendants\n", " \"\"\"\n", " plt.plot(thrs, rates_b, label=\"black\")\n", " plt.plot(thrs, rates_w, label=\"white\")\n", " plt.title('FPR across thresholds')\n", " plt.xlabel('threshold')\n", " plt.ylabel('FPR')\n", " plt.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to tweak the threshold for black defendants just a little:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FPR for black defendants for tweaked thresholds: [(0.51, 0.3254237288135593), (0.52, 0.29745762711864404), (0.53, 0.2584745762711864), (0.54, 0.24067796610169492), (0.55, 0.21525423728813559), (0.56, 0.20254237288135593), (0.57, 0.1788135593220339), (0.58, 0.16779661016949152), (0.59, 0.15508474576271186)]\n" ] } ], "source": [ "thrs_detail = [0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59]\n", "FP_b_new = [getFPR_fit(my_model, compas_train_b_x, compas_train_b_y, thr) for thr in thrs_detail]\n", "\n", "print('FPR for black defendants for tweaked thresholds:', list(zip(thrs_detail, FP_b_new)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try to visualize the threshold at which the white and black demographic would be at parity." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "PlotFPR(FP_b, FP_w, thrs)\n", "plt.axvline(x=0.5, color=\"grey\", ls=\"dashed\")\n", "plt.axhline(y=0.21, color=\"grey\", ls=\"dashed\")\n", "plt.axvline(x=0.557, color=\"grey\", ls=\"dashed\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`thr = 0.557` seems about right. \n", "\n", "Now the white and black demographic would be at parity. We'll compute the correct classification rate on the validation set." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.669750231267345" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compas_valid_x, compas_valid_y = get_x_y_split(compas_valid, predictors)\n", "getCCR_fit(my_model, compas_valid_x, compas_valid_y, 0.557)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(Note that we ignored everyone who wasn't white or black. That's OK to do, but including other demographics (in any way you like) is OK too)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }