Fair NLP - Stanford University

Transcription

Fair NLPMay 6, 2020Dr. Wei Wei, Prof. James LandayCS 335: Fair, Accountable, and Transparent (FAccT) Deep LearningStanford University

Recap Basic Data Preprocessing Techniques for Fairness The Expected Joint Distribution Under Our Observed Joint DistributionResample/Reweigh Datato Match ExpectedDistribution

Recap Reweighting Resampling Universal Sampling Sample uniformly Preferential Sampling Sample based on model uncertaintyKamiran et al, 2012

Outline Fairness Through Data/Prediction Manipulations Individual FairnessOptimized Pre-processingLearning to DeferFair NLP Biases in NLP ModelsData AugmentationDebiasing Word EmbeddingAdversarial Learning

Individual FairnessIncome 50kCredit Score 690AcceptedIncome 43kCredit Score 650Acceptedgroup 1Income 50kCredit Score 690Denied?Income 70kCredit Score 740Income 100kCredit Score 750AcceptedAcceptedgroup 2

Individual Fairness A predictor M achieves individual fairness under a distance metric d iff Similar Samples are treated similarly, in other words

Individual FairnessIndividualGroup 1Income 19kCredit Score 690Income 23kCredit Score 720Income 60kCredit Score 800Group 2Income 20kCredit Score 680Income 27kCredit Score 700Income 65kCredit Score 810

Fairness CriteriaIndividual TreatmentFairness Through UnawarenessGroup TreatmentDemographic ParityExcludes Sensitive Information A from thepredictorIndividual FairnessEqual Opportunity/Odds

Outline Fairness Through Data/Prediction Manipulations Individual FairnessOptimized Pre-processingLearning to DeferFair NLP Biases in NLP ModelsData AugmentationDebiasing Word EmbeddingAdversarial Learning

Optimized Pre-Processing for Fairness Can We Automate the Resampling Process? Turn the the manual process into an optimization based approachInclude more criteria than Demographic FairnessAllow transformations of dataOptimized Pre-Processing Given sensitive feature D, learn a probabilistic mappingSatisfies three constraintsthat transfersCalmon el al, 2017

Resampling and TransformingResamplingTransforming

Constraint 1: Utility Preservations A Utility Function to Preserve the Joint Probability e.g. KL Divergencetransformed dataoriginal dataCalmon el al, 2017

Constraint 2: Discrimination Control Constrain the dependency of the target variable y given sensitive feature dto march target J - distance measure- a small number used as our toleranceWhen, we achieve Demographic ParityCalmon el al, 2017

Constraint 3: Distortion Control An Implementation of the Individual Fairness The Mapped Sample Has to Stay Close to the Original Sample- tolerance- a similarity function 1 - very different 0 - very similarCalmon el al, 2017

Putting Things TogetherUtilityDiscrimination controlgroup fairnessDistortion ControlIndividual fairnessCalmon el al, 2017

COMPAS Dataset

COMPAS Dataset

COMPAS Dataset

Results on COMPAS datasetLogistic RegressionRandom ForestLFR - Learning Fair Representations (Zemel et al, 2013)Calmon el al, 2017

Outline Fairness Through Data/Prediction Manipulations Individual FairnessOptimized Pre-processingLearning to DeferFair NLP Biases in NLP ModelsData AugmentationDebiasing Word EmbeddingAdversarial Learning

Post-Processing Methods for Fairness Why Post-Processing? Flexibility: No need to fine-tune the ML modelModel Agnostic: Can be applied across a wide range of modelsLearning to DeferPost-processing ModelDecision-Maker

Learning to Defer Working Together with A Black-box Decision-maker Model Decision-maker models (e.g. human) have access to important informationthat our model does not hasDecision-maker models might be biasedPerformance and Fairness Trade-offs Fix the unfair predictions of the decision-maker modelDefer to the decision-maker the model is uncertainResponsibleModelnFair but possibly inaccuratepredictionsDefer?yDecisionmakerAccurate but possibly biasedpredictions

Learning to Defer Decision-maker Model Considered as a black-box modelNo fine-tuning, no access to its training dataResponsible Model Have access to additional dataStick to fairness constraintsResponsibleModelnpossibly inaccurate but fairpredictionDefer?yDecisionMakerpossibly biased but accuratepredictions

Training the Defer Modelxidecision y biased but accuratepredictionspossibly inaccurate but fairpredictionsi 0si 1Final PredictionFair regularizerMadras et al, 2018

Results on COMPAS DM Model High-Accuracy - DM has more data, Highly-Biased - DM is extremely biased DM - Decision-maker modelDefer - Fair - Learning to DeferReject- Fair - Only reject or accept DM Baseline - Model trained only tooptimize accuracy, no DMBinary - Fair - Baseline optimizedwith fairnessMadras et al, 2018

Outline Fairness Through Data/Prediction Manipulations Individual FairnessOptimized Pre-processingLearning to DeferFair NLP Biases in NLP ModelsData AugmentationDebiasing Word EmbeddingAdversarial Learning

Biases of NLP Models Denigration Under-representation The disproportionately low representation of a specific groupe.g., a classifier's performance is adversely affected due to sampling biases of the minorityprotected groupStereotyping The use of culturally or historically derogatory termsAn over-generalized belief about a particular category of peoplee.g., a classifier attributes man to computers more than womanRecognition Algorithms perform different for protected groups because of their inherent characteristicse.g., a voice recognition algorithm has better capabilities in recognizing voices in lowfrequency

Biases of NLP ModelsS D R U (S)tereotyping, (D)enigration,(S)tereotyping(R)ecognition, (U)nder-representationSun et al, 2019

Outline Fairness Through Data/Prediction Manipulations Individual FairnessOptimized Pre-processingLearning to DeferFair NLP Biases in NLP ModelsData AugmentationDebiasing Word EmbeddingAdversarial Learning

Data AugmentationBiased DatasetData AugmentationOriginalDataAugmentedData

Coreference ResolutionA man and his son get into a terrible car crash. The father dies, and the boy isbadly injured. In the hospital, the surgeon looks at the patient and exclaims, “Ican’t operate on this boy, he’s my son!Does this paragraph make sense to you?Rudinger et al, 2018

Gender Swapping in Coreference er et al, 2018

ResultsSTAT- Statistical Model (Durrett et al, 2013)RULE - Rule Based Model (Lee et al, 2011)NEURAL - Neural Based Model (Clark et al, 2016)Rudinger et al, 2018

ResultsE2E (Lee et al, 2011)Feature (Durrett et al, 2013)Diff - Difference between pro/antiZhao et al, 2018

Outline Fairness Through Data/Prediction Manipulations Individual FairnessOptimized Pre-processingLearning to DeferFair NLP Biases in NLP ModelsData AugmentationDebiasing Word EmbeddingAdversarial Learning

Word Embeddings An Essential Part of Deep NLP Models Classifications (e.g., Sentiment Analysis)Text Generation (e.g., translation, summarization)Text Retrieval (e.g., Question Answering)Visual-Language Representations (e.g., Image Captioning)TextDiscrete SpaceLook UpsWord EmbeddingContinuous SpaceNeural Networks

Word Embeddings Embedding Techniques GloVe (Pennington et al, 2014)Word2Vec (Rong et al, 2014)Trained Through A Proxy Task Word proximity (GloVe)SkipGram (Word2Vec)

Geometric Properties of Word Embeddings

Can Word Embedding Be Biased?Garga et al, 2017

Types of Gender Associations Definitional Gender Associations Stereotypical Gender AssociationsBolukbasi et al, 2016

Definitional and Stereotypical AssociationsBolukbasi et al, 2016

Gender SubspaceBolukbasi et al, 2016

Gender-Neutral Word Embeddings Decompose Word Embeddings Into Gender-Related and Gender-NeuralPartsgrandfatherGender RelatedGender Neutralgrandfathera 580.580.680.68Zhao et al, 2018

Gender-Neutral Word Embeddings Fine-tuning Word Embeddings Using Debiasing RegularizersGloveLoss FunctionRegulateGender-relatedWordsRegulate All OtherWordsFemale Seed WordsAll Other WordsMale Seed WordsZhao et al, 2018

Gender-Neutral Word Embeddings Fine-tuning Word Embeddings Using Debiasing RegularizersRegulateGender-relatedWordsPush Toward ExtremesOn Gender DimensionsFemale Seed WordMale Seed Wordw(g) - Gender-related Componentsw(a) - Gender-neutral ComponentsZhao et al, 2018

Gender-Neutral Word Embeddings Fine-tuning Word Embeddings Using Debiasing RegularizersRegulate All OtherWordsGender Subspacew(g) - Gender-related Componentsw(a) - Gender-neutral Componentsw(a)θvgZhao et al, 2018

Gender Attribute Separationw(g) of All Occupationsw(g) - Gender-related Componentsw(a) - Gender-neutral Componentsw(a) of GloVe for GenderNeutral Occupationsw(a) of Gender-Neutral GloVE forGender Neutral Occupations

Gender Relational AnalogyJurgens et al , 2012

Coreference Resolutionw(a) - Gender-neutral ComponentsJurgens et al , 2012

Outline Fairness Through Data/Prediction Manipulations Individual FairnessOptimized Pre-processingLearning to DeferFair NLP Biases in NLP ModelsData AugmentationDebiasing Word EmbeddingAdversarial Learning

Summary Optimized Pre-processing for Fairness Post-processing Techniques for Fairness Separate gender specific and gender neutral embeddingsData Augmentation Learning to DeferFix biased predictions from the decision-makerTake advantage of high performance of the decision-maker modelWord Debiasing Optimizes several fairness criteria (Demographic Parity, Individual Fairness) at the same timeTransform data to meet criteriaGender SwappingAdversarial Learning

Reading Assignments Gonen, Hila, and Yoav Goldberg. Lipstick on a Pig: Debiasing Methods Cover upSystematic Gender Biases in Word Embeddings But do not Remove Them,NAACL 2019Zhao, Jieyu, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, andKai-Wei Chang. Gender Bias in Contextualized Word Embeddings, NAACL 2019Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, and RichardZemel. Understanding the Origins of Bias in Word Embeddings, ICML 2019Sheng, Emily, Kai-Wei Chang, Prem Natarajan, and Nanyun Peng. The WomanWorked as a Babysitter: On Biases in Language Generation, EMNLP 2019Sap, Maarten, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith. Therisk of racial bias in hate speech detection, ACL 2019

Next LectureFair Visual Representations

Optimized Pre-processing Learning to Defer Fair NLP Biases in NLP Models . Adversarial Learning. Individual Fairness Income 50k Credit Score 690 Income 43k Credit Score 650 Income 50k Credit Score 690 Income 70k Credit Score 740 Income 100k . Sap, Maarten, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith. The