Author:
(1) Anuar Assamidanov, Department of Economics, Claremont Graduate University, 150 E 10th St, Claremont, CA 91711. (Email: anuar.assamidanov@cgu.edu). Table of Links Abstract Background Data Methodology Results Discussion and Conclusion, and References A Appendix Tables and Figures 3 Methodology The identification strategy used in this study is Difference in Differences with Dummy Outcomes (Price and Wolfers, 2010), which focuses on coach and artist gender interaction. Specifically, we estimate the following linear probability model: The regression model in the difference-in-differences framework does not capture potential heterogeneous effects across covariates. There could be a positive effect on some groups in the covariates but a negative or no effect on other groups. For example, the coaches in the initial stage of the blind audition could elicit own-gender bias, but in the later stage, opposite-gender bias. The standard regression model will provide a weighted average of effect size, even if an effect varies across a population represented by different groups. One plausible solution is to estimate heterogeneous treatment effects. Instead of just estimating one effect, I estimate a distribution of effect size for a given unit with a given set of attributes and what their impact might be. To analyze this heterogeneous effect of own-gender bias, I incorporated the first difference approach into the causal forest approach. I will discuss the performance of the proposed causal forest approach and how to apply it to my empirical example. The causal forest approach is consistent and asymptotically Gaussian and provides an estimator for their asymptotic variance that allows us to construct valid confidence intervals (Athey and Wager, 2019). Causal forest utilizes a forest-based approach to calculate a similarity weight and then applies the local generalized method of moments to estimate based on a weighted set of neighbors. The authors also show that using the centered instead of the original outcome and treatment allows a forest to focus its splits on the features that affect the treatment effect instead of the primary outcome. Intuitively, a causal forest consists of regression trees (Athey and Imbens, 2016; Breiman et al., 1984), which estimate conditional average treatment effects (CATE) at the leaves of the trees instead of predicting the outcome variables. I utilize a two-step approach by incorporating the first-difference approach into the causal forest approach, the similar two-step approach in the standard differencein-difference method. In the first step, I estimate the first difference, which is the change in the selection of individual contestants from a male coach to a female coach. This step teases out the systematic differences between coaches in female and male coach groups. In the second step, I use the outcome variables obtained from the first step and apply the causal forest approach for heterogeneous treatment-effect analysis. This step teases out the effect of contestant gender trends that exist in both women and men. My methods rely heavily on the causal forest approach, a machine learning method that can estimate heterogeneous causal effect functions under the above assumptions. The main strength associated with the causal forest is that it provides a strategy for learning patterns of heterogeneity from the data. This method requires little researcher discretion compared to traditional subgroup analyses. For instance, one does not have to decide which cutoffs to use to analyze heterogeneity in effects along continuous covariates, nor do we require to define the functional condition of the heterogeneity; the causal forest algorithm is designed to automate this search and find the most appropriate model that best characterizes the heterogeneous effect in the data (Bonander and Svensson, 2021). This paper is available on arxiv under CC 4.0 license. Author: (1) Anuar Assamidanov, Department of Economics, Claremont Graduate University, 150 E 10th St, Claremont, CA 91711. (Email: anuar.assamidanov@cgu.edu). Author: Author: (1) Anuar Assamidanov, Department of Economics, Claremont Graduate University, 150 E 10th St, Claremont, CA 91711. (Email: anuar.assamidanov@cgu.edu). Table of Links Abstract Abstract Background Background Data Data Methodology Methodology Results Results Discussion and Conclusion, and References Discussion and Conclusion, and References A Appendix Tables and Figures A Appendix Tables and Figures 3 Methodology The identification strategy used in this study is Difference in Differences with Dummy Outcomes (Price and Wolfers, 2010), which focuses on coach and artist gender interaction. Specifically, we estimate the following linear probability model: The regression model in the difference-in-differences framework does not capture potential heterogeneous effects across covariates. There could be a positive effect on some groups in the covariates but a negative or no effect on other groups. For example, the coaches in the initial stage of the blind audition could elicit own-gender bias, but in the later stage, opposite-gender bias. The standard regression model will provide a weighted average of effect size, even if an effect varies across a population represented by different groups. One plausible solution is to estimate heterogeneous treatment effects. Instead of just estimating one effect, I estimate a distribution of effect size for a given unit with a given set of attributes and what their impact might be. To analyze this heterogeneous effect of own-gender bias, I incorporated the first difference approach into the causal forest approach. I will discuss the performance of the proposed causal forest approach and how to apply it to my empirical example. The causal forest approach is consistent and asymptotically Gaussian and provides an estimator for their asymptotic variance that allows us to construct valid confidence intervals (Athey and Wager, 2019). Causal forest utilizes a forest-based approach to calculate a similarity weight and then applies the local generalized method of moments to estimate based on a weighted set of neighbors. The authors also show that using the centered instead of the original outcome and treatment allows a forest to focus its splits on the features that affect the treatment effect instead of the primary outcome. Intuitively, a causal forest consists of regression trees (Athey and Imbens, 2016; Breiman et al., 1984), which estimate conditional average treatment effects (CATE) at the leaves of the trees instead of predicting the outcome variables. I utilize a two-step approach by incorporating the first-difference approach into the causal forest approach, the similar two-step approach in the standard differencein-difference method. In the first step, I estimate the first difference, which is the change in the selection of individual contestants from a male coach to a female coach. This step teases out the systematic differences between coaches in female and male coach groups. In the second step, I use the outcome variables obtained from the first step and apply the causal forest approach for heterogeneous treatment-effect analysis. This step teases out the effect of contestant gender trends that exist in both women and men. My methods rely heavily on the causal forest approach, a machine learning method that can estimate heterogeneous causal effect functions under the above assumptions. The main strength associated with the causal forest is that it provides a strategy for learning patterns of heterogeneity from the data. This method requires little researcher discretion compared to traditional subgroup analyses. For instance, one does not have to decide which cutoffs to use to analyze heterogeneity in effects along continuous covariates, nor do we require to define the functional condition of the heterogeneity; the causal forest algorithm is designed to automate this search and find the most appropriate model that best characterizes the heterogeneous effect in the data (Bonander and Svensson, 2021). This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Using Causal Forests to Unveil Gender Bias in Hiring

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

AI Tries (and Fumbles) at Inflation Forecasting

What 'The Voice' Auditions Reveal About Gender Bias in Hiring

Blind Auditions of 'The Voice' Expose Gender Bias in Hiring Practices

Revealing Gender Bias: Insights from Blind Auditions on 'The Voice

Discrimination and Constraints: Evidence from The Voice

Exploring Own-Gender Bias in Talent Selections

AI Tries (and Fumbles) at Inflation Forecasting

What 'The Voice' Auditions Reveal About Gender Bias in Hiring

Blind Auditions of 'The Voice' Expose Gender Bias in Hiring Practices

Revealing Gender Bias: Insights from Blind Auditions on 'The Voice

Discrimination and Constraints: Evidence from The Voice

Exploring Own-Gender Bias in Talent Selections

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps