[musical cue] [Voiceover] National Data Archive on Child Abuse and Neglect. [Clayton Covington] Welcome everyone to the third session of the 2023 summer training series hosted here at the National Data archive on child abuse neglect which you might refer to as NDACAN NDACAN all types of variations. My name is Clayton Covington I'm The Graduate research associate with NDACAN and gonna help facilitate today's session with our presenter. So just a couple reminders that this session is going to be recorded and if you need any assistance you can reach out to Andres Arroyo who is listed his email on the slide. Next slide please. Also just a couple reminders that we are going to do a Q and A for this session as we do all of our sessions but please keep your Q and A to the actual Q and A box which should be available on your screens. We are going to hold all questions to the end at which point the presenter will then respond to the questions. Next slide. So the National Data arc National Data archive on child abuse neglect is also funded by the Children's Bureau so this is just an acknowledgment of our funding sources. Next slide. And just give you a little overview of what we've done so far and where we're going. So the last two sessions were first an introduction to NDACAN and administrative data series here at NDACAN. We last week covered one of our newest data Acquisitions the CCOULD data set which is really interesting linking some child welfare data with Medicaid data. And then this session is going to be a workshop on causal inference using administrative data led by our presenter Garrett Baker who's a PhD candidate in public policy and sociology at Duke University. Next slide please. So without further Ado I will hand it over to Garrett thank you so much Garrett for presenting we're really looking forward to hearing from you. [Garrett Baker] All right thanks so much Clayton for the intro and thanks to NDACAN for inviting me to do this talk. So a quick overview here of the agenda and sort of how the presentation will progress today. I'll start with sort of some high level background on sort of what is causal inference, what we're talking about when we're talking about causal inference. Then sort of the middle section will shift to sort of how do we do causal inference and talk about some of the key sort of constructs and methodologies. And then in the final section we'll go through a specific example and a sort of relevant one from a topic perspective on a recent paper on the child welfare system. At the bottom there I just put the goal really I think of this presentation, this session is really a sort of high level overview sort of geared more towards people who have heard terms that will be used today you know causal inference, things like confounding, diff in diffs, endogeneity, but you know maybe know just a little bit but don't know exactly sort of some of the details behind them. There won't be any really high level or really very little math or equations or anything really technical in detail we just don't have time for that because it's such a big topic. So really my goal is let's give you all sort of an initial push to build some intuition and gain some initial knowledge you can go pursue more details and whatever things that you hear today that seem potentially interesting or useful for you in your work. So what is causal inference? What's the point again? At a really high level the the sort of most simple thing we want to know is whether X causes Y, right? Whether something causes or leads to another thing. This may just be an academic exercise but it can also help us adjust our own lives accordingly can help us adjust our Behavior. From a policy perspective it can help us allocate resources and form the sort of policies and practices that we're advocating for and so forth. So really the only math you'll see here which is really a simple equation which is basically in in theory for unit I the average causal effect of the given treatment is the outcome under the treatment minus the outcome not under the treatment. And if it was that simple we wouldn't really need an entire field of causal inference but unfortunately in practice is usually not this simple and the main reason it's not this simple is that we can't observe this sort of counterfactual, right? We can't observe sort of parallel universe in which a unit so that can be a person, School, City, County, whatever I'll often just say person or individual but it can be really any unit they either do or do not receive the treatment at a given time, right? We can't see a sort of again a parallel world where a school does and does not do a reading intervention, right? They either do or don't do the reading intervention and so we can't directly observe a given unit under both conditions. This is often referred to as a sort of fundamental problem of causal inference. There's a bunch of stuff on potential outcomes Frameworks. Rubins causal model I put some links in here [ONSCREEN https://en.wikipedia.org/wiki/Rubin_causal_model] because these slides will be made available afterwards and I do that sort of throughout the presentation and then at the very end I also point you toward some references to be able to sort of do more learning on on some of the details of what we talk about here. So some specific problems that we're trying to overcome or try to overcome various issues of what's often referred to as endogeneity or confounding some of the definitions of endogeneity and confounding and a couple of things that I'm about to talk about will sort of vary by group like economists might refer to them as one thing, statisticians and other social science folks who might refer to them as another thing I'm going to pay less attention to like giving you a really specific or trying to tell you that one definition is correct. I just want to again build some intuition here with these sort of two main constructs or concepts that are sort of fundamental to everything else that we're going to talk about and fundamental to the the concept of causal inference. First we have selection bias. The fact that individuals or units who receive a given treatment are often very different than those who don't receive that treatment, right? In the real world you know for example if you're interested in in the effects of incarceration on a given outcome or child or child protective services investigations or foster care placement on on a given outcome. The difficulty is that folks who are or are not incarcerated or families who do and do not get investigated by CPS are often very different in a bunch of sort of meaningful ways than those who don't. So you can't just simply compare outcomes for those who do and do not experience those things. Sometimes you know if you have really good data access and we'll talk about data more in a moment, you might be able to do a sort of simple regression model and control for certain things that help deal with selection bias a bit, but our second the second Point here is that much of the time the the heterogeneity or the variation between these groups is unobserved and either it's unobservable in that sense that it's really really difficult to measure or it's just in the data we have access to we just simply think that they're really important ways that these two groups, the treated and untreated or control groups, might differ but we just can't see that in the data we have so we can't just simply control for them in a sort of straightforward or simple way that we would maybe like to. So here we have just a graphical representation. You might see these referred to as path models or dags directed acyclical graphs where this basically just represents X causing Y. The main problem that of confounding endogeneity is if we have some variable that is correlated with or causes both X and Y, right? It's represented here by a U because this is sort of typical notation because oftentimes we're talking about this being unobserved. So we'd say here that U confounds the relationship between X and Y. A sort of common example in statistics that is sort of silly but always sticks to my head is that statistically it looks usually like either drowning causes ice cream consumption, or ice cream consumption causes drowning. But obviously we you know we maybe think that maybe people eat ice cream and then they get too full and drown but that's that's obviously not really the case with the the actual relationship here is that when it's hotter out, right? So in the summertime people eat more ice cream and people swim more and therefore drown more. So U here, the the confounding variable, would be temperature or or seasonality. So in a really straightforward world if you if that was the relationship or the model that you were trying to estimate and you could control for temperature in a simple regression model you'd be good to go. But of course in in the real world and in the questions that I think most people here are interested in things are not that simple. And that's really what we're going to talk about for the rest of the presentation. One quick aside before we start getting into some more details is that causal inference as a term is sometimes kind of confusing and this is because there's both causal inference like you see in the first bullet point here with capital C and a capital I it's almost sort of become its own subfield it's become such a prominent part of empirical social science that it's and statistics that it's sort of become a distinct field. So you might hear people refer to it that way but then also you know you might hear people just say we're trying to make causal inferences we're trying to make inferences about some empirical relationships. So I just want to make sure that that is a clear distinction. And so again at the end of the day basically we're just trying to eliminate potential confounders and and try to identify or create a control group that is as similar as possible to treatment group, right? That is like really the most basic plain language way of describing what we're trying to do here. Okay so how can we do I can actually do causal inference? A first point that I want to touch on that I think gets overlooked especially by people who are just sort of first getting into this, it was certainly something that I didn't really think about at first and was not necessarily explicitly taught this in any class or anything, but the the title of this talk is causal inference using administrative data and I think people often just focus on the first part of that the causal inference part and not so much the second part the administrative data part. But the the data choices and the data we have access to are really really important in sort of informing what we actually are able to do from the empirical standpoint. So I'll spend a few minutes here before I get into some methodologies to just talk a little bit about data. So survey data are obviously when individuals are surveyed, right? They're asked about things in their lives some prominent examples you may have heard of are Add Health, Fragile Families, NSLY National Institutional Survey of Youth there are many others. The benefits of surveys they're often easier to access sometimes they're public and you can just sort of go download them on your own or maybe just have to fill out some forms and get access to them. They're also nice because you can get sort of difficult to measure things especially like household dynamics or people's feelings or things that are really sort of hard to measure you can just ask people about them which comes with its own limitations but but can sometimes get you access to information that that wouldn't be available in administrative data. The downside though of surveys is really that we can only typically do a sort of what's called a condition on observables strategy. So this is just like your simple linear logistic regression model with control variables. You can maybe then add a bit to that with matching or weighting, maybe fixed-effect strategies if you have the if the survey data is set up in panel structure meaning the same people are asked the the same sort of survey questions over time. The main reason so this this is an important aspect of causal inference. The main reason I'm not going to go into it today is really two reasons: one is that the title of the talk is causal inference with administrative data and these sort of matching or weighting or condition on observable regression with control strategies are really geared towards survey data and the topic here is Administrative data. The other is that last summer in the NDACAN summer training series, Alex Roehrkasse had a really really nice presentation that was exclusively focused on matching strategies. So if that's something you're really interested in that's available online and you can go find Alex's really good presentation on that. So shifting to administrative data this sort of topic that we're focusing on here which is going to inform the the sort of methods, specific methods that I'm about to talk about. Administrative records are sort of officially collected or recorded by organizations. Typically you know we're talking about government agencies this can be local, state, federal etc. But typically we're talking about government data although not always depends a bit on the the topic and the discipline. So with NDACAN you have AFCARS and NCANDS data these are you know two government-collected administrative record systems. You also have you'll see in in research school data is really common so it can be from an individual school or school district or state. Criminal records also something coming from police departments that's sort of another common source of administrative records. The downside compared to survey data is Administrative records are often harder to access, right? You can imagine why: these are often more identifiable sensitive data they're also owned by government agencies so you're inevitably going to have a lot of sort of bureaucratic bureaucratic Hoops to jump through although NDACAN can make it you know and again is a potentially maybe slightly less burdensome process it's still a bit different than survey data. There also will be some things that are left out of administrative data. I have a link here [ONSCREEN https://mccourt.georgetown.edu/news/who-is-missing-from-administrative-data/#:~:text=Administrative%20data%20may%20be%20missing,to%20observe%20in%20administrative%20data] if you're interested in learning a little more about this but administrative data you know there are some groups some individuals and some circumstances that are are really important that are not going to be captured in administrative data. And then finally and this is not the case importantly for NDACAN as it's the National Data archive on child abuse neglect but often administrative data is not national level it's collected by local agencies and thus is less generalizable to the national population. The benefits of administrative data especially in thinking about causal inference is that they typically have the level of sort of detail, timing, and sufficient sample size to at least potentially utilize conventional causal inference techniques which we'll talk about in a moment. They're also prospective in nature meaning you know we actually observe a child's test score or in AFCARS we actually observe child's foster care records instead of just asking them in a survey, right? Retrospectively there's tons of biases and misunderstandings that can happen if you ask about you know how did you do on a test or in school a few years ago or what were your experiences as a child with the foster care system? You're you're introducing a lot of potential for biases that that you then have to deal with that you don't have to to deal with with administrative records. Okay so jumping right into the first and sort of most prominent I think the thing that most people will have heard of will have heard of is the randomized control trial the RCT. Sort of an experiment method where you randomly assign some people to a treatment and some people do a control. This is sort of referred to often as a gold standard because you're basically you're getting rid of all the potential endogeneity confounding and differences between the two groups because you're just randomly putting some people in the treatment in the control bucket. The reason that I'm I'm honestly not going to go through these too much here because especially in the child welfare system you can imagine it's often not very practical it's very often not practical or ethical to conduct an RCT, right? We're not going to randomly assign some people to to have CPS contact. You're not going to randomly assign people to have contact with with systems like this it's just in in general I don't find it very practical in this in this field at least to talk about. The one thing I will mention here that if this is something that you are still interested in, there are a couple important sort of nuances related to the actual estimand or the thing that you're trying to estimate the two being the ATE or the average treatment effect and then the ATT which the average treatment effect on the treated. The main difference here being that even if you're in the treatment group you may not actually receive or or follow or comply with the treatment. And and that sort of nuance is an important thing that that if this is something that you're you're interested in RCT is I'd encourage you to go and sort of think about or learn about the ATE and ATT and the important distinctions there. But but again I think that for the most part and especially if we're talking about administrative data that's already been collected RCT s are not something that's going to be quite as pertinent. So the sort of if that's an experimental method the sort of more common and with administrative data more practical things that we're we're going to look at are sort of either called natural experiments or quality experimental methods. There's some debate about the differences between what constitutes a natural vs. Quasi-experimental method for now we'll just use them synonymously to distinguish them from an experiment, an RCT. The basic sort of intuition behind a natural experiment is that there's some randomization that has occurred whether literally by nature so something that just happened in the world and this could sometimes literally be nature itself. Hurricanes are sometimes used as natural experiment but it doesn't literally have to be like a natural event or some type of statistical procedure some type of choice or cut off that was made by an individual, an agency, a school or whatever that introduced some type of randomization about how people were split into groups. So the first sort of most prominent I think probably at this point most prominent strategy is the instrumental variable technique you'll hear this sometimes just be called the IV technique. The intuition behind the IV is you're identifying and then using a third variable which is referred to as an instrument which can affect the outcome only through its effect on the predictor. So getting back to this same dag that we saw earlier we have X our interest is on if X impacts Y if X causes Y and we have this third variable or however many variables that are potentially correlated with both X and Y that are confounding this relationship in a way that's problematic for our ability to make inferences about this X to Y relationship. So the instrument a good instrument will is represented here by Z and this is a variable that again only affects X, right? So it cannot directly it should not directly in any way affect Y and I'll explain why in a moment. There are a few big assumptions that we have to make the the sort of most important that you'll hear if you pursue learning about this a little bit more as you hear about the exclusion restriction which is the only influencing Y through X. We don't want any relationship between z and Y because if you if you sort of imagine I don't know if you'd see my mouse here but if you see if you imagine that a z influences both X and Y it's actually just U, right? Then it just becomes something that's confounding the relationship in a way that we don't want, we want something that's only influencing X. The intuition here is that the instrument should only cause changes in the treatment, right? And is independent of X and therefore can be used to recover a treatment effect in a way that I'll discuss at the end the the paper that I'm going to talk about in the third section of this presentation will use an instrumental variable technique. Now there are some there's sort of an increased skepticism around IV and the empirical social science world at the moment. IV requires a lot of big assumptions. There's also been sort of growing concerns about weak instruments and instruments in general. The the four papers that I refer to here [ONSCREEN Andrews, Stock, and Sun (2018), Mellon (2020); Bound, Jaeger, and Baker (1995); Lal et al (2021)] a couple of them just give good background and I encourage people to read a couple of the other ones are sort of specific to the problem of weak instruments and weak sort of referring back to the dag here, that the the impact of Z on X is is really weak and may not actually be a very convincing instrument to to induce changes in X and therefore allow us to recover a treatment effect on Y. So moving on to a second construct. The the regression discontinuity method. This basically this method sort of leverages a cutoff score that exists in the real world and is used to sort or treat or select individuals into or out of something. In the real world basically you imagine anyone who scores above a certain score receives X receives some treatment anyone below the score receives nothing or receive some alternative. We then compare basically individuals on either side just on either side of the cutoff who should be essentially identical on average and the only difference between them is that they just either made or missed the sort of cut off and I'll give you a specific example here and one that you can see visually. So this is taken from Scott Cunningham's book textbook causal inference mixtape it's referencing an old paper that you can get if you have the book that's online it's one of the references that I will point you to at the end. But basically the the sort of gist of the paper is that the authors wanted to see if if attending a good college impacts future earnings. You can imagine this is a relevant academic and policy question but it's difficult to answer using sort of just a simple regression with controls model that we talked about earlier because people who go to a a top college are fundamentally different in a lot of ways than those who did not and so to just compare them may not be sufficient if you think that some of those ways that they're different, again that unobserved heterogeneity, if you think some of those variations are not able to be captured in data that you have access to that presents a problem for inference. So the very clever thing that these authors did is they took data from Florida and in Florida I don't know if they still do this but at least at the time the paper they did where they basically had a minimum cutoff score on the SAT to get into the University of Florida which is sort of the flagship school. In Florida you use a 1250 and you needed to score at least 1250 to even be considered for admission. So what the the figure here is showing is zero they basically standardized this x-axis so zero is 1250 on the SAT and to the left you see points below 1250 and above. So negative 50 means a score of 1200, 50 is a score of 1300 and so on. Basically what you see here on the y-axis the enrollment rate is what you'd expect. Sorry this is called this assignment variable it's called The Running variable, right? That's what you use to sort of assess the discontinuity and you can see here this big gap. So it seems like you know what we'd expect if the University of Florida is indeed following their policy that basically right at 1250 there's a huge jump in your likelihood of being admitted to the University of Florida. Now it's not perfect there may be a few people down here that were admitted even though they were just below but you can see the percentages here are very very small. And so this is good because it basically means that this group right here just below the cutoff score down here and this group just above the cutoff score should be essentially identical, right? I don't think we'd assume that people who score 1240 versus 1260 or 1230 versus 1270 are really going to be very different individuals at all they just happen to score on either side of this you know somewhat arbitrary cutoff that the university set. So what you can then do and again we're not going to get into any equations or technical details here but you can basically use a few statistical techniques to then use this cutoff to predict the actual outcome. You can see on the y-axis here log of earnings and if there is indeed an effect which it seems like there is here you'd see also this little discontinuity or jump in earnings at the cutoff, right? So you can see right here to right here on either side of the cutoff it does look like there is a bit of an effect of going to the University of Florida going to a top college on later earnings that is seemingly above and beyond some of the other characteristics that might also influence earnings. So the regression discontinuity design I think has sort of become increasingly popular because these cutoffs are used in the real world all the time, right? You can imagine and government context, funding contexts you know people have to make decisions about where things get cut off and and this I think regression discontinuities are kind of everywhere once you once you sort of know the intuition behind them you can sort of start seeing them pop up all the time. It also relies on potentially less sort of difficult assumptions compared to the instrumental variables technique and I think is relatively clear conceptually like I think it's a thing that you know most people can sort of wrap their head around fairly easily. The the third bullet point here the one thing that if you're interested in this and want to learn more that you should sort of familiarize yourself with is the concept of a fuzzy versus a sharp regression discontinuity. So the example that we gave here is in theory a sharp discontinuity right? That 1250 is the cutoff and it is sharp and if you're above you're in if you're or you're eligible to be in and if you're below you cannot be admitted. We can see there is a little bit of fuzziness basically at the cut off it's not perfect but that's this is really more of a Sharp regressionist continuity, a fuzzy one is where there's a bit more sort of leeway in the in the cut off and you have to do some basically a few different things statistically to sort of try to to deal with that. All right and then our third quasi-experimental or natural experiment method here is the "diff in diff", the difference in differences. This is a method used when you have panel data or data that is collected from individuals or groups or whatever your unit is over time and you can you can observe them over some period of time. Basically the the simplest case and the intuition here I think is that you have a control and a treatment group so you have two groups where one group the only difference between them or the only meaningful difference between them is that one of them will receive a treatment some type of intervention, it can be anything really at one point in time and you basically then use that to compare your outcome of interest for these two groups. So you can imagine say in a county or a city a city randomly says that half of the neighborhoods or half of the city we're just going to remove police officers. That you know we don't have money anymore and so we just have to randomly get rid of half of our police force. This basically you then have two different groups those who live in an area where the police remained and those preliminary where the police were removed. And they may have different levels and say your outcome they're interested in is crime they may have different levels of crime beforehand. So see this say this red is the treatment group and the blues the control group. They may have different levels of crime beforehand but as long as they're changing either increasing or decreasing at sort of parallel rates, and the Assumption here is parallel pre-trends before the intervention, then then you're okay. As long as the the change is constant you're okay and I'll show you why in a second. So you can see here in the absence of the treatment, so the control group who the police officers remained you'd expect that crime would continue on in the same Trend as it did before. For the treatment group this dotted line here you is what you'd assume the trend would have continued to be in the absence of the intervention by the city. But because they removed police officers you'd expect potentially some change and you know in this totally hypothetical example, the difference basically between what you actually observe and what would have been if you assume that the trend would have continued, this is sort of the the what's referred to as the treatment effect. You see on the left here the the sort of literal meaning of difference and differences is you basically take the before and after for the treatment group, the before and after for the control group, and you calculate the difference between the difference for the treatment group and the difference for the control group and hence difference in the differences between those two groups. There are a lot of complications. That is an incredibly simple version of it but again this is really the goal here is to just build intuition. There are a lot of complications with anything more than what we have here which is a sort of simple one treatment group, one control group, one intervention or sort of treatment that happens at one point in time represented by this dotted line. In the real world this is often not what happens, right? There's often multiple treatment groups there are often groups that are treated multiple times at different times. There is basically a really frankly complicated and ongoing debate and literature looking at the different things you can do statistically to deal with these complications to a fixed effects these get very tricky very quickly. These two papers that I reference here [ONSCREEN https://causalinf.substack.com/p/callaway-and-santanna-dd-estimator; http://resources.oliviajhealy.com/TWFE_Healy.pdf] are technically advanced but are I think are you know if this is something you're interested in you sort of have to be on The Cutting Edge of of what's going on here. I also provide two resources here I think that are nice slides and explanations of a little more detail of what i've gotten into here. But then I think you kind of have to take these and then and then sort of dive into some of the more recent Innovations in this space. All right how are we doing on time okay we're doing fine on time. So this third section we're going to take a look at a particular paper sort of get into causal inference in practice. The paper that I'm looking the paper that I'll show you here is one by Jason Baron and Max Gross. [ONSCREEN https://www.nber.org/papers/w29922] this was came out in 2002 I provide a link to the paper here from NBER's website. The basic point of the paper is they use administrative data in Michigan to investigate foster care placement on likelihood of arrest and incarceration in adulthood. So for children who are placed in foster care at some point in their in their youth how does that impact, if at all, their likelihood of basically criminal justice various forms of Criminal Justice contact and adulthood. Now again to you know to just compare those who do and do not experience foster care placement you may be missing, even if you have some other controls, you may be missing a lot of unobservable important heterogeneity or variants between these two groups. And so a simple regression model with control variables may not capture adequately some of those variations. So I'll show you here basically what they did. So investigators from CPS in Michigan who are mostly at least Michigan randomly assigned to a given case, really significantly vary in their leniency or their likelihood or propensity to to place children in foster care and I'll show you a figure that from the paper that shows that in just a moment. This is essentially a form of a instrumental variable technique where FCP here is a foster care placement and we're interested or they're interested in the impacts on adult crime. What they basically do is say well if two families or two groups of families are essentially identical but one group gets assigned to a caseworker who is really highly likely to send to place in foster care versus the other group of families is assigned to a caseworker who is really not likely, right? Then the only thing that's different the only thing that's influencing that FCP that foster care placement is the leniency or the the tendency of the investigator to place in foster care, right? And so you have two groups that would otherwise be identical or at least very very similar and all they do is vary in who they were assigned to by CPS. What you can see here, this is figure 2 from their paper [ONSCREEN bar graph shows increase in removal rate as stringency increases] is that indeed it does seem like it does seem like there are really big variations you can see in the the y-axis on the right here is propensity to place in foster care by by the investigator. Folks down here on the the sort of bottom left are barely ever ascending, are placing children in foster care about one percent of their cases. Whereas up here in the 98th percentile you're seeing five six percent of their cases they're sending children to foster care. And it's that variation and you can see the distribution behind them where most of them cluster around zero but you can sort of take the ones on either end of this distribution to leverage for your for your empirical strategy. Now, table seven here is there as their sort of main outcomes and we're going to pay less attention to the actual outcomes of paper although if this is of interest to you I encourage you to go read the paper and dive into it a little more especially now that you have at least some introduction to the to the method. One thing I want to point out here is this 2SLS or two-stage least regression is something you'll see if you're getting interested in instrumental variables and this is something you want to pursue. This is basically a literally two-stage process where in the first stage we regress you regress the endogenous variable on the instrument. In the second stage you then substitute the predicted values of the endogenous variable into the original regression model. So said another way, in the second stage the model estimated values from the first stage are used in place of the actual values of your endogenous predictor variable problematic predictor variable which then allows you to to deal with that endogeneity account for that and recover some type of of treatment effect. That's again a very simple and limited description but you'll see if you if you do any further research on instrumental variables, two-stage release regression is going to show up very quickly so I at least wanted to mention it and you can see it here in the notes for the table as well. The I'll just briefly though touch on the main finding here which is that they actually find, especially for male children, so in this First Column and then for sort of younger children in the third column as well as for white and black children the fifth and sixth column they find actually that foster care placement seems to lead to a decrease in adult criminal activity using this instrumental variables technique. Now two things if you do go and look at this paper on your own which I encourage you to do, two things that I think are interesting for folks think about. One is does the exclusion restriction hold here, right? Again that's the exclusion restriction being that that in the investigators only influence crime through its impact through their impact on Foster Care placement or is there some other way that is it possible that the investigator might impact crime in some other way that that the researchers don't account for. They have a little bit where they talk about this in the paper but some more sort of content expertise that's I think an interesting thing to go through. The other important assumption that wasn't mentioned by me before but is the monotonicity Assumption, and that is that basically in this figure here [ONSCREEN bar graph showing increase in removal rate as stringency increases] the the sort of leniency or the likelihood of placing in foster care is uniform across different types of families. So if it's the case for example that a given investigator is a lot more likely to place in foster care in one neighborhood versus another or for One race or ethnicity group versus another or one type of reporter, right? If it's a teacher reporter versus a healthcare reporter or you know whatever other group the types of maltreatment or types of reports, if they're really different in that regard then you run into a bunch of problems very quickly because the sort of at least base modeling strategy here assumes uniformity across across all of these groups. And again if you read the paper something interesting to think about it's something that they at least touch on a little bit but again for those with content expertise I think that's a really interesting thing to to think about. All right so to wrap up a few resources that have been useful for me and that I think would be useful for folks who who want to learn a bit more the first is a book by Morgan Winship this is a really classic book that I encourage everyone to read it's called Counterfactuals and Causal Inference. There's a few newer books the Cunningham book which has an online version for free which is really nice [ONSCREEN https://mixtape.scunning.com/] this book is particularly useful for folks who already have some coding expertise or skills and are potentially want to utilize something like this soon because it the book has a ton of code in both R and STATA and therefore you can kind of follow along and and get code with the examples and be able to potentially Implement things quickly. Mostly harmless econometrics also has a website [ONSCREEN https://www.mostlyharmlesseconometrics.com] doesn't have I think quite as robust of an online presence but it's a really a really sort of fundamental book in the econometric field. The Huntington-Klein book The Effect has an online version similar to the Cunningham's book that I think is really really nice. And his YouTube video is the author of that he's an economics professor his YouTube videos are fantastic. So if that if you're more of like a visual learner you want to hear and see him talk and go through examples and things he he does things in a really nice accessible way. And so if that's of Interest I think that that's a really good learning Tool for folks. And YouTube in general I think there are other people who make good videos that if you go on YouTube and any of these particular topics are of interest to you they're good but his page is a really good place to start. All right I think I'm about time so thanks so much. Happy to take some questions although I think there's already some questions in the chat. [Clayton Covington] Excellent thank you so much Garrett for that presentation. I think you gave us a really good high level and accessible introduction to causal inference and specifically causal inference in administrative data because I agree with you that a lot of people read the first part and maybe maybe don't pay as much attention to the second so. [Garrett Baker] Yeah totally. [Clayton Covington] The way that this is gonna work because I'm going to read all the questions aloud and again to our panelists if you have any I'm sorry not panelists but all of our attendees if you have any additional questions please place in the Q and A I'll read them aloud and allow the presenter to respond. The next thing that I also just wanted to say before I answer any of the questions is that this is a session that is obviously a little bit more technical especially methodologically and while this is an excellent opportunity to ask questions one of the other things that the NDACAN regularly does that we host the Office Hours events which happens in the fall and spring semesters during the academic year. So if you ever want to come and get some more hands-on one-on-one attention with questions like these you're also welcome to kind of those so that's just a little plug for other programming at NDACAN but I will get started with the question. So the first question asks I am very interested in the methods of establishing baseline equivalence and testing the degree of reduction and selection bias and treatment effect studies especially considering how emphasis on Administration for Children and Families families places are on for these studies of the Family First Prevention Services Act practices for preventing out-of-home care placements. This person's asking are there any tips or tricks for doing really a really thorough job of demonstrating between group equivalents before a treatment was administered? [Garrett Baker] Clayton do you want me to answer are you going to read all of them first? [Clayton Covington] Oh we'll go one by one you can answer this first one. [Garrett Baker] Yeah sure so any tips or tricks for doing a thorough job of demonstrating between group equivalents before a treatment was administered. I mean I think that's so that's a really important let's see if I can still get to my slides yeah so in this figure here, right? So the the importance you know this sort of shows the importance of establishing that before the treatment that these groups are are are similar even if not in absolute value they're similar in their pre-trends, right? [ONSCREEN graph of two parallel lines, one red, one blue, sloping upwards from bottom left to upper right] they're similar in how much they're changing either increasing or decreasing before the treatment yeah. To in terms of answering the question directly doing really thorough job of demonstrating this I mean I think it really depends on the context, it depends on what your baseline equivalency in what like is it you know it can be a lot of different things so I think visuals are really good I don't know if that's sort of what you're getting at. But being able to graph whatever the equivalency is in whatever time period is relevant to you whether you're looking at absolute rates of something or mean values or something or or whatever the the statistical part is plotting these in a visual way I think is going to do the most convincing job at least as a first pass at at establishing or at least getting an initial sense of what is going on. If these two groups look similar and literally look visually similar I think that's sort of my first step at least usually is to look at is to look visually at whatever the the construct that you're interested in is in. [Clayton Covington] Excellent I'll move on to the next question. This next question actually has two parts so the first part is asking if there's already selection bias and difference in difference models can they still be used to investigate the bias? [Garrett Baker] Um this is sort of a tricky question answer. If there's already a selection bias can data still be used to investigate the bias? Investigate the selection bias itself if I'm understanding correctly I don't think that's going to be the the goal. The goal is really to address the selection bias in the sense that these two groups may be different and you're you're basically trying to to leverage some type of treatment or intervention that happens to just one group or the other. But there are certainly cases where for example if these two groups are are not parallel, right? Imagine where one group is sort of going down and the other group is going up in whatever construct you're interested in, you would not really be able to recover recover an unbiased treatment effect. This is basically because you would be there's some type of variation there's some type of selection that's already going into potentially how that mechanism was was chosen for an intervention. Like it may not have been actually random or or even if it was maybe random in theory it was not implemented randomly or or they're just important differences between the two groups that that are that are sort of make it very difficult to do a diff in diff, yeah. [Clayton Covington] And a related question asks are are there any R packages for better calculation of variance of the parameter for DID? [Garrett Baker] I would I would probably I mean I would just point you towards there's a package literally called DID in R and that is sort of the that is sort of like the most popular popular package that they have really good they have really good documentation so I would just point you towards towards that if you're trying to do a diff in diff and you're using R I would I would sort of get started with the D I D package and and go through that. [Clayton Covington] OK next question is asks the rotational assignment approach was quite interesting but one concern that I had was whether or not this actually was a valid source of exogenous variation? And this person follows up and says maybe caseworkers Place themselves and the next step order selectively question mark ? [Garrett Baker] Yeah this is I think a real I think this is a real problem. I'm probably like actually not the best person to answer this because I'm not a content expert on the inner workings and dynamics of of CPS and especially not in Michigan this is another sort of important thing that I think I mentioned at the very beginning which is that a sort of potential limitation oftentimes of of administrative data is that if it's at the state level or just in a non-federal national level you might run into problems where different states do things differently. I do think though it's you know if they if the the assignment mechanism of investigators to two cases is not actually random and there are some ways that are really difficult to observe or to even know about that potentially may vary and and could violate the exclusion restriction that is definitely a problem. So the short answer I guess would be yes this is definitely a potential source of I think criticism if this is indeed true it's at least something that it would be interesting to hear the the authors or content experts sort of talk about a bit. [Clayton Covington] Okay great. So the next question and it might be helpful to go back to the slides where you have the FCP variable. Yeah because okay yeah I think this is it. So basically this person's asking given this like you know like causal model that you're showing here the dag rather wouldn't that make the FCP a mediator variable? I did moderation analysis slash interactions for my dissertation and had to read on the differences between mediation and moderation. FCP looks like a mediator and that A is related to M and M is related to B only and but B is only related to A through M. [Garrett Baker] Yeah it's a good question. On one hand I'll point you towards this really nice paper on on causal mediation analysis. There's this data package called ivmediate and there's a whole paper by it's like Dippel and all et al where they talk more about this. There are a few differences the the sort of main thing is that in mediation analysis there's not necessarily the the exclusion restriction doesn't have to hold, right? It could be the case where investigators do still have a direct effect on the outcome and what you want to know is how much of that effect goes through goes through the mediator variable, right? And so there's that sort of important conceptual distinction and then there's also the the empirical or technical distinction of how the the mediation is calculated versus using a two-stage least squares and some of the other ways that you could sort of recover a treatment effect using using instrumental variables. So yeah I'll stop there. There are a few interesting papers that I don't know if I can send papers through this but if you look up the ivmediate paper by Christian Dippel that will give you I think a sort of a slightly more technical but but interesting overview of the differences. [Clayton Covington] Yeah I don't think we can send papers through here per se maybe because we do have a little bit of time. If if you fi if you have a link to those feel free to send it to Alex and I think Alex might be able to share it in the larger chat you can see if we can do that but if not Garrett did give you some references. The next question is I think actually related to one of the previous ones. [ONSCREEN https://christiandippel.com/DFH_ivmediate.pdf] oh perfect okay thank you Garrett. It says as I recall most of Jason Baron's work takes advantage of this rotational assignment as an instrument however as I understand many it's sorry many administrative data sources like NCANDS and AFCARS don't have this feature of randomness. Do you have any thoughts on causal inference with other administrative data sources? [Garrett Baker] Yeah yeah I do. I think it's important I mean the the Baron and Gross paper and a lot of other papers in the in the criminal justice world will use the rotational assignment of judges as well. The sort of downside of these like I mentioned before is that they sort of have to rely on state level data because the the detail needed for judges or investigators is just not going to be in NCANDS or AFCARS or national level data like you're just not going to be able to to to access that except at a state level. So I mean my I guess my other thoughts on causal inference with sort of national data sources like NCANDS or AFCARS is that you may find some interesting policy changes. So going back to sort of the diff in diff kind of model where you have maybe some groups that are treated quote unquote treated with some type of intervention or policy change you know say a given number of states all make a change that's significant to the child welfare system in a certain year and and you basically try to compare changes in in whatever you're interested in before and after, that requires a lot of content knowledge of different states' policies and practices over time. And you know you'd have to do a bunch of checking of pre-trends and those kind of things to make sure that you're identifying good comparison groups. The other thing that I didn't talk about because it's sort of looped into the condition on observables things is sort of synthetic matching and weighting which is becoming also more popular. It's sort of not that distinct from propensity score matching or or other types of matching and weighting strategies the sort of synthetic control method is basically just a more technologically advanced version of that. And I think that's another promising when you're using National level data like NCANDS and AFCARS where you just have so many things you know you have a national data set but the child welfare system really operates at a local level, and so at a country as big as the US you're really going to struggle I think to find really tight intervention or policy things where you can Implement something really cleanly. And so synthetic control might be a sort of good in-between strategy when you have such a sort of big and robust data source like AFCARS or NCANDS. Yeah that's I think I think that's it. [Clayton Covington] Um there's another comment but that's not a question so I'm going to move on to the next question. So this is actually of this person's clarifying what they were asking earlier about the difference in difference model and selection bias. They're specifically asking can treatment be a bias? [Garrett Baker] Oh that's a philosophical question. Can treatment be a bias? I'd I'd probably ask you to clarify that more. I mean I guess treatment can be the way I'm interpreting I think about this question is treatment could be a bias if the treatment is not random, right? That's sort of a fundamental aspect of what we're trying to pursue in a bunch of these methods, right? If you imagine a diff in diff let's get back to that slide and this is sort of related to what I said before. [ONSCREEN graph of two parallel lines, one red, one blue, sloping upwards from bottom left to upper right] I don't remember if it was in response to this earlier question or not, but if the treatment is not random let's say and to use the same example I was talking about before. Say the changes in police force were not randomly you know a city didn't just say we're in a randomly cut or increase police in just half of the city, instead we're going to Target it in really specific neighborhoods either high crime low crime whatever. I guess you could consider that a source of bias in the sense that it's not you're not going to be able to recover a treatment effect using this method because you're you're going to have very different groups that are selected so there's sort of a bias in selecting into the treatment of the change in police force because the the treatment or the intervention didn't happen randomly. In that sense I guess you you know it's maybe sort of somewhat semantics but I think I guess you can consider treatment to be a bias in that sense. [Clayton Covington] All right another follow-up is asking can you expand a bit more on the pros and cons of TWFE? [Garrett Baker] Oh yeah so my short answer is going to be kind of no because it's such a big complicated you know you very quickly within 10 seconds I'd get into things that are like honestly can get even above my head quickly and will be really really context specific. Two-way fixed effects are my simplest answer is they're most useful when you have multiple a sort of staggered treatment or intervention design where groups multiple groups are receiving the treatments at different times, right? Where the the sort of clean picture that you're seeing here is not the case, right? It's not the one treatment effect happens at one time to one group and you can sort of nicely put these two together and and recover a treatment effect. I'd really I just I would mostly point you towards the Callaway and Sant'Anna and the Goodman Bacon pieces. Those are really like the main things in this area right now that have sort of upended a lot of the diff in diff and fixed effects kind of literatures. They are going to be if depending on what your sort of needs are I I would encourage you to to go through those. [Clayton Covington] Awesome. Next question is saying I appreciate your comments on unobserved covariates and their impact on analyses and making causal inferences. If you had a magic wand and could add one set of variables to our large national child welfare files what would you add, socioeconomic variables, crime related variables, so that's the question. [Garrett Baker] Wow that's a really interesting question that I well I have a lot of thoughts. Socio so it's interesting so I'll say on one hand socioeconomic and crime variables I actually probably wouldn't add because hmm actually I might take that back because depending on your unit of analysis, and this is going to lead into what my actual answer is in a second, where you can merge in data like if you are interested in looking at county or state level variation or something in in a given in some given empirical relationship like you can mer there are National data files crime would be a little harder because that's not collected nationally. But you would you have some potential to merge in other data sources basically like it's not impossible especially at the county level to merge in Census Data. You know this is something I'm actually working on right now. What I would add and I guess I'm putting the like privacy concerns aside here is some much more specific location, right? Like if you knew exactly where again putting the privacy ethics in concerns aside like if you knew exact locations of people you could really dive into local level dynamics that are really what you know the that's the way that the child welfare system operates really at a local level more than a national one and most of the time we're usually sitting with county level data that is nice but a lot of the variation is going to happen within County, and I think you could leverage a lot more potential interventions and treatment things if you knew more precisely where people lived within like a given area. But that's a really interesting question I'm gonna end up probably thinking about that for like the next hour and come up with a different answer later. [Clayton Covington] It's an interesting question indeed okay we're going to proceed to our last question before we wrap up and it asks, suppose that the data on hand almost exclusively consists of binary and limited integer value variables could causal inference analysis work well with this data or would this not play a factor? [Garrett Baker] That would not I mean in in the grandest game that would not really play factor. I think I don't know if you have some more specific thoughts in mind but you you know a lot of these things are going to work you know there will be different computational considerations and things depending if you have linear variables and continuous variables or if you have binary integer categorical variables. But causal inference is as a whole conceptually possible with with all these different types of variables, there are different strengths and weaknesses and limitations and things you'd have to take into consideration based on your specific variables, but otherwise no that would not be like a big limitation. [Clayton Covington] Excellent so Garrett I'm going to ask you to go to the very last slide please. [Garrett Baker] Yes sure. [Clayton Covington] All right so again thank you everyone for joining us for the third session of the summer training series here at the National Data archive on child abuse neglect. Join us again at the same time next week on Wednesday July 26th from 12 to 1PM Eastern eastern time where we will have a presentation by Dr. Frank Edwards or yeah Frank Edwards of Rutgers University who will be talking about dealing with missing data using R statistical software. So again thank you Garrett for the wonderful presentation and answering so many interesting questions. Thank you to all of our attendees for your questions we look forward to seeing you all next week. [Garrett Baker] Thanks y'all. [voiceover] The National Data Archive on Child Abuse and Neglect is a collaboration between Cornell University and Duke university. Funding for NDACAN is provided by the Children's Bureau an office of the Administration for Children and Families. [musical cue]