Transcript for Video entitled, "Merging the National Youth in Transition Database (NYTD) Outcomes File with the Adoption and Foster Care Analysis and Reporting System (AFCARS) Foster Care File, using SPSS" Slide 1 National Data Archive on Child Abuse and Neglect Slide 2 Merging the National Youth in Transition Database Outcomes File with the Adoption and Foster Care Analysis and Reporting System Foster Care File, using SPSS. This video was produced by the National Data Archive on Child Abuse and Neglect. Slide 3 This video uses the following acronyms and abbreviations: NDACAN, pronounced "N-D-A-CAN", is the National Data Archive on Child Abuse and Neglect. AFCARS, pronounced "AF-CARS", is the Adoption and Foster Care Analysis and Reporting System. NYTD, pronounced "NYTED", is the National Youth in Transition Database. FY means Fiscal Year: the Fiscal Year begins October 1st and runs through September 30th of the following year. Slide 4 Video Overview. This video describes linking, or merging, the AFCARS Foster Care FY 2011 file with the NYTD Outcomes File Cohort Age 17 in FY 2011 data, using SPSS. The instructions in this video are generally applicable for merging other years of AFCARS and NYTD datasets. Instructional documents for merging the data files using STATA and SAS statistical software programs can be found on the User Support page of the NDACAN website (www.ndacan.acf.hhs.gov/user-support/user-support.cfm). Slide 5 Viewer Requirements In order to follow along with the instructions in this video, you will need to have ordered and received NDACAN dataset numbers 167 and 201 from the National Data Archive on Child Abuse and Neglect. These are the AFCARS Foster Care FY 2011 and NYTD Outcomes File Cohort Age 17 in FY 2011 datasets. The data files are delivered inside of compressed .zip folders which require unzipping into a new folder on your computer in order to gain read and write access to the files. Secondary analysts are advised to create a copy of the data files and to use that copy for their work. Using a copy of the files preserves an unmodified version of the data in case an error occurs and there is a need to start over. The content of this video is based on Version 23 of the IBM SPSS statistics software program. Viewers should be aware that depending on your version of SPSS and personal settings, screenshots presented in this video may look different from what appears on your computer screen. This video assumes that you are familiar with using SPSS syntax. You need to download the companion document to this video which contains the SPSS syntax that will be used in this tutorial. The companion document titled, "Merging the NYTD Outcomes file with the AFCARS Foster Care file, using SPSS" can be found on the User Support page of the NDACAN website. www.ndacan.acf.hhs.gov/user-support/user-support.cfm. Slide 6 Getting Started: If you plan to follow along with this video, at this time please start SPSS on your workstation and open a new SPSS Syntax Editor window. Slide 7 Section One: Restructure the NYTD Outcomes data file from long to wide. This section will assist you with restructuring (also called reshaping) the NYTD Outcomes data file from being multiple-records-per-participant to one-record-per-participant. The NYTD Outcomes file is oriented in long format, also called stacked format. This means that there are multiple records per child. Each record in the data file represents a child at a given survey administration time point, as identified by the "wave" variable. This NYTD Outcomes File was administered at three different time points. The Section One syntax from the companion document does several things: it assigns the resulting NYTD Outcomes data file with the nickname "t0" which will be used later during the merge process; the file will be sorted in ascending order by the "st" and "recnumbr" variables, and the data file will be restructured so that there is only one-record-per-participant. At this time copy the syntax from Section One of the companion document and paste it into the SPSS Syntax Editor window. After pasting, you will need to update the "Get File" line of syntax which appears at the top. The Get File statement should contain the location of the NYTD Outcomes file on your computer. In this example, we are using the NYTD Outcomes File Cohort Age 17 in FY 2011. You will also need to update the SAVE OUTFILE line of syntax with the location to where you want to save your restructured file. Be sure to give the new file a name that will alert you to it being the newly restructured data. Slide 8 This slide displays a screenshot of the SPSS Syntax Editor window with the pasted syntax from Section One of the companion document. Syntax lines 2 and 14 are circled in red to show that these are the lines of syntax that require updating, as described in the previous slide. Slide 9 Run the Section One block of syntax: Once you have updated the file paths for the GET FILE and SAVE OUTFILE lines of syntax, then you can highlight the entire block of syntax and select "run" from the menu bar across the top of the SPSS Syntax Editor window, and then choose "selection." This will run the syntax to restructure the data. Slide 10 Section One Checkpoint: After running the syntax, always check the SPSS Output Window for information about any errors that may have been produced while the syntax was running. If there are errors, you need to stop here and remediate problems in the syntax. Once the syntax runs without errors, you can proceed to the next step. Slide 11 Navigating the newly restructured data file. The file will now be oriented so that there is one record per child. The variables from each of the waves of the NYTD data collection will have, appended to the end of the original variable name, a "1" for the age 17 data collection, a "2" for the age 19 data collection, and a "3" for the age 21 data collection. The total record count after the restructure is 30,009. Feel free to pause this video to spend a few moments exploring the newly restructured data file. Please leave the newly restructured data file open in SPSS and do not close it for the remainder of this video. Slide 12 Section Two: Assigning a Nickname to the AFCARS Foster Care data file. In this section, you will assign the nickname "t1" to the AFCARS Foster Care FY 2011 dataset. This nickname will be leveraged during the merge process in Section 3. Use the same SPSS Syntax Editor window from the previous step. Copy the syntax from Section Two of the companion document and paste it below the syntax from Section One which already appears in the Editor window. In this section of syntax, you will need to update the GET FILE statement with the location of the AFCARS Foster Care data file. In this example, I am using the AFCARS Foster Care FY 2011 dataset. You will also need to update the SAVE OUTFILE statement with the location where you want the resulting data file to be saved. Slide 13 This slide displays a screenshot of the SPSS Syntax Editor window with the pasted syntax from Section Two of the companion document appearing below the syntax from Section One. Syntax lines 18 and 23 are circled in red to show that these are the lines of syntax that require updating, as described in the previous slide. Please note, the exact line number will depend on how much space you place between your Section One and Section Two blocks of syntax. Wherever your "GET FILE" and "SAVE OUTFILE" lines of syntax appear, those are the lines you need to update. Slide 14 Section Two Continued: Once you have updated the GET FILE and SAVE OUTFILE lines for section two, please highlight the block of syntax pertaining to Section Two and select "run" from the menu bar across the top of the SPSS Syntax Editor window and then choose "selection." This will run the syntax to assign the nickname of "t1" to the AFCARS data file. Slide 15 Section Two Checkpoint: After running the syntax, always check the SPSS Output Window for information regarding any errors that may have been produced while the syntax was running. If there are errors, you need to stop here and remediate problems in the syntax. Once the syntax runs without errors, you can proceed to the next step. Please leave the newly saved AFCARS Foster Care data file open and do NOT close it for the remainder of this video. Slide 16 Section Three: Merging, or linking, the modified NYTD Outcomes File with the AFCARS Foster Care File. In this section, you will perform the data linkage, or merge, of the modified NYTD Outcomes File Cohort Age 17 in FY 2011 and the modified AFCARS Foster Care File FY 2011. This section of syntax will use the data file nicknames that we specified earlier in this video and those are "t0" which points to the restructured NYTD Outcomes File Cohort Age 17 in FY 2011 and "t1" which points to the AFCARS Foster Care FY2011 data file. We will be using a "star Join" command to merge or "join" the two data files together. The reason we use the star join command is because the sample of interest to us is only those participants who were surveyed and appear in the NYTD Outcomes file. So, even though the AFCARS Foster Care file has hundreds of thousands of records, we are only interested in the 30,009 participants from the AFCARS who were selected to participate in the corresponding NYTD Outcomes survey. This is also known as a "left outer join." We will be joining the data files together using both the "st" and "recnumbr" variables from each of the data files. Slide 17 Section Three Continued: At this time, copy the entirety of the syntax from Section Three of the companion document into the SPSS Syntax Editor that is already open and paste it beneath the syntax from Section Two. Update the line of syntax starting with "/OUTFILE FILE" by replacing the existing file path with the location to where you would like to save the resulting data file, including a new file name. Also, copy this file path and paste it in the GET FILE line of syntax, as this will tell SPSS to open the newly created merged data file. Slide 18 This slide shows a screenshot of the SPSS Syntax Editor window where the block of syntax from Section 3 of the companion document has been pasted into the editor. Syntax lines 61 and 63 are circled in red to show that these are the lines of syntax that require updating, as described in the previous slide. Please note, the exact line number will depend upon how much space you place between the different sections of syntax. Wherever your "/OUTFILE File" and "GET FILE" lines of syntax appear, those are the lines you need to update. Slide 19 Section Three Continued: Once you have finished pasting the syntax, please highlight the block of syntax that you just pasted and select "run" from the menu bar across the top of the SPSS Syntax Editor window and then choose "selection." This will perform the join and open the resulting merged data file. Slide 20 Section Three Checkpoint: After running the syntax, always check the SPSS Output Window for information regarding any errors that may have been produced while the syntax was running. If there are errors, you need to stop here and remediate problems in the syntax. Once the syntax runs without producing errors, you can compare your merged file with the following information: In this example, based on the AFCARS Foster Care FY2011 file and the NYTD Outcomes File FY 2011 data, the total number of variables in the new file is 194 and the total number of observations or records is 30,009. Congratulations, you are now ready to begin exploring the newly merged dataset. Slide 21 This concludes the video titled, "Merging the National Youth in Transition Database Outcomes File with the Adoption and Foster Care Analysis and Reporting System Foster Care File using SPSS". If you have any questions, please send an email to NDACANsupport@cornell.edu. Slide 22 The National Data Archive on Child Abuse and Neglect is a project of the Bronfenbrenner Center for Translational Research at Cornell University. Funding for NDACAN is provided by the Children's Bureau.