Why Foundations in the Education Sector Should Use Administrative Data for Program Evaluation

The challenges of using primary data collection for evaluation purposes

Grant-making foundations in the educational sector require timely, high-quality evaluation of the academic programs they invest in (Morariu et al. 2016). Foundations cannot expect real improvement without data about the success or failure of their funded programs (Gamoran 2016). Furthermore, without real-time evaluation results, foundations cannot make the necessary course corrections within required timeframes. Evaluation results must be actionable, reliably identifying significant and meaningful differences between program participants and non-participants such that funders and program staff can adjust accordingly. For all these reasons, decision-making staff from major foundations require meaningful and useful information from evaluations of funded programs, according to a recent survey (The Center for Effective Philanthropy & the Center for Evaluation Innovation 2016). Yet almost two-thirds of the same survey respondents said their foundations fund evaluations for less than 10 percent of individual grantees. Clearly, then, there is a disconnect between foundations’ need for timely, high-quality evaluations and their ability to obtain them. Although there is a growing need for more evidence-based programs worthy and ready for large-scale investments by philanthropy, the slow trickle of high-quality evaluative studies cannot keep up with the demand (Fitzsimmons 2016).

The reason for this lack of evidence is simple. The standard technique for program evaluation – designing an experimental study, selecting a random sample and then building, testing and fielding a survey or assessment instrument -- is expensive, slow, burdensome to schools, and often finds no significant program effects due to small sample sizes. Foundations considering this sort of primary data collection may not see opportunities to measure outcomes at reasonable costs and on reasonable timelines (Sullivan and McDonagh 2016). Outcome data about students and schools can be difficult to collect, particularly on intermediate and long-term outcomes achieved after a funded program has ended. Surveys can be used to collect information on outcomes in a school setting, but an hour-long survey can cost upwards of $500 per participant (Sullivan and McDonagh 2016). Like surveys, assessments for evaluation purposes also present challenges to foundations. While they may provide essential information about the educational progress of students affected by a funded program, educational assessments administered as part of an evaluation often involve substantial burden to participating schools, teachers, and students (May et al. 2009).

A particular example of the challenges faced by foundations is the randomized control trial (RCT) conducted in a school setting. About one-fifth of respondents say their foundations have provided funding for a randomized control trial of their grantees’ work in the past three years (The Center for Effective Philanthropy & the Center for Evaluation Innovation 2016). While an RCT is the gold standard in evaluation practices, it is both time-consuming and expensive. A typical RCT can take months or years for data-collection, especially when multiple measures are necessary in order to generate longitudinal data sets about pre- and post-measures of program outcomes. In addition, foundations must gather data on outcomes for a comparison group of students, which is necessary to isolate program impact in an RCT. In spite of the effort expended in conducting them, RCTs do not always provide actionable information. According to one recent survey of foundation staff, of those who provided funding for an RCT, less than 50% found the results to be useful for understanding grant-dollar impact or future grantmaking decisions, only 25% found the results useful for refining foundation strategies (The Center for Effective Philanthropy & the Center for Evaluation Innovation 2016). Without sufficient statistical power, RCTs may well result in no significant findings, leaving funders and program officers with no actionable results (U.S. Executive Office 2015).

It is clear, then, that relying exclusively on primary data collection for evaluation purposes is often too slow, expensive, incremental and insufficient for nonprofits and their funders (Fitzsimmons 2016). In order to meet their goals, grant-making foundations need less costly tools and more timely approaches to evaluation.

What are administrative data?

Administrative data are collected by public and private organizations as part of the ongoing administration of their programs. These data are not collected for research purposes, but for recordkeeping, typically tracking participants, registrants, employers, or transactions. However, these data sets are rich with information that can be useful for evaluating grant-funded programs. Administrative records often already contain information on key outcomes such as employment, earnings, college persistence and completion, contact with the criminal justice system, and hospital admissions, among others (Sullivan and McDonagh 2016).

In the education sector, administrative records contain information on outcomes of interest to evaluators such as students’ achievement on standardized tests, participation in advanced course-taking, drop-out and persistence rates, college entrance and completion, and post-educational employment and earnings, to name a few (May et al. 2009, Means, Padilla, and Gallagher 2010). At the Federal and state level, administrative data include rich information on students’ educational outcomes, teachers’ labor market outcomes, and other important topics, but they are often greatly underutilized in evaluating grant-funded programs’ effects (U.S. Executive Office 2015). Such data are usually collected on the universe of students (such as state-mandated end-of-course exams or participation in the National School Lunch Program (NSLP), in contrast to survey data that are collected using samples of students, for research or other statistical purposes.

Typically produced either by the National Center for Education Statistics (NCES), state departments of education, or local educational agencies (LEA’s) such as public school districts, these administrative data form a robust evidence base to support both public and private decision-making about educational outcomes (Office of Management and Budget 2015). Since foundation-funded educational interventions often take place in the context of public schools, administrative data from these schools is a valuable but often untapped resource for evaluation. Rather than conducting primary data collection on grant-funded programs, foundations should consider how they can optimize the use of data that is already being collected by public schools to make smarter policy decisions (Gamoran 2016).

Advantages of using administrative data for evaluation

Using administrative data for evaluation, either alone or in combination with survey data, can have a number of advantages over survey data alone.

Administrative data can often be obtained at much lower cost than fielding a new survey because they are collected through the normal administration of programs and do not require additional primary data collection.

Using administrative data can substantially accelerate the evaluation timeline because the data have already been collected before the evaluative study even begins.

Using administrative data can substantially accelerate the evaluation timeline because the data have already been collected before the evaluative study even begins.

Administrative data are often more accurate than survey self-reports, especially with respect to information such as participants’ socioeconomic status, enrollment status or prior achievement.

Administrative data, especially when linked across multiple programs, are often available for long time periods, permitting study of long-term impacts that would be prohibitively expensive with a survey.

Because they typically gather data on large samples or entire populations, administrative data sets provide greater statistical power than small-sample surveys, which in turn increases the probability of detecting significant differences in program effects.

Large administrative data sets allow for quasi-experimental studies that would be impossible in most survey data sets, particularly research designs that depend on detecting small differences in outcomes based on small but near-random variation in program participation (U.S. Executive Office 2015).

Obtaining and using administrative data

While some administrative educational data sets such as the NCES Common Core of Data are publically available and do not require a license for their use, most are license-restricted due to the presence of personally identifiable information (PII) about students, teachers and other school personnel. In the case of educational data sets maintained by NCES, researchers may apply for a restricted-use data (RUD) license through their sponsoring agency (National Center for Education Statistics 2011). In the case of state or district-level data sets, researchers must follow the protocols unique to the relevant department or administering agency prior to obtaining RUD. These protocols protect the privacy of individuals.

Once obtained, a unique identifier must exist permitting program participants’ identification within the larger administrative data set. In the case of programs affecting all students within a given public school, the linking identifier may exist at the group level, such as the school-specific identifier supplied by the National Center for Education Statistics (NCES) for each public school in the nation. When a grant-funded program involves only some students within a school, additional identifiers (such as linkage to a participating classroom or teacher) may be required. In yet other programs, the unit of observation may not be students at all, and evaluators will examine outcomes at the level of the classroom, school or an entire district affected by a grant-funded intervention. Identifiers at all levels of data are typically provided in educational data sets maintained by LEA’s, state department of education, or NCES. In addition to program participants, a unique identifier for non-program participants can allow evaluators to create a comparison control group for rigorous impact studies.

Once linked, a wide range of variables become available for evaluation purposes such as participants’ background, demographics and outcomes. Since the most commonly used measure in evaluations of educational programs is student progress, researchers will want to link program participants to existing outcome data such standardized tests of achievement. While some national tests are available from NCES, the most commonly used test results are state-level mandatory end-of-course (EOC) tests. In many cases, state-level EOC test results can be compared for students in different states for evaluation of cross-state interventions (May et al. 2009). Other relevant outcomes available from administrative data might include student course-taking, persistence or graduation (Means, Padilla, and Gallagher 2010).

When conducting an evaluation, administrative data can be used either to augment or replace primary data collection. Because administrative records are already being collected irrespective of whether a program evaluation is taking place, linking them to study-collected data substantially reduces a study’s costs and may accelerate its completion, leading to more real-time information on whether programs are meeting their aims. Administrative data also have great value for quasi-experimental studies, which can often be conducted without additional data beyond the administrative records themselves (Schneider et al. 2007).

When conducting an evaluation, administrative data can be used either to augment or replace primary data collection. Because administrative records are already being collected irrespective of whether a program evaluation is taking place, linking them to study-collected data substantially reduces a study’s costs and may accelerate its completion, leading to more real-time information on whether programs are meeting their aims. Administrative data also have great value for quasi-experimental studies, which can often be conducted without additional data beyond the administrative records themselves (Schneider et al. 2007).

In recent years, federal agencies increasingly use administrative data to conduct program evaluation and research. These agencies recognize that administrative data provide a more efficient and cost-effective alternative to primary data collection or external data.

  • The 2016 Federal Budget promulgated by the Obama administration placed particular emphasis on the use of administrative data for the evaluation of federally funded programs (Office of Management and Budget 2015).
  • Also in 2016, the Congress established the Commission on Evidence-Based Policymaking (CEP) to develop a strategy for increasing the availability and use of administrative data in order to build evidence about government programs.
  • State- and district-level administrative data have been used for important and influential research on topics ranging from teacher value-added to disparities in educational outcomes by family income to the effects of universal prekindergarten, charter schools, intensive tutoring programs, and community college remediation programs (Abdulkadiroglu et al. 2011, Andrews, Jargowsky, and Kuhne 2012, Calcagno and Long 2008, Fryer 2014, Papay, Murnane, and Willett 2015, Rivkin, Hanushek, and Kain 2005).
  • Research on student aid simplification showing the feasibility and importance of simplifying the Free Application for Federal Student Aid (FAFSA) relied on administrative records (Bettinger et al. 2012).
  • State education data systems have contributed to the success of the Department of Education’s Investing in Innovation (“i3”) tiered evidence program, one of the Administration’s most successful grant reform efforts. Nearly all of i3’s Scale-up grantees have used administrative data for their evaluations, as have many of the Development and Validation grantees (U.S. Executive Office 2015).

Outside of the government, administrative data has also been used effectively by private entities, including grant-making foundations, for evaluation purposes.

  • The Edna McConnell Clark Foundation helps nonprofits use administrative data to build evidence of their programs’ effectiveness (Fitzsimmons 2016).
  • The Laura and John Arnold Foundation sponsors a grant to demonstrate innovative strategies for linking data across programs and levels of government to advance evidence-based policymaking (Goerge 2017).
  • The William T. Grant Foundation promotes the use of linked administrative data to address critical questions related to evaluation of grant-funded programs (Gamoran 2016).

While an increasing number of foundations are turning to administrative data for evaluation, their use is still relatively uncommon. According to a recent survey, only thirty-eight percent of major grant-making foundations reported currently using large-scale administrative data sets for evaluation purposes. An additional 9% of foundations reported having the ability to use these data but are not currently doing so (Morariu et al. 2016).

References

Abdulkadiroglu, A., J. D. Angrist, S. M. Dynarski, T. J. Kane, and P. A. Pathak. 2011. “Accountability and Flexibility in Public Schools: Evidence from Boston’s Charters And Pilots.” The Quarterly Journal of Economics 126 (2):699–748. doi: 10.1093/qje/qjr017.

Abdulkadiroglu, A., J. D. Angrist, S. M. Dynarski, T. J. Kane, and P. A. Pathak. 2011. "Accountability and Flexibility in Public Schools: Evidence from Boston's Charters And Pilots." The Quarterly Journal of Economics 126 (2):699-748. doi: 10.1093/qje/qjr017.

Andrews, Rodney J., Paul Jargowsky, and Kristin Kuhne. 2012. "The Effects of Texas's Targeted Pre-Kindergarten Program on Academic Performance." National Bureau of Economic Research Working Paper Series No. 18598. doi: 10.3386/w18598.

Bettinger, E. P., B. T. Long, P. Oreopoulos, and L. Sanbonmatsu. 2012. "The Role of Application Assistance and Information in College Decisions: Results from the H&R Block FAFSA Experiment." The Quarterly Journal of Economics 127 (3):1205-1242. doi: 10.1093/qje/qjs017.

Calcagno, Juan Carlos, and Bridget Terry Long. 2008. "The Impact of Postsecondary Remediation Using a Regression Discontinuity Approach: Addressing Endogenous Sorting and Noncompliance." National Bureau of Economic Research Working Paper Series No. 14194. doi: 10.3386/w14194.

Fryer, Roland G. 2014. “Injecting Charter School Best Practices into Traditional Public Schools: Evidence from Field Experiments.” The Quarterly Journal of Economics 129 (3):1355–1407. doi: 10.1093/qje/qju011.

Fryer, Roland G. 2014. "Injecting Charter School Best Practices into Traditional Public Schools: Evidence from Field Experiments." The Quarterly Journal of Economics 129 (3):1355-1407. doi: 10.1093/qje/qju011.

Goerge, Robert. 2017. Statement on State Models for Acquiring, Linking, and Making Data Available to Researchers and Evaluators prepared for the Commission on Evidence-Based Policymaking. Chicago, IL: Chapin Hall, University of Chicago.

May, Henry, Irma Perez-Johnson, Joshua Haimson, Samina Sattar, and Phil Gleason. 2009. Using State Tests in Education Experiments: A Discussion of the Issues. NCEE 2009–013. Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

May, Henry, Irma Perez-Johnson, Joshua Haimson, Samina Sattar, and Phil Gleason. 2009. Using State Tests in Education Experiments: A Discussion of the Issues. NCEE 2009-013. Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

Morariu, Johanna, Kat Athanasiades, Veena Pankaj, and Deborah Grodzicki. 2016. State of Evaluation 2016: Evaluation Capacity and Practice in the Nonprofit Sector. Washington, DC: Innovation Network.

National Center for Education Statistics. 2011. Restricted-Use Data Procedures Manual. Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, IES Data Security Office.

Office of Management and Budget. 2015. Fiscal year 2016: Budget of the U.S. Government. Washington, DC: U.S. Government Printing Office.

Papay, John P., Richard J. Murnane, and John B. Willett. 2015. “Income-Based Inequality in Educational Outcomes.” Educational Evaluation and Policy Analysis 37 (1_suppl):29S-52S. doi: doi:10.3102/0162373715576364.

Papay, John P., Richard J. Murnane, and John B. Willett. 2015. "Income-Based Inequality in Educational Outcomes." Educational Evaluation and Policy Analysis 37 (1_suppl):29S-52S. doi: doi:10.3102/0162373715576364.

Rivkin, Steven G., Eric A. Hanushek, and John F. Kain. 2005. "Teachers, Schools, and Academic Achievement." Econometrica 73 (2):417-458. doi: 10.1111/j.1468-0262.2005.00584.x.

Schneider, Barbara, Martin Carnoy, Jeremy Kilpatrick, William H Schmidt, and Richard J Shavelson. 2007. "Estimating causal effects using experimental and observational designs." Washington, DC: American Educational Research Association.

The Center for Effective Philanthropy & the Center for Evaluation Innovation. 2016. Benchmarking Founcation Evaluation Practices. Cambridge, MA: The Center for Effective Philanthropy.

U.S. Executive Office. 2015. Fiscal year 2016: Analytical perspectives of the U.S. Government. Washington, DC: U.S. Government Printing Office.

Examples of Administrative Data Used for Evaluation of Educational Initiatives

STEM Training and Early Career Outcomes of Female and Male Graduate Students: Evidence from UMETRICS Data Linked to the 2010 Census

The Reading for Life program

Evaluation of the Effectiveness of the Alabama Math, Science, and Technology Initiative (AMSTI)

Children Left Behind: The Effects of Statewide Job Loss on Student Achievement

  • study funded by the William T. Grant Foundation
  • Authors were able to connect Texas state administrative data from K12 education, higher education, and employment and, using a difference-in-difference design that permits causal inference, test the effects of two university scholarship programs for high-achieving, low-income youth on college and workforce outcomes.
  • http://www.nber.org/papers/w17104

Resources for the Use of Administrative Data in Educational Evaluations

Common Core of Data (CCD)

CCD is a program of the U.S. Department of Education’s National Center for Education Statistics that annually collects fiscal and non-fiscal data about all public schools, public school districts and state education agencies in the United States. The data are supplied by state education agency officials and include information that describes schools and school districts, including name, address, and phone number; descriptive information about students and staff, including demographics; and fiscal data, including revenues and current expenditures.

CCD is a program of the U.S. Department of Education’s National Center for Education Statistics that annually collects fiscal and non-fiscal data about all public schools, public school districts and state education agencies in the United States. The data are supplied by state education agency officials and include information that describes schools and school districts, including name, address, and phone number; descriptive information about students and staff, including demographics; and fiscal data, including revenues and current expenditures.

https://nces.ed.gov/ccd/

The Private School Survey produces data similar to that of the NCES Common Core of Data (CCD) for the public schools. The data are useful for a variety of policy- and research-relevant issues, such as the growth of religiously-affiliated schools, the length of the school year, the number of private high school graduates, and the number of private school students and teachers.

https://nces.ed.gov/surveys/pss/

https://nces.ed.gov/surveys/pss/

SEDA includes a range of detailed data on educational conditions, contexts, and outcomes in schools and school districts across the United States. It includes data at a range of institutional and geographic levels of aggregation, including schools, districts, counties, commuting zones, metropolitan areas, and states. It includes measures of academic achievement, achievement gaps, school and neighborhood racial and socioeconomic composition, school and neighborhood racial and socioeconomic segregation patterns, and other features of the schooling system.

https://cepa.stanford.edu/seda/overview

https://cepa.stanford.edu/seda/overview

No Child Left Behind (NCLB) required states to develop a single accountability system to determine whether all students and key subgroups of students are meeting AYP. All students must be assessed using the same state assessment (with limited exceptions, described below) and AYP definitions must apply to all public schools and districts in the state, Title I and non-Title I. A comprehensive list of state-level assessments by subject and grade level is available from the National Center for Education Evaluation and Regional Assistance at the Institute of Education Sciences (NCEE 2009013 Appendix A).

No Child Left Behind (NCLB) required states to develop a single accountability system to determine whether all students and key subgroups of students are meeting AYP. All students must be assessed using the same state assessment (with limited exceptions, described below) and AYP definitions must apply to all public schools and districts in the state, Title I and non-Title I. A comprehensive list of state-level assessments by subject and grade level is available from the National Center for Education Evaluation and Regional Assistance at the Institute of Education Sciences (NCEE 2009013 Appendix A).

Use of Education Data at the Local Level: From Accountability to Instructional Improvement

This report, conducted by researchers at SRI International on behalf of the US Department of Education, outlines the use of educational data systems at the district level.Downloaded from http://www.ed.gov/about/offices/list/opepd/ppss/reports.html#edtech

Downloaded from http://www.ed.gov/about/offices/list/opepd/ppss/reports.html#edtech

Abdulkadiroglu, Atila, et al., “Accountability and Flexibility in Public Schools: Evidence from Boston’s Charters and Pilots,” The Quarterly Journal of Economics, 2011, 126(2) pp. 699–74.

Andrews, Rodney J., Jargowsky, Paul, and Kuhne, Kristin. “The Effects of Texas’s Targeted Pre-Kindergarten Program on Academic Performance,” NBER Working Paper №18598, December 2012.

Andrews, Rodney J., Jargowsky, Paul, and Kuhne, Kristin. “The Effects of Texas’s Targeted Pre-Kindergarten Program on Academic Performance,” NBER Working Paper №18598, December 2012.

Bettinger, Eric P., et al., “The Role of Application Assistance and Information in College Decisions: Results from the H&R Block FAFSA Experiment,” Quarterly Journal of Economics, April 2012, 127(3), pp. 1205–1242.

Calcagno, Juan C. and Bridget T. Long, “The Impact of Postsecondary Remediation Using a Regression Discontinuity Approach: Addressing Endogenous Sorting and Noncom-pliance,” The National Center for Postsecondary Education Working Paper, April 2008.

Chapin Hall (2016). Using Linked Data to Advance Evidence-Based Policymaking. Downloaded from http://www.chapinhall.org/pages/RFP-Linked-Data-Evidence-Based-Policymaking.

Fitzsimmons, K. (2016). Evidence Commission: A Funder’s Perspective. Edna McConnell Clark Foundation. Downloaded from http://www.emcf.org/our-strategies/investment-approach/evidence/.

Fryer, Roland G., Jr., “Injecting Charter School Best Practices into Traditional Public Schools: Evidence from Field Experiments,” The Quarterly Journal of Economics, April 2014, 129(3), pp. 1355–140.

Fryer, Roland G., Jr., “Injecting Charter School Best Practices into Traditional Public Schools: Evidence from Field Experiments,” The Quarterly Journal of Economics, April 2014, 129(3), pp. 1355–140.

George, R. (2016). Increasing the use of state and local administrative data for evidence-building: Presentation to the Commission on Evidence-Based Policymaking. Chapin Hall, University of Chicago. Downloaded from https://www.cep.gov/content/dam/cep/events/2017-01-13/2017-1-13-goerge.pdf.

May, Henry, Irma Perez-Johnson, Joshua Haimson, Samina Sattar, and Phil Gleason (2009). Using State Tests in Education Experiments: A Discussion of the Issues (NCEE 2009–013). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Downloaded from https://ies.ed.gov/ncee/pubs/2009013/section_2c.asp.

May, Henry, Irma Perez-Johnson, Joshua Haimson, Samina Sattar, and Phil Gleason (2009). Using State Tests in Education Experiments: A Discussion of the Issues (NCEE 2009–013). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Downloaded from https://ies.ed.gov/ncee/pubs/2009013/section_2c.asp.

Means, B., Padilla, C., & Gallagher, L. (2010). Use of Education Data at the Local Level From Accountability to Instructional Improvement, Washington, DC: U.S. Department of Education, Office of Planning, Evaluation, and Policy Development. Downloaded from http://www.ed.gov/about/offices/list/opepd/ppss/reports.html#edtech

Morariu, J., Pankaj, V., Athanasiades, K., & Grodzicki, D. (2016). State of Evaluation 2016: Evaluation Practice and Capacity in the Nonprofit Sector. Washington, DC: Innovation Network. Downloaded from www.stateofevaluation.org.

Papay, John P., Richard J. Murnane, and John B. Willett, “Income-based Inequality in Educational Outcomes: Learning from State Longitudinal Data Systems,” NBER Working Paper №20802, December 2014.

Papay, John P., Richard J. Murnane, and John B. Willett, “Income-based Inequality in Educational Outcomes: Learning from State Longitudinal Data Systems,” NBER Working Paper №20802, December 2014.

**Rivkin, Steven G., Eric A. Hanushek, and John F. Kain, “Teach-ers, Schools, and Academic Achievement,” Econometrica, March 2005, 73(2), pp. 417–45.

**Sullivan, J. (2016). Statement for the Commission on Evidence-Based Policymaking Panel on “Non-Governmental Demand for Evaluation: Capacity to Support Public Good Activities”. Downloaded from https://www.cep.gov/content/dam/cep/events/2016-11-04/114SullivanTestimony.pdf

**Sullivan, J. (2016). Statement for the Commission on Evidence-Based Policymaking Panel on “Non-Governmental Demand for Evaluation: Capacity to Support Public Good Activities”. Downloaded from https://www.cep.gov/content/dam/cep/events/2016-11-04/114SullivanTestimony.pdf

**US Government Publishing Office (2015b). Budget of the United States Government, Fiscal Year 2016. Downloaded from https://www.gpo.gov/fdsys/pkg/BUDGET-2016-BUD/pdf/BUDGET-2016-BUD.pdf#page=95.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store