Secondary analysis of large quantitative datasets (or doing research with other people's data).
Re-using big government datasets can spotlight unfair gaps for people with ID, but only after you clean the junk.
01Research in Context
What this study did
Dai et al. (2023) looked at 11 big data files already collected by governments or hospitals.
They asked: can we reuse these numbers to spot unfair gaps faced by people with intellectual disability?
The paper is a story-style review, not new experiments.
What they found
Old datasets can reveal who is left out of jobs, health care, or school.
Mistakes in the files, like wrong labels or missing rows, can hide the very gaps we want to fix.
Cleaning the data and checking with real families makes the picture clearer.
How this fits with other research
Dodd et al. (2010) audited one national database and found 28 % of the need ratings were off.
That single-country audit is the pitfall G et al. warn about: bad input leads to bad policy math.
Sannicandro et al. (2018) show the upside: linking social-security files proved college boosts wages for adults with ID.
Cramm et al. (2009) reused clinic files across five hospitals to show vision loss piles on extra disability.
Together these papers prove the same point: big data can guide fair rules if we scrub it first.
Why it matters
Before you write a grant or a policy brief, mine the free public files. Check for missing groups, odd scores, or double entries. Share the cleaned file with self-advocates so the numbers truly speak for them.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Download your state’s special-education outcomes file, run a quick filter for students labeled ID, and flag any blank cells in the employment column.
02At a glance
03Original abstract
Over the past decade, secondary analysis of large quantitative datasets has begun to make a significant contribution to furthering our understanding of the lives of people (including children and young people) with intellectual disability and the inequities they experience compared to their nondisabled peers. This critical development brings population-level understanding about the lives of people with intellectual disability into line with the more established tradition of this research approach in areas such as child development, social policy, education, sociology, economics and public health. Secondary analysis in these fields has been primarily undertaken on either large-scale health or social surveys or clinical/administrative data held by health, social, census or welfare agencies or governments. This first special issue on this topic for the Journal of Intellectual Disability Research demonstrates that similar benefits can result from secondary analysis as it becomes a more established feature of the intellectual disability research landscape. Secondary analysis offers, among other benefits, the following three opportunities for improving our understanding of the lives of people with intellectual disability. The first is to better understand the overall prevalence of intellectual disability and prevalence among sub-groups of particular interest at a particular point in time and how this may change over time. The second is to describe and quantify the association between intellectual disability and indicators of health and well-being and broad social determinants of health and well-being such as income, housing, education, employment, discrimination, violence and social exclusion. Associations that may reflect risk factors for the incidence and/or prevalence of intellectual disability or the consequences of having an intellectual disability in specific contexts at a particular point in history. The third benefit comes from the opportunity to examine the barriers experienced by people with intellectual disability in accessing critical services such as health care or life opportunities such as employment and community participation. Linking national survey data and administrative datasets can bring additional opportunities such as tracing the service trajectories for people with intellectual disability and evaluating the reach of intellectual disability services compared to the nature and patterning of services and supports needs over time. One particular benefit of secondary analysis is that it often allows each of these three areas to be explored using data that are reasonably representative of national or state/provincial populations. As such, findings from secondary analysis of large quantitative datasets can help establish points for national or regional policy change to reduce the inequities experienced by people with intellectual disability. Research using secondary analysis of large quantitative datasets can also contribute significantly to policy refinement or change such as when the impact of specific polices is shown to be trending away from the desired direction. Modelling the potential impact of interventions to reduce inequities or address unmet support or service needs is also possible using secondary analysis of large quantitative datasets. The 11 papers in this Special Issue demonstrate that researchers have moved forward quickly to leverage the benefits of secondary analysis as large quantitative datasets are now more readily available to researchers in many countries. The papers represent a breadth of countries, 79 in total, from which representative data on intellectual disability are available. Nine of the 11 papers utilise country specific datasets from six countries, all of which are high income. Of the remaining two, one paper presents a systematic review and meta-analysis drawing on 14 studies utilising representative data from eight high-income countries with six of these 14 studies coming from the United States and three from the Netherlands (Kavanagh, Manninen & Issartel). The other reports on surveys from low-income and middle-income countries utilising 126 nationally representative surveys undertaken in 73 countries spread across the globe (Emerson & Llewellyn). The papers demonstrate a breadth of datasets containing identifiers for people with intellectual disability. Survey datasets are most frequently represented (three papers; Emerson & Llewellyn; Totsika et al.; Vaitsiakhovich & Landes), with register (Lin, Tseng & Lai; Nurminen et al.) and administrative datasets (Bakkum et al.; Schuengel et al.) represented twice each. There are also two instances of linked datasets; one utilises linked administrative data (Liao et al.) and the other linked administrative and survey data (İsvan, Bonardi & Hiersteiner). As data linkage becomes more commonplace, especially in high-income countries, more papers reporting findings from linked datasets would be expected. A breadth of issues was also explored across the 11 papers. Most of the papers focused on adults with intellectual disability; however, four papers addressed issues concerning the lives of children with intellectual disability (Emerson & Llewellyn; Kavanagh, Manninen & Issartel; Lin, Tseng & Lai; Totsika et al.). Five of the 11 papers addressed health matters all of which related to adults with intellectual disability. Two papers addressed physical concerns one of which utilised data for children, the other for adults. It is encouraging to see researchers in the field of intellectual disability taking up the opportunity to access large quantitative datasets and utilise secondary data analysis methods. However, challenges remain in this area. Despite the increasing access granted to researchers by governments and government agencies to utilise large quantitative datasets, including linked datasets, barriers still exist including cost in some instances or only sub-sets of data being available. There can also be issues with data being only on platforms not suitable for some statistical analysis programs, or not accompanied by standardised data dictionaries. Another difficulty is the multiple ways in which intellectual disability is defined and identified across countries and in administrative datasets within some countries. This may require researchers to identify items within administrative records or generic population-based surveys that taken together represent the characteristics of intellectual disability. These may include special education needs or schooling, difficulties with literacy and numeracy or difficulties with learning or understanding. Standardisation of intellectual disability identifiers would facilitate cross-country comparisons, which could help inform best practice. Although the papers included in this issue all relied on large quantitative datasets, most authors also discussed concerns about representativeness. Survey data, unless the survey includes appropriate reasonable accommodation including proxy responding, often excludes those with more severe disability (Chusamer, Melville & McGarty), those living in institutional settings (Vaitsiakhovich & Landes) and even those with mild disability who choose not to self-identify (Vaitsiakhovich & Landes). Administrative data often do not allow the identification of those with less severe disability (Schuengel et al.; Nurminen et al., Lin, Tseng & Lai; Liao et al.). As shown by Shuengel et al., it is sometimes possible to apply analytic strategies to address the potential selection bias due to the under-representation of those with milder disability in administrative data. Another common limitation discussed in this issue is the inability to examine whether findings are consistent across levels of disability/level of support needs (Totsika et al.; Kavanagh, Manninen & Issartel; Vaitsakhovich & Landes; Shuengel et al.). These observations are critical to the interpretation of findings and suggest areas of improvement in data collection. Another ongoing and pressing concern is that a high proportion of this type of research comes from authors in high-income countries (all 11 papers in this Special Issue) or focuses on datasets from high-income countries (10 out of 11 papers). This concern requires specific attention, given the greater proportion of the global population is from low-income and middle-income countries, and in this ageing world, an increasing proportion of whom are children and young adults with intellectual disability. Despite these challenges, the future offers great promise for increasing our understanding of the lives of people with intellectual disability by using secondary analysis of large quantitative datasets. Census collections in many countries are increasingly including disability questions and making this data available to researchers. UN initiatives (e.g. UNICEF's Multiple Indicator Cluster Surveys) and other international initiatives (e.g. USAID's funded Demographic and Heath Surveys Programme) are also including disability identifiers in their surveys and making the resulting data widely and freely available. Ever-increasing computational power and online access to national survey and administrative datasets collectively reduce the personnel and infrastructure resources required to undertake research. Mixed methods approaches where qualitative data can inform secondary analysis of large quantitative datasets, or explicate, corroborate or contextualise quantitative data analysis results should also be considered (Bakkum et al.). We look forward to secondary data analysis making an increasingly important contribution to the creation of globally and nationally relevant knowledge about the health and well-being of people with intellectual disability. No data were used in this paper.
Journal of intellectual disability research : JIDR, 2023 · doi:10.1111/jir.13101