Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What is Secondary Research? | Definition, Types, & Examples

What is Secondary Research? | Definition, Types, & Examples

Published on January 20, 2023 by Tegan George . Revised on January 12, 2024.

Secondary research is a research method that uses data that was collected by someone else. In other words, whenever you conduct research using data that already exists, you are conducting secondary research. On the other hand, any type of research that you undertake yourself is called primary research .

Secondary research can be qualitative or quantitative in nature. It often uses data gathered from published peer-reviewed papers, meta-analyses, or government or private sector databases and datasets.

Table of contents

When to use secondary research, types of secondary research, examples of secondary research, advantages and disadvantages of secondary research, other interesting articles, frequently asked questions.

Secondary research is a very common research method, used in lieu of collecting your own primary data. It is often used in research designs or as a way to start your research process if you plan to conduct primary research later on.

Since it is often inexpensive or free to access, secondary research is a low-stakes way to determine if further primary research is needed, as gaps in secondary research are a strong indication that primary research is necessary. For this reason, while secondary research can theoretically be exploratory or explanatory in nature, it is usually explanatory: aiming to explain the causes and consequences of a well-defined problem.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Secondary research can take many forms, but the most common types are:

Statistical analysis

Literature reviews, case studies, content analysis.

There is ample data available online from a variety of sources, often in the form of datasets. These datasets are often open-source or downloadable at a low cost, and are ideal for conducting statistical analyses such as hypothesis testing or regression analysis .

Credible sources for existing data include:

  • The government
  • Government agencies
  • Non-governmental organizations
  • Educational institutions
  • Businesses or consultancies
  • Libraries or archives
  • Newspapers, academic journals, or magazines

A literature review is a survey of preexisting scholarly sources on your topic. It provides an overview of current knowledge, allowing you to identify relevant themes, debates, and gaps in the research you analyze. You can later apply these to your own work, or use them as a jumping-off point to conduct primary research of your own.

Structured much like a regular academic paper (with a clear introduction, body, and conclusion), a literature review is a great way to evaluate the current state of research and demonstrate your knowledge of the scholarly debates around your topic.

A case study is a detailed study of a specific subject. It is usually qualitative in nature and can focus on  a person, group, place, event, organization, or phenomenon. A case study is a great way to utilize existing research to gain concrete, contextual, and in-depth knowledge about your real-world subject.

You can choose to focus on just one complex case, exploring a single subject in great detail, or examine multiple cases if you’d prefer to compare different aspects of your topic. Preexisting interviews , observational studies , or other sources of primary data make for great case studies.

Content analysis is a research method that studies patterns in recorded communication by utilizing existing texts. It can be either quantitative or qualitative in nature, depending on whether you choose to analyze countable or measurable patterns, or more interpretive ones. Content analysis is popular in communication studies, but it is also widely used in historical analysis, anthropology, and psychology to make more semantic qualitative inferences.

Primary Research and Secondary Research

Secondary research is a broad research approach that can be pursued any way you’d like. Here are a few examples of different ways you can use secondary research to explore your research topic .

Secondary research is a very common research approach, but has distinct advantages and disadvantages.

Advantages of secondary research

Advantages include:

  • Secondary data is very easy to source and readily available .
  • It is also often free or accessible through your educational institution’s library or network, making it much cheaper to conduct than primary research .
  • As you are relying on research that already exists, conducting secondary research is much less time consuming than primary research. Since your timeline is so much shorter, your research can be ready to publish sooner.
  • Using data from others allows you to show reproducibility and replicability , bolstering prior research and situating your own work within your field.

Disadvantages of secondary research

Disadvantages include:

  • Ease of access does not signify credibility . It’s important to be aware that secondary research is not always reliable , and can often be out of date. It’s critical to analyze any data you’re thinking of using prior to getting started, using a method like the CRAAP test .
  • Secondary research often relies on primary research already conducted. If this original research is biased in any way, those research biases could creep into the secondary results.

Many researchers using the same secondary research to form similar conclusions can also take away from the uniqueness and reliability of your research. Many datasets become “kitchen-sink” models, where too many variables are added in an attempt to draw increasingly niche conclusions from overused data . Data cleansing may be necessary to test the quality of the research.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Inclusion and exclusion criteria

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Sources in this article

We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.

George, T. (2024, January 12). What is Secondary Research? | Definition, Types, & Examples. Scribbr. Retrieved April 1, 2024, from https://www.scribbr.com/methodology/secondary-research/
Largan, C., & Morris, T. M. (2019). Qualitative Secondary Research: A Step-By-Step Guide (1st ed.). SAGE Publications Ltd.
Peloquin, D., DiMaio, M., Bierer, B., & Barnes, M. (2020). Disruptive and avoidable: GDPR challenges to secondary research uses of data. European Journal of Human Genetics , 28 (6), 697–705. https://doi.org/10.1038/s41431-020-0596-x

Is this article helpful?

Tegan George

Tegan George

Other students also liked, primary research | definition, types, & examples, how to write a literature review | guide, examples, & templates, what is a case study | definition, examples & methods, what is your plagiarism score.

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical Literature
  • Classical Reception
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Archaeology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Papyrology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Agriculture
  • History of Education
  • History of Emotions
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Variation
  • Language Families
  • Language Acquisition
  • Language Evolution
  • Language Reference
  • Lexicography
  • Linguistic Theories
  • Linguistic Typology
  • Linguistic Anthropology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Modernism)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Culture
  • Music and Religion
  • Music and Media
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Science
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Society
  • Law and Politics
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Oncology
  • Medical Toxicology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Medical Ethics
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Games
  • Computer Security
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Neuroscience
  • Cognitive Psychology
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business History
  • Business Strategy
  • Business Ethics
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Methodology
  • Economic Systems
  • Economic History
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Theory
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Politics and Law
  • Public Administration
  • Public Policy
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Quantitative Methods in Psychology: Vol. 2: Statistical Analysis

  • < Previous chapter
  • Next chapter >

The Oxford Handbook of Quantitative Methods in Psychology: Vol. 2: Statistical Analysis

28 Secondary Data Analysis

Department of Psychology, Michigan State University

Richard E. Lucas, Department of Psychology, Michigan State University, East Lansing, MI

  • Published: 01 October 2013
  • Cite Icon Cite
  • Permissions Icon Permissions

Secondary data analysis refers to the analysis of existing data collected by others. Secondary analysis affords researchers the opportunity to investigate research questions using large-scale data sets that are often inclusive of under-represented groups, while saving time and resources. Despite the immense potential for secondary analysis as a tool for researchers in the social sciences, it is not widely used by psychologists and is sometimes met with sharp criticism among those who favor primary research. The goal of this chapter is to summarize the promises and pitfalls associated with secondary data analysis and to highlight the importance of archival resources for advancing psychological science. In addition to describing areas of convergence and divergence between primary and secondary data analysis, we outline basic steps for getting started and finding data sets. We also provide general guidance on issues related to measurement, handling missing data, and the use of survey weights.

The goal of research in the social science is to gain a better understanding of the world and how well theoretical predictions match empirical realities. Secondary data analysis contributes to these objectives through the application of “creative analytical techniques to data that have been amassed by others” ( Kiecolt & Nathan, 1985 , p. 10). Primary researchers design new studies to answer research questions, whereas the secondary data analyst uses existing resources. There is a deliberate coupling of research design and data analysis in primary research; however, the secondary data analyst rarely has had input into the design of the original studies in terms of the sampling strategy and measures selected for the investigation. For better or worse, the secondary data analyst simply has access to the final products of the data collection process in the form of a codebook or set of codebooks and a cleaned data set.

The analysis of existing data sets is routine in disciplines such as economics, political science, and sociology, but it is less well established in psychology ( but see   Brooks-Gunn & Chase-Lansdale, 1991 ; Brooks-Gunn, Berlin, Leventhal, & Fuligini, 2000 ). Moreover, biases against secondary data analysis in favor of primary research may be present in psychology ( see   McCall & Appelbaum, 1991 ). One possible explanation for this bias is that psychology has a rich and vibrant experimental tradition, and the training of many psychologists has likely emphasized this approach as the “gold standard” for addressing research questions and establishing causality ( see , e.g., Cronbach, 1957 ). As a result, the nonexperimental methods that are typically used in secondary analyses may be viewed by some as inferior. Psychological scientists trained in the experimental tradition may not fully appreciate the unique strengths that nonexperimental techniques have to offer and may underestimate the time, effort, and skills required for conducting secondary data analyses in a competent and professional manner. Finally, biases against secondary data analysis might stem from lingering concerns over the validity of the self-report methods that are typically used in secondary data analysis. These can include concerns about the possibility that placement of items in a survey can influence responses (e.g., differences in the average levels of reported marital and life satisfaction when questions occur back to back as opposed to having the questions separated in the survey; see   Schwarz, 1999 ; Schwarz & Strack, 1999 ) and concerns with biased reporting of sensitive behaviors ( but see   Akers, Massey, & Clarke, 1983 ).

Despite the initial reluctance to widely embrace secondary data analysis as a tool for psychological research, there are promising signs that the skepticism toward secondary analyses will diminish as psychology seeks to position itself as a hub science that plays a key role in interdisciplinary inquiry ( see   Mroczek, Pitzer, Miller, Turiano, & Fingerman, 2011 ). Accordingly, there is a compelling argument for including secondary data analysis into the suite of methodological approaches used by psychologists ( see   Trzesniewski, Donnellan, & Lucas, 2011 ).

The goal of this chapter is to summarize the promises and pitfalls associated with secondary data analysis and to highlight the importance of archival resources for advancing psychological science. We limit our discussion to analyses based on large-scale and often longitudinal national data sets such as the National Longitudinal Study of Adolescent Health (Add Health), the British Household Panel Study (BHPS), the German Socioeconomic Panel Study (GSOEP), and the National Institute of Child Health and Human Development (NICHD) Study of Early Child Care and Youth Development (SEC-CYD). However, much of our discussion applies to all secondary analyses. The perspective and specific recommendations found in this chapter draw on the edited volume by Trzesniewski et al. (2011 ). Following a general introduction to secondary data analysis, we will outline the necessary steps for getting started and finding data sets. Finally, we provide some general guidance on issues related to measurement, approaches to handling missing data, and survey weighting. Our treatment of these important topics is intended to draw attention to the relevant issues rather than to provide extensive coverage. Throughout, we take a practical approach to the issues and offer tips and guidance rooted in our experiences as data analysts and researchers with substantive interests in personality and life span developmental psychology.

Comparing Primary Research and Secondary Research

As noted in the opening section, it is possible that biases against secondary data analysis exist in the minds of some psychological scientists. To address these concerns, we have found it can be helpful to explicitly compare the processes of secondary analyses with primary research ( see also   McCall & Appelbaum, 1991 ). An idealized and simplified list of steps is provided in Table 28.1 . As is evident from this table, both techniques start with a research question that is ideally rooted in existing theory and previous empirical results. The areas of biggest divergence between primary and secondary approaches occur after researchers have identified their questions (i.e., Steps 2 through 5 in Table 28.1 ). At this point, the primary researcher develops a set of procedures and then engages in pilot testing to refine procedures and methods, whereas the secondary analyst searches for data sets and evaluates codebooks. The primary researcher attempts to refine her or his procedures, whereas the secondary analyst determines whether a particular resource is appropriate for addressing the question at hand. In the next stages, the primary researcher collects new data, whereas the secondary data analyst constructs a working data set from a much larger data archive. At these stages, both types of researchers must grapple with the practical considerations imposed by real world constraints. There is no such thing as a perfect single study ( see   Hunter & Schmidt, 2004 ), as all data sets are subject to limitations stemming from design and implementation. For example, the primary researcher may not have enough subjects to generate adequate levels of statistical power (because of a failure to take power calculations into account during the design phase, time or other resource constraints during the data collection phase, or because of problems with sample retention), whereas the secondary data analyst may have to cope with impoverished measurement of core constructs. Both sets of considerations will affect the ability of a given study to detect effects and provide unbiased estimates of effect sizes.

Table 28.1 also illustrates the fact that there are considerable areas of overlap between the two techniques. Researchers stemming from both traditions analyze data, interpret results, and write reports for dissemination to the wider scientific community. Both kinds of research require a significant investment of time and intellectual resources. Many skills required in conducting high-quality primary research are also required in conducting high-quality secondary data analysis including sound scientific judgment, attention to detail, and a firm grasp of statistical methodology.

Note: Steps modified and expanded from McCall and Appelbaum (1991 ).

We argue that both primary research and secondary data analysis have the potential to provide meaningful and scientifically valid research findings for psychology. Both approaches can generate new knowledge and are therefore reasonable ways of evaluating research questions. Blanket pronouncements that one approach is inherently superior to the other are usually difficult to justify. Many of the concerns about secondary data analysis are raised in the context of an unfair comparison—a contrast between the idealized conceptualization of primary research with the actual process of a secondary data analysis. Our point is that both approaches can be conducted in a thoughtful and rigorous manner, yet both approaches involve concessions to real-world constraints. Accordingly, we encourage all researchers and reviewers of papers to keep an open mind about the importance of both types of research.

Advantages and Disadvantages of Secondary Data Analysis

The foremost reason why psychologists should learn about secondary data analysis is that there are many existing data sets that can be used to answer interesting and important questions. Individuals who are unaware of these resources are likely to miss crucial opportunities to contribute new knowledge to the discipline and even risk reinventing the proverbial wheel by collecting new data. Regrettably, new data collection efforts may occur on a smaller scale than what is available in large national datasets. Researchers who are unaware of the potential treasure trove of variables in existing data sets risk unnecessarily duplicating considerable amounts of time and effort. At the very least, researchers may wish to familiarize themselves with publicly available data to truly address gaps in the literature when they undertake projects that involve new data collection.

The biggest advantage of secondary analyses is that the data have already been collected and are ready to be analyzed ( see   Hofferth, 2005 ), thus conserving time and resources. Existing data sources are often of much larger and higher quality than could be feasibly collected by a single investigator. This advantage is especially pronounced when considering the investments of time and money necessary to collect longitudinal data. Some data sets were collected with scientific sampling plans (such as the GSOEP), which make it possible to generalize the findings to a specific population. Further, many publicly available data sets are quite large, and therefore provide adequate statistical power for conducting many analyses, including hypotheses about statistical interactions. Investigations of interactions often require a surprisingly high number of participants to achieve respectable levels of statistical power in the face of measurement error ( see   Aiken & West, 1991 ). 1 Large-scale data sets are also well suited for subgroup analyses of populations that are often under-represented in smaller research studies.

Another advantage of secondary data analysis is that it forces researchers to adopt an open and transparent approach to their craft. Because data are publicly available, other investigators may attempt to replicate findings and specify alternative models for a given research question. This reality encourages transparency and detailed record keeping on the part of the researcher, including careful reporting of analysis and a reasoned justification for all analytic decisions. Freese (2007 ) has provided a useful discussion about policies for archiving material necessary for replicating results, and his treatment of the issues provides guidance to researchers interested in maintaining good records.

Despite the many advantages of secondary data analysis, it is not without its disadvantages. The most significant challenge is simply the flipside of the primary advantage—the data have already been collected by somebody else! Analysts must take advantage of what has been collected without input into design and measurement issues. In some cases, an existing data set may not be available to address the particular research questions of a given investigator without some limitations in terms of sampling, measurement, or other design feature. For example, data sets commonly used for secondary analysis often have a great deal of breadth in terms of the range of constructs assessed (e.g., finances, attitudes, personality, life satisfaction, physical health), but these constructs are often measured with a limited number of survey items. Issues of measurement reliability and validity are usually a major concern. Therefore, a strong grounding in basic and advanced psychometrics is extremely helpful for responding to criticisms and concerns about measurement issues that arise during the peer-review process.

A second consequence of the fact that the data have been collected by somebody else is that analysts may not have access to all of the information about data collection procedures and issues. The analyst simply receives a cleaned data set to use for subsequent analyses. Perhaps not obvious to the user is the amount of actual cleaning that occurred behind the scenes. Similarly, the complicated sampling procedures used in a given study may not be readily apparent to users, and this issue can prevent the appropriate use of survey weights ( Shrout & Napier, 2011 ).

Another significant disadvantage for secondary data analysis is the large amount of time and energy initially required to review data documentation. It can take hours and even weeks to become familiar with the codebooks and to discover which research questions have already been addressed by investigators using the existing data sets. It is very easy to underestimate how long it will take to move from an initial research idea to a competent final analysis. There is a risk that, unbeknownst to one another, researchers in different locations will pursue answers to the same research questions. On the other hand, once a researcher has become familiar with a data set and developed skills to work with the resource, they are able to pursue additional research questions resulting in multiple publications from the same data set. It is our experience that the process of learning about a data set can help generate new research ideas as it becomes clearer how the resource can be used to contribute to psychological science. Thus, the initial time and energy expended to learn about a resource can be viewed as initial investment that holds the potential to pay larger dividends over time.

Finally, a possible disadvantage concerns how secondary data analyses are viewed within particular subdisciplines of psychology and by referees during the peer-review process. Some journals and some academic departments may not value secondary data analyses as highly as primary research. Such preferences might break along Cronbach’s two disciplines or two streams of psychology—correlational versus experimental ( Cronbach, 1957 ; Tracy, Robins, & Sherman, 2009 ). The reality is that if original data collection is more highly valued in a given setting, then new investigators looking to build a strong case for getting hired or getting promoted might face obstacles if they base a career exclusively on secondary data analysis. Similarly, if experimental methods are highly valued and correlational methods are denigrated in a particular subfield, then results of secondary data analyses will face difficulties getting attention (and even getting published). The best advice is to be aware of local norms and to act accordingly.

Steps for Beginning a Secondary Data Analysis

Step 1: Find Existing Data Sets . After generating a substantive question, the first task is to find relevant data sets ( see   Pienta, O’Rouke, & Franks, 2011 ). In some cases researchers will be aware of existing data sets through familiarity with the literature given that many well-cited papers have used such resources. For example, the GSOEP has now been widely used to address questions about correlates and developmental course of subjective well-being (e.g., Baird, Lucas, & Donnellan, 2010 ; Gerstorf, Ram, Estabrook, Schupp, Wagner, & Lindenberger, 2008 ; Gerstorf, Ram, Goebel, Schupp, Lindenberger, & Wagner, 2010 ; Lucas, 2005 ; 2007 ), and thus, researchers in this area know to turn to this resource if a new question arises. In other cases, however, researchers will attempt to find data sets using established archives such as the University of Michigan’s Interuniversity Consortium for Political and Social Research (ICPSR; http://www.icpsr.umich.edu/icpsrweb/ICPSR/ ). In addition to ICPSR, there are a number of other major archives ( see   Pienta et al., 2011 ) that house potentially relevant data sets. Here are just a few starting points:

The Henry A. Murray Research Archive ( http://www.murray.harvard.edu/ )

The Howard W Odum Institute for Research in Social Science ( http://www.irss.unc.edu/odum/jsp/home2.jsp )

The National Opinion Research Center ( http://norc.org/homepage.htm )

The Roper Center of Public Opinion Research ( http://ropercenter.uconn.edu/ )

The United Kingdom Data Archive ( http://www.data-archive.ac.uk/ )

Individuals in charge of these archives and data depositories often catalog metadata, which is the technical term for information about the constituent data sets. Typical kinds of metadata include information about the original investigators, a description of the design and process of data collection, a list of the variables assessed, and notes about sampling weights and missing data. Searching through this information is an efficient way of gaining familiarity with data sets. In particular, the ICPSR has an impressive infrastructure for allowing researchers to search for data sets through a cataloguing of study metadata. The ICPSR is thus a useful starting point for finding the raw material for a secondary data analysis. The ICPSR also provides a new user tutorial for searching their holdings ( http://www.icpsr.umich.edu/icpsrweb/ICPSR/help/newuser.jsp ). We recommend that researchers search through their holdings to make a list of potential data sets. At that point, the next task is to obtain relevant codebooks to learn more about each resource.

Step 2: Read Codebooks . Researchers interesting in using an existing data set are strongly advised to thoroughly read the accompanying codebook ( Pienta et al., 2011 ). There are several reasons why a comprehensive understanding of the codebook is a critical first step when conducting a secondary data analysis. First, the codebook will detail the procedures and methods used to acquire the data and provide a list of all of the questions and assessments collected. A thorough reading of the codebook can provide insights into important covariates that can be included in subsequent models, and a careful reading will draw the analyst’s attention to key variables that will be missing because no such information was collected. Reading through a codebook can also help to generate new research questions.

Second, high-quality codebooks often report basic descriptive information for each variable such as raw frequency distributions and information about the extent of missing values. The descriptive information in the codebook can give investigators a baseline expectation for variables under consideration, including the expected distributions of the variables and the frequencies of under-represented groups (such as ethnic minority participants). Because it is important to verify that the descriptive statistics in the published codebook match those in the file analyzed by the secondary analyst, a familiarity with the codebook is essential. In addition to codebooks, many existing resources provide copies of the actual surveys completed by participants ( Pienta et al., 2011 ). However, the use of actual pencil-and-paper surveys is becoming less common with the advent of computer assisted interview techniques and Internet surveys. It is often the case that survey methods involve skip patterns (e.g., a participant is not asked about the consequences of her drinking if she responds that she doesn’t drink alcohol) that make it more difficult to assume the perspective of the “typical” respondent in a given study ( Pienta et al., 2011 ). Nonetheless, we recommend that analysts try to develop an understanding for the experiences of the participant in a given study. This perspective can help secondary analysts develop an intuitive understanding of certain patterns of missing data and anticipate concerns about question ordering effects ( see , e.g., Schwarz, 1999 ).

Step 3: Acquire Datasets and Construct a Working Datafile . Although there is a growing availability of Web-based resources for conducting basic analyses using selected data sets (e.g., the Survey Documentation Analysis software used by ICPSR), we are convinced that there is no substitute for the analysis of the raw data using the software packages of preference for a given investigator. This means that the analysts will need to acquire the data sets that they consider most relevant. This is typically a very straightforward process that involves acknowledging researcher responsibilities before downloading the entire data set from a website. In some cases, data are classified as restricted-use, and there are more extensive procedures for obtaining access that may involve submitting a detailed security plan and accompanying legal paperwork before becoming an authorized data user. When data involve children and other sensitive groups, Institutional Review Board approval is often required.

Each data set has different usage requirements, so it is difficult to provide blanket guidance. Researchers should be aware of the policies for using each data set and recognize their ethical responsibility for adhering to those regulations. A central issue is that the researcher must avoid deductive disclosure whereby otherwise anonymous participants are identified because of prior knowledge in conjunction with the personal characteristics coded in the dataset (e.g., gender, racial/ethnic group, geographic location, birth date). Such a practice violates the major ethical principles followed by responsible social scientists and has the potential to harm research participants.

Once the entire set of raw data is acquired, it is usually straightforward to import the files into the kinds of statistical packages used by researchers (e.g., R, SAS, SPSS, and STATA). At this point, it is likely that researchers will want to create smaller “working” file by pulling only relevant variables from the larger master files. It is often too cumbersome to work with a computer file that may have more than a thousand columns of information. The solution is to construct a working data file that has all of the needed variables tied to a particular research project. Researchers may also need to link multiple files by matching longitudinal data sets and linking to contextual variables such as information about schools or neighborhoods for data sets with a multilevel structure (e.g., individuals nested in schools or neighborhoods).

Explicit guidance about managing a working data file can be found in Willms (2011 ). Here, we simply highlight some particularly useful advice: (1) keep exquisite notes about what variables were selected and why; (2) keep detailed notes regarding changes to each variable and reasons why; and (3) keep track of sample sizes throughout this entire process. The guiding philosophy is to create documentation that is clear enough for an outside user to follow the logic and procedures used by the researcher. It is far too easy to overestimate the power of memory only to be disappointed when it comes time to revisit a particular analysis. Careful documentation can save time and prevent frustration. Willms (2011 ) noted that “keeping good notes is the sine qua non of the trade” (p. 33).

Step 4: Conduct Analyses . After assembling the working data file, the researcher will likely construct major study variables by creating scale composites (e.g., the mean of the responses to the items assessing the same construct) and conduct initial analyses. As previously noted, a comparison of the distributions and sample sizes with those in the study codebook is essential at this stage. Any deviations for the variables in the working data file and the codebook should be understood and documented. It is particularly useful to keep track of missing values to make sure that they have been properly coded. It should go without saying that an observed value of-9999 will typically require recoding to a missing value in the working file. Similarly, errors in reverse scoring items can be particularly common (and troubling) so researchers are well advised to conduct through item-level and scale analyses and check to make sure that reverse scoring was done correctly (e.g., examine the inter-item correlation matrix when calculating internal consistency estimates to screen for negative correlations). Willms (2011 ) provides some very savvy advice for the initial stages of actual data analysis: “Be wary of surprise findings” (p. 35). He noted that “too many times I have been excited by results only to find that I have made some mistake” (p. 35). Caution, skepticism, and a good sense of the underlying data set are essential for detecting mistakes.

An important comment about the nature of secondary data analysis is again worth emphasizing: These data sets are available to others in the scholarly community. This means that others should be able to replicate your results! It is also very useful to adopt a self-critical perspective because others will be able to subject findings to their own empirical scrutiny. Contemplate alternative explanations and attempt to conduct analyses to evaluate the plausibility of these explanations. Accordingly, we recommend that researchers strive to think of theoretically relevant control variables and include them in the analytic models when appropriate. Such an approach is useful both from the perspective of scientific progress (i.e., attempting to curb confirmation biases) and in terms of surviving the peer-review process.

Special Issue: Measurement Concerns in Existing Datasets

One issue with secondary data analyses that is likely to perplex psychologists are concerns regarding the measurement of core constructs. The reality is that many of the measures available in large-scale data sets consist of a subset of items derived from instruments commonly used by psychologists ( see   Russell & Matthews, 2011 ). For example, the 10-item Rosenberg Self-Esteem scale ( Rosenberg, 1965 ) is the most commonly used measure of global self-esteem in the literature ( Donnellan, Trzesniewski, & Robins, 2011 ). Measures of self-esteem are available in many data sets like Monitoring the Future ( see   Trzesniewski & Donnellan, 2010 ) but these measures are typically shorter than the original Rosenberg scale. Similarly, the GSOEP has a single-item rating of subjective well-being in the form of happiness, whereas psychologists might be more accustomed to measuring this construct with at least five items (e.g., Diener, Emmons, Larsen, & Griffin, 1985 ). Researchers using existing data sets will have to grapple with the consequences of having relatively short assessments in terms of the impact on reliability and validity.

For purposes of this chapter, we will make use of a conventional distinction between reliability and validity. Reliability will refer to the degree of measurement error present in a given set of scores (or alternatively the degree of consistency or precision in scores), whereas validity will refer to the degree to which measures capture the construct of interest and predict other variables in ways that are consistent with theory. More detailed but accessible discussions of reliability and validity can be found in Briggs and Cheek (1986 ), Clark and Watson (1995 ), John and Soto (2007 ), Messick (1995 ), Simms (2008 ), and Simms and Watson (2007 ). Widaman, Little, Preacher, and Sawalani (2011 ) have provided a discussion of these issues in the context of the shortened assessments available in existing data sets.

Short Measures and Reliability . Classical Test Theory (e.g., Lord & Novick, 1968 ) is the measurement perspective most commonly used among psychologists. According to this measurement philosophy, any observed score is a function of the underlying attribute (the so-called “true score”) and measurement error. Reliability is conceptualized as any deviation or inconsistency in observed scores for the same attribute across multiple assessments of that attribute. A thought experiment may help crystallize insights about reliability (e.g., Lord & Novick, 1968 ): Imagine a thousand identical clones each completing the same self-esteem instrument simultaneously. The underlying self-esteem attribute (i.e., the true scores) should be the same for each clone (by definition), whereas the observed scores may fluctuate across clones because of random measurement errors (e.g., a single clone misreading an item vs. another clone being frustrated by an extremely hot testing room). The extent of the observed fluctuations in reported scores across clones offers insight into how much measurement error is present in this instrument. If scores are tightly clustered around a single value, then measurement error is minimal; however, if scores are dramatically different across clones, then there is a clear indication of problems with reliability. The measure is imprecise because it yields inconsistent values across the same true scores.

These ideas about reliability can be applied to observed samples of scores such that the total observed variance is attributable to true score variance (i.e., true individual differences in underlying attributes) and variance stemming from random measurement errors. The assumption that measurement error is random means that it has an expected value of zero across observations. Using this framework, reliability can then be defined as the ratio of true score variance to the total observed variance. An assessment that is perfectly reliable (i.e., has no measurement error) will have a ratio of 1.0, whereas an assessment that is completely unreliable will yield a ratio of 0.0 ( see   John & Soto, 2007 , for an expanded discussion). This perspective provides a formal definition of a reliability coefficient.

Psychologists have developed several tools to estimate the reliability of their measures, but the approach that is most commonly used is coefficient a ( Cronbach, 1951 ; see   Schmitt, 1996 , for an accessible review). This approach considers reliability from the perspective of internal consistency. The basic idea is that fluctuations across items assessing the same construct reflect the presence of measurement error. The formula for the standardized α is a fairly simple function of the average inter-item correlation (a measure of inter-item homogeneity) and the total number of items in a scale. The α coefficient is typically judged acceptable if it is above 0.70, but the justification for this particular cutoff is somewhat arbitrary ( see   Lance, Butts, & Michels, 2006 ). Researchers are therefore advised to take a more critical perspective on this statistic. A relevant concern is that α is negatively impacted when the measure is short.

Given concerns with scale length and α, many methodologically oriented researchers recommend evaluating and reporting the average inter-item correlation because it can be interpreted independently of length and thus represents a “more straightforward indicator of internal consistency” ( Clark & Watson, 1995 , p. 316). Consider that it is common to observe an average inter-item correlation for the 10-item Rosenberg Self-Esteem ( Rosenberg, 1965 ) scale around 0.40 (this is based on typically reported a coefficients; see   Donnellan et al., 2011 ). This same level of internal homogeneity (i.e., an inter-item correlation of 0.40) yields an α of around 0.67 with a 3-item scale but an α of around 0.87 with 10 items. A measure of a broader construct like Extraversion may generate an average inter-item correlation of 0.20 ( Clark & Watson, 1995 , p. 316), which would translate to an α of 0.43 for a 3-item scale and 0.71 for a 10-item scale. The point is that α coefficients will fluctuate with scale length and the breadth of the construct. Because most scales in existing resources are short, the α coefficients might fall below the 0.70 convention despite having a respectable level of inter-item correlation.

Given these considerations, we recommend that researchers consider the average inter-item correlation more explicitly when working with secondary data sets. It is also important to consider the breadth of the underlying construct to generate expectations for reasonable levels of item homogeneity as indexed by the average inter-item correlation. Clark and Watson (1995 ; see also   Briggs & Cheek, 1986 ) recommend values of around 0.40 to 0.50 for measures of fairly narrow constructs (e.g., self-esteem) and values of around 0.15 to 0.20 for measures of broader constructs (e.g., neuroticism). It is our experience that considerations about internal consistency often need to be made explicit in manuscripts so that reviewers will not take an unnecessarily harsh perspective on α’s that fall below their expectations. Finally, we want to emphasize that internal consistency is but one kind of reliability. In some cases, it might be that test—retest reliability is more informative and diagnostic of the quality of a measure ( McCrae, Kurtz, Yamagata, & Terracciano, 2011 ). Fortunately, many secondary data sets are longitudinal so it possible to get an estimate of longer term test-retest reliability from the existing data.

Beyond simply reporting estimates of reliability, it is worth considering why measurement reliability is such an important issue in the first place. One consequence of reliability for substantive research is that measurement imprecision tends to depress observed correlations with other variables. This notion of attenuation resulting from measurement error and a solution were discussed by Spearman as far back as 1904 ( see , e.g., pp. 88–94). Unreliable measures can affect the conclusions drawn from substantive research by imposing a downward bias on effect size estimation. This is perhaps why Widaman et al. (2011 ) advocate using latent variable structural modeling methods to combat this important consequence of measurement error. Their recommendation is well worth considering for those with experience with this technique ( see   Kline, 2011 , for an introduction). Regardless of whether researchers use observed variables or latent variables for their analyses, it is important to recognize and appreciate the consequences of reliability.

Short Measures and Validity . Validity, for our purposes, reflects how well a measure captures the underlying conceptual attribute of interest. All discussions of validity are based, in part, on agreement in a field as to how to understand the construct in question. Validity, like reliability, is assessed as a matter of degree rather than a categorical distinction between valid or invalid measures. Cronbach and Meehl (1955 ) have provided a classic discussion of construct validity, perhaps the most overarching and fundamental form of validity considered in psychological research ( see also   Smith, 2005 ). However, we restrict our discussion to content validity and criterion-related validity because these two types of validity are particularly relevant for secondary data analysis and they are more immediately addressable.

Content validity describes how well a measure captures the entire domain of the construct in question. Judgments regarding content validity are ideally made by panels of experts familiar with the focal construct. A measure is considered construct deficient if it fails to assess important elements of the construct. For example, if thoughts of suicide are an integral aspect of the concept depression and a given self-report measure is missing items that tap this content, then the measure would be deemed construct-deficient. A measure can also suffer from construct contamination if it includes extraneous items that are irrelevant to the focal construct. For example, if somatic symptoms like a rapid heartbeat are considered to reflect the construct of anxiety and not part of depression, then a depression inventory that has such an item would suffer from construct contamination. Given the reduced length of many assessments, concerns over construct deficiency are likely to be especially pressing. A short assessment may not include enough items to capture the full breadth of a broad construct. This limitation is not readily addressed and should be acknowledged ( see   Widaman et al., 2011 ). In particular, researchers may need to clearly specify that their findings are based on a narrower content domain than is normally associated with the focal construct of interest.

A subtle but important point can arise when considering the content of measures with particularly narrow content. Internal consistency will increase when there is redundancy among items in the scale; however, the presence of similar items may decrease predictive power. This is known as the attenuation paradox in psycho metrics ( see   Clark & Watson, 1995 ). When items are nearly identical, they contribute redundant information about a very specific aspect of the construct. However, the very specific attribute may not have predictive power. In essence, reliability can be maximized at the expense of creating a measure that is not very useful from the point of view of prediction (and likely explanation). Indeed, Clark and Watson (1995 ) have argued that the “goal of scale construction is to maximize validity rather than reliability” (p. 316). In short, an evaluation of content validity is also important when considering the predictive power of a given measure.

Whereas content validity is focused on the internal attributes of a measure, criterion-related validity is based on the empirical relations between measures and other variables. Using previous research and theory surrounding the focal construct, the researcher should develop an expectation regarding the magnitude and direction of observed associations (i.e., correlations) with other variables. A good supporting theory of a construct should stipulate a pattern of association, or nomological network, concerning those other variables that should be related and unrelated to the focal construct. This latter requirement is often more difficult to specify from existing theories, which tend to provide a more elaborate discussion of convergent associations rather than discriminant validity ( Widaman et al., 2011 ). For example, consider a very truncated nomological network for Agreeableness (dispositional kindness and empathy). Measures of this construct should be positively associated with romantic relationship quality, negatively related to crime (especially violent crime), and distinct from measures of cognitive ability such as tests of general intelligence.

Evaluations of criterion-related validity can be conducted within a data set as researchers document that a measure has an expected pattern of associations with existing criterion-related variables. Investigators using secondary data sets may want to conduct additional research to document the criterion-related validity of short measures with additional convenience samples (e.g., the ubiquitous college student samples used by many psychologists; Sears, 1986 ). For example, there are six items in the Add Health data set that appear to measure self-esteem (e.g., “I have a lot of good qualities” and “I like myself just the way I am”) ( see   Russell, Crockett, Shen, &Lee, 2008 ). Although many of the items bear a strong resemblance to the items on the Rosenberg Self-Esteem scale ( Rosenberg, 1965 ), they are not exactly the same items. To obtain some additional data on the usefulness of this measure, we administered the Add Health items to a sample of 387 college students at our university along with the Rosenberg Self-Esteem scale and an omnibus measure of personality based on the Five-Factor model ( Goldberg, 1999 ). The six Add Health items were strongly correlated with the Rosenberg ( r = 0.79), and both self-esteem measures had a similar pattern of convergent and divergent associations with the facets of the Five-Factor model (the two profiles were very strongly associated: r > 0.95). This additional information can help bolster the case for the validity of the short Add Health self-esteem measure.

Special Issue: Missing Data in Existing Data Sets

Missing data is a fact of life in research— individuals may drop out of longitudinal studies or refuse to answer particular questions. These behaviors can affect the generalizability of findings because results may only apply to those individuals who choose to complete a study or a measure. Missing data can also diminish statistical power when common techniques like listwise deletion are used (e.g., only using cases with complete information, thereby reducing the sample size) and even lead to biased effect size estimates (e.g., McKnight & McKnight, 2011 ; McKnight, McKnight, Sidani, & Figuredo, 2007 ; Widaman, 2006 ). Thus, concerns about missing data are important for all aspects of research, including secondary data analysis. The development of specific techniques for appropriately handling missing data is an active area of research in quantitative methods ( Schafer & Graham, 2002 ).

Unfortunately, the literature surrounding missing data techniques is often technical and steeped in jargon, as noted by McKnight et al. (2007 ). The reality is that researchers attempting to understand issues of missing data need to pay careful attention to terminology. For example, a novice researcher may not immediately grasp the classification of missing data used in the literature ( see   Schafer & Graham, 2002 , for a clear description). Consider the confusion that may stem from learning that data are missing at random (MAR) versus data are missing completely at random (MCAR). The term MAR does not mean that missing values only occurred because of chance factors. This is the case when data are missing completely at random (MCAR). Data that are MCAR are absent because of truly random factors. Data that are MAR refers to the situation in which the probability that the observations are missing depends only on other available information in the data set. Data that are MAR can be essentially “ignored” when the other factors are included in a statistical model. The last type of missing data, data missing not at random (MNAR), is likely to characterize the variables in many real-life data sets. As it stands, methods for handing data that are MAR and MCAR are better developed and more easily implemented than methods for handling data MNAR. Thus, many applied researchers will assume data are MAR for purposes of statistical modeling (and the ability to sleep comfortably at night). Fortunately, such an assumption might not create major problems for many analyses and may in fact represent the “practical state of the art” ( Schafer & Graham, 2002 , p. 173).

The literature on missing data techniques is growing, so we simply recommend that researchers keep current on developments in this area. McKnight et al. (2007 ) and Widaman (2006 ) both provide an accessible primer on missing data techniques. In keeping with the largely practical bent to the chapter, we suggest that researchers keep careful track of the amount of missing data present in their analyses and report such information clearly in research papers ( see   McKnight & McKnight, 2011 ). Similarly, we recommend that researchers thoroughly screen their data sets for evidence that missing values depend on other measured variables (e.g., scores at Time 1 might be associated with Time 2 dropout). In general, we suggest that researchers avoid listwise and pairwise deletion methods because there is very little evidence that these are good practices ( see   Jeličić, Phelps, & Lerner, 2009 ; Widaman, 2006 ). Rather, it might be easiest to use direct fitting methods such as the estimation procedures used in conventional structural equation modeling packages (e.g., Full Information Maximum Likelihood; see   Allison, 2003 ). At the very least, it is usually instructive to compare results using listwise deletion with results obtained with direct model fitting in terms of the effect size estimates and basic conclusions regarding the statistical significance of focal coefficients.

Special Issue: Sample Weighting in Existing Data Sets

One of the advantages of many existing data sets is that they were collected using probabilistic sampling methods so that researchers can obtain unbiased population estimates. Such estimates, however, are only obtained when complex survey weights are formally incorporated into the statistical modeling procedures. Such weighting schemes can affect the correlations between variables, and therefore all users of secondary data sets should become familiar with sampling design when they begin working with a new data set. A considerable amount of time and effort is dedicated toward generating complex weighting schemes that account for the precise sampling strategies used in the given study, and users of secondary data sets should give careful consideration to using these weights appropriately.

In some cases, the addition of sampling weights will have little substantive implication on findings, so extensive concern over weighting might be overstated. On the other hand, any potential difference is ultimately an empirical question, so researchers are well advised to consider the importance of sampling weights ( Shrout & Napier, 2011 ). The problem is that many psychologists are not well versed in the use of sampling weights ( Shrout & Napier, 2011 ). Thus, psychologists may not be in a strong position to evaluate whether sample weighting concerns are relevant. In addition, it is sometimes necessary to use specialized software packages or add-ons to adjust analytic models appropriately for sampling weights. Programs such as STATA and SAS have such capabilities in the base package, whereas packages like SPSS sometimes require a complex survey model add-on that integrates with its existing capabilities. Whereas the graduate training of the modal sociologist or demographer is likely to emphasize survey research and thus presumably cover sampling, this is not the case with the methodological training of many psychologists ( Aiken, West, & Millsap, 2008 ). Psychologists who are unfamiliar with sample weighting procedures are well advised to seek the counsel of a survey methodologist before undertaking data analysis.

In terms of practical recommendations, it is important for the user of the secondary data set to develop a clear understanding of how the data were collected by reading documentation about the design and sampling procedure ( Shrout & Napier, 2011 ). This insight will provide a conceptual framework for understanding weighting schemes and for deciding how to appropriately weight the data. Once researchers have a clear idea of the sampling scheme and potential weights, actually incorporating available weights into analyses is not terribly difficult, provided researchers have the appropriate software ( Shrout & Napier, 2011 ). Weighting tutorials are often available for specific data sets. For example, the Add Health project has a document describing weighting ( http://www.cpc.unc.edu/projects/addhealth/faqs/aboutdata/weight1.pdf ) as does the Centers for Disease Control and Prevention for use with their Youth Risk Behavior Surveys ( http://www.cdc.gov/HealthyYouth/yrbs/pdf/YRBS_analysis_software.pdf ). These free documents may also provide useful and accessible background even for those who may not use the data from these projects.

Secondary data analysis refers to the analysis of existing data that may not have been explicitly collected to address a particular research question. Many of the quantitative techniques described in this volume can be applied using existing resources. To be sure, strong data analytic skills are important for fully realizing the potential benefits of secondary data sets, and such skills can help researchers recognize the limits of a data set for any given analysis.

In particular, measurement issues are likely to create the biggest hurdles for psychologists conducting secondary analyses in terms of the challenges associated with offering a reasonable interpretation of the results and in surviving the peer-review process. Accordingly, a familiarity with basic issues in psychometrics is very helpful. Beyond such skills, the effective use of these existing resources requires patience and strong attention to detail. Effective secondary data analysis also requires a fair bit of curiosity to seek out those resources that might be used to make important contribution to psychological science.

Ultimately, we hope that the field of psychology becomes more and more accepting of secondary data analysis. As psychologists use this approach with increasing frequency, it is likely that the organizers of major ongoing data collection efforts will be increasingly open to including measures of prime interest to psychologists. The individuals in charge of projects like the BHPS, the GSOEP, and the National Center for Education Statistics ( http://nces.ed.gov/ ) want their data to be used by the widest possible audiences and will respond to researcher demands. We believe that it is time that psychologists join their colleagues in economics, sociology, and political science in taking advantage of these existing resources. It is also time to move beyond divisive discussions surrounding the presumed superiority of primary data collection over secondary analysis. There is no reason to choose one over the other when the field of psychology can profit from both. We believe that the relevant topics of debate are not about the method of initial data collection but, rather, about the importance and intrinsic interest of the underlying research questions. If the question is important and the research design and measures are suitable, then there is little doubt in our minds that secondary data analysis can make a contribution to psychological science.

Author Note

M. Brent Donnellan, Department of Psychology, Michigan State University, East Lansing, MI 48824.

Richard E. Lucas, Department of Psychology, Michigan State University, East Lansing, MI 48824.

One consequence of large sample sizes, however, is that issues of effect size interpretation become paramount given that very small correlations or very small mean differences between groups are likely to be statistically significant using conventional null hypothesis significance tests (e.g., Trzesniewski & Donnellan, 2009 ). Researchers will therefore need to grapple with issues related to null hypothesis significance testing ( see   Kline, 2004 ).

Aiken, L. S. , & West, S. G. ( 1991 ). Multiple regression: Testing and interpreting interactions . Newbury Park, CA: Sage.

Google Scholar

Google Preview

Aiken, L. S. , West, S. G. , & Millsap, R. E. ( 2008 ). Doctoral training in statistics, measurement, and methodology in psychology: Replication and extension of Aiken, West, Sechrest, and Reno’s (1990) survey of Ph.D. programs in North America.   American Psychologist, 63, 32–50.

Akers, R. L. , Massey, J. , & Clarke, W ( 1983 ). Are self-reports of adolescent deviance valid? Biochemical measures, randomized response, and the bogus pipeline in smoking behavior.   Social Forces, 62, 234–251.

Allison, P. D. ( 2003 ). Missing data techniques for structural equation modeling.   Journal of Abnormal Psychology, 112, 545–557.

Baird, B. M. , Lucas, R. E. , & Donnellan, M. B. ( 2010 ). Life Satisfaction across the lifespan: Findings from two nationally representative panel studies.   Social Indicators Research, 99, 183–203.

Briggs, S. R. , & Cheek, J. M. ( 1986 ). The role of factor analysis in the development and evaluation of personality scales.   Journal of Personality 54, 106–148.

Brooks-Gunn, J. , Berlin, L. J. , Leventhal, T. , & Fuligini, A. S. ( 2000 ). Depending on the kindness of strangers: Current national data initiatives and developmental research.   Child Development, 71, 257–268.

Brooks-Gunn, J. , & Chase-Lansdale, P. L. ( 1991 ) (Eds.). Secondary data analyses in developmental psychology [Special section].   Developmental Psychology, 27, 899–951.

Clark, L. A. , & Watson, D. ( 1995 ). Constructing validity: Basic issues in objective scale development.   Psychological Assessment, 7, 309–319.

Cronbach, L. J. ( 1951 ). Coefficient alpha and the internal structure of tests.   Psychometrika, 16, 297–234.

Cronbach, L. J. ( 1957 ). The two disciplines of scientific psychology.   American Psychologist, 12, 671–684.

Cronbach, L. J. , & Meehl, P. ( 1955 ). Construct validity in psychological tests.   Psychological Bulletin, 52, 281–302.

Diener, E. , Emmons, R. A. , Larsen, R. J. , & Griffin, S. ( 1985 ). The Satisfaction with Life Scale.   Journal of Personality Assessment, 49, 71–75.

Donnellan, M. B. , Trzesniewski, K. H. , & Robins, R. W. ( 2011 ). Self-esteem: Enduring issues and controversies. In T Chamorro-Premuzic , S. von Stumm , and A. Furnham (Eds). The Wiley-Blackwell Handbook of Individual Differences (pp. 710–746). New York: Wiley-Blackwell.

Freese, J. ( 2007 ). Replication standards for quantitative social science: Why not sociology?   Sociological Methods & Research, 36, 153–172.

Gerstorf, D. , Ram, N. , Estabrook, R. , Schupp, J. , Wagner, G. G. , & Lindenberger, U. ( 2008 ). Life satisfaction shows terminal decline in old age: Longitudinal evidence from the German Socio-Economic Panel Study (SOEP).   Developmental Psychology, 44, 1148–1159.

Gerstorf, D. , Ram, N. , Goebel, J. , Schupp, J. , Lindenberger, U. , & Wagner, G. G. ( 2010 ). Where people live and die makes a difference: Individual and geographic disparities in well-being progression at the end of life.   Psychology and Aging, 25, 661–676.

Goldberg, L. R. ( 1999 ). A broad-bandwidth, public domain, personality inventory measuring the lower-level facets of several five-factor models. In I Mervielde , I. Deary , F. De Fruyt , & F. Ostendorf (Eds.), Personality psychology in Europe (Vol. 7, pp. 7–28). Tilburg, The Netherlands: Tilburg University Press.

Hofferth, S. L. , ( 2005 ). Secondary data analysis in family research.   Journal of Marriage and the Family, 67, 891–907.

Hunter, J. E. , & Schmidt, F. L. ( 2004 ). Methods of meta-analysis: Correcting error and bias in research findings (2nd ed.). Newbury Park, CA: Sage.

Jeličić, H. , Phelps, E. , & Lerner, R. M. ( 2009 ). Use of missing data methods in longitudinal studies: The persistence of bad practices in developmental psychology.   Developmental Psychology, 45, 1195–1199.

John, O. P. , & Soto, C. J. ( 2007 ). The importance of being valid. In R. W Robins , R. C. Fraley , and R. F. Krueger (Eds). Handbook of Research Methods in Personality Psychology (pp. 461–494). New York: Guilford Press.

Kiecolt, K. J. & Nathan, L. E. ( 1985 ). Secondary analysis of survey data . Sage University Paper series on Quantitative Applications in the Social Sciences, No. 53). Newbury Park, CA: Sage.

Kline, R. B. ( 2004 ). Beyond significance testing: Reforming data analysis methods in behavioral research . Washington, DC: American Psychological Association.

Kline, R. B. ( 2011 ). Principles and practice of structural equation modeling (3rd ed.). New York: Guildford Press.

Lance, C. E. , Butts, M. M. , & Michels, L. C. ( 2006 ). The sources of four commonly reported cutoff criteria: What did they really say?   Organizational Research Methods, 9, 202–220.

Lord, F. , & Novick, M. R. ( 1968 ). Statistical theories of mental test scores . Reading, MA: Addison-Wesley.

Lucas, R. E. ( 2005 ). Time does not heal all wounds.   Psychological Science, 16, 945–950.

Lucas, R. E. ( 2007 ). Adaptation and the set-point model of subjective well-being: Does happiness change after major life events?   Current Directions in Psychological Science, 16, 75–79.

McCall, R. B. , & Appelbaum, M. I. ( 1991 ). Some issues of conducting secondary analyses.   Developmental Psychology, 27, 911–917.

McCrae, R. R. , Kurtz, J. E. , Yamagata, S. , & Terracciano, A. ( 2011 ). Internal consistency, retest reliability, and their implications for personality scale validity.   Personality and Social Psychology Review, 15, 28–50.

Messick, S. ( 1995 ). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning.   American Psychologist, 50, 741–749.

McKnight, P. E. , & McKnight, K. M. ( 2011 ). Missing data in secondary data analysis. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 83–101). Washington, DC: American Psychological Association.

McKnight, P. E. , McKnight, K. M. , Sidani, S. , & Figuredo, A. ( 2007 ). Missing data: A gentle introduction . New York: Guilford Press.

Mroczek, D. K. , Pitzer, L. , Miller, L. , Turiano, N. , & Fingerman, K. ( 2011 ). The use of secondary data in adult development and aging research. In K. H. Trzesniewski , M. B. Donnellan , and R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 121–132). Washington, DC: American Psychological Association.

Pienta, A. M. , O’Rourke, J. M. , & Franks, M. M. ( 2011 ). Getting started: Working with secondary data. In K. H. Trzesniewski , M. B. Donnellan , and R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 13–25). Washington, DC: American Psychological Association.

Rosenberg, M. ( 1965 ). Society and adolescent self image , Princeton, NJ: Princeton University.

Russell, S. T. , Crockett, L. J. , Shen, Y-L , & Lee, S-A. ( 2008 ). Cross-ethnic invariance of self-esteem and depression measures for Chinese, Filipino, and European American adolescents.   Journal of Youth and Adolescence, 37, 50–61.

Russell, S. T. , & Matthews, E. ( 2011 ). Using secondary data to study adolescence and adolescent development. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 163–176). Washington, DC: American Psychological Association.

Schafer, J. L. & Graham, J. W ( 2002 ). Missing data: Our view of the state of the art.   Psychological Methods, 7, 147–177.

Schmitt, N. ( 1996 ). Uses and abuses of coefficient alpha.   Psychological Assessment, 8, 350–353.

Schwarz, N. ( 1999 ). Self-reports: How the questions shape the answers.   American Psychologist, 54, 93–105.

Schwarz, N. & Strack, F. ( 1999 ). Reports of subjective well-being: Judgmental processes and their methodological implications. In D. Kahneman , E. Diener , & N. Schwarz (Eds.). Well-being: The foundations of hedonic psychology (pp.61–84). New York: Russell Sage Foundation.

Sears, D. O. ( 1986 ). College sophomores in the lab: Influences of a narrow data base on social psychology’s view of human nature.   Journal of Personality and Social Psychology, 51, 515–530.

Shrout, P. E. , & Napier, J. L. ( 2011 ). Analyzing survey data with complex sampling designs. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 63–81). Washington, DC: American Psychological Association.

Simms, L. J. ( 2008 ). Classical and modern methods of psychological scale construction.   Social and Personality Psychology Compass, 2/1, 414–433.

Simms, L. J. , & Watson, D. ( 2007 ). The construct validation approach to personality scale creation. In R. W Robins , R. C. Fraley , & R. F. Krueger (Eds). Handbook of Research Methods in Personality Psychology (pp. 240–258). New York: Guilford Press.

Smith, G. X ( 2005 ). On construct validity: Issues of method and measurement.   Psychological Assessment, 17, 396–408.

Tracy, J. L. , Robins, R. W. , & Sherman, J. W. ( 2009 ). The practice of psychological science: Searching for Cronbach’s two streams in social-personality psychology.   Journal of Personality and Social Psychology, 96, 1206–1225.

Trzesniewski, K.H. & Donnellan, M. B. ( 2009 ). Re-evaluating the evidence for increasing self-views among high school students: More evidence for consistency across generations (1976–2006).   Psychological Science, 20, 920–922.

Trzesniewski, K. H. & Donnellan, M. B. ( 2010 ). Rethinking “Generation Me”: A study of cohort effects from 1976–2006.   Perspectives in Psychological Science , 5, 58–75.

Trzesniewski, K. H. , Donnellan, M. B. , & Lucas, R. E. ( 2011 ) (Eds). Secondary data analysis: An introduction for psychologists . Washington, DC: American Psychological Association.

Widaman, K. F. ( 2006 ). Missing data: What to do with or without them.   Monographs of the Society for Research in Child Development, 71, 42–64.

Widaman, K. F. , Little, T. D. , Preacher, K. K. , & Sawalani, G. M. ( 2011 ). On creating and using short forms of scales in secondary research. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 39–61). Washington, DC: American Psychological Association.

Willms, J. D. ( 2011 ). Managing and using secondary data sets with multidisciplinary research teams. In K. H. Trzesniewski , M. B. Donnellan , & R. E. Lucas (Eds). Secondary data analysis: An introduction for psychologists (pp. 27–38). Washington, DC: American Psychological Association.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

A Guide To Secondary Data Analysis

What is secondary data analysis? How do you carry it out? Find out in this post.  

Historically, the only way data analysts could obtain data was to collect it themselves. This type of data is often referred to as primary data and is still a vital resource for data analysts.   

However, technological advances over the last few decades mean that much past data is now readily available online for data analysts and researchers to access and utilize. This type of data—known as secondary data—is driving a revolution in data analytics and data science.

Primary and secondary data share many characteristics. However, there are some fundamental differences in how you prepare and analyze secondary data. This post explores the unique aspects of secondary data analysis. We’ll briefly review what secondary data is before outlining how to source, collect and validate them. We’ll cover:

  • What is secondary data analysis?
  • How to carry out secondary data analysis (5 steps)
  • Summary and further reading

Ready for a crash course in secondary data analysis? Let’s go!

1. What is secondary data analysis?

Secondary data analysis uses data collected by somebody else. This contrasts with primary data analysis, which involves a researcher collecting predefined data to answer a specific question. Secondary data analysis has numerous benefits, not least that it is a time and cost-effective way of obtaining data without doing the research yourself.

It’s worth noting here that secondary data may be primary data for the original researcher. It only becomes secondary data when it’s repurposed for a new task. As a result, a dataset can simultaneously be a primary data source for one researcher and a secondary data source for another. So don’t panic if you get confused! We explain exactly what secondary data is in this guide . 

In reality, the statistical techniques used to carry out secondary data analysis are no different from those used to analyze other kinds of data. The main differences lie in collection and preparation. Once the data have been reviewed and prepared, the analytics process continues more or less as it usually does. For a recap on what the data analysis process involves, read this post . 

In the following sections, we’ll focus specifically on the preparation of secondary data for analysis. Where appropriate, we’ll refer to primary data analysis for comparison. 

2. How to carry out secondary data analysis

Step 1: define a research topic.

The first step in any data analytics project is defining your goal. This is true regardless of the data you’re working with, or the type of analysis you want to carry out. In data analytics lingo, this typically involves defining:

  • A statement of purpose
  • Research design

Defining a statement of purpose and a research approach are both fundamental building blocks for any project. However, for secondary data analysis, the process of defining these differs slightly. Let’s find out how.

Step 2: Establish your statement of purpose

Before beginning any data analytics project, you should always have a clearly defined intent. This is called a ‘statement of purpose.’ A healthcare analyst’s statement of purpose, for example, might be: ‘Reduce admissions for mental health issues relating to Covid-19′. The more specific the statement of purpose, the easier it is to determine which data to collect, analyze, and draw insights from.

A statement of purpose is helpful for both primary and secondary data analysis. It’s especially relevant for secondary data analysis, though. This is because there are vast amounts of secondary data available. Having a clear direction will keep you focused on the task at hand, saving you from becoming overwhelmed. Being selective with your data sources is key.

Step 3: Design your research process

After defining your statement of purpose, the next step is to design the research process. For primary data, this involves determining the types of data you want to collect (e.g. quantitative, qualitative, or both ) and a methodology for gathering them.

For secondary data analysis, however, your research process will more likely be a step-by-step guide outlining the types of data you require and a list of potential sources for gathering them. It may also include (realistic) expectations of the output of the final analysis. This should be based on a preliminary review of the data sources and their quality.

Once you have both your statement of purpose and research design, you’re in a far better position to narrow down potential sources of secondary data. You can then start with the next step of the process: data collection.

Step 4: Locate and collect your secondary data

Collecting primary data involves devising and executing a complex strategy that can be very time-consuming to manage. The data you collect, though, will be highly relevant to your research problem.

Secondary data collection, meanwhile, avoids the complexity of defining a research methodology. However, it comes with additional challenges. One of these is identifying where to find the data. This is no small task because there are a great many repositories of secondary data available. Your job, then, is to narrow down potential sources. As already mentioned, it’s necessary to be selective, or else you risk becoming overloaded.  

Some popular sources of secondary data include:  

  • Government statistics , e.g. demographic data, censuses, or surveys, collected by government agencies/departments (like the US Bureau of Labor Statistics).
  • Technical reports summarizing completed or ongoing research from educational or public institutions (colleges or government).
  • Scientific journals that outline research methodologies and data analysis by experts in fields like the sciences, medicine, etc.
  • Literature reviews of research articles, books, and reports, for a given area of study (once again, carried out by experts in the field).
  • Trade/industry publications , e.g. articles and data shared in trade publications, covering topics relating to specific industry sectors, such as tech or manufacturing.
  • Online resources: Repositories, databases, and other reference libraries with public or paid access to secondary data sources.

Once you’ve identified appropriate sources, you can go about collecting the necessary data. This may involve contacting other researchers, paying a fee to an organization in exchange for a dataset, or simply downloading a dataset for free online .

Step 5: Evaluate your secondary data

Secondary data is usually well-structured, so you might assume that once you have your hands on a dataset, you’re ready to dive in with a detailed analysis. Unfortunately, that’s not the case! 

First, you must carry out a careful review of the data. Why? To ensure that they’re appropriate for your needs. This involves two main tasks:

Evaluating the secondary dataset’s relevance

  • Assessing its broader credibility

Both these tasks require critical thinking skills. However, they aren’t heavily technical. This means anybody can learn to carry them out.

Let’s now take a look at each in a bit more detail.  

The main point of evaluating a secondary dataset is to see if it is suitable for your needs. This involves asking some probing questions about the data, including:

What was the data’s original purpose?

Understanding why the data were originally collected will tell you a lot about their suitability for your current project. For instance, was the project carried out by a government agency or a private company for marketing purposes? The answer may provide useful information about the population sample, the data demographics, and even the wording of specific survey questions. All this can help you determine if the data are right for you, or if they are biased in any way.

When and where were the data collected?

Over time, populations and demographics change. Identifying when the data were first collected can provide invaluable insights. For instance, a dataset that initially seems suited to your needs may be out of date.

On the flip side, you might want past data so you can draw a comparison with a present dataset. In this case, you’ll need to ensure the data were collected during the appropriate time frame. It’s worth mentioning that secondary data are the sole source of past data. You cannot collect historical data using primary data collection techniques.

Similarly, you should ask where the data were collected. Do they represent the geographical region you require? Does geography even have an impact on the problem you are trying to solve?

What data were collected and how?

A final report for past data analytics is great for summarizing key characteristics or findings. However, if you’re planning to use those data for a new project, you’ll need the original documentation. At the very least, this should include access to the raw data and an outline of the methodology used to gather them. This can be helpful for many reasons. For instance, you may find raw data that wasn’t relevant to the original analysis, but which might benefit your current task.

What questions were participants asked?

We’ve already touched on this, but the wording of survey questions—especially for qualitative datasets—is significant. Questions may deliberately be phrased to preclude certain answers. A question’s context may also impact the findings in a way that’s not immediately obvious. Understanding these issues will shape how you perceive the data.  

What is the form/shape/structure of the data?

Finally, to practical issues. Is the structure of the data suitable for your needs? Is it compatible with other sources or with your preferred analytics approach? This is purely a structural issue. For instance, if a dataset of people’s ages is saved as numerical rather than continuous variables, this could potentially impact your analysis. In general, reviewing a dataset’s structure helps better understand how they are categorized, allowing you to account for any discrepancies. You may also need to tidy the data to ensure they are consistent with any other sources you’re using.  

This is just a sample of the types of questions you need to consider when reviewing a secondary data source. The answers will have a clear impact on whether the dataset—no matter how well presented or structured it seems—is suitable for your needs.

Assessing secondary data’s credibility

After identifying a potentially suitable dataset, you must double-check the credibility of the data. Namely, are the data accurate and unbiased? To figure this out, here are some key questions you might want to include:

What are the credentials of those who carried out the original research?

Do you have access to the details of the original researchers? What are their credentials? Where did they study? Are they an expert in the field or a newcomer? Data collection by an undergraduate student, for example, may not be as rigorous as that of a seasoned professor.  

And did the original researcher work for a reputable organization? What other affiliations do they have? For instance, if a researcher who works for a tobacco company gathers data on the effects of vaping, this represents an obvious conflict of interest! Questions like this help determine how thorough or qualified the researchers are and if they have any potential biases.

Do you have access to the full methodology?

Does the dataset include a clear methodology, explaining in detail how the data were collected? This should be more than a simple overview; it must be a clear breakdown of the process, including justifications for the approach taken. This allows you to determine if the methodology was sound. If you find flaws (or no methodology at all) it throws the quality of the data into question.  

How consistent are the data with other sources?

Do the secondary data match with any similar findings? If not, that doesn’t necessarily mean the data are wrong, but it does warrant closer inspection. Perhaps the collection methodology differed between sources, or maybe the data were analyzed using different statistical techniques. Or perhaps unaccounted-for outliers are skewing the analysis. Identifying all these potential problems is essential. A flawed or biased dataset can still be useful but only if you know where its shortcomings lie.

Have the data been published in any credible research journals?

Finally, have the data been used in well-known studies or published in any journals? If so, how reputable are the journals? In general, you can judge a dataset’s quality based on where it has been published. If in doubt, check out the publication in question on the Directory of Open Access Journals . The directory has a rigorous vetting process, only permitting journals of the highest quality. Meanwhile, if you found the data via a blurry image on social media without cited sources, then you can justifiably question its quality!  

Again, these are just a few of the questions you might ask when determining the quality of a secondary dataset. Consider them as scaffolding for cultivating a critical thinking mindset; a necessary trait for any data analyst!

Presuming your secondary data holds up to scrutiny, you should be ready to carry out your detailed statistical analysis. As we explained at the beginning of this post, the analytical techniques used for secondary data analysis are no different than those for any other kind of data. Rather than go into detail here, check out the different types of data analysis in this post.

3. Secondary data analysis: Key takeaways

In this post, we’ve looked at the nuances of secondary data analysis, including how to source, collect and review secondary data. As discussed, much of the process is the same as it is for primary data analysis. The main difference lies in how secondary data are prepared.

Carrying out a meaningful secondary data analysis involves spending time and effort exploring, collecting, and reviewing the original data. This will help you determine whether the data are suitable for your needs and if they are of good quality.

Why not get to know more about what data analytics involves with this free, five-day introductory data analytics short course ? And, for more data insights, check out these posts:

  • Discrete vs continuous data variables: What’s the difference?
  • What are the four levels of measurement? Nominal, ordinal, interval, and ratio data explained
  • What are the best tools for data mining?
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

quantitative research methods secondary data

Home Market Research

Quantitative Research: What It Is, Practices & Methods

Quantitative research

Quantitative research involves analyzing and gathering numerical data to uncover trends, calculate averages, evaluate relationships, and derive overarching insights. It’s used in various fields, including the natural and social sciences. Quantitative data analysis employs statistical techniques for processing and interpreting numeric data.

Research designs in the quantitative realm outline how data will be collected and analyzed with methods like experiments and surveys. Qualitative methods complement quantitative research by focusing on non-numerical data, adding depth to understanding. Data collection methods can be qualitative or quantitative, depending on research goals. Researchers often use a combination of both approaches to gain a comprehensive understanding of phenomena.

What is Quantitative Research?

Quantitative research is a systematic investigation of phenomena by gathering quantifiable data and performing statistical, mathematical, or computational techniques. Quantitative research collects statistically significant information from existing and potential customers using sampling methods and sending out online surveys , online polls , and questionnaires , for example.

One of the main characteristics of this type of research is that the results can be depicted in numerical form. After carefully collecting structured observations and understanding these numbers, it’s possible to predict the future of a product or service, establish causal relationships or Causal Research , and make changes accordingly. Quantitative research primarily centers on the analysis of numerical data and utilizes inferential statistics to derive conclusions that can be extrapolated to the broader population.

An example of a quantitative research study is the survey conducted to understand how long a doctor takes to tend to a patient when the patient walks into the hospital. A patient satisfaction survey can be administered to ask questions like how long a doctor takes to see a patient, how often a patient walks into a hospital, and other such questions, which are dependent variables in the research. This kind of research method is often employed in the social sciences, and it involves using mathematical frameworks and theories to effectively present data, ensuring that the results are logical, statistically sound, and unbiased.

Data collection in quantitative research uses a structured method and is typically conducted on larger samples representing the entire population. Researchers use quantitative methods to collect numerical data, which is then subjected to statistical analysis to determine statistically significant findings. This approach is valuable in both experimental research and social research, as it helps in making informed decisions and drawing reliable conclusions based on quantitative data.

Quantitative Research Characteristics

Quantitative research has several unique characteristics that make it well-suited for specific projects. Let’s explore the most crucial of these characteristics so that you can consider them when planning your next research project:

quantitative research methods secondary data

  • Structured tools: Quantitative research relies on structured tools such as surveys, polls, or questionnaires to gather quantitative data . Using such structured methods helps collect in-depth and actionable numerical data from the survey respondents, making it easier to perform data analysis.
  • Sample size: Quantitative research is conducted on a significant sample size  representing the target market . Appropriate Survey Sampling methods, a fundamental aspect of quantitative research methods, must be employed when deriving the sample to fortify the research objective and ensure the reliability of the results.
  • Close-ended questions: Closed-ended questions , specifically designed to align with the research objectives, are a cornerstone of quantitative research. These questions facilitate the collection of quantitative data and are extensively used in data collection processes.
  • Prior studies: Before collecting feedback from respondents, researchers often delve into previous studies related to the research topic. This preliminary research helps frame the study effectively and ensures the data collection process is well-informed.
  • Quantitative data: Typically, quantitative data is represented using tables, charts, graphs, or other numerical forms. This visual representation aids in understanding the collected data and is essential for rigorous data analysis, a key component of quantitative research methods.
  • Generalization of results: One of the strengths of quantitative research is its ability to generalize results to the entire population. It means that the findings derived from a sample can be extrapolated to make informed decisions and take appropriate actions for improvement based on numerical data analysis.

Quantitative Research Methods

Quantitative research methods are systematic approaches used to gather and analyze numerical data to understand and draw conclusions about a phenomenon or population. Here are the quantitative research methods:

  • Primary quantitative research methods
  • Secondary quantitative research methods

Primary Quantitative Research Methods

Primary quantitative research is the most widely used method of conducting market research. The distinct feature of primary research is that the researcher focuses on collecting data directly rather than depending on data collected from previously done research. Primary quantitative research design can be broken down into three further distinctive tracks and the process flow. They are:

A. Techniques and Types of Studies

There are multiple types of primary quantitative research. They can be distinguished into the four following distinctive methods, which are:

01. Survey Research

Survey Research is fundamental for all quantitative outcome research methodologies and studies. Surveys are used to ask questions to a sample of respondents, using various types such as online polls, online surveys, paper questionnaires, web-intercept surveys , etc. Every small and big organization intends to understand what their customers think about their products and services, how well new features are faring in the market, and other such details.

By conducting survey research, an organization can ask multiple survey questions , collect data from a pool of customers, and analyze this collected data to produce numerical results. It is the first step towards collecting data for any research. You can use single ease questions . A single-ease question is a straightforward query that elicits a concise and uncomplicated response.

This type of research can be conducted with a specific target audience group and also can be conducted across multiple groups along with comparative analysis . A prerequisite for this type of research is that the sample of respondents must have randomly selected members. This way, a researcher can easily maintain the accuracy of the obtained results as a huge variety of respondents will be addressed using random selection. 

Traditionally, survey research was conducted face-to-face or via phone calls. Still, with the progress made by online mediums such as email or social media, survey research has also spread to online mediums.There are two types of surveys , either of which can be chosen based on the time in hand and the kind of data required:

Cross-sectional surveys: Cross-sectional surveys are observational surveys conducted in situations where the researcher intends to collect data from a sample of the target population at a given point in time. Researchers can evaluate various variables at a particular time. Data gathered using this type of survey is from people who depict similarity in all variables except the variables which are considered for research . Throughout the survey, this one variable will stay constant.

  • Cross-sectional surveys are popular with retail, SMEs, and healthcare industries. Information is garnered without modifying any parameters in the variable ecosystem.
  • Multiple samples can be analyzed and compared using a cross-sectional survey research method.
  • Multiple variables can be evaluated using this type of survey research.
  • The only disadvantage of cross-sectional surveys is that the cause-effect relationship of variables cannot be established as it usually evaluates variables at a particular time and not across a continuous time frame.

Longitudinal surveys: Longitudinal surveys are also observational surveys , but unlike cross-sectional surveys, longitudinal surveys are conducted across various time durations to observe a change in respondent behavior and thought processes. This time can be days, months, years, or even decades. For instance, a researcher planning to analyze the change in buying habits of teenagers over 5 years will conduct longitudinal surveys.

  • In cross-sectional surveys, the same variables were evaluated at a given time, and in longitudinal surveys, different variables can be analyzed at different intervals.
  • Longitudinal surveys are extensively used in the field of medicine and applied sciences. Apart from these two fields, they are also used to observe a change in the market trend analysis , analyze customer satisfaction, or gain feedback on products/services.
  • In situations where the sequence of events is highly essential, longitudinal surveys are used.
  • Researchers say that when research subjects need to be thoroughly inspected before concluding, they rely on longitudinal surveys.

02. Correlational Research

A comparison between two entities is invariable. Correlation research is conducted to establish a relationship between two closely-knit entities and how one impacts the other, and what changes are eventually observed. This research method is carried out to give value to naturally occurring relationships, and a minimum of two different groups are required to conduct this quantitative research method successfully. Without assuming various aspects, a relationship between two groups or entities must be established.

Researchers use this quantitative research design to correlate two or more variables using mathematical analysis methods. Patterns, relationships, and trends between variables are concluded as they exist in their original setup. The impact of one of these variables on the other is observed, along with how it changes the relationship between the two variables. Researchers tend to manipulate one of the variables to attain the desired results.

Ideally, it is advised not to make conclusions merely based on correlational research. This is because it is not mandatory that if two variables are in sync that they are interrelated.

Example of Correlational Research Questions :

  • The relationship between stress and depression.
  • The equation between fame and money.
  • The relation between activities in a third-grade class and its students.

03. Causal-comparative Research

This research method mainly depends on the factor of comparison. Also called quasi-experimental research , this quantitative research method is used by researchers to conclude the cause-effect equation between two or more variables, where one variable is dependent on the other independent variable. The independent variable is established but not manipulated, and its impact on the dependent variable is observed. These variables or groups must be formed as they exist in the natural setup. As the dependent and independent variables will always exist in a group, it is advised that the conclusions are carefully established by keeping all the factors in mind.

Causal-comparative research is not restricted to the statistical analysis of two variables but extends to analyzing how various variables or groups change under the influence of the same changes. This research is conducted irrespective of the type of relationship that exists between two or more variables. Statistical analysis plan is used to present the outcome using this quantitative research method.

Example of Causal-Comparative Research Questions:

  • The impact of drugs on a teenager. The effect of good education on a freshman. The effect of substantial food provision in the villages of Africa.

04. Experimental Research

Also known as true experimentation, this research method relies on a theory. As the name suggests, experimental research is usually based on one or more theories. This theory has yet to be proven before and is merely a supposition. In experimental research, an analysis is done around proving or disproving the statement. This research method is used in natural sciences. Traditional research methods are more effective than modern techniques.

There can be multiple theories in experimental research. A theory is a statement that can be verified or refuted.

After establishing the statement, efforts are made to understand whether it is valid or invalid. This quantitative research method is mainly used in natural or social sciences as various statements must be proved right or wrong.

  • Traditional research methods are more effective than modern techniques.
  • Systematic teaching schedules help children who struggle to cope with the course.
  • It is a boon to have responsible nursing staff for ailing parents.

B. Data Collection Methodologies

The second major step in primary quantitative research is data collection. Data collection can be divided into sampling methods and data collection using surveys and polls.

01. Data Collection Methodologies: Sampling Methods

There are two main sampling methods for quantitative research: Probability and Non-probability sampling .

Probability sampling: A theory of probability is used to filter individuals from a population and create samples in probability sampling . Participants of a sample are chosen by random selection processes. Each target audience member has an equal opportunity to be selected in the sample.

There are four main types of probability sampling:

  • Simple random sampling: As the name indicates, simple random sampling is nothing but a random selection of elements for a sample. This sampling technique is implemented where the target population is considerably large.
  • Stratified random sampling: In the stratified random sampling method , a large population is divided into groups (strata), and members of a sample are chosen randomly from these strata. The various segregated strata should ideally not overlap one another.
  • Cluster sampling: Cluster sampling is a probability sampling method using which the main segment is divided into clusters, usually using geographic segmentation and demographic segmentation parameters.
  • Systematic sampling: Systematic sampling is a technique where the starting point of the sample is chosen randomly, and all the other elements are chosen using a fixed interval. This interval is calculated by dividing the population size by the target sample size.

Non-probability sampling: Non-probability sampling is where the researcher’s knowledge and experience are used to create samples. Because of the researcher’s involvement, not all the target population members have an equal probability of being selected to be a part of a sample.

There are five non-probability sampling models:

  • Convenience sampling: In convenience sampling , elements of a sample are chosen only due to one prime reason: their proximity to the researcher. These samples are quick and easy to implement as there is no other parameter of selection involved.
  • Consecutive sampling: Consecutive sampling is quite similar to convenience sampling, except for the fact that researchers can choose a single element or a group of samples and conduct research consecutively over a significant period and then perform the same process with other samples.
  • Quota sampling: Using quota sampling , researchers can select elements using their knowledge of target traits and personalities to form strata. Members of various strata can then be chosen to be a part of the sample as per the researcher’s understanding.
  • Snowball sampling: Snowball sampling is conducted with target audiences who are difficult to contact and get information. It is popular in cases where the target audience for analysis research is rare to put together.
  • Judgmental sampling: Judgmental sampling is a non-probability sampling method where samples are created only based on the researcher’s experience and research skill .

02. Data collection methodologies: Using surveys & polls

Once the sample is determined, then either surveys or polls can be distributed to collect the data for quantitative research.

Using surveys for primary quantitative research

A survey is defined as a research method used for collecting data from a pre-defined group of respondents to gain information and insights on various topics of interest. The ease of survey distribution and the wide number of people it can reach depending on the research time and objective makes it one of the most important aspects of conducting quantitative research.

Fundamental levels of measurement – nominal, ordinal, interval, and ratio scales

Four measurement scales are fundamental to creating a multiple-choice question in a survey. They are nominal, ordinal, interval, and ratio measurement scales without the fundamentals of which no multiple-choice questions can be created. Hence, it is crucial to understand these measurement levels to develop a robust survey.

Use of different question types

To conduct quantitative research, close-ended questions must be used in a survey. They can be a mix of multiple question types, including multiple-choice questions like semantic differential scale questions , rating scale questions , etc.

Survey Distribution and Survey Data Collection

In the above, we have seen the process of building a survey along with the research design to conduct primary quantitative research. Survey distribution to collect data is the other important aspect of the survey process. There are different ways of survey distribution. Some of the most commonly used methods are:

  • Email: Sending a survey via email is the most widely used and effective survey distribution method. This method’s response rate is high because the respondents know your brand. You can use the QuestionPro email management feature to send out and collect survey responses.
  • Buy respondents: Another effective way to distribute a survey and conduct primary quantitative research is to use a sample. Since the respondents are knowledgeable and are on the panel by their own will, responses are much higher.
  • Embed survey on a website: Embedding a survey on a website increases a high number of responses as the respondent is already in close proximity to the brand when the survey pops up.
  • Social distribution: Using social media to distribute the survey aids in collecting a higher number of responses from the people that are aware of the brand.
  • QR code: QuestionPro QR codes store the URL for the survey. You can print/publish this code in magazines, signs, business cards, or on just about any object/medium.
  • SMS survey: The SMS survey is a quick and time-effective way to collect a high number of responses.
  • Offline Survey App: The QuestionPro App allows users to circulate surveys quickly, and the responses can be collected both online and offline.

Survey example

An example of a survey is a short customer satisfaction (CSAT) survey that can quickly be built and deployed to collect feedback about what the customer thinks about a brand and how satisfied and referenceable the brand is.

Using polls for primary quantitative research

Polls are a method to collect feedback using close-ended questions from a sample. The most commonly used types of polls are election polls and exit polls . Both of these are used to collect data from a large sample size but using basic question types like multiple-choice questions.

C. Data Analysis Techniques

The third aspect of primary quantitative research design is data analysis . After collecting raw data, there must be an analysis of this data to derive statistical inferences from this research. It is important to relate the results to the research objective and establish the statistical relevance of the results.

Remember to consider aspects of research that were not considered for the data collection process and report the difference between what was planned vs. what was actually executed.

It is then required to select precise Statistical Analysis Methods , such as SWOT, Conjoint, Cross-tabulation, etc., to analyze the quantitative data.

  • SWOT analysis: SWOT Analysis stands for the acronym of Strengths, Weaknesses, Opportunities, and Threat analysis. Organizations use this statistical analysis technique to evaluate their performance internally and externally to develop effective strategies for improvement.
  • Conjoint Analysis: Conjoint Analysis is a market analysis method to learn how individuals make complicated purchasing decisions. Trade-offs are involved in an individual’s daily activities, and these reflect their ability to decide from a complex list of product/service options.
  • Cross-tabulation: Cross-tabulation is one of the preliminary statistical market analysis methods which establishes relationships, patterns, and trends within the various parameters of the research study.
  • TURF Analysis: TURF Analysis , an acronym for Totally Unduplicated Reach and Frequency Analysis, is executed in situations where the reach of a favorable communication source is to be analyzed along with the frequency of this communication. It is used for understanding the potential of a target market.

Inferential statistics methods such as confidence interval, the margin of error, etc., can then be used to provide results.

Secondary Quantitative Research Methods

Secondary quantitative research or desk research is a research method that involves using already existing data or secondary data. Existing data is summarized and collated to increase the overall effectiveness of the research.

This research method involves collecting quantitative data from existing data sources like the internet, government resources, libraries, research reports, etc. Secondary quantitative research helps to validate the data collected from primary quantitative research and aid in strengthening or proving, or disproving previously collected data.

The following are five popularly used secondary quantitative research methods:

  • Data available on the internet: With the high penetration of the internet and mobile devices, it has become increasingly easy to conduct quantitative research using the internet. Information about most research topics is available online, and this aids in boosting the validity of primary quantitative data.
  • Government and non-government sources: Secondary quantitative research can also be conducted with the help of government and non-government sources that deal with market research reports. This data is highly reliable and in-depth and hence, can be used to increase the validity of quantitative research design.
  • Public libraries: Now a sparingly used method of conducting quantitative research, it is still a reliable source of information, though. Public libraries have copies of important research that was conducted earlier. They are a storehouse of valuable information and documents from which information can be extracted.
  • Educational institutions: Educational institutions conduct in-depth research on multiple topics, and hence, the reports that they publish are an important source of validation in quantitative research.
  • Commercial information sources: Local newspapers, journals, magazines, radio, and TV stations are great sources to obtain data for secondary quantitative research. These commercial information sources have in-depth, first-hand information on market research, demographic segmentation, and similar subjects.

Quantitative Research Examples

Some examples of quantitative research are:

  • A customer satisfaction template can be used if any organization would like to conduct a customer satisfaction (CSAT) survey . Through this kind of survey, an organization can collect quantitative data and metrics on the goodwill of the brand or organization in the customer’s mind based on multiple parameters such as product quality, pricing, customer experience, etc. This data can be collected by asking a net promoter score (NPS) question , matrix table questions, etc. that provide data in the form of numbers that can be analyzed and worked upon.
  • Another example of quantitative research is an organization that conducts an event, collecting feedback from attendees about the value they see from the event. By using an event survey , the organization can collect actionable feedback about the satisfaction levels of customers during various phases of the event such as the sales, pre and post-event, the likelihood of recommending the organization to their friends and colleagues, hotel preferences for the future events and other such questions.

What are the Advantages of Quantitative Research?

There are many advantages to quantitative research. Some of the major advantages of why researchers use this method in market research are:

advantages-of-quantitative-research

Collect Reliable and Accurate Data:

Quantitative research is a powerful method for collecting reliable and accurate quantitative data. Since data is collected, analyzed, and presented in numbers, the results obtained are incredibly reliable and objective. Numbers do not lie and offer an honest and precise picture of the conducted research without discrepancies. In situations where a researcher aims to eliminate bias and predict potential conflicts, quantitative research is the method of choice.

Quick Data Collection:

Quantitative research involves studying a group of people representing a larger population. Researchers use a survey or another quantitative research method to efficiently gather information from these participants, making the process of analyzing the data and identifying patterns faster and more manageable through the use of statistical analysis. This advantage makes quantitative research an attractive option for projects with time constraints.

Wider Scope of Data Analysis:

Quantitative research, thanks to its utilization of statistical methods, offers an extensive range of data collection and analysis. Researchers can delve into a broader spectrum of variables and relationships within the data, enabling a more thorough comprehension of the subject under investigation. This expanded scope is precious when dealing with complex research questions that require in-depth numerical analysis.

Eliminate Bias:

One of the significant advantages of quantitative research is its ability to eliminate bias. This research method leaves no room for personal comments or the biasing of results, as the findings are presented in numerical form. This objectivity makes the results fair and reliable in most cases, reducing the potential for researcher bias or subjectivity.

In summary, quantitative research involves collecting, analyzing, and presenting quantitative data using statistical analysis. It offers numerous advantages, including the collection of reliable and accurate data, quick data collection, a broader scope of data analysis, and the elimination of bias, making it a valuable approach in the field of research. When considering the benefits of quantitative research, it’s essential to recognize its strengths in contrast to qualitative methods and its role in collecting and analyzing numerical data for a more comprehensive understanding of research topics.

Best Practices to Conduct Quantitative Research

Here are some best practices for conducting quantitative research:

Tips to conduct quantitative research

  • Differentiate between quantitative and qualitative: Understand the difference between the two methodologies and apply the one that suits your needs best.
  • Choose a suitable sample size: Ensure that you have a sample representative of your population and large enough to be statistically weighty.
  • Keep your research goals clear and concise: Know your research goals before you begin data collection to ensure you collect the right amount and the right quantity of data.
  • Keep the questions simple: Remember that you will be reaching out to a demographically wide audience. Pose simple questions for your respondents to understand easily.

Quantitative Research vs Qualitative Research

Quantitative research and qualitative research are two distinct approaches to conducting research, each with its own set of methods and objectives. Here’s a comparison of the two:

quantitative research methods secondary data

Quantitative Research

  • Objective: The primary goal of quantitative research is to quantify and measure phenomena by collecting numerical data. It aims to test hypotheses, establish patterns, and generalize findings to a larger population.
  • Data Collection: Quantitative research employs systematic and standardized approaches for data collection, including techniques like surveys, experiments, and observations that involve predefined variables. It is often collected from a large and representative sample.
  • Data Analysis: Data is analyzed using statistical techniques, such as descriptive statistics, inferential statistics, and mathematical modeling. Researchers use statistical tests to draw conclusions and make generalizations based on numerical data.
  • Sample Size: Quantitative research often involves larger sample sizes to ensure statistical significance and generalizability.
  • Results: The results are typically presented in tables, charts, and statistical summaries, making them highly structured and objective.
  • Generalizability: Researchers intentionally structure quantitative research to generate outcomes that can be helpful to a larger population, and they frequently seek to establish causative connections.
  • Emphasis on Objectivity: Researchers aim to minimize bias and subjectivity, focusing on replicable and objective findings.

Qualitative Research

  • Objective: Qualitative research seeks to gain a deeper understanding of the underlying motivations, behaviors, and experiences of individuals or groups. It explores the context and meaning of phenomena.
  • Data Collection: Qualitative research employs adaptable and open-ended techniques for data collection, including methods like interviews, focus groups, observations, and content analysis. It allows participants to express their perspectives in their own words.
  • Data Analysis: Data is analyzed through thematic analysis, content analysis, or grounded theory. Researchers focus on identifying patterns, themes, and insights in the data.
  • Sample Size: Qualitative research typically involves smaller sample sizes due to the in-depth nature of data collection and analysis.
  • Results: Findings are presented in narrative form, often in the participants’ own words. Results are subjective, context-dependent, and provide rich, detailed descriptions.
  • Generalizability: Qualitative research does not aim for broad generalizability but focuses on in-depth exploration within a specific context. It provides a detailed understanding of a particular group or situation.
  • Emphasis on Subjectivity: Researchers acknowledge the role of subjectivity and the researcher’s influence on the Research Process . Participant perspectives and experiences are central to the findings.

Researchers choose between quantitative and qualitative research methods based on their research objectives and the nature of the research question. Each approach has its advantages and drawbacks, and the decision between them hinges on the particular research objectives and the data needed to address research inquiries effectively.

Quantitative research is a structured way of collecting and analyzing data from various sources. Its purpose is to quantify the problem and understand its extent, seeking results that someone can project to a larger population.

Companies that use quantitative rather than qualitative research typically aim to measure magnitudes and seek objectively interpreted statistical results. So if you want to obtain quantitative data that helps you define the structured cause-and-effect relationship between the research problem and the factors, you should opt for this type of research.

At QuestionPro , we have various Best Data Collection Tools and features to conduct investigations of this type. You can create questionnaires and distribute them through our various methods. We also have sample services or various questions to guarantee the success of your study and the quality of the collected data.

FREE TRIAL         LEARN MORE

Quantitative research is a systematic and structured approach to studying phenomena that involves the collection of measurable data and the application of statistical, mathematical, or computational techniques for analysis.

Quantitative research is characterized by structured tools like surveys, substantial sample sizes, closed-ended questions, reliance on prior studies, data presented numerically, and the ability to generalize findings to the broader population.

The two main methods of quantitative research are Primary quantitative research methods, involving data collection directly from sources, and Secondary quantitative research methods, which utilize existing data for analysis.

1.Surveying to measure employee engagement with numerical rating scales. 2.Analyzing sales data to identify trends in product demand and market share. 4.Examining test scores to assess the impact of a new teaching method on student performance. 4.Using website analytics to track user behavior and conversion rates for an online store.

1.Differentiate between quantitative and qualitative approaches. 2.Choose a representative sample size. 3.Define clear research goals before data collection. 4.Use simple and easily understandable survey questions.

MORE LIKE THIS

customer experience automation

Customer Experience Automation: Benefits and Best Tools

Apr 1, 2024

market segmentation tools

7 Best Market Segmentation Tools in 2024

in-app feedback tools

In-App Feedback Tools: How to Collect, Uses & 14 Best Tools

Mar 29, 2024

Customer Journey Analytics Software

11 Best Customer Journey Analytics Software in 2024

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Employee Exit Interviews
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories

Market Research

  • Artificial Intelligence
  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Salt Lake City.

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • Secondary Research

Try Qualtrics for free

Secondary research: definition, methods, & examples.

19 min read This ultimate guide to secondary research helps you understand changes in market trends, customers buying patterns and your competition using existing data sources.

In situations where you’re not involved in the data gathering process ( primary research ), you have to rely on existing information and data to arrive at specific research conclusions or outcomes. This approach is known as secondary research.

In this article, we’re going to explain what secondary research is, how it works, and share some examples of it in practice.

Free eBook: The ultimate guide to conducting market research

What is secondary research?

Secondary research, also known as desk research, is a research method that involves compiling existing data sourced from a variety of channels . This includes internal sources (e.g.in-house research) or, more commonly, external sources (such as government statistics, organizational bodies, and the internet).

Secondary research comes in several formats, such as published datasets, reports, and survey responses , and can also be sourced from websites, libraries, and museums.

The information is usually free — or available at a limited access cost — and gathered using surveys , telephone interviews, observation, face-to-face interviews, and more.

When using secondary research, researchers collect, verify, analyze and incorporate it to help them confirm research goals for the research period.

As well as the above, it can be used to review previous research into an area of interest. Researchers can look for patterns across data spanning several years and identify trends — or use it to verify early hypothesis statements and establish whether it’s worth continuing research into a prospective area.

How to conduct secondary research

There are five key steps to conducting secondary research effectively and efficiently:

1.    Identify and define the research topic

First, understand what you will be researching and define the topic by thinking about the research questions you want to be answered.

Ask yourself: What is the point of conducting this research? Then, ask: What do we want to achieve?

This may indicate an exploratory reason (why something happened) or confirm a hypothesis. The answers may indicate ideas that need primary or secondary research (or a combination) to investigate them.

2.    Find research and existing data sources

If secondary research is needed, think about where you might find the information. This helps you narrow down your secondary sources to those that help you answer your questions. What keywords do you need to use?

Which organizations are closely working on this topic already? Are there any competitors that you need to be aware of?

Create a list of the data sources, information, and people that could help you with your work.

3.    Begin searching and collecting the existing data

Now that you have the list of data sources, start accessing the data and collect the information into an organized system. This may mean you start setting up research journal accounts or making telephone calls to book meetings with third-party research teams to verify the details around data results.

As you search and access information, remember to check the data’s date, the credibility of the source, the relevance of the material to your research topic, and the methodology used by the third-party researchers. Start small and as you gain results, investigate further in the areas that help your research’s aims.

4.    Combine the data and compare the results

When you have your data in one place, you need to understand, filter, order, and combine it intelligently. Data may come in different formats where some data could be unusable, while other information may need to be deleted.

After this, you can start to look at different data sets to see what they tell you. You may find that you need to compare the same datasets over different periods for changes over time or compare different datasets to notice overlaps or trends. Ask yourself: What does this data mean to my research? Does it help or hinder my research?

5.    Analyze your data and explore further

In this last stage of the process, look at the information you have and ask yourself if this answers your original questions for your research. Are there any gaps? Do you understand the information you’ve found? If you feel there is more to cover, repeat the steps and delve deeper into the topic so that you can get all the information you need.

If secondary research can’t provide these answers, consider supplementing your results with data gained from primary research. As you explore further, add to your knowledge and update your findings. This will help you present clear, credible information.

Primary vs secondary research

Unlike secondary research, primary research involves creating data first-hand by directly working with interviewees, target users, or a target market. Primary research focuses on the method for carrying out research, asking questions, and collecting data using approaches such as:

  • Interviews (panel, face-to-face or over the phone)
  • Questionnaires or surveys
  • Focus groups

Using these methods, researchers can get in-depth, targeted responses to questions, making results more accurate and specific to their research goals. However, it does take time to do and administer.

Unlike primary research, secondary research uses existing data, which also includes published results from primary research. Researchers summarize the existing research and use the results to support their research goals.

Both primary and secondary research have their places. Primary research can support the findings found through secondary research (and fill knowledge gaps), while secondary research can be a starting point for further primary research. Because of this, these research methods are often combined for optimal research results that are accurate at both the micro and macro level.

Sources of Secondary Research

There are two types of secondary research sources: internal and external. Internal data refers to in-house data that can be gathered from the researcher’s organization. External data refers to data published outside of and not owned by the researcher’s organization.

Internal data

Internal data is a good first port of call for insights and knowledge, as you may already have relevant information stored in your systems. Because you own this information — and it won’t be available to other researchers — it can give you a competitive edge . Examples of internal data include:

  • Database information on sales history and business goal conversions
  • Information from website applications and mobile site data
  • Customer-generated data on product and service efficiency and use
  • Previous research results or supplemental research areas
  • Previous campaign results

External data

External data is useful when you: 1) need information on a new topic, 2) want to fill in gaps in your knowledge, or 3) want data that breaks down a population or market for trend and pattern analysis. Examples of external data include:

  • Government, non-government agencies, and trade body statistics
  • Company reports and research
  • Competitor research
  • Public library collections
  • Textbooks and research journals
  • Media stories in newspapers
  • Online journals and research sites

Three examples of secondary research methods in action

How and why might you conduct secondary research? Let’s look at a few examples:

1.    Collecting factual information from the internet on a specific topic or market

There are plenty of sites that hold data for people to view and use in their research. For example, Google Scholar, ResearchGate, or Wiley Online Library all provide previous research on a particular topic. Researchers can create free accounts and use the search facilities to look into a topic by keyword, before following the instructions to download or export results for further analysis.

This can be useful for exploring a new market that your organization wants to consider entering. For instance, by viewing the U.S Census Bureau demographic data for that area, you can see what the demographics of your target audience are , and create compelling marketing campaigns accordingly.

2.    Finding out the views of your target audience on a particular topic

If you’re interested in seeing the historical views on a particular topic, for example, attitudes to women’s rights in the US, you can turn to secondary sources.

Textbooks, news articles, reviews, and journal entries can all provide qualitative reports and interviews covering how people discussed women’s rights. There may be multimedia elements like video or documented posters of propaganda showing biased language usage.

By gathering this information, synthesizing it, and evaluating the language, who created it and when it was shared, you can create a timeline of how a topic was discussed over time.

3.    When you want to know the latest thinking on a topic

Educational institutions, such as schools and colleges, create a lot of research-based reports on younger audiences or their academic specialisms. Dissertations from students also can be submitted to research journals, making these places useful places to see the latest insights from a new generation of academics.

Information can be requested — and sometimes academic institutions may want to collaborate and conduct research on your behalf. This can provide key primary data in areas that you want to research, as well as secondary data sources for your research.

Advantages of secondary research

There are several benefits of using secondary research, which we’ve outlined below:

  • Easily and readily available data – There is an abundance of readily accessible data sources that have been pre-collected for use, in person at local libraries and online using the internet. This data is usually sorted by filters or can be exported into spreadsheet format, meaning that little technical expertise is needed to access and use the data.
  • Faster research speeds – Since the data is already published and in the public arena, you don’t need to collect this information through primary research. This can make the research easier to do and faster, as you can get started with the data quickly.
  • Low financial and time costs – Most secondary data sources can be accessed for free or at a small cost to the researcher, so the overall research costs are kept low. In addition, by saving on preliminary research, the time costs for the researcher are kept down as well.
  • Secondary data can drive additional research actions – The insights gained can support future research activities (like conducting a follow-up survey or specifying future detailed research topics) or help add value to these activities.
  • Secondary data can be useful pre-research insights – Secondary source data can provide pre-research insights and information on effects that can help resolve whether research should be conducted. It can also help highlight knowledge gaps, so subsequent research can consider this.
  • Ability to scale up results – Secondary sources can include large datasets (like Census data results across several states) so research results can be scaled up quickly using large secondary data sources.

Disadvantages of secondary research

The disadvantages of secondary research are worth considering in advance of conducting research :

  • Secondary research data can be out of date – Secondary sources can be updated regularly, but if you’re exploring the data between two updates, the data can be out of date. Researchers will need to consider whether the data available provides the right research coverage dates, so that insights are accurate and timely, or if the data needs to be updated. Also, fast-moving markets may find secondary data expires very quickly.
  • Secondary research needs to be verified and interpreted – Where there’s a lot of data from one source, a researcher needs to review and analyze it. The data may need to be verified against other data sets or your hypotheses for accuracy and to ensure you’re using the right data for your research.
  • The researcher has had no control over the secondary research – As the researcher has not been involved in the secondary research, invalid data can affect the results. It’s therefore vital that the methodology and controls are closely reviewed so that the data is collected in a systematic and error-free way.
  • Secondary research data is not exclusive – As data sets are commonly available, there is no exclusivity and many researchers can use the same data. This can be problematic where researchers want to have exclusive rights over the research results and risk duplication of research in the future.

When do we conduct secondary research?

Now that you know the basics of secondary research, when do researchers normally conduct secondary research?

It’s often used at the beginning of research, when the researcher is trying to understand the current landscape . In addition, if the research area is new to the researcher, it can form crucial background context to help them understand what information exists already. This can plug knowledge gaps, supplement the researcher’s own learning or add to the research.

Secondary research can also be used in conjunction with primary research. Secondary research can become the formative research that helps pinpoint where further primary research is needed to find out specific information. It can also support or verify the findings from primary research.

You can use secondary research where high levels of control aren’t needed by the researcher, but a lot of knowledge on a topic is required from different angles.

Secondary research should not be used in place of primary research as both are very different and are used for various circumstances.

Questions to ask before conducting secondary research

Before you start your secondary research, ask yourself these questions:

  • Is there similar internal data that we have created for a similar area in the past?

If your organization has past research, it’s best to review this work before starting a new project. The older work may provide you with the answers, and give you a starting dataset and context of how your organization approached the research before. However, be mindful that the work is probably out of date and view it with that note in mind. Read through and look for where this helps your research goals or where more work is needed.

  • What am I trying to achieve with this research?

When you have clear goals, and understand what you need to achieve, you can look for the perfect type of secondary or primary research to support the aims. Different secondary research data will provide you with different information – for example, looking at news stories to tell you a breakdown of your market’s buying patterns won’t be as useful as internal or external data e-commerce and sales data sources.

  • How credible will my research be?

If you are looking for credibility, you want to consider how accurate the research results will need to be, and if you can sacrifice credibility for speed by using secondary sources to get you started. Bear in mind which sources you choose — low-credibility data sites, like political party websites that are highly biased to favor their own party, would skew your results.

  • What is the date of the secondary research?

When you’re looking to conduct research, you want the results to be as useful as possible , so using data that is 10 years old won’t be as accurate as using data that was created a year ago. Since a lot can change in a few years, note the date of your research and look for earlier data sets that can tell you a more recent picture of results. One caveat to this is using data collected over a long-term period for comparisons with earlier periods, which can tell you about the rate and direction of change.

  • Can the data sources be verified? Does the information you have check out?

If you can’t verify the data by looking at the research methodology, speaking to the original team or cross-checking the facts with other research, it could be hard to be sure that the data is accurate. Think about whether you can use another source, or if it’s worth doing some supplementary primary research to replicate and verify results to help with this issue.

We created a front-to-back guide on conducting market research, The ultimate guide to conducting market research , so you can understand the research journey with confidence.

In it, you’ll learn more about:

  • What effective market research looks like
  • The use cases for market research
  • The most important steps to conducting market research
  • And how to take action on your research findings

Download the free guide for a clearer view on secondary research and other key research types for your business.

Related resources

Market intelligence 10 min read, marketing insights 11 min read, ethnographic research 11 min read, qualitative vs quantitative research 13 min read, qualitative research questions 11 min read, qualitative research design 12 min read, primary vs secondary research 14 min read, request demo.

Ready to learn more about Qualtrics?

  • A/B Monadic Test
  • A/B Pre-Roll Test
  • Key Driver Analysis
  • Multiple Implicit
  • Penalty Reward
  • Price Sensitivity
  • Segmentation
  • Single Implicit
  • Category Exploration
  • Competitive Landscape
  • Consumer Segmentation
  • Innovation & Renovation
  • Product Portfolio
  • Marketing Creatives
  • Advertising
  • Shelf Optimization
  • Performance Monitoring
  • Better Brand Health Tracking
  • Ad Tracking
  • Trend Tracking
  • Satisfaction Tracking
  • AI Insights
  • Case Studies

quantilope is the Consumer Intelligence Platform for all end-to-end research needs

5 Methods of Data Collection for Quantitative Research

mrx glossary quantitative data collection

In this blog, read up on five different data collection techniques for quantitative research studies. 

Quantitative research forms the basis for many business decisions. But what is quantitative data collection, why is it important, and which data collection methods are used in quantitative research? 

Table of Contents: 

  • What is quantitative data collection?
  • The importance of quantitative data collection
  • Methods used for quantitative data collection
  • Example of a survey showing quantitative data
  • Strengths and weaknesses of quantitative data

What is quantitative data collection? 

Quantitative data collection is the gathering of numeric data that puts consumer insights into a quantifiable context. It typically involves a large number of respondents - large enough to extract statistically reliable findings that can be extrapolated to a larger population.

The actual data collection process for quantitative findings is typically done using a quantitative online questionnaire that asks respondents yes/no questions, ranking scales, rating matrices, and other quantitative question types. With these results, researchers can generate data charts to summarize the quantitative findings and generate easily digestible key takeaways. 

Back to Table of Contents

The importance of quantitative data collection 

Quantitative data collection can confirm or deny a brand's hypothesis, guide product development, tailor marketing materials, and much more. It provides brands with reliable information to make decisions off of (i.e. 86% like lemon-lime flavor or just 12% are interested in a cinnamon-scented hand soap). 

Compared to qualitative data collection, quantitative data allows for comparison between insights given higher base sizes which leads to the ability to have statistical significance. Brands can cut and analyze their dataset in a variety of ways, looking at their findings among different demographic groups, behavioral groups, and other ways of interest. It's also generally easier and quicker to collect quantitative data than it is to gather qualitative feedback, making it an important data collection tool for brands that need quick, reliable, concrete insights. 

In order to make justified business decisions from quantitative data, brands need to recruit a high-quality sample that's reflective of their true target market (one that's comprised of all ages/genders rather than an isolated group). For example, a study into usage and attitudes around orange juice might include consumers who buy and/or drink orange juice at a certain frequency or who buy a variety of orange juice brands from different outlets. 

Methods used for quantitative data collection 

So knowing what quantitative data collection is and why it's important , how does one go about researching a large, high-quality, representative sample ?

Below are five examples of how to conduct your study through various data collection methods : 

Online quantitative surveys 

Online surveys are a common and effective way of collecting data from a large number of people. They tend to be made up of closed-ended questions so that responses across the sample are comparable; however, a small number of open-ended questions can be included as well (i.e. questions that require a written response rather than a selection of answers in a close-ended list). Open-ended questions are helpful to gather actual language used by respondents on a certain issue or to collect feedback on a view that might not be shown in a set list of responses).

Online surveys are quick and easy to send out, typically done so through survey panels. They can also appear in pop-ups on websites or via a link embedded in social media. From the participant’s point of view, online surveys are convenient to complete and submit, using whichever device they prefer (mobile phone, tablet, or computer). Anonymity is also viewed as a positive: online survey software ensures respondents’ identities are kept completely confidential.

To gather respondents for online surveys, researchers have several options. Probability sampling is one route, where respondents are selected using a random selection method. As such, everyone within the population has an equal chance of getting selected to participate. 

There are four common types of probability sampling . 

  • Simple random sampling is the most straightforward approach, which involves randomly selecting individuals from the population without any specific criteria or grouping. 
  • Stratified random sampling  divides the population into subgroups (strata) and selects a random sample from each stratum. This is useful when a population includes subgroups that you want to be sure you cover in your research. 
  • Cluster sampling  divides the population into clusters and then randomly selects some of the clusters to sample in their entirety. This is useful when a population is geographically dispersed and it would be impossible to include everyone.
  • Systematic sampling  begins with a random starting point and then selects every nth member of the population after that point (i.e. every 15th respondent). 

Learn how to leverage AI to help generate your online quantitative survey inputs:

AI webinar

While online surveys are by far the most common way to collect quantitative data in today’s modern age, there are still some harder-to-reach respondents where other mediums can be beneficial; for example, those who aren’t tech-savvy or who don’t have a stable internet connection. For these audiences, offline surveys   may be needed.

Offline quantitative surveys

Offline surveys (though much rarer to come across these days) are a way of gathering respondent feedback without digital means. This could be something like postal questionnaires that are sent out to a sample population and asked to return the questionnaire by mail (like the Census) or telephone surveys where questions are asked of respondents over the phone. 

Offline surveys certainly take longer to collect data than online surveys and they can become expensive if the population is difficult to reach (requiring a higher incentive). As with online surveys, anonymity is protected, assuming the mail is not intercepted or lost.

Despite the major difference in data collection to an online survey approach, offline survey data is still reported on in an aggregated, numeric fashion. 

In-person interviews are another popular way of researching or polling a population. They can be thought of as a survey but in a verbal, in-person, or virtual face-to-face format. The online format of interviews is becoming more popular nowadays, as it is cheaper and logistically easier to organize than in-person face-to-face interviews, yet still allows the interviewer to see and hear from the respondent in their own words. 

Though many interviews are collected for qualitative research, interviews can also be leveraged quantitatively; like a phone survey, an interviewer runs through a survey with the respondent, asking mainly closed-ended questions (yes/no, multiple choice questions, or questions with rating scales that ask how strongly the respondent agrees with statements). The advantage of structured interviews is that the interviewer can pace the survey, making sure the respondent gives enough consideration to each question. It also adds a human touch, which can be more engaging for some respondents. On the other hand, for more sensitive issues, respondents may feel more inclined to complete a survey online for a greater sense of anonymity - so it all depends on your research questions, the survey topic, and the audience you're researching.

Observations

Observation studies in quantitative research are similar in nature to a qualitative ethnographic study (in which a researcher also observes consumers in their natural habitats), yet observation studies for quant research remain focused on the numbers - how many people do an action, how much of a product consumer pick up, etc.

For quantitative observations, researchers will record the number and types of people who do a certain action - such as choosing a specific product from a grocery shelf, speaking to a company representative at an event, or how many people pass through a certain area within a given timeframe. Observation studies are generally structured, with the observer asked to note behavior using set parameters. Structured observation means that the observer has to hone in on very specific behaviors, which can be quite nuanced. This requires the observer to use his/her own judgment about what type of behavior is being exhibited (e.g. reading labels on products before selecting them; considering different items before making the final choice; making a selection based on price).

Document reviews and secondary data sources

A fifth method of data collection for quantitative research is known as secondary research : reviewing existing research to see how it can contribute to understanding a new issue in question. This is in contrast to the primary research methods above, which is research that is specially commissioned and carried out for a research project. 

There are numerous secondary data sources that researchers can analyze such as  public records, government research, company databases, existing reports, paid-for research publications, magazines, journals, case studies, websites, books, and more.

Aside from using secondary research alone, secondary research documents can also be used in anticipation of primary research, to understand which knowledge gaps need to be filled and to nail down the issues that might be important to explore further in a primary research study. Back to Table of Contents

Example of a survey showing quantitative data 

The below study shows what quantitative data might look like in a final study dashboard, taken from quantilope's Sneaker category insights study . 

The study includes a variety of usage and attitude metrics around sneaker wear, sneaker purchases, seasonality of sneakers, and more. Check out some of the data charts below showing these quantitative data findings - the first of which even cuts the quantitative data findings by demographics. 

sneaker study data chart

Beyond these basic usage and attitude (or, descriptive) data metrics, quantitative data also includes advanced methods - such as implicit association testing. See what these quantitative data charts look like from the same sneaker study below:

sneaker implicit chart

These are just a few examples of how a researcher or insights team might show their quantitative data findings. However, there are many ways to visualize quantitative data in an insights study, from bar charts, column charts, pie charts, donut charts, spider charts, and more, depending on what best suits the story your data is telling. Back to Table of Contents

Strengths and weaknesses of quantitative data collection

quantitative data is a great way to capture informative insights about your brand, product, category, or competitors. It's relatively quick, depending on your sample audience, and more affordable than other data collection methods such as qualitative focus groups. With quantitative panels, it's easy to access nearly any audience you might need - from something as general as the US population to something as specific as cannabis users . There are many ways to visualize quantitative findings, making it a customizable form of insights - whether you want to show the data in a bar chart, pie chart, etc. 

For those looking for quick, affordable, actionable insights, quantitative studies are the way to go.  

quantitative data collection, despite the many benefits outlined above, might also not be the right fit for your exact needs. For example, you often don't get as detailed and in-depth answers quantitatively as you would with an in-person interview, focus group, or ethnographic observation (all forms of qualitative research). When running a quantitative survey, it’s best practice to review your data for quality measures to ensure all respondents are ones you want to keep in your data set. Fortunately, there are a lot of precautions research providers can take to navigate these obstacles - such as automated data cleaners and data flags. Of course, the first step to ensuring high-quality results is to use a trusted panel provider.  Back to Table of Contents

Quantitative research typically needs to undergo statistical analysis for it to be useful and actionable to any business. It is therefore crucial that the method of data collection, sample size, and sample criteria are considered in light of the research questions asked.

quantilope’s online platform is ideal for quantitative research studies. The online format means a large sample can be reached easily and quickly through connected respondent panels that effectively reach the desired target audience. Response rates are high, as respondents can take their survey from anywhere, using any device with internet access.

Surveys are easy to build with quantilope’s online survey builder. Simply choose questions to include from pre-designed survey templates or build your own questions using the platform’s drag & drop functionality (of which both options are fully customizable). Once the survey is live, findings update in real-time so that brands can get an idea of consumer attitudes long before the survey is complete. In addition to basic usage and attitude questions, quantilope’s suite of advanced research methodologies provides an AI-driven approach to many types of research questions. These range from exploring the features of products that drive purchase through a Key Driver Analysis , compiling the ideal portfolio of products using a TURF , or identifying the optimal price point for a product or service using a Price Sensitivity Meter (PSM) .

Depending on the type of data sought it might be worth considering a mixed-method approach, including both qual and quant in a single research study. Alongside quantitative online surveys, quantilope’s video research solution - inColor , offers qualitative research in the form of videoed responses to survey questions. inColor’s qualitative data analysis includes an AI-drive read on respondent sentiment, keyword trends, and facial expressions.

To find out more about how quantilope can help with any aspect of your research design and to start conducting high-quality, quantitative research, get in touch below:

Get in touch to learn more about quantitative research studies!

Related posts, how can brands build, measure, and manage brand equity, how to use a brand insights tool to improve your branding strategy, quantilope's 5th consecutive year as a 'fastest growing tech company', automated survey setup: how to utilize ai-generated question inputs.

quantitative research methods secondary data

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Quantitative Data – Types, Methods and Examples

Quantitative Data – Types, Methods and Examples

Table of Contents

 Quantitative Data

Quantitative Data

Definition:

Quantitative data refers to numerical data that can be measured or counted. This type of data is often used in scientific research and is typically collected through methods such as surveys, experiments, and statistical analysis.

Quantitative Data Types

There are two main types of quantitative data: discrete and continuous.

  • Discrete data: Discrete data refers to numerical values that can only take on specific, distinct values. This type of data is typically represented as whole numbers and cannot be broken down into smaller units. Examples of discrete data include the number of students in a class, the number of cars in a parking lot, and the number of children in a family.
  • Continuous data: Continuous data refers to numerical values that can take on any value within a certain range or interval. This type of data is typically represented as decimal or fractional values and can be broken down into smaller units. Examples of continuous data include measurements of height, weight, temperature, and time.

Quantitative Data Collection Methods

There are several common methods for collecting quantitative data. Some of these methods include:

  • Surveys : Surveys involve asking a set of standardized questions to a large number of people. Surveys can be conducted in person, over the phone, via email or online, and can be used to collect data on a wide range of topics.
  • Experiments : Experiments involve manipulating one or more variables and observing the effects on a specific outcome. Experiments can be conducted in a controlled laboratory setting or in the real world.
  • Observational studies : Observational studies involve observing and collecting data on a specific phenomenon without intervening or manipulating any variables. Observational studies can be conducted in a natural setting or in a laboratory.
  • Secondary data analysis : Secondary data analysis involves using existing data that was collected for a different purpose to answer a new research question. This method can be cost-effective and efficient, but it is important to ensure that the data is appropriate for the research question being studied.
  • Physiological measures: Physiological measures involve collecting data on biological or physiological processes, such as heart rate, blood pressure, or brain activity.
  • Computerized tracking: Computerized tracking involves collecting data automatically from electronic sources, such as social media, online purchases, or website analytics.

Quantitative Data Analysis Methods

There are several methods for analyzing quantitative data, including:

  • Descriptive statistics: Descriptive statistics are used to summarize and describe the basic features of the data, such as the mean, median, mode, standard deviation, and range.
  • Inferential statistics : Inferential statistics are used to make generalizations about a population based on a sample of data. These methods include hypothesis testing, confidence intervals, and regression analysis.
  • Data visualization: Data visualization involves creating charts, graphs, and other visual representations of the data to help identify patterns and trends. Common types of data visualization include histograms, scatterplots, and bar charts.
  • Time series analysis: Time series analysis involves analyzing data that is collected over time to identify patterns and trends in the data.
  • Multivariate analysis : Multivariate analysis involves analyzing data with multiple variables to identify relationships between the variables.
  • Factor analysis : Factor analysis involves identifying underlying factors or dimensions that explain the variation in the data.
  • Cluster analysis: Cluster analysis involves identifying groups or clusters of observations that are similar to each other based on multiple variables.

Quantitative Data Formats

Quantitative data can be represented in different formats, depending on the nature of the data and the purpose of the analysis. Here are some common formats:

  • Tables : Tables are a common way to present quantitative data, particularly when the data involves multiple variables. Tables can be used to show the frequency or percentage of data in different categories or to display summary statistics.
  • Charts and graphs: Charts and graphs are useful for visualizing quantitative data and can be used to highlight patterns and trends in the data. Some common types of charts and graphs include line charts, bar charts, scatterplots, and pie charts.
  • Databases : Quantitative data can be stored in databases, which allow for easy sorting, filtering, and analysis of large amounts of data.
  • Spreadsheets : Spreadsheets can be used to organize and analyze quantitative data, particularly when the data is relatively small in size. Spreadsheets allow for calculations and data manipulation, as well as the creation of charts and graphs.
  • Statistical software : Statistical software, such as SPSS, R, and SAS, can be used to analyze quantitative data. These programs allow for more advanced statistical analyses and data modeling, as well as the creation of charts and graphs.

Quantitative Data Gathering Guide

Here is a basic guide for gathering quantitative data:

  • Define the research question: The first step in gathering quantitative data is to clearly define the research question. This will help determine the type of data to be collected, the sample size, and the methods of data analysis.
  • Choose the data collection method: Select the appropriate method for collecting data based on the research question and available resources. This could include surveys, experiments, observational studies, or other methods.
  • Determine the sample size: Determine the appropriate sample size for the research question. This will depend on the level of precision needed and the variability of the population being studied.
  • Develop the data collection instrument: Develop a questionnaire or survey instrument that will be used to collect the data. The instrument should be designed to gather the specific information needed to answer the research question.
  • Pilot test the data collection instrument : Before collecting data from the entire sample, pilot test the instrument on a small group to identify any potential problems or issues.
  • Collect the data: Collect the data from the selected sample using the chosen data collection method.
  • Clean and organize the data : Organize the data into a format that can be easily analyzed. This may involve checking for missing data, outliers, or errors.
  • Analyze the data: Analyze the data using appropriate statistical methods. This may involve descriptive statistics, inferential statistics, or other types of analysis.
  • Interpret the results: Interpret the results of the analysis in the context of the research question. Identify any patterns, trends, or relationships in the data and draw conclusions based on the findings.
  • Communicate the findings: Communicate the findings of the analysis in a clear and concise manner, using appropriate tables, graphs, and other visual aids as necessary. The results should be presented in a way that is accessible to the intended audience.

Examples of Quantitative Data

Here are some examples of quantitative data:

  • Height of a person (measured in inches or centimeters)
  • Weight of a person (measured in pounds or kilograms)
  • Temperature (measured in Fahrenheit or Celsius)
  • Age of a person (measured in years)
  • Number of cars sold in a month
  • Amount of rainfall in a specific area (measured in inches or millimeters)
  • Number of hours worked in a week
  • GPA (grade point average) of a student
  • Sales figures for a product
  • Time taken to complete a task.
  • Distance traveled (measured in miles or kilometers)
  • Speed of an object (measured in miles per hour or kilometers per hour)
  • Number of people attending an event
  • Price of a product (measured in dollars or other currency)
  • Blood pressure (measured in millimeters of mercury)
  • Amount of sugar in a food item (measured in grams)
  • Test scores (measured on a numerical scale)
  • Number of website visitors per day
  • Stock prices (measured in dollars)
  • Crime rates (measured by the number of crimes per 100,000 people)

Applications of Quantitative Data

Quantitative data has a wide range of applications across various fields, including:

  • Scientific research: Quantitative data is used extensively in scientific research to test hypotheses and draw conclusions. For example, in biology, researchers might use quantitative data to measure the growth rate of cells or the effectiveness of a drug treatment.
  • Business and economics: Quantitative data is used to analyze business and economic trends, forecast future performance, and make data-driven decisions. For example, a company might use quantitative data to analyze sales figures and customer demographics to determine which products are most popular among which segments of their customer base.
  • Education: Quantitative data is used in education to measure student performance, evaluate teaching methods, and identify areas where improvement is needed. For example, a teacher might use quantitative data to track the progress of their students over the course of a semester and adjust their teaching methods accordingly.
  • Public policy: Quantitative data is used in public policy to evaluate the effectiveness of policies and programs, identify areas where improvement is needed, and develop evidence-based solutions. For example, a government agency might use quantitative data to evaluate the impact of a social welfare program on poverty rates.
  • Healthcare : Quantitative data is used in healthcare to evaluate the effectiveness of medical treatments, track the spread of diseases, and identify risk factors for various health conditions. For example, a doctor might use quantitative data to monitor the blood pressure levels of their patients over time and adjust their treatment plan accordingly.

Purpose of Quantitative Data

The purpose of quantitative data is to provide a numerical representation of a phenomenon or observation. Quantitative data is used to measure and describe the characteristics of a population or sample, and to test hypotheses and draw conclusions based on statistical analysis. Some of the key purposes of quantitative data include:

  • Measuring and describing : Quantitative data is used to measure and describe the characteristics of a population or sample, such as age, income, or education level. This allows researchers to better understand the population they are studying.
  • Testing hypotheses: Quantitative data is often used to test hypotheses and theories by collecting numerical data and analyzing it using statistical methods. This can help researchers determine whether there is a statistically significant relationship between variables or whether there is support for a particular theory.
  • Making predictions : Quantitative data can be used to make predictions about future events or trends based on past data. This is often done through statistical modeling or time series analysis.
  • Evaluating programs and policies: Quantitative data is often used to evaluate the effectiveness of programs and policies. This can help policymakers and program managers identify areas where improvements can be made and make evidence-based decisions about future programs and policies.

When to use Quantitative Data

Quantitative data is appropriate to use when you want to collect and analyze numerical data that can be measured and analyzed using statistical methods. Here are some situations where quantitative data is typically used:

  • When you want to measure a characteristic or behavior : If you want to measure something like the height or weight of a population or the number of people who smoke, you would use quantitative data to collect this information.
  • When you want to compare groups: If you want to compare two or more groups, such as comparing the effectiveness of two different medical treatments, you would use quantitative data to collect and analyze the data.
  • When you want to test a hypothesis : If you have a hypothesis or theory that you want to test, you would use quantitative data to collect data that can be analyzed statistically to determine whether your hypothesis is supported by the data.
  • When you want to make predictions: If you want to make predictions about future trends or events, such as predicting sales for a new product, you would use quantitative data to collect and analyze data from past trends to make your prediction.
  • When you want to evaluate a program or policy : If you want to evaluate the effectiveness of a program or policy, you would use quantitative data to collect data about the program or policy and analyze it statistically to determine whether it has had the intended effect.

Characteristics of Quantitative Data

Quantitative data is characterized by several key features, including:

  • Numerical values : Quantitative data consists of numerical values that can be measured and counted. These values are often expressed in terms of units, such as dollars, centimeters, or kilograms.
  • Continuous or discrete : Quantitative data can be either continuous or discrete. Continuous data can take on any value within a certain range, while discrete data can only take on certain values.
  • Objective: Quantitative data is objective, meaning that it is not influenced by personal biases or opinions. It is based on empirical evidence that can be measured and analyzed using statistical methods.
  • Large sample size: Quantitative data is often collected from a large sample size in order to ensure that the results are statistically significant and representative of the population being studied.
  • Statistical analysis: Quantitative data is typically analyzed using statistical methods to determine patterns, relationships, and other characteristics of the data. This allows researchers to make more objective conclusions based on empirical evidence.
  • Precision : Quantitative data is often very precise, with measurements taken to multiple decimal points or significant figures. This precision allows for more accurate analysis and interpretation of the data.

Advantages of Quantitative Data

Some advantages of quantitative data are:

  • Objectivity : Quantitative data is usually objective because it is based on measurable and observable variables. This means that different people who collect the same data will generally get the same results.
  • Precision : Quantitative data provides precise measurements of variables. This means that it is easier to make comparisons and draw conclusions from quantitative data.
  • Replicability : Since quantitative data is based on objective measurements, it is often easier to replicate research studies using the same or similar data.
  • Generalizability : Quantitative data allows researchers to generalize findings to a larger population. This is because quantitative data is often collected using random sampling methods, which help to ensure that the data is representative of the population being studied.
  • Statistical analysis : Quantitative data can be analyzed using statistical methods, which allows researchers to test hypotheses and draw conclusions about the relationships between variables.
  • Efficiency : Quantitative data can often be collected quickly and efficiently using surveys or other standardized instruments, which makes it a cost-effective way to gather large amounts of data.

Limitations of Quantitative Data

Some Limitations of Quantitative Data are as follows:

  • Limited context: Quantitative data does not provide information about the context in which the data was collected. This can make it difficult to understand the meaning behind the numbers.
  • Limited depth: Quantitative data is often limited to predetermined variables and questions, which may not capture the complexity of the phenomenon being studied.
  • Difficulty in capturing qualitative aspects: Quantitative data is unable to capture the subjective experiences and qualitative aspects of human behavior, such as emotions, attitudes, and motivations.
  • Possibility of bias: The collection and interpretation of quantitative data can be influenced by biases, such as sampling bias, measurement bias, or researcher bias.
  • Simplification of complex phenomena: Quantitative data may oversimplify complex phenomena by reducing them to numerical measurements and statistical analyses.
  • Lack of flexibility: Quantitative data collection methods may not allow for changes or adaptations in the research process, which can limit the ability to respond to unexpected findings or new insights.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Primary Data

Primary Data – Types, Methods and Examples

Qualitative Data

Qualitative Data – Types, Methods and Examples

Research Data

Research Data – Types Methods and Examples

Secondary Data

Secondary Data – Types, Methods and Examples

Research Information

Information in Research – Types and Examples

quantitative research methods secondary data

quantitative research methods secondary data

Secondary Quantitative Data Collection

The data those have been collected already and readily available from other sources are called as secondary data. when compared to primary data, these secondary data are cheaper and more quickly obtainable..

quantitative research methods secondary data

Secondary Data

The data those have been collected already and readily available from other sources are called as secondary data. When compared to primary data, these secondary data are cheaper and more quickly obtainable. Usually, desk-based research is used to collect secondary data. After arriving at the secondary data, the researcher should examine the validity and reliability. Thus, the researcher should consider the secondary data which is highly valid and well-referenced in academic articles (Creswell, 2003).

Types of secondary data

Secondary data is categorised into internal (from the organization which is under observation) data which is routinely supplied by management and external (from outside the organization) data which is obtained from various resources such as internet, journals, books, directories, Non-governmental statistical data and Census data.

quantitative research methods secondary data

Figure 1. Types of Secondary data (Source: http://www.ilo.org)

Based on the suggestions of Bryman (1989); Dale et al., (1988); Hakim, (1982 ) the secondary data is also classified into documentary data, survey based data and multiple source secondary data.

Documentary data

Written documents like notices, correspondence, minutes of meetings, reports to shareholders, diaries, transcripts of speeches and administrative and public records will come under this category. Non-written documents such as Tape and video recordings, pictures, drawings, films and television programmes, DVD/CD can also be considered as documentary data.

Survey-based secondary data

The data collected by questionnaires that have already been analyzed for their original purpose are called as survey-based secondary data. They can be from:

  • Continuous/regular surveys
  • Ad hoc surveys

Multiple-source secondary data

Multiple-source secondary data can be from either documentary or on survey data, or it could be the combination of these data. These data can be used in cohort studies those have been done for the same population over a period of time and also to develop area-based data sets.

Sources of secondary data

Reports from central, state or local government

Reports from international firms and foreign governments

Journals, magazines and books

Publications from research scholars, universities and research groups- are the major sources of secondary data (Dash, 2011).

Uses of secondary data

Although all data is intended to provide information for analysis and decision making, secondary data can be used in several ways in the context and conduct of a research/consultancy project. In accordance to Malhotra and Birks (2000) and McDaniel and Gates (2004) secondary data can be useful to

  • To identify the research problem
  • To develop a strategy to arrive at the solutions for the research problem
  • To construct a sampling plan
  • To formulate a suitable research design
  • To find out the answers for certain research questions or to test some hypotheses
  • To interpret primary data
  • To validate the outcomes from qualitative research
  • To identify the potential problems
  • To obtain the required background information and to improve the credibility of the study.

Evaluation of secondary data

The researcher has to view the secondary data with the same caution as he does with any primary data. He has to check whether he could access the data and the available secondary data could support him to arrive at the research objectives. Based on the following criteria the researcher should evaluate the secondary data resources (Stewart, 1984).

Methodology

It is necessary to evaluate the secondary data based on the methodology through which it has been collected (Stewart, 1984). Hence, the researcher has to evaluate the factors like sampling procedure, size of sample, rate of response, field work procedures and data analysis methods.

Error/accuracy

Accuracy of the data from the secondary resource should be assessed by the researcher in order to make sure about the trustworthiness of his study. However, the specifications as well as the methodologies won’t be given in detail to the researcher. As a result, it is often very tough to assess the accuracy of secondary data. However, the researcher could check the accuracy through triangulation research.

Date of data collection

As the secondary data is associated with the events those have already happened, it is usually outdated. Hence the researcher has to view the date of data collection, duration between data collection and publication and the relevancy of the data with respect to existing situations. In the case of census data, date of data collection is the major issue as it is collected once in few years.

Purpose of data collection

Again, by definition, secondary data is data that was collected with some other purpose or objective than that which the researcher/consultant now addressing this data is concerned with. The researcher must assess the extent to which data collected with another purpose or specific objective in mind is appropriate and relevant to the researcher’s situation or problem.

Nature: content of data

The data with high validity and accuracy can’t be useful if the content of the data is not good. Sometimes, there may be no proper link between the relationships examined by the data and the measurement categories adopted by the researcher. For example, the data may consider the relationship between salary levels and motivation. However, motivation may have been defined and measured in a way that is inconsistent or inappropriate with how the researcher considering the secondary data wishes to measure motivation.

Dependability/source credibility

It is suggested that the factors such as expertise, credibility and overall trustworthiness of the source should be considered by the researcher while evaluating the secondary data resources. In general, Government reports are found to be more credible than other commercial sources of secondary data. If the researcher knows the provider of the data or the primary data collection methodologies adopted by the researcher or the source of the primary data (whether original or acquired source), dependability and source credibility will become high (Stewart ,1984).

These are the key factors associated with the evaluation of secondary data resources. In addition to these factors other features such as costs and benefits, issues in access as well as control over data quality will also be considered while evaluating the secondary data (Stewart & Kamins, 1993; Denscombe, 1998).

Advantages of secondary data

Researchers (Boslaugh, 2007) have observed the following advantages with secondary data:

  • Less resource requirements
  • Inconspicuous method
  • Applicable for longitudinal studies
  • Comparative and contextual data can be obtained
  • Lead to unforeseen discoveries
  • Durability of data

Disadvantages of secondary data

  • The purpose for which the data is collected may not match the objectives of the researcher. Sometimes, it may be very difficult to get the access to the data
  • Aggregations and definitions may not match the expectations of the researcher.
  • The researcher has no control over data quality

Using more than one method

Sometimes, the researcher may be in a situation to adopt more than one method of data collection. For example, he may conduct secondary research followed by observation and focus group interviews. As this approach is the combination of two or more methods it is referred to as triangulation. Most of the management and consultancy research are not exclusive, and dichotomous. They may include both qualitative and quantitative research. On the other hand, focus groups and surveys can be done together for experimental research.

Factors affecting the selection of the research method

The research methodology adopted by the researcher has extreme influence over the success of the study. Hence, he should be very careful while selecting the research methodology. To arrive at the outcomes with expected quality, he should consider various factors while selecting the research methods. Among them, validity, reliability, and generalizability are the important factors that influence the selection of research method.

Validity and reliability of measures

The two important parts of research process are validity and reliability of measures. As stated by (Bryman & Cramer, 2005) “It is generally accepted that when a concept has been operationally defined, in that a measure of it has been proposed, the ensuring measurement device should be both valid and reliable”. Accuracy or exactness of data is related to the validity measure (Churchill & Lacobucci, 2002) which denotes the accuracy of the survey instrument (Litwin, 1995). As stated by Oppenheim(1984), Baines and Chansarkar (2002), Parasuraman, (1991), Peterson, (2000), validity measure could check whether the survey instrument measures the variables under study. Validity can be inferred through direct assessment employing validity and indirect assessment via reliability (Churchill & Lacobucci , 2002).

Generalizability

Generalizability, another dimension of validity is related to the degree with which the research outcomes could be generalised to other situations. Generalizability of data is crucial in two aspects. (i) Sampling is often used in the generation of data in the process of research and consultancy projects. The researcher should be able, to examine the extent to which results from the sample will also be present in the wider population from which the sample is obtained. Thus, Generalizability, is not associated only with the data collection methods but also with sample design and sampling method. (ii) The extent to which the data and results of a particular research project can be generalized to other situations. This, of course, is crucial in developing theories and particularly in the deductive approach to research.

Hence, it is understood that these three measures are found to support the researcher to generate scientific and reproducible research outcomes. It was believed that these three measures were developed for conducting research in the field of physical sciences where quantitative data is mostly used (Easterby-Smith et al., 2002).

Objectives/purpose of research

The objectives and purpose of the research extremely influences the selection of the research method. For example, the researcher may attempt to arrive at the solutions for a particular set of questions in his study. Hence, he could choose the methodology or techniques based on these objectives.

Skills and expertise of the researcher

It is true that an experienced researcher could opt for any kind of alternative research methodology and data-collection techniques. But each and every researcher can’t be an expert in all types of research methods. The researcher should opt for the methodology in which he has the high degree of expertise and familiarity. This factor is the major determinant for the success of the study/research.

Cost/budgets

The cost of the proposed research methodology is an important factor in determining the choice of research methods. Mostly, cost of the data collection method is high in many projects. Sometimes, the most effective research methodology in quality and potential value may be rejected in favour of a slightly less effective research methodology which is cheaper.

The researcher should consider the time given to complete the research while choosing the research methodology. If he has allotted with short duration, he should not opt for time consuming data collection methods such as large-scale surveys.

Availability

To make the study easier, the researcher should opt for readily available and accessible research methods. As the secondary data is already available, it is very easy to collect them. Sometimes, the data may not be available. In this case it is necessary to choose an alternative method to collect data.

Preferences/values of the consultant

The consultant should not allow his/her preferences and values to influence the choice of research methodology and he/she should give importance for the preferences of the client and the validity and cost of the methods.

Preferences/values of the client

Preferences and values of the client are found to extremely influence the choice of alternative research methodologies in the case of consultancy research projects. For example, the consultancy may feel that data for a particular project can be effectively collected through observation. But the client may tell that his department won’t allow observation method. In this case, it is necessary to discuss in terms of research methodology at the contracting stage of the consultancy process.

Ethical, legal and other issues

The consultant and client should consider and take care of several issues while selecting alternative research methodologies. Ethical issues concerned with the choice of research methodology and data collection techniques are one of these issues. Sometimes, there may be some issues in informing the participants that they are under observation (Wysocki 2004). On the other hand, there may be some issues in terms of collection, storage and uses of data. Most of the countries deal these issues with the help of legislation. For example, Data Protection Act has been passed in the UK to deal these issues.

Some of the factors should consider before utilizing secondary data

At Statswork, we are very conscious in collecting the secondary data include

  • 1. Checking the reliability of data
  • 2. Accuracy and quality of data for specific research goal (level of accuracy and quality)
  • 3. Suitability of data

How do I contact Statswork expert for secondary data collection?

You can reach us at any time in a week (between Monday to Saturday, 09.00 am to 06.00 pm IST) at +91 87544 46690 or fill up this form or Just mail us via LinkedIn.

MAIN SERVICES

Hire a statistician, statswork popup.

Statswork_Logo

  • Privacy Overview
  • Strictly Necessary Cookies
  • 3rd Party Cookies

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable Strictly Necessary Cookies first so that we can save your preferences!

Survey Software & Market Research Solutions - Sawtooth Software

  • Technical Support
  • Technical Papers
  • Knowledge Base
  • Question Library

Call our friendly, no-pressure support team.

Primary vs Secondary Research: Differences, Methods, Sources, and More

Two images representing primary vs secondary research: woman holding a phone taking an online survey (primary research), and a stack of books bound with string (secondary research).

Table of Contents

Primary vs Secondary Research – What’s the Difference?

In the search for knowledge and data to inform decisions, researchers and analysts rely on a blend of research sources. These sources are broadly categorized into primary and secondary research, each serving unique purposes and offering different insights into the subject matter at hand. But what exactly sets them apart?

Primary research is the process of gathering fresh data directly from its source. This approach offers real-time insights and specific information tailored to specific objectives set by stakeholders. Examples include surveys, interviews, and observational studies.

Secondary research , on the other hand, involves the analysis of existing data, most often collected and presented by others. This type of research is invaluable for understanding broader trends, providing context, or validating hypotheses. Common sources include scholarly articles, industry reports, and data compilations.

The crux of the difference lies in the origin of the information: primary research yields firsthand data which can be tailored to a specific business question, whilst secondary research synthesizes what's already out there. In essence, primary research listens directly to the voice of the subject, whereas secondary research hears it secondhand .

When to Use Primary and Secondary Research

Selecting the appropriate research method is pivotal and should be aligned with your research objectives. The choice between primary and secondary research is not merely procedural but strategic, influencing the depth and breadth of insights you can uncover.

Primary research shines when you need up-to-date, specific information directly relevant to your study. It's the go-to for fresh insights, understanding consumer behavior, or testing new theories. Its bespoke nature makes it indispensable for tailoring questions to get the exact answers you need.

Ready to Start Gathering Primary Research Data?

Get started with our free survey research tool today! In just a few minutes, you can create powerful surveys with our easy-to-use interface.

Start Survey Research for Free or Request a Product Tour

Secondary research is your first step into the research world. It helps set the stage by offering a broad understanding of the topic. Before diving into costly primary research, secondary research can validate the need for further investigation or provide a solid background to build upon. It's especially useful for identifying trends, benchmarking, and situating your research within the existing body of knowledge.

Combining both methods can significantly enhance your research. Starting with secondary research lays the groundwork and narrows the focus, whilst subsequent primary research delves deep into specific areas of interest, providing a well-rounded, comprehensive understanding of the topic.

Primary vs Secondary Research Methods

In the landscape of market research, the methodologies employed can significantly influence the insights and conclusions drawn. Let's delve deeper into the various methods underpinning both primary and secondary research, shedding light on their unique applications and the distinct insights they offer.

Two women interviewing at a table. Represents primary research interviews.

Primary Research Methods:

  • Surveys: Surveys are a cornerstone of primary research, offering a quantitative approach to gathering data directly from the target audience. By employing structured questionnaires, researchers can collect a vast array of data ranging from customer preferences to behavioral patterns. This method is particularly valuable for acquiring statistically significant data that can inform decision-making processes and strategy development. The application of statistical approaches for analysing this data, such as key drivers analysis, MaxDiff or conjoint analysis can also further enhance any collected data.
  • One on One Interviews: Interviews provide a qualitative depth to primary research, allowing for a nuanced exploration of participants' attitudes, experiences, and motivations. Conducted either face-to-face or remotely, interviews enable researchers to delve into the complexities of human behavior, offering rich insights that surveys alone may not uncover. This method is instrumental in exploring new areas of research or obtaining detailed information on specific topics.
  • Focus Groups: Focus groups bring together a small, diverse group of participants to discuss and provide feedback on a particular subject, product, or idea. This interactive setting fosters a dynamic exchange of ideas, revealing consumers' perceptions, experiences, and preferences. Focus groups are invaluable for testing concepts, exploring market trends, and understanding the factors that influence consumer decisions.
  • Ethnographic Studies: Ethnographic studies involve the systematic watching, recording, and analysis of behaviors and events in their natural setting. This method offers an unobtrusive way to gather authentic data on how people interact with products, services, or environments, providing insights that can lead to more user-centered design and marketing strategies.

The interior of a two story library with books lining the walls and study cubicles in the center of the room. Represents secondary research.

Secondary Research Methods:

  • Literature Reviews: Literature reviews involve the comprehensive examination of existing research and publications on a given topic. This method enables researchers to synthesize findings from a range of sources, providing a broad understanding of what is already known about a subject and identifying gaps in current knowledge.
  • Meta-Analysis: Meta-analysis is a statistical technique that combines the results of multiple studies to arrive at a comprehensive conclusion. This method is particularly useful in secondary research for aggregating findings across different studies, offering a more robust understanding of the evidence on a particular topic.
  • Content Analysis: Content analysis is a method for systematically analyzing texts, media, or other content to quantify patterns, themes, or biases . This approach allows researchers to assess the presence of certain words, concepts, or sentiments within a body of work, providing insights into trends, representations, and societal norms. This can be performed across a range of sources including social media, customer forums or review sites.
  • Historical Research: Historical research involves the study of past events, trends, and behaviors through the examination of relevant documents and records. This method can provide context and understanding of current trends and inform future predictions, offering a unique perspective that enriches secondary research.

Each of these methods, whether primary or secondary, plays a crucial role in the mosaic of market research, offering distinct pathways to uncovering the insights necessary to drive informed decisions and strategies.

Primary vs Secondary Sources in Research

Both primary and secondary sources of research form the backbone of the insight generation process, when both are utilized in tandem it can provide the perfect steppingstone for the generation of real insights. Let’s explore how each category serves its unique purpose in the research ecosystem.

Primary Research Data Sources

Primary research data sources are the lifeblood of firsthand research, providing raw, unfiltered insights directly from the source. These include:

  • Customer Satisfaction Survey Results: Direct feedback from customers about their satisfaction with a product or service. This data is invaluable for identifying strengths to build on and areas for improvement and typically renews each month or quarter so that metrics can be tracked over time.
  • NPS Rating Scores from Customers: Net Promoter Score (NPS) provides a straightforward metric to gauge customer loyalty and satisfaction. This quantitative data can reveal much about customer sentiment and the likelihood of referrals.
  • Ad-hoc Surveys: Ad-hoc surveys can be about any topic which requires investigation, they are typically one off surveys which zero in on one particular business objective. Ad-hoc projects are useful for situations such as investigating issues identified in other tracking surveys, new product development, ad testing, brand messaging, and many other kinds of projects.
  • A Field Researcher’s Notes: Detailed observations from fieldwork can offer nuanced insights into user behaviors, interactions, and environmental factors that influence those interactions. These notes are a goldmine for understanding the context and complexities of user experiences.
  • Recordings Made During Focus Groups: Audio or video recordings of focus group discussions capture the dynamics of conversation, including reactions, emotions, and the interplay of ideas. Analyzing these recordings can uncover nuanced consumer attitudes and perceptions that might not be evident in survey data alone.

These primary data sources are characterized by their immediacy and specificity, offering a direct line to the subject of study. They enable researchers to gather data that is specifically tailored to their research objectives, providing a solid foundation for insightful analysis and strategic decision-making.

Secondary Research Data Sources

In contrast, secondary research data sources offer a broader perspective, compiling and synthesizing information from various origins. These sources include:

  • Books, Magazines, Scholarly Journals: Published works provide comprehensive overviews, detailed analyses, and theoretical frameworks that can inform research topics, offering depth and context that enriches primary data.
  • Market Research Reports: These reports aggregate data and analyses on industry trends, consumer behavior, and market dynamics, providing a macro-level view that can guide primary research directions and validate findings.
  • Government Reports: Official statistics and reports from government agencies offer authoritative data on a wide range of topics, from economic indicators to demographic trends, providing a reliable basis for secondary analysis.
  • White Papers, Private Company Data: White papers and reports from businesses and consultancies offer insights into industry-specific research, best practices, and market analyses. These sources can be invaluable for understanding the competitive landscape and identifying emerging trends.

Secondary data sources serve as a compass, guiding researchers through the vast landscape of information to identify relevant trends, benchmark against existing data, and build upon the foundation of existing knowledge. They can significantly expedite the research process by leveraging the collective wisdom and research efforts of others.

By adeptly navigating both primary and secondary sources, researchers can construct a well-rounded research project that combines the depth of firsthand data with the breadth of existing knowledge. This holistic approach ensures a comprehensive understanding of the research topic, fostering informed decisions and strategic insights.

Examples of Primary and Secondary Research in Marketing

In the realm of marketing, both primary and secondary research methods play critical roles in understanding market dynamics, consumer behavior, and competitive landscapes. By comparing examples across both methodologies, we can appreciate their unique contributions to strategic decision-making.

Example 1: New Product Development

Primary Research: Direct Consumer Feedback through Surveys and Focus Groups

  • Objective: To gauge consumer interest in a new product concept and identify preferred features.
  • Process: Surveys distributed to a target demographic to collect quantitative data on consumer preferences, and focus groups conducted to dive deeper into consumer attitudes and desires.
  • Insights: Direct insights into consumer needs, preferences for specific features, and willingness to pay. These insights help in refining product design and developing a targeted marketing strategy.

Secondary Research: Market Analysis Reports

  • Objective: To understand the existing market landscape, including competitor products and market trends.
  • Process: Analyzing published market analysis reports and industry studies to gather data on market size, growth trends, and competitive offerings.
  • Insights: Provides a broader understanding of the market, helping to position the new product strategically against competitors and align it with current trends.

Example 2: Brand Positioning

Primary Research: Brand Perception Analysis through Surveys

  • Objective: To understand how the brand is perceived by consumers and identify potential areas for repositioning.
  • Process: Conducting surveys that ask consumers to describe the brand in their own words, rate it against various attributes, and compare it to competitors.
  • Insights: Direct feedback on brand strengths and weaknesses from the consumer's perspective, offering actionable data for adjusting brand messaging and positioning.

Secondary Research: Social Media Sentiment Analysis

  • Objective: To analyze public sentiment towards the brand and its competitors.
  • Process: Utilizing software tools to analyze mentions, hashtags, and discussions related to the brand and its competitors across social media platforms.
  • Insights: Offers an overview of public perception and emerging trends in consumer sentiment, which can validate findings from primary research or highlight areas needing further investigation.

Example 3: Market Expansion Strategy

Primary Research: Consumer Demand Studies in New Markets

  • Objective: To assess demand and consumer preferences in a new geographic market.
  • Process: Conducting surveys and interviews with potential consumers in the target market to understand their needs, preferences, and cultural nuances.
  • Insights: Provides specific insights into the new market’s consumer behavior, preferences, and potential barriers to entry, guiding market entry strategies.

Secondary Research: Economic and Demographic Analysis

  • Objective: To evaluate the economic viability and demographic appeal of the new market.
  • Process: Reviewing existing economic reports, demographic data, and industry trends relevant to the target market.
  • Insights: Offers a macro view of the market's potential, including economic conditions, demographic trends, and consumer spending patterns, which can complement insights gained from primary research.

By leveraging both primary and secondary research, marketers can form a comprehensive understanding of their market, consumers, and competitors, facilitating informed decision-making and strategic planning. Each method brings its strengths to the table, with primary research offering direct consumer insights and secondary research providing a broader context within which to interpret those insights.

What Are the Pros and Cons of Primary and Secondary Research?

When it comes to market research, both primary and secondary research offer unique advantages and face certain limitations. Understanding these can help researchers and businesses make informed decisions on which approach to utilize for their specific needs. Below is a comparative table highlighting the pros and cons of each research type.

Navigating the Pros and Cons

  • Balance Your Research Needs: Consider starting with secondary research to gain a broad understanding of the subject matter, then delve into primary research for specific, targeted insights that are tailored to your precise needs.
  • Resource Allocation: Evaluate your budget, time, and resource availability. Primary research can offer more specific and actionable data but requires more resources. Secondary research is more accessible but may lack the specificity or recency you need.
  • Quality and Relevance: Assess the quality and relevance of available secondary sources before deciding if primary research is necessary. Sometimes, the existing data might suffice, especially for preliminary market understanding or trend analysis.
  • Combining Both for Comprehensive Insights: Often, the most effective research strategy involves a combination of both primary and secondary research. This approach allows for a more comprehensive understanding of the market, leveraging the broad perspective provided by secondary sources and the depth and specificity of primary data.

Free Survey Maker Tool

Get access to our free and intuitive survey maker. In just a few minutes, you can create powerful surveys with its easy-to-use interface.

Try our Free Survey Maker or Request a Product Tour

Sawtooth Software

3210 N Canyon Rd Ste 202

Provo UT 84604-6508

United States of America

quantitative research methods secondary data

Support: [email protected]

Consulting: [email protected]

Sales: [email protected]

Products & Services

Support & Resources

quantitative research methods secondary data

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

Conducting secondary analysis of qualitative data: Should we, can we, and how?

Nicole ruggiano.

School of Social Work, University of Alabama, USA

Tam E Perry

School of Social Work, Wayne State University, USA

While secondary data analysis of quantitative data has become commonplace and encouraged across disciplines, the practice of secondary data analysis with qualitative data has met more criticism and concerns regarding potential methodological and ethical problems. Though commentary about qualitative secondary data analysis has increased, little is known about the current state of qualitative secondary data analysis or how researchers are conducting secondary data analysis with qualitative data. This critical interpretive synthesis examined research articles (n = 71) published between 2006 and 2016 that involved qualitative secondary data analysis and assessed the context, purpose, and methodologies that were reported. Implications of findings are discussed, with particular focus on recommended guidelines and best practices of conducting qualitative secondary data analysis.

There has been increasing commentary in the literature regarding secondary data analysis (SDA) with qualitative data. Many critics assert that there are potential methodological and ethical problems regarding such practice, especially when qualitative data is shared and SDA is conducted by researchers not involved with data collection. However, less has been written on how sharing and SDA of qualitative data is actually conducted by scholars. To better understand this practice with qualitative research, this critical interpretive synthesis (CIS) appraised studies that have involved SDA with qualitative data, examining their context, analytical techniques, and methods applied to promote rigor and ethical conduct of research. Following this analysis, the strengths and weaknesses of such practice and strategies for promoting the advancement of science will be discussed in light of findings.

SDA involves investigations where data collected for a previous study is analyzed – either by the same researcher(s) or different researcher(s) – to explore new questions or use different analysis strategies that were not a part of the primary analysis ( Szabo and Strang, 1997 ). For research involving quantitative data, SDA, and the process of sharing data for the purpose of SDA, has become commonplace. Though not without its limitations, Hinds et al. (1997) argue that it is a “respected, common, and cost-effective approach to maximizing the usefulness of collected data” (p. 408). They describe four approaches to SDA: (1) research where SDA focuses on a different unit of analysis from that of the parent study; (2) research involving a more in-depth analysis of themes from the parent study with a subset of data from that study; (3) analyses of data from the parent study that appear important, but not sufficiently focused on in the primary analysis; and (4) analyses with a dataset that includes data from a parent study and newly-collected data that refines the parent study’s purpose or research questions ( Hinds et al., 1997 ).

Scholars have also promoted the practice of sharing data for the purpose of SDA, asserting that it may answer new research questions, as well as increase sample sizes and statistical power ( Perrino et al., 2013 ). Sharing data also allows for the generation of new knowledge without the costs of administration and implementation of additional data collection and maximizes the output of large-scale studies that are funded by public or private sources. Recognizing the value of sharing data, researchers and institutions have created an infrastructure to promote such practice by: making datasets more available through the process of archiving; making archived data available through a number of media, such as the internet, CD-ROMS, and other removable storage devices; and documenting and providing detailed information about the sampling, design, and data collection strategies from such parent studies so that researchers can better understand the qualities of the data they obtain for future use ( Hox and Boeije, 2005 ; Perrino et al., 2013 ).

Concerns about secondary data analysis when using qualitative data

The primary concerns about SDA with qualitative data surround rigor and ethics from a number of stakeholder perspectives, including research participants, funders, and the researchers themselves. Heaton (2004) suggests that a strength of secondary analysis of qualitative data is that it relieves the burden of participation from research participants and community partners who collaborate with researchers to identify, access, and recruit research participants. However, we must also consider how SDA fits within guidelines for duplicate publishing of qualitative research ( Morse, 2007 ) in an era of a quantity-driven publishing as one mark of scholarliness.

Debates regarding rigor in qualitative SDA.

Despite the demonstrated benefits from its practice in quantitative studies, sharing qualitative data for SDA has not been as widely promoted and even has received considerable criticisms in the literature. One criticism relates to the socio-cultural-political context under which qualitative studies are implemented. As highlighted by Walters (2009) , qualitative research involves the collection and interpretation of subjective data that often is shaped by the social, cultural, and political realities that are evident at the time of data collection. When such data are re-analyzed or reinterpreted during another time period, the changes in social, cultural and/or political norms may result in investigators exploring research questions or utilizing analysis strategies that are inappropriate or they may misinterpret the original data. Mauthner et al. (1998) assert that the process of re-analyzing data can be different even for researchers who are revisiting their own data that was collected at an earlier time. However, they also report that some researchers may find benefits to this process. For instance, some researchers may find themselves less emotionally invested in the data and therefore more objective, though, other researchers may find this emotional distance to result in less immersion in the data. Thorne (1994 ) has provided a number of approaches to increasing rigor in SDA, such as audit trails and critical and reflective constant comparison. However, it is unclear the extent to which such practices actually overcome challenges that compromise qualitative SDA, such as inappropriate coding and interpretation of data and/or lack of first-hand knowledge of data by SDA researchers ( Thorne, 1994 ).

Debates regarding ethics in qualitative secondary data analysis.

In addition to questions of methodological rigor, there are criticisms regarding ethical dilemmas posed by SDA of qualitative data. Many criticisms center on basic questions of research ethics – the risks to informed consent, confidentiality, and anonymity when such data are archived and/or shared ( Morrow et al., 2014 ). For instance, Parry and Mauthner (2004) argue that the in-depth nature of qualitative data may pose particular challenges to de-identifying data for the purpose of archiving it for shared use. The descriptiveness of the data alone may allow others to identify respondents, while removing such descriptors may compromise the quality of the data.

There are also arguments that qualitative data is not created by researchers alone – they represent the “joint endeavor between respondent and researcher” and therefore allowing other researchers to re-use data poses significant ethical and legal dilemmas by disregarding the respondent’s ownership of the data ( Parry and Mauthner, 2004 : 142). Parry and Mauthner (2004) write that the collaborative effort of creating qualitative data also poses ethical dilemmas for qualitative researchers, who often offer personal information to respondents in an attempt to develop rapport. Therefore, they risk breeches in anonymity/confidentiality when such data are shared for future use.

To date, there has been increasing dialogue and controversy surrounding the practice of SDA with qualitative data. However, few studies have examined how qualitative SDA is being conducted or guidelines on conducting such investigations with high amounts of rigor and ethics. To address this issue, a CIS of studies identified as having qualitative SDA as a methodology was undertaken to address the following questions:

  • What is the extent and context under which SDA is conducted with qualitative data?
  • What are common approaches and purposes for conducting SDA with qualitative data?
  • In what ways do researchers maintain rigor and ethics in qualitative SDA? and
  • What limitations in qualitative SDA have been identified in practice?

Methodology

Although systematic reviews are commonly used to synthesize quantitative studies on a specific topic, Dixon-Woods et al. (2006) argue that the nature of systematic reviews and their focus on examining studies that emphasize testing theories is inappropriate when different types of evidence are being synthesized and/or there is a need for interpretation of studies. This review involved a CIS of literature that was identified through multiple search strategies. CIS differs from quantitative systematic reviews in several ways: (1) it uses broad review questions to guide the identification and analysis of studies, rather than specific hypotheses; (2) it relies on sources other than bibliographic databases to identify studies for inclusion; (3) it does not use a preconceived hierarchy of methods to guide study inclusion (e.g. only including randomized control trials, due to their perceived higher level of rigor); and (4) it uses ongoing inductive and interpretive strategies in the identification and analysis of studies, which may result in ongoing revision to the guiding review questions or revisiting search criteria and/or strategies ( Dixon-Woods et al., 2006 ). CIS differs from meta-ethnography in that the latter involves a more interpretive way of linking ethnographic findings from multiple studies, often on a specific topic ( Flemming, 2010 ). By contrast, the current analysis involves the interpretation and comparison of context and methodologies of studies focused on a wide variety of topics.

Eligibility criteria

This CIS identified and assessed research published in peer-reviewed, scholarly journals between the years 1996 and 2016. They also had to meet the following inclusion criteria: (a) involving analysis of data derived through qualitative methodologies; (b) research involving social or health-related research with human subjects; (c) use of SDA or repurposing of parent study data for subsequent analysis; and (d) research published in English. For the purpose of time sensitivity, unpublished dissertations were excluded from the final review. Given prior assertions that not all qualitative studies using SDA are identified as being such ( Hinds et al., 1997 ), the researchers cast a wide net and did not impose any additional exclusion criteria based on the perceived quality or approach to methodology, analysis, or focus area ( Dixon-Woods et al., 2006 ; Walsh and Downe, 2005 ).

Sources and process of search

Studies were identified between May and June of 2016 (see figure 1 ) by searching through the following eight databases: Expanded Academic ASAP, EBSCO Host, PsychInfo, PubMed, Social Services Abstracts, Social Work Abstracts, Sociological Abstracts, and Web of Science. The titles and/or abstracts were reviewed for more than 10,373 results that were yielded from the initial search. For each database, a search was conducted using combinations of the following search terms: qualitative research OR qualitative analysis OR qualitative study AND secondary data analysis OR secondary analysis OR combining data* OR sharing data* OR integrating data* OR two studies OR two field studies. Among these studies, 76 unduplicated studies were selected for full-text review. A second search strategy took place in September of 2016, where peer-reviewed journals that are dedicated to qualitative research and have impact factors (International Journal of Qualitative Methods, Qualitative Health Research, Qualitative Inquiry, Qualitative Research, Qualitative Social Work, and Qualitative Sociology) were searched. This subsequent search yielded 49 additional articles selected for full-text review. Among the 125 articles that were fully-reviewed, 54 did not meet the inclusion criteria and were excluded from the final analysis.

An external file that holds a picture, illustration, etc.
Object name is nihms-1005823-f0001.jpg

Search strategy and results for systematic review.

Appraisal of studies

The approach for appraising the included studies were derived from a number of recommendations in the literature ( Barnett-Page and Thomas, 2009 ; Schoenberg and McAuley, 2007 ; Walsh and Downe, 2005 ). Given that the current CIS focuses on an analysis of context and methodologies, rather than the findings of qualitative research on a specific topic, the appraisal of primary studies focused on the inclusion, description, and comparison/contrast of methods across the following categories:

  • Relationship of researchers with parent study: Here, the extent to which researchers conducting the SDA were involved with the parent study or studies was assessed. The relationships were identified by: authors self-citing the parent study, authors describing their contribution to the parent study, and authors describing their use of other researchers’ data or archived data.
  • Context of secondary analysis: For this category, articles were assessed by the context under which SDA took place. For instance, whether the data from parent study were analyzed post hoc, whether entire datasets or subsets were analyzed in the SDA, whether data from multiple studies were combined, or whether new research questions or analytical approaches were explicitly used. It was also assessed whether the secondary analysis aimed at advancing theory regarding a certain topic or methodology.
  • Details about parent study: To understand the context under which the data were initially collected in the parent studies, articles were assessed for whether they included details about the parent studies, such as their: context and methodologies, IRB approval, funding sources, and process of sharing data (when applicable).
  • Ethical considerations in secondary analysis: Articles were assessed for whether ethical considerations were described that were specific to secondary analysis. For instance, whether researchers made additional steps in the SDA to protect human subjects who participated in the parent study or descriptions of obtaining IRB approval for SDA.
  • Methodological rigor in secondary analysis: Articles were assessed for whether the researchers described aspects specific to the secondary analysis that were used to increase rigor, including descriptions of the SDA process or specific strategies to improve rigor.
  • Methodological challenges in secondary analysis: Articles were assessed for whether the researchers identified aspects of SDA that created challenges or limitations for their findings.

Both authors assessed each article independently and created a thematic chart based on these assessment criteria. Discrepancies in this assessment were resolved through discussion until agreement was reached. The authors acknowledge that the assessment is based on the published text, and thus, may not reflect further details outlined in other articles on the research or details not published. For instance, in cases where researchers did not identify obtaining IRB approval specifically for the SDA, that does not necessarily mean that the authors did not obtain IRB approval.

Seventy-one studies were included in this analysis. A table listing the studies and their appraisal using the criteria above can be accessed as an online supplementary appendix file. Most of the studies (n = 51, 71.8%) that met the inclusion criteria involved research focused on physical and mental health research, with fewer studies focused on social or economic issues.

Authors of these qualitative studies used a myriad of terms to describe their efforts to “repurpose parent study data for subsequent analysis,” including secondary data analysis, post hoc analysis, re-analysis, and supplemental analysis. Hence, the term qualitative secondary data analysis is not used consistently in the qualitative research literature. Through the appraisal of these studies, three central themes emerged that shed light on the current state of qualitative SDA and relate to current controversies to such practices within the literature: (1) the relationship of the SDA study to the parent study or studies ; (2) ethical considerations and human subject protections in qualitative SDA; and (3) attention given to methods and rigor in writing about primary and secondary studies . These themes, along with their sub-themes, are described in detail below. Please note that when interpreting these thematic findings that the articles were assessed based on what information they included or did not include in the reporting of their studies and that the findings should not be used to assess actual rigor or quality in methodologies of individual studies.

Relationship of the SDA study to the parent study or studies

In most cases (n = 60, 84.5%), qualitative SDA among the included studies involved researchers re-examining qualitative data from parent studies that they were involved with to explore new research questions or analytic strategies. Therefore, most were familiar with the methodologies and data of the parent studies and were able to write about the parent studies and quality of data in significant detail. Variation in relationships between parent and secondary studies was generally based on the following characteristics:

Involvement of researchers across studies.

In the majority of cases (n = 60, 84.5%), it was clear when the researchers conducting the SDA were involved in the parent study, as indicated by researchers self-citing their previous work on the parent study or directly referring to their participation in the parent study (e.g. We conducted in-depth interviews ...). However, it was not always clear when new investigators were included on the research team for SDA and therefore, the exact number of SDA researchers who were also involved with the parent study was not always easily determined. In some cases, the relationship could be assumed (but was not assumed for the current analysis), such as those where the SDA researchers did not explicitly indicate their involvement with the parent study, but described that the study was conducted at their institution and/or their IRB approved the study for research with human subjects (see Bergstrom et al., 2009 ). There were other cases where researchers shared their data with one another and combined data from independent parent studies for the purpose of SDA and indicated that they were involved with one or more of the studies that data were derived from, but not all of the studies (see Sallee and Harris, 2011 ; Taylor and Brown, 2011 ). Hence, the in-depth knowledge of parent study methodologies and data by each researcher was limited.

There were a smaller number of cases (n = 8, 11.3%) where researchers reported that they conducted an SDA with qualitative data derived from an qualitative data archive where the author(s) did not indicate having an affiliation with the archive team (see Kelly et al., 2013 ; Wilbanks et al., 2016 ). In these cases, it was common for SDA researchers to describe the methods used to collect the data for the archive, or at a minimum, describe the purpose and source of the data archive. Very few studies included in this analysis involved researchers conducting SDA using data that they were not involved with at all and/or not obtained through an archive. The most common case for this (n = 3, 4.3%) involved researchers who conducted analyses with data collected through program or government evaluations (see Hohl and Gaskell, 2008 ; Romero et al., 2012 ; Wint and Frank, 2006 ). One notable case involved an SDA using data collected by unrelated independent researchers to reanalyze classic sociological research ( Fielding and Fielding, 2000 ).

Context and purpose of SDA.

In almost all cases of research included in this analysis (n = 68, 95.7%), the SDA researchers provided the context and methodologies of the parent studies, though these descriptions varied in detail. Most were explicit in whether the data used in the SDA involved an entire dataset, a subset of data, or combination of data from the parent study or studies. The most common reason (n = 57, 80.2%) to conduct SDA was to explore new research questions post hoc that would advance theory in a particular area. In a smaller number of cases (n = 18, 25.4%), SDA was conducted post hoc to advance methodology. For instance, Myers and Lampropoulou (2016) conducted an SDA with data from several studies to examine the practice of identifying laughter in transcriptions of audio data. In other cases, SDA was conducted to demonstrate novel analytic approaches (see Henderson et al., 2012 ; Patel et al., 2015) or approaches to research (see Morse and Pooler, 2002 ; Schwartz et al., 2010 ).

Clarity in distinguishing between primary and secondary analyses.

While it was clear in most studies, there lacked consistency in the identification and description of SDA among the articles assessed. Some studies did not identify as being an SDA, but described methods and purposes that diverged from those of the parent studies and/or indicated that the analysis of the data for SDA was completed after the primary analysis in the parent study. In other cases, the researchers identified the research as being SDA, but it was not clear if the purpose or aims of the SDA diverged from the initial analysis or occurred subsequent to the parent study. For instance, Cortes et al. (2016) indicated that their study was considered SDA, because the theme that emerged wasn’t sufficiently explored before the IRB protocol period ended and therefore the findings being presented actually emerged during the primary analysis. In Coltart and Henwood’s (2012) study, they reported that they “routinely crossed conventional boundaries between primary and secondary analysis” (p. 39).

Ethical considerations and human subject protections in qualitative SDA

The articles assessed in this analysis also varied in the extent to which they discussed ethical considerations and protections of human subjects. The following is an analysis of the extent to which ethical issues were identified and/or addressed in the parent and/or SDA research presented in the articles.

Attention given to ethical safeguards in writing about primary and secondary studies.

For the majority of studies assessed, it was most common for researchers to provide information regarding IRB approval and/or ethical considerations given in the parent study methodology (n = 26, 36.6%) with fewer cases indicating that IRB approval or exemption was specifically obtained or ethical considerations were made in their effort to conduct SDA. Most articles indicated that IRB approval was obtained for the parent study with no mention about IRB review of the SDA (n = 19, 26.8%). In 17 cases (23.9%), the researchers indicated that IRB approval was obtained for the SDA study alone or for both the parent study and SDA. In one of these cases, a researcher using archived data reported that IRB approval was sought out, but not required for the scope of their study ( Heaton, 2015 ).

Examples of ethical procedures in secondary analysis.

Some researchers described steps for protecting human subjects that extended to the SDA, such as de-identifying data before SDA was conducted. Very few studies (n = 5, 7.0%) specifically indicated that participants in the parent studies consented to having their data available for SDA. Some researchers identified ethical considerations that are intrinsic to the nature of SDA, such as their efforts to conduct SDA in order to not overburden vulnerable populations that were participating in research (see Turcotte et al., 2015 ). Also less common was for researchers to report ethical dilemmas or concerns in conducting SDA, such as Coltart and Henwood’s (2012) research with longitudinal qualitative data, were the researchers presented concerns about anonymity and ethics regarding archived data.

Attention given to methods and rigor in writing about primary and secondary studies

Finally, articles varied in the extent to which they described issues of rigor and limitations stemming specifically from the SDA. There was variation on the attention researchers gave to describing methods and rigor in the parent and SDA studies, their approaches to increasing rigor in SDA, and the limitations they identified that were specific from conducting an SDA.

Attention and focus of parent and secondary studies.

For most of the articles appraised (n = 60, 84.5%), researchers provided detail on the methodologies used to collect and analyze data in the parent study. The level of detail of these descriptions varied significantly, with some researchers providing a few sentences on the overall methodological approach to data collection in the parent study with little to no detail on primary analysis, to extensive sections of research articles being dedicated to the methods of the parent studies. Some researchers also reported the funding sources of the parent studies (n = 28, 39.4%), which may further help readers assess bias in the SDA. Many studies also described the process of SDA as being distinctively different from primary analysis, though in some articles, it was difficult to assess how SDA different from primary data analysis.

Examples of rigor in secondary analysis.

Some studies presented strategies used by researchers to increase rigor in the SDA study. Many studies (n = 25, 35.2%) reported common practices in qualitative data analysis to increase rigor, such as member checking, memoing, triangulation, peer debriefing, inter-rater agreement, and maintaining audit trails. In some articles, researchers indicated inclusion of members of the parent study research team or new researchers with expertise in the area of focus for the SDA with the intent of increasing rigor. Other articles asserted that the research questions explored through SDA were “a good fit” with those of the parent study, and therefore increased the trustworthiness of findings. Only a few studies reported that steps were taken in SDA to analyze data with a lens that was not influenced by the researchers’ involvement with the parent study, such as using clean, uncoded transcripts from parent study (see Williams and Collins, 2002 ) or purposefully reading transcripts with new perspective (see Moran and Russo-Netzer, 2016 ). Some articles reported that a strength in the SDA was that the researchers involved were very familiar with the parent study methodology and data. In one case ( Volume and Farris, 2000 ), the researchers indicated that one source of rigor was that emerging findings during analysis could not influence future interviews, since the data were already all collected, which may minimize bias.

Identification of limitations in secondary analysis.

Most articles reported limitations in their studies that are often reported in qualitative research (e.g. small samples, not generalizable), though most of these descriptions did not relate specifically to SDA. About half (n = 36, 50.7%) of articles identified limitations in their study that resulted from the nature of their SDA, such as: not being able to return to participants for member checking or conduct further interviews to clarify or validate thematic findings in the SDA; conducting research with one purpose using data that were collected for another purpose, which limited the number of cases or extent to which a thematic finding could be identified; and conducting qualitative research with data that may not be as relevant as when it was first collected, given changes in context and/or time that may have influenced the data if collected in present day.

In response to growing dialogue and criticisms about conducting SDA with qualitative data, this CIS set out to better understand the context of qualitative SDA in practice, with particular attention given to issues of methodological rigor and ethical principles. Overall, 71 articles met the inclusion criteria and were appraised, a number that is expectedly dwarfed by the number of quantitative studies that are identified as using SDA. However, thematic findings in this assessment address controversies in the literature and also raise issues in conducting SDA with qualitative data that can be used to guide future research and assessment of qualitative SDA studies.

The need for better and consistent definitions of qualitative SDA

Revisiting Hinds et al.’s (1997 ) approaches to qualitative SDA described earlier, most qualitative SDA studies identified and appraised through this CIS best reflect the approaches of conducting a more in-depth analysis of themes from the parent study with a subset of data from that study and conducting an analysis of data from the parent study that appear important, but not sufficiently focused on in the primary analysis , though all four approaches they identified were observed among studies. However, the main concern that arose from this CIS was that researchers often failed to describe the differences between primary and secondary analysis (or at least the relationship between the two analyses). Many described SDA strategies that were similar in scope and appeared to have been conducted in close timing to the primary analysis. As a result, it was not always clear cut if findings were more related to primary analysis than an actual secondary analysis.

There were also cases where researchers described conducting qualitative SDA, but did not label it as such. As a result, one of the primary limitations of this CIS is that the extent to which qualitative SDA studies were excluded from search results and therefore not included in this synthesis is unclear. Scholars can improve this issue by explicitly referring to qualitative SDA as such and describing the study methods in a way that make clear how SDA differed from primary analysis in scope, context, and/or methodology. Otherwise, given the fluid and/or emerging nature of many qualitative analyses and the fact that many researchers conduct qualitative SDA with their own data, there are limitations on the extent to which audiences can fully appraise such research.

Maintaining ethical standards in qualitative SDA

It is generally accepted that almost all research involving human subjects, including research involving SDA, should be reviewed by an IRB and determined if the study is exempt from further review or approved based on its treatment of human subjects. However, the majority of articles included in this analysis reported that IRB approval was obtained for the parent study with no mention of whether review was sought for the SDA or if the SDA was included under the same protocol. In the case of quantitative SDA, this issue may be more clearly explained in research reporting, since data is often shared among researchers who were not involved with the parent study and therefore SDA researchers would not be able to claim to be covered under the protocol approval for the parent study. As was found through this CIS, many qualitative SDA researchers are conducting analysis with their own data and may feel that the SDA is covered under the original protocol approval. However, it is unclear if this is always appropriate, given that many SDA investigations involve new research questions, unit of analysis, or focus from which the participants of the parent study may have consented to.

In addition, specific safeguards aimed at protecting human subjects should always be taken in qualitative SDA and described in the research reporting. For researchers who are interested in conducting studies that may be open to SDA in the future, this may mean taking specific steps that would make additional IRB review unnecessary (when the same researchers are conducting further analysis) or eligible for exemption. For instance, qualitative researchers should have participants consent to SDA of their data during the recruitment process or explain to participants during the consent process that researchers may report findings from their data that are unexpectedly derived and therefore not feasibly explained in the purpose and goals of the study through the initial consent form. They can also design interview and focus group guides that could more easily be de-identified for researchers to use later and think critically about whether additional safeguards should be in place to protect the participants in primary studies. Researchers should report about these procedures so that their audience can adequately access the ethical considerations taken in their research.

Ways to move forward

Promoting qualitative data sharing..

While much of the literature on the topic has criticized the use of qualitative data for SDA, some scholars have recognized its potential benefit to the state of science and have offered suggestions to promote this practice. Drawing upon the literature, Dargentas (2006) identified several ways of advancing the practice of SDA of qualitative data, including: increasing access to archived qualitative data, training researchers on using computer assisted qualitative analysis software, and addressing issues related to qualitative methodologies (p. 3). Such efforts have initiated, but have been slower to develop than those for quantitative data. Examples include the UK Data Service, the Timescapes Archive (University of Leeds) and The Oxford Health Experiences Research Group (University of Oxford).

Arguments have also been made that qualitative researchers can deploy strategies to collect data that is suitable and appropriate for SDA by other investigators. Walters (2009) asserts that through effective use of reflexivity, qualitative researchers can collect data that identifies and documents the socio-cultural-political context under which the data are collected so the dataset is relevant and important for future use by other researchers. However, Parry and Mauthner (2004) caution that researchers who develop plans at the beginning of their projects to collect qualitative data that may be shared in the future may run the risk of restraining themselves, through the questions that they ask, data collection strategies, or even their own contributions to creating the data (e.g. offering personal information to respondents to develop rapport) in a way that they would not if they were creating the data for solely their own use. This could compromise the quality of the data.

Recommendations

After our review of the literature, we offer three sets of recommendations to give SDA common anchors in qualitative research, designed to stress its strengths and reveal its limitations.

1. Increasing clarity and transparency in SDA.

We recommend a clearer and consistent definition of qualitative SDA where some or all of the following information is included in manuscripts. This includes: (1a) describing if and how the SDA researchers were involved with the parent study or studies; and (1b) a distinction between primary and secondary analysis should be provided so that the readers can determine if findings reflect the emerging nature of qualitative research findings or a new approach or purpose for re-analysis. Such descriptions will help readers evaluate the researchers’ familiarity of the parent study methods, sample, data, and context. This will also help readers evaluate whether findings were the result of the emerging process of qualitative analysis, as opposed to SDA, which ideally would be a new analysis with a different purpose or approach from the parent study, even if the researchers remain the same across studies. A number of exemplary studies were identified that helped create clear and transparent understandings about the difference between the parent and SDA studies, including: Molloy et al. (2015) , Myers and Lampropoulou’s (2016) , and Pleschberger et al. (2011) .

2. Ethics in conducting qualitative SDA studies.

The ethics of conducting qualitative SDA is one of the most common topics written about in the literature about this practice. Hence, it was surprising that many studies in this CIS did not discuss IRB approval or strategies for protecting human subjects in the SDA study. It may be that researchers and peer reviewers assume that IRB approval was given or extended from the parent study’s protocol. However, researchers should take responsibility to report their efforts in protecting human subjects through qualitative SDA. Some specific recommendations include: (2a) clarity about how the researchers obtained approval or exemption for the SDA; and (2b) methods to protect human subjects in the SDA, such as de-identified data, or consent forms that outlined SDA.

3. Increasing rigor and identifying our limitations in qualitative SDA.

Researchers are expected to maximize rigor in their research methodologies and identify limitations in their studies that may influence their audience’s interpretation of findings. However, in this CIS it was found that only about half of the articles identified how the nature of SDA may affect their findings. Some recommendations for increasing rigor and transparency include: (3a) employing and describing strategies for increasing rigor within the SDA, such as including research team members from the parent study, including new research team members with specific expertise or fresh perspectives uninfluenced by the primary analysis, conducting SDA with uncoded transcripts, or other methods (audit trails, peer debriefing, member checking); and (3b) identifying limitations in qualitative SDA, such as how time or context may have changed the relevance of the data and/or the extent to which the goals and purpose of the SDA research were a good fit with those of the parent study. Examples of SDA studies that described rigor include: Borg et al. (2013) , Chau et al.’s (2008 ), and Mayer and Rosenfeld (2006) .

Qualitative research often involves long data collection sessions and/or participants who share intimate, sensitive and detailed information about themselves with researchers to promote the goal of generating new knowledge that may benefit society. SDA of qualitative research is one way to advance this goal while minimizing the burden of research participants. Although SDA of qualitative data may not be appropriate or ethical in all cases, researchers should take the responsibility of recognizing when qualitative data are appropriate and safe to conduct SDA and/ or find creative ways that new studies may be designed that promote SDA. In such efforts, researchers should also take responsibility for identifying ways of promoting rigor and ethical research practices in SDA and clearly identify and describe these efforts so that the academic community can appropriately appraise such work while also learn from one another to advance methodology.

Acknowledgments

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Contributor Information

Nicole Ruggiano, School of Social Work, University of Alabama, USA.

Tam E Perry, School of Social Work, Wayne State University, USA.

  • Barnett-Page E and Thomas J (2009) Methods for the synthesis of qualitative research: A critical review . BMC Medical Research Methodology 9 ( 1 ): 1. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bergstrom L, Richards L, Proctor A, et al. (2009) Birth talk in second stage labor . Qualitative Health Research 19 : 954–964. [ PubMed ] [ Google Scholar ]
  • Borg M, Veseth M, Binder PE, et al. (2013) The role of work in recovery from bipolar disorders . Qualitative Social Work 12 : 323–339. [ Google Scholar ]
  • Chau L, Hegedus L, Praamsma M, et al. (2008) Women living with a spinal cord injury: Perceptions about their changed bodies . Qualitative Health Research 18 : 209–221. [ PubMed ] [ Google Scholar ]
  • Coltart C and Henwood K (2012) On paternal subjectivity: A qualitative longitudinal and psychosocial case analysis of men’s classed positions and transitions to first-time fatherhood . Qualitative Research 12 : 35–52. [ Google Scholar ]
  • Cortés YI, Arcia A, Kearney J, et al. (2017) Urban-dwelling community members’ views on biomedical research engagement . Qualitative Health Research 27 : 130–137. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Dargentas M (2006, September) Secondary analysis and culture of disputation in European qualitative research . Forum: Qualitative Social Research 7 ( 4 ): 1–19. [ Google Scholar ]
  • Dixon-Woods M, Cavers D, Agarwal S, et al. (2006) Conducting a critical interpretive synthesis of the literature on access to healthcare by vulnerable groups . BMC Medical Research Methodology 6 ( 1 ): 1. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fielding NG and Fielding JL (2000) Resistance and adaptation to criminal identity: Using secondary analysis to evaluate classic studies of crime and deviance . Sociology 34 : 671–689. [ Google Scholar ]
  • Flemming K (2010) Synthesis of quantitative and qualitative research: An example using critical interpretive synthesis . Journal of Advanced Nursing 66 : 201–217. [ PubMed ] [ Google Scholar ]
  • Heaton J (2004) Reworking Qualitative Data . Thousand Oaks, CA: Sage. [ Google Scholar ]
  • Heaton J (2015) Use of social comparisons in interviews about young adults’ experiences of chronic illness . Qualitative Health Research 25 : 336–347. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Henderson S, Holland J, McGrellis S, et al. (2012) Storying qualitative longitudinal research: Sequence, voice and motif . Qualitative Research 12 : 16–34. [ Google Scholar ]
  • Hinds PS, Vogel RJ and Clarke-Steffen L (1997) The possibilities and pitfalls of doing a secondary analysis of a qualitative data set . Qualitative Health Research 7 : 408–424. [ Google Scholar ]
  • Hohl K and Gaskell G (2008) European public perceptions of food risk: Cross-national and methodological comparisons . Risk Analysis 28 : 311–324. [ PubMed ] [ Google Scholar ]
  • Hox JJ and Boeije HR (2005) Data collection, primary vs. secondary In: Kempf-Leonard K (ed.) Encyclopedia of Social Measurement . Atlanta, GA: Elsevier Science, pp. 593–599. [ Google Scholar ]
  • Kelly L, Jenkinson C and Ziebland S (2013) Measuring the effects of online health information for patients: Item generation for an e-health impact questionnaire . Patient Education and Counseling 93 : 433–438. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Mauthner NS, Parry O and Backett-Milburn K (1998) The data are out there, or are they? Implications for archiving and revisiting qualitative data . Sociology 32 : 733–745. [ Google Scholar ]
  • Mayer D and Rosenfeld A (2006) Symptom interpretation in women with diabetes and myocardial infarction: A qualitative study . The Diabetes Educator 32 : 918–924. [ PubMed ] [ Google Scholar ]
  • Molloy J, Evans M and Coughlin K (2015) Moral distress in the resuscitation of extremely premature infants . Nursing Ethics 22 : 52–63. [ PubMed ] [ Google Scholar ]
  • Moran G and Russo-Netzer P (2016) Understanding universal elements in mental health recovery: A cross-examination of peer providers and a non-clinical sample . Qualitative Health Research 26 : 273–287. [ PubMed ] [ Google Scholar ]
  • Morrow V, Boddy J and Lamb R (2014) The Ethics of Secondary Data Analysis: Learning from the Experience of Sharing Qualitative Data from Young People and their Families in an International Study of Childhood Poverty . London: Thomas Coram Research Unit and the Institute of Education University of London; Available at: http://sro.sussex.ac.uk/49123/1/NOVELLA_NCRM_ethics_of_secondary_analysis.pdf . [ Google Scholar ]
  • Morse JM (2007) Duplicate publication . Qualitative Health Research 17 : 1307–1308. [ PubMed ] [ Google Scholar ]
  • Morse JM and Pooler C (2002) Analysis of videotaped data: Methodological considerations . International Journal of Qualitative Methods 1 : 62–67. [ Google Scholar ]
  • Myers G and Lampropoulou S (2016) Laughter, non-seriousness and transitions in social research interview transcripts . Qualitative Research 16 : 78–94. [ Google Scholar ]
  • Parry O and Mauthner NS (2004) Whose data are they anyway? Practical, legal and ethical issues in archiving qualitative research data . Sociology 38 : 139–152. [ Google Scholar ]
  • Patel K, Auton MF, Carter B, et al. (2016) Parallel-serial memoing: A novel approach to analyzing qualitative data . Qualitative Health Research 26 : 1745–1752. [ PubMed ] [ Google Scholar ]
  • Perrino T, Howe G, Sperling A, et al. (2013) Advancing science through collaborative data sharing and synthesis . Perspectives on Psychological Science 8 : 433–444. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pleschberger S, Seymour JE, Payne S, et al. (2011) Interviews on end-of-life care with older people reflections on six European studies . Qualitative Health Research 21 : 1588–1600. [ PubMed ] [ Google Scholar ]
  • Romero SL, Ellis AA and Gurman TA (2012) Disconnect between discourse and behavior regarding concurrent sexual partnerships and condom use: Findings from a qualitative study among youth in Malawi . Global Health Promotion 19 : 20–28. [ PubMed ] [ Google Scholar ]
  • Sallee MW and Harris F (2011) Gender performance in qualitative studies of masculinities . Qualitative Research 11 : 409–429. [ Google Scholar ]
  • Schoenberg NE and McAuley WJ (2007) Promoting qualitative research . The Gerontologist 47 : 576–577. [ Google Scholar ]
  • Schwartz S, Hoyte J, James T, et al. (2010) Challenges to engaging Black male victims of community violence in healthcare research: Lessons learned from two studies . Psychological Trauma: Theory, Research, Practice, and Policy 2 : 54–61. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Szabo V and Strang VR (1997) Secondary analysis of qualitative data . Advances in Nursing Science 20 : 66–74. [ PubMed ] [ Google Scholar ]
  • Taylor S and Brown CW (2011) The contribution of gifts to the household economy of low-income families . Social Policy and Society 10 : 163–175. [ Google Scholar ]
  • Thorne S (1994) Secondary analysis in qualitative research: Issues and implications In: More JM (ed.) Critical Issues in Qualitative Research Methods . Thousand Oaks: Sage, pp. 263–279. [ Google Scholar ]
  • Turcotte PL, Carrier A, Desrosiers J, et al. (2015) Are health promotion and prevention interventions integrated into occupational therapy practice with older adults having disabilities? Insights from six community health settings in Quebec, Canada . Australian Occupational Therapy Journal 62 : 56–67. [ PubMed ] [ Google Scholar ]
  • Volume CI and Farris KB (2000) Hoping to maintain a balance: The concept of hope and the discontinuation of anorexiant medications . Qualitative Health Research 10 : 174–187. [ PubMed ] [ Google Scholar ]
  • Walsh D and Downe S (2005) Meta-synthesis method for qualitative research: A literature review . Journal of Advanced Nursing 50 : 204–211. [ PubMed ] [ Google Scholar ]
  • Walters P (2009) Qualitative archiving: Engaging with epistemological misgivings . Australian Journal of Social Issues 44 : 309–320. [ Google Scholar ]
  • Wilbanks BA, Geisz-Everson M and Boust RR (2016) The role of documentation quality in anesthesia-related closed claims: A descriptive qualitative study . Computers, Informatics, Nursing: CIN 34 : 406–412. [ PubMed ] [ Google Scholar ]
  • Williams CC and Collins AA (2002) The social construction of disability in schizophrenia . Qualitative Health Research 12 : 297–309. [ PubMed ] [ Google Scholar ]
  • Wint E and Frank C (2006) From poor to not poor: Improved understandings and the advantage of the qualitative approach . Journal of Sociology & Social Welfare 33 : 163–177. [ Google Scholar ]

No internet connection.

All search filters on the page have been cleared., your search has been saved..

  • All content
  • Dictionaries
  • Encyclopedias
  • Expert Insights
  • Foundations
  • How-to Guides
  • Journal Articles
  • Little Blue Books
  • Little Green Books
  • Project Planner
  • Tools Directory
  • Sign in to my profile My Profile

Not Logged In

  • Sign in Signed in
  • My profile My Profile

Not Logged In

  • Offline Playback link

quantitative research methods secondary data

Have you created a personal profile? sign in or create a profile so that you can create alerts, save clips, playlists and searches.

Finding Data for Secondary Data Analysis: Finding Your Secondary Data

  • Watching now: Chapter 1: Finding Your Secondary Data Start time: 00:00:00 End time: 00:05:06

Video Type: Tutorial

(Academic). (2024). Finding data for secondary data analysis: finding your secondary data [Video]. Sage Research Methods. https:// doi. org/10.4135/9781529697971

"Finding Data for Secondary Data Analysis: Finding Your Secondary Data." In Sage Video . : SAGE Publications, Ltd., 2024. Video, 00:05:06. https:// doi. org/10.4135/9781529697971.

, 2024. Finding Data for Secondary Data Analysis: Finding Your Secondary Data , Sage Video. [Streaming Video] London: Sage Publications Ltd. Available at: <https:// doi. org/10.4135/9781529697971 & gt; [Accessed 2 Apr 2024].

Finding Data for Secondary Data Analysis: Finding Your Secondary Data . Online video clip. SAGE Video. London: SAGE Publications, Ltd., 26 Feb 2024. doi: https:// doi. org/10.4135/9781529697971. 2 Apr 2024.

Finding Data for Secondary Data Analysis: Finding Your Secondary Data [Streaming video]. 2024. doi:10.4135/9781529697971. Accessed 04/02/2024

Please log in from an authenticated institution or log into your member profile to access the email feature.

  • Sign in/register

Add this content to your learning management system or webpage by copying the code below into the HTML editor on the page. Look for the words HTML or </>. Learn More about Embedding Video   icon link (opens in new window)

Sample View:

This is an image of a Sage Research Methods video on a Learning Management System

  • Download PDF opens in new window
  • icon/tools/download-video icon/tools/video-downloaded Download video Downloading... Video downloaded

Dr. Simon Massey emphasizes the importance of formulating a research question and conducting a literature review before searching for secondary data sources.

Chapter 1: Finding Your Secondary Data

  • Start time: 00:00:00
  • End time: 00:05:06
  • Product: Sage Research Methods Video: Quantitative and Mixed Methods
  • Type of Content: Tutorial
  • Title: Finding Data for Secondary Data Analysis: Finding Your Secondary Data
  • Publisher: SAGE Publications, Ltd.
  • Series: Finding Data for Secondary Data Analysis
  • Publication year: 2024
  • Online pub date: February 26, 2024
  • Discipline: Sociology , History , Economics , Counseling and Psychotherapy , Marketing , Science , Technology , Education , Computer Science , Criminology and Criminal Justice , Mathematics , Medicine , Dentistry , Public Health , Psychology , Business and Management , Health , Anthropology , Social Policy and Public Policy , Social Work , Political Science and International Relations , Geography , Nursing , Communication and Media Studies , Engineering
  • Methods: Secondary data analysis , Student success , Data collection skills , Data archives , Research design
  • Duration: 00:05:06
  • DOI: https:// doi. org/10.4135/9781529697971
  • Keywords: literature reviews , research questions , Secondary data , Secondary data analysis Show all Show less

Sign in to access this content

Get a 30 day free trial, more like this, sage recommends.

We found other relevant content for you on other Sage platforms.

Have you created a personal profile? Login or create a profile so that you can save clips, playlists and searches

Navigating away from this page will delete your results

Please save your results to "My Self-Assessments" in your profile before navigating away from this page.

Sign in to my profile

Sign up for a free trial and experience all Sage Learning Resources have to offer.

You must have a valid academic email address to sign up.

Get off-campus access

  • View or download all content my institution has access to.

Sign up for a free trial and experience all Sage Research Methods has to offer.

  • view my profile
  • view my lists

IMAGES

  1. Secondary Data: Advantages, Disadvantages, Sources, Types

    quantitative research methods secondary data

  2. Quantitative Data

    quantitative research methods secondary data

  3. Methods of Data Collection-Primary and secondary sources

    quantitative research methods secondary data

  4. Quantitative Research

    quantitative research methods secondary data

  5. 15 Types of Research Methods (2024)

    quantitative research methods secondary data

  6. The Complete Guide to Quantitative Market Research

    quantitative research methods secondary data

VIDEO

  1. Quantitative research process

  2. Quantitative Research

  3. Quantitative Research, Types and Examples Latest

  4. Lecture 41: Quantitative Research

  5. Quantitative analysis introduction Chemistry secondary 3 تالتة ثانوى كيمسترى

  6. Lecture 40: Quantitative Research: Case Study

COMMENTS

  1. What is Secondary Research?

    Secondary research is a research method that uses data that was collected by someone else. In other words, whenever you conduct research using data that already exists, you are conducting secondary research. On the other hand, any type of research that you undertake yourself is called primary research. Example: Secondary research.

  2. Secondary Data

    Types of secondary data are as follows: Published data: Published data refers to data that has been published in books, magazines, newspapers, and other print media. Examples include statistical reports, market research reports, and scholarly articles. Government data: Government data refers to data collected by government agencies and departments.

  3. Secondary Data Analysis

    Secondary data analysis refers to the analysis of existing data collected by others. Secondary analysis affords researchers the opportunity to investigate research questions using large-scale data sets that are often inclusive of under-represented groups, while saving time and resources.

  4. Sage Research Methods Foundations

    Secondary analysis is the analysis of data that have originally been collected either for a different purpose or by a different researcher or organisation. Because of the cost and complexity of primary data collection, and because of the opportunities offered by "found" data not originally collected for research purposes (e.g ...

  5. Secondary Analysis Research

    Secondary data analysis research may be limited to descriptive, exploratory, and correlational designs and nonparametric statistical tests. By their nature, SDA studies are observational and retrospective, and the investigator cannot examine causal relationships (by a randomized, controlled design). ... Qualitative and Quantitative Methods in ...

  6. Secondary Data Analysis: Your Complete How-To Guide

    Step 3: Design your research process. After defining your statement of purpose, the next step is to design the research process. For primary data, this involves determining the types of data you want to collect (e.g. quantitative, qualitative, or both) and a methodology for gathering them. For secondary data analysis, however, your research ...

  7. Quantitative Research: What It Is, Practices & Methods

    The following are five popularly used secondary quantitative research methods: Data available on the internet: With the high penetration of the internet and mobile devices, it has become increasingly easy to conduct quantitative research using the internet. Information about most research topics is available online, and this aids in boosting ...

  8. Secondary Research: Definition, Methods & Examples

    Secondary research, also known as desk research, is a research method that involves compiling existing data sourced from a variety of channels. This includes internal sources (e.g.in-house research) or, more commonly, external sources (such as government statistics, organizational bodies, and the internet).

  9. Sage Research Methods

    Volume 2: Quantitative Approaches to Secondary Analysis covers the broad range of approaches adopted in quantitative secondary analysis research designs. Volume 3: Qualitative Data and Research in Secondary Analysis focuses on qualitative research methods that offer the social researcher the opportunity to examine additional themes or explore ...

  10. Using Secondary Data in Mixed Methods is More Straight-Forward Than You

    Secondary data in mixed methods research is the process of identifying, evaluating, and incorporating one or more secondary qualitative or quantitative data sources into a mixed methods project. Incorporating secondary data expands on the original definition of mixed methods research, which involves collecting, analyzing, and integrating qualitative and quantitative approaches to study a ...

  11. The Importance Of Secondary Data

    The Importance Of Secondary Data. Data Collection. Apr 29, 2022. by Helen Kara. Dr. Helen Kara was the Mentor in Residence for April 2022 to focus on the topic: Be expansive: Research outside academia. Dr. Kara has been an independent researcher since 1999 and writes and teaches on research methods. She is the author of Qualitative Research for ...

  12. What is Quantitative Research? Definition, Methods, Types, and Examples

    Quantitative research methods 5. Quantitative research methods are classified into two types—primary and secondary. Primary quantitative research method: In this type of quantitative research, data are directly collected by the researchers using the following methods. - Survey research: Surveys are the easiest and most commonly used ...

  13. Secondary Data in Research

    This research employs mixed qualitative and quantitative methods (Onwuegbuzie and Johnson, 2006), and it is strongly based on secondary data (Martins et al., 2018). In order to obtain data from ...

  14. How Can I Use Secondary Quantitative Data in My Research?

    Methods Map. This visualization demonstrates how methods are related and connects users to relevant content. Project Planner. Find step-by-step guidance to complete your research project. Which Stats Test. Answer a handful of multiple-choice questions to see which statistical method is best for your data. Reading Lists

  15. 5 Methods of Data Collection for Quantitative Research

    A fifth method of data collection for quantitative research is known as secondary research: reviewing existing research to see how it can contribute to understanding a new issue in question. This is in contrast to the primary research methods above, which is research that is specially commissioned and carried out for a research project.

  16. Quantitative Data

    Secondary data analysis: Secondary data analysis involves using existing data that was collected for a different purpose to answer a new research question. This method can be cost-effective and efficient, but it is important to ensure that the data is appropriate for the research question being studied. ... Scientific research: Quantitative ...

  17. Sage Research Methods Video: Quantitative and Mixed Methods

    Product: Sage Research Methods Video: Quantitative and Mixed Methods; Type of Content: Tutorial Title: Finding Data for Secondary Data Analysis: Why Use Secondary Data? Publisher: SAGE Publications, Ltd. Series: Finding Data for Secondary Data Analysis; Publication year: 2024; Online pub date: February 26, 2024

  18. Secondary Quantitative Data Collection

    Secondary Quantitative Data Collection ... The cost of the proposed research methodology is an important factor in determining the choice of research methods. Mostly, cost of the data collection method is high in many projects. Sometimes, the most effective research methodology in quality and potential value may be rejected in favour of a ...

  19. Secondary Qualitative Research Methodology Using Online Data within the

    There are many benefits to qualitative over quantitative research, particularly in the context of forced migration. ... For this reason, we propose a new systematic step-by-step guideline with a set of methods for secondary data collection, filtering, and analysis to mitigate the downfalls of secondary data analysis, particularly in the setting ...

  20. Primary vs Secondary Research: Differences, Methods, Sources, and More

    Navigating the Pros and Cons. Balance Your Research Needs: Consider starting with secondary research to gain a broad understanding of the subject matter, then delve into primary research for specific, targeted insights that are tailored to your precise needs. Resource Allocation: Evaluate your budget, time, and resource availability. Primary research can offer more specific and actionable data ...

  21. Conducting secondary analysis of qualitative data: Should we, can we

    While secondary data analysis of quantitative data has become commonplace and encouraged across disciplines, the practice of secondary data analysis with qualitative data has met more criticism and concerns regarding potential methodological and ethical problems. ... Walsh D and Downe S (2005) Meta-synthesis method for qualitative research: A ...

  22. Sage Research Methods Video: Quantitative and Mixed Methods

    Product: Sage Research Methods Video: Quantitative and Mixed Methods; Type of Content: Tutorial Title: Finding Data for Secondary Data Analysis: Finding Your Secondary Data Publisher: SAGE Publications, Ltd. Series: Finding Data for Secondary Data Analysis; Publication year: 2024