What is the Scientific Method: How does it work and why is it important?

The scientific method is a systematic process involving steps like defining questions, forming hypotheses, conducting experiments, and analyzing data. It minimizes biases and enables replicable research, leading to groundbreaking discoveries like Einstein's theory of relativity, penicillin, and the structure of DNA. This ongoing approach promotes reason, evidence, and the pursuit of truth in science.

Updated on November 18, 2023

What is the Scientific Method: How does it work and why is it important?

Beginning in elementary school, we are exposed to the scientific method and taught how to put it into practice. As a tool for learning, it prepares children to think logically and use reasoning when seeking answers to questions.

Rather than jumping to conclusions, the scientific method gives us a recipe for exploring the world through observation and trial and error. We use it regularly, sometimes knowingly in academics or research, and sometimes subconsciously in our daily lives.

In this article we will refresh our memories on the particulars of the scientific method, discussing where it comes from, which elements comprise it, and how it is put into practice. Then, we will consider the importance of the scientific method, who uses it and under what circumstances.

What is the scientific method?

The scientific method is a dynamic process that involves objectively investigating questions through observation and experimentation . Applicable to all scientific disciplines, this systematic approach to answering questions is more accurately described as a flexible set of principles than as a fixed series of steps.

The following representations of the scientific method illustrate how it can be both condensed into broad categories and also expanded to reveal more and more details of the process. These graphics capture the adaptability that makes this concept universally valuable as it is relevant and accessible not only across age groups and educational levels but also within various contexts.

a graph of the scientific method

Steps in the scientific method

While the scientific method is versatile in form and function, it encompasses a collection of principles that create a logical progression to the process of problem solving:

  • Define a question : Constructing a clear and precise problem statement that identifies the main question or goal of the investigation is the first step. The wording must lend itself to experimentation by posing a question that is both testable and measurable.
  • Gather information and resources : Researching the topic in question to find out what is already known and what types of related questions others are asking is the next step in this process. This background information is vital to gaining a full understanding of the subject and in determining the best design for experiments. 
  • Form a hypothesis : Composing a concise statement that identifies specific variables and potential results, which can then be tested, is a crucial step that must be completed before any experimentation. An imperfection in the composition of a hypothesis can result in weaknesses to the entire design of an experiment.
  • Perform the experiments : Testing the hypothesis by performing replicable experiments and collecting resultant data is another fundamental step of the scientific method. By controlling some elements of an experiment while purposely manipulating others, cause and effect relationships are established.
  • Analyze the data : Interpreting the experimental process and results by recognizing trends in the data is a necessary step for comprehending its meaning and supporting the conclusions. Drawing inferences through this systematic process lends substantive evidence for either supporting or rejecting the hypothesis.
  • Report the results : Sharing the outcomes of an experiment, through an essay, presentation, graphic, or journal article, is often regarded as a final step in this process. Detailing the project's design, methods, and results not only promotes transparency and replicability but also adds to the body of knowledge for future research.
  • Retest the hypothesis : Repeating experiments to see if a hypothesis holds up in all cases is a step that is manifested through varying scenarios. Sometimes a researcher immediately checks their own work or replicates it at a future time, or another researcher will repeat the experiments to further test the hypothesis.

a chart of the scientific method

Where did the scientific method come from?

Oftentimes, ancient peoples attempted to answer questions about the unknown by:

  • Making simple observations
  • Discussing the possibilities with others deemed worthy of a debate
  • Drawing conclusions based on dominant opinions and preexisting beliefs

For example, take Greek and Roman mythology. Myths were used to explain everything from the seasons and stars to the sun and death itself.

However, as societies began to grow through advancements in agriculture and language, ancient civilizations like Egypt and Babylonia shifted to a more rational analysis for understanding the natural world. They increasingly employed empirical methods of observation and experimentation that would one day evolve into the scientific method . 

In the 4th century, Aristotle, considered the Father of Science by many, suggested these elements , which closely resemble the contemporary scientific method, as part of his approach for conducting science:

  • Study what others have written about the subject.
  • Look for the general consensus about the subject.
  • Perform a systematic study of everything even partially related to the topic.

a pyramid of the scientific method

By continuing to emphasize systematic observation and controlled experiments, scholars such as Al-Kindi and Ibn al-Haytham helped expand this concept throughout the Islamic Golden Age . 

In his 1620 treatise, Novum Organum , Sir Francis Bacon codified the scientific method, arguing not only that hypotheses must be tested through experiments but also that the results must be replicated to establish a truth. Coming at the height of the Scientific Revolution, this text made the scientific method accessible to European thinkers like Galileo and Isaac Newton who then put the method into practice.

As science modernized in the 19th century, the scientific method became more formalized, leading to significant breakthroughs in fields such as evolution and germ theory. Today, it continues to evolve, underpinning scientific progress in diverse areas like quantum mechanics, genetics, and artificial intelligence.

Why is the scientific method important?

The history of the scientific method illustrates how the concept developed out of a need to find objective answers to scientific questions by overcoming biases based on fear, religion, power, and cultural norms. This still holds true today.

By implementing this standardized approach to conducting experiments, the impacts of researchers’ personal opinions and preconceived notions are minimized. The organized manner of the scientific method prevents these and other mistakes while promoting the replicability and transparency necessary for solid scientific research.

The importance of the scientific method is best observed through its successes, for example: 

  • “ Albert Einstein stands out among modern physicists as the scientist who not only formulated a theory of revolutionary significance but also had the genius to reflect in a conscious and technical way on the scientific method he was using.” Devising a hypothesis based on the prevailing understanding of Newtonian physics eventually led Einstein to devise the theory of general relativity .
  • Howard Florey “Perhaps the most useful lesson which has come out of the work on penicillin has been the demonstration that success in this field depends on the development and coordinated use of technical methods.” After discovering a mold that prevented the growth of Staphylococcus bacteria, Dr. Alexander Flemimg designed experiments to identify and reproduce it in the lab, thus leading to the development of penicillin .
  • James D. Watson “Every time you understand something, religion becomes less likely. Only with the discovery of the double helix and the ensuing genetic revolution have we had grounds for thinking that the powers held traditionally to be the exclusive property of the gods might one day be ours. . . .” By using wire models to conceive a structure for DNA, Watson and Crick crafted a hypothesis for testing combinations of amino acids, X-ray diffraction images, and the current research in atomic physics, resulting in the discovery of DNA’s double helix structure .

Final thoughts

As the cases exemplify, the scientific method is never truly completed, but rather started and restarted. It gave these researchers a structured process that was easily replicated, modified, and built upon. 

While the scientific method may “end” in one context, it never literally ends. When a hypothesis, design, methods, and experiments are revisited, the scientific method simply picks up where it left off. Each time a researcher builds upon previous knowledge, the scientific method is restored with the pieces of past efforts.

By guiding researchers towards objective results based on transparency and reproducibility, the scientific method acts as a defense against bias, superstition, and preconceived notions. As we embrace the scientific method's enduring principles, we ensure that our quest for knowledge remains firmly rooted in reason, evidence, and the pursuit of truth.

The AJE Team

The AJE Team

See our "Privacy Policy"

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Scientific Method

Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of hypotheses and theories. How these are carried out in detail can vary greatly, but characteristics like these have been looked to as a way of demarcating scientific activity from non-science, where only enterprises which employ some canonical form of scientific method or methods should be considered science (see also the entry on science and pseudo-science ). Others have questioned whether there is anything like a fixed toolkit of methods which is common across science and only science. Some reject privileging one view of method as part of rejecting broader views about the nature of science, such as naturalism (Dupré 2004); some reject any restriction in principle (pluralism).

Scientific method should be distinguished from the aims and products of science, such as knowledge, predictions, or control. Methods are the means by which those goals are achieved. Scientific method should also be distinguished from meta-methodology, which includes the values and justifications behind a particular characterization of scientific method (i.e., a methodology) — values such as objectivity, reproducibility, simplicity, or past successes. Methodological rules are proposed to govern method and it is a meta-methodological question whether methods obeying those rules satisfy given values. Finally, method is distinct, to some degree, from the detailed and contextual practices through which methods are implemented. The latter might range over: specific laboratory techniques; mathematical formalisms or other specialized languages used in descriptions and reasoning; technological or other material means; ways of communicating and sharing results, whether with other scientists or with the public at large; or the conventions, habits, enforced customs, and institutional controls over how and what science is carried out.

While it is important to recognize these distinctions, their boundaries are fuzzy. Hence, accounts of method cannot be entirely divorced from their methodological and meta-methodological motivations or justifications, Moreover, each aspect plays a crucial role in identifying methods. Disputes about method have therefore played out at the detail, rule, and meta-rule levels. Changes in beliefs about the certainty or fallibility of scientific knowledge, for instance (which is a meta-methodological consideration of what we can hope for methods to deliver), have meant different emphases on deductive and inductive reasoning, or on the relative importance attached to reasoning over observation (i.e., differences over particular methods.) Beliefs about the role of science in society will affect the place one gives to values in scientific method.

The issue which has shaped debates over scientific method the most in the last half century is the question of how pluralist do we need to be about method? Unificationists continue to hold out for one method essential to science; nihilism is a form of radical pluralism, which considers the effectiveness of any methodological prescription to be so context sensitive as to render it not explanatory on its own. Some middle degree of pluralism regarding the methods embodied in scientific practice seems appropriate. But the details of scientific practice vary with time and place, from institution to institution, across scientists and their subjects of investigation. How significant are the variations for understanding science and its success? How much can method be abstracted from practice? This entry describes some of the attempts to characterize scientific method or methods, as well as arguments for a more context-sensitive approach to methods embedded in actual scientific practices.

1. Overview and organizing themes

2. historical review: aristotle to mill, 3.1 logical constructionism and operationalism, 3.2. h-d as a logic of confirmation, 3.3. popper and falsificationism, 3.4 meta-methodology and the end of method, 4. statistical methods for hypothesis testing, 5.1 creative and exploratory practices.

  • 5.2 Computer methods and the ‘new ways’ of doing science

6.1 “The scientific method” in science education and as seen by scientists

6.2 privileged methods and ‘gold standards’, 6.3 scientific method in the court room, 6.4 deviating practices, 7. conclusion, other internet resources, related entries.

This entry could have been given the title Scientific Methods and gone on to fill volumes, or it could have been extremely short, consisting of a brief summary rejection of the idea that there is any such thing as a unique Scientific Method at all. Both unhappy prospects are due to the fact that scientific activity varies so much across disciplines, times, places, and scientists that any account which manages to unify it all will either consist of overwhelming descriptive detail, or trivial generalizations.

The choice of scope for the present entry is more optimistic, taking a cue from the recent movement in philosophy of science toward a greater attention to practice: to what scientists actually do. This “turn to practice” can be seen as the latest form of studies of methods in science, insofar as it represents an attempt at understanding scientific activity, but through accounts that are neither meant to be universal and unified, nor singular and narrowly descriptive. To some extent, different scientists at different times and places can be said to be using the same method even though, in practice, the details are different.

Whether the context in which methods are carried out is relevant, or to what extent, will depend largely on what one takes the aims of science to be and what one’s own aims are. For most of the history of scientific methodology the assumption has been that the most important output of science is knowledge and so the aim of methodology should be to discover those methods by which scientific knowledge is generated.

Science was seen to embody the most successful form of reasoning (but which form?) to the most certain knowledge claims (but how certain?) on the basis of systematically collected evidence (but what counts as evidence, and should the evidence of the senses take precedence, or rational insight?) Section 2 surveys some of the history, pointing to two major themes. One theme is seeking the right balance between observation and reasoning (and the attendant forms of reasoning which employ them); the other is how certain scientific knowledge is or can be.

Section 3 turns to 20 th century debates on scientific method. In the second half of the 20 th century the epistemic privilege of science faced several challenges and many philosophers of science abandoned the reconstruction of the logic of scientific method. Views changed significantly regarding which functions of science ought to be captured and why. For some, the success of science was better identified with social or cultural features. Historical and sociological turns in the philosophy of science were made, with a demand that greater attention be paid to the non-epistemic aspects of science, such as sociological, institutional, material, and political factors. Even outside of those movements there was an increased specialization in the philosophy of science, with more and more focus on specific fields within science. The combined upshot was very few philosophers arguing any longer for a grand unified methodology of science. Sections 3 and 4 surveys the main positions on scientific method in 20 th century philosophy of science, focusing on where they differ in their preference for confirmation or falsification or for waiving the idea of a special scientific method altogether.

In recent decades, attention has primarily been paid to scientific activities traditionally falling under the rubric of method, such as experimental design and general laboratory practice, the use of statistics, the construction and use of models and diagrams, interdisciplinary collaboration, and science communication. Sections 4–6 attempt to construct a map of the current domains of the study of methods in science.

As these sections illustrate, the question of method is still central to the discourse about science. Scientific method remains a topic for education, for science policy, and for scientists. It arises in the public domain where the demarcation or status of science is at issue. Some philosophers have recently returned, therefore, to the question of what it is that makes science a unique cultural product. This entry will close with some of these recent attempts at discerning and encapsulating the activities by which scientific knowledge is achieved.

Attempting a history of scientific method compounds the vast scope of the topic. This section briefly surveys the background to modern methodological debates. What can be called the classical view goes back to antiquity, and represents a point of departure for later divergences. [ 1 ]

We begin with a point made by Laudan (1968) in his historical survey of scientific method:

Perhaps the most serious inhibition to the emergence of the history of theories of scientific method as a respectable area of study has been the tendency to conflate it with the general history of epistemology, thereby assuming that the narrative categories and classificatory pigeon-holes applied to the latter are also basic to the former. (1968: 5)

To see knowledge about the natural world as falling under knowledge more generally is an understandable conflation. Histories of theories of method would naturally employ the same narrative categories and classificatory pigeon holes. An important theme of the history of epistemology, for example, is the unification of knowledge, a theme reflected in the question of the unification of method in science. Those who have identified differences in kinds of knowledge have often likewise identified different methods for achieving that kind of knowledge (see the entry on the unity of science ).

Different views on what is known, how it is known, and what can be known are connected. Plato distinguished the realms of things into the visible and the intelligible ( The Republic , 510a, in Cooper 1997). Only the latter, the Forms, could be objects of knowledge. The intelligible truths could be known with the certainty of geometry and deductive reasoning. What could be observed of the material world, however, was by definition imperfect and deceptive, not ideal. The Platonic way of knowledge therefore emphasized reasoning as a method, downplaying the importance of observation. Aristotle disagreed, locating the Forms in the natural world as the fundamental principles to be discovered through the inquiry into nature ( Metaphysics Z , in Barnes 1984).

Aristotle is recognized as giving the earliest systematic treatise on the nature of scientific inquiry in the western tradition, one which embraced observation and reasoning about the natural world. In the Prior and Posterior Analytics , Aristotle reflects first on the aims and then the methods of inquiry into nature. A number of features can be found which are still considered by most to be essential to science. For Aristotle, empiricism, careful observation (but passive observation, not controlled experiment), is the starting point. The aim is not merely recording of facts, though. For Aristotle, science ( epistêmê ) is a body of properly arranged knowledge or learning—the empirical facts, but also their ordering and display are of crucial importance. The aims of discovery, ordering, and display of facts partly determine the methods required of successful scientific inquiry. Also determinant is the nature of the knowledge being sought, and the explanatory causes proper to that kind of knowledge (see the discussion of the four causes in the entry on Aristotle on causality ).

In addition to careful observation, then, scientific method requires a logic as a system of reasoning for properly arranging, but also inferring beyond, what is known by observation. Methods of reasoning may include induction, prediction, or analogy, among others. Aristotle’s system (along with his catalogue of fallacious reasoning) was collected under the title the Organon . This title would be echoed in later works on scientific reasoning, such as Novum Organon by Francis Bacon, and Novum Organon Restorum by William Whewell (see below). In Aristotle’s Organon reasoning is divided primarily into two forms, a rough division which persists into modern times. The division, known most commonly today as deductive versus inductive method, appears in other eras and methodologies as analysis/​synthesis, non-ampliative/​ampliative, or even confirmation/​verification. The basic idea is there are two “directions” to proceed in our methods of inquiry: one away from what is observed, to the more fundamental, general, and encompassing principles; the other, from the fundamental and general to instances or implications of principles.

The basic aim and method of inquiry identified here can be seen as a theme running throughout the next two millennia of reflection on the correct way to seek after knowledge: carefully observe nature and then seek rules or principles which explain or predict its operation. The Aristotelian corpus provided the framework for a commentary tradition on scientific method independent of science itself (cosmos versus physics.) During the medieval period, figures such as Albertus Magnus (1206–1280), Thomas Aquinas (1225–1274), Robert Grosseteste (1175–1253), Roger Bacon (1214/1220–1292), William of Ockham (1287–1347), Andreas Vesalius (1514–1546), Giacomo Zabarella (1533–1589) all worked to clarify the kind of knowledge obtainable by observation and induction, the source of justification of induction, and best rules for its application. [ 2 ] Many of their contributions we now think of as essential to science (see also Laudan 1968). As Aristotle and Plato had employed a framework of reasoning either “to the forms” or “away from the forms”, medieval thinkers employed directions away from the phenomena or back to the phenomena. In analysis, a phenomena was examined to discover its basic explanatory principles; in synthesis, explanations of a phenomena were constructed from first principles.

During the Scientific Revolution these various strands of argument, experiment, and reason were forged into a dominant epistemic authority. The 16 th –18 th centuries were a period of not only dramatic advance in knowledge about the operation of the natural world—advances in mechanical, medical, biological, political, economic explanations—but also of self-awareness of the revolutionary changes taking place, and intense reflection on the source and legitimation of the method by which the advances were made. The struggle to establish the new authority included methodological moves. The Book of Nature, according to the metaphor of Galileo Galilei (1564–1642) or Francis Bacon (1561–1626), was written in the language of mathematics, of geometry and number. This motivated an emphasis on mathematical description and mechanical explanation as important aspects of scientific method. Through figures such as Henry More and Ralph Cudworth, a neo-Platonic emphasis on the importance of metaphysical reflection on nature behind appearances, particularly regarding the spiritual as a complement to the purely mechanical, remained an important methodological thread of the Scientific Revolution (see the entries on Cambridge platonists ; Boyle ; Henry More ; Galileo ).

In Novum Organum (1620), Bacon was critical of the Aristotelian method for leaping from particulars to universals too quickly. The syllogistic form of reasoning readily mixed those two types of propositions. Bacon aimed at the invention of new arts, principles, and directions. His method would be grounded in methodical collection of observations, coupled with correction of our senses (and particularly, directions for the avoidance of the Idols, as he called them, kinds of systematic errors to which naïve observers are prone.) The community of scientists could then climb, by a careful, gradual and unbroken ascent, to reliable general claims.

Bacon’s method has been criticized as impractical and too inflexible for the practicing scientist. Whewell would later criticize Bacon in his System of Logic for paying too little attention to the practices of scientists. It is hard to find convincing examples of Bacon’s method being put in to practice in the history of science, but there are a few who have been held up as real examples of 16 th century scientific, inductive method, even if not in the rigid Baconian mold: figures such as Robert Boyle (1627–1691) and William Harvey (1578–1657) (see the entry on Bacon ).

It is to Isaac Newton (1642–1727), however, that historians of science and methodologists have paid greatest attention. Given the enormous success of his Principia Mathematica and Opticks , this is understandable. The study of Newton’s method has had two main thrusts: the implicit method of the experiments and reasoning presented in the Opticks, and the explicit methodological rules given as the Rules for Philosophising (the Regulae) in Book III of the Principia . [ 3 ] Newton’s law of gravitation, the linchpin of his new cosmology, broke with explanatory conventions of natural philosophy, first for apparently proposing action at a distance, but more generally for not providing “true”, physical causes. The argument for his System of the World ( Principia , Book III) was based on phenomena, not reasoned first principles. This was viewed (mainly on the continent) as insufficient for proper natural philosophy. The Regulae counter this objection, re-defining the aims of natural philosophy by re-defining the method natural philosophers should follow. (See the entry on Newton’s philosophy .)

To his list of methodological prescriptions should be added Newton’s famous phrase “ hypotheses non fingo ” (commonly translated as “I frame no hypotheses”.) The scientist was not to invent systems but infer explanations from observations, as Bacon had advocated. This would come to be known as inductivism. In the century after Newton, significant clarifications of the Newtonian method were made. Colin Maclaurin (1698–1746), for instance, reconstructed the essential structure of the method as having complementary analysis and synthesis phases, one proceeding away from the phenomena in generalization, the other from the general propositions to derive explanations of new phenomena. Denis Diderot (1713–1784) and editors of the Encyclopédie did much to consolidate and popularize Newtonianism, as did Francesco Algarotti (1721–1764). The emphasis was often the same, as much on the character of the scientist as on their process, a character which is still commonly assumed. The scientist is humble in the face of nature, not beholden to dogma, obeys only his eyes, and follows the truth wherever it leads. It was certainly Voltaire (1694–1778) and du Chatelet (1706–1749) who were most influential in propagating the latter vision of the scientist and their craft, with Newton as hero. Scientific method became a revolutionary force of the Enlightenment. (See also the entries on Newton , Leibniz , Descartes , Boyle , Hume , enlightenment , as well as Shank 2008 for a historical overview.)

Not all 18 th century reflections on scientific method were so celebratory. Famous also are George Berkeley’s (1685–1753) attack on the mathematics of the new science, as well as the over-emphasis of Newtonians on observation; and David Hume’s (1711–1776) undermining of the warrant offered for scientific claims by inductive justification (see the entries on: George Berkeley ; David Hume ; Hume’s Newtonianism and Anti-Newtonianism ). Hume’s problem of induction motivated Immanuel Kant (1724–1804) to seek new foundations for empirical method, though as an epistemic reconstruction, not as any set of practical guidelines for scientists. Both Hume and Kant influenced the methodological reflections of the next century, such as the debate between Mill and Whewell over the certainty of inductive inferences in science.

The debate between John Stuart Mill (1806–1873) and William Whewell (1794–1866) has become the canonical methodological debate of the 19 th century. Although often characterized as a debate between inductivism and hypothetico-deductivism, the role of the two methods on each side is actually more complex. On the hypothetico-deductive account, scientists work to come up with hypotheses from which true observational consequences can be deduced—hence, hypothetico-deductive. Because Whewell emphasizes both hypotheses and deduction in his account of method, he can be seen as a convenient foil to the inductivism of Mill. However, equally if not more important to Whewell’s portrayal of scientific method is what he calls the “fundamental antithesis”. Knowledge is a product of the objective (what we see in the world around us) and subjective (the contributions of our mind to how we perceive and understand what we experience, which he called the Fundamental Ideas). Both elements are essential according to Whewell, and he was therefore critical of Kant for too much focus on the subjective, and John Locke (1632–1704) and Mill for too much focus on the senses. Whewell’s fundamental ideas can be discipline relative. An idea can be fundamental even if it is necessary for knowledge only within a given scientific discipline (e.g., chemical affinity for chemistry). This distinguishes fundamental ideas from the forms and categories of intuition of Kant. (See the entry on Whewell .)

Clarifying fundamental ideas would therefore be an essential part of scientific method and scientific progress. Whewell called this process “Discoverer’s Induction”. It was induction, following Bacon or Newton, but Whewell sought to revive Bacon’s account by emphasising the role of ideas in the clear and careful formulation of inductive hypotheses. Whewell’s induction is not merely the collecting of objective facts. The subjective plays a role through what Whewell calls the Colligation of Facts, a creative act of the scientist, the invention of a theory. A theory is then confirmed by testing, where more facts are brought under the theory, called the Consilience of Inductions. Whewell felt that this was the method by which the true laws of nature could be discovered: clarification of fundamental concepts, clever invention of explanations, and careful testing. Mill, in his critique of Whewell, and others who have cast Whewell as a fore-runner of the hypothetico-deductivist view, seem to have under-estimated the importance of this discovery phase in Whewell’s understanding of method (Snyder 1997a,b, 1999). Down-playing the discovery phase would come to characterize methodology of the early 20 th century (see section 3 ).

Mill, in his System of Logic , put forward a narrower view of induction as the essence of scientific method. For Mill, induction is the search first for regularities among events. Among those regularities, some will continue to hold for further observations, eventually gaining the status of laws. One can also look for regularities among the laws discovered in a domain, i.e., for a law of laws. Which “law law” will hold is time and discipline dependent and open to revision. One example is the Law of Universal Causation, and Mill put forward specific methods for identifying causes—now commonly known as Mill’s methods. These five methods look for circumstances which are common among the phenomena of interest, those which are absent when the phenomena are, or those for which both vary together. Mill’s methods are still seen as capturing basic intuitions about experimental methods for finding the relevant explanatory factors ( System of Logic (1843), see Mill entry). The methods advocated by Whewell and Mill, in the end, look similar. Both involve inductive generalization to covering laws. They differ dramatically, however, with respect to the necessity of the knowledge arrived at; that is, at the meta-methodological level (see the entries on Whewell and Mill entries).

3. Logic of method and critical responses

The quantum and relativistic revolutions in physics in the early 20 th century had a profound effect on methodology. Conceptual foundations of both theories were taken to show the defeasibility of even the most seemingly secure intuitions about space, time and bodies. Certainty of knowledge about the natural world was therefore recognized as unattainable. Instead a renewed empiricism was sought which rendered science fallible but still rationally justifiable.

Analyses of the reasoning of scientists emerged, according to which the aspects of scientific method which were of primary importance were the means of testing and confirming of theories. A distinction in methodology was made between the contexts of discovery and justification. The distinction could be used as a wedge between the particularities of where and how theories or hypotheses are arrived at, on the one hand, and the underlying reasoning scientists use (whether or not they are aware of it) when assessing theories and judging their adequacy on the basis of the available evidence. By and large, for most of the 20 th century, philosophy of science focused on the second context, although philosophers differed on whether to focus on confirmation or refutation as well as on the many details of how confirmation or refutation could or could not be brought about. By the mid-20 th century these attempts at defining the method of justification and the context distinction itself came under pressure. During the same period, philosophy of science developed rapidly, and from section 4 this entry will therefore shift from a primarily historical treatment of the scientific method towards a primarily thematic one.

Advances in logic and probability held out promise of the possibility of elaborate reconstructions of scientific theories and empirical method, the best example being Rudolf Carnap’s The Logical Structure of the World (1928). Carnap attempted to show that a scientific theory could be reconstructed as a formal axiomatic system—that is, a logic. That system could refer to the world because some of its basic sentences could be interpreted as observations or operations which one could perform to test them. The rest of the theoretical system, including sentences using theoretical or unobservable terms (like electron or force) would then either be meaningful because they could be reduced to observations, or they had purely logical meanings (called analytic, like mathematical identities). This has been referred to as the verifiability criterion of meaning. According to the criterion, any statement not either analytic or verifiable was strictly meaningless. Although the view was endorsed by Carnap in 1928, he would later come to see it as too restrictive (Carnap 1956). Another familiar version of this idea is operationalism of Percy William Bridgman. In The Logic of Modern Physics (1927) Bridgman asserted that every physical concept could be defined in terms of the operations one would perform to verify the application of that concept. Making good on the operationalisation of a concept even as simple as length, however, can easily become enormously complex (for measuring very small lengths, for instance) or impractical (measuring large distances like light years.)

Carl Hempel’s (1950, 1951) criticisms of the verifiability criterion of meaning had enormous influence. He pointed out that universal generalizations, such as most scientific laws, were not strictly meaningful on the criterion. Verifiability and operationalism both seemed too restrictive to capture standard scientific aims and practice. The tenuous connection between these reconstructions and actual scientific practice was criticized in another way. In both approaches, scientific methods are instead recast in methodological roles. Measurements, for example, were looked to as ways of giving meanings to terms. The aim of the philosopher of science was not to understand the methods per se , but to use them to reconstruct theories, their meanings, and their relation to the world. When scientists perform these operations, however, they will not report that they are doing them to give meaning to terms in a formal axiomatic system. This disconnect between methodology and the details of actual scientific practice would seem to violate the empiricism the Logical Positivists and Bridgman were committed to. The view that methodology should correspond to practice (to some extent) has been called historicism, or intuitionism. We turn to these criticisms and responses in section 3.4 . [ 4 ]

Positivism also had to contend with the recognition that a purely inductivist approach, along the lines of Bacon-Newton-Mill, was untenable. There was no pure observation, for starters. All observation was theory laden. Theory is required to make any observation, therefore not all theory can be derived from observation alone. (See the entry on theory and observation in science .) Even granting an observational basis, Hume had already pointed out that one could not deductively justify inductive conclusions without begging the question by presuming the success of the inductive method. Likewise, positivist attempts at analyzing how a generalization can be confirmed by observations of its instances were subject to a number of criticisms. Goodman (1965) and Hempel (1965) both point to paradoxes inherent in standard accounts of confirmation. Recent attempts at explaining how observations can serve to confirm a scientific theory are discussed in section 4 below.

The standard starting point for a non-inductive analysis of the logic of confirmation is known as the Hypothetico-Deductive (H-D) method. In its simplest form, a sentence of a theory which expresses some hypothesis is confirmed by its true consequences. As noted in section 2 , this method had been advanced by Whewell in the 19 th century, as well as Nicod (1924) and others in the 20 th century. Often, Hempel’s (1966) description of the H-D method, illustrated by the case of Semmelweiss’ inferential procedures in establishing the cause of childbed fever, has been presented as a key account of H-D as well as a foil for criticism of the H-D account of confirmation (see, for example, Lipton’s (2004) discussion of inference to the best explanation; also the entry on confirmation ). Hempel described Semmelsweiss’ procedure as examining various hypotheses explaining the cause of childbed fever. Some hypotheses conflicted with observable facts and could be rejected as false immediately. Others needed to be tested experimentally by deducing which observable events should follow if the hypothesis were true (what Hempel called the test implications of the hypothesis), then conducting an experiment and observing whether or not the test implications occurred. If the experiment showed the test implication to be false, the hypothesis could be rejected. If the experiment showed the test implications to be true, however, this did not prove the hypothesis true. The confirmation of a test implication does not verify a hypothesis, though Hempel did allow that “it provides at least some support, some corroboration or confirmation for it” (Hempel 1966: 8). The degree of this support then depends on the quantity, variety and precision of the supporting evidence.

Another approach that took off from the difficulties with inductive inference was Karl Popper’s critical rationalism or falsificationism (Popper 1959, 1963). Falsification is deductive and similar to H-D in that it involves scientists deducing observational consequences from the hypothesis under test. For Popper, however, the important point was not the degree of confirmation that successful prediction offered to a hypothesis. The crucial thing was the logical asymmetry between confirmation, based on inductive inference, and falsification, which can be based on a deductive inference. (This simple opposition was later questioned, by Lakatos, among others. See the entry on historicist theories of scientific rationality. )

Popper stressed that, regardless of the amount of confirming evidence, we can never be certain that a hypothesis is true without committing the fallacy of affirming the consequent. Instead, Popper introduced the notion of corroboration as a measure for how well a theory or hypothesis has survived previous testing—but without implying that this is also a measure for the probability that it is true.

Popper was also motivated by his doubts about the scientific status of theories like the Marxist theory of history or psycho-analysis, and so wanted to demarcate between science and pseudo-science. Popper saw this as an importantly different distinction than demarcating science from metaphysics. The latter demarcation was the primary concern of many logical empiricists. Popper used the idea of falsification to draw a line instead between pseudo and proper science. Science was science because its method involved subjecting theories to rigorous tests which offered a high probability of failing and thus refuting the theory.

A commitment to the risk of failure was important. Avoiding falsification could be done all too easily. If a consequence of a theory is inconsistent with observations, an exception can be added by introducing auxiliary hypotheses designed explicitly to save the theory, so-called ad hoc modifications. This Popper saw done in pseudo-science where ad hoc theories appeared capable of explaining anything in their field of application. In contrast, science is risky. If observations showed the predictions from a theory to be wrong, the theory would be refuted. Hence, scientific hypotheses must be falsifiable. Not only must there exist some possible observation statement which could falsify the hypothesis or theory, were it observed, (Popper called these the hypothesis’ potential falsifiers) it is crucial to the Popperian scientific method that such falsifications be sincerely attempted on a regular basis.

The more potential falsifiers of a hypothesis, the more falsifiable it would be, and the more the hypothesis claimed. Conversely, hypotheses without falsifiers claimed very little or nothing at all. Originally, Popper thought that this meant the introduction of ad hoc hypotheses only to save a theory should not be countenanced as good scientific method. These would undermine the falsifiabililty of a theory. However, Popper later came to recognize that the introduction of modifications (immunizations, he called them) was often an important part of scientific development. Responding to surprising or apparently falsifying observations often generated important new scientific insights. Popper’s own example was the observed motion of Uranus which originally did not agree with Newtonian predictions. The ad hoc hypothesis of an outer planet explained the disagreement and led to further falsifiable predictions. Popper sought to reconcile the view by blurring the distinction between falsifiable and not falsifiable, and speaking instead of degrees of testability (Popper 1985: 41f.).

From the 1960s on, sustained meta-methodological criticism emerged that drove philosophical focus away from scientific method. A brief look at those criticisms follows, with recommendations for further reading at the end of the entry.

Thomas Kuhn’s The Structure of Scientific Revolutions (1962) begins with a well-known shot across the bow for philosophers of science:

History, if viewed as a repository for more than anecdote or chronology, could produce a decisive transformation in the image of science by which we are now possessed. (1962: 1)

The image Kuhn thought needed transforming was the a-historical, rational reconstruction sought by many of the Logical Positivists, though Carnap and other positivists were actually quite sympathetic to Kuhn’s views. (See the entry on the Vienna Circle .) Kuhn shares with other of his contemporaries, such as Feyerabend and Lakatos, a commitment to a more empirical approach to philosophy of science. Namely, the history of science provides important data, and necessary checks, for philosophy of science, including any theory of scientific method.

The history of science reveals, according to Kuhn, that scientific development occurs in alternating phases. During normal science, the members of the scientific community adhere to the paradigm in place. Their commitment to the paradigm means a commitment to the puzzles to be solved and the acceptable ways of solving them. Confidence in the paradigm remains so long as steady progress is made in solving the shared puzzles. Method in this normal phase operates within a disciplinary matrix (Kuhn’s later concept of a paradigm) which includes standards for problem solving, and defines the range of problems to which the method should be applied. An important part of a disciplinary matrix is the set of values which provide the norms and aims for scientific method. The main values that Kuhn identifies are prediction, problem solving, simplicity, consistency, and plausibility.

An important by-product of normal science is the accumulation of puzzles which cannot be solved with resources of the current paradigm. Once accumulation of these anomalies has reached some critical mass, it can trigger a communal shift to a new paradigm and a new phase of normal science. Importantly, the values that provide the norms and aims for scientific method may have transformed in the meantime. Method may therefore be relative to discipline, time or place

Feyerabend also identified the aims of science as progress, but argued that any methodological prescription would only stifle that progress (Feyerabend 1988). His arguments are grounded in re-examining accepted “myths” about the history of science. Heroes of science, like Galileo, are shown to be just as reliant on rhetoric and persuasion as they are on reason and demonstration. Others, like Aristotle, are shown to be far more reasonable and far-reaching in their outlooks then they are given credit for. As a consequence, the only rule that could provide what he took to be sufficient freedom was the vacuous “anything goes”. More generally, even the methodological restriction that science is the best way to pursue knowledge, and to increase knowledge, is too restrictive. Feyerabend suggested instead that science might, in fact, be a threat to a free society, because it and its myth had become so dominant (Feyerabend 1978).

An even more fundamental kind of criticism was offered by several sociologists of science from the 1970s onwards who rejected the methodology of providing philosophical accounts for the rational development of science and sociological accounts of the irrational mistakes. Instead, they adhered to a symmetry thesis on which any causal explanation of how scientific knowledge is established needs to be symmetrical in explaining truth and falsity, rationality and irrationality, success and mistakes, by the same causal factors (see, e.g., Barnes and Bloor 1982, Bloor 1991). Movements in the Sociology of Science, like the Strong Programme, or in the social dimensions and causes of knowledge more generally led to extended and close examination of detailed case studies in contemporary science and its history. (See the entries on the social dimensions of scientific knowledge and social epistemology .) Well-known examinations by Latour and Woolgar (1979/1986), Knorr-Cetina (1981), Pickering (1984), Shapin and Schaffer (1985) seem to bear out that it was social ideologies (on a macro-scale) or individual interactions and circumstances (on a micro-scale) which were the primary causal factors in determining which beliefs gained the status of scientific knowledge. As they saw it therefore, explanatory appeals to scientific method were not empirically grounded.

A late, and largely unexpected, criticism of scientific method came from within science itself. Beginning in the early 2000s, a number of scientists attempting to replicate the results of published experiments could not do so. There may be close conceptual connection between reproducibility and method. For example, if reproducibility means that the same scientific methods ought to produce the same result, and all scientific results ought to be reproducible, then whatever it takes to reproduce a scientific result ought to be called scientific method. Space limits us to the observation that, insofar as reproducibility is a desired outcome of proper scientific method, it is not strictly a part of scientific method. (See the entry on reproducibility of scientific results .)

By the close of the 20 th century the search for the scientific method was flagging. Nola and Sankey (2000b) could introduce their volume on method by remarking that “For some, the whole idea of a theory of scientific method is yester-year’s debate …”.

Despite the many difficulties that philosophers encountered in trying to providing a clear methodology of conformation (or refutation), still important progress has been made on understanding how observation can provide evidence for a given theory. Work in statistics has been crucial for understanding how theories can be tested empirically, and in recent decades a huge literature has developed that attempts to recast confirmation in Bayesian terms. Here these developments can be covered only briefly, and we refer to the entry on confirmation for further details and references.

Statistics has come to play an increasingly important role in the methodology of the experimental sciences from the 19 th century onwards. At that time, statistics and probability theory took on a methodological role as an analysis of inductive inference, and attempts to ground the rationality of induction in the axioms of probability theory have continued throughout the 20 th century and in to the present. Developments in the theory of statistics itself, meanwhile, have had a direct and immense influence on the experimental method, including methods for measuring the uncertainty of observations such as the Method of Least Squares developed by Legendre and Gauss in the early 19 th century, criteria for the rejection of outliers proposed by Peirce by the mid-19 th century, and the significance tests developed by Gosset (a.k.a. “Student”), Fisher, Neyman & Pearson and others in the 1920s and 1930s (see, e.g., Swijtink 1987 for a brief historical overview; and also the entry on C.S. Peirce ).

These developments within statistics then in turn led to a reflective discussion among both statisticians and philosophers of science on how to perceive the process of hypothesis testing: whether it was a rigorous statistical inference that could provide a numerical expression of the degree of confidence in the tested hypothesis, or if it should be seen as a decision between different courses of actions that also involved a value component. This led to a major controversy among Fisher on the one side and Neyman and Pearson on the other (see especially Fisher 1955, Neyman 1956 and Pearson 1955, and for analyses of the controversy, e.g., Howie 2002, Marks 2000, Lenhard 2006). On Fisher’s view, hypothesis testing was a methodology for when to accept or reject a statistical hypothesis, namely that a hypothesis should be rejected by evidence if this evidence would be unlikely relative to other possible outcomes, given the hypothesis were true. In contrast, on Neyman and Pearson’s view, the consequence of error also had to play a role when deciding between hypotheses. Introducing the distinction between the error of rejecting a true hypothesis (type I error) and accepting a false hypothesis (type II error), they argued that it depends on the consequences of the error to decide whether it is more important to avoid rejecting a true hypothesis or accepting a false one. Hence, Fisher aimed for a theory of inductive inference that enabled a numerical expression of confidence in a hypothesis. To him, the important point was the search for truth, not utility. In contrast, the Neyman-Pearson approach provided a strategy of inductive behaviour for deciding between different courses of action. Here, the important point was not whether a hypothesis was true, but whether one should act as if it was.

Similar discussions are found in the philosophical literature. On the one side, Churchman (1948) and Rudner (1953) argued that because scientific hypotheses can never be completely verified, a complete analysis of the methods of scientific inference includes ethical judgments in which the scientists must decide whether the evidence is sufficiently strong or that the probability is sufficiently high to warrant the acceptance of the hypothesis, which again will depend on the importance of making a mistake in accepting or rejecting the hypothesis. Others, such as Jeffrey (1956) and Levi (1960) disagreed and instead defended a value-neutral view of science on which scientists should bracket their attitudes, preferences, temperament, and values when assessing the correctness of their inferences. For more details on this value-free ideal in the philosophy of science and its historical development, see Douglas (2009) and Howard (2003). For a broad set of case studies examining the role of values in science, see e.g. Elliott & Richards 2017.

In recent decades, philosophical discussions of the evaluation of probabilistic hypotheses by statistical inference have largely focused on Bayesianism that understands probability as a measure of a person’s degree of belief in an event, given the available information, and frequentism that instead understands probability as a long-run frequency of a repeatable event. Hence, for Bayesians probabilities refer to a state of knowledge, whereas for frequentists probabilities refer to frequencies of events (see, e.g., Sober 2008, chapter 1 for a detailed introduction to Bayesianism and frequentism as well as to likelihoodism). Bayesianism aims at providing a quantifiable, algorithmic representation of belief revision, where belief revision is a function of prior beliefs (i.e., background knowledge) and incoming evidence. Bayesianism employs a rule based on Bayes’ theorem, a theorem of the probability calculus which relates conditional probabilities. The probability that a particular hypothesis is true is interpreted as a degree of belief, or credence, of the scientist. There will also be a probability and a degree of belief that a hypothesis will be true conditional on a piece of evidence (an observation, say) being true. Bayesianism proscribes that it is rational for the scientist to update their belief in the hypothesis to that conditional probability should it turn out that the evidence is, in fact, observed (see, e.g., Sprenger & Hartmann 2019 for a comprehensive treatment of Bayesian philosophy of science). Originating in the work of Neyman and Person, frequentism aims at providing the tools for reducing long-run error rates, such as the error-statistical approach developed by Mayo (1996) that focuses on how experimenters can avoid both type I and type II errors by building up a repertoire of procedures that detect errors if and only if they are present. Both Bayesianism and frequentism have developed over time, they are interpreted in different ways by its various proponents, and their relations to previous criticism to attempts at defining scientific method are seen differently by proponents and critics. The literature, surveys, reviews and criticism in this area are vast and the reader is referred to the entries on Bayesian epistemology and confirmation .

5. Method in Practice

Attention to scientific practice, as we have seen, is not itself new. However, the turn to practice in the philosophy of science of late can be seen as a correction to the pessimism with respect to method in philosophy of science in later parts of the 20 th century, and as an attempted reconciliation between sociological and rationalist explanations of scientific knowledge. Much of this work sees method as detailed and context specific problem-solving procedures, and methodological analyses to be at the same time descriptive, critical and advisory (see Nickles 1987 for an exposition of this view). The following section contains a survey of some of the practice focuses. In this section we turn fully to topics rather than chronology.

A problem with the distinction between the contexts of discovery and justification that figured so prominently in philosophy of science in the first half of the 20 th century (see section 2 ) is that no such distinction can be clearly seen in scientific activity (see Arabatzis 2006). Thus, in recent decades, it has been recognized that study of conceptual innovation and change should not be confined to psychology and sociology of science, but are also important aspects of scientific practice which philosophy of science should address (see also the entry on scientific discovery ). Looking for the practices that drive conceptual innovation has led philosophers to examine both the reasoning practices of scientists and the wide realm of experimental practices that are not directed narrowly at testing hypotheses, that is, exploratory experimentation.

Examining the reasoning practices of historical and contemporary scientists, Nersessian (2008) has argued that new scientific concepts are constructed as solutions to specific problems by systematic reasoning, and that of analogy, visual representation and thought-experimentation are among the important reasoning practices employed. These ubiquitous forms of reasoning are reliable—but also fallible—methods of conceptual development and change. On her account, model-based reasoning consists of cycles of construction, simulation, evaluation and adaption of models that serve as interim interpretations of the target problem to be solved. Often, this process will lead to modifications or extensions, and a new cycle of simulation and evaluation. However, Nersessian also emphasizes that

creative model-based reasoning cannot be applied as a simple recipe, is not always productive of solutions, and even its most exemplary usages can lead to incorrect solutions. (Nersessian 2008: 11)

Thus, while on the one hand she agrees with many previous philosophers that there is no logic of discovery, discoveries can derive from reasoned processes, such that a large and integral part of scientific practice is

the creation of concepts through which to comprehend, structure, and communicate about physical phenomena …. (Nersessian 1987: 11)

Similarly, work on heuristics for discovery and theory construction by scholars such as Darden (1991) and Bechtel & Richardson (1993) present science as problem solving and investigate scientific problem solving as a special case of problem-solving in general. Drawing largely on cases from the biological sciences, much of their focus has been on reasoning strategies for the generation, evaluation, and revision of mechanistic explanations of complex systems.

Addressing another aspect of the context distinction, namely the traditional view that the primary role of experiments is to test theoretical hypotheses according to the H-D model, other philosophers of science have argued for additional roles that experiments can play. The notion of exploratory experimentation was introduced to describe experiments driven by the desire to obtain empirical regularities and to develop concepts and classifications in which these regularities can be described (Steinle 1997, 2002; Burian 1997; Waters 2007)). However the difference between theory driven experimentation and exploratory experimentation should not be seen as a sharp distinction. Theory driven experiments are not always directed at testing hypothesis, but may also be directed at various kinds of fact-gathering, such as determining numerical parameters. Vice versa , exploratory experiments are usually informed by theory in various ways and are therefore not theory-free. Instead, in exploratory experiments phenomena are investigated without first limiting the possible outcomes of the experiment on the basis of extant theory about the phenomena.

The development of high throughput instrumentation in molecular biology and neighbouring fields has given rise to a special type of exploratory experimentation that collects and analyses very large amounts of data, and these new ‘omics’ disciplines are often said to represent a break with the ideal of hypothesis-driven science (Burian 2007; Elliott 2007; Waters 2007; O’Malley 2007) and instead described as data-driven research (Leonelli 2012; Strasser 2012) or as a special kind of “convenience experimentation” in which many experiments are done simply because they are extraordinarily convenient to perform (Krohs 2012).

5.2 Computer methods and ‘new ways’ of doing science

The field of omics just described is possible because of the ability of computers to process, in a reasonable amount of time, the huge quantities of data required. Computers allow for more elaborate experimentation (higher speed, better filtering, more variables, sophisticated coordination and control), but also, through modelling and simulations, might constitute a form of experimentation themselves. Here, too, we can pose a version of the general question of method versus practice: does the practice of using computers fundamentally change scientific method, or merely provide a more efficient means of implementing standard methods?

Because computers can be used to automate measurements, quantifications, calculations, and statistical analyses where, for practical reasons, these operations cannot be otherwise carried out, many of the steps involved in reaching a conclusion on the basis of an experiment are now made inside a “black box”, without the direct involvement or awareness of a human. This has epistemological implications, regarding what we can know, and how we can know it. To have confidence in the results, computer methods are therefore subjected to tests of verification and validation.

The distinction between verification and validation is easiest to characterize in the case of computer simulations. In a typical computer simulation scenario computers are used to numerically integrate differential equations for which no analytic solution is available. The equations are part of the model the scientist uses to represent a phenomenon or system under investigation. Verifying a computer simulation means checking that the equations of the model are being correctly approximated. Validating a simulation means checking that the equations of the model are adequate for the inferences one wants to make on the basis of that model.

A number of issues related to computer simulations have been raised. The identification of validity and verification as the testing methods has been criticized. Oreskes et al. (1994) raise concerns that “validiation”, because it suggests deductive inference, might lead to over-confidence in the results of simulations. The distinction itself is probably too clean, since actual practice in the testing of simulations mixes and moves back and forth between the two (Weissart 1997; Parker 2008a; Winsberg 2010). Computer simulations do seem to have a non-inductive character, given that the principles by which they operate are built in by the programmers, and any results of the simulation follow from those in-built principles in such a way that those results could, in principle, be deduced from the program code and its inputs. The status of simulations as experiments has therefore been examined (Kaufmann and Smarr 1993; Humphreys 1995; Hughes 1999; Norton and Suppe 2001). This literature considers the epistemology of these experiments: what we can learn by simulation, and also the kinds of justifications which can be given in applying that knowledge to the “real” world. (Mayo 1996; Parker 2008b). As pointed out, part of the advantage of computer simulation derives from the fact that huge numbers of calculations can be carried out without requiring direct observation by the experimenter/​simulator. At the same time, many of these calculations are approximations to the calculations which would be performed first-hand in an ideal situation. Both factors introduce uncertainties into the inferences drawn from what is observed in the simulation.

For many of the reasons described above, computer simulations do not seem to belong clearly to either the experimental or theoretical domain. Rather, they seem to crucially involve aspects of both. This has led some authors, such as Fox Keller (2003: 200) to argue that we ought to consider computer simulation a “qualitatively different way of doing science”. The literature in general tends to follow Kaufmann and Smarr (1993) in referring to computer simulation as a “third way” for scientific methodology (theoretical reasoning and experimental practice are the first two ways.). It should also be noted that the debates around these issues have tended to focus on the form of computer simulation typical in the physical sciences, where models are based on dynamical equations. Other forms of simulation might not have the same problems, or have problems of their own (see the entry on computer simulations in science ).

In recent years, the rapid development of machine learning techniques has prompted some scholars to suggest that the scientific method has become “obsolete” (Anderson 2008, Carrol and Goodstein 2009). This has resulted in an intense debate on the relative merit of data-driven and hypothesis-driven research (for samples, see e.g. Mazzocchi 2015 or Succi and Coveney 2018). For a detailed treatment of this topic, we refer to the entry scientific research and big data .

6. Discourse on scientific method

Despite philosophical disagreements, the idea of the scientific method still figures prominently in contemporary discourse on many different topics, both within science and in society at large. Often, reference to scientific method is used in ways that convey either the legend of a single, universal method characteristic of all science, or grants to a particular method or set of methods privilege as a special ‘gold standard’, often with reference to particular philosophers to vindicate the claims. Discourse on scientific method also typically arises when there is a need to distinguish between science and other activities, or for justifying the special status conveyed to science. In these areas, the philosophical attempts at identifying a set of methods characteristic for scientific endeavors are closely related to the philosophy of science’s classical problem of demarcation (see the entry on science and pseudo-science ) and to the philosophical analysis of the social dimension of scientific knowledge and the role of science in democratic society.

One of the settings in which the legend of a single, universal scientific method has been particularly strong is science education (see, e.g., Bauer 1992; McComas 1996; Wivagg & Allchin 2002). [ 5 ] Often, ‘the scientific method’ is presented in textbooks and educational web pages as a fixed four or five step procedure starting from observations and description of a phenomenon and progressing over formulation of a hypothesis which explains the phenomenon, designing and conducting experiments to test the hypothesis, analyzing the results, and ending with drawing a conclusion. Such references to a universal scientific method can be found in educational material at all levels of science education (Blachowicz 2009), and numerous studies have shown that the idea of a general and universal scientific method often form part of both students’ and teachers’ conception of science (see, e.g., Aikenhead 1987; Osborne et al. 2003). In response, it has been argued that science education need to focus more on teaching about the nature of science, although views have differed on whether this is best done through student-led investigations, contemporary cases, or historical cases (Allchin, Andersen & Nielsen 2014)

Although occasionally phrased with reference to the H-D method, important historical roots of the legend in science education of a single, universal scientific method are the American philosopher and psychologist Dewey’s account of inquiry in How We Think (1910) and the British mathematician Karl Pearson’s account of science in Grammar of Science (1892). On Dewey’s account, inquiry is divided into the five steps of

(i) a felt difficulty, (ii) its location and definition, (iii) suggestion of a possible solution, (iv) development by reasoning of the bearing of the suggestions, (v) further observation and experiment leading to its acceptance or rejection. (Dewey 1910: 72)

Similarly, on Pearson’s account, scientific investigations start with measurement of data and observation of their correction and sequence from which scientific laws can be discovered with the aid of creative imagination. These laws have to be subject to criticism, and their final acceptance will have equal validity for “all normally constituted minds”. Both Dewey’s and Pearson’s accounts should be seen as generalized abstractions of inquiry and not restricted to the realm of science—although both Dewey and Pearson referred to their respective accounts as ‘the scientific method’.

Occasionally, scientists make sweeping statements about a simple and distinct scientific method, as exemplified by Feynman’s simplified version of a conjectures and refutations method presented, for example, in the last of his 1964 Cornell Messenger lectures. [ 6 ] However, just as often scientists have come to the same conclusion as recent philosophy of science that there is not any unique, easily described scientific method. For example, the physicist and Nobel Laureate Weinberg described in the paper “The Methods of Science … And Those By Which We Live” (1995) how

The fact that the standards of scientific success shift with time does not only make the philosophy of science difficult; it also raises problems for the public understanding of science. We do not have a fixed scientific method to rally around and defend. (1995: 8)

Interview studies with scientists on their conception of method shows that scientists often find it hard to figure out whether available evidence confirms their hypothesis, and that there are no direct translations between general ideas about method and specific strategies to guide how research is conducted (Schickore & Hangel 2019, Hangel & Schickore 2017)

Reference to the scientific method has also often been used to argue for the scientific nature or special status of a particular activity. Philosophical positions that argue for a simple and unique scientific method as a criterion of demarcation, such as Popperian falsification, have often attracted practitioners who felt that they had a need to defend their domain of practice. For example, references to conjectures and refutation as the scientific method are abundant in much of the literature on complementary and alternative medicine (CAM)—alongside the competing position that CAM, as an alternative to conventional biomedicine, needs to develop its own methodology different from that of science.

Also within mainstream science, reference to the scientific method is used in arguments regarding the internal hierarchy of disciplines and domains. A frequently seen argument is that research based on the H-D method is superior to research based on induction from observations because in deductive inferences the conclusion follows necessarily from the premises. (See, e.g., Parascandola 1998 for an analysis of how this argument has been made to downgrade epidemiology compared to the laboratory sciences.) Similarly, based on an examination of the practices of major funding institutions such as the National Institutes of Health (NIH), the National Science Foundation (NSF) and the Biomedical Sciences Research Practices (BBSRC) in the UK, O’Malley et al. (2009) have argued that funding agencies seem to have a tendency to adhere to the view that the primary activity of science is to test hypotheses, while descriptive and exploratory research is seen as merely preparatory activities that are valuable only insofar as they fuel hypothesis-driven research.

In some areas of science, scholarly publications are structured in a way that may convey the impression of a neat and linear process of inquiry from stating a question, devising the methods by which to answer it, collecting the data, to drawing a conclusion from the analysis of data. For example, the codified format of publications in most biomedical journals known as the IMRAD format (Introduction, Method, Results, Analysis, Discussion) is explicitly described by the journal editors as “not an arbitrary publication format but rather a direct reflection of the process of scientific discovery” (see the so-called “Vancouver Recommendations”, ICMJE 2013: 11). However, scientific publications do not in general reflect the process by which the reported scientific results were produced. For example, under the provocative title “Is the scientific paper a fraud?”, Medawar argued that scientific papers generally misrepresent how the results have been produced (Medawar 1963/1996). Similar views have been advanced by philosophers, historians and sociologists of science (Gilbert 1976; Holmes 1987; Knorr-Cetina 1981; Schickore 2008; Suppe 1998) who have argued that scientists’ experimental practices are messy and often do not follow any recognizable pattern. Publications of research results, they argue, are retrospective reconstructions of these activities that often do not preserve the temporal order or the logic of these activities, but are instead often constructed in order to screen off potential criticism (see Schickore 2008 for a review of this work).

Philosophical positions on the scientific method have also made it into the court room, especially in the US where judges have drawn on philosophy of science in deciding when to confer special status to scientific expert testimony. A key case is Daubert vs Merrell Dow Pharmaceuticals (92–102, 509 U.S. 579, 1993). In this case, the Supreme Court argued in its 1993 ruling that trial judges must ensure that expert testimony is reliable, and that in doing this the court must look at the expert’s methodology to determine whether the proffered evidence is actually scientific knowledge. Further, referring to works of Popper and Hempel the court stated that

ordinarily, a key question to be answered in determining whether a theory or technique is scientific knowledge … is whether it can be (and has been) tested. (Justice Blackmun, Daubert v. Merrell Dow Pharmaceuticals; see Other Internet Resources for a link to the opinion)

But as argued by Haack (2005a,b, 2010) and by Foster & Hubner (1999), by equating the question of whether a piece of testimony is reliable with the question whether it is scientific as indicated by a special methodology, the court was producing an inconsistent mixture of Popper’s and Hempel’s philosophies, and this has later led to considerable confusion in subsequent case rulings that drew on the Daubert case (see Haack 2010 for a detailed exposition).

The difficulties around identifying the methods of science are also reflected in the difficulties of identifying scientific misconduct in the form of improper application of the method or methods of science. One of the first and most influential attempts at defining misconduct in science was the US definition from 1989 that defined misconduct as

fabrication, falsification, plagiarism, or other practices that seriously deviate from those that are commonly accepted within the scientific community . (Code of Federal Regulations, part 50, subpart A., August 8, 1989, italics added)

However, the “other practices that seriously deviate” clause was heavily criticized because it could be used to suppress creative or novel science. For example, the National Academy of Science stated in their report Responsible Science (1992) that it

wishes to discourage the possibility that a misconduct complaint could be lodged against scientists based solely on their use of novel or unorthodox research methods. (NAS: 27)

This clause was therefore later removed from the definition. For an entry into the key philosophical literature on conduct in science, see Shamoo & Resnick (2009).

The question of the source of the success of science has been at the core of philosophy since the beginning of modern science. If viewed as a matter of epistemology more generally, scientific method is a part of the entire history of philosophy. Over that time, science and whatever methods its practitioners may employ have changed dramatically. Today, many philosophers have taken up the banners of pluralism or of practice to focus on what are, in effect, fine-grained and contextually limited examinations of scientific method. Others hope to shift perspectives in order to provide a renewed general account of what characterizes the activity we call science.

One such perspective has been offered recently by Hoyningen-Huene (2008, 2013), who argues from the history of philosophy of science that after three lengthy phases of characterizing science by its method, we are now in a phase where the belief in the existence of a positive scientific method has eroded and what has been left to characterize science is only its fallibility. First was a phase from Plato and Aristotle up until the 17 th century where the specificity of scientific knowledge was seen in its absolute certainty established by proof from evident axioms; next was a phase up to the mid-19 th century in which the means to establish the certainty of scientific knowledge had been generalized to include inductive procedures as well. In the third phase, which lasted until the last decades of the 20 th century, it was recognized that empirical knowledge was fallible, but it was still granted a special status due to its distinctive mode of production. But now in the fourth phase, according to Hoyningen-Huene, historical and philosophical studies have shown how “scientific methods with the characteristics as posited in the second and third phase do not exist” (2008: 168) and there is no longer any consensus among philosophers and historians of science about the nature of science. For Hoyningen-Huene, this is too negative a stance, and he therefore urges the question about the nature of science anew. His own answer to this question is that “scientific knowledge differs from other kinds of knowledge, especially everyday knowledge, primarily by being more systematic” (Hoyningen-Huene 2013: 14). Systematicity can have several different dimensions: among them are more systematic descriptions, explanations, predictions, defense of knowledge claims, epistemic connectedness, ideal of completeness, knowledge generation, representation of knowledge and critical discourse. Hence, what characterizes science is the greater care in excluding possible alternative explanations, the more detailed elaboration with respect to data on which predictions are based, the greater care in detecting and eliminating sources of error, the more articulate connections to other pieces of knowledge, etc. On this position, what characterizes science is not that the methods employed are unique to science, but that the methods are more carefully employed.

Another, similar approach has been offered by Haack (2003). She sets off, similar to Hoyningen-Huene, from a dissatisfaction with the recent clash between what she calls Old Deferentialism and New Cynicism. The Old Deferentialist position is that science progressed inductively by accumulating true theories confirmed by empirical evidence or deductively by testing conjectures against basic statements; while the New Cynics position is that science has no epistemic authority and no uniquely rational method and is merely just politics. Haack insists that contrary to the views of the New Cynics, there are objective epistemic standards, and there is something epistemologically special about science, even though the Old Deferentialists pictured this in a wrong way. Instead, she offers a new Critical Commonsensist account on which standards of good, strong, supportive evidence and well-conducted, honest, thorough and imaginative inquiry are not exclusive to the sciences, but the standards by which we judge all inquirers. In this sense, science does not differ in kind from other kinds of inquiry, but it may differ in the degree to which it requires broad and detailed background knowledge and a familiarity with a technical vocabulary that only specialists may possess.

  • Aikenhead, G.S., 1987, “High-school graduates’ beliefs about science-technology-society. III. Characteristics and limitations of scientific knowledge”, Science Education , 71(4): 459–487.
  • Allchin, D., H.M. Andersen and K. Nielsen, 2014, “Complementary Approaches to Teaching Nature of Science: Integrating Student Inquiry, Historical Cases, and Contemporary Cases in Classroom Practice”, Science Education , 98: 461–486.
  • Anderson, C., 2008, “The end of theory: The data deluge makes the scientific method obsolete”, Wired magazine , 16(7): 16–07
  • Arabatzis, T., 2006, “On the inextricability of the context of discovery and the context of justification”, in Revisiting Discovery and Justification , J. Schickore and F. Steinle (eds.), Dordrecht: Springer, pp. 215–230.
  • Barnes, J. (ed.), 1984, The Complete Works of Aristotle, Vols I and II , Princeton: Princeton University Press.
  • Barnes, B. and D. Bloor, 1982, “Relativism, Rationalism, and the Sociology of Knowledge”, in Rationality and Relativism , M. Hollis and S. Lukes (eds.), Cambridge: MIT Press, pp. 1–20.
  • Bauer, H.H., 1992, Scientific Literacy and the Myth of the Scientific Method , Urbana: University of Illinois Press.
  • Bechtel, W. and R.C. Richardson, 1993, Discovering complexity , Princeton, NJ: Princeton University Press.
  • Berkeley, G., 1734, The Analyst in De Motu and The Analyst: A Modern Edition with Introductions and Commentary , D. Jesseph (trans. and ed.), Dordrecht: Kluwer Academic Publishers, 1992.
  • Blachowicz, J., 2009, “How science textbooks treat scientific method: A philosopher’s perspective”, The British Journal for the Philosophy of Science , 60(2): 303–344.
  • Bloor, D., 1991, Knowledge and Social Imagery , Chicago: University of Chicago Press, 2 nd edition.
  • Boyle, R., 1682, New experiments physico-mechanical, touching the air , Printed by Miles Flesher for Richard Davis, bookseller in Oxford.
  • Bridgman, P.W., 1927, The Logic of Modern Physics , New York: Macmillan.
  • –––, 1956, “The Methodological Character of Theoretical Concepts”, in The Foundations of Science and the Concepts of Science and Psychology , Herbert Feigl and Michael Scriven (eds.), Minnesota: University of Minneapolis Press, pp. 38–76.
  • Burian, R., 1997, “Exploratory Experimentation and the Role of Histochemical Techniques in the Work of Jean Brachet, 1938–1952”, History and Philosophy of the Life Sciences , 19(1): 27–45.
  • –––, 2007, “On microRNA and the need for exploratory experimentation in post-genomic molecular biology”, History and Philosophy of the Life Sciences , 29(3): 285–311.
  • Carnap, R., 1928, Der logische Aufbau der Welt , Berlin: Bernary, transl. by R.A. George, The Logical Structure of the World , Berkeley: University of California Press, 1967.
  • –––, 1956, “The methodological character of theoretical concepts”, Minnesota studies in the philosophy of science , 1: 38–76.
  • Carrol, S., and D. Goodstein, 2009, “Defining the scientific method”, Nature Methods , 6: 237.
  • Churchman, C.W., 1948, “Science, Pragmatics, Induction”, Philosophy of Science , 15(3): 249–268.
  • Cooper, J. (ed.), 1997, Plato: Complete Works , Indianapolis: Hackett.
  • Darden, L., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press
  • Dewey, J., 1910, How we think , New York: Dover Publications (reprinted 1997).
  • Douglas, H., 2009, Science, Policy, and the Value-Free Ideal , Pittsburgh: University of Pittsburgh Press.
  • Dupré, J., 2004, “Miracle of Monism ”, in Naturalism in Question , Mario De Caro and David Macarthur (eds.), Cambridge, MA: Harvard University Press, pp. 36–58.
  • Elliott, K.C., 2007, “Varieties of exploratory experimentation in nanotoxicology”, History and Philosophy of the Life Sciences , 29(3): 311–334.
  • Elliott, K. C., and T. Richards (eds.), 2017, Exploring inductive risk: Case studies of values in science , Oxford: Oxford University Press.
  • Falcon, Andrea, 2005, Aristotle and the science of nature: Unity without uniformity , Cambridge: Cambridge University Press.
  • Feyerabend, P., 1978, Science in a Free Society , London: New Left Books
  • –––, 1988, Against Method , London: Verso, 2 nd edition.
  • Fisher, R.A., 1955, “Statistical Methods and Scientific Induction”, Journal of The Royal Statistical Society. Series B (Methodological) , 17(1): 69–78.
  • Foster, K. and P.W. Huber, 1999, Judging Science. Scientific Knowledge and the Federal Courts , Cambridge: MIT Press.
  • Fox Keller, E., 2003, “Models, Simulation, and ‘computer experiments’”, in The Philosophy of Scientific Experimentation , H. Radder (ed.), Pittsburgh: Pittsburgh University Press, 198–215.
  • Gilbert, G., 1976, “The transformation of research findings into scientific knowledge”, Social Studies of Science , 6: 281–306.
  • Gimbel, S., 2011, Exploring the Scientific Method , Chicago: University of Chicago Press.
  • Goodman, N., 1965, Fact , Fiction, and Forecast , Indianapolis: Bobbs-Merrill.
  • Haack, S., 1995, “Science is neither sacred nor a confidence trick”, Foundations of Science , 1(3): 323–335.
  • –––, 2003, Defending science—within reason , Amherst: Prometheus.
  • –––, 2005a, “Disentangling Daubert: an epistemological study in theory and practice”, Journal of Philosophy, Science and Law , 5, Haack 2005a available online . doi:10.5840/jpsl2005513
  • –––, 2005b, “Trial and error: The Supreme Court’s philosophy of science”, American Journal of Public Health , 95: S66-S73.
  • –––, 2010, “Federal Philosophy of Science: A Deconstruction-and a Reconstruction”, NYUJL & Liberty , 5: 394.
  • Hangel, N. and J. Schickore, 2017, “Scientists’ conceptions of good research practice”, Perspectives on Science , 25(6): 766–791
  • Harper, W.L., 2011, Isaac Newton’s Scientific Method: Turning Data into Evidence about Gravity and Cosmology , Oxford: Oxford University Press.
  • Hempel, C., 1950, “Problems and Changes in the Empiricist Criterion of Meaning”, Revue Internationale de Philosophie , 41(11): 41–63.
  • –––, 1951, “The Concept of Cognitive Significance: A Reconsideration”, Proceedings of the American Academy of Arts and Sciences , 80(1): 61–77.
  • –––, 1965, Aspects of scientific explanation and other essays in the philosophy of science , New York–London: Free Press.
  • –––, 1966, Philosophy of Natural Science , Englewood Cliffs: Prentice-Hall.
  • Holmes, F.L., 1987, “Scientific writing and scientific discovery”, Isis , 78(2): 220–235.
  • Howard, D., 2003, “Two left turns make a right: On the curious political career of North American philosophy of science at midcentury”, in Logical Empiricism in North America , G.L. Hardcastle & A.W. Richardson (eds.), Minneapolis: University of Minnesota Press, pp. 25–93.
  • Hoyningen-Huene, P., 2008, “Systematicity: The nature of science”, Philosophia , 36(2): 167–180.
  • –––, 2013, Systematicity. The Nature of Science , Oxford: Oxford University Press.
  • Howie, D., 2002, Interpreting probability: Controversies and developments in the early twentieth century , Cambridge: Cambridge University Press.
  • Hughes, R., 1999, “The Ising Model, Computer Simulation, and Universal Physics”, in Models as Mediators , M. Morgan and M. Morrison (eds.), Cambridge: Cambridge University Press, pp. 97–145
  • Hume, D., 1739, A Treatise of Human Nature , D. Fate Norton and M.J. Norton (eds.), Oxford: Oxford University Press, 2000.
  • Humphreys, P., 1995, “Computational science and scientific method”, Minds and Machines , 5(1): 499–512.
  • ICMJE, 2013, “Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals”, International Committee of Medical Journal Editors, available online , accessed August 13 2014
  • Jeffrey, R.C., 1956, “Valuation and Acceptance of Scientific Hypotheses”, Philosophy of Science , 23(3): 237–246.
  • Kaufmann, W.J., and L.L. Smarr, 1993, Supercomputing and the Transformation of Science , New York: Scientific American Library.
  • Knorr-Cetina, K., 1981, The Manufacture of Knowledge , Oxford: Pergamon Press.
  • Krohs, U., 2012, “Convenience experimentation”, Studies in History and Philosophy of Biological and BiomedicalSciences , 43: 52–57.
  • Kuhn, T.S., 1962, The Structure of Scientific Revolutions , Chicago: University of Chicago Press
  • Latour, B. and S. Woolgar, 1986, Laboratory Life: The Construction of Scientific Facts , Princeton: Princeton University Press, 2 nd edition.
  • Laudan, L., 1968, “Theories of scientific method from Plato to Mach”, History of Science , 7(1): 1–63.
  • Lenhard, J., 2006, “Models and statistical inference: The controversy between Fisher and Neyman-Pearson”, The British Journal for the Philosophy of Science , 57(1): 69–91.
  • Leonelli, S., 2012, “Making Sense of Data-Driven Research in the Biological and the Biomedical Sciences”, Studies in the History and Philosophy of the Biological and Biomedical Sciences , 43(1): 1–3.
  • Levi, I., 1960, “Must the scientist make value judgments?”, Philosophy of Science , 57(11): 345–357
  • Lindley, D., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press.
  • Lipton, P., 2004, Inference to the Best Explanation , London: Routledge, 2 nd edition.
  • Marks, H.M., 2000, The progress of experiment: science and therapeutic reform in the United States, 1900–1990 , Cambridge: Cambridge University Press.
  • Mazzochi, F., 2015, “Could Big Data be the end of theory in science?”, EMBO reports , 16: 1250–1255.
  • Mayo, D.G., 1996, Error and the Growth of Experimental Knowledge , Chicago: University of Chicago Press.
  • McComas, W.F., 1996, “Ten myths of science: Reexamining what we think we know about the nature of science”, School Science and Mathematics , 96(1): 10–16.
  • Medawar, P.B., 1963/1996, “Is the scientific paper a fraud”, in The Strange Case of the Spotted Mouse and Other Classic Essays on Science , Oxford: Oxford University Press, 33–39.
  • Mill, J.S., 1963, Collected Works of John Stuart Mill , J. M. Robson (ed.), Toronto: University of Toronto Press
  • NAS, 1992, Responsible Science: Ensuring the integrity of the research process , Washington DC: National Academy Press.
  • Nersessian, N.J., 1987, “A cognitive-historical approach to meaning in scientific theories”, in The process of science , N. Nersessian (ed.), Berlin: Springer, pp. 161–177.
  • –––, 2008, Creating Scientific Concepts , Cambridge: MIT Press.
  • Newton, I., 1726, Philosophiae naturalis Principia Mathematica (3 rd edition), in The Principia: Mathematical Principles of Natural Philosophy: A New Translation , I.B. Cohen and A. Whitman (trans.), Berkeley: University of California Press, 1999.
  • –––, 1704, Opticks or A Treatise of the Reflections, Refractions, Inflections & Colors of Light , New York: Dover Publications, 1952.
  • Neyman, J., 1956, “Note on an Article by Sir Ronald Fisher”, Journal of the Royal Statistical Society. Series B (Methodological) , 18: 288–294.
  • Nickles, T., 1987, “Methodology, heuristics, and rationality”, in Rational changes in science: Essays on Scientific Reasoning , J.C. Pitt (ed.), Berlin: Springer, pp. 103–132.
  • Nicod, J., 1924, Le problème logique de l’induction , Paris: Alcan. (Engl. transl. “The Logical Problem of Induction”, in Foundations of Geometry and Induction , London: Routledge, 2000.)
  • Nola, R. and H. Sankey, 2000a, “A selective survey of theories of scientific method”, in Nola and Sankey 2000b: 1–65.
  • –––, 2000b, After Popper, Kuhn and Feyerabend. Recent Issues in Theories of Scientific Method , London: Springer.
  • –––, 2007, Theories of Scientific Method , Stocksfield: Acumen.
  • Norton, S., and F. Suppe, 2001, “Why atmospheric modeling is good science”, in Changing the Atmosphere: Expert Knowledge and Environmental Governance , C. Miller and P. Edwards (eds.), Cambridge, MA: MIT Press, 88–133.
  • O’Malley, M., 2007, “Exploratory experimentation and scientific practice: Metagenomics and the proteorhodopsin case”, History and Philosophy of the Life Sciences , 29(3): 337–360.
  • O’Malley, M., C. Haufe, K. Elliot, and R. Burian, 2009, “Philosophies of Funding”, Cell , 138: 611–615.
  • Oreskes, N., K. Shrader-Frechette, and K. Belitz, 1994, “Verification, Validation and Confirmation of Numerical Models in the Earth Sciences”, Science , 263(5147): 641–646.
  • Osborne, J., S. Simon, and S. Collins, 2003, “Attitudes towards science: a review of the literature and its implications”, International Journal of Science Education , 25(9): 1049–1079.
  • Parascandola, M., 1998, “Epidemiology—2 nd -Rate Science”, Public Health Reports , 113(4): 312–320.
  • Parker, W., 2008a, “Franklin, Holmes and the Epistemology of Computer Simulation”, International Studies in the Philosophy of Science , 22(2): 165–83.
  • –––, 2008b, “Computer Simulation through an Error-Statistical Lens”, Synthese , 163(3): 371–84.
  • Pearson, K. 1892, The Grammar of Science , London: J.M. Dents and Sons, 1951
  • Pearson, E.S., 1955, “Statistical Concepts in Their Relation to Reality”, Journal of the Royal Statistical Society , B, 17: 204–207.
  • Pickering, A., 1984, Constructing Quarks: A Sociological History of Particle Physics , Edinburgh: Edinburgh University Press.
  • Popper, K.R., 1959, The Logic of Scientific Discovery , London: Routledge, 2002
  • –––, 1963, Conjectures and Refutations , London: Routledge, 2002.
  • –––, 1985, Unended Quest: An Intellectual Autobiography , La Salle: Open Court Publishing Co..
  • Rudner, R., 1953, “The Scientist Qua Scientist Making Value Judgments”, Philosophy of Science , 20(1): 1–6.
  • Rudolph, J.L., 2005, “Epistemology for the masses: The origin of ‘The Scientific Method’ in American Schools”, History of Education Quarterly , 45(3): 341–376
  • Schickore, J., 2008, “Doing science, writing science”, Philosophy of Science , 75: 323–343.
  • Schickore, J. and N. Hangel, 2019, “‘It might be this, it should be that…’ uncertainty and doubt in day-to-day science practice”, European Journal for Philosophy of Science , 9(2): 31. doi:10.1007/s13194-019-0253-9
  • Shamoo, A.E. and D.B. Resnik, 2009, Responsible Conduct of Research , Oxford: Oxford University Press.
  • Shank, J.B., 2008, The Newton Wars and the Beginning of the French Enlightenment , Chicago: The University of Chicago Press.
  • Shapin, S. and S. Schaffer, 1985, Leviathan and the air-pump , Princeton: Princeton University Press.
  • Smith, G.E., 2002, “The Methodology of the Principia”, in The Cambridge Companion to Newton , I.B. Cohen and G.E. Smith (eds.), Cambridge: Cambridge University Press, 138–173.
  • Snyder, L.J., 1997a, “Discoverers’ Induction”, Philosophy of Science , 64: 580–604.
  • –––, 1997b, “The Mill-Whewell Debate: Much Ado About Induction”, Perspectives on Science , 5: 159–198.
  • –––, 1999, “Renovating the Novum Organum: Bacon, Whewell and Induction”, Studies in History and Philosophy of Science , 30: 531–557.
  • Sober, E., 2008, Evidence and Evolution. The logic behind the science , Cambridge: Cambridge University Press
  • Sprenger, J. and S. Hartmann, 2019, Bayesian philosophy of science , Oxford: Oxford University Press.
  • Steinle, F., 1997, “Entering New Fields: Exploratory Uses of Experimentation”, Philosophy of Science (Proceedings), 64: S65–S74.
  • –––, 2002, “Experiments in History and Philosophy of Science”, Perspectives on Science , 10(4): 408–432.
  • Strasser, B.J., 2012, “Data-driven sciences: From wonder cabinets to electronic databases”, Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences , 43(1): 85–87.
  • Succi, S. and P.V. Coveney, 2018, “Big data: the end of the scientific method?”, Philosophical Transactions of the Royal Society A , 377: 20180145. doi:10.1098/rsta.2018.0145
  • Suppe, F., 1998, “The Structure of a Scientific Paper”, Philosophy of Science , 65(3): 381–405.
  • Swijtink, Z.G., 1987, “The objectification of observation: Measurement and statistical methods in the nineteenth century”, in The probabilistic revolution. Ideas in History, Vol. 1 , L. Kruger (ed.), Cambridge MA: MIT Press, pp. 261–285.
  • Waters, C.K., 2007, “The nature and context of exploratory experimentation: An introduction to three case studies of exploratory research”, History and Philosophy of the Life Sciences , 29(3): 275–284.
  • Weinberg, S., 1995, “The methods of science… and those by which we live”, Academic Questions , 8(2): 7–13.
  • Weissert, T., 1997, The Genesis of Simulation in Dynamics: Pursuing the Fermi-Pasta-Ulam Problem , New York: Springer Verlag.
  • William H., 1628, Exercitatio Anatomica de Motu Cordis et Sanguinis in Animalibus , in On the Motion of the Heart and Blood in Animals , R. Willis (trans.), Buffalo: Prometheus Books, 1993.
  • Winsberg, E., 2010, Science in the Age of Computer Simulation , Chicago: University of Chicago Press.
  • Wivagg, D. & D. Allchin, 2002, “The Dogma of the Scientific Method”, The American Biology Teacher , 64(9): 645–646
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Blackmun opinion , in Daubert v. Merrell Dow Pharmaceuticals (92–102), 509 U.S. 579 (1993).
  • Scientific Method at philpapers. Darrell Rowbottom (ed.).
  • Recent Articles | Scientific Method | The Scientist Magazine

al-Kindi | Albert the Great [= Albertus magnus] | Aquinas, Thomas | Arabic and Islamic Philosophy, disciplines in: natural philosophy and natural science | Arabic and Islamic Philosophy, historical and methodological topics in: Greek sources | Arabic and Islamic Philosophy, historical and methodological topics in: influence of Arabic and Islamic Philosophy on the Latin West | Aristotle | Bacon, Francis | Bacon, Roger | Berkeley, George | biology: experiment in | Boyle, Robert | Cambridge Platonists | confirmation | Descartes, René | Enlightenment | epistemology | epistemology: Bayesian | epistemology: social | Feyerabend, Paul | Galileo Galilei | Grosseteste, Robert | Hempel, Carl | Hume, David | Hume, David: Newtonianism and Anti-Newtonianism | induction: problem of | Kant, Immanuel | Kuhn, Thomas | Leibniz, Gottfried Wilhelm | Locke, John | Mill, John Stuart | More, Henry | Neurath, Otto | Newton, Isaac | Newton, Isaac: philosophy | Ockham [Occam], William | operationalism | Peirce, Charles Sanders | Plato | Popper, Karl | rationality: historicist theories of | Reichenbach, Hans | reproducibility, scientific | Schlick, Moritz | science: and pseudo-science | science: theory and observation in | science: unity of | scientific discovery | scientific knowledge: social dimensions of | simulations in science | skepticism: medieval | space and time: absolute and relational space and motion, post-Newtonian theories | Vienna Circle | Whewell, William | Zabarella, Giacomo

Copyright © 2021 by Brian Hepburn < brian . hepburn @ wichita . edu > Hanne Andersen < hanne . andersen @ ind . ku . dk >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2023 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

When you choose to publish with PLOS, your research makes an impact. Make your work accessible to all, without restrictions, and accelerate scientific discovery with options like preprints and published peer review that make your work more Open.

  • PLOS Biology
  • PLOS Climate
  • PLOS Complex Systems
  • PLOS Computational Biology
  • PLOS Digital Health
  • PLOS Genetics
  • PLOS Global Public Health
  • PLOS Medicine
  • PLOS Mental Health
  • PLOS Neglected Tropical Diseases
  • PLOS Pathogens
  • PLOS Sustainability and Transformation
  • PLOS Collections
  • About This Blog
  • Official PLOS Blog
  • EveryONE Blog
  • Speaking of Medicine
  • PLOS Biologue
  • Absolutely Maybe
  • DNA Science
  • PLOS ECR Community
  • All Models Are Wrong
  • About PLOS Blogs

A Guide to Using the Scientific Method in Everyday Life

scientific problem solving designing carefully controlled research

The  scientific method —the process used by scientists to understand the natural world—has the merit of investigating natural phenomena in a rigorous manner. Working from hypotheses, scientists draw conclusions based on empirical data. These data are validated on large-scale numbers and take into consideration the intrinsic variability of the real world. For people unfamiliar with its intrinsic jargon and formalities, science may seem esoteric. And this is a huge problem: science invites criticism because it is not easily understood. So why is it important, then, that every person understand how science is done?

Because the scientific method is, first of all, a matter of logical reasoning and only afterwards, a procedure to be applied in a laboratory.

Individuals without training in logical reasoning are more easily victims of distorted perspectives about themselves and the world. An example is represented by the so-called “ cognitive biases ”—systematic mistakes that individuals make when they try to think rationally, and which lead to erroneous or inaccurate conclusions. People can easily  overestimate the relevance  of their own behaviors and choices. They can  lack the ability to self-estimate the quality of their performances and thoughts . Unconsciously, they could even end up selecting only the arguments  that support their hypothesis or beliefs . This is why the scientific framework should be conceived not only as a mechanism for understanding the natural world, but also as a framework for engaging in logical reasoning and discussion.

A brief history of the scientific method

The scientific method has its roots in the sixteenth and seventeenth centuries. Philosophers Francis Bacon and René Descartes are often credited with formalizing the scientific method because they contrasted the idea that research should be guided by metaphysical pre-conceived concepts of the nature of reality—a position that, at the time,  was highly supported by their colleagues . In essence, Bacon thought that  inductive reasoning based on empirical observation was critical to the formulation of hypotheses  and the  generation of new understanding : general or universal principles describing how nature works are derived only from observations of recurring phenomena and data recorded from them. The inductive method was used, for example, by the scientist Rudolf Virchow to formulate the third principle of the notorious  cell theory , according to which every cell derives from a pre-existing one. The rationale behind this conclusion is that because all observations of cell behavior show that cells are only derived from other cells, this assertion must be always true. 

Inductive reasoning, however, is not immune to mistakes and limitations. Referring back to cell theory, there may be rare occasions in which a cell does not arise from a pre-existing one, even though we haven’t observed it yet—our observations on cell behavior, although numerous, can still benefit from additional observations to either refute or support the conclusion that all cells arise from pre-existing ones. And this is where limited observations can lead to erroneous conclusions reasoned inductively. In another example, if one never has seen a swan that is not white, they might conclude that all swans are white, even when we know that black swans do exist, however rare they may be.  

The universally accepted scientific method, as it is used in science laboratories today, is grounded in  hypothetico-deductive reasoning . Research progresses via iterative empirical testing of formulated, testable hypotheses (formulated through inductive reasoning). A testable hypothesis is one that can be rejected (falsified) by empirical observations, a concept known as the  principle of falsification . Initially, ideas and conjectures are formulated. Experiments are then performed to test them. If the body of evidence fails to reject the hypothesis, the hypothesis stands. It stands however until and unless another (even singular) empirical observation falsifies it. However, just as with inductive reasoning, hypothetico-deductive reasoning is not immune to pitfalls—assumptions built into hypotheses can be shown to be false, thereby nullifying previously unrejected hypotheses. The bottom line is that science does not work to prove anything about the natural world. Instead, it builds hypotheses that explain the natural world and then attempts to find the hole in the reasoning (i.e., it works to disprove things about the natural world).

How do scientists test hypotheses?

Controlled experiments

The word “experiment” can be misleading because it implies a lack of control over the process. Therefore, it is important to understand that science uses controlled experiments in order to test hypotheses and contribute new knowledge. So what exactly is a controlled experiment, then? 

Let us take a practical example. Our starting hypothesis is the following: we have a novel drug that we think inhibits the division of cells, meaning that it prevents one cell from dividing into two cells (recall the description of cell theory above). To test this hypothesis, we could treat some cells with the drug on a plate that contains nutrients and fuel required for their survival and division (a standard cell biology assay). If the drug works as expected, the cells should stop dividing. This type of drug might be useful, for example, in treating cancers because slowing or stopping the division of cells would result in the slowing or stopping of tumor growth.

Although this experiment is relatively easy to do, the mere process of doing science means that several experimental variables (like temperature of the cells or drug, dosage, and so on) could play a major role in the experiment. This could result in a failed experiment when the drug actually does work, or it could give the appearance that the drug is working when it is not. Given that these variables cannot be eliminated, scientists always run control experiments in parallel to the real ones, so that the effects of these other variables can be determined.  Control experiments  are designed so that all variables, with the exception of the one under investigation, are kept constant. In simple terms, the conditions must be identical between the control and the actual experiment.     

Coming back to our example, when a drug is administered it is not pure. Often, it is dissolved in a solvent like water or oil. Therefore, the perfect control to the actual experiment would be to administer pure solvent (without the added drug) at the same time and with the same tools, where all other experimental variables (like temperature, as mentioned above) are the same between the two (Figure 1). Any difference in effect on cell division in the actual experiment here can be attributed to an effect of the drug because the effects of the solvent were controlled.

scientific problem solving designing carefully controlled research

In order to provide evidence of the quality of a single, specific experiment, it needs to be performed multiple times in the same experimental conditions. We call these multiple experiments “replicates” of the experiment (Figure 2). The more replicates of the same experiment, the more confident the scientist can be about the conclusions of that experiment under the given conditions. However, multiple replicates under the same experimental conditions  are of no help  when scientists aim at acquiring more empirical evidence to support their hypothesis. Instead, they need  independent experiments  (Figure 3), in their own lab and in other labs across the world, to validate their results. 

scientific problem solving designing carefully controlled research

Often times, especially when a given experiment has been repeated and its outcome is not fully clear, it is better  to find alternative experimental assays  to test the hypothesis. 

scientific problem solving designing carefully controlled research

Applying the scientific approach to everyday life

So, what can we take from the scientific approach to apply to our everyday lives?

A few weeks ago, I had an agitated conversation with a bunch of friends concerning the following question: What is the definition of intelligence?

Defining “intelligence” is not easy. At the beginning of the conversation, everybody had a different, “personal” conception of intelligence in mind, which – tacitly – implied that the conversation could have taken several different directions. We realized rather soon that someone thought that an intelligent person is whoever is able to adapt faster to new situations; someone else thought that an intelligent person is whoever is able to deal with other people and empathize with them. Personally, I thought that an intelligent person is whoever displays high cognitive skills, especially in abstract reasoning. 

The scientific method has the merit of providing a reference system, with precise protocols and rules to follow. Remember: experiments must be reproducible, which means that an independent scientists in a different laboratory, when provided with the same equipment and protocols, should get comparable results.  Fruitful conversations as well need precise language, a kind of reference vocabulary everybody should agree upon, in order to discuss about the same “content”. This is something we often forget, something that was somehow missing at the opening of the aforementioned conversation: even among friends, we should always agree on premises, and define them in a rigorous manner, so that they are the same for everybody. When speaking about “intelligence”, we must all make sure we understand meaning and context of the vocabulary adopted in the debate (Figure 4, point 1).  This is the first step of “controlling” a conversation.

There is another downside that a discussion well-grounded in a scientific framework would avoid. The mistake is not structuring the debate so that all its elements, except for the one under investigation, are kept constant (Figure 4, point 2). This is particularly true when people aim at making comparisons between groups to support their claim. For example, they may try to define what intelligence is by comparing the  achievements in life of different individuals: “Stephen Hawking is a brilliant example of intelligence because of his great contribution to the physics of black holes”. This statement does not help to define what intelligence is, simply because it compares Stephen Hawking, a famous and exceptional physicist, to any other person, who statistically speaking, knows nothing about physics. Hawking first went to the University of Oxford, then he moved to the University of Cambridge. He was in contact with the most influential physicists on Earth. Other people were not. All of this, of course, does not disprove Hawking’s intelligence; but from a logical and methodological point of view, given the multitude of variables included in this comparison, it cannot prove it. Thus, the sentence “Stephen Hawking is a brilliant example of intelligence because of his great contribution to the physics of black holes” is not a valid argument to describe what intelligence is. If we really intend to approximate a definition of intelligence, Steven Hawking should be compared to other physicists, even better if they were Hawking’s classmates at the time of college, and colleagues afterwards during years of academic research. 

In simple terms, as scientists do in the lab, while debating we should try to compare groups of elements that display identical, or highly similar, features. As previously mentioned, all variables – except for the one under investigation – must be kept constant.

This insightful piece  presents a detailed analysis of how and why science can help to develop critical thinking.

scientific problem solving designing carefully controlled research

In a nutshell

Here is how to approach a daily conversation in a rigorous, scientific manner:

  • First discuss about the reference vocabulary, then discuss about the content of the discussion.  Think about a researcher who is writing down an experimental protocol that will be used by thousands of other scientists in varying continents. If the protocol is rigorously written, all scientists using it should get comparable experimental outcomes. In science this means reproducible knowledge, in daily life this means fruitful conversations in which individuals are on the same page. 
  • Adopt “controlled” arguments to support your claims.  When making comparisons between groups, visualize two blank scenarios. As you start to add details to both of them, you have two options. If your aim is to hide a specific detail, the better is to design the two scenarios in a completely different manner—it is to increase the variables. But if your intention is to help the observer to isolate a specific detail, the better is to design identical scenarios, with the exception of the intended detail—it is therefore to keep most of the variables constant. This is precisely how scientists ideate adequate experiments to isolate new pieces of knowledge, and how individuals should orchestrate their thoughts in order to test them and facilitate their comprehension to others.   

Not only the scientific method should offer individuals an elitist way to investigate reality, but also an accessible tool to properly reason and discuss about it.

Edited by Jason Organ, PhD, Indiana University School of Medicine.

scientific problem solving designing carefully controlled research

Simone is a molecular biologist on the verge of obtaining a doctoral title at the University of Ulm, Germany. He is Vice-Director at Culturico (https://culturico.com/), where his writings span from Literature to Sociology, from Philosophy to Science. His writings recently appeared in Psychology Today, openDemocracy, Splice Today, Merion West, Uncommon Ground and The Society Pages. Follow Simone on Twitter: @simredaelli

  • Pingback: Case Studies in Ethical Thinking: Day 1 | Education & Erudition

This has to be the best article I have ever read on Scientific Thinking. I am presently writing a treatise on how Scientific thinking can be adopted to entreat all situations.And how, a 4 year old child can be taught to adopt Scientific thinking, so that, the child can look at situations that bothers her and she could try to think about that situation by formulating the right questions. She may not have the tools to find right answers? But, forming questions by using right technique ? May just make her find a way to put her mind to rest even at that level. That is why, 4 year olds are often “eerily: (!)intelligent, I have iften been intimidated and plain embarrassed to see an intelligent and well spoken 4 year old deal with celibrity ! Of course, there are a lot of variables that have to be kept in mind in order to train children in such controlled thinking environment, as the screenplay of little Sheldon shows. Thanking the author with all my heart – #ershadspeak #wearescience #weareallscientists Ershad Khandker

Simone, thank you for this article. I have the idea that I want to apply what I learned in Biology to everyday life. You addressed this issue, and have given some basic steps in using the scientific method.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name and email for the next time I comment.

By Ana Santos-Carvalho and Carolina Lebre, edited by Andrew S. Cale Excessive use of technical jargon can be a significant barrier to…

By Ryan McRae and Briana Pobiner, edited by Andrew S. Cale In 2023, the field of human evolution benefited from a plethora…

By Elizabeth Fusco, edited by Michael Liesen Infection and pandemics have never been more relevant globally, and zombies have long been used…

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Chemistry LibreTexts

1.3: The Scientific Method - How Chemists Think

  • Last updated
  • Save as PDF
  • Page ID 47444

Learning Objectives

  • Identify the components of the scientific method.

Scientists search for answers to questions and solutions to problems by using a procedure called the scientific method. This procedure consists of making observations, formulating hypotheses, and designing experiments; which leads to additional observations, hypotheses, and experiments in repeated cycles (Figure \(\PageIndex{1}\)).

1.4.jpg

Step 1: Make observations

Observations can be qualitative or quantitative. Qualitative observations describe properties or occurrences in ways that do not rely on numbers. Examples of qualitative observations include the following: "the outside air temperature is cooler during the winter season," "table salt is a crystalline solid," "sulfur crystals are yellow," and "dissolving a penny in dilute nitric acid forms a blue solution and a brown gas." Quantitative observations are measurements, which by definition consist of both a number and a unit. Examples of quantitative observations include the following: "the melting point of crystalline sulfur is 115.21° Celsius," and "35.9 grams of table salt—the chemical name of which is sodium chloride—dissolve in 100 grams of water at 20° Celsius." For the question of the dinosaurs’ extinction, the initial observation was quantitative: iridium concentrations in sediments dating to 66 million years ago were 20–160 times higher than normal.

Step 2: Formulate a hypothesis

After deciding to learn more about an observation or a set of observations, scientists generally begin an investigation by forming a hypothesis, a tentative explanation for the observation(s). The hypothesis may not be correct, but it puts the scientist’s understanding of the system being studied into a form that can be tested. For example, the observation that we experience alternating periods of light and darkness corresponding to observed movements of the sun, moon, clouds, and shadows is consistent with either one of two hypotheses:

  • Earth rotates on its axis every 24 hours, alternately exposing one side to the sun.
  • The sun revolves around Earth every 24 hours.

Suitable experiments can be designed to choose between these two alternatives. For the disappearance of the dinosaurs, the hypothesis was that the impact of a large extraterrestrial object caused their extinction. Unfortunately (or perhaps fortunately), this hypothesis does not lend itself to direct testing by any obvious experiment, but scientists can collect additional data that either support or refute it.

Step 3: Design and perform experiments

After a hypothesis has been formed, scientists conduct experiments to test its validity. Experiments are systematic observations or measurements, preferably made under controlled conditions—that is—under conditions in which a single variable changes.

Step 4: Accept or modify the hypothesis

A properly designed and executed experiment enables a scientist to determine whether or not the original hypothesis is valid. If the hypothesis is valid, the scientist can proceed to step 5. In other cases, experiments often demonstrate that the hypothesis is incorrect or that it must be modified and requires further experimentation.

Step 5: Development into a law and/or theory

More experimental data are then collected and analyzed, at which point a scientist may begin to think that the results are sufficiently reproducible (i.e., dependable) to merit being summarized in a law, a verbal or mathematical description of a phenomenon that allows for general predictions. A law simply states what happens; it does not address the question of why.

One example of a law, the law of definite proportions , which was discovered by the French scientist Joseph Proust (1754–1826), states that a chemical substance always contains the same proportions of elements by mass. Thus, sodium chloride (table salt) always contains the same proportion by mass of sodium to chlorine, in this case 39.34% sodium and 60.66% chlorine by mass, and sucrose (table sugar) is always 42.11% carbon, 6.48% hydrogen, and 51.41% oxygen by mass.

Whereas a law states only what happens, a theory attempts to explain why nature behaves as it does. Laws are unlikely to change greatly over time unless a major experimental error is discovered. In contrast, a theory, by definition, is incomplete and imperfect, evolving with time to explain new facts as they are discovered.

Because scientists can enter the cycle shown in Figure \(\PageIndex{1}\) at any point, the actual application of the scientific method to different topics can take many different forms. For example, a scientist may start with a hypothesis formed by reading about work done by others in the field, rather than by making direct observations.

Example \(\PageIndex{1}\)

Classify each statement as a law, a theory, an experiment, a hypothesis, an observation.

  • Ice always floats on liquid water.
  • Birds evolved from dinosaurs.
  • Hot air is less dense than cold air, probably because the components of hot air are moving more rapidly.
  • When 10 g of ice were added to 100 mL of water at 25°C, the temperature of the water decreased to 15.5°C after the ice melted.
  • The ingredients of Ivory soap were analyzed to see whether it really is 99.44% pure, as advertised.
  • This is a general statement of a relationship between the properties of liquid and solid water, so it is a law.
  • This is a possible explanation for the origin of birds, so it is a hypothesis.
  • This is a statement that tries to explain the relationship between the temperature and the density of air based on fundamental principles, so it is a theory.
  • The temperature is measured before and after a change is made in a system, so these are observations.
  • This is an analysis designed to test a hypothesis (in this case, the manufacturer’s claim of purity), so it is an experiment.

Exercise \(\PageIndex{1}\) 

Classify each statement as a law, a theory, an experiment, a hypothesis, a qualitative observation, or a quantitative observation.

  • Measured amounts of acid were added to a Rolaids tablet to see whether it really “consumes 47 times its weight in excess stomach acid.”
  • Heat always flows from hot objects to cooler ones, not in the opposite direction.
  • The universe was formed by a massive explosion that propelled matter into a vacuum.
  • Michael Jordan is the greatest pure shooter to ever play professional basketball.
  • Limestone is relatively insoluble in water, but dissolves readily in dilute acid with the evolution of a gas.

The scientific method is a method of investigation involving experimentation and observation to acquire new knowledge, solve problems, and answer questions. The key steps in the scientific method include the following:

  • Step 1: Make observations.
  • Step 2: Formulate a hypothesis.
  • Step 3: Test the hypothesis through experimentation.
  • Step 4: Accept or modify the hypothesis.
  • Step 5: Develop into a law and/or a theory.

Contributions & Attributions

  • Architecture and Design
  • Asian and Pacific Studies
  • Business and Economics
  • Classical and Ancient Near Eastern Studies
  • Computer Sciences
  • Cultural Studies
  • Engineering
  • General Interest
  • Geosciences
  • Industrial Chemistry
  • Islamic and Middle Eastern Studies
  • Jewish Studies
  • Library and Information Science, Book Studies
  • Life Sciences
  • Linguistics and Semiotics
  • Literary Studies
  • Materials Sciences
  • Mathematics
  • Social Sciences
  • Sports and Recreation
  • Theology and Religion
  • Publish your article
  • The role of authors
  • Promoting your article
  • Abstracting & indexing
  • Publishing Ethics
  • Why publish with De Gruyter
  • How to publish with De Gruyter
  • Our book series
  • Our subject areas
  • Your digital product at De Gruyter
  • Contribute to our reference works
  • Product information
  • Tools & resources
  • Product Information
  • Promotional Materials
  • Orders and Inquiries
  • FAQ for Library Suppliers and Book Sellers
  • Repository Policy
  • Free access policy
  • Open Access agreements
  • Database portals
  • For Authors
  • Customer service
  • People + Culture
  • Journal Management
  • How to join us
  • Working at De Gruyter
  • Mission & Vision
  • De Gruyter Foundation
  • De Gruyter Ebound
  • Our Responsibility
  • Partner publishers

scientific problem solving designing carefully controlled research

Your purchase has been completed. Your documents are now available to view.

Identifying and solving scientific problems in the medicine: key to become a competent scientist

The scientific method can be described as a multistep and detailed process, in which finding the best question is the first and most crucial step. Thus, scientific problem should be examined thoroughly in different ways and perspectives. The amount and diversity of scientific data are enormously increasing and becoming more specific day by day, therefore traditional observational biology is not sufficient on an individual basis to understand and treat multifactorial diseases. Moreover, protocols, documentations, information, outcomes, precisions, and considerations of evidence should be improved to answer scientific questions correctly during the scientific research. Because of the diversity of the data and the methods, statisticians and methodologists should be involved and contribute to the all stages of research. Besides that, all scientific data should be certainly reproducible and repeatable. Scientific knowledge is in a state of flux and becomes more complex day by day. Thus, becoming a competent scientist needs, abilities and skills such as creativity, hardworking and self-discipline that all requires lifelong learning, searching, and widening scientific horizons consistently.

Bilimsel yöntem, en iyi soruyu bulmanın ilk ve en önemli adım olduğu çok aşamalı ve ayrıntılı bir süreç olarak tanımlanabilir. Bu nedenle, bilimsel problem farklı şekillerde ve bakış açılarıyla ayrıntılı olarak incelenmelidir. Bilimsel verilerin sayısı ve çeşitliliği gün geçtikçe son derece hızlı bir biçimde artmakta ve daha belirgin hale gelmektedir, bu nedenle gelenekseli, gözlemsel biyoloji, çok faktörlü hastalıkları anlamak ve tedavi etmek için tek başına yeterli değildir. Ayrıca, bilimsel araştırma sırasında bilimsel sorulara doğru cevap verebilmek için protokoller, belgeler, bilgiler, sonuçlar, kesinlikler ve kanıtlar iyileştirilmelidir. Verilerin çeşitliliği ve yöntemlerden dolayı, istatistikçiler ve metod geliştirenler araştırmaya katılmalı ve araştırmanın her aşamasına katkıda bulunmalıdır. Bunun yanı sıra, tüm bilimsel veriler kesinlikle tekrarlanabilir olmalıdır. Bilimsel bilgi bir akış halindedir ve gün geçtikçe daha karmaşık hale gelir. Böylece, yetkin bir bilim insanının yaratıcılık, çalışkanlık ve öz disiplin gibi ihtiyaçları, yetenekleri ve becerileri herkesin yaşam boyu öğrenmeyi, aramayı ve bilimsel ufukları tutarlı bir şekilde genişletmeyi gerektiren bir hale gelir.

Introduction

The scientific method in medicine is comprised of research design, conducting research, data analyzing and interpretation that all contribute to the solving specified problems. Research design types can be categorized as a case study, survey, observational study, semi-experimental, experimental, review, meta-analytic or comparative [ 1 ]. However, before choosing research design type in medicine, finding the best question of which either comprises huge populations such as patients with diabetes, cancer or affects small groups like people with rare diseases is the first and most crucial step. Although rare diseases impact fewer human beings, in total many people are affected from them in the worldwide since there are no cure [ 2 ], [ 3 ], [ 4 ].

Present problems in the medical and biological sciences should be examined thoroughly in different ways and perspectives to find the best scientific question. Therefore, researchers should widen their scientific horizons consistently and should develop deep insight in their specific fields [ 5 ], [ 6 ], [ 7 ], [ 8 ], [ 9 ], [ 10 ], [ 11 ]. The amount and diversity of scientific data are enormously increasing and becoming more specific day by day. Therefore, traditional observational biology is not sufficient alone to understand and treat multifactorial diseases such as obesity, cancer or neurological disorders. Every data contributes to the scientific knowledge in the worldwide. Thus, access to the largest data by using omic technologies such as lipidomics, metabolics, proteomics, genomics, etc. has led to the revolution in the medical and biological sciences, that enable scientist to reveal complex mechanisms behind various diseases which affect either huge or small populations. Thus, not only determining the problem, but also knowing how to analyze and integrate the scientific data is crucial to become a competent scientist [ 4 ], [ 5 ], [ 6 ], [ 7 ], [ 8 ], [ 9 ], [ 10 ].

Protocols, documentations, information, outcomes, precisions and considerations of evidence should be improved for data analysis and interpretation. However, in research design there are also other factors affecting the research quality, for instance originality, instruments used in the experiments that all parameters together contribute to the increasing validity and the reliability of a research [ 11 ]. Also, since methods using in each field are diversifying day by day, choosing the best and most effective methods play a vital role to obtain the most accurate and reliable data. Therefore, statisticians and methodologists should be involved and contribute to the all stages of medical and biological research. Besides that, all scientific data and procedure should be certainly reproducible and repeatable in every area of the discipline including medicine [ 11 ].

The scientific world is continuously in progress and improving itself day by day. New methods and data analyzing approaches, including various omic technologies revolutionize the medical research field. Thus, researchers encounter new concepts such as subtyping patients with diseases to reveal biomarker that enables us to discover personalized medicine techniques. Personalized treatments are promising therapeutic approaches which increasing efficacy of the treatment and reducing side effects. These factors enable us to predict disease susceptibility that all together contribute to the improving human health [ 10 ], [ 11 ], [ 12 ].

The researcher’s creativity, critical thinking skills, abilities and successes are directly correlated with the researchers’ deep knowledge on a specific topic, current technologies, data analysis and interpretation. Since science continues from past to present, every step we follow reflects an evolutionary step on the way. Scientific knowledge is in a state of flux and becomes more complex and competitive day by day. Therefore, being a competent scientist needs various skills such as creativity, hardworking and self-discipline, since this process is a lifelong journey requiring consistently learning, searching, and widening scientific horizons for a lifetime.

Currently the world has realized the importance and the need of describing reasons of various public health concerns, since this is the key to solving them. Therefore, finding the best question of which either comprises huge populations or affects small groups is the first and most crucial step in the medical and biological sciences. The amount and diversity of scientific data and novel methods are enormously increasing and becoming more specific day by day. Therefore, a researcher’s creative and critical thinking, abilities and successes are directly correlated with the researchers’ deep knowledge on a specific topic, current technologies, data analysis and interpretation.

1. Mann CJ. Observational research methods. Research design II: cohort, cross sectional, and case-control studies. J Emerg Med 2003;20:54–60. 10.1136/emj.20.1.54 Search in Google Scholar

2. Lewis ND. The scientific method in medicine. J Natl Med Assoc 1958;50:325–8. Search in Google Scholar

3. Donders Y. The right to enjoy the benefits of scientific progress: in search of state obligations in relation to health. Med Health Care Philos 2011;1:371–81. 10.1007/s11019-011-9327-y Search in Google Scholar

4. Whicher D, Philbin S, Aronson N. An overview of the impact of rare disease characteristics on research methodology. Orphanet J Rare Dis 2018;13:14. 10.1186/s13023-017-0755-5 Search in Google Scholar

5. Fernald DH, Coombs L, DeAlleaume L, West D, Parnes B. An assessment of the Hawthorne Dale H. Scientific Method in Medical Research. Br Med J 1950;2:1185–90. 10.1136/bmj.2.4690.1185 Search in Google Scholar

6. Carney TJ, Weber DJ. Public health intelligence: learning from the Ebola crisis. Am J Public Health 2015;105:1740–4. 10.2105/AJPH.2015.302771 Search in Google Scholar

7. Dolmans DH, Loyens SM, Marcq H, Gijbels D. Deep and surface learning in problem-based learning: a review of the literature. Adv Health Sci Educ 2016;21:1087–112. 10.1007/s10459-015-9645-6 Search in Google Scholar

8. Crockett ET. A research education program model to prepare a highly-qualified workforce in biomedical and health-related research and increase diversity. BMC Med Educ 2014;14:202. 10.1186/1472-6920-14-202 Search in Google Scholar

9. Brown SA. Patient similarity: emerging concepts in systems and precision medicine. Front Physiol 2016;7:561. 10.3389/fphys.2016.00561 Search in Google Scholar

10. Gligorijević V, Malod-Dognin N, Pržulj N. Integrative methods for analyzing big data in precision medicine. Proteomics 2016;16:741–58. 10.1002/pmic.201500396 Search in Google Scholar

11. Ioannidis JP, Greenland S, Hlatky MA, Khoury MJ, Macleod MR, Moher D, et al. Increasing value and reducing waste in research design, conduct, and analysis. Lancet 2014;383:166–75. 10.1016/S0140-6736(13)62227-8 Search in Google Scholar

12. Redekop WK, Mladsi D. The faces of personalized medicine: a framework for understanding its meaning and scope. Value Health 2013;16:4–9. 10.1016/j.jval.2013.06.005 Search in Google Scholar PubMed

©2019 Walter de Gruyter GmbH, Berlin/Boston

  • X / Twitter

Supplementary Materials

Please login or register with De Gruyter to order this product.

Turkish Journal of Biochemistry

Journal and Issue

Articles in the same issue.

scientific problem solving designing carefully controlled research

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

Biology library

Course: biology library   >   unit 1, the scientific method.

  • Controlled experiments
  • The scientific method and experimental design

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Great Answer

Video transcript

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Biology LibreTexts

1.3: The Science of Biology - The Scientific Method

  • Last updated
  • Save as PDF
  • Page ID 12645

Learning Objectives

  • Discuss hypotheses and the components of a scientific experiment as part of the scientific method

The Scientific Method

Biologists study the living world by posing questions about it and seeking science -based responses. This approach is common to other sciences as well and is often referred to as the scientific method. The scientific method was used even in ancient times, but it was first documented by England’s Sir Francis Bacon (1561–1626) who set up inductive methods for scientific inquiry. The scientific method can be applied to almost all fields of study as a logical, rational, problem-solving method.

image

The scientific process typically starts with an observation (often a problem to be solved) that leads to a question. Let’s think about a simple problem that starts with an observation and apply the scientific method to solve the problem. A teenager notices that his friend is really tall and wonders why. So his question might be, “Why is my friend so tall? ”

image

Proposing a Hypothesis

Recall that a hypothesis is an educated guess that can be tested. Hypotheses often also include an explanation for the educated guess. To solve one problem, several hypotheses may be proposed. For example, the student might believe that his friend is tall because he drinks a lot of milk. So his hypothesis might be “If a person drinks a lot of milk, then they will grow to be very tall because milk is good for your bones.” Generally, hypotheses have the format “If…then…” Keep in mind that there could be other responses to the question; therefore, other hypotheses may be proposed. A second hypothesis might be, “If a person has tall parents, then they will also be tall, because they have the genes to be tall. ”

Once a hypothesis has been selected, the student can make a prediction. A prediction is similar to a hypothesis but it is truly a guess. For instance, they might predict that their friend is tall because he drinks a lot of milk.

Testing a Hypothesis

A valid hypothesis must be testable. It should also be falsifiable, meaning that it can be disproven by experimental results. Importantly, science does not claim to “prove” anything because scientific understandings are always subject to modification with further information. This step—openness to disproving ideas—is what distinguishes sciences from non-sciences. The presence of the supernatural, for instance, is neither testable nor falsifiable. To test a hypothesis, a researcher will conduct one or more experiments designed to eliminate one or more of the hypotheses. Each experiment will have one or more variables and one or more controls. A variable is any part of the experiment that can vary or change during the experiment. The control group contains every feature of the experimental group except it is not given the manipulation that is hypothesized. For example, a control group could be a group of varied teenagers that did not drink milk and they could be compared to the experimental group, a group of varied teenagers that did drink milk. Thus, if the results of the experimental group differ from the control group, the difference must be due to the hypothesized manipulation rather than some outside factor. To test the first hypothesis, the student would find out if drinking milk affects height. If drinking milk has no affect on height, then there must be another reason for the height of the friend. To test the second hypothesis, the student could check whether or not his friend has tall parents. Each hypothesis should be tested by carrying out appropriate experiments. Be aware that rejecting one hypothesis does not determine whether or not the other hypotheses can be accepted. It simply eliminates one hypothesis that is not valid. Using the scientific method, the hypotheses that are inconsistent with experimental data are rejected.

While this “tallness” example is based on observational results, other hypotheses and experiments might have clearer controls. For instance, a student might attend class on Monday and realize she had difficulty concentrating on the lecture. One hypothesis to explain this occurrence might be, “If I eat breakfast before class, then I am better able to pay attention.” The student could then design an experiment with a control to test this hypothesis.

The scientific method may seem too rigid and structured. It is important to keep in mind that although scientists often follow this sequence, there is flexibility. Many times, science does not operate in a linear fashion. Instead, scientists continually draw inferences and make generalizations, finding patterns as their research proceeds. Scientific reasoning is more complex than the scientific method alone suggests.

  • In the scientific method, observations lead to questions that require answers.
  • In the scientific method, the hypothesis is a testable statement proposed to answer a question.
  • In the scientific method, experiments (often with controls and variables) are devised to test hypotheses.
  • In the scientific method, analysis of the results of an experiment will lead to the hypothesis being accepted or rejected.
  • scientific method : a way of discovering knowledge based on making falsifiable predictions (hypotheses), testing them, and developing theories based on collected data
  • hypothesis : an educated guess that usually is found in an “if…then…” format
  • control group : a group that contains every feature of the experimental group except it is not given the manipulation that is hypothesized

PrepScholar

Choose Your Test

Sat / act prep online guides and tips, the 6 scientific method steps and how to use them.

author image

General Education

feature_microscope-1

When you’re faced with a scientific problem, solving it can seem like an impossible prospect. There are so many possible explanations for everything we see and experience—how can you possibly make sense of them all? Science has a simple answer: the scientific method.

The scientific method is a method of asking and answering questions about the world. These guiding principles give scientists a model to work through when trying to understand the world, but where did that model come from, and how does it work?

In this article, we’ll define the scientific method, discuss its long history, and cover each of the scientific method steps in detail.

What Is the Scientific Method?

At its most basic, the scientific method is a procedure for conducting scientific experiments. It’s a set model that scientists in a variety of fields can follow, going from initial observation to conclusion in a loose but concrete format.

The number of steps varies, but the process begins with an observation, progresses through an experiment, and concludes with analysis and sharing data. One of the most important pieces to the scientific method is skepticism —the goal is to find truth, not to confirm a particular thought. That requires reevaluation and repeated experimentation, as well as examining your thinking through rigorous study.

There are in fact multiple scientific methods, as the basic structure can be easily modified.  The one we typically learn about in school is the basic method, based in logic and problem solving, typically used in “hard” science fields like biology, chemistry, and physics. It may vary in other fields, such as psychology, but the basic premise of making observations, testing, and continuing to improve a theory from the results remain the same.

body_history

The History of the Scientific Method

The scientific method as we know it today is based on thousands of years of scientific study. Its development goes all the way back to ancient Mesopotamia, Greece, and India.

The Ancient World

In ancient Greece, Aristotle devised an inductive-deductive process , which weighs broad generalizations from data against conclusions reached by narrowing down possibilities from a general statement. However, he favored deductive reasoning, as it identifies causes, which he saw as more important.

Aristotle wrote a great deal about logic and many of his ideas about reasoning echo those found in the modern scientific method, such as ignoring circular evidence and limiting the number of middle terms between the beginning of an experiment and the end. Though his model isn’t the one that we use today, the reliance on logic and thorough testing are still key parts of science today.

The Middle Ages

The next big step toward the development of the modern scientific method came in the Middle Ages, particularly in the Islamic world. Ibn al-Haytham, a physicist from what we now know as Iraq, developed a method of testing, observing, and deducing for his research on vision. al-Haytham was critical of Aristotle’s lack of inductive reasoning, which played an important role in his own research.

Other scientists, including Abū Rayhān al-Bīrūnī, Ibn Sina, and Robert Grosseteste also developed models of scientific reasoning to test their own theories. Though they frequently disagreed with one another and Aristotle, those disagreements and refinements of their methods led to the scientific method we have today.

Following those major developments, particularly Grosseteste’s work, Roger Bacon developed his own cycle of observation (seeing that something occurs), hypothesis (making a guess about why that thing occurs), experimentation (testing that the thing occurs), and verification (an outside person ensuring that the result of the experiment is consistent).

After joining the Franciscan Order, Bacon was granted a special commission to write about science; typically, Friars were not allowed to write books or pamphlets. With this commission, Bacon outlined important tenets of the scientific method, including causes of error, methods of knowledge, and the differences between speculative and experimental science. He also used his own principles to investigate the causes of a rainbow, demonstrating the method’s effectiveness.

Scientific Revolution

Throughout the Renaissance, more great thinkers became involved in devising a thorough, rigorous method of scientific study. Francis Bacon brought inductive reasoning further into the method, whereas Descartes argued that the laws of the universe meant that deductive reasoning was sufficient. Galileo’s research was also inductive reasoning-heavy, as he believed that researchers could not account for every possible variable; therefore, repetition was necessary to eliminate faulty hypotheses and experiments.

All of this led to the birth of the Scientific Revolution , which took place during the sixteenth and seventeenth centuries. In 1660, a group of philosophers and physicians joined together to work on scientific advancement. After approval from England’s crown , the group became known as the Royal Society, which helped create a thriving scientific community and an early academic journal to help introduce rigorous study and peer review.

Previous generations of scientists had touched on the importance of induction and deduction, but Sir Isaac Newton proposed that both were equally important. This contribution helped establish the importance of multiple kinds of reasoning, leading to more rigorous study.

As science began to splinter into separate areas of study, it became necessary to define different methods for different fields. Karl Popper was a leader in this area—he established that science could be subject to error, sometimes intentionally. This was particularly tricky for “soft” sciences like psychology and social sciences, which require different methods. Popper’s theories furthered the divide between sciences like psychology and “hard” sciences like chemistry or physics.

Paul Feyerabend argued that Popper’s methods were too restrictive for certain fields, and followed a less restrictive method hinged on “anything goes,” as great scientists had made discoveries without the Scientific Method. Feyerabend suggested that throughout history scientists had adapted their methods as necessary, and that sometimes it would be necessary to break the rules. This approach suited social and behavioral scientists particularly well, leading to a more diverse range of models for scientists in multiple fields to use.

body_experiment-3

The Scientific Method Steps

Though different fields may have variations on the model, the basic scientific method is as follows:

#1: Make Observations 

Notice something, such as the air temperature during the winter, what happens when ice cream melts, or how your plants behave when you forget to water them.

#2: Ask a Question

Turn your observation into a question. Why is the temperature lower during the winter? Why does my ice cream melt? Why does my toast always fall butter-side down?

This step can also include doing some research. You may be able to find answers to these questions already, but you can still test them!

#3: Make a Hypothesis

A hypothesis is an educated guess of the answer to your question. Why does your toast always fall butter-side down? Maybe it’s because the butter makes that side of the bread heavier.

A good hypothesis leads to a prediction that you can test, phrased as an if/then statement. In this case, we can pick something like, “If toast is buttered, then it will hit the ground butter-first.”

#4: Experiment

Your experiment is designed to test whether your predication about what will happen is true. A good experiment will test one variable at a time —for example, we’re trying to test whether butter weighs down one side of toast, making it more likely to hit the ground first.

The unbuttered toast is our control variable. If we determine the chance that a slice of unbuttered toast, marked with a dot, will hit the ground on a particular side, we can compare those results to our buttered toast to see if there’s a correlation between the presence of butter and which way the toast falls.

If we decided not to toast the bread, that would be introducing a new question—whether or not toasting the bread has any impact on how it falls. Since that’s not part of our test, we’ll stick with determining whether the presence of butter has any impact on which side hits the ground first.

#5: Analyze Data

After our experiment, we discover that both buttered toast and unbuttered toast have a 50/50 chance of hitting the ground on the buttered or marked side when dropped from a consistent height, straight down. It looks like our hypothesis was incorrect—it’s not the butter that makes the toast hit the ground in a particular way, so it must be something else.

Since we didn’t get the desired result, it’s back to the drawing board. Our hypothesis wasn’t correct, so we’ll need to start fresh. Now that you think about it, your toast seems to hit the ground butter-first when it slides off your plate, not when you drop it from a consistent height. That can be the basis for your new experiment.

#6: Communicate Your Results

Good science needs verification. Your experiment should be replicable by other people, so you can put together a report about how you ran your experiment to see if other peoples’ findings are consistent with yours.

This may be useful for class or a science fair. Professional scientists may publish their findings in scientific journals, where other scientists can read and attempt their own versions of the same experiments. Being part of a scientific community helps your experiments be stronger because other people can see if there are flaws in your approach—such as if you tested with different kinds of bread, or sometimes used peanut butter instead of butter—that can lead you closer to a good answer.

body_toast-1

A Scientific Method Example: Falling Toast

We’ve run through a quick recap of the scientific method steps, but let’s look a little deeper by trying again to figure out why toast so often falls butter side down.

#1: Make Observations

At the end of our last experiment, where we learned that butter doesn’t actually make toast more likely to hit the ground on that side, we remembered that the times when our toast hits the ground butter side first are usually when it’s falling off a plate.

The easiest question we can ask is, “Why is that?”

We can actually search this online and find a pretty detailed answer as to why this is true. But we’re budding scientists—we want to see it in action and verify it for ourselves! After all, good science should be replicable, and we have all the tools we need to test out what’s really going on.

Why do we think that buttered toast hits the ground butter-first? We know it’s not because it’s heavier, so we can strike that out. Maybe it’s because of the shape of our plate?

That’s something we can test. We’ll phrase our hypothesis as, “If my toast slides off my plate, then it will fall butter-side down.”

Just seeing that toast falls off a plate butter-side down isn’t enough for us. We want to know why, so we’re going to take things a step further—we’ll set up a slow-motion camera to capture what happens as the toast slides off the plate.

We’ll run the test ten times, each time tilting the same plate until the toast slides off. We’ll make note of each time the butter side lands first and see what’s happening on the video so we can see what’s going on.

When we review the footage, we’ll likely notice that the bread starts to flip when it slides off the edge, changing how it falls in a way that didn’t happen when we dropped it ourselves.

That answers our question, but it’s not the complete picture —how do other plates affect how often toast hits the ground butter-first? What if the toast is already butter-side down when it falls? These are things we can test in further experiments with new hypotheses!

Now that we have results, we can share them with others who can verify our results. As mentioned above, being part of the scientific community can lead to better results. If your results were wildly different from the established thinking about buttered toast, that might be cause for reevaluation. If they’re the same, they might lead others to make new discoveries about buttered toast. At the very least, you have a cool experiment you can share with your friends!

Key Scientific Method Tips

Though science can be complex, the benefit of the scientific method is that it gives you an easy-to-follow means of thinking about why and how things happen. To use it effectively, keep these things in mind!

Don’t Worry About Proving Your Hypothesis

One of the important things to remember about the scientific method is that it’s not necessarily meant to prove your hypothesis right. It’s great if you do manage to guess the reason for something right the first time, but the ultimate goal of an experiment is to find the true reason for your observation to occur, not to prove your hypothesis right.

Good science sometimes means that you’re wrong. That’s not a bad thing—a well-designed experiment with an unanticipated result can be just as revealing, if not more, than an experiment that confirms your hypothesis.

Be Prepared to Try Again

If the data from your experiment doesn’t match your hypothesis, that’s not a bad thing. You’ve eliminated one possible explanation, which brings you one step closer to discovering the truth.

The scientific method isn’t something you’re meant to do exactly once to prove a point. It’s meant to be repeated and adapted to bring you closer to a solution. Even if you can demonstrate truth in your hypothesis, a good scientist will run an experiment again to be sure that the results are replicable. You can even tweak a successful hypothesis to test another factor, such as if we redid our buttered toast experiment to find out whether different kinds of plates affect whether or not the toast falls butter-first. The more we test our hypothesis, the stronger it becomes!

What’s Next?

Want to learn more about the scientific method? These important high school science classes will no doubt cover it in a variety of different contexts.

Test your ability to follow the scientific method using these at-home science experiments for kids !

Need some proof that science is fun? Try making slime

author image

Melissa Brinks graduated from the University of Washington in 2014 with a Bachelor's in English with a creative writing emphasis. She has spent several years tutoring K-12 students in many subjects, including in SAT prep, to help them prepare for their college education.

Student and Parent Forum

Our new student and parent forum, at ExpertHub.PrepScholar.com , allow you to interact with your peers and the PrepScholar staff. See how other students and parents are navigating high school, college, and the college admissions process. Ask questions; get answers.

Join the Conversation

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

Improve With Our Famous Guides

  • For All Students

The 5 Strategies You Must Be Using to Improve 160+ SAT Points

How to Get a Perfect 1600, by a Perfect Scorer

Series: How to Get 800 on Each SAT Section:

Score 800 on SAT Math

Score 800 on SAT Reading

Score 800 on SAT Writing

Series: How to Get to 600 on Each SAT Section:

Score 600 on SAT Math

Score 600 on SAT Reading

Score 600 on SAT Writing

Free Complete Official SAT Practice Tests

What SAT Target Score Should You Be Aiming For?

15 Strategies to Improve Your SAT Essay

The 5 Strategies You Must Be Using to Improve 4+ ACT Points

How to Get a Perfect 36 ACT, by a Perfect Scorer

Series: How to Get 36 on Each ACT Section:

36 on ACT English

36 on ACT Math

36 on ACT Reading

36 on ACT Science

Series: How to Get to 24 on Each ACT Section:

24 on ACT English

24 on ACT Math

24 on ACT Reading

24 on ACT Science

What ACT target score should you be aiming for?

ACT Vocabulary You Must Know

ACT Writing: 15 Tips to Raise Your Essay Score

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

Is the ACT easier than the SAT? A Comprehensive Guide

Should you retake your SAT or ACT?

When should you take the SAT or ACT?

Stay Informed

scientific problem solving designing carefully controlled research

Get the latest articles and test prep tips!

Looking for Graduate School Test Prep?

Check out our top-rated graduate blogs here:

GRE Online Prep Blog

GMAT Online Prep Blog

TOEFL Online Prep Blog

Holly R. "I am absolutely overjoyed and cannot thank you enough for helping me!”

Logo for Maricopa Open Digital Press

1 Thinking Like a Scientist

Learning Objectives

After studying this chapter, you should be able to:

  • Identify the shared characteristics of the natural sciences
  • Compare inductive reasoning with deductive reasoning
  • Illustrate the steps in the scientific method
  • Explain how to design a controlled experiment
  • Describe the goals of descriptive science and hypothesis-based science
  • Apply the Claim-Evidence-Reasoning process to a scientific investigation

The Nature of Science

Environmental science (also known as environmental biology) is a field of study that focuses on the earth and its many complex systems. It is an interdisciplinary field that brings together elements of biology, geology, chemistry, and other natural sciences. It may even include elements of social sciences such as economics and political science. The discoveries of environmental science are made by a community of researchers who work individually and together using agreed-on methods. In this sense, environmental science, like all sciences, is a social enterprise like politics or the arts. The methods of science include careful observation, record keeping, logical and mathematical reasoning, experimentation, and submitting conclusions to the scrutiny of others. Science also requires considerable imagination and creativity; a well-designed experiment is commonly described as elegant, or beautiful. Like politics, science has considerable practical implications, and some science is dedicated to practical applications, such as improvements to farming practices (Figure 1). Other science proceeds largely motivated by curiosity. Whatever its goal, there is no doubt that science has transformed human existence and will continue to do so.

Photo of George Washington Carver working in a laboratory

What exactly is science? What does the study of environmental science share with other scientific disciplines? Science   (from the Latin  scientia, meaning “knowledge”) can be defined as knowledge about the natural world. But science is not just a collection of facts and theories, it is also a process used to gain that knowledge.

Science is a very specific way of learning, or knowing, about the world. The history of the past 500 years demonstrates that science is a very powerful way of knowing about the world; it is largely responsible for the technological revolutions that have taken place during this time. There are however, areas of knowledge and human experience that the methods of science cannot be applied to. These include such things as answering purely moral questions, aesthetic questions, or what can be generally categorized as spiritual questions. Science cannot investigate these areas because they are outside the realm of material phenomena, the phenomena of matter and energy, and cannot be observed and measured.

The  scientific method  is a method of research with defined steps that include experiments and careful observation. The steps of the scientific method will be examined in detail later, but one of the most important aspects of this method is the testing of hypotheses. A  hypothesis   is a suggested explanation for an event, which can be tested. Hypotheses, or tentative explanations, are generally produced within the context of a  scientific theory . A scientific theory is a generally accepted, thoroughly tested, and confirmed explanation for a set of observations or phenomena. Scientific theory is the foundation of scientific knowledge. In addition, in many scientific disciplines (less so in biology) there are scientific laws , often expressed in mathematical formulas, which describe how elements of nature will behave under certain specific conditions.

A common misconception is that a hypothesis is elevated to the level of theory after being confirmed, then a theory is promoted to a scientific law after it is confirmed. However, there is no evolution of hypotheses through theories to laws as if they represent some increase in certainty about the world. Hypotheses are the day-to-day material that scientists work with and they are developed within the context of theories. You can think of theories as being “bigger” than hypotheses because a theory incorporates many hypotheses and facts. Laws, on the other hand, are concise descriptions of natural events that can usually be described mathematically. For example, Newton’s Law of Gravity explains how objects in the universe attract other objects differently depending on their mass.

Natural Sciences

What would you expect to see in a museum of natural sciences? Frogs? Plants? Dinosaur skeletons? Exhibits about how the brain functions? A planetarium? Gems and minerals? Or maybe all of the above? Science includes such diverse fields as astronomy, biology, computer sciences, geology, logic, physics, chemistry, and mathematics. However, those fields of science related to the physical world and its phenomena and processes are considered natural sciences . Thus, a museum of natural sciences might contain any of the items listed above (Figure 2).

Natural History Museum of Los Angeles County

There is no complete agreement when it comes to defining what the natural sciences include. For some experts, the natural sciences are astronomy, biology, chemistry, earth science, and physics. Other scholars choose to divide natural sciences into  life sciences , which study living things and include biology, and  physical sciences , which study nonliving matter and include astronomy, physics, and chemistry. Some disciplines such as biophysics and biochemistry build on two sciences and are interdisciplinary.

Knowledge Check

Scientific Inquiry

One thing is common to all forms of science: an ultimate goal “to know.” Curiosity and inquiry are the driving forces for the development of science. Scientists seek to understand the world and the way it operates by using one of two main pathways of scientific study: descriptive science and hypothesis-based science.  Descriptive  (or discovery)  science  aims to observe, explore, and discover, while  hypothesis-based science  begins with a specific question or problem and a potential answer or solution that can be tested. The boundary between these two forms of study is often blurred, because most scientific endeavors combine both approaches. Observations lead to questions, questions lead to forming a hypothesis as a possible answer to those questions, and then the hypothesis is tested. Thus, descriptive science and hypothesis-based science are in continuous dialogue.

Hypothesis Testing

Biologists study the living world by posing questions about it and seeking science-based responses. This approach is common to other sciences as well and is often referred to as the scientific method (Figure 3). The scientific method was used even in ancient times, but it was first documented by England’s Sir Francis Bacon (1561–1626), who set up inductive methods for scientific inquiry. The scientific method is not exclusively used by biologists but can be applied to almost anything as a logical problem-solving method.

A flow chart shows the steps in the scientific method. In step 1, an observation is made. In step 2, a question is asked about the observation. In step 3, an answer to the question, called a hypothesis, is proposed. In step 4, a prediction is made based on the hypothesis. In step 5, an experiment is done to test the prediction. In step 6, the results are analyzed to determine whether or not the hypothesis is supported. If the hypothesis is not supported, another hypothesis is made. In either case, the results are reported.

The scientific process typically starts with an observation (often a problem to be solved) that leads to a question . Let’s think about a simple problem that starts with an observation and apply the scientific method to solve the problem. One Monday morning, a student arrives at class and quickly discovers that the classroom is too warm. That is an observation that also describes a problem: the classroom is too warm. The student then asks a question: “Why is the classroom so warm?”

Recall that a hypothesis is a suggested explanation that can be tested. To solve a problem, several hypotheses may be proposed. For example, one hypothesis might be, “The classroom is warm because no one turned on the air conditioning.” But there could be other responses to the question, and therefore other hypotheses may be proposed. A second hypothesis might be, “The classroom is warm because there is a power failure, and so the air conditioning doesn’t work.”

Once a hypothesis has been selected, a prediction may be made. A prediction is similar to a hypothesis but it typically has the format “If . . . then . . . .” For example, the prediction for the first hypothesis might be, “ If  the student turns on the air conditioning,  then  the classroom will no longer be too warm.”

A hypothesis must be testable to ensure that it is valid. For example, a hypothesis that depends on what a bear thinks is not testable, because it can never be known what a bear thinks. It should also be  falsifiable , meaning that it can be disproven by experimental results. An example of an unfalsifiable hypothesis is “Botticelli’s  Birth of Venus  painting is beautiful.” There is no experiment that might show this statement to be false. To test a hypothesis, a researcher will conduct one or more experiments designed to eliminate one or more of the hypotheses. This is important. A hypothesis can be disproven, or eliminated, but it can never be proven. Science does not deal in proofs like mathematics. If an experiment fails to disprove a hypothesis, then we find support for that explanation, but this is not to say that down the road a better explanation will not be found, or a more carefully designed experiment will be found to falsify the hypothesis.

The best way to test a hypothesis is to conduct a controlled experiment. A controlled experiment is a scientific test performed under controlled conditions, meaning just one (or a few) variables are changed at a time, while all other factors are kept constant. A  variable  is any part of the experiment that can vary or change during the experiment.

What are the key components of a controlled experiment? Let’s say you want what it takes to grow the healthiest tomatoes. Your hypothesis is that tomato plants will grow better if given fertilizer. To test this hypothesis, you give fertilizer to some of your tomato plants and give others only water. Your prediction might be “If I give fertilizer to a group of tomato plants, they will grow better than tomato plants without fertilizer.” In this example, the tomatoes with fertilizer are known as the experimental group , and the ones without fertilizer are the control group because they did not receive the treatment.

The factor that is different between the experimental and control group is known as the  independent variable (in this case, the fertilizer). It can also be thought of as the variable that is directly manipulated by the experimenter. The dependent variable is the response that is measured to determine if the experimental treatment had any effect. In this case, the dependent variable is the growth of the tomato plants.

Experimental results or data are the observations made in the course of an experiment. In this case, the height, number of leaves, and other signs of plant growth are the data you would collect in your experiment. Looking at Figure 4, we can conclude that the hypothesis was supported . If the fertilized plants did not grow better than the unfertilized plants, we would conclude that the hypothesis was not supported , and we may need to generate a new hypothesis.

Photo shows tomato plants grown with fertilizer are larger and healthier than tomato plants grown without fertilizer

Note that in the tomato experiment, three plants were used in each group. This is because there may have been an unhealthy or slow-growing plant that would affect the results. Having a larger sample size helps eliminate the effects of random factors like this.

Not all scientific questions can be answered using controlled experiments. It may be unethical to test the effects of a virus on humans, or impractical to see how changing rainfall affects plants in the desert. In such cases, a scientist may simply collect data from the real world to test a hypothesis. In recent years a new approach of testing hypotheses has developed as a result of an exponential growth of data deposited in various databases. Using computer algorithms and statistical analyses of data in databases, a new field of so-called “data research” (also referred to as “in silico” research) provides new methods of data analyses and their interpretation. This will increase the demand for specialists in both biology and computer science, a promising career opportunity.

In the example below, the scientific method is used to solve an everyday problem. Which part in the example below is the hypothesis? Which is the prediction? Based on the results of the experiment, is the hypothesis supported? If it is not supported, propose some alternative hypotheses.

  • My toaster doesn’t toast my bread.
  • Why doesn’t my toaster work?
  • There is something wrong with the electrical outlet.
  • If something is wrong with the outlet, my coffeemaker also won’t work when plugged into it.
  • I plug my coffeemaker into the outlet.
  • My coffeemaker works.

In practice, the scientific method is not as rigid and structured as it might at first appear. Sometimes an experiment leads to conclusions that favor a change in approach; often, an experiment brings entirely new scientific questions to the puzzle. Many times, science does not operate in a linear fashion; instead, scientists continually draw inferences and make generalizations, finding patterns as their research proceeds. Scientific reasoning is more complex than the scientific method alone suggests.

Basic and Applied Science

The scientific community has been debating for the last few decades about the value of different types of science. Is it valuable to pursue science for the sake of simply gaining knowledge, or does scientific knowledge only have worth if we can apply it to solving a specific problem or bettering our lives? This question focuses on the differences between two types of science: basic science and applied science.

Basic science  or “pure” science seeks to expand knowledge regardless of the short-term application of that knowledge. It is not focused on developing a product or a service of immediate public or commercial value. The immediate goal of basic science is knowledge for knowledge’s sake, though this does not mean that in the end it may not result in an application.

In contrast,  applied science  or “technology,” aims to use science to solve real-world problems, making it possible, for example, to improve a crop yield, find a cure for a particular disease, or save animals threatened by a natural disaster. In applied science, the problem is usually defined for the researcher.

Some individuals may perceive applied science as “useful” and basic science as “useless.” A question these people might pose to a scientist advocating knowledge acquisition would be, “What for?” A careful look at the history of science, however, reveals that basic knowledge has resulted in many remarkable applications of great value. Many scientists think that a basic understanding of science is necessary before an application is developed; therefore, applied science relies on the results generated through basic science. Other scientists think that it is time to move on from basic science and instead to find solutions to actual problems. Both approaches are valid. It is true that there are problems that demand immediate attention; however, few solutions would be found without the help of the knowledge generated through basic science.

One example of how basic and applied science can work together to solve practical problems occurred after the discovery of DNA structure led to an understanding of the molecular mechanisms governing DNA replication. Strands of DNA, unique in every human, are found in our cells, where they provide the instructions necessary for life. During DNA replication, new copies of DNA are made, shortly before a cell divides to form new cells. Understanding the mechanisms of DNA replication enabled scientists to develop laboratory techniques that are now used to identify genetic diseases, pinpoint individuals who were at a crime scene, and determine paternity. Without basic science, it is unlikely that applied science would exist.

Illustration shows some of the letters in the DNA sequence of humans

Another example of the link between basic and applied research is the Human Genome Project, a study in which each human chromosome was analyzed and mapped to determine the precise sequence of DNA subunits and the exact location of each gene. (The gene is the basic unit of heredity; an individual’s complete collection of genes is his or her genome.) Other organisms have also been studied as part of this project to gain a better understanding of human chromosomes. The Human Genome Project (Figure 5) relied on basic research carried out with non-human organisms and, later, with the human genome. An important end goal eventually became using the data for applied research seeking cures for genetically related diseases.

While research efforts in both basic science and applied science are usually carefully planned, it is important to note that some discoveries are made by serendipity, that is, by means of a fortunate accident or a lucky surprise. Penicillin was discovered when biologist Alexander Fleming accidentally left a petri dish of Staphylococcus  bacteria open. An unwanted mold grew, killing the bacteria. The mold turned out to be  Penicillium , and a new antibiotic was discovered. Even in the highly organized world of science, luck—when combined with an observant, curious mind—can lead to unexpected breakthroughs.

Communicating Scientific Work

Whether scientific research is basic science or applied science, scientists must share their findings for other researchers to expand and build upon their discoveries. Communication and collaboration within and between sub disciplines of science are key to the advancement of knowledge in science. For this reason, an important aspect of a scientist’s work is disseminating results and communicating with peers. Scientists can share results by presenting them at a scientific meeting or conference, but this approach can reach only the limited few who are present. Instead, most scientists present their results in peer-reviewed articles that are published in scientific journals.  Peer-reviewed articles  are scientific papers that are reviewed, usually anonymously by a scientist’s colleagues, or peers. These colleagues are qualified individuals, often experts in the same research area, who judge whether or not the scientist’s work is suitable for publication. The process of peer review helps to ensure that the research described in a scientific paper or grant proposal is original, significant, logical, and thorough. Grant proposals, which are requests for research funding, are also subject to peer review. Scientists publish their work so other scientists can reproduce their experiments under similar or different conditions to expand on the findings. The experimental results must be consistent with the findings of other scientists.

There are many journals and the popular press that do not use a peer-review system. A large number of online open-access journals, journals with articles available without cost, are now available, many of which use rigorous peer-review systems, but some of which do not. Results of any studies published in these forums without peer review are not reliable and should not form the basis for other scientific work. In one exception, journals may allow a researcher to cite a personal communication from another researcher about unpublished results with the cited author’s permission.

Claim-Evidence-Reasoning

Ultimately, the goal of science is to understand and explain how things work in the natural world. One of the tools scientists use to achieve this goal is the Claim-Evidence-Reasoning process. A claim is a statement that answers a scientific question. It can be an explanation of a natural phenomenon or a conclusion that can be drawn after conducting a scientific investigation. Evidence is the scientific data that supports the claim that is being made. The evidence must be sufficient , meaning there must be enough data to fully support the claim, and it must be appropriate , leaving out any unnecessary information. Reasoning is a justification that connects the evidence to the claim. It shows why the data count as evidence to support this specific claim by using appropriate and sufficient scientific principles.

Attribution

Concepts in Biology by OpenStax, modified by Sean Whitcomb. License: CC-BY

Media Attributions

  • George Washington Carver is licensed under a Public Domain license
  • Natural History Museum of Los Angeles County © Matthew Dillon is licensed under a CC BY (Attribution) license
  • Scientific_method © OpenStax is licensed under a CC BY (Attribution) license
  • Fertilizer_tomatoes © SuSanA Secretariat is licensed under a CC BY (Attribution) license
  • Human Genome Reference Sequence © National Human Genome Research Institute is licensed under a CC BY-NC (Attribution NonCommercial) license

Environmental Science Copyright © by Sean Whitcomb is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

Share This Book

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of springeropen

Identifying problems and solutions in scientific text

Kevin heffernan.

Department of Computer Science and Technology, University of Cambridge, 15 JJ Thomson Avenue, Cambridge, CB3 0FD UK

Simone Teufel

Research is often described as a problem-solving activity, and as a result, descriptions of problems and solutions are an essential part of the scientific discourse used to describe research activity. We present an automatic classifier that, given a phrase that may or may not be a description of a scientific problem or a solution, makes a binary decision about problemhood and solutionhood of that phrase. We recast the problem as a supervised machine learning problem, define a set of 15 features correlated with the target categories and use several machine learning algorithms on this task. We also create our own corpus of 2000 positive and negative examples of problems and solutions. We find that we can distinguish problems from non-problems with an accuracy of 82.3%, and solutions from non-solutions with an accuracy of 79.7%. Our three most helpful features for the task are syntactic information (POS tags), document and word embeddings.

Introduction

Problem solving is generally regarded as the most important cognitive activity in everyday and professional contexts (Jonassen 2000 ). Many studies on formalising the cognitive process behind problem-solving exist, for instance (Chandrasekaran 1983 ). Jordan ( 1980 ) argues that we all share knowledge of the thought/action problem-solution process involved in real life, and so our writings will often reflect this order. There is general agreement amongst theorists that state that the nature of the research process can be viewed as a problem-solving activity (Strübing 2007 ; Van Dijk 1980 ; Hutchins 1977 ; Grimes 1975 ).

One of the best-documented problem-solving patterns was established by Winter ( 1968 ). Winter analysed thousands of examples of technical texts, and noted that these texts can largely be described in terms of a four-part pattern consisting of Situation, Problem, Solution and Evaluation. This is very similar to the pattern described by Van Dijk ( 1980 ), which consists of Introduction-Theory, Problem-Experiment-Comment and Conclusion. The difference is that in Winter’s view, a solution only becomes a solution after it has been evaluated positively. Hoey changes Winter’s pattern by introducing the concept of Response in place of Solution (Hoey 2001 ). This seems to describe the situation in science better, where evaluation is mandatory for research solutions to be accepted by the community. In Hoey’s pattern, the Situation (which is generally treated as optional) provides background information; the Problem describes an issue which requires attention; the Response provides a way to deal with the issue, and the Evaluation assesses how effective the response is.

An example of this pattern in the context of the Goldilocks story can be seen in Fig.  1 . In this text, there is a preamble providing the setting of the story (i.e. Goldilocks is lost in the woods), which is called the Situation in Hoey’s system. A Problem in encountered when Goldilocks becomes hungry. Her first Response is to try the porridge in big bear’s bowl, but she gives this a negative Evaluation (“too hot!”) and so the pattern returns to the Problem. This continues in a cyclic fashion until the Problem is finally resolved by Goldilocks giving a particular Response a positive Evaluation of baby bear’s porridge (“it’s just right”).

An external file that holds a picture, illustration, etc.
Object name is 11192_2018_2718_Fig1_HTML.jpg

Example of problem-solving pattern when applied to the Goldilocks story.

Reproduced with permission from Hoey ( 2001 )

It would be attractive to detect problem and solution statements automatically in text. This holds true both from a theoretical and a practical viewpoint. Theoretically, we know that sentiment detection is related to problem-solving activity, because of the perception that “bad” situations are transformed into “better” ones via problem-solving. The exact mechanism of how this can be detected would advance the state of the art in text understanding. In terms of linguistic realisation, problem and solution statements come in many variants and reformulations, often in the form of positive or negated statements about the conditions, results and causes of problem–solution pairs. Detecting and interpreting those would give us a reasonably objective manner to test a system’s understanding capacity. Practically, being able to detect any mention of a problem is a first step towards detecting a paper’s specific research goal. Being able to do this has been a goal for scientific information retrieval for some time, and if successful, it would improve the effectiveness of scientific search immensely. Detecting problem and solution statements of papers would also enable us to compare similar papers and eventually even lead to automatic generation of review articles in a field.

There has been some computational effort on the task of identifying problem-solving patterns in text. However, most of the prior work has not gone beyond the usage of keyword analysis and some simple contextual examination of the pattern. Flowerdew ( 2008 ) presents a corpus-based analysis of lexio-grammatical patterns for problem and solution clauses using articles from professional and student reports. Problem and solution keywords were used to search their corpora, and each occurrence was analysed to determine grammatical usage of the keyword. More interestingly, the causal category associated with each keyword in their context was also analysed. For example, Reason–Result or Means-Purpose were common causal categories found to be associated with problem keywords.

The goal of the work by Scott ( 2001 ) was to determine words which are semantically similar to problem and solution, and to determine how these words are used to signal problem-solution patterns. However, their corpus-based analysis used articles from the Guardian newspaper. Since the domain of newspaper text is very different from that of scientific text, we decided not to consider those keywords associated with problem-solving patterns for use in our work.

Instead of a keyword-based approach, Charles ( 2011 ) used discourse markers to examine how the problem-solution pattern was signalled in text. In particular, they examined how adverbials associated with a result such as “thus, therefore, then, hence” are used to signal a problem-solving pattern.

Problem solving also has been studied in the framework of discourse theories such as Rhetorical Structure Theory (Mann and Thompson 1988 ) and Argumentative Zoning (Teufel et al. 2000 ). Problem- and solutionhood constitute two of the original 23 relations in RST (Mann and Thompson 1988 ). While we concentrate solely on this aspect, RST is a general theory of discourse structure which covers many intentional and informational relations. The relationship to Argumentative Zoning is more complicated. The status of certain statements as problem or solutions is one important dimension in the definitions of AZ categories. AZ additionally models dimensions other than problem-solution hood (such as who a scientific idea belongs to, or which intention the authors might have had in stating a particular negative or positive statement). When forming categories, AZ combines aspects of these dimensions, and “flattens” them out into only 7 categories. In AZ it is crucial who it is that experiences the problems or contributes a solution. For instance, the definition of category “CONTRAST” includes statements that some research runs into problems, but only if that research is previous work (i.e., not if it is the work contributed in the paper itself). Similarly, “BASIS” includes statements of successful problem-solving activities, but only if they are achieved by previous work that the current paper bases itself on. Our definition is simpler in that we are interested only in problem solution structure, not in the other dimensions covered in AZ. Our definition is also more far-reaching than AZ, in that we are interested in all problems mentioned in the text, no matter whose problems they are. Problem-solution recognition can therefore be seen as one aspect of AZ which can be independently modelled as a “service task”. This means that good problem solution structure recognition should theoretically improve AZ recognition.

In this work, we approach the task of identifying problem-solving patterns in scientific text. We choose to use the model of problem-solving described by Hoey ( 2001 ). This pattern comprises four parts: Situation, Problem, Response and Evaluation. The Situation element is considered optional to the pattern, and so our focus centres on the core pattern elements.

Goal statement and task

Many surface features in the text offer themselves up as potential signals for detecting problem-solving patterns in text. However, since Situation is an optional element, we decided to focus on either Problem or Response and Evaluation as signals of the pattern. Moreover, we decide to look for each type in isolation. Our reasons for this are as follows: It is quite rare for an author to introduce a problem without resolving it using some sort of response, and so this is a good starting point in identifying the pattern. There are exceptions to this, as authors will sometimes introduce a problem and then leave it to future work, but overall there should be enough signal in the Problem element to make our method of looking for it in isolation worthwhile. The second signal we look for is the use of Response and Evaluation within the same sentence. Similar to Problem elements, we hypothesise that this formulation is well enough signalled externally to help us in detecting the pattern. For example, consider the following Response and Evaluation: “One solution is to use smoothing”. In this statement, the author is explicitly stating that smoothing is a solution to a problem which must have been mentioned in a prior statement. In scientific text, we often observe that solutions implicitly contain both Response and Evaluation (positive) elements. Therefore, due to these reasons there should be sufficient external signals for the two pattern elements we concentrate on here.

When attempting to find Problem elements in text, we run into the issue that the word “problem” actually has at least two word senses that need to be distinguished. There is a word sense of “problem” that means something which must be undertaken (i.e. task), while another sense is the core sense of the word, something that is problematic and negative. Only the latter sense is aligned with our sense of problemhood. This is because the simple description of a task does not predispose problemhood, just a wish to perform some act. Consider the following examples, where the non-desired word sense is being used:

  • “Das and Petrov (2011) also consider the problem of unsupervised bilingual POS induction”. (Chen et al. 2011 ).
  • “In this paper, we describe advances on the problem of NER in Arabic Wikipedia”. (Mohit et al. 2012 ).

Here, the author explicitly states that the phrases in orange are problems, they align with our definition of research tasks and not with what we call here ‘problematic problems’. We will now give some examples from our corpus for the desired, core word sense:

  • “The major limitation of supervised approaches is that they require annotations for example sentences.” (Poon and Domingos 2009 ).
  • “To solve the problem of high dimensionality we use clustering to group the words present in the corpus into much smaller number of clusters”. (Saha et al. 2008 ).

When creating our corpus of positive and negative examples, we took care to select only problem strings that satisfy our definition of problemhood; “ Corpus creation ” section will explain how we did that.

Corpus creation

Our new corpus is a subset of the latest version of the ACL anthology released in March, 2016 1 which contains 22,878 articles in the form of PDFs and OCRed text. 2

The 2016 version was also parsed using ParsCit (Councill et al. 2008 ). ParsCit recognises not only document structure, but also bibliography lists as well as references within running text. A random subset of 2500 papers was collected covering the entire ACL timeline. In order to disregard non-article publications such as introductions to conference proceedings or letters to the editor, only documents containing abstracts were considered. The corpus was preprocessed using tokenisation, lemmatisation and dependency parsing with the Rasp Parser (Briscoe et al. 2006 ).

Definition of ground truth

Our goal was to define a ground truth for problem and solution strings, while covering as wide a range as possible of syntactic variations in which such strings naturally occur. We also want this ground truth to cover phenomena of problem and solution status which are applicable whether or not the problem or solution status is explicitly mentioned in the text.

To simplify the task, we only consider here problem and solution descriptions that are at most one sentence long. In reality, of course, many problem descriptions and solution descriptions go beyond single sentence, and require for instance an entire paragraph. However, we also know that short summaries of problems and solutions are very prevalent in science, and also that these tend to occur in the most prominent places in a paper. This is because scientists are trained to express their contribution and the obstacles possibly hindering their success, in an informative, succinct manner. That is the reason why we can afford to only look for shorter problem and solution descriptions, ignoring those that cross sentence boundaries.

To define our ground truth, we examined the parsed dependencies and looked for a target word (“problem/solution”) in subject position, and then chose its syntactic argument as our candidate problem or solution phrase. To increase the variation, i.e., to find as many different-worded problem and solution descriptions as possible, we additionally used semantically similar words (near-synonyms) of the target words “problem” or “solution” for the search. Semantic similarity was defined as cosine in a deep learning distributional vector space, trained using Word2Vec (Mikolov et al. 2013 ) on 18,753,472 sentences from a biomedical corpus based on all full-text Pubmed articles (McKeown et al. 2016 ). From the 200 words which were semantically closest to “problem”, we manually selected 28 clear synonyms. These are listed in Table  1 . From the 200 semantically closest words to “solution” we similarly chose 19 (Table  2 ). Of the sentences matching our dependency search, a subset of problem and solution candidate sentences were randomly selected.

Selected words for use in problem candidate phrase extraction

Selected words for use in solution candidate phrase extraction

An example of this is shown in Fig.  2 . Here, the target word “drawback” is in subject position (highlighted in red), and its clausal argument (ccomp) is “(that) it achieves low performance” (highlighted in purple). Examples of other arguments we searched for included copula constructions and direct/indirect objects.

An external file that holds a picture, illustration, etc.
Object name is 11192_2018_2718_Fig2_HTML.jpg

Example of our extraction method for problems using dependencies. (Color figure online)

If more than one candidate was found in a sentence, one was chosen at random. Non-grammatical sentences were excluded; these might appear in the corpus as a result of its source being OCRed text.

800 candidates phrases expressing problems and solutions were automatically extracted (1600 total) and then independently checked for correctness by two annotators (the two authors of this paper). Both authors found the task simple and straightforward. Correctness was defined by two criteria:

  • An unexplained phenomenon or a problematic state in science; or
  • A research question; or
  • An artifact that does not fulfil its stated specification.
  • The phrase must not lexically give away its status as problem or solution phrase.

The second criterion saves us from machine learning cues that are too obvious. If for instance, the phrase itself contained the words “lack of” or “problematic” or “drawback”, our manual check rejected it, because it would be too easy for the machine learner to learn such cues, at the expense of many other, more generally occurring cues.

Sampling of negative examples

We next needed to find negative examples for both cases. We wanted them not to stand out on the surface as negative examples, so we chose them so as to mimic the obvious characteristics of the positive examples as closely as possible. We call the negative examples ‘non-problems’ and ‘non-solutions’ respectively. We wanted the only differences between problems and non-problems to be of a semantic nature, nothing that could be read off on the surface. We therefore sampled a population of phrases that obey the same statistical distribution as our problem and solution strings while making sure they really are negative examples. We started from sentences not containing any problem/solution words (i.e. those used as target words). From each such sentence, we at random selected one syntactic subtree contained in it. From these, we randomly selected a subset of negative examples of problems and solutions that satisfy the following conditions:

  • The distribution of the head POS tags of the negative strings should perfectly match the head POS tags 3 of the positive strings. This has the purpose of achieving the same proportion of surface syntactic constructions as observed in the positive cases.
  • The average lengths of the negative strings must be within a tolerance of the average length of their respective positive candidates e.g., non-solutions must have an average length very similar (i.e. + / -  small tolerance) to solutions. We chose a tolerance value of 3 characters.

Again, a human quality check was performed on non-problems and non-solutions. For each candidate non-problem statement, the candidate was accepted if it did not contain a phenomenon, a problematic state, a research question or a non-functioning artefact. If the string expressed a research task, without explicit statement that there was anything problematic about it (i.e., the ‘wrong’ sense of “problem”, as described above), it was allowed as a non-problem. A clause was confirmed as a non-solution if the string did not represent both a response and positive evaluation.

If the annotator found that the sentence had been slightly mis-parsed, but did contain a candidate, they were allowed to move the boundaries for the candidate clause. This resulted in cleaner text, e.g., in the frequent case of coordination, when non-relevant constituents could be removed.

From the set of sentences which passed the quality-test for both independent assessors, 500 instances of positive and negative problems/solutions were randomly chosen (i.e. 2000 instances in total). When checking for correctness we found that most of the automatically extracted phrases which did not pass the quality test for problem-/solution-hood were either due to obvious learning cues or instances where the sense of problem-hood used is relating to tasks (cf. “ Goal statement and task ” section).

Experimental design

In our experiments, we used three classifiers, namely Naïve Bayes, Logistic Regression and a Support Vector Machine. For all classifiers an implementation from the WEKA machine learning library (Hall et al. 2009 ) was chosen. Given that our dataset is small, tenfold cross-validation was used instead of a held out test set. All significance tests were conducted using the (two-tailed) Sign Test (Siegel 1956 ).

Linguistic correlates of problem- and solution-hood

We first define a set of features without taking the phrase’s context into account. This will tell us about the disambiguation ability of the problem/solution description’s semantics alone. In particular, we cut out the rest of the sentence other than the phrase and never use it for classification. This is done for similar reasons to excluding certain ‘give-away’ phrases inside the phrases themselves (as explained above). As the phrases were found using templates, we know that the machine learner would simply pick up on the semantics of the template, which always contains a synonym of “problem” or “solution”, thus drowning out the more hidden features hopefully inherent in the semantics of the phrases themselves. If we allowed the machine learner to use these stronger features, it would suffer in its ability to generalise to the real task.

ngrams Bags of words are traditionally successfully used for classification tasks in NLP, so we included bags of words (lemmas) within the candidate phrases as one of our features (and treat it as a baseline later on). We also include bigrams and trigrams as multi-word combinations can be indicative of problems and solutions e.g., “combinatorial explosion”.

Polarity Our second feature concerns the polarity of each word in the candidate strings. Consider the following example of a problem taken from our dataset: “very conservative approaches to exact and partial string matches overgenerate badly”. In this sentence, words such as “badly” will be associated with negative polarity, therefore being useful in determining problem-hood. Similarly, solutions will often be associated with a positive sentiment e.g. “smoothing is a good way to overcome data sparsity” . To do this, we perform word sense disambiguation of each word using the Lesk algorithm (Lesk 1986 ). The polarity of the resulting synset in SentiWordNet (Baccianella et al. 2010 ) was then looked up and used as a feature.

Syntax Next, a set of syntactic features were defined by using the presence of POS tags in each candidate. This feature could be helpful in finding syntactic patterns in problems and solutions. We were careful not to base the model directly on the head POS tag and the length of each candidate phrase, as these are defining characteristics used for determining the non-problem and non-solution candidate set.

Negation Negation is an important property that can often greatly affect the polarity of a phrase. For example, a phrase containing a keyword pertinent to solution-hood may be a good indicator but with the presence of negation may flip the polarity to problem-hood e.g., “this can’t work as a solution”. Therefore, presence of negation is determined.

Exemplification and contrast Problems and solutions are often found to be coupled with examples as they allow the author to elucidate their point. For instance, consider the following solution: “Once the translations are generated, an obvious solution is to pick the most fluent alternative, e.g., using an n-gram language model”. (Madnani et al. 2012 ). To acknowledge this, we check for presence of exemplification. In addition to examples, problems in particular are often found when contrast is signalled by the author (e.g. “however, “but”), therefore we also check for presence of contrast in the problem and non-problem candidates only.

Discourse Problems and solutions have also been found to have a correlation with discourse properties. For example, problem-solving patterns often occur in the background sections of a paper. The rationale behind this is that the author is conventionally asked to objectively criticise other work in the background (e.g. describing research gaps which motivate the current paper). To take this in account, we examine the context of each string and capture the section header under which it is contained (e.g. Introduction, Future work). In addition, problems and solutions are often found following the Situation element in the problem-solving pattern (cf. “ Introduction ” section). This preamble setting up the problem or solution means that these elements are likely not to be found occurring at the beginning of a section (i.e. it will usually take some sort of introduction to detail how something is problematic and why a solution is needed). Therefore we record the distance from the candidate string to the nearest section header.

Subcategorisation and adverbials Solutions often involve an activity (e.g. a task). We also model the subcategorisation properties of the verbs involved. Our intuition was that since problematic situations are often described as non-actions, then these are more likely to be intransitive. Conversely solutions are often actions and are likely to have at least one argument. This feature was calculated by running the C&C parser (Curran et al. 2007 ) on each sentence. C&C is a supertagger and parser that has access to subcategorisation information. Solutions are also associated with resultative adverbial modification (e.g. “thus, therefore, consequently”) as it expresses the solutionhood relation between the problem and the solution. It has been seen to occur frequently in problem-solving patterns, as studied by Charles ( 2011 ). Therefore, we check for presence of resultative adverbial modification in the solution and non-solution candidate only.

Embeddings We also wanted to add more information using word embeddings. This was done in two different ways. Firstly, we created a Doc2Vec model (Le and Mikolov 2014 ), which was trained on  ∼  19  million sentences from scientific text (no overlap with our data set). An embedding was created for each candidate sentence. Secondly, word embeddings were calculated using the Word2Vec model (cf. “ Corpus creation ” section). For each candidate head, the full word embedding was included as a feature. Lastly, when creating our polarity feature we query SentiWordNet using synsets assigned by the Lesk algorithm. However, not all words are assigned a sense by Lesk, so we need to take care when that happens. In those cases, the distributional semantic similarity of the word is compared to two words with a known polarity, namely “poor” and “excellent”. These particular words have traditionally been consistently good indicators of polarity status in many studies (Turney 2002 ; Mullen and Collier 2004 ). Semantic similarity was defined as cosine similarity on the embeddings of the Word2Vec model (cf. “ Corpus creation ” section).

Modality Responses to problems in scientific writing often express possibility and necessity, and so have a close connection with modality. Modality can be broken into three main categories, as described by Kratzer ( 1991 ), namely epistemic (possibility), deontic (permission / request / wish) and dynamic (expressing ability).

Problems have a strong relationship to modality within scientific writing. Often, this is due to a tactic called “hedging” (Medlock and Briscoe 2007 ) where the author uses speculative language, often using Epistemic modality, in an attempt to make either noncommital or vague statements. This has the effect of allowing the author to distance themselves from the statement, and is often employed when discussing negative or problematic topics. Consider the following example of Epistemic modality from Nakov and Hearst ( 2008 ): “A potential drawback is that it might not work well for low-frequency words”.

To take this linguistic correlate into account as a feature, we replicated a modality classifier as described by (Ruppenhofer and Rehbein 2012 ). More sophisticated modality classifiers have been recently introduced, for instance using a wide range of features and convolutional neural networks, e.g, (Zhou et al. 2015 ; Marasović and Frank 2016 ). However, we wanted to check the effect of a simpler method of modality classification on the final outcome first before investing heavily into their implementation. We trained three classifiers using the subset of features which Ruppenhofer et al. reported as performing best, and evaluated them on the gold standard dataset provided by the authors 4 . The results of the are shown in Table  3 . The dataset contains annotations of English modal verbs on the 535 documents of the first MPQA corpus release (Wiebe et al. 2005 ).

Modality classifier results (precision/recall/f-measure) using Naïve Bayes (NB), logistic regression, and a support vector machine (SVM)

Italicized results reflect highest f-measure reported per modal category

Logistic Regression performed best overall and so this model was chosen for our upcoming experiments. With regards to the optative and concessive modal categories, they can be seen to perform extremely poorly, with the optative category receiving a null score across all three classifiers. This is due to a limitation in the dataset, which is unbalanced and contains very few instances of these two categories. This unbalanced data also is the reason behind our decision of reporting results in terms of recall, precision and f-measure in Table  3 .

The modality classifier was then retrained on the entirety of the dataset used by Ruppenhofer and Rehbein ( 2012 ) using the best performing model from training (Logistic Regression). This new model was then used in the upcoming experiment to predict modality labels for each instance in our dataset.

As can be seen from Table  4 , we are able to achieve good results for distinguishing a problematic statement from non-problematic one. The bag-of-words baseline achieves a very good performance of 71.0% for the Logistic Regression classifier, showing that there is enough signal in the candidate phrases alone to distinguish them much better than random chance.

Results distinguishing problems from non-problems using Naïve Bayes (NB), logistic regression (LR) and a support vector machine (SVM)

Each feature set’s performance is shown in isolation followed by combinations with other features. Tenfold stratified cross-validation was used across all experiments. Statistical significance with respect to the baseline at the p  < 0.05 , 0.01, 0.001 levels is denoted by *, ** and *** respectively

Taking a look at Table  5 , which shows the information gain for the top lemmas,

Information gain (IG) in bits of top lemmas from the bag-of-words baseline in Table  4

we can see that the top lemmas are indeed indicative of problemhood (e.g. “limit”,“explosion”). Bigrams achieved good performance on their own (as did negation and discourse) but unfortunately performance deteriorated when using trigrams, particularly with the SVM and LR. The subcategorisation feature was the worst performing feature in isolation. Upon taking a closer look at our data, we saw that our hypothesis that intransitive verbs are commonly used in problematic statements was true, with over 30% of our problems (153) using them. However, due to our sampling method for the negative cases we also picked up many intransitive verbs (163). This explains the almost random chance performance (i.e.  50%) given that the distribution of intransitive verbs amongst the positive and negative candidates was almost even.

The modality feature was the most expensive to produce, but also didn’t perform very well is isolation. This surprising result may be partly due to a data sparsity issue

where only a small portion (169) of our instances contained modal verbs. The breakdown of how many types of modal senses which occurred is displayed in Table  6 . The most dominant modal sense was epistemic. This is a good indicator of problemhood (e.g. hedging, cf. “ Linguistic correlates of problem- and solution-hood ” section) but if the accumulation of additional data was possible, we think that this feature may have the potential to be much more valuable in determining problemhood. Another reason for the performance may be domain dependence of the classifier since it was trained on text from different domains (e.g. news). Additionally, modality has also shown to be helpful in determining contextual polarity (Wilson et al. 2005 ) and argumentation (Becker et al. 2016 ), so using the output from this modality classifier may also prove useful for further feature engineering taking this into account in future work.

Number of instances of modal senses

Polarity managed to perform well but not as good as we hoped. However, this feature also suffers from a sparsity issue resulting from cases where the Lesk algorithm (Lesk 1986 ) is not able to resolve the synset of the syntactic head.

Knowledge of syntax provides a big improvement with a significant increase over the baseline results from two of the classifiers.

Examining this in greater detail, POS tags with high information gain mostly included tags from open classes (i.e. VB-, JJ-, NN- and RB-). These tags are often more associated with determining polarity status than tags such as prepositions and conjunctions (i.e. adverbs and adjectives are more likely to be describing something with a non-neutral viewpoint).

The embeddings from Doc2Vec allowed us to obtain another significant increase in performance (72.9% with Naïve Bayes) over the baseline and polarity using Word2Vec provided the best individual feature result (77.2% with SVM).

Combining all features together, each classifier managed to achieve a significant result over the baseline with the best result coming from the SVM (81.8%). Problems were also better classified than non-problems as shown in the confusion matrix in Table  7 . The addition of the Word2Vec vectors may be seen as a form of smoothing in cases where previous linguistic features had a sparsity issue i.e., instead of a NULL entry, the embeddings provide some sort of value for each candidate. Particularly wrt. the polarity feature, cases where Lesk was unable to resolve a synset meant that a ZERO entry was added to the vector supplied to the machine learner. Amongst the possible combinations, the best subset of features was found by combining all features with the exception of bigrams, trigrams, subcategorisation and modality. This subset of features managed to improve results in both the Naïve Bayes and SVM classifiers with the highest overall result coming from the SVM (82.3%).

Confusion matrix for problems

The results for disambiguation of solutions from non-solutions can be seen in Table  8 . The bag-of-words baseline performs much better than random, with the performance being quite high with regard to the SVM (this result was also higher than any of the baseline performances from the problem classifiers). As shown in Table  9 , the top ranked lemmas from the best performing model (using information gain) included “use” and “method”. These lemmas are very indicative of solutionhood and so give some insight into the high baseline returned from the machine learners. Subcategorisation and the result adverbials were the two worst performing features. However, the low performance for subcategorisation is due to the sampling of the non-solutions (the same reason for the low performance of the problem transitivity feature). When fitting the POS-tag distribution for the negative samples, we noticed that over 80% of the head POS-tags were verbs (much higher than the problem heads). The most frequent verb type being the infinite form.

Results distinguishing solutions from non-solutions using Naïve Bayes (NB), logistic regression (LR) and a support vector machine (SVM)

Each feature set’s performance is shown in isolation followed by combinations with other features. Tenfold stratified cross-validation was used across all experiments

Information gain (IG) in bits of top lemmas from the bag-of-words baseline in Table  8

This is not surprising given that a very common formulation to describe a solution is to use the infinitive “TO” since it often describes a task e.g., “One solution is to find the singletons and remove them”. Therefore, since the head POS tags of the non-solutions had to match this high distribution of infinitive verbs present in the solution, the subcategorisation feature is not particularly discriminatory. Polarity, negation, exemplification and syntactic features were slightly more discriminate and provided comparable results. However, similar to the problem experiment, the embeddings from Word2Vec and Doc2Vec proved to be the best features, with polarity using Word2Vec providing the best individual result (73.4% with SVM).

Combining all features together managed to improve over each feature in isolation and beat the baseline using all three classifiers. Furthermore, when looking at the confusion matrix in Table  10 the solutions were classified more accurately than the non-solutions. The best subset of features was found by combining all features without adverbial of result, bigrams, exemplification, negation, polarity and subcategorisation. The best result using this subset of features was achieved by the SVM with 79.7%. It managed to greatly improve upon the baseline but was just shy of achieving statistical significance ( p = 0.057 ).

Confusion matrix for solutions

In this work, we have presented new supervised classifiers for the task of identifying problem and solution statements in scientific text. We have also introduced a new corpus for this task and used it for evaluating our classifiers. Great care was taken in constructing the corpus by ensuring that the negative and positive samples were closely matched in terms of syntactic shape. If we had simply selected random subtrees for negative samples without regard for any syntactic similarity with our positive samples, the machine learner may have found easy signals such as sentence length. Additionally, since we did not allow the machine learner to see the surroundings of the candidate string within the sentence, this made our task even harder. Our performance on the corpus shows promise for this task, and proves that there are strong signals for determining both the problem and solution parts of the problem-solving pattern independently.

With regard to classifying problems from non-problems, features such as the POS tag, document and word embeddings provide the best features, with polarity using the Word2Vec embeddings achieving the highest feature performance. The best overall result was achieved using an SVM with a subset of features (82.3%). Classifying solutions from non-solutions also performs well using the embedding features, with the best feature also being polarity using the Word2Vec embeddings, and the highest result also coming from the SVM with a feature subset (79.7%).

In future work, we plan to link problem and solution statements which were found independently during our corpus creation. Given that our classifiers were trained on data solely from the ACL anthology, we also hope to investigate the domain specificity of our classifiers and see how well they can generalise to domains other than ACL (e.g. bioinformatics). Since we took great care at removing the knowledge our classifiers have of the explicit statements of problem and solution (i.e. the classifiers were trained only on the syntactic argument of the explicit statement of problem-/solution-hood), our classifiers should in principle be in a good position to generalise, i.e., find implicit statements too. In future work, we will measure to which degree this is the case.

To facilitate further research on this topic, all code and data used in our experiments can be found here: www.cl.cam.ac.uk/~kh562/identifying-problems-and-solutions.html

Acknowledgements

The first author has been supported by an EPSRC studentship (Award Ref: 1641528). We thank the reviewers for their helpful comments.

1 http://acl-arc.comp.nus.edu.sg/ .

2 The corpus comprises 3,391,198 sentences, 71,149,169 words and 451,996,332 characters.

3 The head POS tags were found using a modification of the Collins’ Head Finder. This modified algorithm addresses some of the limitations of the head finding heuristics described by Collins ( 2003 ) and can be found here: http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/ModCollinsHeadFinder.html .

4 https://www.uni-hildesheim.de/ruppenhofer/data/modalia_release1.0.tgz.

Contributor Information

Kevin Heffernan, Email: [email protected] .

Simone Teufel, Email: [email protected] .

  • Baccianella S, Esuli A, Sebastiani F. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. LREC. 2010; 10 :2200–2204. [ Google Scholar ]
  • Becker, M., Palmer, A., & Frank, A. (2016). Clause types and modality in argumentative microtexts. In Workshop on foundations of the language of argumentation (in conjunction with COMMA) .
  • Briscoe, T., Carroll, J., & Watson, R. (2006). The second release of the rasp system. In Proceedings of the COLING/ACL on interactive presentation sessions, association for computational linguistics pp. 77–80.
  • Chandrasekaran B. Towards a taxonomy of problem solving types. AI Magazine. 1983; 4 (1):9. [ Google Scholar ]
  • Charles M. Adverbials of result: Phraseology and functions in the problem-solution pattern. Journal of English for Academic Purposes. 2011; 10 (1):47–60. doi: 10.1016/j.jeap.2011.01.002. [ CrossRef ] [ Google Scholar ]
  • Chen, D., Dyer, C., Cohen, S. B., & Smith, N. A. (2011). Unsupervised bilingual pos tagging with markov random fields. In Proceedings of the first workshop on unsupervised learning in NLP, association for computational linguistics pp. 64–71.
  • Collins M. Head-driven statistical models for natural language parsing. Computational Linguistics. 2003; 29 (4):589–637. doi: 10.1162/089120103322753356. [ CrossRef ] [ Google Scholar ]
  • Councill, I. G., Giles, C. L., & Kan, M. Y. (2008). Parscit: An open-source CRF reference string parsing package. In LREC .
  • Curran, J. R., Clark, S., & Bos, J. (2007). Linguistically motivated large-scale NLP with C&C and boxer. In Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, association for computational linguistics pp. 33–36.
  • Flowerdew L. Corpus-based analyses of the problem-solution pattern: A phraseological approach. Amsterdam: John Benjamins Publishing; 2008. [ Google Scholar ]
  • Grimes JE. The thread of discourse. Berlin: Walter de Gruyter; 1975. [ Google Scholar ]
  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The weka data mining software: An update. ACM SIGKDD Explorations Newsletter. 2009; 11 (1):10–18. doi: 10.1145/1656274.1656278. [ CrossRef ] [ Google Scholar ]
  • Hoey M. Textual interaction: An introduction to written discourse analysis. Portland: Psychology Press; 2001. [ Google Scholar ]
  • Hutchins J. On the structure of scientific texts. UEA Papers in Linguistics. 1977; 5 (3):18–39. [ Google Scholar ]
  • Jonassen DH. Toward a design theory of problem solving. Educational Technology Research and Development. 2000; 48 (4):63–85. doi: 10.1007/BF02300500. [ CrossRef ] [ Google Scholar ]
  • Jordan MP. Short texts to explain problem-solution structures-and vice versa. Instructional Science. 1980; 9 (3):221–252. doi: 10.1007/BF00177328. [ CrossRef ] [ Google Scholar ]
  • Kratzer, A. (1991). Modality. In von Stechow & Wunderlich (Eds.), Semantics: An international handbook of contemporary research .
  • Le QV, Mikolov T. Distributed representations of sentences and documents. ICML. 2014; 14 :1188–1196. [ Google Scholar ]
  • Lesk, M. (1986). Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of the 5th annual international conference on Systems documentation, ACM (pp. 24–26).
  • Madnani, N., Tetreault, J., & Chodorow, M. (2012). Exploring grammatical error correction with not-so-crummy machine translation. In Proceedings of the seventh workshop on building educational applications using NLP, association for computational linguistics pp. 44–53.
  • Mann WC, Thompson SA. Rhetorical structure theory: Toward a functional theory of text organization. Text-Interdisciplinary Journal for the Study of Discourse. 1988; 8 (3):243–281. doi: 10.1515/text.1.1988.8.3.243. [ CrossRef ] [ Google Scholar ]
  • Marasović, A., & Frank, A. (2016). Multilingual modal sense classification using a convolutional neural network. In Proceedings of the 1st Workshop on Representation Learning for NLP .
  • McKeown K, Daume H, Chaturvedi S, Paparrizos J, Thadani K, Barrio P, Biran O, Bothe S, Collins M, Fleischmann KR, et al. Predicting the impact of scientific concepts using full-text features. Journal of the Association for Information Science and Technology. 2016; 67 :2684–2696. doi: 10.1002/asi.23612. [ CrossRef ] [ Google Scholar ]
  • Medlock B, Briscoe T. Weakly supervised learning for hedge classification in scientific literature. ACL, Citeseer. 2007; 2007 :992–999. [ Google Scholar ]
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119).
  • Mohit, B., Schneider, N., Bhowmick, R., Oflazer, K., & Smith, N. A. (2012). Recall-oriented learning of named entities in arabic wikipedia. In Proceedings of the 13th conference of the European chapter of the association for computational linguistics, association for computational linguistics (pp. 162–173).
  • Mullen T, Collier N. Sentiment analysis using support vector machines with diverse information sources. EMNLP. 2004; 4 :412–418. [ Google Scholar ]
  • Nakov, P., Hearst, M. A. (2008). Solving relational similarity problems using the web as a corpus. In: ACL (pp. 452–460).
  • Poon, H., & Domingos, P. (2009). Unsupervised semantic parsing. In Proceedings of the 2009 conference on empirical methods in natural language processing: Volume 1-association for computational linguistics (pp. 1–10).
  • Ruppenhofer, J., & Rehbein, I. (2012). Yes we can!? Annotating the senses of English modal verbs. In Proceedings of the 8th international conference on language resources and evaluation (LREC), Citeseer (pp. 24–26).
  • Saha, S. K., Mitra, P., & Sarkar, S. (2008). Word clustering and word selection based feature reduction for maxent based hindi ner. In ACL (pp. 488–495).
  • Scott, M. (2001). Mapping key words to problem and solution. In Patterns of text: In honour of Michael Hoey Benjamins, Amsterdam (pp. 109–127).
  • Siegel S. Nonparametric statistics for the behavioral sciences. New York: McGraw-hill; 1956. [ Google Scholar ]
  • Strübing, J. (2007). Research as pragmatic problem-solving: The pragmatist roots of empirically-grounded theorizing. In The Sage handbook of grounded theory (pp. 580–602).
  • Teufel, S., et al. (2000). Argumentative zoning: Information extraction from scientific text. PhD Thesis, Citeseer .
  • Turney, P. D. (2002). Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics, association for computational linguistics (pp. 417–424).
  • Van Dijk TA. Text and context explorations in the semantics and pragmatics of discourse. London: Longman; 1980. [ Google Scholar ]
  • Wiebe J, Wilson T, Cardie C. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation. 2005; 39 (2):165–210. doi: 10.1007/s10579-005-7880-9. [ CrossRef ] [ Google Scholar ]
  • Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing, association for computational linguistics (pp. 347–354).
  • Winter, E. O. (1968). Some aspects of cohesion. In Sentence and clause in scientific English . University College London.
  • Zhou, M., Frank, A., Friedrich, A., & Palmer, A. (2015). Semantically enriched models for modal sense classification. In Workshop on linking models of lexical, sentential and discourse-level semantics (LSDSem) (p. 44).
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Scientific Research – Types, Purpose and Guide

Scientific Research – Types, Purpose and Guide

Table of Contents

Scientific Research

Scientific Research

Definition:

Scientific research is the systematic and empirical investigation of phenomena, theories, or hypotheses, using various methods and techniques in order to acquire new knowledge or to validate existing knowledge.

It involves the collection, analysis, interpretation, and presentation of data, as well as the formulation and testing of hypotheses. Scientific research can be conducted in various fields, such as natural sciences, social sciences, and engineering, and may involve experiments, observations, surveys, or other forms of data collection. The goal of scientific research is to advance knowledge, improve understanding, and contribute to the development of solutions to practical problems.

Types of Scientific Research

There are different types of scientific research, which can be classified based on their purpose, method, and application. In this response, we will discuss the four main types of scientific research.

Descriptive Research

Descriptive research aims to describe or document a particular phenomenon or situation, without altering it in any way. This type of research is usually done through observation, surveys, or case studies. Descriptive research is useful in generating ideas, understanding complex phenomena, and providing a foundation for future research. However, it does not provide explanations or causal relationships between variables.

Exploratory Research

Exploratory research aims to explore a new area of inquiry or develop initial ideas for future research. This type of research is usually conducted through observation, interviews, or focus groups. Exploratory research is useful in generating hypotheses, identifying research questions, and determining the feasibility of a larger study. However, it does not provide conclusive evidence or establish cause-and-effect relationships.

Experimental Research

Experimental research aims to test cause-and-effect relationships between variables by manipulating one variable and observing the effects on another variable. This type of research involves the use of an experimental group, which receives a treatment, and a control group, which does not receive the treatment. Experimental research is useful in establishing causal relationships, replicating results, and controlling extraneous variables. However, it may not be feasible or ethical to manipulate certain variables in some contexts.

Correlational Research

Correlational research aims to examine the relationship between two or more variables without manipulating them. This type of research involves the use of statistical techniques to determine the strength and direction of the relationship between variables. Correlational research is useful in identifying patterns, predicting outcomes, and testing theories. However, it does not establish causation or control for confounding variables.

Scientific Research Methods

Scientific research methods are used in scientific research to investigate phenomena, acquire knowledge, and answer questions using empirical evidence. Here are some commonly used scientific research methods:

Observational Studies

This method involves observing and recording phenomena as they occur in their natural setting. It can be done through direct observation or by using tools such as cameras, microscopes, or sensors.

Experimental Studies

This method involves manipulating one or more variables to determine the effect on the outcome. This type of study is often used to establish cause-and-effect relationships.

Survey Research

This method involves collecting data from a large number of people by asking them a set of standardized questions. Surveys can be conducted in person, over the phone, or online.

Case Studies

This method involves in-depth analysis of a single individual, group, or organization. Case studies are often used to gain insights into complex or unusual phenomena.

Meta-analysis

This method involves combining data from multiple studies to arrive at a more reliable conclusion. This technique can be used to identify patterns and trends across a large number of studies.

Qualitative Research

This method involves collecting and analyzing non-numerical data, such as interviews, focus groups, or observations. This type of research is often used to explore complex phenomena and to gain an understanding of people’s experiences and perspectives.

Quantitative Research

This method involves collecting and analyzing numerical data using statistical techniques. This type of research is often used to test hypotheses and to establish cause-and-effect relationships.

Longitudinal Studies

This method involves following a group of individuals over a period of time to observe changes and to identify patterns and trends. This type of study can be used to investigate the long-term effects of a particular intervention or exposure.

Data Analysis Methods

There are many different data analysis methods used in scientific research, and the choice of method depends on the type of data being collected and the research question. Here are some commonly used data analysis methods:

  • Descriptive statistics: This involves using summary statistics such as mean, median, mode, standard deviation, and range to describe the basic features of the data.
  • Inferential statistics: This involves using statistical tests to make inferences about a population based on a sample of data. Examples of inferential statistics include t-tests, ANOVA, and regression analysis.
  • Qualitative analysis: This involves analyzing non-numerical data such as interviews, focus groups, and observations. Qualitative analysis may involve identifying themes, patterns, or categories in the data.
  • Content analysis: This involves analyzing the content of written or visual materials such as articles, speeches, or images. Content analysis may involve identifying themes, patterns, or categories in the content.
  • Data mining: This involves using automated methods to analyze large datasets to identify patterns, trends, or relationships in the data.
  • Machine learning: This involves using algorithms to analyze data and make predictions or classifications based on the patterns identified in the data.

Application of Scientific Research

Scientific research has numerous applications in many fields, including:

  • Medicine and healthcare: Scientific research is used to develop new drugs, medical treatments, and vaccines. It is also used to understand the causes and risk factors of diseases, as well as to develop new diagnostic tools and medical devices.
  • Agriculture : Scientific research is used to develop new crop varieties, to improve crop yields, and to develop more sustainable farming practices.
  • Technology and engineering : Scientific research is used to develop new technologies and engineering solutions, such as renewable energy systems, new materials, and advanced manufacturing techniques.
  • Environmental science : Scientific research is used to understand the impacts of human activity on the environment and to develop solutions for mitigating those impacts. It is also used to monitor and manage natural resources, such as water and air quality.
  • Education : Scientific research is used to develop new teaching methods and educational materials, as well as to understand how people learn and develop.
  • Business and economics: Scientific research is used to understand consumer behavior, to develop new products and services, and to analyze economic trends and policies.
  • Social sciences : Scientific research is used to understand human behavior, attitudes, and social dynamics. It is also used to develop interventions to improve social welfare and to inform public policy.

How to Conduct Scientific Research

Conducting scientific research involves several steps, including:

  • Identify a research question: Start by identifying a question or problem that you want to investigate. This question should be clear, specific, and relevant to your field of study.
  • Conduct a literature review: Before starting your research, conduct a thorough review of existing research in your field. This will help you identify gaps in knowledge and develop hypotheses or research questions.
  • Develop a research plan: Once you have a research question, develop a plan for how you will collect and analyze data to answer that question. This plan should include a detailed methodology, a timeline, and a budget.
  • Collect data: Depending on your research question and methodology, you may collect data through surveys, experiments, observations, or other methods.
  • Analyze data: Once you have collected your data, analyze it using appropriate statistical or qualitative methods. This will help you draw conclusions about your research question.
  • Interpret results: Based on your analysis, interpret your results and draw conclusions about your research question. Discuss any limitations or implications of your findings.
  • Communicate results: Finally, communicate your findings to others in your field through presentations, publications, or other means.

Purpose of Scientific Research

The purpose of scientific research is to systematically investigate phenomena, acquire new knowledge, and advance our understanding of the world around us. Scientific research has several key goals, including:

  • Exploring the unknown: Scientific research is often driven by curiosity and the desire to explore uncharted territory. Scientists investigate phenomena that are not well understood, in order to discover new insights and develop new theories.
  • Testing hypotheses: Scientific research involves developing hypotheses or research questions, and then testing them through observation and experimentation. This allows scientists to evaluate the validity of their ideas and refine their understanding of the phenomena they are studying.
  • Solving problems: Scientific research is often motivated by the desire to solve practical problems or address real-world challenges. For example, researchers may investigate the causes of a disease in order to develop new treatments, or explore ways to make renewable energy more affordable and accessible.
  • Advancing knowledge: Scientific research is a collective effort to advance our understanding of the world around us. By building on existing knowledge and developing new insights, scientists contribute to a growing body of knowledge that can be used to inform decision-making, solve problems, and improve our lives.

Examples of Scientific Research

Here are some examples of scientific research that are currently ongoing or have recently been completed:

  • Clinical trials for new treatments: Scientific research in the medical field often involves clinical trials to test new treatments for diseases and conditions. For example, clinical trials may be conducted to evaluate the safety and efficacy of new drugs or medical devices.
  • Genomics research: Scientists are conducting research to better understand the human genome and its role in health and disease. This includes research on genetic mutations that can cause diseases such as cancer, as well as the development of personalized medicine based on an individual’s genetic makeup.
  • Climate change: Scientific research is being conducted to understand the causes and impacts of climate change, as well as to develop solutions for mitigating its effects. This includes research on renewable energy technologies, carbon capture and storage, and sustainable land use practices.
  • Neuroscience : Scientists are conducting research to understand the workings of the brain and the nervous system, with the goal of developing new treatments for neurological disorders such as Alzheimer’s disease and Parkinson’s disease.
  • Artificial intelligence: Researchers are working to develop new algorithms and technologies to improve the capabilities of artificial intelligence systems. This includes research on machine learning, computer vision, and natural language processing.
  • Space exploration: Scientific research is being conducted to explore the cosmos and learn more about the origins of the universe. This includes research on exoplanets, black holes, and the search for extraterrestrial life.

When to use Scientific Research

Some specific situations where scientific research may be particularly useful include:

  • Solving problems: Scientific research can be used to investigate practical problems or address real-world challenges. For example, scientists may investigate the causes of a disease in order to develop new treatments, or explore ways to make renewable energy more affordable and accessible.
  • Decision-making: Scientific research can provide evidence-based information to inform decision-making. For example, policymakers may use scientific research to evaluate the effectiveness of different policy options or to make decisions about public health and safety.
  • Innovation : Scientific research can be used to develop new technologies, products, and processes. For example, research on materials science can lead to the development of new materials with unique properties that can be used in a range of applications.
  • Knowledge creation : Scientific research is an important way of generating new knowledge and advancing our understanding of the world around us. This can lead to new theories, insights, and discoveries that can benefit society.

Advantages of Scientific Research

There are many advantages of scientific research, including:

  • Improved understanding : Scientific research allows us to gain a deeper understanding of the world around us, from the smallest subatomic particles to the largest celestial bodies.
  • Evidence-based decision making: Scientific research provides evidence-based information that can inform decision-making in many fields, from public policy to medicine.
  • Technological advancements: Scientific research drives technological advancements in fields such as medicine, engineering, and materials science. These advancements can improve quality of life, increase efficiency, and reduce costs.
  • New discoveries: Scientific research can lead to new discoveries and breakthroughs that can advance our knowledge in many fields. These discoveries can lead to new theories, technologies, and products.
  • Economic benefits : Scientific research can stimulate economic growth by creating new industries and jobs, and by generating new technologies and products.
  • Improved health outcomes: Scientific research can lead to the development of new medical treatments and technologies that can improve health outcomes and quality of life for people around the world.
  • Increased innovation: Scientific research encourages innovation by promoting collaboration, creativity, and curiosity. This can lead to new and unexpected discoveries that can benefit society.

Limitations of Scientific Research

Scientific research has some limitations that researchers should be aware of. These limitations can include:

  • Research design limitations : The design of a research study can impact the reliability and validity of the results. Poorly designed studies can lead to inaccurate or inconclusive results. Researchers must carefully consider the study design to ensure that it is appropriate for the research question and the population being studied.
  • Sample size limitations: The size of the sample being studied can impact the generalizability of the results. Small sample sizes may not be representative of the larger population, and may lead to incorrect conclusions.
  • Time and resource limitations: Scientific research can be costly and time-consuming. Researchers may not have the resources necessary to conduct a large-scale study, or may not have sufficient time to complete a study with appropriate controls and analysis.
  • Ethical limitations : Certain types of research may raise ethical concerns, such as studies involving human or animal subjects. Ethical concerns may limit the scope of the research that can be conducted, or require additional protocols and procedures to ensure the safety and well-being of participants.
  • Limitations of technology: Technology may limit the types of research that can be conducted, or the accuracy of the data collected. For example, certain types of research may require advanced technology that is not yet available, or may be limited by the accuracy of current measurement tools.
  • Limitations of existing knowledge: Existing knowledge may limit the types of research that can be conducted. For example, if there is limited knowledge in a particular field, it may be difficult to design a study that can provide meaningful results.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Documentary Research

Documentary Research – Types, Methods and...

Original Research

Original Research – Definition, Examples, Guide

Humanities Research

Humanities Research – Types, Methods and Examples

Historical Research

Historical Research – Types, Methods and Examples

Artistic Research

Artistic Research – Methods, Types and Examples

When Science Is Taught This Way, Students Become Critical Friends: Setting the Stage for Student Teachers

  • Open access
  • Published: 08 July 2023
  • Volume 53 , pages 1063–1079, ( 2023 )

Cite this article

You have full access to this open access article

  • Paul Nnanyereugo Iwuanyanwu   ORCID: orcid.org/0000-0001-7641-6238 1  

1757 Accesses

2 Altmetric

Explore all metrics

Effective science education draws on many different ways of teaching science. The literature on science education documents some potential benefits of argumentation instruction as a powerful tool for learning science and maintaining wonder and curiosity in the classroom. Unlike expository teaching, which relies on a teacher-driven pedagogy in which students accept the teacher’s authority over any content to be justified a priori, argumentation teaching allows students to focus on the importance of high-quality evidence for epistemic knowledge, reasoning, and justification. Using a quasi-experimental design, two study groups of undergraduate student teachers were exposed to two different learning conditions, the Exp-group with dialogic argumentation instruction (DAI) and the Ctrl-group with expository instruction. Each group received the same science content twice a week for 12 weeks (2 h per lesson). Pre- and posttests were administered to collect data. One-way MANCOVA with the pretest results as covariates showed that the instructional approaches (Wilk’s Λ  = 0.765, p  < 0.001) had a significant effect on the tested variables after the intervention. A pairwise comparison of performance indices between the two study groups revealed that the exp-group was better able to evaluate alternative solutions and defend arguments for collaborative consensus on unstructured scientific problems. This suggests that dialogic argumentation instruction can be used to help students improve their scientific reasoning, thinking, and argumentation skills, which are required to solve problems involving scientific phenomena.

Similar content being viewed by others

scientific problem solving designing carefully controlled research

Students’ Thinking Strategies and the Role of Argument as a Shared Thinking Tool

scientific problem solving designing carefully controlled research

Developing scientific argumentation strategies using revised argument-driven inquiry (rADI) in science classrooms in Thailand

Wilaiwan Songsil, Pongprapan Pongsophon, … Anthony Clarke

scientific problem solving designing carefully controlled research

Arguing to Learn and Learning to Argue with Elements of Nature of Science

Avoid common mistakes on your manuscript.

Introduction

Over the years, science education research has examined various issues related to science learning, including its nature, content, goals, and problems, and has found that some problems that appear to be specific to science education actually reveal broader instructional issues that affect students’ interest in science, their achievement, and their perceived usefulness (Eccles & Wigfield, 2002 ; Ogunniyi, 2022 ; Toma & Lederman, 2020 ). Considering this, scholars and science education reformers recommend that science instruction be tailored to encourage students to reflect critically on science (International Council for Science [ICS], 2011 ), solve real-world problems, make and defend arguments about scientific knowledge (Iwuanyanwu, 2020 ), and gather and evaluate scientific evidence on a variety of topics to gain new knowledge (Osborne, 2019 ; Tarekegn et al., 2022 ). According to the social constructivist perspective (Schreiber & Valle, 2013 ), the process of gathering and analyzing scientific evidence to gain new knowledge requires students to engage in a dialogic relationship with their teachers and peers who contribute to the construction of knowledge through reasoning, thinking, and sense-making (Iwuanyanwu, 2022 ). When participating in a dialogic relationship with peers, students are able to freely express their views, learn to ask questions, identify assumptions, think through problems with peers, gather and weigh evidence to find solutions, or reach a collaborative consensus.

The many benefits of engaging students, especially student teachers, in a dialogic relationship that promotes social construction of knowledge can provide them with the skills they need to function as critical friends within the teaching profession as well as the scientific community in general. However, research has shown that student teachers and in-service teachers have difficulty completing these cognitive tasks (Erduran et al., 2016 ; Ghebru & Ogunniyi, 2017 ; Iwuanyanwu, 2017 ). For example, one of the reasons for conducting the current study was that student teachers in physics lectures were becoming increasingly perfunctory in formulating, presenting, and defending arguments for the best solution to unstructured problems and in evaluating evidence from opposing sides based on new data. In the literature, physics teachers who have encountered this type of student difficulty report observations that led them to incorporate dialogic argumentation in their lectures to allow their students and preservice teachers to explore possibilities for social negotiation and to discuss the uncertainty of multiple solutions to unstructured scientific problems (Etkina et al., 2019 ; Gürel & Süzük, 2017 ; Syafril et al., 2021 ). In light of this, dialogic argumentation instruction plays a significant role in equipping student teachers with the methodological repertoire needed to resolve scientific and socio-scientific issues both inside and outside the science classroom (Iwuanyanwu & Ogunniyi, 2018 ).

Considering the above useful suggestions, the present study expands on the exploration of dialogic argumentation instruction in physics classroom, focusing on two study groups of student teachers (experimental and control groups) as they learn to construct arguments, evaluate solutions, and justify them when faced with uncertainty. The following research question guided the study’s data gathering and analysis: How do student teachers in expository and DAI-based classes differ in their ability to develop valid problem-solving strategies, formulate, present, and defend arguments, and provide reasonable solutions to given problems?

Review of Literature

Recent research has explored how the argumentation framework can be used in physics education to enhance physics instruction and achieve various learning objectives (Erduran & Park, 2023 ; Syafril et al., 2021 ). Argumentation, which involves asserting claims and using evidence to support them, mirrors as closely as possible the enterprise of science learning (Kuhn & Udell, 2007 ). According to this perspective, the structure of an argument is determined by how evidence, data, reasons, and claims are presented to support the argument (Toulmin, 2003 ), which can follow inductive or deductive reasoning from premises to conclusions (Iwuanyanwu, 2019 ). Consistent with social constructivism, students’ dialogic interactions with teachers and peers have been shown to promote argumentation in physics classrooms (Gürel & Süzük, 2017 ; Hansson & Leden, 2016 ). Thus, the present study fits within the paradigm of dialogic argumentation (Ghebru & Ogunniyi, 2017 ; Iwuanyanwu, 2022 ). In science classrooms, the process of dialogic argumentation becomes evident when a student presents a reason for or against an assertion about a phenomenon (Iwuanyanwu, 2017 ). In such cases, a common goal is to reach consensus between different perspectives about plausible or acceptable claims (Erduran & Park, 2023 ). As Ogunniyi ( 2022 ) points out, learning and teaching science through dialogic argumentation (DAI) provide students and teachers with a forum to express themselves, clarify doubts or anomalies, better understand scientific phenomena, and possibly revise their viewpoints based on new knowledge about scientific phenomena.

In the current study, the focus on DAI aligns with the objective of helping physics student teachers become better at arguing and defending arguments and producing reasoned solutions to science problems, which are essential skills for the science teaching profession (Erduran & Park, 2023 ). In learning environments where dialogic argumentation has flourished, student teachers’ drive to explore has also evolved (Iwuanyanwu, 2022 ). In this regard, DAI approach requires teachers to serve as mediators and ask thought-provoking questions to help students understand science (Iwuanyanwu & Ogunniyi, 2020 ; Koichu et al., 2022 ).

Conceptual Scheme of Dialogic Argumentation Instruction

Dialogic argumentation instruction (DAI) focuses on the components of learning that occur in a socially interactive context in which students learn about subject-specific concepts, formulate, present, and defend arguments and counterarguments to resolve contentious issues. Essentially, DAI includes three types of arguments that reflect the exploration of classroom activities, beginning with the individual argument or self-talk (intra-locutory arguments). Intra-locutory arguments are driven by a student’s self-talk, wonder, curiosity, or passion to understand and solve a particular scientific problem or phenomenon. During this phase, the student may notice something that intrigues her or stimulates her curiosity, leading her to ask questions, which in turn stimulates her inquiring mind. As she moves through this phase, the student puzzles over the task, asks more questions, and hypothesizes to create a new mental framework for the phenomenon. By focusing on the phenomenon more carefully and resolving related problems, the student may gain a better understanding of it (Iwuanyanwu, 2020 ). Consequently, the student provides evidence/data to support claims and/or counterclaims, and at best uses such evidence to justify arguments or solutions to given problems (Belland et al., 2011 ; Iwuanyanwu, 2022 ).

After completing the individual tasks, students move to their assigned small group to work on the next tasks, which are saturated with arguments and require reflective judgments. In this case, students enter into dialogues with their peers (inter-locutory arguments) to address issues that arise from their individual tasks or that are part of the group tasks. In this type of argument, dialogic relationships occur within and between subgroups in which students ask questions, critically engage with each other’s arguments and counterarguments, solicit responses, and connect meanings to reach a common consensus. In the current study, this is evident when students analyze different strategies for solving problems by collecting and comparing critical evidence for opposing viewpoints, making arguments and counterarguments using structures such as if/and/then/but/therefore, or considering facts for which additional evidence/data is needed to establish a common consensus about the uncertainty of solutions to particular problems (Gürel & Süzük, 2017 ; Iwuanyanwu, 2022 ; Jonassen, 2011 ). When dealing with unstructured problems, a plausible or acceptable solution is derived from the sum of all defended reasonable solutions (Geifman & Raban, 2015 ; Iwuanyanwu, 2020 ).

Moreover, in large group sessions, decisions made at the individual and group levels are mobilized again and typically articulated by group representatives (trans-locutory arguments), and the teacher serves as a mediator to facilitate learning with the explicit goal of reaching consensus. From this perspective, it can be said that teaching physics through dialogic argumentation can provide student teachers in the current study with a better understanding of physics that goes beyond the presentation of facts, definitions, laws, and problem-solving skills (Erduran & Park, 2023 ; Iwuanyanwu, 2019 ). When physics is taught in the context of dialogic argumentation instruction, it provides insight into students’ prior knowledge and views about specific scientific phenomena (Ghebru & Ogunniyi, 2017 ), including views that may not be consistent with valid scientific knowledge and may influence the way they approach learning physics concepts (Voss, 2006 ). In addition, some research evidence agrees that learning physics concepts through DAI can improve students’ ability to present and defend arguments and find reasoned solutions to given scientific problems (Erduran & Park, 2023 ; Gürel & Süzük, 2017 ; Iwuanyanwu & Ogunniyi, 2020 ).

Furthermore, the importance of solving scientific problems, making and defending judgments regarding the problems, and developing reasoned solutions to the problems cannot be overstated. After all, problem solving is a ubiquitous activity and an indispensable skill for students and teachers to acquire (Iwuanyanwu, 2020 ). According to UNESCO ( 2020 ), real-life challenges and recent COVID-19 health problems have led to an urgent need to address the gaps in students’ ability to think for themselves, reason, and solve different types of problems related to STEM education and other socio-scientific contexts. Recent studies by Erduran and Park ( 2023 ) and Tarekegn et al. ( 2022 ) suggest that educators could use dialogic argumentation instruction to address these identified gaps, which are among the skills students need in the twenty-first century. In light of this, DAI is a very important tool to help student teachers in the current study think independently; challenge the views of their classmates, lecturers, and others with reasoned arguments and counterarguments; and investigate the unresolved issues related to physics phenomena and problems. By doing so, students can build a broad range of knowledge and inquiry skills and gain more experience to be better able to act as critical friends within a larger scientific community.

This study followed a pre–posttest control group design with quantitative data collection and analysis as described below.

Setting and Samples

The faculty of education at a South African university where this study was conducted offers a four-year science education program that prepares students to become science teachers. The program includes science subjects such as chemistry, physics, and biological sciences. Since the study was prompted by a problem identified in physics education class, it focuses on physics instruction. In the first and second years of the physics education program, student teachers must take an introductory physics module, usually offered in two separate classes. The module includes learning concepts such as dynamics of uniform circular motion, thermodynamics, fluids, forces and motions, waves, and sound. Student teachers were taught specific topics in two separate classes in their first year and mid-year in their second year, during which the study was conducted. The organization of the lectures is such that students attend two hours of physics lectures twice a week for 16 weeks.

The participants were second-year physics student teachers ( n  = 79, 37 females, 42 males) enrolled in the above program. Of the two classes, one was inducted as Exp-group ( n  = 46) and the other as the Ctrl-group ( n  = 33). Rural, suburban, and urban student teachers were represented in each class. They were mostly between 19 and 23 years old (average age = 20; standard deviation = 3.86). Family income ranged from dual income to middle-income. Student teachers who volunteered to participate in the study after it was approved by the ethics committee gave their consent by completing the POPIA consent form. Using POPIA guidelines, all tasks submitted and assessed were treated with privacy, confidentiality, and anonymity. The first step of the study was to use a self-developed research instrument to generate baseline data to compare students in the two classes. Details of the instrument review process are provided under instrumentation. The first application of the instrument (pretest) lasted for two hours in week 1. Pretest data suggests that the study groups did not differ significantly ( F  = 4.82; p  = 0.295). Based on this result, the study groups were inducted according to the pretest–posttest control group design.

Instrumentation

Seventeen different science problems were developed. The tasks required students to develop valid problem-solving strategies, formulate and defend arguments, evaluate the reasonableness of alternative solutions, and defend reasonable solutions, as shown in Table 1 . Most tasks do not have unique correct answers/solutions (see, for example, the science problem SP-Q2). The tasks were designed to elicit agreement and/or disagreement about their solutions, even when the tasks are considered solved. The instrument was tested for validity and reliability by two independent science educators who reviewed the tasks. In this regard, the study instrument was appraised for its content level, language appropriateness, and conceptual coverage. After several revisions of the instrument, the final version yielded Cohen’s kappa value of 0.78. The Kuder–Richardson 21 reliability coefficient was 0.73. These data indicate that the overall consistency and reliability of the 17 items were satisfactory, reliable, and appropriate (Creswell, 2013 ). Each of the 17 items required students to (a) develop valid problem-solving strategies, (b) formulate defensible arguments for their chosen strategies, (c) judge the reasonableness of alternative solutions (if applicable) to resolve ambiguities, and (d) defend the reasonable solutions/collaborative consensus reached. Following the existing literature, it was believed that the combination of the four variables (a–d) could allow for the integration of many of the skills and abilities student teachers need to collect and evaluate data/evidence, formulate and defend arguments, and, at their best, use arguments to solve physics problems and communicate scientific knowledge (Adams & Wieman, 2015 ; Belland et al., 2011 ; Iwuanyanwu, 2020 ).

In accordance with the experimental pre–posttest control group design, the two study groups were subjected to a series of biweekly physics lectures lasting 2 h per session over a 12-week period. The language of instruction was English. Both groups received the same content (advanced mechanics) and the same amount of teacher contact and were pretested and posttested with the same instrument (4 h in weeks 1 and 12). During the 12-week teaching–learning period, the exp-group participated in a series of argumentation-based lessons delivered by the author using DAI (Erduran & Park, 2023 ; Gürel & Süzük, 2017 ; Iwuanyanwu & Ogunniyi, 2020 ), while the Ctrl-group received expository instruction from another instructor in the science education faculty (Adams & Wieman, 2015 ). Both teaching approaches consisted of cycles of preparation and reflection, which in turn led to the next cycle of instruction (Geifman & Raban, 2015 ). During week 1 (0.5 h on Day 1) of the study, students were briefed on the study project and given guidelines for their learning behaviors throughout the study. Following this meeting, the two-hour pretest data collection session took place. In week 2 (5 min before the DAI lesson began), students were asked to form groups of 5–6 students per group and were given a unique identification code for data collection.

Using the DAI guidelines as presented in the literature review, teaching and learning were facilitated through three key phases to actively engage students in learning scientific concepts as individuals (elaborating intra-locutory arguments within individuals), as small groups (elaborating inter-locutory arguments between subgroups), and as a whole class (elaborating trans-locutory arguments across groups). Following this mode, learning physics concepts using basic argument structures was taught to help students learn how to use the structural elements of arguments, such as claims, data, evidence, reason, and counterclaims to formulate reasoned solutions to scientific problems (see Table 1 ). In weeks 2 through 12, the DAI process was scaffolded, and students were guided. The structure and guidance decrease as students become better at creating evidence-based arguments to solve scientific problems and communicate scientific knowledge.

Moreover, since scientific problems by their nature require arguments and counterarguments to solve them (Voss, 2006 ), students in the Ctrl-group received guided instructions from their instructor, who likes to play the role of devil’s advocate. Since both groups received the same content and research instrument, the instructor in the Ctrl-group was asked to guide her students to become aware of the four target variables in order to solve scientific problems and develop reasoned solutions. Her instruction was processed as shown in Table 2 .

Data Collection and Analysis

The instrument (consisting of 17 items) was administered to both the Exp-group and the Ctrl-group as a posttest (within 2 h at week 12) toward the end of the semester. Following posttest data collection, the researcher compiled each student’s solution script, removed identifying information, and then assigned a number code. The same number code was used to track student performance and progress throughout the study. Student performance on the targeted science concepts and variables (Tables 1 and 2 , respectively) was then analyzed using one-way MANCOVA (multivariate analysis of covariance), which was considered the most appropriate analysis to examine whether the two study groups differed on the outcomes of the four variables after the intervention. The purpose was to test whether participation in DAI and expository classroom activities helps students develop valid problem-solving strategies, formulate and defend arguments, and find reasoned solutions to given problems. Using pretest scores as covariates, the outcome variables were assumed to have a linear relationship in terms of multivariate normality and variance–covariance matrices \({(x}^{2} = 33.19, df=28, p=.012)\) (Creswell, 2013 ).

Findings and Discussion

According to the results of this study, under two different teaching approaches, both study groups made significant progress in learning how to create and use evidence-based arguments to solve scientific problems and communicate knowledge. For example, the quantitative data and student reasoning episodes showed that a significant number of students progressed from the pretest to posttest phase in terms of (a) constructing valid problem-solving strategies, (b) giving defensible arguments about their chosen strategies, (c) judging the reasonableness of alternative solutions to resolve ambiguities, and (d) defending reasonable solutions/collaborative consensus. From pretest to posttest, the Exp-group made significant progress in constructing valid problem-solving strategies ( t  = 6.75, p  < 0.001) and was able to provide defensible arguments for their choice strategies ( t  = 4.39, p  < 0.001). The Ctrl group made significant progress from pretest to posttest only in developing valid problem-solving strategies ( t  = 3.26, p  < 0.001).

Table 3 is a summary of the results for the four variables measured in the two study groups at pre- and posttest levels. A one-way MANCOVA using pretest scores as a covariate suggests that instructional approaches (Wilk’s Λ  = 0.765, p  < 0.001) have a statistically significant impact on postintervention outcomes for the variables tested. The univariate test confirmed significant differences in posttest scores between the study groups on judging the reasonableness of alternative solutions to resolve ambiguities ( F  = 10.67, p  = 0.0021) and defending reasonable solutions/collaborative consensus ( F  = 3.40, p 0.001). Thus, students’ posttest scores on defending arguments related to solutions or consensus about scientific phenomena were significantly influenced by the instructional approach used. Further post hoc tests (Tukey HSD), shown in Table 4 , indicate that there was a difference between the two study groups at posttest (Exp group > Ctrl group), ( p (post-Var.3)  = 0.042) and ( p (post-Var.4)  = 0.016). Note: see Table 4 for a full description of “Var. 3” and “Var. 4.”

The results of repeated measures of ANOVA with one factor Exp or Control are summarized in Table 4 . The between-subjects factor is the group, and the within-subjects factor is the time (pre or post scores). Scores on the pretest demonstrated significant differences between time and exp-group ( F  = 5.36, p 0.001), along with a significant time main effect ( F  = 4.82, p 0.001). However, posttest scores were not significantly affected by the Ctrl-group ( F  = 0.290, p  = 0.461).

A pairwise comparison of test time for learning in the study groups is shown in Table 5 . There was a significant difference in posttest scores between study groups ( F  = 5.36, p  = 0.001, η p 2  = 0.032). However, the pretest scores were not significantly different between the two groups ( F  = 4.82, p  = 0.195, η p 2  = 0.0063). In the Exp-group, the posttest results differed significantly from the pretest by ( F  = 0.46, p  < 0.001, η p 2  = 0.02) and in the Ctrl-group by ( F  = 6.56, p  < 0.001, η p 2  = 0.26). Also, in the experimental group, the difference between pre- and posttests means was greater than in the control group.

Overall student performance suggests that the control group performs at a much lower level than the experimental group on physics problems that depend on theoretical foundations or assumptions. This suggests that encouraging students to learn physics concepts and solve related problems in a DAI-based classroom may help students develop different reasoning skills (Asterhan & Swartz, 2007; Osborne, 2010 ; Voss, 2006 ). Of the 46 student teachers in the Exp group, 28 provided evidence-based arguments and rationales for multiple or alternative solutions. By comparison, 11 of their peers in the Ctrl group did likewise. More than 49% of the group could not judge the adequacy of their proposed solutions to ill-structured problems using the structural elements of arguments such as claims/counterclaims, data, evidence, and reasoning. The following is an example of an ill-defined scientific problem that student teachers addressed during the pre–posttests (SP -Q2, Table 6 , Figs. 1 , 2 , 3 , 4 ).

figure 1

Pyramid scientific ill-defined problem. Source: unsplash.com

figure 2

Initial predictions of work done during the construction of the pyramid

figure 3

Final predictions of work done during the construction of the pyramid

figure 4

Free-body diagram to support arguments generated

Student Teachers’ Deliberation on Scientific Problem

When student teachers were presented with the task SP-Q2 during the pre- and posttests, they were asked to perform the first two of the four variables examined in the study. In this phase, they were allowed to make mistakes, correct mistakes, make predictions, and self-regulate their own thinking. In small groups of five, students negotiated the question and its meaning. They gathered data and used available evidence to approach the problem in their subgroups. After this phase, students were referred to the remaining variables, but this time they were asked to develop possible strategic solutions based on sound arguments and evaluate their solutions. To do this, they planned, collected, recorded, and analyzed the approach they would take to address the problem. To complete the final phase, each individual or group was asked to explain their proposed solutions, applying the argumentation skills they had acquired. As each group shared their progress, they were encouraged to challenge the evolving evidence by making claims/counterclaims, presenting their evidence, and communicating and justifying their solutions. Using the guidelines outlined in Table 1 , the instructor constantly challenged the Exp-group to provide reasons for their claims/counterclaims, to pose thought-provoking questions, to persuade their classmates, who serve as critical friends and to provide evidence or data to support or refute opposing viewpoints.

Comparison of Student Teachers’ Responses to Ill-Defined Problems SP-Q2

Five student teachers in the Ctrl-group reported closely related problem-solving strategies and some arguments that included text structures such as if/then/but/therefore to support their solutions to Task SP-Q2 (Table 6 , Fig.  1 ), as did most of their peers in the Exp-group. However, the extent to which they presented arguments in support of their solutions based on the four outcome variables differed markedly from their peers in the DAI-based class, who consistently presented better arguments and used reasoned evidence to support their solutions. Considering that a problem solver may frame ill-structured problem-solving strategies differently based on his or her knowledge, experience, and/or insight into the problem context, the final solutions produced by the Ctrl group showed some inconclusive reasoning episodes. According to the final proposed solutions of SP-Q2 presented by the Ctrl-group, some incompatible excerpts of invalid scientific knowledge found in their solution scripts suggest they considered the problem statement superficially without adequately defining the problem context or exploring it using reasoned arguments. Their lack of experience with argumentation instruction may account for the discrepancy. By engaging students in argumentation instruction while learning physics concepts, they can develop complex learning skills that will assist them in their future workplaces and in everyday life (Iwuanyanwu, 2022 ). Due to space limitations, only a few subgroup presentations were selected to show how students developed problem-solving strategies, made and defended arguments, and developed reasoned solutions to item SP-Q2. The question was as follows: how did the ancient Egyptians move the blocks up and into position when building the Great Pyramid? For simplicity, a student teacher is referred to as ‘ST.’

Excerpt 1: Subgroup 3 Presented, Subgroup1 Responded

ST33: The question says: …in building the Great Pyramid, how did the ancient Egyptians move the blocks up and into position? ST77: The Pyramid has a shape of a prism block…how they moved the blocks up is a mystery… ST33: …but we can’t simply say the problem belongs to mystery… ST49: I have been looking at the Pyramid image, I think they probably moved the blocks up the side of the Pyramid using a rope… ST61: How is that possible? …how did they build the side of the Pyramid then, to now use it to support the upward movements of the blocks? ST49: …they probably started from the foundation to get to a height that requires to pull the blocks up and into position ST49: Here is my sketch, the rope is parallel by the opposite link… ST33: …they probably had no machines, like crane, so those men pulling the rope must be able men to pull 2000–3000kg block up there against gravity (Fg) and frictional force ( f f )…how many men could have done so? ST61: …certainly, it will depend on how strong the men were…and the magnitude of force each man could produce ST77: That I agree, even so one wonders how they managed to secure the block against gravity and frictional force that Mercy was alluding to…

ST49: … with the frictional force downward the plane to oppose the pending motion… one can say from Newton’s second law that \(F-mg sin \theta -{f}_{s}\)

ST61: The block could have been secured to a wood sled or something and …is pulled by multiple ropes ST49: So, are you saying that one rope…a strong rope will not do the job? ST77, ST33: …definitely not, Dawson …think about 2000-3000kg…

ST49: Therefore, with multiple ropes once the block began to move upward, one expects. \(F={\mu }_{s}mg \mathrm{cos} \theta +mg \mathrm{sin} \theta\)

ST77: When the block is on the verge of moving the pyramid side, the static friction: f s = f s max ST61: It makes sense then from this ST33: So, the blocks could have been pulled up into position by teams of able men as Dawson said from the beginning…no evidence is available to suggest exactly how many men did so ST49: Yeah

For student ST61, the premise presented by Dawson (ST49) suggesting that “the builders of the Great Pyramid probably used ropes to move the blocks up the side of the pyramid” (ST49) led him to ask, “…how did they build the side of the pyramid then to use it now to support the upward movements of the blocks?” His search for warrant was simply to point out that ‘facts’ are the arguments that can be made to support the premise. However, to make sure he understood the context, ST49 tacitly constructed his initial problem strategies and asked his group to identify and explore possible limitations. Available research suggests that ST49 is aware of the activity at levels of different cognitive styles, prior knowledge, experience, and reasoning ability (Jonassen, 2011 ; Osborne, 2010 ), and it is this level of awareness of the activity that is most likely to have an impact on him (Redish & Kuo, 2015 ) and lead to internalization (Adams & Wieman, 2015 ).

As can be seen, their arguments relate to important components of the kinematics and dynamics of the movements of objects. This supports the assumption that it would have required teams of capable men to lift 2000 to 3000 kg blocks of stone against gravity and the force of friction and move them into position, since machines such as cranes did not exist in those years (descriptive claim of ST33). Since there is much to consider, and after refuting her colleague’s initial claim that “getting the blocks up and into position is a mystery” (ST77), she colored her thoughts with Newtonian concepts—how many men did it take? In response, ST61 added that it certainly depends on the strength and power that each man can muster. In fact, as part of the solution path, his argument holds that “… several ropes could have been used to move the blocks upward, which explains the connection between epistemological optimism and the later inclusion of “ \(F={\mu }_{S}mg\mathrm{cos}\theta +mg \mathrm{sin }\theta\) .” After looking at some major components of the problem, they concluded that “…the blocks could have been pulled into position by teams of capable men …but in terms of the hard core, there is no indication of exactly how many men did this.”

According to the findings of this study, approximately 27 students who received DAI lessons and 11 students who received expository lessons completed 9 of the 17 tasks at a level that met all four dependent variables tested. The remainder from both groups were unable to dismiss the potential conflicts inherent in the ill-defined structured tasks as unimportant or the whim of mythical science. Although Shin et al. ( 2003 ) argued that students can solve unstructured science problems using general strategies if they are aware of or have sufficient knowledge about the problem domain, the current study adds that in order to solve unstructured problems and formulate reasoned solutions, students must be able to formulate and defend arguments that support their solution paths/strategies.

In terms of initiating and reflecting on alternatives, constructing arguments, evaluating, and reasoning, most students were idiosyncratic on some of the unstructured problems in the area of force and motion as they occur in everyday life. They were unable to identify with the meanings articulated by their classmates, who serve as critical friends. Other studies have shown that successful solvers of unstructured problems need to justify their decision about selected strategies in order to generate plausible or acceptable solutions (Belland et al., 2011 ; Gürel & Süzük, 2017 ; Syafril et al., 2021 ). While some of the students in the current study attempted to justify the decisions that led to the formulation of problem-solving strategies, some others did not do so satisfactorily. As a result, this study shows that the creation and defence of arguments to develop a reasoned solution to an unstructured scientific problem occurs through processes controlled by conditions that include students’ existing knowledge, reflective judgement, and subject-specific knowledge. Although not all student teachers were able to integrate expertise arguments and reasoning skills when they were needed to construct valid problem-solving strategies, the arguments and justifications generated from the completed worksheets indicated that they were motivated to engage in the activity as critical friends (e.g., ST11, ST27, and ST8).

This study has described an interesting type of science learning that fuses explicit teaching of science knowledge with student solving of unstructured problems. It has also provided compelling data showing how dialogic argumentation instruction can help foster students’ thinking when they must make and defend judgments about different sides of issues, thereby contributing to the extant literature (Erduran & Park, 2023 ; Evagorou & Osborne, 2013 ; Tarekegn et al., 2022 ). The results of the current study are consistent with other research evidence demonstrating the efficacy of dialogic argumentation instruction for cultivating scientific knowledge among students and teachers (Asterhan & Schwarz, 2007 ; Iwuanyanwu & Ogunniyi, 2020 ; Lubben et al., 2010 ; Ogunniyi, 2022 ).

Additionally, the current study supports the claims of Erduran and Park ( 2023 ) and Gürel and Süzük ( 2017 ) that argumentation instruction plays a more prominent role in assisting student teachers to better understand physics concepts and solve related problems. To some extent, the use of dialogic argumentation instruction appears to have helped students in the Exp-group mobilize better strategies for solving scientific problems. For example, the underlying assumptions they made about the four variables listed in Table 1 all showed indicators of shared tacit knowledge, sharing of cognitive discourse and understanding among their classmates. In addition, the group was generally enthusiastic about the intervention program delivered via DAI, except for the few who found the repetition of the stages of the four variables tedious when solving problems individually. Nonetheless, students need argumentation skills and knowledge to solve complex problems such as those encountered in everyday life.

Finally, this study has shown that engaging student teachers in dialogic argumentation instruction can help them improve their ability to think, reason, and solve different types of problems related to subject knowledge and socio-scientific contexts, which are essential skills for the science teaching profession. It should be noted that the results of this study may be affected by some limitations. Primarily, this was a case study of physics student teachers from a single institution taught by different instructors. Generalizations of the results to the entire population of university science students should be made with caution. For example, the study lasted only 12 weeks and included only 79 students. Therefore, a study with larger samples is recommended before implementing the approach on a large scale.

Adams, W. K., & Wieman, C. E. (2015). Analyzing the many skills involved in solving complex physics problems. American Journal of Physics, 83 (5), 459–467. https://doi.org/10.1119/1.4913923

Article   Google Scholar  

Asterhan, C., & Schwarz, B. (2007). The effect of monological and dialogical argumentation on concept learning in evolution theory. Journal of Education Psychology, 99 (3), 626–639.

Belland, B. R., Glazewski, K. D., & Richardson, J. C. (2011). Problem-based learning and argumentation: Testing a scaffolding framework to support school students’ creation of evidence-based arguments. Instructional Science, 39 (5), 667–694.

Creswell, J. W. (2013). Qualitative inquiry and research design: Choosing among five approaches (3rd ed.). Sage.

Google Scholar  

Eccles, J. S., & Wigfield, A. (2002). Motivational beliefs, values, and goals. Annual Review of Psychology, 53 (1), 109–132.

Erduran, S., & Park, W. (2023). Argumentation in physics education research: Recent trends and key themes. In M. F. Taşar & P. R. L. Heron (Eds.), The international handbook of physics education research: Learning physics (pp. 16–32). AIP Publishing.

Erduran, S., Kaya, E., & Çetin, P. S. (2016). Pre- service teachers’ perceptions of argumentation: Impact of a teacher education project in Rwanda. Boğaziçi University Journal of Education, 33 (1), 1–21.

Etkina, E., Brookes, D., & Planinsic, G. (2019). Examples of ISLE-based learning of traditional physics topics and examples of ISLE-based physics problems. In E. Etkina, D. Brookes, & G. Planinsic (Eds.), Investigative science learning environment: When learning physics mirrors doing physics (pp. 2.1–2.32). IOP Publishing.

Evagorou, M., & Osborne, J. (2013). Exploring young students’ collaborative argumentation within a socio-scientific issue. Journal of Research in Science Teaching, 50 (2), 209–237.

Geifman, D., & Raban, D. R. (2015). Collective problem-solving: The role of self-efficacy, skill, and prior knowledge. Interdisciplinary Journal of e-Skills and Lifelong Learning, 11 , 159–178. https://doi.org/10.28945/2319

Ghebru, S., & Ogunniyi, M. (2017). Pre-service science teachers’ understanding of argumentation. African Journal of Research in Mathematics, Science and Technology Education, 21 (1), 49–60.

Gürel, C., & Süzük, E. (2017). Pre-service physics teachers’ argumentation in a model rocketry physics experience. Educational Sciences: Theory & Practice, 17 , 83–104. https://doi.org/10.12738/estp.2017.1.0042

Hansson, L., & Leden, L. (2016). Working with the nature of science in physics class: Turning ‘ordinary’ classroom situations into nature of science learning situations. Physics Education, 51 (5), 1–6.

International Council for Science. (2011). Report of the ICSU ad-hoc review panel on science. Paris, France: Available from http://www.icsu.org/publications/reports-and-reviews/external-review-of-icsu .

Iwuanyanwu, P. N. (2017). An analysis of pre-service teachers’ ability to use a dialogical argumentation instructional model to solve mathematical problems in physics . Unpublished Doctoral Thesis. Cape Town: University of the Western Cape.

Iwuanyanwu, P. N. (2019). Students understanding of calculus-based kinematics and the arguments they generated for problem solving: The case of understanding physics. Journal of Education in Science, Environment and Health, 5 (2), 283–295. https://doi.org/10.21891/jeseh.581588

Iwuanyanwu, P. N. (2020). Nature of problem-solving skills for 21st century STEM learners: What teachers need to know. Journal of STEM Teacher Education, 55 (1), 27–40. https://doi.org/10.30707/JSTE55.1/MMDZ8325

Iwuanyanwu, P. N. (2022). What Students Gain by Learning through Argumentation. International Journal of Teaching and Learning in Higher Education, 34 (1), 97–107.

Iwuanyanwu, P.N. & Ogunniyi, M.B. (2018). Scientific and indigenous worldviews of pre-service teachers in an interactive learning environment. In Proceedings of the 4th Annual Conference of the African Association for the Study of Indigenous Knowledge Systems . Moshi, Tanzania: Ngā Pae o te Māramatanga.

Iwuanyanwu, P. N., & Ogunniyi, M. B. (2020). Effects of dialogical argumentation instructional model on pre-service teachers’ ability to solve conceptual mathematical problems in physics. African Journal of Research in Mathematics, Science and Technology Education, 24 (1), 121–141. https://doi.org/10.1080/18117295.2020.1748325

Jonassen, D. H. (2011). Learning to solve problems: A handbook for designing problem solving learning environments . Routledge.

Koichu, B., Schwarz, B.B., Heyd-Metzuyanim, E., Tabach, M., & Yarden, A.(2022). Design practices and principles for promoting dialogic argumentation via interdisciplinarity. Learning, Culture and Social Interaction. 37:100657. https://doi.org/10.1016/j.lcsi.2022.100657 .

Kuhn, D., & Udell, W. (2007). Coordinating own and other perspectives in argument. Thinking and Reasoning, 13 (2), 90–104.

Lubben, F., Sadeck, M., Scholtz, Z., & Braund, M. (2010). Gauging students’ untutored ability in argumentation about experimental data: A South African case study. International Journal of Science Education, 32 (16), 2143–2166. https://doi.org/10.1080/09500690903331886

Ogunniyi, M.B. (2022). Implementing a socioculturally relevant science curriculum: The South African experience. In: Atwater, M.M. (eds) International handbook of research on multicultural science education (pp. 819 -837). Springer International Handbooks of Education. Springer, Cham. https://doi.org/10.1007/978-3-030-83122-6_31 .

Osborne, J. F. (2010). Arguing to learn in science: The role of collaborative, critical discourse. Science, 328 , 463–466.

Osborne, J. F. (2019). Not “hands on” but “minds on”: A response to Furtak and Penuel. Science Education, 103 , 1280–1283. https://doi.org/10.1002/sce.21543

Redish, E. F., & Kuo, E. (2015). Language of physics, language of math. Science and Education, 25 (5–6), 561–590.

Schreiber, L. M., & Valle, B. E. (2013). Social constructivist teaching strategies in the small group classroom. Small Group Research, 44 (4), 395–411. https://doi.org/10.1177/1046496413488422

Shin, N., Jonassen, D. H., & McGee, S. (2003). Predictors of well-structured and ill-structured problem solving in an astronomy simulation. Journal of Research in Science Teaching, 40 (1), 6–33.

Syafril, S., Latifah, S., Engkizar, E., Damri, D., Asril, Z., & Yaumas, N. E. (2021). Hybrid learning on problem-solving abilities in physics learning: A literature review. Journal of Physics: Conference Series, 1796 (1), 012021. https://doi.org/10.1088/1742-6596/1796/1/012021

Tarekegn, G., Osborne, J., & Tadesse, M. (2022). Impact of dialogic argumentation pedagogy on grade 8 students’ epistemic knowledge of science. In M. Kalogiannakis, & M. Ampartzaki (Eds.), Advances in research in STEM education . IntechOpen. https://doi.org/10.5772/intechopen.104536 .

Toma, R. B., & Lederman, N. G. (2020). A comprehensive review of instruments measuring attitudes toward science. Research in Science Education , 1–16. https://doi.org/10.1007/s11165-020-09967-1 .

Toulmin, S. (2003). The uses of argument . Cambridge University Press.

Book   Google Scholar  

UNESCO. (2020). Policy brief. Education during Covid-19 and beyond http://unesdoc.unesco.org

Voss, J. F. (2006). Toulmin’s model and the solving of ill-structured problems. In D. Hitchcock & B. Verheij (Eds.), Arguing on the Toulmin model: New essays in argument analysis and evaluation (pp. 303–311). Springer.

Chapter   Google Scholar  

Download references

Open access funding provided by North-West University.

Author information

Authors and affiliations.

Faculty of Education, School of Mathematics, Science and Technology Education, (SDL research unit), Northwest University, Potchefstroom, South Africa

Paul Nnanyereugo Iwuanyanwu

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Paul Nnanyereugo Iwuanyanwu .

Ethics declarations

Conflict of interest.

The author declares no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Iwuanyanwu, P.N. When Science Is Taught This Way, Students Become Critical Friends: Setting the Stage for Student Teachers. Res Sci Educ 53 , 1063–1079 (2023). https://doi.org/10.1007/s11165-023-10122-9

Download citation

Accepted : 21 June 2023

Published : 08 July 2023

Issue Date : December 2023

DOI : https://doi.org/10.1007/s11165-023-10122-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Dialogic argumentation
  • Traditional instruction
  • Problem-solving
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. example of solving problem using scientific method

    scientific problem solving designing carefully controlled research

  2. Designing a Controlled Experiment

    scientific problem solving designing carefully controlled research

  3. scientific problem solving steps

    scientific problem solving designing carefully controlled research

  4. example of solving problem using scientific method

    scientific problem solving designing carefully controlled research

  5. Draw A Map Showing The Problem Solving Process

    scientific problem solving designing carefully controlled research

  6. PPT

    scientific problem solving designing carefully controlled research

VIDEO

  1. Problem Solving on Phase Controlled Rectifiers by Dr. A Naresh Kumar

  2. Vocabulary About Scientific Problem-Solving Preview 2, LevelG. i-Ready Answers

  3. It Has Happened! This Discovery by James Webb Could Change the Entire Field of Cosmology!

  4. The Importance of Good Science Education with Matt Beall

  5. Neda-Shahbazi-portfolio

  6. Problem Solving

COMMENTS

  1. The scientific method (article)

    The scientific method. At the core of biology and other sciences lies a problem-solving approach called the scientific method. The scientific method has five basic steps, plus one feedback step: Make an observation. Ask a question. Form a hypothesis, or testable explanation. Make a prediction based on the hypothesis.

  2. What is the Scientific Method: How does it work and why is it important

    Article. Research Process. The scientific method is a systematic process involving steps like defining questions, forming hypotheses, conducting experiments, and analyzing data. It minimizes biases and enables replicable research, leading to groundbreaking discoveries like Einstein's theory of relativity, penicillin, and the structure of DNA.

  3. Scientific Method

    Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of ...

  4. A Guide to Using the Scientific Method in Everyday Life

    A brief history of the scientific method. The scientific method has its roots in the sixteenth and seventeenth centuries. Philosophers Francis Bacon and René Descartes are often credited with formalizing the scientific method because they contrasted the idea that research should be guided by metaphysical pre-conceived concepts of the nature of reality—a position that, at the time, was ...

  5. Scientific method

    The scientific method is critical to the development of scientific theories, which explain empirical (experiential) laws in a scientifically rational manner. In a typical application of the scientific method, a researcher develops a hypothesis, tests it through various means, and then modifies the hypothesis on the basis of the outcome of the ...

  6. 1.2: Scientific Approach for Solving Problems

    In doing so, they are using the scientific method. 1.2: Scientific Approach for Solving Problems is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts. Chemists expand their knowledge by making observations, carrying out experiments, and testing hypotheses to develop laws to summarize their results and ...

  7. 1.3: The Scientific Method

    The scientific method is a method of investigation involving experimentation and observation to acquire new knowledge, solve problems, and answer questions. The key steps in the scientific method include the following: Step 1: Make observations. Step 2: Formulate a hypothesis. Step 3: Test the hypothesis through experimentation.

  8. Identifying and solving scientific problems in the medicine: key to

    The scientific method in medicine is comprised of research design, conducting research, data analyzing and interpretation that all contribute to the solving specified problems. Research design types can be categorized as a case study, survey, observational study, semi-experimental, experimental, review, meta-analytic or comparative [ 1 ...

  9. 2.1: The Scientific Method

    The scientific method is a process of research with defined steps that include data collection and careful observation. The scientific method was used even in ancient times, but it was first documented by England's Sir Francis Bacon (1561-1626) (Figure 2.1.5 2.1. 5 ), who set up inductive methods for scientific inquiry.

  10. The scientific method (video)

    The scientific method. The scientific method is a logical approach to understanding the world. It starts with an observation, followed by a question. A testable explanation or hypothesis is then created. An experiment is designed to test the hypothesis, and based on the results, the hypothesis is refined.

  11. 1.3: The Science of Biology

    The scientific method can be applied to almost all fields of study as a logical, rational, problem-solving method. Figure 1.3.1 1.3. 1: Sir Francis Bacon: Sir Francis Bacon (1561-1626) is credited with being the first to define the scientific method. The scientific process typically starts with an observation (often a problem to be solved ...

  12. The Scientific Method: What Is It?

    The scientific method is a systematic way of conducting experiments or studies so that you can explore the world around you and answer questions using reason and evidence. It's a step-by-step ...

  13. The 6 Scientific Method Steps and How to Use Them

    The number of steps varies, but the process begins with an observation, progresses through an experiment, and concludes with analysis and sharing data. One of the most important pieces to the scientific method is skepticism —the goal is to find truth, not to confirm a particular thought. That requires reevaluation and repeated experimentation ...

  14. Design as Scientific Problem-Solving

    Abstract. Following Proclus' aphorism that "it is necessary to know beforehand what is sought," a ground rule of intellectual endeavor seems to be that any new field of study, to be recognized properly, must first scrutinize its bounds and objectives: where it stands in the universe and how it proposes to relate to the established ...

  15. Scientific Creativity: Discovery and Invention as Combinatorial

    Three Combinatorial Parameters (p, u, and v)At the instant that problem solving starts, combinations may be described by the following three parameters (omitting subscripts; cf.Simonton, 2013a): p=the combination's initial probability or "response strength," where 0≤p≤1 (e.g., not spontaneously generated during the initial session, to generated after a certain delay within the ...

  16. Modeling Scientific Writing as Scientific Problem-Solving

    Abstract. Scientific problem-solving usually involves sketches or plots instead of multi-sentence written arguments, but scientific findings are still routinely reported in written form in the primary literature. Textbook use might ground students in the discipline but might not help students to learn to produce larger-scale written works.

  17. Thinking Like a Scientist

    The scientific method is not exclusively used by biologists but can be applied to almost anything as a logical problem-solving method. Figure 3. The scientific method is a series of defined steps that include experiments and careful observation. If a hypothesis is not supported by data, a new hypothesis can be proposed.

  18. Identifying problems and solutions in scientific text

    Introduction. Problem solving is generally regarded as the most important cognitive activity in everyday and professional contexts (Jonassen 2000).Many studies on formalising the cognitive process behind problem-solving exist, for instance (Chandrasekaran 1983).Jordan argues that we all share knowledge of the thought/action problem-solution process involved in real life, and so our writings ...

  19. Scientific Research

    Scientific research is the systematic and empirical investigation of phenomena, theories, or hypotheses, using various methods and techniques in order to acquire new knowledge or to validate existing knowledge. It involves the collection, analysis, interpretation, and presentation of data, as well as the formulation and testing of hypotheses.

  20. PDF Scientific Method How do Scientists Solve problems

    Formulate student's ideas into a chart of steps in the scientific method. Determine with the students how a scientist solves problems. • Arrange students in working groups of 3 or 4. Students are to attempt to discover what is in their mystery box. • The group must decide on a procedure to determine the contents of their box and formulate ...

  21. When Science Is Taught This Way, Students Become Critical ...

    Furthermore, the importance of solving scientific problems, making and defending judgments regarding the problems, and developing reasoned solutions to the problems cannot be overstated. ... In accordance with the experimental pre-posttest control group design, the two study groups were subjected to a series of biweekly physics lectures ...

  22. 4. Assessing complex problem-solving skills through the lens of

    Evaluate and design scientific enquiry. ... (to allow assessment of the full range of decisions in a problem-solving process) needs to be carefully considered in the design of any assessment. ... (2017), "The PISA 2012 assessment of problem solving", in Csapó, B. and J. Funke (eds.), The Nature of Problem Solving: Using Research to Inspire ...

  23. HOW TO CONDUCT AN EXPERIMENTAL RESEARCH?

    Step 2: Evaluate the Literature. A thorough examination of the relevant studies is essential to the research process. It enables the researcher to identify the precise aspects of the problem. Once ...