Search databasePMCAll DatabasesAssemblyBiocollectionsBioProjectBioSampleBioSystemsBooksClinVarConserved DomainsdbGaPdbVarGeneGenomeGEO DataSetsGEO ProfilesGTRHomoloGeneIdentical Protein web CatalogNucleotideOMIMPMCPopSetProteinProtein ClustersProtein family members ModelsPubChem BioAssayPubChem CompoundPubChem SubstancePubMedSNPSRAStructureTaxonomyToolKitToolKitAllToolKitBookgh

Address because that correspondence: Dr. (Prof.) Amitav Banerjee, room of neighborhood Medicine, D. Y. Patil clinical College, Pune - 411 018, India. E-mail: moc.liamg
This is one open-access short article distributed under the terms of the an innovative Commons Attribution License, which patent unrestricted use, distribution, and reproduction in any type of medium, noted the original job-related is correctly cited.

You are watching: Which error type does the "off-by-one" error belong to?

Hypothesis experimentation is crucial activity that empirical research and also evidence-based medicine. A well functioned up theory is half the answer come the research study question. For this, both understanding of the subject acquired from extensive review of the literature and also working understanding of basic statistical ideas are desirable. The present record discusses the methods of working up a an excellent hypothesis and also statistical principles of theory testing.

Karl Popper is most likely the most prominent philosopher of scientific research in the 20thcentury (Wulff et al., 1986). Numerous scientists, also those who execute not commonly read publications on philosophy, room acquainted v the simple principles of his see on science. The popularity of Popper’s philosophy is due partially to the truth that it has been well defined in simple terms by, among others, the Nobel prize winner Peter Medawar (Medawar, 1969). Popper makes the an extremely important point that empirical scientists (those who anxiety on monitorings only as the starting point that research) put the cart in prior of the horse when they claim that scientific research proceeds from observation to theory, because there is no such thing as a pure observation which walk not count on theory. Popper states, “… the id that we can start with pure monitoring alone, without anything in the nature of a theory, is absurd: As might be shown by the story that the guy who dedicated his life to natural science, wrote down every little thing he could observe, and bequeathed his ‘priceless’ arsenal of observations to the Royal society to be used as inductive (empirical) evidence.


The first step in the scientific process is no observation yet the generation the a hypothesis which might then be tested critically through observations and also experiments. Popper additionally makes the important case that the goal of the scientist’s initiatives is not the verification but the falsification the the initial hypothesis. The is logically impossible to verify the fact of a basic law by repetitive observations, but, at the very least in principle, it is feasible to falsify such a regulation by a single observation. Repeated monitorings of white swans did not prove the all swans room white, however the observation of a solitary black swan sufficed come falsify that basic statement (Popper, 1976).


A good hypothesis should be based upon a an excellent research question. It must be simple, certain and declared in advancement (Hulley et al., 2001).

Hypothesis have to be simple

A straightforward hypothesis contains one predictor and one outcome variable, e.g. Confident family history of schizophrenia rises the threat of occurring the problem in first-degree relatives. Right here the solitary predictor variable is confident family history of schizophrenia and also the result variable is schizophrenia. A complicated hypothesis contains much more than one predictor change or an ext than one result variable, e.g., a positive family history and stressful life events are linked with an boosted incidence of Alzheimer’s disease. Below there room 2 predictor variables, i.e., confident family background and stressful life events, if one outcome variable, i.e., Alzheimer’s disease. Complex hypothesis favor this cannot be quickly tested with a single statistical test and also should constantly be separated right into 2 or an ext simple hypotheses.

Hypothesis need to be specific

A details hypothesis leaves no ambiguity about the subjects and also variables, or around how the test of statistical definition will it is in applied. It provides concise operational definitions that summary the nature and resource of the subjects and also the strategy to measure up variables (History the medication through tranquilizers, as measured by review of medical store records and also physicians’ prescriptions in the past year, is an ext common in patients that attempted suicides than in controls hospitalized for other conditions). This is a long-winded sentence, but it explicitly states the nature that predictor and outcome variables, how they will certainly be measured and the research hypothesis. Often these details might be consisted of in the examine proposal and also may not be stated in the research hypothesis. However, they need to be clean in the psychic of the investigator when conceptualizing the study.

Hypothesis need to be declared in advance

The hypothesis have to be proclaimed in writing during the proposal state. This will help to keep the research effort focused ~ above the major objective and also create a more powerful basis because that interpreting the study’s outcomes as contrasted to a theory that emerges together a an outcome of inspecting the data. The habit of short article hoc hypothesis experimentation (common amongst researchers) is nothing but using third-degree approaches on the data (data dredging), to yield at the very least something significant. This leads to overrating the occasional chance associations in the study.


For the objective of experimentation statistical significance, hypotheses room classified through the method they describe the meant difference between the research groups.

Null and different hypotheses

The null hypothesis says that over there is no association in between the predictor and also outcome variables in the population (There is no difference in between tranquilizer behavior of patients v attempted suicides and also those the age- and also sex- matched “control” patients hospitalized for various other diagnoses). The null theory is the formal basis for testing statistical significance. By starting with the proposition that there is no association, statistical tests can estimate the probability that an observed association could be due to chance.

The proposition that there is an association — that patients through attempted suicides will report various tranquilizer actions from those that the controls — is dubbed the alternative hypothesis. The alternative hypothesis can not be tested directly; that is welcomed by exemption if the test of statistical meaning rejects the null hypothesis.

One- and two-tailed different hypotheses

A one-tailed (or one-sided) hypothesis mentions the direction the the association between the predictor and also outcome variables. The prediction that patients of test suicides will have actually a greater rate of usage of tranquilizers than control patients is a one-tailed hypothesis. A two-tailed theory states just that an combination exists; the does no specify the direction. The prediction the patients with attempted suicides will have a different rate that tranquilizer use — either higher or lower than control patients — is a two-tailed hypothesis. (The word tails describes the tail end of the statistical circulation such together the acquainted bell-shaped regular curve the is supplied to test a hypothesis. One tail to represent a positive impact or association; the other, a an unfavorable effect.) A one-tailed hypothesis has the statistical advantage of allow a smaller sized sample dimension as compared to the permissible through a two-tailed hypothesis. Unfortunately, one-tailed hypotheses are not always appropriate; in fact, some investigators believe that they need to never be used. However, castle are appropriate when just one direction because that the association is necessary or biologically meaningful. An instance is the one-sided theory that a drug has a better frequency the side results than a placebo; the opportunity that the drug has actually fewer side impacts than the placebo is not worth testing. Everything strategy is used, it should be proclaimed in advance; otherwise, that would lack statistical rigor. Data dredging ~ it has actually been collected and post hoc deciding to readjust over come one-tailed hypothesis trial and error to alleviate the sample size and P value space indicative of lack of scientific integrity.

STATISTICAL principles OF hypothesis TESTING

A hypothesis (for example, Tamiflu , drug of an option in H1N1 influenza, is connected with an enhanced incidence that acute psychotic manifestations) is either true or false in the real world. Due to the fact that the investigator cannot examine all civilization who space at risk, he need to test the hypothesis in a sample of the target population. No matter how many data a researcher collects, he can never absolutely prove (or disprove) his hypothesis. There will always be a need to draw inferences around phenomena in the populace from events observed in the sample (Hulley et al., 2001). In part ways, the investigator’s trouble is similar to that confronted by a referee judging a defendant . The absolute reality whether the defendant cursed the crime cannot be determined. Instead, the judge begins by presuming innocence — the defendant did no commit the crime. The judge have to decide whether there is sufficient evidence to reject the presumed innocence that the defendant; the conventional is recognized as beyond a reasonable doubt. A judge deserve to err, however, through convicting a defendant that is innocent, or by failing to judge one who is in reality guilty. In similar fashion, the investigator starts by presuming the null hypothesis, or no association between the predictor and outcome variables in the population. Based upon the data built up in his sample, the investigator supplies statistical exam to recognize whether over there is sufficient evidence to refuse the null theory in donate of the alternative hypothesis that there is an association in the population. The standard for this tests is displayed as the level of statistics significance.

Table 1

The analogy between judge’s decisions and also statistical tests

Judge’s decisionStatistical test
Innocence: The defendant did no commit crimeNull hypothesis: No association in between Tamiflu and also psychotic manifestations
Guilt: The defendant did commit the crimeAlternative hypothesis: there is association in between Tamiflu and also psychosis
Standard for rejecting innocence: past a reasonable doubtStandard because that rejecting null hypothesis: Level of statistical definition (à)
Correct judgment: convict a criminalCorrect inference: Conclude the there is one association as soon as one does exist in the population
Correct judgment: Acquit an innocent personCorrect inference: Conclude that there is no association between Tamiflu and psychosis as soon as one does not exist
Incorrect judgment: convict an innocent person.Incorrect inference (Type i error): Conclude the there is an association as soon as there in reality is none
Incorrect judgment: Acquit a criminalIncorrect inference (Type II error): Conclude the there is no association when there in reality is one

Open in a separate window

TYPE ns (ALSO well-known AS ‘α’) AND kind II (ALSO known AS ‘β’)ERRORS

Just prefer a judge’s conclusion, one investigator’s conclusion might be wrong. Sometimes, by opportunity alone, a sample is no representative of the population. Thus the outcomes in the sample execute not reflect truth in the population, and the random error leader to one erroneous inference. A kind I error (false-positive) occurs if an investigator rejects a null hypothesis that is in reality true in the population; a form II error (false-negative) occurs if the investigator fails to disapprove a null theory that is actually false in the population. Although type I and kind II errors deserve to never be avoided entirely, the investigator deserve to reduce your likelihood by increasing the sample dimension (the bigger the sample, the lesser is the likelihood the it will differ substantially from the population).

False-positive and false-negative outcomes can additionally occur due to the fact that of prejudice (observer, instrument, recall, etc.). (Errors as result of bias, however, are not described as kind I and kind II errors.) such errors are troublesome, since they may be an overwhelming to detect and cannot normally be quantified.


The likelihood the a examine will be able to detect one association in between a predictor change and an end result variable depends, the course, on the actual size of the association in the target population. If it is large (such together 90% boost in the incidence that psychosis in people who room on Tamiflu), it will certainly be simple to recognize in the sample. Conversely, if the dimension of the association is tiny (such together 2% rise in psychosis), it will certainly be challenging to detect in the sample. Unfortunately, the investigator regularly does not know the actual magnitude of the association — one of the functions of the research is to calculation it. Instead, the investigator must select the dimension of the association that he would choose to have the ability to detect in the sample. This quantity is known as the effect size. Picking an appropriate effect size is the most daunting aspect of sample dimension planning. Sometimes, the investigator can use data from various other studies or pilot exam to do an informed guess about a reasonable effect size. When there room no data through which to estimate it, the can choose the smallest impact size that would certainly be clinically meaningful, because that example, a 10% rise in the incidence of psychosis. Of course, from the general public health allude of view, also a 1% increase in psychosis incidence would certainly be important. Hence the an option of the effect size is constantly somewhat arbitrary, and also considerations of feasibility are regularly paramount. When the variety of available subjects is limited, the investigator may have to work behind to recognize whether the result size that his examine will be able to detect v that variety of subjects is reasonable.


After a research is completed, the investigator provides statistical tests to try to disapprove the null hypothesis in favor of its different (much in the same method that a prosecuting lawyer tries to to convince a judge to reject innocence in favor of guilt). Relying on whether the null hypothesis is true or false in the target population, and also assuming that the study is free of bias, 4 situations are possible, as shown in Table 2 below. In 2 that these, the findings in the sample and also reality in the population are concordant, and also the investigator’s inference will certainly be correct. In the various other 2 situations, one of two people a type I (α) or a type II (β) error has been made, and also the inference will be incorrect.

Table 2

Truth in the populace versus the outcomes in the research sample: The four possibilities

Truth in the populationAssociation + ntNo association
Reject null hypothesisCorrectType ns error
Fail to refuse null hypothesisType II errorCorrect

Open in a different window

The investigator develops the maximum possibility of making form I and form II errors in development of the study. The probability the committing a type I error (rejecting the null hypothesis as soon as it is actually true) is called α (alpha) the various other name for this is the level of statistical significance.

If a research of Tamiflu and psychosis is designed with α = 0.05, for example, then the investigator has set 5% as the maximum opportunity of incorrectly rejecting the null hypothesis (and erroneously inferring that use of Tamiflu and also psychosis incidence are connected in the population). This is the level the reasonable doubt that the investigator is ready to accept when he offers statistical tests to analyze the data after the examine is completed.

The probability of make a form II error (failing to reject the null hypothesis once it is in reality false) is referred to as β (beta). The quantity (1 - β) is dubbed power, the probability that observing an impact in the sample (if one), the a specified impact size or better exists in the population.

If β is set at 0.10, climate the investigator has chose that the is ready to accept a 10% opportunity of lacking an combination of a given effect size in between Tamiflu and also psychosis. This to represent a power of 0.90, i.e., a 90% opportunity of finding an association of that size. Because that example, mean that there really would be a 30% increase in psychosis incidence if the entire population took Tamiflu. Climate 90 times out of 100, the investigator would certainly observe an result of that size or larger in his study. This does no mean, however, the the investigator will be absolutely unable to detect a smaller effect; simply that the will have actually less than 90% likelihood of doing so.

Ideally alpha and beta errors would certainly be set at zero, removed the opportunity of false-positive and false-negative results. In practice they space made as small as possible. Reduce them, however, typically requires boosting the sample size. Sample size planning aims at picking a sufficient variety of subjects to store alpha and beta in ~ acceptably short levels without making the examine unnecessarily high value or difficult.

Many studies set alpha in ~ 0.05 and beta at 0.20 (a power of 0.80). This are somewhat arbitrary values, and also others are occasionally used; the conventional range for alpha is between 0.01 and also 0.10; and for beta, between 0.05 and also 0.20. In general the investigator should select a low value of alpha as soon as the research concern makes it specifically important to prevent a kind I (false-positive) error, and he should choose a low value of beta once it is particularly important to protect against a form II error.


The null theory acts favor a punching bag: it is assumed to it is in true in order to shadowbox it into false v a statistical test. As soon as the data are analyzed, such tests identify the P value, the probability of obtaining the study results by opportunity if the null theory is true. The null hypothesis is rejected in favor of the alternative hypothesis if the P worth is much less than alpha, the predetermined level that statistical meaning (Daniel, 2000). “Nonsignificant” outcomes — those v P value greater than alpha — carry out not suggest that over there is no combination in the population; castle only typical that the association it was observed in the sample is small compared through what could have developed by possibility alone. For example, an investigator could find that guys with family background of mental disease were twice as most likely to construct schizophrenia as those v no family members history, however with a P worth of 0.09. This means that also if family background and schizophrenia were not associated in the population, there was a 9% opportunity of finding such an association because of random error in the sample. If the investigator had collection the significance level at 0.05, he would have to conclude that the combination in the sample to be “not statistically significant.” It might be tempting because that the investigator to change his mind around the level of statistical meaning ex article facto and also report the results “showed statistical significance at P < 10”. A much better choice would be to report the the “results, although suggestive of an association, did not achieve statistical significance (P = .09)”. This solution acknowledges that statistical definition is no an “all or none” situation.

See more: Br A Mirror With A Reflecting Surface That Caves Inward Is Said To Be A


Hypothesis trial and error is the sheet anchor the empirical research and also in the rapidly emerging practice of evidence-based medicine. However, empirical research study and, ipso facto, hypothesis trial and error have your limits. The empirical technique to research study cannot eliminate uncertainty completely. In ~ the best, it deserve to quantify uncertainty. This uncertainty can be the 2 types: kind I error (falsely rejecting a null hypothesis) and form II error (falsely agree a null hypothesis). The agree magnitudes of type I and form II errors are set in development and are important for sample size calculations. Another important point to psychic is that we cannot ‘prove’ or ‘disprove’ anything by hypothesis testing and also statistical tests. We have the right to only knock down or disapprove the null hypothesis and by default expropriate the alternate hypothesis. If we fail to disapprove the null hypothesis, we accept it through default.