One way to understand the p-value is to consider the likelihood of discovering the difference observed in a random sample or a more extreme one in a population where the null hypothesis is true. Because hypothesis testing is so common, p values are included in almost all quantitative research papers. Before getting to the details, let us briefly talk about hypothesis testing.
Research hypotheses are predictions about the outcomes of a study based on logic from a theory or prior research. The best theories are definite and explicit, allowing them to be proved or disproven without any room for doubt. Dependent variables are the standards used to test hypotheses. The act of coming up with testable hypotheses helps to clarify the questions about a particular research problem. It compels the researcher to outline the specific information required to evaluate the hypothesis and the methodology for doing so.
Conventionally, it is assumed that any variations in the dependent variables are the result of chance when using statistical techniques to test a hypothesis. The processes determine the likelihood that a difference that appears to exist is not true. Therefore, to be statistically tested, hypotheses must be stated in the null form (i.e., predict no difference). The null hypothesis is accepted when the statistical analysis shows a strong likelihood of no difference between the dependent variables. The null hypothesis is disproved when there is a low likelihood of no real difference.
When a study hypothesis is confirmed, it signifies that the data changes cannot be explained by coincidence. Additionally, it suggests that the causative variables suggested by the hypothesis may explain the changes, but they do not necessarily do so as long as there are alternative hypotheses. As a result, developing a theory involves weeding out weak hypotheses until one is left that continues to hold up under attempts at disconfirmation rather than proving a hypothesis
Two types of error can happen when deciding whether to accept or reject a statistical hypothesis: Type I error and Type II error. Type I or alpha errors reject the null hypothesis when the dependent variable remains unchanged. A Type II or beta error occurs when the null hypothesis is accepted when there has been a change.
The probability of Type I error in a hypothesis test is determined by the level of significance chosen and is typically denoted by alpha () or P critical Alpha, a probability value that can range from 0 to 1, which is selected prior to data collection. The most common alpha values are.05 and.01, but other alpha values may also be used by researchers as they consider Type I and Type II error consequences. With an alpha of.05, there is a 5% chance of rejecting the Ho when it is true, which is known as a Type I error. Consequently, an alpha of.01 means that 1% of Type I errors are likely to occur. Alpha is the criterion for rejecting H0
Following the alpha setting, researchers gather data on a sample selected randomly from the population(s) from which they will conclude. It is crucial to remember that Ho is set up before data collection. The statistic of interest is computed using the information from this sample. The likelihood of the observed difference between the sample statistic and the hypothesized population parameter in a population where Ho is true is used to make decisions about accepting or rejecting H0.
The next step is calculating the likelihood of receiving the observed or more extreme difference in a random sample from a population where Ho is true. This likelihood is known as the p-value or P calculated. Ho is rejected if the p-value is less than the alpha level and is referred to as a statistically significant result when H0 is rejected. Ho is not disproved if the p-value is higher than the alpha level, and this choice is known as a statistically nonsignificant result. The p-value is calculated using sampling distributions and ranges from 0 to 1, just like alpha and all other probabilities.
The p-value was first formally introduced by Karl Pearson in his Pearson's chi-squared test, using the chi-squared distribution and notated as capital P. The p-values for the chi-squared distribution (for various values of χ2 and degrees of freedom), now notated as P, was calculated in (Elderton 1902), collected in. Ronald Fisher popularized the p-value in statistics, which plays a central role in Fisher's approach to statistics
Assume a researcher rolls a pair of dice once and assumes that the dice are fair. The test statistic is one-tailed and is "the total of the rolling numbers." The researcher rolls the dice and notices that both dice have a value of 6, resulting in a test statistic of 12. This result has a p-value of 1/36, or around 0.028 (the highest test statistic out of 6 = 36 potential outcomes). If the researcher used a significance threshold of 0.05, this finding would be considered significant, and the hypothesis that the dice are fair would be rejected. In this situation, a single roll gives a very poor foundation (inadequate evidence) for drawing significant conclusions about the dice. This demonstrates the risk of applying p-values without considering the experiment design.
Because the purpose of hypothesis testing in an applied setting can be complex, quantitative results frequently need to be interpreted. A partial understanding of hypothesis testing may lead to systemic problems that compromise the caliber of research reports. It is crucial to comprehend hypothesis testing and p values and use them carefully for high-quality research reports.