Table of Contents
ToggleStatistics and Probability: Sampling and Hypothesis Testing
What Is Sampling?
Sampling is the process of selecting a subset of individuals or items from a larger population to make inferences about the population as a whole. In statistics, it’s often not practical or possible to study an entire population, so samples are used instead.
There are different sampling methods that can be used, including:
- Random Sampling: Every individual in the population has an equal chance of being selected.
- Systematic Sampling: Individuals are selected at regular intervals from a list.
- Stratified Sampling: The population is divided into subgroups, or strata, and samples are taken from each subgroup.
- Cluster Sampling: The population is divided into clusters, and a random sample of clusters is selected.
Key Concepts in Sampling
Sample Size and Population Size
- Sample Size: The number of individuals or items selected from the population. A larger sample size typically leads to more accurate estimates of population parameters.
- Population Size: The total number of individuals or items in the population. It’s important to understand the relationship between sample size and population size to avoid bias in your sample.
Sampling Error
Sampling error is the difference between the sample statistic and the population parameter due to the fact that the sample is only a subset of the population. This error decreases as the sample size increases.
What Is Hypothesis Testing?
Hypothesis testing is a statistical method used to make inferences or decisions about population parameters based on sample data. It involves testing a claim or assumption about a population using data collected from a sample.
The process of hypothesis testing involves several key steps:
- Formulate the Null and Alternative Hypotheses
- Select the Significance Level
- Calculate the Test Statistic
- Make a Decision (Reject or Fail to Reject the Null Hypothesis)
Null and Alternative Hypotheses
In hypothesis testing, we define two competing hypotheses:
- Null Hypothesis (H0H_0H0): The statement that there is no effect or no difference in the population (e.g., the population mean is equal to a specific value).
- Alternative Hypothesis (H1H_1H1): The statement that contradicts the null hypothesis (e.g., the population mean is different from the specific value).
Example Hypothesis
Example: We want to test if the average height of a group of students is 170 cm.
- Null Hypothesis (H0H_0H0): The average height is 170 cm.
H0:μ=170H_0: \mu = 170H0:μ=170
- Alternative Hypothesis (H1H_1H1): The average height is not 170 cm.
H1:μ≠170H_1: \mu \neq 170H1:μ=170
Significance Level and p-Value
Significance Level (α\alphaα)
The significance level, denoted by α\alphaα, is the probability of rejecting the null hypothesis when it is actually true (Type I error). Common significance levels are 0.050.050.05, 0.010.010.01, and 0.100.100.10.
p-Value
The p-value is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct. If the p-value is less than the significance level (α\alphaα), we reject the null hypothesis.
Types of Errors in Hypothesis Testing
Type I Error
A Type I error occurs when the null hypothesis is rejected when it is actually true. The probability of a Type I error is equal to the significance level α\alphaα.
Type II Error
A Type II error occurs when the null hypothesis is not rejected when it is actually false. The probability of a Type II error is denoted by β\betaβ.
Example Problem
Problem: A sample of 50 students is selected to test if the average weight of the students in a school is 60 kg. The null hypothesis is that the average weight is 60 kg, and the alternative hypothesis is that the average weight is not 60 kg. The significance level is α=0.05\alpha = 0.05α=0.05, and the sample mean is 62 kg with a standard deviation of 8 kg.
Test the hypothesis at the 5% significance level.
Solution:
- State the hypotheses:
H0:μ=60H_0: \mu = 60H0:μ=60, H1:μ≠60H_1: \mu \neq 60H1:μ=60 - Calculate the test statistic:
Use the z-test formula:
z=xˉ−μσnz = \frac{\bar{x} – \mu}{\frac{\sigma}{\sqrt{n}}}z=nσxˉ−μ
Where:
- xˉ=62\bar{x} = 62xˉ=62 (sample mean),
- μ=60\mu = 60μ=60 (population mean),
- σ=8\sigma = 8σ=8 (sample standard deviation),
- n=50n = 50n=50 (sample size).
z=62−60850=287.071=21.131=1.77z = \frac{62 – 60}{\frac{8}{\sqrt{50}}} = \frac{2}{\frac{8}{7.071}} = \frac{2}{1.131} = 1.77z=50862−60=7.07182=1.1312=1.77
- Find the critical value for a two-tailed test at α=0.05\alpha = 0.05α=0.05. The critical z-values are ±1.96\pm 1.96±1.96.
- Decision:
Since 1.77<1.961.77 < 1.961.77<1.96, we fail to reject the null hypothesis. There is not enough evidence to suggest that the average weight is different from 60 kg.
Common Mistakes in Hypothesis Testing
- Not Defining Hypotheses Clearly: Make sure to clearly define both the null and alternative hypotheses before conducting the test.
- Confusing p-Value with Significance Level: The p-value is the probability of observing the test statistic under the null hypothesis, not the significance level.
- Misinterpreting Type I and Type II Errors: Be aware of the consequences of making a Type I or Type II error, and adjust the significance level accordingly.
Practice Questions
- A sample of 100 students has an average height of 160 cm. Test the hypothesis that the average height of students in the school is 165 cm at the 5% significance level.
- A new drug is tested on 200 patients. The null hypothesis is that the drug has no effect. The sample mean recovery time is 12 days with a standard deviation of 4 days. Perform a hypothesis test at the 1% significance level.
- In a factory, a machine is tested for accuracy. The null hypothesis is that the machine is accurate to within 0.5 mm. The sample measurement shows a mean of 0.8 mm with a standard deviation of 0.2 mm. Perform a hypothesis test at the 10% significance level.
Skinat Tuition | Where Expert Tutoring Meets Proven Results.