A chemical company has two factories, one in London and the other in Windsor. We wish to determine if there is any difference between the two factories with respect to how much methane is being put into the air. Readings of methane expulsion are taken at randomly spaced intervals in the two cities. The 8 readings in London show an average of 0.23 parts per million (ppm) with a standard deviation of 0.07 ppm. The 11 readings in Windsor show an average of 0.32 ppm with a standard deviation of 0.12 ppm.
a) Is the methane expulsion different in the two cities? First, you need to determine whether you should use the pooled or non-pooled test. Once you have determined that, you can test your hypothesis at the 5% level of significance by calculating the p-value. Make sure you state your null and alternative hypothesis (in symbols and words), make your decision and draw your conclusion.
(b) Calculate a 99% confidence interval for the difference in actual mean methane levels for the two cities and provide a clear interpret the resulting interval.
Solution:
Part a
Here, we have to use two sample t test for the difference between two population means assuming equal population variances. Here, we use pooled test, because sample sizes are small and less than 30.
The null and alternative hypotheses for this test are given as below:
Null hypothesis: H0: The methane expulsion is not different in the two cities.
Alternative hypothesis: Ha: The methane expulsion is different in the two cities.
H0: µ1 = µ2 versus Ha: µ1 ≠ µ2
Test statistic formula for pooled variance t test is given as below:
t = (X1bar – X2bar) / sqrt[Sp2*((1/n1)+(1/n2))]
Where Sp2 is pooled variance
Sp2 = [(n1 – 1)*S1^2 + (n2 – 1)*S2^2]/(n1 + n2 – 2)
We are given
X1bar = 0.23
X2bar = 0.32
S1 = 0.07
S2 = 0.12
n1 = 8
n2 = 11
df = n1 + n2 – 2 = 8 + 11 – 2 = 17
α = 0.05
Sp2 = [(n1 – 1)*S1^2 + (n2 – 1)*S2^2]/(n1 + n2 – 2)
Sp2 = [(8 – 1)* 0.07^2 + (11 – 1)* 0.12^2]/(8 + 11 – 2)
Sp2 = 0.0105
t = (X1bar – X2bar) / sqrt[Sp2*((1/n1)+(1/n2))]
t = (X1bar – X2bar) / sqrt[Sp2*((1/n1)+(1/n2))]
t = (0.23 – 0.32) / sqrt[0.0105*((1/8)+(1/11))]
t = -0.09/0.0476
t = -1.8913
P-value = 0.0758
(by using t-table)
P-value > α = 0.05
So, we do not reject the null hypothesis
There is insufficient evidence to conclude that the methane expulsion is different in the two cities.
Part b
Here, we have to find 99% confidence interval for the difference between two population means.
Confidence interval for difference between two population means is given as below:
Confidence interval = (X1bar – X2bar) ± t*sqrt[Sp2*((1/n1)+(1/n2))]
Where Sp2 is pooled variance
Sp2 = [(n1 – 1)*S1^2 + (n2 – 1)*S2^2]/(n1 + n2 – 2)
Sp2 = 0.0105
Standard error = sqrt[Sp2*((1/n1)+(1/n2))] = 0.0476
Confidence level = 99%
df = 17
Critical value = 2.8982
(by using t-table)
(X1bar – X2bar) = 0.23 – 0.32 = -0.09
Confidence interval = (X1bar – X2bar) ± t*sqrt[Sp2*((1/n1)+(1/n2))]
Confidence interval = -0.09 ± 2.8982*0.0476
Confidence interval = -0.09 ± 0.137954
Lower limit = -0.09 - 0.137954 = -0.2279
Upper limit = -0.09 + 0.137954 = 0.0479
Confidence interval = (-0.2279, 0.0479)
We are 99% confident that the population mean difference between methane expulsion for two cities will lies within -0.2279 and 0.0479.
A chemical company has two factories, one in London and the other in Windsor. We wish...
In the picture below, I have the output for the same two sets of data. I ran the hypothesis test and the confidence interval. If you had a choice to use one output or the other, which would you choose and why? Make sure to be specific and include what information you get from each and what information you don't get if you use one over the other. Options Two sample T summary hypothesis test: : Mean of Population 1...
31. A manufacturing company has 2 different instruments they use to measure the Rockwell hardness of an object. They believe that one of the instruments may not be working properly, and giving readings that are not completely accurate. To test this, they do the following. They take a large sheet of metal, and cut it into 60 different pieces, and randomly divide the pile in two. They believe it is safe to assume that the hardness of the metal is...
A researcher conjectures that cities in the more populous states of the United States tend to have higher costs for hospital rooms. Using “city data” that accompany this text, select a random sample of ten cities from the six most populous states (California, Texas, New York, Florida, Pennsylvania and Illinois). Then take a random sample of ten cities from the remaining states in the data set. For each of the twenty cities, record the average daily cost of a private...
Consider independent random samples from two populations that are normal or approximately normal, or the case in which both sample sizes are at least 30. Then, if σ1 and σ2 are unknown but we have reason to believe that σ1 = σ2, we can pool the standard deviations. Using sample sizes n1 and n2, the sample test statistic x1 − x2 has a Student's t distribution where t = x1 − x2 s 1 n1 + 1 n2 with degrees...
For the independent-measures t test, which of the following describes the estimated standard error of M1 - M2 (whose symbol is )? O The variance across all the data values when both samples are pooled together O A weighted average of the two sample variances (weighted by the sample sizes) O The difference between the standard deviations of the two samples O An estimate of the standard distance between the difference in sample means (M, - M2) and the difference...
Doctors are comparing the efficacy of two painkillers. They randomly assign one of the two painkillers to their patients, and measure time to relief for each individual. The group assigned painkiller 1 has 41 people in it. The group assigned painkiller 2 has 61 people in it. For group 1, the mean time to relief is 10 minutes, and the sample variance is 2 minutes. For group 2, the mean to time to relief is 9.5 minutes, and the sample...
Two catalysts may be used in a batch chemical process. Twelve batches were prepared using catalyst 1, resulting in an average yield of 85 and a sample standard deviation of 3. Fifteen batches were prepared using catalyst 2, and they resulted in an average yield of 89 with a standard deviation of 2. Assume that yield measurements are approximately normally distributed with the same standard deviation (a) Is there evidence to support the claim that catalyst 2 produces higher mean...
9 Six capsules of drug A took an average of 75 seconds with standard deviation of 1.4 seconds to dissolve, while the average time for six capsules of drug B was 71 seconds with standard deviation of 1.7 seconds. Establish a 99% confidence interval estimate for the difference in dissolving time between the two brands. What is the interval, rounded to two decimal places? (2.45, 5.55) (1.52, 6.48) (1.44, 6.56) (1.13, 6.87) (1.15, 6.85) 10 Refer to your answer in...
Example 3) Suppose you want to study if female Facebook users have more friends on Facebook on average than male Facebook users. To do this you first divide the country into four regions, West Coast, East Coast, Midwest, and South. Then you randomly sample men and women from all four regions and pool them together to make your two samples. Ultimately, you end up sampling 120 female Facebook users and 150 male Facebook users. You find that the female Facebook...
6. The t test for two independent samples - One-tailed example using tables Aa Aa Most engaged couples expect or at least hope that they will have high levels of marital satisfaction. However, because 54% of first marriages end in divorce, social scientists have begun investigating influences on marital satisfaction. [Data source: This data was obtained from National Center for Health Statistics.] Suppose a counseling psychologist sets out to look at the role of economic hardship in relationship longevity. He...