Chapter 9 Inferences from Two Samples

Learning Outcome

Perform hypotheses testing two population proportions and two sample means (independent and dependent).

The previous chapters covered the methods of estimating values of population parameters using confidence intervals and testing hypotheses about population parameters with a sample from one population. This chapter extends these methods to situations involving two populations.

9.1 Inferences about Two Proportions

Objectives:

Hypothesis test: conduct a hypothesis test of a claim about two population proportions.
Confidence interval: construct a confidence interval estimate of the difference between two population proportions.

Notations:

\[ \begin{cases} p_1 = \text {proportion in population 1 } \\ n_1 = \text {size of the sample drawn from population 1 } \\ x_1 = \text {number of successes oberved in sample 1} \\ \hat p_1 = \dfrac{x_1}{n_1} \ (\text{sample 1 proportion}) \\ \hat q_1 = 1- \hat p_1 \ (\text{complement of sample 1 proportion}) \\ \end{cases} \] The corresponding notations \(p_2, n_2, x_2, \hat p_2, \hat q_2\) apply to population 2.

Requirements

Random - The sample proportions are from two simple random samples.
Independent - The two samples are independent. If drawn from the same population, the samples are not related or they are not naturally paired or matched with the sample values from the other population.
Sample size - For each of the two samples, there are at least \(5\) successes and at least \(5\) failures. (That is \(n \hat p \ge 5\) and \(n \hat q \ge 5\) for each of the two samples.)

9.1.1 Hypothesis Test

The null hypothesis claims that the two population proportions are equal. It implies that both samples came from the same population and have equal variance. Under the assumption of equal proportions, the best estimate of the common population proportion is estimated by pooling both samples into one large sample, so that \(\bar p\) is the estimator of the common population proportion. Furthermore, pooled sample variance is calculated from \(\bar p\).

Let’s consider, the difference in two population proportions: \(p_1 - p_2\).

A reasonable point estimate of \(p_1 - p_2\) based on the samples drawn from the same populations can be written in the form: \(\hat p_1 - \hat p_2\).

\[ H_0: p_1 - p_2 = 0 \\ H_A: p_1 - p_2 \ne 0 \\ \]

Pooled Sample Proportion

\[ \bar p = \dfrac{x_1 + x_2}{n_1 + n_2} \\ \bar q = 1 - \bar p \]

Hypothesis Test Statistic for Two Proportions

The mean of the sampling distribution of \((\hat p_1 - \hat p_2)\) is \(p_1 - p_2\), and standard error

\(\sqrt{\dfrac{\bar p \bar q}{n_1}+\dfrac{\bar p \bar q }{n_2}}\).

The \(z\)-score at \((\hat p_1 - \hat p_2)\),

\[ z = \dfrac{(\hat p_1 - \hat p_2) - (p_1 - p_2)}{\sqrt{\dfrac{\bar p \bar q}{n_1}+\dfrac{\bar p \bar q }{n_2}}} \\ \text{where, } p_1 - p_2 = 0 \]

9.1.2 Confidence Interval Estimate of \(p_1 - p_2\)

The confidence interval estimate of the difference \(p_1 - p_2\) is

\((\hat p_1 - \hat p_2) - E \lt (p_1 - p_2) \lt (\hat p_1 - \hat p_2) + E\)

where,

the standard error: \(SE_{\hat p_1 - \hat p_2} = \sqrt{SE^2_{\hat p_1} + SE^2_{\hat p_2}} = \sqrt{\dfrac{\hat p_1 \hat q_1 }{n_1}+\dfrac{\hat p_2 \hat q_2}{n_2}}\)

margin of error \(E = z^*_{\alpha/2} \sqrt{\dfrac{\hat p \hat q}{n_1}+\dfrac{\hat p \hat q }{n_2}}\)

\(z^*_{\alpha/2}\) is the critical value that corresponds to the confidence level \((1-\alpha)\).

Notice that the standard error calculation in the confidence interval is based on \(\hat p_1\) and \(\hat p_2\), whereas the hypothesis test uses a standard error based on pooled proportion \(\bar p\).

Example (Two-proportional \(z\)-interval):

How much difference is there in the proportion of male drivers who wear seat belts when sitting next to a man and the proportion when sitting next to a woman?
With female passengers: \(2777\) wore seat belts, \(1431\) did not.
With male passengers: \(1363\) wore seat belts, \(1400\) did not.

Solution:

\[ \begin{align} n_F &= 4208, \\ n_M &= 2763, \\ \hat p_F &= \frac{2777}{4208} = 0.660, \\ \hat p_M &= \frac{1363}{2763} = 0.493 \\ \\ SE_{\hat {p_F} - \hat p_M} &= \sqrt{\frac{\hat p_F(1- \hat p_F)}{n_F}+\frac{\hat p_M(1-\hat p_M)}{n_M}} \\ &= \sqrt{\frac{(0.660)(1-0.660)}{4208}+\frac{0.493(1-0.493)}{2763}} \\ &= 0.012 \\ \\ ME &= z^* \times SE(\hat p_F - \hat p_M) = 1.96 \times 0.012 = 0.024 \\ \\ \hat p_F - \hat p_M &= 0.660 - 0.493 = 0.167 \\ \\ \text{Confidence Interval:} \\ (\hat p_F - \hat p_M) - ME &\lt ( p_F - p_M) \lt (\hat p_F - \hat p_M) + ME \\ \\ 0.167 - 0.024 &\lt ( p_F - p_M) \lt 0.167 + 0.024 \\ \\ 0.143 &\lt ( p_F - p_M) \lt 0.191 \end{align} \]

Example (Two-proportional \(z\)-test):

The Sleep in America Poll found that \(205\) of \(293\) of Gen-Y and \(235\) of \(469\) of Gen-X use the Internet before sleep. Is this difference real?

\[ \hat p_Y = \dfrac{205}{293} = 0.700 \\ \hat p_X = \dfrac{235}{469} = 0.501 \\ \hat p_Y - \hat p_X = 0.700 - 0.501 = 0.199 \] The null hypothesis claims that the two proportions are equal. We want to test whether the difference observed in the sample is statistically different from \(0\) or not.

In other words,

Null Model: \(p_Y - p_X = 0\)

The sampling distribution of \((\hat p_Y - \hat p_X)\) is centered around \(0\).

\[ \begin{align} \hat p_{pooled} &= \frac{x_Y + x_X}{n_Y + n_X} = \frac{205+235}{293+469} = 0.5774 \\ \\ SE_{pooled}(\hat p_Y - \hat p_X) &= \sqrt{\frac{\hat p_{pooled}(1- \hat p_{pooled})}{n_X}+\frac{\hat p_{pooled}(1-\hat p_{pooled})}{n_Y}} \\ &= \sqrt{\frac{0.5774 \times (1- 0.5774)}{293}+\frac{0.5774 \times (1-0.5774)}{469}} \\ &= 0.0368 \\ \\ \end{align} \]

\[ \begin{align} &\text{Two-tailed two-proportional z-test} \\ \\ &H_0: p_Y - p_X = 0 \\ &H_A: p_Y - p_X \ne 0 \\ \\ &\hat p_Y - \hat p_X = 0.700 - 0.501 = 0.199 \\ \\ &z = \frac{0.199 - 0}{0.0368} = 5.41 \\ \\ &p \text{-value} = 2 \times P(z>5.41) \le 0.05 \end{align} \]

Hence, reject \(H_0\). The difference between the proportions of Gen Y and Gen X is significantly different.

Example (Two-proportional \(z\)-test):

\(62\) of \(325\) girls and \(75\) of \(268\) boys have online profiles. Is there a real difference between all boys and girls?

Null Model:

mean: \(p_B - p_G = 0\)

\[ \begin{align} \hat p_{pooled} &= \frac{x_B + x_G}{n_B + n_G} = \frac{75+62}{268+325} = 0.231 \\ \\ SE_{pooled}(\hat p_B - \hat p_G) &= \sqrt{\frac{\hat p_{pooled}(1- \hat p_{pooled})}{n_B}+\frac{\hat p_{pooled}(1-\hat p_{pooled})}{n_G}} \\ &= \sqrt{\frac{0.231 \times (1- 0.231)}{268}+\frac{0.231 \times (1-0.231)}{325}} \\ &= 0.0348 \\ \\ \end{align} \]

\[ \begin{align} &\text{Two-tailed two-proportional z-test} \\ &H_0: p_B - p_G = 0 \\ &H_A: p_B - p_G \ne 0 \\ \\ &\hat p_B - \hat p_G = 0.28 - 0.19 = 0.09 \\ \\ &z = \frac{0.09 - 0}{0.0348} = 2.59 \\ \\ &p-value = 2 \times P(z>2.59) = 0.0096 \le 0.05 \end{align} \]

Reject \(H_0\). There is strong evidence to say that there is a statistically significant difference between the proportions of boys and girls who have online profiles.

9.2 Inferences about Two Independent Sample Means

Objectives:

Hypothesis test: conduct a hypothesis test of a claim about two population means.
Confidence interval: construct a confidence interval estimate of the difference between two population means.

Three Scenarios:

The standard deviations of the two populations are unknown and are not assumed to be equal.
The two population standard deviations are unknown but are assumed to be equal.
The two population standard deviations are both known.

Notations:

\[ \begin{cases} \mu_1 = \text {mean of population 1 } \\ \sigma_1 = \text {standard deviation of population 1} \\ n_1 = \text {size of the sample drawn from population 1 } \\ \bar x_1 = \text{mean of sample 1} \\ s_1 = \text{standard deviation of sample 1} \\ \end{cases} \] The corresponding notations \(\mu_2, \sigma_2, n_2, \bar x_2, s_2\) apply to population 2.

9.2.1 Hypothesis Test of Independent Samples: \(\sigma_1\) and \(\sigma_2\) Unknown and Not Assumed Equal

Requirements

Unequal variance - The values of \(\sigma_1\) and \(\sigma_2\) are unknown and we do not assume that they are equal.
Independent - The two samples are independent. If drawn from the same population, the samples are not related or they are not naturally paired or matched with the sample values from the other population.
Sample size - Both samples are simple random samples and either sample sizes are large (with \(n_1 \gt 30\) and \(n_2 \gt 30\)) or both samples come from populations having normal distributions. The methods presented in this section is robust against departures from normality, so they perform well on small samples as long as the departures from normality are not too extreme.

The null hypothesis claims that the two population means are equal, i.e. the difference in two population means: \(\mu_1 - \mu_2 = 0\).

A reasonable point estimate of \(\mu_1 - \mu_2\) based on the samples drawn from two populations can be written in the form: \(\bar x_1 - \bar x_2\).

The sampling distribution of \(\bar x_1 - \bar x_2\) is distributed with a mean or expected value of \(\mu_1 - \mu_2\) and standard error: \(SE_{\bar x_1 - \bar x_2} = \sqrt{SE^2_{\bar x_1} + SE^2_{\bar x_2}} = \sqrt{\dfrac{\sigma^2_1}{n_1}+\dfrac{\sigma^2_2}{n_2}}\)

When sample size is small or population standard deviations \((\sigma_1, \sigma_2)\) are unknown, each sample is assumed to follow \(t_{df=\nu}\)-distribution with mean \(0\) and \(SE_{\bar x_1 - \bar x_2} = \sqrt{\dfrac{s^2_1}{n_1}+\dfrac{s^2_2}{n_2}}\).

Hypothesis Test Statistic for Two Means

\[ t_\nu = \dfrac{(\bar x_1 - \bar x_2) - (\mu_1 - \mu_2)}{\sqrt{\dfrac{s^2_1}{n_1}+\dfrac{s^2_2}{n_2}}} \ \text{ where, } \mu_1 - \mu_2 = 0 \\ \\ \text{degrees of freedom } (\nu) = \dfrac{(A + B)^2}{\dfrac{A^2}{n_1 - 1} + \dfrac{B^2}{n_2 - 1}} \\ \\ \text{where, } \ A = \dfrac{s^2_1}{n_1} \ \text{ and } \ B = \dfrac{s^2_2}{n_2} \\ \text{Alternatively, } df = min(n_1-1, n_2-1) \\ \]

9.2.2 Confidence Interval Estimate of \(\mu_1 - \mu_2\): Independent Samples with \(\sigma_1\) and \(\sigma_2\) Unknown and Not Assumed Equal

The confidence interval estimate of the difference \(\mu_1 - \mu_2\) is

\[(\bar x_1 - \bar x_2) - E < \mu_1 - \mu_2 < (\bar x_1 - \bar x_2) + E \]
where,

\[ E = t^*_{\nu, \alpha/2} \times \sqrt{\dfrac{s^2_1}{n_1}+\dfrac{s^2_2}{n_2}}\]

\(t^*_{\nu, \alpha/2}\) is the critical \(t\) score that corresponds to the confidence level \((1-\alpha)\).

Example: Two-Sample \(t\)-interval

Find the \(95\%\) confidence interval about the difference in sample means.

\[ \begin{array}{c|c|c} & \text{Sample 1} & \text{Sample 2} \\ \hline n & 27 & 27 \\ \bar x & 8.5 & 14.7 \\ s & 6.1 & 8.4 \end{array} \]

\[ \begin{align} & \bar x_1 - \bar x_2 = 14.7 - 8.5 = 6.2 \\ & t_{47.46} = 2.011 \text { at CL} = 95\% \\ \\ & SE = \sqrt{\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2}} = \sqrt{\frac{8.4^2}{27}+\frac{6.1^2}{27}} = 2 \\ \\ & ME = 2.011 \times 2 = 4.02 \\ & CI: 6.2 \pm 4.02 = [2.18, 10.22] \end{align} \]

Example: Two-Sample \(t\)-test: Testing for the Difference between the Two Means

Is there a statistically significant difference between two sample means?

Generally, use unpooled \(t\)-test. Equal variance assumption is often violated in small samples.

\[ \begin{array}{c|c|c} & \text{Sample 1} & \text{Sample 2} \\ \hline n & 8 & 7 \\ \bar y & 281.88 & 211.43 \\ s & 18.31 & 46.43 \end{array} \]

\[ \begin{align} \text{Two-tailed t test} \\ \\ H_0: \mu_1 - \mu_2 &= 0 \\ H_A: \mu_1 - \mu_2 &\ne 0 \\ \\ \bar y_1 - \bar y_2 &= 0.281.88 - 211.43 = 70.45 \\ \\ SE(\bar y_1 - \bar y_2) &= \sqrt{\frac{18.31^2}{8}+\frac{46.43^2}{7}} = 18.70 \\ t_{7.62} &= \frac{70.45 - 0}{18.70} = 3.77 \\ \\ p-value &= 2 \times P(t>3.77) = 0.006 \le 0.05 \end{align} \] Hence, we reject \(H_0\).

9.2.3 Pooled \(t\)-test: \(\sigma_1 = \sigma_2\)

To use the pooled \(t\)-test, we must make the Equal Variance Assumption that the variances of the two populations from which the samples have been drawn are equal. That is \(\sigma_1^2 = \sigma_2^2.\)

Even when the specific values of \(\sigma_1\) and \(\sigma_2\) are unknown but assumed equal, the sample variances \(s_1^2\) and \(s_2^2\) can be pooled to obtain an estimate of the common population variance \(\sigma^2\).

\[ \begin{align} &\text {Pooled variance:} \\ &s^2_{p} = \frac{(n_1-1)s^2_1+(n_2-1)s^2_2}{(n_1-1)+(n_2-1)} \\ &SE_{p} = \sqrt{s^2_{p} \bigg (\dfrac{1}{n_1} + \dfrac{1}{n_2} \bigg )} \end{align} \]

\[ \text{Test Statistic: } \ t = \dfrac{(\bar x_1- \bar x_2) - (\bar \mu_1- \bar \mu_2)}{\sqrt{s^2_{p} \bigg (\dfrac{1}{n_1} + \dfrac{1}{n_2} \bigg )}} \]

\[ \begin{align} &\text{Hypothesis test: } \\\\ &H_0: \mu_1 - \mu_2 = 0 \\ &H_A: \mu_1 - \mu_2 \ne 0 \\ \\ &\text {pooled t-score, } t = \dfrac{(\bar x_1 - \bar x_2 )-0}{SE_{p}} \\ &df = (n_1 + n_2 - 2) \\ \\ &\text {Confidence Interval: } (\bar x_1- \bar x_2) \pm t_{df}^* \times SE_{p} &\end{align} \]

9.3 Two Dependent Samples (Matched Pairs)

This section presents methods for testing hypotheses and constructing confidence intervals involving the mean of the differences of the values from two populations that are dependent in the sense that the data consist of matched pairs. The pairs must be matched according to some relationship, such as before/after measurements from the same subjects or IQ scores of siblings.

Notation for Dependent Samples

\[ \begin{cases} d = \text {individual difference between the two values in a single matched pair} \\ \mu_d = \text {mean value of the differences } d \text{ for the population of all matched pairs} \\ \bar d = \text {mean value of the difference } d \text{ for the matched sample data} \\ s_d = \text{standard deviation of the differences } d \text{ for the paired sample data} \\ n = \text{ number of pairs of sample data} \\ \end{cases} \]

Requirements

The sample data are dependent (matched pairs).
The matched pair are a simple random sample.
Either or both of these conditions are satisfied: the number of pairs of sample data is large \((n > 30)\) or the pairs of values have differences that are from a population having a distribution that is approximately normal.

9.3.1 Hypothesis Testing for Paired Data: The paired \(t\)-test

The null hypothesis claims that the mean of the differences between two dependent samples (matched pairs) is equal to \(0\).

Test Statistic for Dependent Samples (with \(H_0: \mu_d = 0\))

\[ t_{n-1} = \dfrac{\bar d - \mu_d}{\dfrac{s_d}{\sqrt n}} \] Example:

The table shows the times recorded by \(17\) track athletes to complete their inner and outer circles in an track and field competition. Conduct a hypothesis test to verify the claim that the recorded inner circle times are not significantly different from the outer circle times.

\[ \begin{array}{c|c|c} & \text{Inner Time} & \text{Outer Time} & Diff\\ \hline 1 & 125.75 & 122.34 & 3.41 \\ 2 & 121.63 & 122.12 & -0.49 \\ 3 & 122.24 & 123.35 & -1.11 \\ 4 & 120.85 & 120.45 & 0.40 \\ ... & ... & ... & ... \\ 17 & 122.15 & 122.75 & -0.60 \\ \end{array} \]

\[ \begin{align} & \text{Hypotheses} \\ & H_0:\mu_d = 0 \\ & H_1: \mu_d \ne 0 \\ \\ & n = 17, \\ &\bar d = 0.499, \\ &s_d = 2.333 \\ \\ & SE(\bar d) = \dfrac{s_d}{\sqrt n} = \dfrac{2.333}{\sqrt 17} = 0.5658 \\ \\ & t_{16} = \dfrac{\bar d-0}{SE(\bar d)} = \dfrac{0.499}{0.5658} = 0.882 \\ \\ & p-value = 2 \times P(t_{16}>0.882) = 0.39 > 0.05 \end{align} \]

Hence, \(H_0\) cannot be rejected. The inner and outer circle times are not significantly different.

9.3.2 Paired \(t\)-Interval

The \(95\%\) confidence interval for the mean paired difference is

\[ \begin{align} & \bar d \pm t_{n-1}^* \times \frac{s_d}{\sqrt n} \\ \\ & = 0.499 \pm 2.12 \times 0.5658 \\ \\ CI &= [-0.7005, 1.6985] \end{align} \]