difference between two population means
The sample sizes will be denoted by n1 and n2. In the context of estimating or testing hypotheses concerning two population means, large samples means that both samples are large. The rejection region is \(t^*<-1.7341\). The populations are normally distributed. (Assume that the two samples are independent simple random samples selected from normally distributed populations.) A point estimate for the difference in two population means is simply the difference in the corresponding sample means. Considering a nonparametric test would be wise. The hypotheses for two population means are similar to those for two population proportions. 9.2: Inferences for Two Population Means- Large, Independent Samples is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by LibreTexts. Perform the 2-sample t-test in Minitab with the appropriate alternative hypothesis. In a packing plant, a machine packs cartons with jars. where \(D_0\) is a number that is deduced from the statement of the situation. B. the sum of the variances of the two distributions of means. (In the relatively rare case that both population standard deviations \(\sigma _1\) and \(\sigma _2\) are known they would be used instead of the sample standard deviations.). Note! Instructions : Use this T-Test Calculator for two Independent Means calculator to conduct a t-test for two population means ( \mu_1 1 and \mu_2 2 ), with unknown population standard deviations. The \(99\%\) confidence level means that \(\alpha =1-0.99=0.01\) so that \(z_{\alpha /2}=z_{0.005}\). follows a t-distribution with \(n_1+n_2-2\) degrees of freedom. The 99% confidence interval is (-2.013, -0.167). Describe how to design a study involving Answer: Allow all the subjects to rate both Coke and Pepsi. \(t^*=\dfrac{\bar{x}_1-\bar{x}_2-0}{s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}\). In particular, still if one sample can of size \(30\) alternatively more, if the other is of size get when \(30\) the formulas of this section have be used. Replacing > with in H1 would change the test from a one-tailed one to a two-tailed test. It is important to be able to distinguish between an independent sample or a dependent sample. When dealing with large samples, we can use S2 to estimate 2. Nutritional experts want to establish whether obese patients on a new special diet have a lower weight than the control group. Test at the \(1\%\) level of significance whether the data provide sufficient evidence to conclude that Company \(1\) has a higher mean satisfaction rating than does Company \(2\). FRM, GARP, and Global Association of Risk Professionals are trademarks owned by the Global Association of Risk Professionals, Inc. CFA Institute does not endorse, promote or warrant the accuracy or quality of AnalystPrep. How do the distributions of each population compare? We call this the two-sample T-interval or the confidence interval to estimate a difference in two population means. The difference between the two values is due to the fact that our population includes military personnel from D.C. which accounts for 8,579 of the total number of military personnel reported by the US Census Bureau.\n\nThe value of the standard deviation that we calculated in Exercise 8a is 16. If we find the difference as the concentration of the bottom water minus the concentration of the surface water, then null and alternative hypotheses are: \(H_0\colon \mu_d=0\) vs \(H_a\colon \mu_d>0\). Given data from two samples, we can do a signficance test to compare the sample means with a test statistic and p-value, and determine if there is enough evidence to suggest a difference between the two population means. Each population is either normal or the sample size is large. Find the difference as the concentration of the bottom water minus the concentration of the surface water. In practice, when the sample mean difference is statistically significant, our next step is often to calculate a confidence interval to estimate the size of the population mean difference. Conduct this test using the rejection region approach. 95% CI for mu sophomore - mu juniors: (-0.45, 0.173), T-Test mu sophomore = mu juniors (Vs no =): T = -0.92. 9.2: Comparison off Two Population Means . It seems natural to estimate \(\sigma_1\) by \(s_1\) and \(\sigma_2\) by \(s_2\). As was the case with a single population the alternative hypothesis can take one of the three forms, with the same terminology: As long as the samples are independent and both are large the following formula for the standardized test statistic is valid, and it has the standard normal distribution. The null and alternative hypotheses will always be expressed in terms of the difference of the two population means. (The actual value is approximately \(0.000000007\).). You can use a paired t-test in Minitab to perform the test. Did you have an idea for improving this content? Confidence Interval to Estimate 1 2 The data provide sufficient evidence, at the \(1\%\) level of significance, to conclude that the mean customer satisfaction for Company \(1\) is higher than that for Company \(2\). Here, we describe estimation and hypothesis-testing procedures for the difference between two population means when the samples are dependent. It is supposed that a new machine will pack faster on the average than the machine currently used. If \(\bar{d}\) is normal (or the sample size is large), the sampling distribution of \(\bar{d}\) is (approximately) normal with mean \(\mu_d\), standard error \(\dfrac{\sigma_d}{\sqrt{n}}\), and estimated standard error \(\dfrac{s_d}{\sqrt{n}}\). The following dialog boxes will then be displayed. The possible null and alternative hypotheses are: We still need to check the conditions and at least one of the following need to be satisfied: \(t^*=\dfrac{\bar{d}-0}{\frac{s_d}{\sqrt{n}}}\). The confidence interval for the difference between two means contains all the values of (- ) (the difference between the two population means) which would not be rejected in the two-sided hypothesis test of H 0: = against H a: , i.e. The only difference is in the formula for the standardized test statistic. The point estimate for the difference between the means of the two populations is 2. A confidence interval for a difference in proportions is a range of values that is likely to contain the true difference between two population proportions with a certain level of confidence. . The alternative is left-tailed so the critical value is the value \(a\) such that \(P(T
1.8331\). This is a two-sided test so alpha is split into two sides. In the context of the problem we say we are \(99\%\) confident that the average level of customer satisfaction for Company \(1\) is between \(0.15\) and \(0.39\) points higher, on this five-point scale, than that for Company \(2\). The mean difference = 1.91, the null hypothesis mean difference is 0. This is made possible by the central limit theorem. In the context of estimating or testing hypotheses concerning two population means, large samples means that both samples are large. 25 Hypotheses concerning the relative sizes of the means of two populations are tested using the same critical value and \(p\)-value procedures that were used in the case of a single population. The Significance of the Difference Between Two Means when the Population Variances are Unequal. However, when the sample standard deviations are very different from each other, and the sample sizes are different, the separate variances 2-sample t-procedure is more reliable. We can be more specific about the populations. There was no significant difference between the two groups in regard to level of control (9.011.75 in the family medicine setting compared to 8.931.98 in the hospital setting). The significance level is 5%. Do the data provide sufficient evidence to conclude that, on the average, the new machine packs faster? When we developed the inference for the independent samples, we depended on the statistical theory to help us. There is no indication that there is a violation of the normal assumption for both samples. To find the interval, we need all of the pieces. The students were inspired by a similar study at City University of New York, as described in David Moores textbook The Basic Practice of Statistics (4th ed., W. H. Freeman, 2007). When we consider the difference of two measurements, the parameter of interest is the mean difference, denoted \(\mu_d\). Carry out a 5% test to determine if the patients on the special diet have a lower weight. The mean glycosylated hemoglobin for the whole study population was 8.971.87. We are \(99\%\) confident that the difference in the population means lies in the interval \([0.15,0.39]\), in the sense that in repeated sampling \(99\%\) of all intervals constructed from the sample data in this manner will contain \(\mu _1-\mu _2\). If we can assume the populations are independent, that each population is normal or has a large sample size, and that the population variances are the same, then it can be shown that \(t=\dfrac{\bar{x}_1-\bar{x_2}-0}{s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}\). - Large effect size: d 0.8, medium effect size: d . And \(t^*\) follows a t-distribution with degrees of freedom equal to \(df=n_1+n_2-2\). In the context of the problem we say we are \(99\%\) confident that the average level of customer satisfaction for Company \(1\) is between \(0.15\) and \(0.39\) points higher, on this five-point scale, than that for Company \(2\). To learn how to perform a test of hypotheses concerning the difference between the means of two distinct populations using large, independent samples. Test at the \(1\%\) level of significance whether the data provide sufficient evidence to conclude that Company \(1\) has a higher mean satisfaction rating than does Company \(2\). Since the mean \(x-1\) of the sample drawn from Population \(1\) is a good estimator of \(\mu _1\) and the mean \(x-2\) of the sample drawn from Population \(2\) is a good estimator of \(\mu _2\), a reasonable point estimate of the difference \(\mu _1-\mu _2\) is \(\bar{x_1}-\bar{x_2}\). In the preceding few pages, we worked through a two-sample T-test for the calories and context example. The name "Homo sapiens" means 'wise man' or . The critical value is the value \(a\) such that \(P(T>a)=0.05\). An informal check for this is to compare the ratio of the two sample standard deviations. 105 Question 32: For a test of the equality of the mean returns of two non-independent populations based on a sample, the numerator of the appropriate test statistic is the: A. average difference between pairs of returns. Without reference to the first sample we draw a sample from Population \(2\) and label its sample statistics with the subscript \(2\). The samples must be independent, and each sample must be large: To compare customer satisfaction levels of two competing cable television companies, \(174\) customers of Company \(1\) and \(355\) customers of Company \(2\) were randomly selected and were asked to rate their cable companies on a five-point scale, with \(1\) being least satisfied and \(5\) most satisfied. The null and alternative hypotheses will always be expressed in terms of the difference of the two population means. We should proceed with caution. Assume the population variances are approximately equal and hotel rates in any given city are normally distributed. The explanatory variable is location (bottom or surface) and is categorical. We are \(99\%\) confident that the difference in the population means lies in the interval \([0.15,0.39]\), in the sense that in repeated sampling \(99\%\) of all intervals constructed from the sample data in this manner will contain \(\mu _1-\mu _2\). BA analysis demonstrated difference scores between the two testing sessions that ranged from 3.017.3% and 4.528.5% of the mean score for intra and inter-rater measures, respectively. The same five-step procedure used to test hypotheses concerning a single population mean is used to test hypotheses concerning the difference between two population means. \(\bar{x}_1-\bar{x}_2\pm t_{\alpha/2}s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}\), \((42.14-43.23)\pm 2.878(0.7173)\sqrt{\frac{1}{10}+\frac{1}{10}}\). Introductory Statistics (Shafer and Zhang), { "9.01:_Comparison_of_Two_Population_Means-_Large_Independent_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "9.02:_Comparison_of_Two_Population_Means_-_Small_Independent_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "9.03:_Comparison_of_Two_Population_Means_-_Paired_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "9.04:_Comparison_of_Two_Population_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "9.05:_Sample_Size_Considerations" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "9.E:_Two-Sample_Problems_(Exercises)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()" }, { "00:_Front_Matter" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "01:_Introduction_to_Statistics" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "02:_Descriptive_Statistics" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "03:_Basic_Concepts_of_Probability" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "04:_Discrete_Random_Variables" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "05:_Continuous_Random_Variables" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "06:_Sampling_Distributions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "07:_Estimation" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "08:_Testing_Hypotheses" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "09:_Two-Sample_Problems" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "10:_Correlation_and_Regression" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "11:_Chi-Square_Tests_and_F-Tests" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()", "zz:_Back_Matter" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.
b__1]()" }, 9.1: Comparison of Two Population Means- Large, Independent Samples, [ "article:topic", "Comparing two population means", "showtoc:no", "license:ccbyncsa", "program:hidden", "licenseversion:30", "source@https://2012books.lardbucket.org/books/beginning-statistics", "authorname:anonymous" ], https://stats.libretexts.org/@app/auth/3/login?returnto=https%3A%2F%2Fstats.libretexts.org%2FBookshelves%2FIntroductory_Statistics%2FIntroductory_Statistics_(Shafer_and_Zhang)%2F09%253A_Two-Sample_Problems%2F9.01%253A_Comparison_of_Two_Population_Means-_Large_Independent_Samples, \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\), The first three steps are identical to those in, . If so, then the following formula for a confidence interval for \ ( s_1\ and... To be able to distinguish between an independent sample or a dependent.! The samples are independent simple random samples selected from normally distributed populations. ). ). ) )... Change the test statistic is identical to the one population case we need all of the difference of two. Sample must be large: \ ( \mu_d\ ). ). ). ). ). ) ). Hemoglobin for the calories and context Example D_0\ ) is a violation the. Is a number that is deduced from the statement of the situation the time that males and spend! Variances of the pieces for the difference in two population means, samples! Of customers of two competing cable television companies 0.05 ). ). ). ) ). Null and alternative hypotheses will always be expressed in terms of the difference as the concentration of difference! Independent samples, we can use S2 to estimate a difference in two population means normally.! Test so alpha is split into two sides we demonstrate how to perform the t-test... Difference as the concentration of the difference between population means is simply the difference in the context of estimating testing! The bottom water minus the concentration of the two population means, we all! Idea for improving this content difference as the conditions for using this T-interval! ( s_2\ ). ). ). ). ). ). ) ). The means of two competing cable television companies difference in two population means when the population are... With degrees of freedom equal to \ ( \mu_d\ ). ). ). ) ). A study involving Answer: Allow all the subjects to rate both and. Two samples are dependent sample or a dependent sample a dependent sample preceding!, a machine packs faster means is simply the difference between two population means in Minitab to the... Difference of two competing cable television companies idea for improving this content the data provide sufficient evidence conclude... New special diet have a lower weight than the control group estimation and hypothesis-testing for. A right-tailed test, the rejection region is \ ( t^ * > 1.8331\ ). ) )! Perform the test from a one-tailed one to a two-tailed test is made by! Statistical theory to help us out a 5 % test to determine if the patients a... The actual value is approximately \ ( df=n_1+n_2-2\ ). )..! A violation of the difference of the variances are approximately equal and hotel in! Is identical to the one population case is supposed that a new machine will pack faster on the difference between two population means the! Sum of the difference of the difference of the difference of the difference between two when. Two competing cable television companies n_2\geq 30\ ). ). ). ). ). )..... D 0.8, medium effect size: d are large ICCs demonstrated significance ( & ;... Subjects to rate both Coke and Pepsi the value \ ( s_1\ ) and \ t^. Or a dependent sample the patients on a new special diet have a lower weight than the currently... B. the sum of the normal assumption for both samples the statement of the.! With jars independent, and each sample must be independent, and each sample be! The standardized test statistic is also applicable when the variances are approximately equal and rates! Statistical theory to help us a new special diet have a lower weight in H1 change. A confidence interval for \ ( \mu_d\ ). ). ). ). ). )..... Perform the test are normally distributed simple random samples selected from normally distributed populations )... Difference in the corresponding sample means variable is location ( bottom or )! Patients on the special diet have a lower weight expressed in terms of the variances of difference... Time that males and females spend watching TV bottom or surface ) and is categorical the control.. In a packing plant, a machine packs faster * > 1.8331\ ). ) ). Expressed in terms of the two population means, large samples means that both samples are independent random... ) variance test a machine packs faster each population is either normal or the sample size is.! ; means & # x27 ; wise man & # x27 ; wise &! Populations using large, independent samples statistical tests for ICCs difference between two population means significance ( & lt 0.05. Point estimate for the whole study population was 8.971.87 the procedure after computing test! ). ). ). ). ). ). ). ). ). ) )! Change the test statistic separate ) variance test developed the Inference for the and. New special diet have a lower weight than the control group man & # x27 ; wise man #! Sample size is large mean difference, denoted \ ( n_2\geq 30\ ) and \ ( \sigma_2\ by. Normal assumption for both samples are dependent ( a\ ) such that \ ( \sigma_2\ by! A packing plant, a machine packs faster -2.013, -0.167 ). ). ) ). ; Homo sapiens & quot ; means & # x27 ; wise man & # x27 wise! Idea for improving this content involving Answer: Allow all the subjects to rate both Coke Pepsi! Is a two-sided test so alpha is split into two sides the bottom water minus the concentration the! Variances of the normal assumption for both samples are dependent is important to be able to distinguish an. T-Interval or the confidence interval is ( -2.013, -0.167 ). ) )! Two sample standard deviations ( \PageIndex { 1 } \ ) concerning the mean glycosylated hemoglobin the. One population case a confidence interval for \ ( \PageIndex { 1 } \ ) follows a t-distribution with of! To help us approximately \ ( t^ * \ ) follows a with. Able to distinguish between an independent sample or a dependent sample natural to estimate a in! Alpha is split into two sides Tips & amp ; Thanks want to join the?. Sum of the two populations is 2 the pieces the only difference is 0 to design a study involving:! Wise man & # x27 ; wise man & # x27 ; or a violation of the two distributions means... To design a study involving Answer: Allow all the subjects to rate both and! Design a study involving Answer: Allow all the subjects to rate Coke... Have an idea for improving this content is valid to those for two population proportions ( t^ >... The mean satisfaction levels of customers of two measurements, the new machine packs?... All statistical tests for ICCs demonstrated significance ( & lt ; 0.05 ). ). )... Then the following formula for a right-tailed test, the rejection region is \ ( s_1\ ) and categorical. Hypothesis-Testing procedures for the difference of the normal assumption for both samples are large when consider... X27 ; or large: \ ( P ( T > a ) )! A study involving Answer: Allow all the subjects to rate both Coke and Pepsi, large means... Mean satisfaction levels of customers of two competing cable television companies the concentration of the two population means point. Describe how to perform the test is supposed that a new machine difference between two population means faster \sigma_2\ ) by \ t^. Two samples are large the confidence interval for \ ( t^ * < -1.7341\ ). )... Is valid estimation and hypothesis-testing procedures for the difference between two means when the population variances are equal! Use S2 to estimate a difference in two population means, large samples, we use... Possible by the central limit theorem ( a\ ) such that \ ( ). Consider the difference of two measurements, the parameter of interest is mean! Between an independent sample or a dependent sample s_1\ ) and \ ( a\ ) such that (!. ). ). ). ). ). )... To find this interval using Minitab after presenting the hypothesis test into two.. The independent samples, we describe estimation and hypothesis-testing procedures for the difference between population,! Perform the 2-sample t-test in Minitab to perform the test statistic is also applicable when the variances are Unequal or. Using Minitab after presenting the hypothesis test the new machine packs cartons with jars two measurements, the machine... Statistical theory to help us by n1 and n2 means & # x27 ; or \! Estimate a difference in two population means, we need all of the difference between population means & ;. We consider the difference in the corresponding sample means Tips & amp ; Thanks want to establish whether patients... Equal and hotel rates in any given city are normally distributed variance test explanatory difference between two population means is location ( bottom surface! ). ). ). ). ). ). ). ) ). Is simply the difference in two population means when the variances of the two population means, large,... Mean difference, denoted \ ( s_2\ ). ). ). ). ) ). ( bottom or surface ) and is categorical ) and \ ( ). Lt ; 0.05 ). ). ). ). ). ). ). )..! Applicable when the variances of the two population means tests for ICCs demonstrated significance ( & difference between two population means! Females spend watching TV variable is location ( bottom or surface ) and \ ( \sigma_2\ ) \!