Wolfram Alpha:
Search by keyword:
Astronomy
Chemistry
Classical Physics
Climate Change
Cosmology
Finance and Accounting
Game Theory
General Relativity
Lagrangian and Hamiltonian Mechanics
Macroeconomics
Mathematics
Microeconomics
Particle Physics
Probability and Statistics
Programming and Computer Science
Quantum Computing
Quantum Field Theory
Quantum Mechanics
Semiconductor Reliability
Solid State Electronics
Special Relativity
Statistical Mechanics
String Theory
Superconductivity
Supersymmetry (SUSY) and Grand Unified Theory (GUT)
The Standard Model
Topology
Units, Constants and Useful Formulas
The one-way ANOVA is a method for extending the two-sample
t-test for independent samples to three or more samples. ANOVA
is an acronym for ANalysis Of VAriance. The acronym is a little
misleading since we are actually analyzing means not variances.
In a one-way ANOVA there is one dependent variable and one
independent variable.
Means Model
-----------
y_{ij} = μ_{i} + e_{ij}
Where,
y_{ij} = measured value on jth subject in ith group
μ_{i} = mean value for group i
e_{ij} = random error about μ_{i}
In analysis of variance (ANOVA), the sum of squares helps express
the total variation that can be attributed to various factors.
Sum of Squares
--------------
Let:
k = number of treatments (columns)
r = number of groups in each treatment (rows).
n = total number of groups (rows x columns)
y_{ij} = ith observation in jth column.
_
y = grand mean = Σy_{ij}/n
_
y_{j} = column means = Σ_{i}y_{ij}/r
_{ } _
Total SS = Σ(y_{ij} - y)^{2}
_{ } __{ } _
SST = Σ_{j}r(y_{j} - y)^{2}
_
SSE = Σ_{ij}(y_{ij} - y_{j})^{2}
Mean Total SS = Total SS/(n - 1)
MSST = SST/(k - 1)
MSSE = SSE/(n - k)
Assumptions:
1. Samples are randomly selected from the k treatment populations.
2. All k treatment populations are normal.
3. All k treatment variances are equal.
Hypotheses:
H_{0}: μ_{1} = μ_{2} ... = μ_{n}
H_{1}: μ_{i} ≠ μ_{j} for at least one i and j
Test statistic: F test = = MSST/MSSE
If H_{0} is rejected how do we determine which samples are
different? Two methods are used: LSD and Bonferroni.
Both methods use the t-test.
SPSS: Analyze>Compare Means>One-Way ANOVA
Post Hoc => check LSD and Bonferroni (more accurate)
df SS MS F
----- --- ------------------ ---------
Treatment k - 1 SST MSST = SST/(k - 1) MSST/MSSE
Error n - k SSE MSSE = SSE/(n - k)
Total n - 1 Tot SS ***
*** Total ≠ column totals.
Why ANOVA and not t-test?
1. Comparing three groups using t-tests would require that
3 t-tests be conducted. This increases the chances of
making a type I error.
2. The t-test does not make use of all of the available
information from which the samples were drawn. For
example, in a comparison of group 1 vs. group 2, the
information from group 3 is neglected. An ANOVA makes
use of the entire data set.
Pairwise Comparisons
--------------------
The number of ways that the means can be compared is given by:
c = k(k - 1)/2
Thus, for example, if there are 3 treatments c = 3. These
combinations are:
μ_{1} with μ_{2}
μ_{1} with μ_{3}
μ_{2} with μ_{3}
A simple completely randomized design example:
SAT Score means
Boys Girls
Sample 1 540 530
Sample 2 530 540
Sample 3 540 520
_
y_{j} 536.66 530
_
y = (540 + 530 + 540 + 530 + 540 + 520)/6 = 533.33
k = 2, r = 3, c = 2, n = 6
_{ } _
Total SS = Σ(y_{ij} - y)^{2}
= (540 - 533.33)^{2} + (530 - 533.33)^{2}
+ (540 - 533.33)^{2} + (530 - 533.33)^{2}
+ (540 - 533.33)^{2} + (520 - 533.33)^{2}
= 333.334
_{ } __{ } _
SST = Σ_{j}r(y_{j} - y)^{2}
= 3(536.66 - 533.33)^{2} + 3(530 - 533.33)^{2}
= 33.267 + 33.267
= 66.533
SSE = Σ_{ij}(y_{ij} - y_{j})^{2}
= (540 - 536.66)^{2} + (530 - 536.66)^{2}
+ (540 - 536.66)^{2} + (530 - 530)^{2}
+ (540 - 530)^{2} + (520 - 530)^{2}
= 266.667
Note that:
Total SS = SST + SSE
= 66.533 + 266.667
= 333.2000
Mean Total SS = Total SS/(n - 1)
= 333.334/5
= 66.667
MSST = SST/(k - 1)
= 66.533
MSSE = SSE/(n - k)
= 266.667/4
= 66.667
F = MSST/MSSE = 66.533/66.667 = 0.9980
MSST df = k - 1 = 1
MSSE df = n - k = 4
F_{1,4} = 7.7086 for α = 0.05
Summarizing:
df SS MS F
-- ------ ------ ------
Treatment 1 66.533 66.533 0.9980
Error 4 266.667 66.667
When 2 samples are being compared, the t and F tests are
equivalent. F = t^{2}
_ _
x_{1} - x_{2}
t = ---------------
√s^{2}(1/n_{1} + 1/n_{2})
s^{2} = MSE = 66.667
536.66 - 530
= ------------------
√66.667(1/3 + 1/3)
= 6.66/6.66
= 1
A simple randomized block design example:
With a randomized block design, the experimenter divides subjects
into subgroups called blocks, such that the variability within
blocks is less than the variability between blocks. Then,
subjects within each block are randomly assigned to treatment
conditions. Compared to a completely randomized design, this
design reduces variability within treatment conditions and
potential confounding, producing a better estimate of treatment
effects.
SPSS Output:
df SS MS F
------------- --- ------------------------- ---------
Treatment k - 1 SST MSST = SST/(k - 1) MSST/MSSE
Block b - 1 SSB MSSB = SSB/(b - 1) MSSB/MSSE
Error n - b - k + 1 SSE MSSE = SSE/(n - b - k + 1)
Total n - 1 Tot SS ***
*** Total ≠ column totals.
SAT Score means
Boys Girls Block Means
School 1 540 530 535
School 2 530 540 535
School 3 540 520 530
_
y_{j} 536.66 530
_
y = (540 + 530 + 540 + 530 + 540 + 520)/6 = 533.33
k = 2, r = 3, c = 2, n = 6, b = 3
_{ } _
Total SS = Σ(y_{ij} - y)^{2}
= (540 - 533.33)^{2} + (530 - 533.33)^{2}
+ (540 - 533.33)^{2} + (530 - 533.33)^{2}
+ (540 - 533.33)^{2} + (520 - 533.33)^{2}
= 333.334
_{ } __{ } _
SST = Σ_{j}r(y_{j} - y)^{2}
= 3(536.66 - 533.33)^{2} + 3(530 - 533.33)^{2}
= 33.267 + 33.267
= 66.533
_{ } __{ } _
SSB = Σ_{i}k(y_{i} - y)^{2}
= 2(535 - 533.33)^{2} + 2(535 - 533.33)^{2} + 2(530 - 533.33)^{2}
= 5.578 + 5.578 + 22.218
= 33.374
SSE = Total SS - SST - SSB
= 333.334 - 66.533 - 33.374
= 233.427
MSST = SST/(k - 1)
= 66.533
MSSE = SSE/(n - b - k + 1)
= 233.427/(6 - 3 - 2 + 1)
= 116.714
MSSB = SSB/(b - 1)
= 33.374/2
= 16.687
F = MSST/MSSE = 66.533/116.714 = 0.5701
MSST df = k - 1 = 1
MSSE df = n - b - k + 1 = 2
MSSB df = b - 1 = 2
F_{1,2} = 18.5128 for α = 0.05
This randomized block design removes school as a potential
source of variability and as a potential confounding variable.
Summarizing:
df SS MS F
-- ------- ------- ------
Treatment 1 66.533 66.533 0.5701
Block 2 33.374 16.687 0.1429
Error 2 266.667 116.714