Wolfram Alpha:
Search by keyword:
Astronomy
Chemistry
Classical Mechanics
Classical Physics
Climate Change
Cosmology
Finance and Accounting
Game Theory
General Relativity
Group Theory
Lagrangian and Hamiltonian Mechanics
Macroeconomics
Mathematics
Microeconomics
Nuclear Physics
Particle Physics
Probability and Statistics
Programming and Computer Science
Quantum Computing
Quantum Field Theory
Quantum Mechanics
Semiconductor Reliability
Solid State Electronics
Special Relativity
Statistical Mechanics
String Theory
Superconductivity
Supersymmetry (SUSY) and Grand Unified Theory (GUT)
The Standard Model
Topology
Units, Constants and Useful Formulas
Sampling Distributions
----------------------
If we draw a sample size, n, from a given population and compute
a statistic (mean, standard deviation, proportion) for each sample.
The probability distribution of the statisitic is called a sampling
distibution.
Central Limit Theorem
---------------------
For large enough sample sizes* the sample MEANS follow N(μ,σ/√n)
where n is the sample size. In other words, the mean of all
the sample means is equal to the population mean. This applies
regardless of the distribution of the parent population. As the
sample size is increased the standard deviation, skew and kurtosis
of the sampling distribution decreases.
The SD of the sampling distribution of the mean is called the
STANDARD ERROR OF THE MEAN.
SE = σ/√n = σsample means
We can generalize this to:
SE = (σ/√n)((N - n)/(N - 1))
Where N is the population size.
For infinite N this reduces to:
SE = σ/√n as before
For small N it reduces to:
SE = σ
* A sample size ≥ 30 is generally considered to be the minimum
for the CLT to apply.
Distribution of Means
---------------------
_
If x is the mean of a random sample of size n taken from
a normal population having mean μ and variance σ2, then
we can state the following:
Large Sample Size (n > 30)
--------------------------
_
z = (x - μ)/σ/√n
Generally, the population standard deviation is not given.
Under these circumstances it is necessary to compute it from
the sample using:
_
s2 = Σi(x - xi)2/(n - 1)
Why use n - 1? To answer that we need to look at biassed
versus unbiassed estimators.
Biassed versus Unbiassed Estimators
-----------------------------------
Consider a small distribution: 5 7 12
N = 3
n = 2
μ = 8
σ2 = Σ(x - μ)2/N = 8.67
σ = 2.94
_ _
s2 = Σ(x - x)2/(n - 1) = Σ(x - x)2/1
_
Sample x s2 s s2/σ2
------ --- ---- ---- ----
5 5 5.0 0.0 0.00 0.00
5 7 6.0 2.0 1.41 0.23
5 12 8.5 24.5 4.95 2.83
7 5 6.0 2.0 1.41 0.23
7 7 7.0 0.0 0.00 0.00
7 12 9.5 12.5 3.54 1.44
12 5 8.5 24.5 4.95 2.83
12 7 9.5 12.5 3.54 1.44
12 12 12.0 0.0 0.00 0.00
Average 8.0 8.67 2.20
Notes:
_
- x follows a t distribution (normal dostribution if the
sample size is large).
- s is a biassed estimator of σ
- (n - 1)s2/σ2 follows the χ2 distribution.
- A denominator of n gives the biassed estimator for σ2.
- A denominator n - 1 gives the unbiassed estimator for σ2.
Small Sample Size (n ≤ 30)
--------------------------
If n ≤ 30 then we must use the t-statistic instead:
_
tv = (x - μ)/s/√n
with v = n - 1 degrees of freedom.
The t-distribution has the following properties:
mean: μ = 0
variance: σ2 = v/(v - 2), where v is the degrees of freedom
and v > 2
The variance is always greater than 1, although it is close
to 1 when there are many degrees of freedom. With infinite
degrees of freedom, the t distribution is the same as the
standard normal distribution.
Distribution of Proportions
---------------------------
If p is the proportion (probability) of successes in the
population, then:
σp = √(p(1 - p)/n)
The sampling distribution of p is a discrete rather than a
continuous distribution. It is approximately normally
distributed if n is fairly large and p is not close to 0 or 1.
A general rule of thumb is that the approximation is good
when:
np and n(1 - p) are both ≥ 10
_
z = (p - μ)/σ/√n
As before, generally, p and σ are not given for the original
population. Under these circumstances it is necessary to
compute them from the sample. Thus,
p = psample
Distribution of Variances
-------------------------
If s2 is the variance of a random sample of size n taken from
a normal poulation having the variance σ2, then
χ2 = (n - 1)s2/σ2
with v = n - 1 degrees of freedom.
The χ2 distribution has the following properties:
mean: μ = v
variance: σ2 = 2v
Example:
An optical firm purchases glass to be ground into lenses and
past experience has shown that the variance of the refractive
index = 1.26 x 10-4. A shipment is received and a sample of
20 pieces is pulled. The measured variance of the sample is
2.10 x 10-4. Should the sample be rejected?
H0: σ2 = s2
H1: σ2 ≠ s2
χ2 = (20 - 1)2.10 x 10-4/1.26 x 10-4 = 31.66
From tables, χ20.05 for v = n - 1 = 19 is equal to 30.144.
Since 31.66 > 30.144 the result is significant at the 0.05
level and there is sufficient reason to reject H0.
F Distribution
--------------
If s12 and s22 are the variances of independent random samples
of size n1 and n2 taken from two normal populations having the
same variance, then
F = s12/s22
with v1 = n1 - 1 and v2 = n2 - 1 degrees of freedom.
The F distribution has the following properties:
mean: μ = v2/(v2 - 2) for v2 > 2.
variance: σ2 = [2v22(v1 + v2 - 2)]/[v1(v2 - 2)2(v2 - 4)]
for v2 > 4.
Example:
Suppose you randomly select 7 marbles from company 1's
production line and 12 marbles from company 2's production
line and measure their diameters. Assume you are given:
s1 = 1.0 and s2 = 1.1
F = s12/s22 = 1/1.21 = 0.83
H0: σ1 = σ2
H1: σ1 ≠ σ2
From tables, F0.05 for v1 = n1 - 1 = 6 and v2 = n2 - 1 = 11 is
equal to 3.0946. Since 0.83 < 3.0946 the result is not
significant at the 0.05 and there is insufficient reason to
reject H0.