Sample, Small

a statistical sample of such a small size n that it is impossible to apply to it the simple classical formulas that are valid only asymptotically as n —. co. The distinctive features of statistical estimates of parameters using small samples can be most easily understood using the example of a normal distribution, for which samples of size n ≤ 30 are usually considered small. Suppose it is necessary to estimate an unknown mean value a of a normal population with unknown variance σ² using a sample x₁, x₂, … , x_n. From the population we denote

In estimating a, we proceed from the fact that the probability distribution of the variable

is independent of a and σ.

The probability ω that the inequality —t_ω < t < t_ω holds and that the inequalities

equivalent to it hold is calculated here using the formula

where s(t, n — 1) is the probability density for the Student distribution with n — 1 degrees of freedom. By determining the corresponding t_ω for given n and ω (0 < ω < 1), which may be done, for example, by using tables, we obtain rule (1) for finding the confidence limits for a with a significance of level ω.

For large n, equation (2), which relates ω and t_ω, can be replaced by the approximate formula

This formula is sometimes incorrectly used for determining t_ω for small n, which leads to gross errors. Thus, for ω =0.99, using formula (3) we find that t_0.99 = 2.58. True values of t_0.99 for small n are given in Table 1.

Table 1
n	2	3	4	5	10	20	30
t_0.99	63.66	9.92	5.84	4.60	3.25	2.86	2.76

If formula (3) is used for n = 5, we may conclude that the inequality

is satisfied with probability 0.99. In fact, in the case of five observations, the probability of this inequality is equal to only 0.94, while the inequality

has probability 0.99, as can be seen from Table 1.

Similar methods of estimating the parameters of multivariate distributions (for example, correlation coefficients) using small samples have been developed.

REFERENCES

Cramér, H. Matematicheskie melody statistiki. Moscow, 1948. (Translated from English.)
Kolmogorov, A. N. “Opredelenie tsentra rasseivaniia i mery tochnosti po ogranichennomu chislu nabliudenii.” Izv. AN SSSR: Seriia matematicheskaia, 1942, vol. 6, nos. 1-2.
Bol’shev, L. N., and N. V. Smirnov. Tablitsy matematicheskoi statistiki. Moscow, 1965.

IU. V. PROKHOROV