Small sample formula. Sample types. Small sample. Methods for selecting units from the general population
18. Theory of small samples.
At large numbers sampling units (n>100) distribution random errors sample mean in accordance with the theorem of A.M. Lyapunov is normal or approaches normal as the number of observations increases.
However, in the practice of statistical research in a market economy, it is increasingly necessary to deal with small samples.
A small sample is such a sample observation, the number of units of which does not exceed 30.
When evaluating results small sample population size is not used. To determine the possible margins of error, Student's criterion is used.
The value of σ is calculated on the basis of sample observation data.
This value is used only for the studied population, and not as an approximate estimate of σ in the general population.
A probabilistic estimate of the results of a small sample differs from an estimate in a large sample in that, with a small number of observations, the probability distribution for the mean depends on the number of selected units.
However, for a small sample, the value of the confidence coefficient t is related to the probabilistic estimate in a different way than for a large sample (since the distribution law differs from the normal one).
According to the distribution law established by Student, the probable distribution error depends both on the value of the confidence coefficient t and on the sample size B.
The average error of a small sample is calculated by the formula:
where is the variance of a small sample.
In MW, the coefficient n / (n-1) must be taken into account and must be corrected. When determining the variance S2, the number of degrees of freedom is:
.
The marginal error of a small sample is determined by the formula
In this case, the value of the confidence coefficient t depends not only on the given confidence probability, but also on the number of sample units n. For individual values of t and n, the confidence probability of a small sample is determined by special Student tables, in which the distributions of standardized deviations are given:
The probabilistic assessment of the results of the SW differs from the assessment in the BV in that, with a small number of observations, the probability distribution for the mean depends on the number of selected units
19. Methods for selecting units in the sample.
1. The sample must be large enough in number.
2. The structure of the sample population should best reflect the structure of the general population.
3. The selection method must be random
Depending on whether the selected units participate in the sample, the method is distinguished - non-repetitive and repeated.
Non-repetitive selection is such a selection, in which the unit that fell into the sample is not returned to the population from which further selection is carried out.
Calculation of the average error of a non-repetitive random sample:
Calculation of the marginal error of non-repetitive random sampling:
During the re-selection, the unit that fell into the sample after the registration of the observed characteristics returns to the original (general) population to participate in the further selection procedure.
The calculation of the average error of repeated simple random sampling is performed as follows:
Calculation of the marginal error of repeated random sampling:
The type of sample formation is divided into - individual, group and combined.
Selection method - determines a specific mechanism for sampling units from the general population and is divided into: actually - random; mechanical; typical; serial; combined.
Actually, random the most common method of selection in a random sample, it is also called the method of lottery, in which a ticket with a serial number is prepared for each unit of the statistical population. Next, the required number of units of the statistical population is randomly selected. Under these conditions, each of them has the same probability of being included in the sample.
Mechanical sampling. It is used in cases where the general population is in some way ordered, i.e. there is a certain sequence in the arrangement of units.
To determine the average error of mechanical sampling, the formula of the average error is used for proper random non-repetitive selection.
typical selection. It is used when all units of the general population can be divided into several typical groups. Typical selection involves the selection of units from each group itself - randomly or mechanically.
For a typical sample, the value of the standard error depends on the accuracy of determining the group means. So, in the formula for the marginal error of a typical sample, the average of the group variances is taken into account, i.e.
serial selection. It is used in cases where the units of the population are combined into small groups or series. The essence of serial sampling lies in the actual random or mechanical selection of series, within which a complete survey of units is carried out.
With serial sampling, the size of the sampling error does not depend on the number of units studied, but on the number of surveyed series (s) and on the value of the intergroup variance:
Combined selection may go through one or more steps. A sample is called single-stage if the units of the population selected once are subjected to study.
The sample is called multistage, if the selection of the population passes through steps, successive stages, and each step, stage of selection has its own unit of selection.
" |
In the process of assessing the degree of representativeness of the sample observation data, the question of the sample size becomes important. sample conversion coefficient student
It determines not only the magnitude of the limits that the sampling error will not exceed with a given probability, but also the methods for determining these limits.
With a large number of sampling units (), the distribution of random errors of the sample mean in accordance with Lyapunov's theorem normal or approaches normal as the number of observations increases.
The probability of an error going beyond certain limits is estimated on the basis of tables Laplace integral . The calculation of the sampling error is based on the value of the general variance, since at large the coefficient by which the sample variance is multiplied to obtain the general variance does not play a big role.
In the practice of statistical research, one often encounters small, so-called small samples.
A small sample is such a sample observation, the number of units of which does not exceed 30.
The development of the theory of small sampling was started by an English statistician V.S. Gosset (published under the pseudonym Student ) in 1908. He proved that the estimation of the discrepancy between the average of a small sample and the general average has a special distribution law.
To determine the possible margins of error, the so-called Student's criterion, determined by the formula
where is a measure of random fluctuations of the sample mean in
small sample.
The value is calculated on the basis of sample observation data:
This value is used only for the study population, and not as an approximate estimate in the general population.
With a small sample size, the distribution Student differs from normal: large values of the criterion have a higher probability here than with a normal distribution.
The marginal error of a small sample as a function of the mean error is presented as
But in this case, the magnitude is related differently to the probable estimate than with a large sample.
According to the distribution Student , the probable estimate depends on both the size and the size of the sample if the marginal error does not exceed the average error in small samples.
Table 3.1 Probability distribution in small samples depending on from the coefficient of confidence and sample size
As seen from tab. 3.1 , with increasing this distribution tends to the normal one and at , it already differs little from it.
Let's show how to use the Student's distribution table.
Suppose that a sample survey of workers in a small enterprise showed that workers spent time (min.) on one of the production operations: . Find the sample average costs:
Sample variance
Hence the average error of a small sample
By tab. 3.1 we find that for the confidence coefficient and the size of a small sample, the probability is equal.
Thus, it can be argued with probability that the discrepancy between the sample and the general average lies in the range from to, i.e. the difference will not exceed in absolute value ().
Therefore, the average time spent in the entire population will be in the range from to.
The probability that this assumption is actually wrong and the error for random reasons will be greater than is equal to: .
Probability table Student often presented in a different form than table 3.1 . It is believed that in some cases this form is more convenient for practical use ( tab. 3.2 ).
From tab. 3.2 it follows that for each number of degrees of freedom, a limit value is indicated, which with a given probability will not be exceeded due to random fluctuations in the sample results.
Based on the tab. 3.2 quantities are determined confidence intervals : and.
This is the area of those values of the general average, going beyond which has a very small probability, equal to:
As a confidence level in a two-sided check, they usually use or, which does not exclude, however, the choice of others not listed in tab. 3.2 .
Table 3.2 Some Meanings -Student's distributions
The probabilities of a random exit of the estimated average value beyond the limits of the confidence interval, respectively, will be equal and, i.e. are very small.
The choice between probabilities and is to a certain extent arbitrary. This choice is largely determined by the content of those tasks for which a small sample is used.
In conclusion, we note that the calculation of errors in a small sample differs little from similar calculations large sample. The difference is that with a small sample, the probability of our assertion is somewhat less than with a larger sample (in particular, in the example above and respectively).
However, all this does not mean that you can use a small sample when you need a large sample. In many cases, the discrepancies between the found limits can reach significant sizes, which hardly satisfies researchers. Therefore, a small sample should be used in a statistical study of socio-economic phenomena with great care, with appropriate theoretical and practical justification.
So, conclusions based on the results of a small sample are of practical importance only on the condition that the distribution of the feature in the general population is normal or asymptotically normal. It is also necessary to take into account the fact that the accuracy of the results of a small sample is still lower than with a large sample.
A.M. Nosovsky1*, A.E. Pihlak2, V.A. Logachev2, I.I. Chursinova3, N.A. Mutyeva2 STATISTICS OF SMALL SAMPLES IN MEDICAL RESEARCH
"State Scientific Center Russian Federation- Institute of Biomedical Problems Russian Academy Sciences, 123007, Moscow, Russia; 2A.I. Evdokimov Moscow State University of Medicine and Dentistry, Ministry of Health of Russia, 127473, Moscow, Russia; 3ANO Arthrological Hospital NPO SKAL, 109044, Moscow, Russia
*Nosovsky Andrey Maksimovich, E-mail: [email protected]
♦ Characteristics of statistical criteria are experimentally found. As a result, the value of the statistics W. Ansari-Bradley (Ansari-Bradly) and K. Klotts (Klotz) was calculated. For each initial statistic, a normal approximation (Z-statistic) and a significance level p of the null hypothesis about the absence of differences in the spread of the values of the two samples are calculated. If p>
The proposed methods of mathematical statistics make it possible to confirm the reliability of the differences in the results obtained even in small groups of observations, if the differences are significant enough. Clinical examples of patients with osteoarticular pathology served as an illustration. Keywords: small sample, power of the criterion, coxarthrosis, gouty arthritis
A.M. Nosovskiy1, A.E.Pikhlak2, V.A. Logachev2, I.I. Chursinova3, N.AMuteva2 SMALL-DATA STATISTICS ANALYSIS IN MEDICAL STUDIES
1The state research center-institute of medical biological problems of the Russia academy of medical sciences, 123007 Moscow, Russia; 2Moscow State University of Medicine and Dentistry named after A.I. Evdokimov, 127473 Moscow, Russia; 3Arthrology hospital of scientific and practical association SKAL, 109044 Moscow, Russia
♦ The experimentally was found characteristics of statistical criteria. As a result, calculated the value of the statistics by W. An-sari-Bradly and K. Klotz. For each source of statistics calculated normal approximation (Z-statistics) and the significance level of p of the null hypothesis of no difference in the spread of the values of the two samples. Atp>0.05 the null hypothesis can be accepted. Suggested methods of mathematical statistics can be confirming the accuracy of the differences of the results, even in small groups of observations, if the differences are significant enough.
We used medical cases of patients with joint and bone pathology.
Key words: small data analysis, power of criteria, coxarthrosis, gouty arthritis
The principles of evidence-based medicine place high demands on the reliability of the comparative assessment of the obtained research results. This becomes all the more important because most physicians have a very superficial understanding of the methods of statistical processing, limiting themselves in their publications to, in addition to calculating percentages, at best, Student's /-criterion.
However, for a full analysis of the results of the study, in some cases this is not enough. There is usually no doubt about the reliability of the revealed regularities when the number of observations is several thousand or even hundreds. What if it's several dozen? What if we only have a few cases? Indeed, in medicine there are quite rare diseases, surgeons sometimes perform unique operations when the number of observations is very small. Where is that line, that necessary and sufficient amount of research that allows us to assert the undoubted presence of this or that regularity?
This question is of great importance not only in the evaluation of studies already carried out, but also in planning scientific work. Is it enough to observe 20 patients or is a minimum of 40 necessary? Or maybe 10 cases will suffice? Not only the reliability of the conclusions drawn, but also the timing of the research, their cost, the need for personnel, equipment, etc. depend on a timely and correct answer to this question.
Modern statistics knows quite a lot of tricks with which you can determine the reliability of the results even with a small number of observations. These are "small sample" methods. It is generally accepted that the start of small sample statistics was laid in the first decade of the 20th century by the publication of the work of U.
set, where he, under the pseudonym "Student" (student), postulated the so-called /-distribution. Unlike the theory normal distribution, distribution theory for small samples does not require a priori knowledge or exact estimates mathematical expectation and population variance, and does not require assumptions about the parameters. In the /-distribution, one of the deviations from the sample mean is always fixed, since the sum of all such deviations must be equal to zero. This affects the sum of squares when calculating the sample variance as an unbiased estimate of the population variance and leads to the fact that the number of degrees of freedom df is equal to the number of measurements minus one for each sample. Hence, in the formulas and procedures for calculating /-statistics to test the null hypothesis df=w-1. Also known are the classic works of the largest English statistician R.A. Fisher (after whom the ^-distribution got its name) on analysis of variance - a statistical method that is clearly focused on the analysis of small samples. Of the numerous statistics that can reasonably be applied to small samples, we can mention: Fisher's exact probability test; two-factor non-parametric (rank) analysis of variance Friedman; rank correlation coefficient / Kendall; Kendall's concordance factor; Kruskal-Wallace R-test for non-parametric (rank) one-way analysis of variance; ^/-Mann-Whitney test; median criterion; sign criterion; rank correlation coefficient Mr. Spearman; /-Wilcoxon test.
There is no definite answer to the question of how large a sample should be in order to be considered small. However, the conditional boundary between a small and large sample is considered to be df=30. foundation
for this, to some extent, an arbitrary solution, the result of comparing the /-distribution (for small samples) with the normal distribution (r) serves. The discrepancy between the values of / and r tends to increase with decreasing and decrease with increasing In fact, 1 begins to closely approach b long before the limiting case when / = r. A simple visual examination of the table values / allows you to see that this approximation becomes quite fast, starting with ^=30 and above. Comparative values of / (at t=30) and r are respectively: 2.04 and 1.96 for p=0.05; 2.75 and 2.58 for p=0.01; 3.65 and 3.29 for p=0.001.
In mathematical statistics, the confidence factor / is used, the values of the function are tabulated for its different values, and the corresponding levels of confidence are obtained (Table 1).
The confidence coefficient allows you to calculate the marginal sampling error AX, calculated by the formula AXav = 1tsav, i.e. the marginal sampling error is equal to /-fold the number of mean sampling errors .
Thus, the value of the marginal sampling error can be set with a certain probability. As can be seen from the last column of Table 1, the probability of an error equal to or greater than triple the average sampling error, i.e., AXs = 3tss, is extremely small and equal to 0.003 (1-0.997). Such improbable events are considered practically impossible, and therefore the value AX = 3cs can be taken as the limit of the possible sampling error p3].
The interval in which, with a given degree of probability, will be concluded unknown quantity estimated parameter is called confidence, and the probability P - confidence probability. Most often, the confidence probability is taken equal to 0.95 or 0.99, then the confidence coefficient of 1 is equal to 1.96 and 2.58, respectively.
This means that the confidence interval given probability contains the general average.
The greater the value of the marginal sampling error, the greater the value of the confidence interval and, consequently, the lower the accuracy of the estimate.
The application of this approach can be illustrated by the observation of 20 patients with coxarthrosis who were treated at the Arthrological Hospital NPO "SKAL" (Scientific and Production Association "Specialized Course Outpatient Treatment") in Moscow.
When testing a statistical hypothesis, errors are possible. There are two kinds of errors. The Type I error is to reject null hypothesis, while in reality this hypothesis is true. A Type II error occurs when the null hypothesis is accepted when, in fact, the null hypothesis is false.
The probability of a Type I error is called the significance level and is denoted a. Thus, a=P(W¥ | H0), i.e. the significance level a is the probability of the event (Ce¥), calculated under the assumption that the null hypothesis H0 is true.
The level of significance and power of the test are combined in the concept of the power function of the test - a function that determines the probability that the null hypothesis will be rejected. The power function depends on the critical region ¥ and the actual distribution of the results of observations. In the parametric
Table 1
Confidence factor t and corresponding confidence levels
t 1.00 1.96 2.00 2.58 3.00
F(0 0.683 0.950 0.954 0.990 0.997
In the problem of testing hypotheses, the distribution of the results of observations is given by the parameter 0. In this case, the power function is denoted by M(¥,0) and depends on the critical region ¥ and the actual value of the parameter under study 0. If H0: 0=00, H1: 0=01, then M (¥,00) \u003d a, M(¥,01) \u003d 1-c, where a is the probability of an error of the first kind, b is the probability of an error of the second kind. Then, the power of the test is the probability that the null hypothesis will be rejected when the alternative hypothesis is true.
The power function M(¥,0) in the case of a one-dimensional parameter 0 usually reaches a minimum equal to a at 0=00, increases monotonically with distance from 00 and approaches 1 at | 0 - 00 | ^ yes.
Let us estimate the required power of statistical criteria (Fig. 1), which could be used to analyze the treatment of 20 patients with coxarthrosis.
As you can see, with a standard deviation of 3.0, which is extremely rare, results will be obtained with a high degree of reliability /><0,05, если разность между средними будет превышать 8. Но уже при среднеквадратическом отклонении равном 1,5, эта разность должна превышать всего 4.
To determine the significance level p, an approximate normal 2-approximation of the corresponding statistic is usually used. This approximation gives a good approximation for sufficiently large sample sizes. With a small sample size and p values close to 0.05, we tested the conclusion about the null hypothesis by comparing
Power Curve alpha=0.05, sigma=
Power Curve alpha=0.05, sigma=1,
True Difference Between Means
True Difference Between Means
Rice. 1. Experimentally found characteristics of statistical
criteria.
Table 2 .
Observation groups
Group 1 Group 2 Group 3 Total observations
Nimesulide, vitamins, chondroprotectors, exercise therapy + + + 20
Physiotherapy --- + + 15
Massage... --- + 8
Pain on movement
Pain at rest 43±13 27±17
the calculation of the calculated value of statistics with a critical value in the table of the corresponding distribution from the statistical handbook.
Criteria for differences in shift (position). We used these criteria to test the following hypotheses:
♦ no differences in the mutual position (medians) of the two studied samples;
♦ the shift of samples relative to each other is equal to some value d;
♦ the median of one analyzed sample is equal to the value d.
In case b) it was necessary to reduce all the values of the second sample by the value d: yi=yi-d.
In case c), it is necessary to prepare an auxiliary paired sample, all elements of which are equal to d.
As a result, we calculated:
♦ the value of W. Wilcoxon's statistics (Wilco-xon) - the sum of the ranks Rxi of the elements of one of the samples in the combined ranked sample;
♦ the value of the van der Varden V statistic based on the use of the "arbitrary marks" method.
For each statistic, a normal approximation (Z-statistic) and a significance level P of the null hypothesis of no difference in shift relative to each other were calculated. If p>0.05 the null hypothesis can be accepted.
Some packages and authors suggest using the Mann-Whitney ^/-test and the Wald-Wolfowitz test. However, it has long been proven that the Mann-Whitney criterion is equivalent, i.e. has the same capabilities as the critical
Table 3.
Mean indicators of pain intensity (in points according to VAS)
Group 1 (n= 5) Group 2 (n=7) Group 3 (n= =8)
Parameter Start of follow-up End of follow-up Pain decrease Start of follow-up End of follow-up Pain decrease Start of follow-up End of follow-up Pain decrease
Table 4
Data of laboratory examination of patient B.
No. Indicator Norm Result of the penultimate Result of the last
him visiting visiting
Hematocrit, % 40-48 38.7
Lymphocytes, % 19-37 42
ESR, mm/hour 2-10 39
Uric acid, µmol/l 200-416 504
Creatinine, µmol/l 44-106 238
Parathyroid hormone, pg/ml 7-53 76.8
Fibrinogen, g/l 1.69-3.92 5.7
Protein in urine, g/l 0-0.1 1
43,5 39 10 489 202 101 3
penultimate
Last thing
Rice. 2. p-values of clinical indicators of patient B. at the penultimate and last examination.
the Wilcoxon test, and the Wald-Wolfowitz test suffers from relatively low sensitivity.
Scale difference criteria (scattering). We used these criteria to test the following hypotheses:
♦ the hypothesis that there are no differences in the scales (in the spread or dispersion of values) of the studied samples;
♦ the hypothesis that the ratio of sample scales is equal to a given value of g.
In the latter case, it is necessary to first change the values of the second sample y1=(y1-m0)^ , where m0 is the common median of the two studied spectra.
If the medians of the populations from which the samples are drawn are not equal in magnitude, but their
apply after modifying one of the selections, for example, into the selection yi=yi-m2+mr
If the medians are not equal and are not known, then the hypothesis of the absence of differences in the shift should be confirmed, or the method should be used to detect arbitrary alternatives.
As a result, the value of the statistics of W. Ansari-Bradley (Ansari-Bradly) and K. Klotz (Klotz), which are conceptual analogues of the statistics of Wilcoxon and Van der Waerden, was calculated.
For each initial statistic, a normal approximation (Z-statistic) and a significance level P of the null hypothesis about the absence of differences in the scatter of the values of the two samples are calculated. If />>0.05, the null hypothesis can be accepted.
Thus, the methods of mathematical statistics proposed above make it possible to confirm the reliability of differences
obtained results even in small groups of observations, if the differences are significant enough.
Two clinical examples of patients with osteoarticular pathology can serve as an illustration.
Clinical example No. 1. In 20 patients with coxarthrosis, a basic treatment complex was used, including oral administration of nimesulide, chondroprotectors, intramuscular injections of vitamins and physiotherapy exercises. In addition, physiotherapy was used in 15 of them, and massage was used in 6 patients. Thus, 3 groups of patients were formed with a small (from 5 to 8) number of observations (Table 2).
Among other parameters, before the start of treatment and after the completion of the course (21 ± 2 days), the intensity of pain during movement and at rest was assessed using a 100-point visual analogue scale (VAS).
The following statistical methods were used by W. Ansari-Bradly and K. Klotz (Table 3).
According to the data obtained (Table 3), it was noted that the reduction in pain at rest in group 1 at the end of the observation was not significant. However, significant values were found for all other studied parameters. The considered clinical example indicates the possibility of obtaining reliable results on a small sample size.
In clinical example No. 2, laboratory data of patient B., who suffers from chronic gouty polyarthritis, gouty nephropathy with symptoms of CRF, which were outside the reference values, are considered in dynamics (Table 4).
Let us calculate the probability that the results of the analysis statistically significantly exceed the boundaries of the clinical norm. To do this, we use the probabilistic calculator of the statistical package "STATISTICA 6.0". In this case, the p-value characterizes the error of the first type: the probability of rejecting the correct hypothesis when in fact it is true. In most cases, the results of the penultimate visit were statistically significantly different from the norm (Fig. 2). Since the threshold level of significance in this case, we take equal to 0.05, the results of hematocrit, lymphocytes, ESR, fibrinogen improved statistically significantly at the last visit. Accordingly, the clinical indicators of uric acid, creatinine, parathyroid hormone and protein in the urine, in terms of mathematical statistics, did not improve.
Thus, when planning a study, it is important to take into account the power of the applied statistical criteria, which are determined by the variability of the sample and the given level of significance.
The proposed approach may be of interest to specialists in the field of personalized medicine for
analysis in the dynamics of the methods of treatment and medicines used, while monitoring the ongoing therapeutic and diagnostic measures.
LITERATURE
1. Bolshev L.N., Smirnov N.V. Tables of mathematical statistics. M.: Nauka; 1995.
2. Korn G., Korn T. Handbook of mathematics for scientists and engineers. M.: Nauka; 2003.
3. Kobzar A.I. Applied math statistics. For engineers and scientists. Moscow: FIZMATLIT; 2006.
4. Pravetsky N.V., Nosovsky A.M., Matrosova M.A., Kholin S.F., Shakin V.V. Mathematical substantiation of a sufficient number of measurements for a reliable assessment of the recorded parameters in space biology and medicine. Space biology and aerospace medicine. M.: Medicine; 1990; 5:53-6.
5. HollenderM., Wulf D.A. Nonparametric methods of statistics. M.: Finance and statistics; 1983.
6. Nosovsky A.M. Application of probabilistic models on a circle in biomedical research. Space biology and aerospace medicine. Abstracts IX All-Union Conference. Kaluga, June 19-21, 1990.
7. Nosovsky A.M., Pravetsky N.V., Kholin S.F. Mathematical approach to assessing the accuracy of measurements of a physiological parameter by various methods. Space biology and aerospace medicine. M.: Medicine; 1991; 6:53-5.
1. Bol "shev L.N., Smirnov N.V. Tables of Mathematical Statistics. Moscow: Nauka; 1995 (in Russian).
2. Korn G., Korn T. Mathematical Handbook for Scientists and Engineers. Moscow: Nauka; 2003 (in Russian).
3. Kobzar" A.I. Applied Mathematical Statistics. For engineers and scientists. Moscow: FIZMATLIT; 2006 (in Russian).
4. Pravetskiy N.V., Nosovskiy A.M., Matrosova M.A., Kholin S.F., Shakin V.V. Mathematical justification of a sufficient number of measurements for reliable evaluation of recorded parameters in space biology and medicine. Space Biology and Aerospace Medicine. Moscow: Meditsina; 1990; 5:53-6 (in Russian).
5. Khollender M., Vul "f D.A. Non-parametric statistical methods. Moscow: Finansy i statistika; 1983 (in Russian).
6. Nosovskiy A.M. The use of probabilistic models on the circle in biomedical research. Space Biology and Aerospace Medicine. Abstracts of the IX All-Union Conference. Kaluga, June 19-21, 1990 (in Russian).
7. Nosovskiy A.M., Pravetskiy N.V., Kholin S.F. Mathematical approach to estimation accuracy of the physiological parameter by different methods. Space Biology and Aerospace Medicine. Moscow: Me-ditsina; 1991; 6:53-5 (in Russian).
When controlling the quality of goods in economic research, an experiment can be carried out on the basis of a small sample. Under small sample is understood as a non-continuous statistical survey, in which the sample population is formed from a relatively small number of units of the general population. The size of a small sample usually does not exceed 30 units and can reach up to 4 - 5 units. The average error of a small sample is calculated by the formula:, where is the variance of a small sample. When determining the variance, the number of degrees of freedom is n-1: . The marginal error of a small sample is determined by the formula: In this case, the value of the confidence coefficient t depends not only on the given confidence probability, but also on the number of sample units n. For individual values of t and n, the confidence probability of a small sample is determined by special Student tables (Table 9.1.), In which the distributions of standardized deviations are given: then the following indications of the Student's distribution are used to determine the marginal error of a small sample:
In addition to the random sample itself with its clear probabilistic justification, there are other samples that are not absolutely random, but are widely used. It should be noted that the strict application of the actual random selection of units from the general population is by no means always possible in practice. Such samples include mechanical sampling, typical, serial (or nested), multi-phase and a number of others.
It rarely happens that the general population is homogeneous; this is more the exception than the rule. Therefore, if there are different types of phenomena in the general population, it is often desirable to ensure a more even representation of different types in the sample population. This goal is successfully achieved by using a typical sample. The main difficulty is that we must Additional information about the entire general population, which in some cases is difficult.
A typical sample is also called a stratified or stratified sample; it is also used to more evenly represent different regions in the sample, in which case the sample is called a region sample.
So under typical A sample is understood as such a sample in which the general population is divided into typical subgroups formed by one or more essential features(for example, the population is divided into 3-4 subgroups according to the average per capita income or the level of education - primary, secondary, higher, etc.). Further, from all typical groups, it is possible to select units in the sample in several ways, forming:
a) a typical evenly spaced sample, where from different types(layers) an equal number of units is selected. This scheme works well if in the general population the layers (types) do not differ very much from each other in the number of units;
b) typical sampling with proportional placement, when it is required (as opposed to uniform placement) that the proportion (%) of selection for all layers be the same (for example, 5 or 10%);
c) a typical sample with optimal placement, when the degree of variation of features in different groups of the general population is taken into account. With this placement, the proportion of selection for groups with a large fluctuation of the trait increases, which ultimately leads to a decrease in random error.
The formula for the average error in typical selection is similar to the usual sampling error for a random sample itself, with the only difference that instead of the total variance, the average of the private intra-group variances is put down, which naturally leads to a decrease in the error compared to a proper random sample. However, its application is not always possible (for many reasons). If there is no need for great accuracy, it is easier and cheaper to use serial sampling.
Serial(nested) sampling consists in the fact that not units of the population (for example, students) are selected in the sample, but separate series or nests (for example, study groups). In other words, with serial (nested) selection, the unit of observation and the unit of selection do not coincide: certain groups of units adjacent to each other (nests) are selected, and the units included in these nests are subject to examination. So, for example, in a sample survey of housing conditions, we can randomly select a certain number of households (sampling unit) and then find out the living conditions of the families living in these houses (observation units).
Series (nests) consist of units interconnected geographically (districts, cities, etc.), organizationally (enterprises, workshops, etc.), or in time (for example, a set of units of products produced over a given period of time) .
Serial selection can be organized in the form of one-stage, two-stage or multi-stage selection.
Randomly selected series are subjected to continuous research. Thus, serial sampling consists of two stages of random selection of series and continuous study of these series. Serial selection provides significant savings in manpower and resources and is therefore often used in practice. Serial selection error differs from the actual random selection error in that the interseries (intergroup) variance is used instead of the total variance value, and the number of series is used instead of the sample size. The accuracy is usually not very high, but in some cases it is acceptable. Serial sampling can be repeated and non-repeated, and series can be equal and unequal.
Serial sampling can be organized according to different schemes. For example, it is possible to form a sampling set in two stages: first, the series to be examined are randomly selected, then a certain number of units to be directly observed (measured, weighed, etc.) are also randomly selected from each selected series. The error of such a sample will depend on the error of serial selection and on the error of individual selection, i.e. multi-stage sampling generally gives less accurate results than single-stage sampling, which is explained by the occurrence of representativeness errors at each sampling stage. In this case, it is required to use the sampling error formula for combined selection.
Another form of selection is multi-phase selection (1, 2, 3 phases or stages). This selection differs in its structure from multi-stage selection, since in multi-phase selection, the same units of selection are used in each phase. Errors in multi-phase selection are calculated for each phase separately. main feature two-phase sampling consists in the fact that the samples differ from each other according to three criteria depending on: 1) the proportion of units studied in the first phase of the sample and re-included in the second and subsequent phases; 2) from observing the equality of chances of each sample unit of the first phase to be the object of study again; 3) on the size of the interval separating the phases from each other.
Let us dwell on one more type of selection, namely mechanical(or systematic). This selection is probably the most common. This is apparently due to the fact that of all the selection methods, this one is the simplest. In particular, it is much simpler than random selection, which involves the ability to use tables of random numbers, and does not require additional information about the general population and its structure. In addition, mechanical selection is closely intertwined with proportional stratified selection, which leads to a decrease in sampling error.
For example, the use of a mechanical selection of members of a housing cooperative from a list drawn up in the order of admission to this cooperative will ensure proportional representation of members of the cooperative with different lengths of service. Using the same technique to select respondents from an alphabetical list of persons provides equal chances for surnames beginning with different letters, and so on. The use of personnel or other lists in enterprises or in educational institutions and others can provide the necessary proportionality in the representation of workers with different length of service. Note that mechanical selection is widely used in sociology, in the study of public opinion, etc.
In order to reduce the magnitude of the error and especially the cost of sampling, various combinations certain types selection (mechanical, serial, individual, multiphase, etc.) In such cases, more complex sampling errors should be calculated, which consist of errors that occur at different stages of the study.
A small sample is a set of units less than 30. Small samples are quite common in practice. For example, the number of cases of rare diseases or the number of units with a rare trait; in addition, a small sample is used when the research is expensive or the research involves the destruction of products or samples. Small samples are widely used in the field of product quality surveys. Theoretical basis to determine the errors of a small sample were laid by the English scientist W. Gosset (pseudonym Student).
It must be remembered that when determining the error for a small sample, instead of the sample size, take the value ( n- 1) or, before determining the average sampling error, calculate the so-called corrected sampling variance (in the denominator instead of n should put ( n- one)). Note that such a correction is made only once - when calculating the sample variance or when determining the error. Value ( n– 1) is called the degree of freedom. Also, the normal distribution is replaced t-distribution (Student's distribution), which is tabulated and depends on the number of degrees of freedom. The only parameter of the Student distribution is the value ( n- one). We emphasize once again that the amendment ( n– 1) is important and significant only for small sample populations; at n> 30 and above, the difference disappears, approaching zero.
Until now, we have been talking about random samples, i.e. such when the selection of units from the general population is made randomly (or almost randomly) and all units have an equal (or almost equal) probability of being included in the sample. However, the selection of units can be based on the principle of non-random selection, when the principle of accessibility and purposefulness is at the forefront. In such cases, it is impossible to speak about the representativeness of the obtained sample, and the calculation of representativeness errors can be made only if we have information about the general population.
Several schemes for the formation of non-random sampling are known, which have become widespread and are used mainly in sociological research: selection of available units of observation, selection according to the Nuremberg method, target sampling when determining experts, etc. The quota sample is also important, which is formed by the researcher according to a small number of significant parameters and gives a very close match with the general population. In other words, quota selection should provide the researcher with an almost complete match between the sample and the general population according to the parameters chosen by him. Purposeful achievement of the proximity of two populations in a limited range of indicators is achieved, as a rule, using a sample of a significantly smaller size than when using random selection. It is this circumstance that makes quota selection attractive to a researcher who is unable to focus on a self-weighted random sample of a large size. It should be added that a reduction in the sample size is most often combined with a reduction in monetary costs and the timing of the study, which increases the advantages of this selection method. We also note that with a quota sample, there is quite a lot of preliminary information about the structure of the general population. The main advantage here is that the sample size is significantly smaller than with a random sample. The identified characteristics (most often socio-demographic - gender, age, education) should closely correlate with the studied characteristics of the general population, i.e. object of study.
As already mentioned, the sampling method makes it possible to obtain information about the general population with much less money, time and effort than with continuous observation. It is also clear that a continuous study of the entire general population is impossible in a number of cases, for example, when checking the quality of products whose samples are destroyed.
Along with this, however, it should be pointed out that the general population is not completely a "black box" and we still have some information about it. Conducting, for example, a selective study concerning the life, way of life, property status, income and expenses of students, their opinions, interests, etc., we still have information about their total number, grouping by sex, age, marital status, place of residence , course of study and other characteristics. This information is always used in a sample study.
There are several varieties of the distribution of sample characteristics to the general population: the method of direct recalculation and the method of correction factors. The recalculation of sample characteristics is carried out, as a rule, taking into account confidence intervals and can be expressed in absolute and relative values.
It is appropriate to emphasize here that most statistical information concerning the economic life of society in its various manifestations and types, is based on sample data. Of course, they are supplemented by complete registration data and information obtained as a result of censuses (of population, enterprises, etc.). For example, all budget statistics (on incomes and expenditures of the population) provided by Rosstat are based on sample survey data. Information about prices, production volumes, trade volumes, expressed in the corresponding indices, is also largely based on sample data.
Statistical hypotheses and statistical tests. Basic concepts
The concepts of statistical test and statistical hypothesis are closely related to sampling. A statistical hypothesis (unlike other scientific hypotheses) consists in assuming some properties of the general population that can be tested based on data from a random sample. It should be remembered that the result obtained is probabilistic in nature. Consequently, the result of the study, which confirms the validity of the hypothesis put forward, can almost never serve as a basis for its final acceptance, and vice versa, a result that is inconsistent with it is quite sufficient to reject the hypothesis put forward as erroneous or false. This is so because the result obtained can be consistent with other hypotheses, and not only with the one put forward.
Under statistical criterion is understood as a set of rules that allow answering the question of under what results of observation the hypothesis is rejected, and under which it is not. In other words, a statistical test is a decision rule, which ensures the acceptance of a true (true) hypothesis and the rejection of a false hypothesis with to a large extent probabilities. Statistical tests are one-sided and two-sided, parametric and non-parametric, more or less powerful. Some criteria are used frequently, others are used less frequently. Some of the criteria are designed to solve special problems, and some criteria can be used to solve a wide class of problems. These criteria have become widespread in sociology, economics, psychology, natural sciences etc.
Let us introduce some basic concepts of statistical hypothesis testing. Hypothesis testing begins with the null hypothesis H 0 , i.e. some assumption of the researcher, as well as a competing, alternative hypothesis H 1 , which contradicts the main one. For example: H 0: , H 1: or H 0: , H 1: (where a- general average).
The main goal of the researcher when testing a hypothesis is to reject the hypothesis put forward by him. As R. Fisher wrote, the goal of testing any hypothesis is to reject it. Hypothesis testing is based on the contrary. Therefore, if we believe that, for example, the average wages of workers, obtained from the data of a particular sample and equal to 186 monetary units per month, do not coincide with the actual wages for the entire population, then it is assumed as a null hypothesis that these wages are equal.
Competing hypothesis H 1 can be formulated in different ways:
H 1: , H 1: , H 1: .
Next is determined type I error(a), which sets the probability that a true hypothesis will be rejected. Obviously, this probability should be small (usually from 0.01 to 0.1, most often by default 0.05, or the so-called 5% significance level). These levels follow from the method of sampling, according to which the double or triple error represents the limits beyond which the random variation of sample characteristics most often does not go. Type II error(b) is the probability that the wrong hypothesis will be accepted. As a rule, the type I error is more “dangerous”; It is she who is fixed by the statistician. If at the beginning of the study we want to fix a and b at the same time (for example, a = 0.05; b = 0.1), then for this we must first calculate the sample size.
Critical zone(or area) is a set of criterion values under which H 0 is rejected. critical point T kr is the point separating the region of acceptance of the hypothesis from the region of deviation, or the critical zone.
As already mentioned, Type I error (a) is the probability of rejecting a correct hypothesis. The smaller a, the less likely it is to make a Type I error. But at the same time, when a decreases (for example, from 0.05 to 0.01), it is more difficult to reject the null hypothesis, which, in fact, is what the researcher sets himself. We emphasize once again that a further decrease in a to 0.05 and further will actually lead to the fact that all hypotheses, true and false, will fall into the area of accepting the null hypothesis, and will make it impossible to distinguish between them.
A Type II error (b) occurs when one accepts H 0 , but in fact the alternative hypothesis is true H one . The value g = 1 – b is called the power of the criterion. Type II error (i.e. erroneous acceptance of a false hypothesis) decreases with increasing sample size and increasing significance level. It follows from this that it is impossible to decrease a and b at the same time. This is achieved only by increasing the sample size (which is not always possible).
Most often, the tasks of testing a hypothesis are reduced to comparing two sample means or shares; to compare the general average (or share) with the sample; comparison of empirical and theoretical distributions (fitness criteria); comparison of two sample variances (c 2 -criterion); comparison of two sample correlation coefficients or regression coefficients and some other comparisons.
The decision to accept or reject the null hypothesis consists in comparing the actual value of the criterion with the tabular (theoretical) one. If the actual value is less than the table value, then it is concluded that the discrepancy is random, insignificant, and the null hypothesis cannot be rejected. The reverse situation (the actual value is greater than the tabular one) leads to the rejection of the null hypothesis.
When checking statistical hypotheses most commonly used are normal distribution tables, distributions c 2 (read: chi-square), t-distributions (Student's distributions) and F-distributions (Fisher distributions).