Math statistics is a modern branch of mathematics that deals with statistical description results of experiments and observations, as well as building mathematical models containing concepts probabilities. The theoretical basis of mathematical statistics is probability theory.

In the structure of mathematical statistics, two main sections are traditionally distinguished: descriptive statistics and statistical inference (Figure 1.1).

Rice. 1.1. Main sections of mathematical statistics

Descriptive statistics is used for:

o generalization of indicators of one variable (statistics of a random sample);

o identifying relationships between two or more variables (correlation-regression analysis).

Descriptive statistics makes it possible to obtain new information, quickly understand and comprehensively evaluate it, that is, it performs the scientific function of describing the objects of study, which justifies its name. The methods of descriptive statistics are designed to turn a set of individual empirical data into a system of forms and numbers that are visual for perception: frequency distributions; indicators of trends, variability, communication. These methods calculate the statistics of a random sample, which serve as the basis for the implementation of statistical inferences.

Statistical Inference give the opportunity:

o evaluate the accuracy, reliability and effectiveness of sample statistics, find errors that occur in the process of statistical research (statistical evaluation)

o summarize the parameters of the general population obtained on the basis of sample statistics (checking statistical hypotheses).

the main objective scientific research- this is the acquisition of new knowledge about a large class of phenomena, persons or events, which are commonly called the general population.

Population is the totality of objects of study, sample- its part, which is formed in a certain scientifically substantiated way 2.

The term "general population" is used when we are talking about a large but finite set of objects under study. For example, about the totality of applicants in Ukraine in 2009 or the totality of children preschool age the city of Rivne. General populations can reach significant volumes, be finite and infinite. In practice, as a rule, one deals with finite sets. And if the ratio of the size of the general population to the size of the sample is more than 100, then, according to Glass and Stanley, the estimation methods for finite and infinite populations give essentially the same results. The general set can also be called the complete set of values ​​of some attribute. The fact that the sample belongs to the general population is the main basis for assessing the characteristics of the general population according to the characteristics of the sample.

Main idea mathematical statistics is based on the belief that a complete study of all objects of the general population in most scientific problems is either practically impossible or economically impractical, since it requires a lot of time and significant material costs. Therefore, in mathematical statistics, it is used selective approach, the principle of which is shown in the diagram in Fig. 1.2.

For example, according to the formation technology, the samples are randomized (simple and systematic), stratified, clustered (see Section 4).

Rice. 1.2. Scheme of application of methods of mathematical statistics According to selective approach use of mathematical statistical methods can be carried out in the following sequence (see Fig. 1.2):

o with general population, properties of which are subject to research, certain methods form a sample- a typical but limited number of objects to which research methods are applied;

o as a result of observational methods, experimental actions and measurements on sample objects, empirical data are obtained;

o processing of empirical data using descriptive statistics methods gives sample indicators, which are called statisticians - like the name of the discipline, by the way;

o applying statistical inference methods to statistician, receive parameters that characterize the properties the general population.

Example 1.1. In order to assess the stability of the level of knowledge (variable x) testing of a randomized sample of 3 students with a volume of n. The tests contained m tasks, each of which was evaluated according to the scoring system: "completed" "- 1," not fulfilled "- 0. average current achievements of students remained X

3 randomized sample(from the English. Random - random) is a representative sample, which is formed according to the strategy of random tests.

at the level of previous years / h? Solution sequence:

o find out a meaningful hypothesis of the type: "if the current test results do not differ from the past, then we can consider the level of students' knowledge to be unchanged, and studying proccess- stable";

o formulate an adequate statistical hypothesis, such as the null hypothesis H 0 that the "current GPA X is not statistically different from the average of previous years / h", i.e. H 0: X = ⁄ r, against the corresponding alternative hypothesis X Ф ^ ;

o build empirical distributions of the investigated variable X;

o define(if necessary) correlations, for example, between a variable X and other indicators, build regression lines;

o check compliance empirical distribution normal law;

o evaluate the value of point indicators and the confidence interval of parameters, for example, the average;

o define criteria for testing statistical hypotheses;

o test statistical hypotheses based on the selected criteria;

o formulate a decision on the statistical null hypothesis on a certain significance level;

o move from the decision to accept or reject the statistical null hypothesis of the interpretation of the conclusions regarding the meaningful hypothesis;

o formulate meaningful conclusions.

So, if we summarize the above procedures, the application of statistical methods consists of three main blocks:

The transition from an object of reality to an abstract mathematical and statistical scheme, that is, the construction of a probabilistic model of a phenomenon, process, property;

Carrying out computational actions by proper mathematical means within the framework of a probabilistic model based on the results of measurements, observations, experiments and the formulation of statistical conclusions;

Interpretation of statistical conclusions about the real situation and making an appropriate decision.

Statistical methods for processing and interpreting data are based on probability theory. The theory of probability is the basis of the methods of mathematical statistics. Without the use of fundamental concepts and laws of probability theory, it is impossible to generalize the conclusions of mathematical statistics, and hence their reasonable use for scientific and practical purposes.

Thus, the task of descriptive statistics is to transform a set of sample data into a system of indicators - statistics - frequency distributions, measures of central tendency and variability, coupling coefficients, and the like. However, statistics are characteristics, in fact, of a particular sample. Of course, it is possible to calculate sample distributions, sample means, variances, etc., but such "data analysis" is of limited scientific and educational value. The "mechanical" transfer of any conclusions drawn on the basis of such indicators to other populations is not correct.

In order to be able to transfer sample indicators or others, or to more common populations, it is necessary to have mathematically justified provisions about the conformity and ability of sample characteristics with the characteristics of these common so-called general populations. Such provisions are based on theoretical approaches and schemes associated with probabilistic models of reality, for example, on the axiomatic approach, in the law big numbers etc. Only with their help it is possible to transfer the properties that are established by the results of the analysis of limited empirical information, either to other or to widespread sets. Thus, the construction, the laws of functioning, the use of probabilistic models, is the subject of a mathematical field called "probability theory", becomes the essence of statistical methods.

Thus, in mathematical statistics, two parallel lines of indicators are used: the first line, which is relevant to practice (these are sample indicators) and the second, based on theory (these are indicators of a probabilistic model). For example, the empirical frequencies that are determined on the sample correspond to the concepts of theoretical probability; sample mean (practice) corresponds expected value(theory), etc. Moreover, in studies, selective characteristics, as a rule, are primary. They are calculated on the basis of observations, measurements, experiments, after which they undergo a statistical assessment of the ability and effectiveness, testing of statistical hypotheses in accordance with the objectives of the research, and in the end are accepted with a certain probability as indicators of the properties of the studied populations.

Question. A task.

1. Describe the main sections of mathematical statistics.

2. What is the main idea of ​​mathematical statistics?

3. Describe the ratio of the general and sample populations.

4. Explain the scheme for applying the methods of mathematical statistics.

5. Specify the list of the main tasks of mathematical statistics.

6. What are the main blocks of the application of statistical methods? Describe them.

7. Expand the connection between mathematical statistics and probability theory.

Introduction

2. Basic concepts of mathematical statistics

2.1 Basic concepts of sampling

2.2 Sampling

2.3 Empirical distribution function, histogram

Conclusion

Bibliography

Introduction

Mathematical statistics is the science of mathematical methods systematization and use of statistical data for scientific and practical conclusions. In many of its branches, mathematical statistics is based on the theory of probability, which makes it possible to assess the reliability and accuracy of conclusions drawn from limited statistical material (for example, to estimate the required sample size to obtain results of the required accuracy in a sample survey).

In probability theory, random variables with a given distribution or random experiments, the properties of which are fully known. The subject of probability theory is the properties and relationships of these quantities (distributions).

But often the experiment is a black box, giving only some results, according to which it is required to draw a conclusion about the properties of the experiment itself. The observer has a set of numerical (or they can be made numerical) results obtained by repeating the same random experiment under the same conditions.

In this case, for example, the following questions arise: If we observe one random variable, how can we draw the most accurate conclusion about its distribution from a set of its values ​​in several experiments?

An example of such a series of experiments is a sociological survey, a set of economic indicators, or, finally, a sequence of coats of arms and tails during a thousand-fold coin toss.

All of the above factors lead to relevance and the importance of the topic of work at the present stage, aimed at a deep and comprehensive study of the basic concepts of mathematical statistics.

In this regard, the purpose of this work is to systematize, accumulate and consolidate knowledge about the concepts of mathematical statistics.

1. Subject and methods of mathematical statistics

Mathematical statistics is the science of mathematical methods for analyzing data obtained during mass observations (measurements, experiments). Depending on the mathematical nature of the specific results of observations, mathematical statistics is divided into statistics of numbers, multidimensional statistical analysis, analysis of functions (processes) and time series, statistics of objects of non-numerical nature. A significant part of mathematical statistics is based on probabilistic models. Allocate general tasks descriptions of data, evaluation and testing of hypotheses. They also consider more specific tasks related to conducting sample surveys, restoring dependencies, building and using classifications (typologies), etc.

To describe the data, tables, charts, and other visual representations are built, for example, correlation fields. Probabilistic models are usually not used. Some data description methods rely on advanced theory and the capabilities of modern computers. These include, in particular, cluster analysis, aimed at identifying groups of objects that are similar to each other, and multidimensional scaling, which makes it possible to visualize objects on a plane, distorting the distances between them to the least extent.

Estimation and hypothesis testing methods rely on probabilistic data generation models. These models are divided into parametric and non-parametric. In parametric models, it is assumed that the objects under study are described by distribution functions that depend on a small number (1-4) of numerical parameters. In nonparametric models, the distribution functions are assumed to be arbitrary continuous. In mathematical statistics, the parameters and characteristics of distribution (mathematical expectation, median, variance, quantiles, etc.), densities and distribution functions, dependencies between variables (based on linear and non-parametric correlation coefficients, as well as parametric or non-parametric estimates of functions expressing dependencies) are evaluated etc. Use point and interval (giving bounds for true values) estimates.

In mathematical statistics there is a general theory of hypothesis testing and big number methods dedicated to testing specific hypotheses. Hypotheses are considered about the values ​​of parameters and characteristics, about checking homogeneity (that is, about the coincidence of characteristics or distribution functions in two samples), about the agreement of the empirical distribution function with a given distribution function or with a parametric family of such functions, about the symmetry of the distribution, etc.

Of great importance is the section of mathematical statistics associated with conducting sample surveys, with the properties of various sampling schemes and the construction of adequate methods for estimating and testing hypotheses.

Dependency recovery problems have been actively studied for more than 200 years, since the development of the method of least squares by K. Gauss in 1794. Currently, the methods of searching for an informative subset of variables and non-parametric methods are the most relevant.

The development of methods for data approximation and description dimension reduction was started more than 100 years ago, when K. Pearson created the principal component method. Later, factor analysis and numerous non-linear generalizations were developed.

Various methods of constructing (cluster analysis), analysis and use (discriminant analysis) of classifications (typologies) are also called methods of pattern recognition (with and without a teacher), automatic classification, etc.

Mathematical methods in statistics are based either on the use of sums (based on the Central Limit Theorem of probability theory) or difference indicators (distances, metrics), as in the statistics of non-numerical objects. Usually only asymptotic results are rigorously substantiated. Nowadays computers play a big role in mathematical statistics. They are used both for calculations and for simulation modeling (in particular, in sampling methods and in studying the suitability of asymptotic results).

Basic concepts of mathematical statistics

2.1 Basic concepts of the sampling method

Let be a random variable observed in a random experiment. It is assumed that the probability space is given (and will not interest us).

We will assume that, having carried out this experiment once under the same conditions, we obtained the numbers , , , - the values ​​of this random variable in the first, second, etc. experiments. A random variable has some distribution , which is partially or completely unknown to us.

Let's take a closer look at a set called a sample.

In a series of experiments already performed, a sample is a set of numbers. But if this series of experiments is repeated again, then instead of this set we will get a new set of numbers. Instead of a number, another number will appear - one of the values ​​​​of a random variable. That is, (and , and , etc.) is a variable that can take the same values ​​as the random variable , and just as often (with the same probabilities). Therefore, before the experiment - a random variable equally distributed with , and after the experiment - the number that we observe in this first experiment, i.e. one of the possible values ​​of the random variable .

A sample of volume is a set of independent and identically distributed random variables (“copies”) that, like and , have a distribution.

What does it mean to “draw a conclusion about the distribution from a sample”? The distribution is characterized by a distribution function, density or table, a set of numerical characteristics - , , etc. Based on the sample, one must be able to build approximations for all these characteristics.

.2 Sampling

Consider the implementation of the sample on one elementary outcome - a set of numbers , , . On a suitable probability space, we introduce a random variable taking the values ​​, , with probabilities in (if some of the values ​​coincide, we add the probabilities the corresponding number of times). The probability distribution table and the random variable distribution function look like this:

The distribution of a quantity is called the empirical or sample distribution. Let us calculate the mathematical expectation and variance of a quantity and introduce the notation for these quantities:

In the same way, we calculate the moment of order

In the general case, we denote by the quantity

If, when constructing all the characteristics introduced by us, we consider the sample , , as a set of random variables, then these characteristics themselves - , , , , - will become random variables. These sample distribution characteristics are used to estimate (approximate) the corresponding unknown characteristics of the true distribution.

The reason for using the characteristics of the distribution to estimate the characteristics of the true distribution (or ) is in the closeness of these distributions for large .

Consider, for example, tossing a regular die. Let - the number of points that fell on the -th throw, . Let's assume that one in the sample will occur once, two - once, and so on. Then the random variable will take the values 1 , , 6 with probabilities , , respectively. But these proportions approach with growth according to the law of large numbers. That is, the distribution of magnitude in some sense approaches the true distribution of the number of points that fall out when the correct die is tossed.

We will not specify what is meant by the closeness of the sample and true distributions. In the following paragraphs, we will take a closer look at each of the characteristics introduced above and examine its properties, including its behavior with increasing sample size.

.3 Empirical distribution function, histogram

Since the unknown distribution can be described, for example, by its distribution function , we will construct an “estimate” for this function from the sample.

Definition 1.

An empirical distribution function built on a sample of volume is called random function, for each equal

Reminder: random function

called an event indicator. For each, this is a random variable having a Bernoulli distribution with parameter . why?

In other words, for any value of , equal to the true probability of the random variable being less than , the proportion of sample elements less than is estimated.

If the sample elements , , are sorted in ascending order (on each elementary outcome), a new set of random variables will be obtained, called a variation series:

The element , , is called the th member of the variational series or the th order statistic .

Example 1

Sample:

Variation row:

Rice. one. Example 1

The empirical distribution function has jumps at sample points, the jump value at the point is , where is the number of sample elements that match with .

Can be built empirical function distribution according to the variation series:

Another characteristic of the distribution is the table (for discrete distributions) or density (for absolutely continuous). An empirical, or selective analogue of a table or density is the so-called histogram.

The histogram is based on grouped data. The estimated range of values ​​of a random variable (or the range of sample data) is divided, regardless of the sample, into a certain number of intervals (not necessarily the same). Let , , be intervals on the line, called grouping intervals . Let us denote for by the number of sample elements that fall into the interval :

(1)

On each of the intervals, a rectangle is built, the area of ​​\u200b\u200bwhich is proportional to. The total area of ​​all rectangles must be equal to one. Let be the length of the interval. The height of the rectangle above is

The resulting figure is called a histogram.

Example 2

Available variation series(see example 1):

Here is the decimal logarithm, therefore, i.e. when the sample is doubled, the number of grouping intervals increases by 1. Note that the more grouping intervals, the better. But, if we take the number of intervals, say, of the order of , then with growth the histogram will not approach density.

The following statement is true:

If the sample density is continuous function, then for so that , pointwise convergence in probability of the histogram to the density takes place.

So the choice of the logarithm is reasonable, but not the only possible one.

Conclusion

Mathematical (or theoretical) statistics is based on the methods and concepts of probability theory, but in a sense it solves inverse problems.

If we observe the simultaneous manifestation of two (or more) signs, i.e. we have a set of values ​​of several random variables - what can be said about their dependence? Is she there or not? And if so, what is this dependence?

It is often possible to make some assumptions about the distribution hidden in the "black box" or about its properties. In this case, according to experimental data, it is required to confirm or refute these assumptions (“hypotheses”). At the same time, we must remember that the answer "yes" or "no" can only be given with a certain degree of certainty, and the longer we can continue the experiment, the more accurate the conclusions can be. The most favorable situation for research is when one can confidently assert about some properties of the observed experiment - for example, about the presence of a functional dependence between the observed quantities, about the normality of the distribution, about its symmetry, about the presence of density in the distribution or about its discrete nature, etc. .

So, it makes sense to remember about (mathematical) statistics if

there is a random experiment, the properties of which are partially or completely unknown,

We are able to reproduce this experiment under the same conditions some (or better, any) number of times.

Bibliography

1. Baumol U. Economic theory and operations research. – M.; Science, 1999.

2. Bolshev L.N., Smirnov N.V. Tables of mathematical statistics. Moscow: Nauka, 1995.

3. Borovkov A.A. Math statistics. Moscow: Nauka, 1994.

4. Korn G., Korn T. Handbook of mathematics for scientists and engineers. - St. Petersburg: Lan Publishing House, 2003.

5. Korshunov D.A., Chernova N.I. Collection of tasks and exercises in mathematical statistics. Novosibirsk: Publishing House of the Institute of Mathematics. S.L. Sobolev SB RAS, 2001.

6. Peheletsky I.D. Mathematics: textbook for students. - M.: Academy, 2003.

7. Sukhodolsky V.G. Lectures on higher mathematics for the humanities. - St. Petersburg Publishing House of St. Petersburg state university. 2003

8. Feller V. Introduction to the theory of probability and its applications. - M.: Mir, T.2, 1984.

9. Harman G., Modern factor analysis. - M.: Statistics, 1972.


Harman G., Modern factor analysis. - M.: Statistics, 1972.

Mathematical statistics is one of the main sections of such a science as mathematics, and is a branch that studies the methods and rules for processing certain data. In other words, it explores ways to uncover patterns that are inherent in large collections of identical objects, based on their sample survey.

A task this section consists in constructing methods for estimating the probability or making a certain decision about the nature of developing events, based on the results obtained. Tables, charts, and correlation fields are used to describe the data. rarely applied.

Mathematical statistics are used in various fields of science. For example, it is important for the economy to process information about homogeneous sets of phenomena and objects. They can be products manufactured by industry, personnel, profit data, etc. Depending on the mathematical nature of the results of observations, one can single out the statistics of numbers, the analysis of functions and objects of a non-numerical nature, and multidimensional analysis. In addition, they consider general and particular (related to the restoration of dependencies, the use of classifications, selective studies) tasks.

The authors of some textbooks believe that the theory of mathematical statistics is only a section of the theory of probability, while others believe that it is an independent science with its own goals, objectives and methods. However, in any case, its use is very extensive.

Thus, mathematical statistics is most clearly applicable in psychology. Its use will allow the specialist to correctly substantiate, find the relationship between the data, generalize them, avoid many logical errors, and much more. It should be noted that it is often simply impossible to measure this or that psychological phenomenon or personality trait without computational procedures. This suggests that the basics of this science are necessary. In other words, it can be called the source and basis of probability theory.

The method of research, which relies on the consideration of statistical data, is used in other areas. However, it should immediately be noted that its features, when applied to objects that have a different nature of origin, are always unique. Therefore, it does not make sense to combine physical science into one science. Common features this method are reduced to counting a certain number of objects that are included in a particular group, as well as studying the distribution quantitative traits and the application of probability theory to obtain certain conclusions.

Elements of mathematical statistics are used in areas such as physics, astronomy, etc. Here, the values ​​of characteristics and parameters, hypotheses about the coincidence of any characteristics in two samples, about the symmetry of the distribution, and much more can be considered.

Mathematical statistics plays an important role in their implementation. Their goal is most often to build adequate methods for estimating and testing hypotheses. At present, computer technologies are of great importance in this science. They allow not only to significantly simplify the calculation process, but also to create samples for replication or when studying the suitability of the results obtained in practice.

In the general case, the methods of mathematical statistics help to draw two conclusions: either to make the desired judgment about the nature or properties of the data being studied and their relationships, or to prove that the results obtained are not enough to draw conclusions.


Content.

1. Introduction:
- How are probability and mathematical statistics used? - page 2
- What is "mathematical statistics"? - page 3
2) Examples of applying the theory of probability and mathematical statistics:
- Selection. - page 4
- Assessment tasks. – page 6
- Probabilistic-statistical methods and optimization. – page 7
3) Conclusion.

Introduction.

How are probability and mathematical statistics used? These disciplines are the basis of probabilistic-statistical decision-making methods. To take advantage of them mathematical apparatus, it is necessary to express decision-making problems in terms of probabilistic-statistical models. The application of a specific probabilistic-statistical decision-making method consists of three stages:
- transition from economic, managerial, technological reality to an abstract mathematical and statistical scheme, i.e. building a probabilistic model of a control system, a technological process, a decision-making procedure, in particular based on the results of statistical control, etc.
- carrying out calculations and obtaining conclusions by purely mathematical means within the framework of a probabilistic model;
- interpretation of mathematical and statistical conclusions in relation to a real situation and making an appropriate decision (for example, on the conformity or non-compliance of product quality with established requirements, the need to adjust the technological process, etc.), in particular, conclusions (on the proportion of defective units of products in a batch, on the specific form of the laws of distribution of controlled parameters of the technological process, etc.).

Mathematical statistics uses the concepts, methods and results of probability theory. Let's consider the main issues of building probabilistic decision-making models in economic, managerial, technological and other situations. For the active and correct use of normative-technical and instructive-methodical documents on probabilistic-statistical methods of decision-making, preliminary knowledge is needed. So, it is necessary to know under what conditions one or another document should be applied, what initial information is necessary to have for its selection and application, what decisions should be made based on the results of data processing, etc.

What is "mathematical statistics"? Mathematical statistics is understood as “a section of mathematics devoted to mathematical methods for collecting, systematizing, processing and interpreting statistical data, as well as using them for scientific or practical conclusions. The rules and procedures of mathematical statistics are based on the theory of probability, which makes it possible to evaluate the accuracy and reliability of the conclusions obtained in each problem on the basis of the available statistical material. At the same time, statistical data refers to information about the number of objects in any more or less extensive collection that have certain characteristics.

According to the type of problems being solved, mathematical statistics is usually divided into three sections: data description, estimation, and hypothesis testing.

According to the type of statistical data being processed, mathematical statistics is divided into four areas:

One-dimensional statistics (statistics of random variables), in which the result of an observation is described by a real number;

Multivariate statistical analysis, where the result of observation of an object is described by several numbers (vector);

Statistics of random processes and time series, where the result of observation is a function;

Statistics of objects of a non-numerical nature, in which the result of an observation is of a non-numerical nature, for example, it is a set (a geometric figure), an ordering, or obtained as a result of a measurement by a qualitative attribute.

Examples of application of probability theory and mathematical statistics.
Let us consider several examples where probabilistic-statistical models are a good tool for solving managerial, industrial, economic, and national economic problems. So, for example, a coin that is used as a lot must be "symmetrical", i.e. when it is thrown, on average, in half the cases, the coat of arms should fall out, and in half the cases - the lattice (tails, number). But what does "average" mean? If you spend many series of 10 throws in each series, then there will often be series in which a coin drops out 4 times with a coat of arms. For a symmetrical coin, this will happen in 20.5% of the series. And if there are 40,000 coats of arms for 100,000 tosses, can the coin be considered symmetrical? The decision-making procedure is based on the theory of probability and mathematical statistics.

The example under consideration may not seem serious enough. However, it is not. The draw is widely used in organizing industrial feasibility experiments, for example, when processing the results of measuring the quality index (friction moment) of bearings depending on various technological factors (the influence of a conservation environment, methods of preparing bearings before measurement, the effect of bearing load in the measurement process, etc.). P.). Suppose it is necessary to compare the quality of bearings depending on the results of their storage in different conservation oils, i.e. in oils of composition A and B. When planning such an experiment, the question arises which bearings should be placed in oil composition A, and which - in oil composition B, but in such a way as to avoid subjectivity and ensure the objectivity of the decision.

Sample
The answer to this question can be obtained by drawing lots. A similar example can be given with the quality control of any product. To decide whether or not an inspected batch of products meets the established requirements, a sample is taken from it. Based on the results of the sample control, a conclusion is made about the entire batch. In this case, it is very important to avoid subjectivity in the formation of the sample, i.e. it is necessary that each unit of product in the controlled lot has the same probability of being selected in the sample. Under production conditions, the selection of units of production in the sample is usually carried out not by lot, but by special tables of random numbers or with the help of computer random number generators.
Similar problems of ensuring the objectivity of comparison arise when comparing various schemes for organizing production, remuneration, when holding tenders and competitions, selecting candidates for vacant positions, etc. Everywhere you need a lottery or similar procedures. Let us explain using the example of identifying the strongest and second strongest team in organizing a tournament according to the Olympic system (the loser is eliminated). Let the stronger team always win over the weaker one. It is clear that the strongest team will definitely become the champion. The second strongest team will reach the final if and only if it has no games with the future champion before the final. If such a game is planned, then the second strongest team will not reach the final. The one who plans the tournament can either “knock out” the second strongest team from the tournament ahead of schedule, bringing it down in the first meeting with the leader, or ensure it second place, ensuring meetings with weaker teams until the final. To avoid subjectivity, draw lots. For an 8-team tournament, the probability that the two strongest teams will meet in the final is 4/7. Accordingly, with a probability of 3/7, the second strongest team will leave the tournament ahead of schedule.
In any measurement of product units (using a caliper, micrometer, ammeter, etc.), there are errors. To find out if there are systematic errors, it is necessary to make repeated measurements of a unit of production, the characteristics of which are known (for example, a standard sample). It should be remembered that in addition to the systematic error, there is also a random error.

Therefore, the question arises of how to find out from the results of measurements whether there is a systematic error. If we note only whether the error obtained during the next measurement is positive or negative, then this problem can be reduced to the previous one. Indeed, let's compare the measurement with the throwing of a coin, the positive error - with the loss of the coat of arms, the negative - with the lattice (zero error with a sufficient number of divisions of the scale almost never occurs). Then checking the absence of a systematic error is equivalent to checking the symmetry of the coin.

The purpose of these considerations is to reduce the problem of checking the absence of a systematic error to the problem of checking the symmetry of a coin. The above reasoning leads to the so-called "criterion of signs" in mathematical statistics.
"Sign test" - a statistical test that allows you to test the null hypothesis that the sample obeys the binomial distribution with parameter p=1/2 . The sign test can be used as a nonparametric statistical test to test the hypothesis that the median is equal to a given value (in particular, zero), as well as the absence of a shift (no processing effect) in two connected samples. It also allows you to test the hypothesis of distribution symmetry, however, there are more powerful criteria for this - the one-sample Wilcoxon test and its modifications.

In the statistical regulation of technological processes based on the methods of mathematical statistics, rules and plans for statistical control of processes are developed, aimed at timely detection of the disorder of technological processes and taking measures to adjust them and prevent the release of products that do not meet the established requirements. These measures are aimed at reducing production costs and losses from the supply of low-quality products. With statistical acceptance control, based on the methods of mathematical statistics, quality control plans are developed by analyzing samples from product batches. The difficulty lies in being able to correctly build probabilistic-statistical decision-making models, on the basis of which it is possible to answer the questions posed above. In mathematical statistics, probabilistic models and methods for testing hypotheses have been developed for this, in particular, hypotheses that the proportion of defective units of production is equal to a certain number p0, for example, p0 = 0.23.

Assessment tasks.
In a number of managerial, industrial, economic, national economic situations, problems of a different type arise - problems of estimating the characteristics and parameters of probability distributions.

Consider an example. Let a batch of N electric lamps come to the control. A sample of n electric lamps was randomly selected from this batch. A number of natural questions arise. How can the average service life of electric lamps be determined from the results of testing the sample elements and with what accuracy can this characteristic be estimated? How does accuracy change if a larger sample is taken? At what number of hours T can it be guaranteed that at least 90% of the electric lamps will last T or more hours?

Suppose that when testing a sample of n electric lamps, X electric lamps turned out to be defective. Then the following questions arise. What limits can be specified for the number D of defective electric lamps in a batch, for the level of defectiveness D/N, etc.?

Or, in a statistical analysis of the accuracy and stability of technological processes, it is necessary to evaluate such quality indicators as the average value of the controlled parameter and the degree of its spread in the process under consideration. According to the theory of probability, it is advisable to use its mathematical expectation as the mean value of a random variable, and the variance, standard deviation, or coefficient of variation as a statistical characteristic of the spread. This raises the question: how to estimate these statistical characteristics from sample data and with what accuracy can this be done? There are many similar examples. Here it was important to show how probability theory and mathematical statistics can be used in production management when making decisions in the field of statistical product quality management.

Probabilistic-statistical methods and optimization. The idea of ​​optimization permeates modern applied mathematical statistics and other statistical methods. Namely, methods of planning experiments, statistical acceptance control, statistical control of technological processes, etc. On the other hand, optimization formulations in decision theory, for example, the applied theory of optimizing product quality and standard requirements, provide for the widespread use of probabilistic-statistical methods, primarily applied mathematical statistics.

In production management, in particular, when optimizing product quality and standard requirements, it is especially important to apply statistical methods at the initial stage. life cycle products, i.e. at the stage of research preparation of experimental design developments (development of promising requirements for products, preliminary design, terms of reference for experimental design development). This is due to the limited information available at the initial stage of the product life cycle and the need to predict the technical possibilities and economic situation for the future. Statistical methods should be applied at all stages of solving an optimization problem - when scaling variables, developing mathematical models for the functioning of products and systems, conducting technical and economic experiments, etc.

In optimization problems, including optimization of product quality and standard requirements, all areas of statistics are used. Namely, statistics of random variables, multivariate statistical analysis, statistics of random processes and time series, statistics of objects of non-numerical nature. The choice of a statistical method for the analysis of specific data should be carried out according to the recommendations.

Conclusion.
AT
etc.................

Every investigation in the field of random phenomena is always rooted in experiment, in experimental data. Numerical data that is collected when studying any feature of some object is called statistical. Statistical data are the initial material of the study. In order for them to be of scientific or practical value, they must be processed by methods of mathematical statistics.

Math statistics is a scientific discipline, the subject of which is the development of methods for recording, describing and analyzing statistical experimental data obtained as a result of observations of massive random phenomena.

The main tasks of mathematical statistics are:

    determination of the law of distribution of a random variable or a system of random variables;

    testing the plausibility of hypotheses;

    determination of unknown distribution parameters.

All methods of mathematical statistics are based on the theory of probability. However, due to the specificity of the problems being solved, mathematical statistics is separated from the theory of probability into an independent field. If in the theory of probability the model of the phenomenon is considered to be given and the possible real course of this phenomenon is calculated (Fig. 1), then in mathematical statistics an appropriate probabilistic model is selected based on statistical data (Fig. 2).

Fig.1. General problem of probability theory

Fig.2. General problem of mathematical statistics

As a scientific discipline, mathematical statistics developed along with the theory of probability. The mathematical apparatus of this science was built in the second half of the 19th century.

2. General population and sample.

To study statistical methods, the concepts of general and sample populations are introduced. In general, under general population is understood as a random variable X with the distribution function
. A sample set or a sample of volume n for a given random variable X is a set
independent observations of this quantity, where is called the sample value or implementation of the random variable X. In this way, can be considered as numbers (if the experiment is carried out and the sample was taken) and as random variables (before the experiment), since they vary from sample to sample.

Example 1. To determine the dependence of the thickness of a tree trunk on its height, 200 trees were selected. In this case, the sample size is n=200.

Example 2 As a result of sawing particle boards on a circular saw, 15 values ​​of the specific cutting work were obtained. In this case, n=15.

D
In order to confidently judge the feature of the general population that we are interested in according to the sample data, the objects of the sample must correctly represent it, that is, the sample must be representative(representative). The representativeness of the sample is usually achieved by random selection of objects: each object of the general population is provided with an equal probability of being included in the sample with all the others.

Fig.3. Demonstration of the representativeness of the sample