- •Introduction
- •Basic concepts of probability theory
- •Classical definition of probability
- •Relative frequency
- •Geometric probabilities
- •Glossary
- •Exercises for Seminar 1
- •Exercises for Homework 1
- •Basic formulas of combinatorial analysis
- •Operations over events
- •Glossary
- •Exercises for Seminar 2
- •Exercises for Homework 2
- •Theorem of addition of probabilities of incompatible events
- •Complete group of events
- •Opposite events
- •Conditional probability
- •Theorem of multiplication of probabilities
- •Glossary
- •Exercises for Seminar 3
- •Exercises for Homework 3
- •Independent events
- •Where a is the appearance of at least one of the events a1, a2, …, An; .
- •Glossary
- •Exercises for Seminar 4
- •Exercises for Homework 4
- •Theorem of addition of probabilities of compatible events
- •Formula of total probability
- •Probability of hypotheses. Bayes’s formulas.
- •Glossary
- •Exercises for Seminar 5
- •Exercises for Homework 5
- •Repetition (recurrence) of trials. The Bernoulli formula
- •Local theorem of Laplace
- •Integral theorem of Laplace
- •Glossary
- •Exercises for Seminar 6
- •Exercises for Homework 6
- •Random variables. The law of distribution of a discrete random variable
- •A random variable is understood as a variable which as result of a trial takes one of the possible set of its values (which namely – it is not beforehand known).
- •Mathematical operations over random variables
- •(Mathematical) expectation of a discrete random variable
- •Dispersion of a discrete random variable
- •Glossary
- •Exercises for Seminar 7
- •Exercises for Homework 7
- •Distribution function of a random variable
- •Properties of a distribution function
- •Continuous random variables. Probability density
- •Properties of probability density
- •Glossary
- •Exercises for Seminar 8
- •Exercises for Homework 8
- •Basic laws of distribution of discrete random variables
- •1. Binomial law of distribution
- •2. The law of distribution of Poisson
- •3. Geometric distribution
- •4. Hypergeometric distribution
- •Glossary
- •Exercises for Seminar 9
- •Exercises for Homework 9
- •Basic laws of distribution of continuous random variables
- •1. The uniform law of distribution
- •2. Exponential law of distribution
- •3. Normal law of distribution
- •Glossary
- •Exercises for Seminar 10
- •Exercises for Homework 10
- •The law of large numbers and limit theorems
- •The central limit theorem
- •Glossary
- •Exercises for Seminar 11
- •Exercises for Homework 11
- •Mathematical statistics. Variation series and their characteristics
- •Numerical characteristics of variation series
- •Glossary
- •Exercises for Seminar 12
- •Exercises for Homework 12
- •Bases of the mathematical theory of sampling
- •Glossary
- •Exercises for Seminar 13
- •Exercises for Homework 13
- •Methods of finding of estimations
- •Notion of interval estimation
- •Glossary
- •Exercises for Seminar 14
- •Exercises for Homework 14
- •Testing of statistical hypotheses
- •Glossary
- •Exercises for Seminar 15
- •Exercises for Homework 15
- •Individual homeworks
- •Variant 1
- •Variant 2
- •Variant 3
- •Variant 4
- •Variant 5
- •Variant 6
- •Variant 7
- •Variant 8
- •Variant 9
- •Variant 10
- •Variant 11
- •Variant 12
- •Variant 13
- •Variant 14
- •Variant 15
- •Variant 16
- •Variant 17
- •Variant 18
- •Variant 19
- •Variant 20
- •Variant 21
- •Variant 22
- •Variant 23
- •Variant 24
- •Variant 25
- •Final exam trial tests (for self-checking)
- •Appendix
- •Values the functions and
- •List of the used books
- •Contents
Mathematical statistics. Variation series and their characteristics
Mathematical statistics is the section of mathematics studying mathematical methods of gathering, ordering, processing and interpretation of results of supervision with the purpose of revealing statistical regularities.
Establishment of statistical regularities inherent in mass random phenomena is based on studying of the statistical data – data on what values have been accepted as a result of observation by an attribute interesting us (random variable X).
In real social and economic systems it is impossible to carry out active experiments; therefore the data usually represent observations over occurring process, for example: an exchange rate at a stock exchange within a month, productivity of wheat in a farm for 30 years, labor productivity of workers for a change, etc. Results of observations are, generally, a series of numbers located in the disorder which for studying it is necessary to order (rank).
The operation of ordering the values of an attribute on increase (decrease) is said to be ranking of experimental data.
After operation of ranking the experimental data can be grouped so that in each group the attribute accepted the same value which refers to as a variant (xi), i.e. various values of an attribute are variants. The number of elements in each group refers to as frequency (ni) of a variant. The sum of all frequencies is equal to the certain number n which refers to as volume of set:
The ratio of frequency of the given variant to volume of set is relative frequency (wi) of this variant: wi = ni/n. Frequencies and relative frequencies are said to be weights.
A variation series is a ranked series of variants with corresponding weights (frequencies and relative frequencies) in increasing (or decreasing) order.
At
studying variation series alongside with the notion of frequency the
notion of cumulative
frequency
(
)
is used. Cumulative frequency shows how many variants with value of
an attribute smaller x
were observed. The ratio of cumulative frequency
to the volume of set n
is said to be cumulative
relative frequency
(
).
For the task of a variation series it is enough to specify variants and frequencies (relative frequencies) corresponding to them or cumulative frequencies (cumulative relative frequencies).
Variation series are discrete and continuous (interval). A discrete variation series is a ranked sequence of variants with corresponding frequencies and (or) relative frequencies.
Example 1. As a result of testing a group of 24 persons has obtained the following points: 4, 0, 3, 4, 1, 0, 3, 1, 0, 4, 0, 0, 3, 1, 0, 1, 1, 3, 2, 3, 1, 2, 1, 2. Construct the discrete variation series.
Solution: Rank the original series, account frequency and relative frequency of variants:
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4.
As a result we obtain a discrete variation series:
Point, xi |
Frequency (the number of students) ni |
Relative frequency, wi |
0 |
6 |
6/24 |
1 |
7 |
7/24 |
2 |
3 |
3/24 |
3 |
5 |
5/24 |
4 |
3 |
3/24 |
|
24 |
1 |
If the number of values of an attribute is great, the construction of a discrete variation series is inexpedient. In this case it is necessary to construct an interval variation series. For construction of such a series the interval of variation of an attribute is subdivided into a series of separate intervals, and the amount of values of the quantity in each of them is counted.
The
recommended number of intervals is calculated by the following
formula: m
= 1 + 3,322lg
n,
and size of an interval (an interval difference, width of an
interval) –
where xmax
– xmin
is the difference between the greatest and the least values of an
attribute.
Example 2. Let a series of distribution of farms by amount of workers on 100 hectares of agricultural fields is given (n = 60):
12 |
6 |
8 |
10 |
11 |
7 |
10 |
12 |
8 |
7 |
7 |
6 |
7 |
8 |
6 |
11 |
9 |
11 |
9 |
10 |
11 |
9 |
10 |
7 |
8 |
8 |
11 |
9 |
8 |
7 |
5 |
9 |
7 |
7 |
14 |
11 |
9 |
8 |
7 |
4 |
7 |
5 |
5 |
10 |
7 |
7 |
5 |
8 |
10 |
10 |
15 |
10 |
10 |
13 |
12 |
11 |
15 |
6 |
6 |
8 |
Find the recommended number of intervals: m = 1 + 3,322 lg 60 6,907; m = 7.
Find size of partial interval: k = (15 – 4)/7 1,6.
Construct an interval variation series using xmin as an initial value. Divide the interval of variation of the attribute X into m = 7 partial intervals with step k = 1,6 and count the number of workers on 100 hectares of agricultural fields in each interval:
Groups of farms on amount of workers on 100 hectares |
Frequency (the number of farms in the group) ni |
Cumulative frequency (cumulative number of farms) |
Relative frequency, wi |
4 – 5,6 |
5 |
5 |
5/60 |
5,61 – 7,2 |
17 |
22 |
17/60 |
7,21 – 8,8 |
9 |
31 |
9/60 |
8,81 – 10,4 |
15 |
46 |
15/60 |
10,41 – 12,0 |
10 |
56 |
10/60 |
12,01 – 13,6 |
1 |
57 |
1/60 |
13,61 – 15,2 |
3 |
60 |
3/60 |
|
60 |
- |
1 |
Variation series are presented graphically by means of polygon and histogram. Polygon of frequencies is a broken line segments of which connect points (x1; n1), (x2; n2), …, (xk; nk). Polygon of relative frequencies is a broken line segments of which connect points (x1; n1/n), (x2; n2/n), …, (xk; nk/n).
Construct a polygon of frequencies for Example 1:
The figure consisting of rectangles with the basis k and heights ni refers to as a histogram of frequencies. For a histogram of relative frequencies as a height is considered ni/n.
Construct a histogram of frequencies for Example 2.
