- •Lecture 1 (part1)
- •Short Preambular to my course
- •Физ. Величина
- •1.1. Classical approaches. Traditional formulation of the one-dimensional regression problem
- •Regression model. Usually, it is supposed the measured values of the response are
- •Usually we suppose that:
- •These components should minimize the values of the error dispersion. Mathematically, this requirement
- •This procedure is very useful and can be considered as obligatory because it
- •Table 1. Simple functions that admit presentation in the form of the straight
- •Definitely, the list of the functions presented in Table 1 can be continued.
- •a.The elimination of the outliers;
- •ksmooth(x, y, w) in MathCad-15
- •Figure 2b. Here we demonstrate the effect of creation a trend by means
- •This procedure automatically decreases the value of the initial fluctuations by means of
- •Figure 3a. Here we show the results of application of the POLS to
- •The minimal values of the functions RelErr(w) for our model example are shown
- •If in the same time we integrate the optimal trend (6) then one
- •1.3. The description of the Eigen-Coordinates (ECs) method
- •If we compare the structure of Eqn.(15) with (5) one can see the
- •Here and below the symbol (A B) defines the scalar product in the
- •The unknown constants A1,2 are found from (22) by the LLSM, because other
- •It is easy to notice from (31) that new set of the functions
- •Questions for self-testing:
- •Questions, Comments or Remarks?
Lecture 1 (part1)
In what cases the non-linear fitting problem is reduced to the well-known linear least squares method?
The ECs method.
1.1Classical approaches. Traditional formulation of the regression problem
1.2.Procedure of the optimal linear smoothing (POLS) of some noisy data
1.3The description of the Eigen-Coordinates (ECs) method
1.4.Further generalizations and some recommendations to the usage of the
ECs method.
1.5. Concluding remarks
1
Short Preambular to my course
Until a certain time, fluctuations of a very different nature (electromagnetic, mechanical, acoustic, etc.) that take place in any physical system were considered only as a source of interference and the efforts of many leading scientists were devoted to developing various methods for extracting a useful signal from these noises (fluctuations). The concept of "noise" has a negative connotation and does not carry any physical meaning, so in my lectures it makes sense to replace it with a more understandable term as physical fluctuations. Fluctuations should be understood as the deviation of a certain value from its "equilibrium" value.
Since the end of the last century, researchers have changed their attitude to these random fluctuations and began to ask the
question - what kind of information can be extracted from these fluctuations and about what phenomena occurring in the
system, they "inform" the researcher? In Russia, these issues were raised and partially resolved by Professors R.M. Yulmetyev,
S.F. Timashev and others. But, unfortunately, on the way of these studies, they suffered one setback: Since these small fluctuations in amplitude are rather subtle phenomena, the mathematical apparatus used for their analysis should not introduce any uncontrolled errors. However, they admitted/allowed them to be present in their research.
Is it possible to keep only the measurement errors and other two types of errors as the treatment errors and errors associated with model cannot be taken into account? Any expert immediately has an emotional objection – it's impossible! How can we abandon the assumptions about Gaussian noise, Poisson process, binomial distribution and other useful distributions of random variables that have proven their usefulness in practice? Where is the truth if you can so easily ignore these useful and necessary models? It turns out, as further research by the author of this course shows, the truth lies in the applicability of those models that are widely used in practice: they all assume that their reliability is achieved at N goes to infinity , where N is
the number of tests and the repeatability of this phenomenon. If this parameter is not so large, then it becomes difficult to
give a reliable assessment of this event. In my lecture you can see how to avoid the applicability of the distribution language
and take into account only the measurement errors.
2
Физ. Величина |
Случайные |
Информация |
и её |
величины |
|
Флуктуация |
|
|
Кол. |
Кол. Описание Физ. |
|
Описание |
|
|
флуктуаций |
|
|
Физ.величин |
|
|
|
|
|
|
Принципы и |
|
|
подгон. ф-ции |
|
|
|
3 |
1.1. Classical approaches. Traditional formulation of the one-dimensional regression problem
Initial data. In fact, there are three basic problems related to initial data:
1)All measured data are available in the limited interval of measurable variables;
2)Set of data represents itself a random sampling (set of random points), which is always accompanied by an error of measurements and by limitations of apparatus function;
3)Set of data can be fitted by a certain set of functions containing some number of fitting parameters in the limits of an admissible error variance.
Fig.1. Realization of a random sampling of R(x)
given in the limited interval and aggravated by a random error.
4
Regression model. Usually, it is supposed the measured values of the response are divided on two parts: (1) the basic part of y strongly depends on x and so it is determined as the function f(x); (2) the second part reflects the influence of other uncontrollable factors and so it is determined as random function (error) (x) with respect to the given factor x.
(1)
Usually, the random function is associated mainly with experimental error (that includes all possible errors that are appeared in the result of experimental measurements, the equipment used and the influence of different uncontrollable factors). The general relationship can be written in the equivalent discrete form
(2)
Basic suppositions about errors. In the conventional models related to the problem of regression analysis we cannot separate the function f(xj) from its random function j. But nevertheless, an approximate separation is
possible if one can replace approximately the function f(xj) by the regression function f(xj, ), where is the fitting
vector that belongs to the set of the fitting parameters . The dimension of the fitting vector ? –idea of reduction!
5
Usually we suppose that:
(S1) realization of random function (x) in one set of experiments is totally independent from the realization of
random function belonging to another set of experiments. (but this supposition is not true)
(S2) is related to the nature of distribution function. It is supposed that the distribution function remains the same. In other words, it keeps its statistical proximity during the whole process of measurements (as we can see below - It can be valid only in the case of stable experiments)
Below we shall see that these two suppositions are not obligatory. We can overcome them and consider the cases when the error function can keep its systematic component and cannot follow to normal or uniform distribution. Below we are going to show some several examples from real applications, which confirm our point of view.
Supposition about the regression function and its possible recognition. It is supposed that a set of admissible
functions belong to a parametric family of the functions. The fitting vector figuring inside should minimize (in a
certain sense) the random function (x). So, the classical problem of the one-dimensional regression analysis can be formulated as:
(3)
6
These components should minimize the values of the error dispersion. Mathematically, this requirement can be expressed in the following simple form
(4)
As we shall see below the eigen-coordinates (ECs) method allows to reduce the problem of the non-linear fitting to the linear-least squares method (LLSM), when the components of the fitting vector enter into (3) by a linear way
(5)
Throughout all lectures we determine the relationship (5) as the basic linear relationship (BLR). At writing of
expression (5), we suppose that all functions figuring in (5) are shifted relatively their mean values, i.e.
(6)
7
This procedure is very useful and can be considered as obligatory because it keeps the deviations of the
transformed error function near its mean value equaled zero ( ). In other words, the systematic error of the
chosen model is eliminated. In the opposite case, when one expects the uncontrollable increasing of the initial and systematic error (x) in the result of integration and other transformations that can lead initial expression (3) to (5).
Here ( in (5)) the fitting vector is represented by a finite (k=1,2,…,s) and linear combination of the independent constants Ck and the fitting function is presented by a linear combination of the functions Xk(xj). Besides the
solution of the non-linear fitting problem (reducing it in the most cases to the linear problem) we are going to show how it is possible to recognize the proper hypothesis if we have at least two competitive hypothesis. We are going to formulate more reliable criterion that helps to select the suitable hypothesis.
As we know that in several cases to decrease the uncertainty related the selection of the proper hypothesis some curves (by the finding of the corresponding coordinates) can be presented in the form of the straight lines.
For example, the hypothesis related to the recognition of the exponential function is recognized in the semi-log
scale, power-law function in a double-log scale, but for more complex cases such representation it is not known
or impossible.
In Table 1 we show the simplest functions that can be presented in the form of the straight lines by selection of the corresponding coordinates. The proper coordinates should be selected in such way that the initial data being presented in these coordinates should give a curve close to straight line.
8
Table 1. Simple functions that admit presentation in the form of the straight lines
The recognized function |
Straight line presentation |
|
Y=aX(x) + b |
||
|
||
y = Aexp( x) |
Y = ln(y), a = , X(x) = x, b = ln(A) |
|
y = Ax |
Y = ln(y), a = , X(x) = ln(x), b = ln(A) |
9
Definitely, the list of the functions presented in Table 1 can be continued. But analyzing these functions one can |
|||||
come to the following conclusion. The list of the functions of Table 1 can be considerably increased if we are |
|||||
able to present the fitting function in the form of the BLR (5). So, one can try to find another form of the |
|||||
presentation of the initial fitting function (possible hypothesis) in order to express the initial function in the |
|||||
equivalent form (5). Is it possible to realize this idea practically or not? Figure 1 placed above represents a typical |
|||||
example. The presented data are measured in the limited interval of a variable x, contain the error of |
|
|
|||
measurements and could be fitted reasonably well either by a Gaussian or some Lorentz bell-like profile. |
|||||
However, the careful investigation shows that this curve is neither a Gaussian nor Lorentzian. The best fit can be |
|||||
realized with the help of the function R(x; A) = B x exp(-a |
1 |
x- a x2/2) with B=1.55, = 0.6, a =1.5, a |
2 |
= 0.3. |
|
|
2 |
1 |
|
||
Analysis of remnants. When the final fitting with the help of the chosen hypothesis was realized, it is useful to |
|||||
analyze the error function |
|
|
|
|
|
(7)
which is usually defined as the remnant function (or the function of remnants). If the hypothesis is the final the remnant function (7) should behave as the equally-distributed random function in the vicinity of zero values not having the clearly visible trend. If we have some trend then it can prompt some unexpected dependence that was taken into account in the initial fitting procedure. Special attention should be paid to outliers. These outliers
can distort considerably the values of the fitting parameters and so two problems can appear:
10
