Diss / 10
.pdf
354 |
Signal Processing in Radar Systems |
be close, in a certain sense, to the true value of estimated parameter, and the optimal estimate must minimize this measure of closeness in accordance with the chosen criterion.
To simplify the written form and discussion in future, we assume, if it is not particularly fixed, that the unknown parameter of the random process ξ(t) is the only parameter l = {l1, l2,…, lμ}. Nevertheless, all conclusions made based on our analysis of estimation process of the parameter l = {l1, l2,…, lμ} of the random process ξ(t) are correct with respect to joint estimation process of several parameters of the same random process ξ(t). Thus, it is natural to obtain one function from the observed realization x(t) to estimate a single parameter l = {l1, l2,…, lμ} of the random process ξ(t). Evidently, the more knowledge we have about the characteristics of the analyzed random process ξ(t) and noise and interference in the received realization x(t), the more accurate will be our estimation of possible values of the parameters of the random process ξ(t), and thus more accurate will be the solution based on synthesis of the devices designed using the chosen criterion with the minimal errors in the estimation of random process parameters of interest.
More specifically, the estimated parameter is a random variable. Under this condition, the most complete data about the possible values of the parameter l = {l1, l2,…, lμ} of the random process
ξ(t) are given by the a posteriori pdf ppost(l) = p{l|x(t)}, which is the conditional pdf if the given realization x(t) is received. The formula of a posteriori pdf can be obtained based on the theorem
about conditional probabilities of two random variables l and X, where X{x1, x2,…, xn} is the multidimensional (n-dimensional) sample of the realization x(t) within the limits of the interval [0, T]. According to the theorem about the conditional probabilities [1]
p(l, X) = p(l) p(l|X) = p( X) p(l|X), |
(11.1) |
we can write
ppost (l) = p(l|X) = |
p(l) p(X|l) |
. |
(11.2) |
|
|||
|
p(X) |
|
|
In (11.1) and (11.2) p(l) ≡ pprior(l) is the a priori pdf of the estimated parameter l; p(X) is the pdf of multidimensional sample X of the realization x(t). The pdf p(X) does not depend on the current
value of the estimated parameter l and can be determined based on the condition of normalization of ppost(l):
p(X) = ∫ p(X|l) pprior (l)dl. |
(11.3) |
|
|
Integration is carried out by the a priori region (interval) of all possible values of the estimated parameter l. Taking into consideration (11.3), we can rewrite (11.2) in the following form:
ppost (l) = |
|
pprior (l) p(X|l) |
(11.4) |
|
|
|
. |
|
|
∫ |
p(X|l) pprior (l)dl |
|
||
|
|
|
||
The conditional pdf of the observed data sample X, under the condition that the estimated parameter takes a value l, has the following form:
p(X|l) = p(x1, x2 ,…, xn |l), |
(11.5) |
Main Statements of Statistical Estimation Theory |
355 |
and can be considered as the function of l and is called the likelihood function. For the fixed sample X, this function shows that one possible value of parameter l is more likely in comparison with other value.
The likelihood function plays a very important role in the course of solution of signal detection problems, especially in radar systems. However, in a set of applications, it is worthwhile to consider the likelihood ratio instead of the likelihood function:
Λ(l) = |
p(x1, x2 ,…, xn |l) |
, |
(11.6) |
|
p(x1, x2 ,…, xn |lfix ) |
||||
|
|
|
where p(x1, x2,…, xn|lfix) is the pdf of observed data sample at some fixed value of the estimated random process parameter lfix. As applied to analysis of continuous realization x(t) within the limits
of the interval [0, T], we introduce the likelihood functional in the following form:
ˆ |
|
p(x1, x2 |
,…, xn |l) |
|
|
Λ(l) = lim |
|
|
, |
(11.7) |
|
|
|
||||
|
n→∞ p(x1, x2 ,…, xn |lfix ) |
|
|
||
where the interval between samples is defined as |
|
|
|
||
|
|
= T . |
|
|
(11.8) |
|
|
n |
|
|
|
Using the introduced notation, the a posteriori pdf takes the following form: |
|
||||
|
ppost (l) = κ pprior (l)Λ(l), |
|
(11.9) |
||
where κ is the normalized coefficient independent of the current value of the estimated parameter l;
κ = |
|
1 |
. |
(11.10) |
|
|
|||
∫ |
pprior (l)Λ(l)dl |
|||
|
|
|
||
We need to note that the a posteriori pdf ppost(l) of the estimated parameter l and the likelihood ratio Λ(l) are the random functions depending on the received realization x(t).
In theory, for statistical parameter estimation, two types of estimates are used:
•Interval estimations based on the definition of confidence interval
•Point estimations, that is, the estimate defined at the point
Employing the interval estimations, we need to indicate the interval, within the limits of which there exists the true value of unknown random process parameter with the probability that is not less than the value given before. This earlier-given probability is called the confidence factor, and the indicated interval of possible values of the estimated random process parameter is called the confidence interval. The upper and lower bounds of the confidence interval, which are called the confidence limits, and the confidence interval are the functions to be considered for both digital signal processing (a discretization) and analog signal processing (continuous function) of the received realization x(t). In the point estimation case, we assign one parameter value to the unknown parameter from the interval of possible parameter values; that is, some value is obtained based on the analysis of the received realization x(t) and we use this value as the true value of the evaluated parameters.
In addition to the procedure of analysis of the random process parameter based on the value of received realization x(t), there is a sequential estimation method. This method essentially involves the sequential statistical analysis that estimates the random process parameter [2,3]. The basic idea
356 |
Signal Processing in Radar Systems |
of sequential estimation is to define the time of analysis of the received realization x(t), within the limits of which we are able to obtain the estimate of parameter with the earlier-given reliability. In the case of point estimate, the root-mean-square deviation of estimate or other convenience function characterizing a deviation of estimate from the true value of estimated random process parameter can be considered as the measure of reliability. From the viewpoint of interval sequential estimation, the estimate reliability can be defined using the length of the confidence interval with a given confidence coefficient.
11.2 POINT ESTIMATE AND ITS PROPERTIES
To make the point estimation means that some number γ = γ[x(t)] from the interval of possible values of the estimated random process parameter l must correspond to each possible received realization x(t). This number γ = γ[x(t)] is called the point estimate. Owing to the random nature of the point estimate of random process parameter, it is characterized by the conditional pdf p(γ|l). This is a general and total characteristic of the point estimate. The shape of this pdf defines the quality of point estimate definition and, consequently, all properties of the point estimate. At the given estimation rule γ = γ[x(t)], the conditional pdf p(γ|l) can be obtained from the pdf of received realization x(t) based on the well-known transformations of pdf [4]. We need to note that a direct determination of the pdf p(γ|l) is very difficult for many application problems. Because of this, if there are reasons to suppose that this pdf is a unimodal function and is very close to symmetrical function, then the bias, dispersion, and variance of estimate that can be determined without direct definition of the p(γ|l) are widely used as characteristics of the estimate γ.
In accordance with definitions, the bias, dispersion, and variance of estimate are defined as follows:
b(γ |l) = (γ − l) = ∫[γ ( X) − l] p( X|l)dX; |
(11.11) |
|
|
X |
|
D(γ |l) = (γ − l)2 = ∫[γ (X) − l]2 p(X|l)dX; |
(11.12) |
X |
|
Var(γ |l) = [γ − γ ]2 = ∫[γ (X) − γ ]2 p(X|l)dX. |
(11.13) |
X |
|
Here and further … means averaging by realizations. The estimate obtained taking into consideration of a priori pdf is called the unconditional estimate. The unconditional estimates are obtained as a result of averaging (11.11) through (11.13) on possible values of the variable l with a priori pdf
pprior(l); that is, the unconditional bias, dispersion, and variance of estimate are determined in the following form:
b(γ ) = ∫b(γ |l) pprior (l)dl; |
(11.14) |
X |
|
D(γ ) = ∫D(γ |l) pprior (l)dl; |
(11.15) |
X |
|
Var(γ |l) = ∫Var(γ |l) pprior (l)dl. |
(11.16) |
X |
|
Main Statements of Statistical Estimation Theory |
357 |
Since the conditional and unconditional estimate characteristics have different notations, we will drop the term “conditional” when discussing a single type of characteristics.
The estimate of random process parameters, for which the conditional bias is equal to zero, is called the conditionally unbiased estimate; that is, in this case, the mathematical expectation of the estimate coincides with the true value of estimated parameter: γ = l. If the unconditional bias is
equal to zero, then the estimate is unconditionally unbiased estimate; that is, γ = lprior, where lprior is the a priori mathematical expectation of the estimated parameter. Evidently, if the estimate is
conditionally unbiased, then we can be sure that the estimate is unconditionally unbiased. Inverse proposition, generally speaking, is not correct. In practice, the conditional unbiasedness often plays a very important role. During simultaneous estimation of several random process parameters, for example, estimation of the vector parameter l = {l1, l2,…, lμ}, we need to know the statistical relationship between estimates in addition to introduced conditional and unconditional bias, dispersion, and variance of estimate. For this purpose, we can use the mutual correlation function of estimates.
If estimations of the random process parameters l1, l2,…, lμ are denoted by γ1, γ2,…, γμ, then the conditional mutual correlation function of estimations of the parameters li and lj is defined in the following form:
R (ν|l) = |
|
ν − ν |
ν |
− ν |
. |
(11.17) |
|
ij |
|
i |
i )( |
j |
j ) |
|
|
( |
|
|
|||||
The correlation matrix is formed based on these elements Rij(ν|l); moreover, the matrix diagonal elements are the conditional variances of estimations. By averaging the conditional mutual correlation function using possible a priori values of the estimated random process parameters, we obtain the unconditional mutual correlation function of estimations.
There are several approaches to define the properties of the point estimations. We consider the following requirements to properties of the point estimations in terminology of conditional characteristics:
•It is natural to try to define such point estimate γ so that the conditional pdf p(γ|l) stays very close to the value l.
•It is desirable that while increasing the observation interval, that is, T → ∞, the estimation would coincide with or approach stochastically the true value of estimated random process parameter. In this case, we can say that the estimate is the consistent estimate.
•The estimate must be unbiased γ = l or, in extreme cases, asymptotically unbiased,
|
that is, lim γ = l. |
• |
T →∞ |
The estimate must be the best by some criterion; for example, it must be characterized by |
|
|
minimal values of dispersion or variance at zero or constant bias. |
• |
The estimate must be statistically sufficient. |
The statistics, that is, in the considered case, the function or functions of the observed data, is sufficient if all statements about the estimated random process parameter can be defined based on the considered statistical data without any additional observation of received realization data. Evidently, the a posteriori pdf is always a sufficient statistic. Conditions of estimation sufficiency can be formulated in terms of the likelihood function: The necessary and sufficient condition of such estimation means the possibility to present the likelihood function in the form of product between two functions [5,6]:
p( X|l) = h[ x(t)] g(γ |l). |
(11.18) |
Here, h[x(t)] is the arbitrary function of the received realization x(t) independent of the current value of the estimated random process parameter l. Since the parameter l does not enter into the function
358 |
Signal Processing in Radar Systems |
h[x(t)], we cannot use this function to obtain any information about the parameter l. The factor g(γ|l) depends on the received realization x(t) over the estimation γ = γ[x(t)] only. For this reason, all information about the estimated random process parameter l must be contained into γ[x(t)].
11.3 EFFECTIVE ESTIMATIONS
One of the main requirements is to obtain an estimate with minimal variance or minimal dispersion. Accordingly, a statement of effective estimations was introduced in the mathematical statistics. As applied to the bias estimations of the random process parameter, the estimation lef is considered effective if the mathematical expectation of its squared deviation from the true value of the estimated random process parameter l does not exceed the mathematical expectation of quadratic deviation of any other estimation γ; in other words, the following condition
Def (l) = (lef − l)2 ≤ (γ − l)2 . |
(11.19) |
must be satisfied. Dispersion of the unbiased estimate coincides with its variance and, consequently, the effective unbiased estimate is defined as the estimation with the minimal variance.
Cramer–Rao lower bound [5] was defined for the conditional variance and dispersion of estimations that are the variance and dispersion of effective estimations under the condition that they exist for the given random process parameters. Thus, in particular, the biased estimate variance is defined as
|
|
|
|
1 |
+ db (γ |
|
l) |
2 |
|
|||
|
|
|
|
|
||||||||
|
|
|
|
|
|
|
||||||
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
dl |
|
|||||
Var (γ |
|
l) > |
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
(11.20) |
|||
|
|
|
|
|
. |
|||||||
|
d |
|
|
2 |
||||||||
|
|
|
|
|||||||||
|
|
|
|
|
|
ln Λ(l) |
|
|||||
|
|
|
|
|
|
|||||||
|
|
|
dl |
|
|
|
|
|||||
The variance of unbiased estimations and estimations with the constant bias is simplified and takes the following form:
Var (γ |
|
l) > |
|
|
1 |
. |
(11.21) |
|
|
|
|||||
|
|
|
|||||
|
d |
2 |
|||||
|
|
|
|
||||
|
|
|
|
|
ln Λ(l) |
|
|
|
|
|
|
|
|||
|
|
|
dl |
|
|
||
We need to note that in the case of analog signal processing of all possible realizations x(t) the averaging is carried out using a multidimensional sample of the observed data X and the derivatives are taken at the point where the estimated random process parameter has a true value. Equality in (11.20) and (11.21) takes place only in the case of effective estimations and if the two conditions are satisfied. The first condition is the condition that estimation remains sufficient (11.18). The second condition is the following: The likelihood function or likelihood ratio logarithm derivative should satisfy the equality [5]
d |
ln Λ(l) = q(l)(γ − γ ), |
(11.22) |
|
dl |
|||
|
|
where the function q(l) does not depend on the estimate δ and sample of observed data but depends on the current value of the estimated random process parameter l. At the same time, the condition (11.22) exists if and only if the estimate is sufficient; that is, the condition (11.18) is satisfied and the condition of sufficiency can exist when (11.22) is not satisfied. Analogous limitations are applied to the effective unbiased estimations, at which point the sign of inequality in (11.21) becomes the sign of equality.
Main Statements of Statistical Estimation Theory |
361 |
In general, there may be a factor on the right side of loss functions given by (11.23) through (11.24). These functions are the symmetric functions of difference |γ − l|. In doing so, deviations of the parameter estimate with respect to the true value of estimated random process parameter are undesirable. In addition, there are some application problems when the observer ignores the sign of estimate. In this case, the loss function is not symmetric.
Based on the random nature of estimations γ and random process parameter l, the losses are random for any decision-making rules and cannot be used to characterize an estimate quality. To characterize an estimate quality, we can apply the mathematical expectation of the loss function that takes into consideration all incorrect solutions and relative frequency of their appearance. Choice of the mathematical expectation to characterize a quality of estimate, not another statistical characteristic, is rational but a little bit arbitrary. The mathematical expectation (conditional or unconditional) of the loss function is called the risk (conditional or unconditional). Conditional risk is obtained by averaging the loss function over all possible values of multidimensional sample of the observed data that are characterized by the conditional pdf p(X|l)
(γ | l) = ∫ (γ , l) p( X|l)dX. |
(11.27) |
|
|
X |
|
As we can see from (11.27), the preferable estimations are the estimations with minimal conditional risk. However, at various values of the estimated random process parameter l, the conditional risk will have different values. For this reason, the preferable decision-making rules can be various, too. Thus, if we know the a priori pdf of the estimated parameter, then it is worthwhile to define the best decision-making rule for the definition of estimation based on the condition of the unconditional average minimum risk, which can be written in the following form:
(γ ) = ∫ p(X) |
|
|
|
|
|
|
|
(γ,l) ppost (l)dl |
|
(11.28) |
|
∫ |
dX, |
||||
X |
|
|
|
|
|
where p(X) is the pdf of the observed data sample.
Estimations obtained by the criterion of minimum conditional and unconditional average risk are called the conditional and unconditional Bayes estimates. The unconditional Bayes estimate is often called the Bayes estimate. Furthermore, we will understand that for the Bayes estimate γm of the parameter l the estimate ensuring the minimum average risk at the given loss function is (γ, l). The minimal value of average risk corresponding to the Bayes estimate is called the Bayes risk
m = ∫ (γ ,l) fpost (l)dl . |
(11.29) |
|
|
Here, the averaging is carried out by samples (digital signal processing) of the observed data X or by realizations x(t) (analog signal processing). The average risk can be determined for any given decision-making rule and because of the definition of Bayes estimate the following condition is satisfied forever:
m ≤ (γ ). |
(11.30) |
|
Computing the average risk by (11.28) for different estimations and comparing each of these risks with Bayes risk, we can evaluate and conclude how one estimate can be better compared to another


(γ –