Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Сумский государственный университет

Предмет:

Биомеханика

Файл:

Kluwer - Handbook of Biomedical Image Analysis Vol

.2.pdf

Скачиваний:

107

Добавлен:

10.08.2013

Размер:

25.84 Mб

Скачать

☆

<<< < Предыдущая 53 54 55 56 57 58 59 60 61 62 63 6465 / 8465 66 67 68 69 70 71 72 73 74 75 76 77 > Следующая >>>

A Knowledge-Based Scheme for Digital Mammography

623

Table 11.9: Mean percentage improvement in segmenting an unenhanced mammogram compared to that obtained when segmenting the image enhanced using the predicted enhancement method from each strategy for all breast types

Type	Mean TP	Mean SUBTP	Total

Target expert	1.00	2.00	1.20
FUZZY expert	0.11	4.25	0.79
DNM	0.08	1.16	0.24
(A) BPM FBP26	0.20	3.75	0.78
(B) BPM FBP316	0.13	3.66	0.72
Types 1–3 (A);	0.29	3.88	0.88
Type 4 (B)

of the target optimal values from Table 11.4. Additionally, the table shows the result obtained by applying the FUZZY method to all images (given in Table 5(part c) over all four breast types. The last row in Table 11.9 shows the result of using the prediction from the BPM strategy with feature set FBP26 on breast types 1–3 and feature set FBP316 on type 4. From these results the following key observations are made:

1.Utility of contrast enhancement: From the complete dataset of mammograms, 75% showed an improved sensitivity following application of the expert contrast enhancement compared with the unenhanced original images.

2.Target experts: Figure 11.5 highlighted that given a set of contrast enhancement methods, different methods can be identiﬁed as target enhancement experts for different mammograms. This observation is the motivation for learning the optimal expert.

3.Characterizing a mammogram: Reviewing the results in Table 11.9, it can be seen that as the DNM strategy relies on characterizing a mammogram by a suspicious ROI, it performs poorly. In contrast the BPM strategy utilizes an image feature vector extracted from the breast comprizing an extensive set of features and performs better.

4.The superior BPM approach: The resultant performance using the modiﬁed BPM strategy based on breast type leads to a greater performance

624	Singh and Bovis

than simply using the FUZZY method. The result is inferior to the target contrast enhancement baseline performance indicating that learning the expert enhancement is a nontrivial problem. In implementing the modiﬁed BPM strategy, a mechanism of predicting the breast type is required.

5.Use of mammogram grouping knowledge: The BPM approach has been developed to utilize a priori knowledge describing the mammogram grouping indicating the mammographic breast density type. This knowledge is used to determine the feature extraction method to be used, either FBP26 for breast types 1–3 or FBP316 for type 4. In the experimental results presented above, the target breast type was used.

11.4 Image Segmentation Layer

The image segmentation layer aims to use a number of image segmentation schemes and then adopt a mixture of experts model. In other words, on a per pixel basis, a number of segmentation experts make classiﬁcation decisions that are fused together. The fusion of decisions is possible either using standard combination rules or adaptable scheme (based on determining appropriate weights of combination that are based on image properties). Our approach is based on the use of parametric models of image segmentation.

Recently, GMM have gained considerable prominence in the image segmentation literature since there is a vast range of training data available from which a priori information can be gathered. One of their key strengths is that such statistical models are underpinned by well-founded statistical probability and information theory. In addition, such approaches can be used in supervised or unsupervised modes. In addition, the output of such models is the a posteriori probability estimate that can be used to optimize the model to perform at a given point on the ROC curve. Also, by expressing the result as a posteriori probability, the outputs of various experts can be combined within a uniﬁed framework. Finally, the postprocessing of images is cheaper with statistical methods since only those regions that contain suspicious pixels need further examination, as opposed to a region-based approach where all regions must be considered.

A Knowledge-Based Scheme for Digital Mammography

625

The GMM approach does not consider the spatial arrangement of class labels in an image, which can be quite useful for relaxation labeling [28]. Markov random ﬁelds (MRF) have been shown as a powerful class of techniques [29–31] for modeling the spatial arrangement of class labels. MRF can be expressed in terms of a probabilistic framework and they can be combined with a statistical observed model of the mammogram. An MRF can increase the homogeneity of the formed regions that leads to a reduction in the false positives.

In this study we propose a Weighted Gaussian Mixture Model (WGMM) for both supervised (WGMMS) and unsupervised (WGMMU ) data analysis. A set of GMMs is constructed, each modeling a particular class distribution and capable of being combined into a single unconditional density. We combine the WGMM model with a MRF hidden model and propose two approaches that work for supervised (WGMMMRFS ) and unsupervised (WGMMUMRF) modes. The four models or experts (WGMMS, WGMMU , WGMMMRFS , and WGMMUMRF) each produce a label for the test pixel. We use a number of different features, each forming the basis of a different expert and relying on one of the above four models for segmentation. The expert outputs can be combined using well-known expert combination methods. In this chapter we propose an adaptive weighted model (AWM) for the combination of four experts and show that this new method of combination outperforms other popular methods.

11.4.1 Weighted Gaussian Mixture Models

A gray-scale image is represented as a 1-D array X = {x1, x2, . . . , xN ), where xn is an input feature for pixel n and N is the total number of pixels in the image. The input feature vector xn may be a D-dimensional vector or simply the gray-scale value of the pixel n. Let the underlying true segmentation of the image be denoted as Y = {y1, y2, . . . , yN ). It is assumed that the number of classes is predetermined as a set of known class labels ωl , where l {1, . . . , L}, and therefore the class label of pixel n is indicated as yn {ωl }lL=1. A common assumption in modeling a density with a GMM for image segmentation is that each component m, m {1, . . . , M}, will model the pdf of each class M = L. Let yˆn represent the estimate of the segmentation. Each component is weighted by its weight of Ymn that indicates the relationship of pixel xn to class label ωl modeled by component m. To ensure that the parameters of each component density are learnt correctly, the weight Ymn is set to indicate the class to which

626			Singh and Bovis
data point xn belongs, thus
γ	mn =	1	if yn = m
	mn =	0	otherwise

If Ymn = 1, then data point xn will only be considered when setting the parameters of class ωl modeled by component m. Using the labelled training data, a maximum likelihood (ML) estimate of all component parameters and mixing coefﬁcients can be found.

We ﬁrst describe the two modes of test image segmentation, supervised and unsupervised, in section 11.4.2. We then detail our weighted GMM/MRF models in section 11.4.3.

11.4.2Supervised and Unsupervised Test Image Segmentation

A test image to be segmented is represented in the same way as the training

ˆ = {

image by a 1-D array X. In the case of test image, a 1-D array Y yˆ1, yˆ2, . . . , yˆ N ) is the estimate of the segmentation. We can now adopt one of the two strategies for test image segmentation.

1.Supervised segmentation with GMM: Using the ML estimate of the parameter values obtained from the training images, a segmentation of the test images is performed. This is achieved by substituting the learnt model parameters θ from training when performing testing. The image is segmented by setting the class label estimate yˆn of pixel xn as the one with the maximum estimate of the component-conditional probability.

yˆn = arg max{ p(yn = m | xn, θm)

m=1

2.Unsupervised segmentation with GMM: This alternative approach assumes no a priori knowledge except for the number of classes in the image corresponding to the number of components in the GMM, L = M. Therefore, the weight Ymn = 1 indicates that all samples are considered as being generated from this distribution. Using the GMM-EM algorithm, an ML estimate of the parameter values is found. The segmentation can then be estimated using the GMM by extracting the component-conditional probabilities using the Bayes rule.

A Knowledge-Based Scheme for Digital Mammography

627

11.4.3 A Weighted GMM/MRF Model of Segmentation

A ﬁnite mixture model (FMM) [23, 27, 32] is deﬁned as a linear combination of M component conditional densities f (x | m, θm), for m = {1, . . . , M}, and M mixing coefﬁcients f (m) of the form

f (x) = f (m) f (x \| m, θm)	(11.14)
m=1

such that the mixing coefﬁcients f (m) satisfy the following constraints:

f (m) = 1 and 0 ≤ f (m) ≤ 1.

m=1

The framework of WGMM comprises of l (1, . . . , L) class densities each modeled independently using a GMM of the form given in Eq. (11.14) and a set of mixing coefﬁcients p(ωl ) as

p(x) =	p(ωl ) p(x \| ωl , l )	(11.15)
	l=1

The lth GMM estimates the class-conditional pdf p(x | ωl , l ), which is itself another mixture model, for each data point for each class {ωl }lL=1. The vector

l is deﬁned as the M component Gaussian parameters of the lth GMM as

l = {Pl (m), µlm, lm }, m = {1, . . . , M}. Each estimate of the class conditional pd f is mixed to model the overall unconditional density p(x), using a mixing coefﬁcient p(ωl ), identifying the contribution of the lth class density in the unconditional pdf.

If it is assumed that for a complete dataset X, of points xn, where X ≡

{x1, . . . , xN ), is drawn independently from the distribution f (x | θ ), then the joint occurrence of the whole dataset can be conveniently expressed as the log likelihood as follows:

N	N	L

log ζ ( ) =	log p(xn \| ) =	log γnl p(ωl ) p(xn \| ωl , l)	(11.16)
n=1	n=1	l=1

Using a modiﬁed version of the expectation-maximisation (EM) algorithm, as described below, we derive an ML estimate of the parameter values of each of the L GMMs { l }lL=1.

The general framework for parameter estimation in GMM can be used to learn the parameters of WGMM. Here the component conditional densities, appearing

628	Singh and Bovis

in Eq. (11.13) are themselves mixture models. In the EM algorithm, the update equations for mixing coefﬁcients do not depend on the functional particulars of the component densities. Hence, the mixing coefﬁcients of the WGMM are updated according to

1		N
1
Pnew(ωl ) =		pold(ωl \| xn, lold)	(11.17)
Pnew(ωl ) =	N	pold(ωl \| xn, lold)	(11.17)
		n=1

The m-step involves maximizing the auxiliary function with respect to the parameters { l }lL=1. The auxiliary function can be written as

new

old

new

) =

)

n=1 l=1

(ωl | xn, l

) log P

(ωl ) p

(xn | ωl , θl

(11.18)

where

new

(xn

| ωl

, l

)

(ml ) p

(xn | ml , ml

)

(11.19)

m=1

Writing γnl = pold(ωl | xn, lold), the auxiliary function can be written as the sum of L auxiliary functions, one for each mixture model:

new

old

new

) =

γnl log P

(ωl ) p

(xn | ωl , θl

)

(11.20)

n=1 l=1

new

old

new

old

) =

Ql ( ,

)

(11.21)

l=1

p(xn | ωl , θl ) P(ωl )

where

γnl = p(ωl | xn, l ) =

(11.22)

j=1 p(xn | ω j , θ j ) P(ω j )

new

old

new

γnl

(11.23)

and Ql ( l

, l

) =

log P

(ωl ) p

(xn | ωl , θl )

n=1

The procedure for maximising the overall likelihood of a WGMM is outlined in Algorithm 1. It consists of an outer EM loop, which are nested in L inner EM loops. Each time the outer loop is traversed, the mixing weights p(ωl ) are updated according to Eq. (11.17) and the L inner loops are iterated to update the mixing weights pl(m), means µlm, and covariances lm for each of the components. It should be noted that it is not necessary to iterate the inner loops to converge on each outer EM step, since it is only necessary to increase the

A Knowledge-Based Scheme for Digital Mammography

629

auxiliary function to ensure convergence of the overall likelihood to a local maximum.

Algorithm 1: WGMM ALGORITHM

1.Make an initial estimate of all GMM parameter values { l }lL=1 and p(ωl ).

2.Iterate outer E-step and outer M-step until the change in auxiliary function (Eq. 11.18) between iterations is less than some convergence threshold WGMM converge.

3.Outer EM E-step:

(a)Compute γnl = pold(ωl | xn, lold).

(b)Evaluate an auxiliary function Q( new, old) as in Eq. (11.18).

4.Outer EM M-step:

(a)Inner EM steps

For each GMM modeling the class-conditional pdf of classes ωl = {1, . . . , L} do (update the parameter values of each individual GMM using the GMM-EM algorithm until convergence).

(b)Find new values for the W G M M mixing coefﬁcients, new, that maximizes the auxiliary function given in step 3(b) above.

5.Iterate steps 2–4 until the convergence criteria are satisﬁed.

Finally, we combine our WGMM model with MRF in the same manner as Zhang et al. [24] combined GMM with MRF. The W G M MMRF model is based on Eq. (11.15) except that the mixing coefﬁcients p(ωl ) are replaced with a MRFMAP estimate p(yn = ωl | n) using ICM algorithm [29]. The auxiliary function given in Eq. (11.18) is rewritten to include the MRF hidden model as follows:

N	L

Q( new, old) =	pold(ωl \| xn, oldl ) log( p(yn = ωl \| n) pnew(xn \| ωl , lnew)

n=1 l=1

(11.24)

630	Singh and Bovis

The update equations for the mean and covariances in the GMM-EM algorithm remain unchanged. The MRF-MAP estimate is combined in the conditional density function pold(ωl | xn, θlold) as


γ	nl =	p(ω	x , θ )	=	p(xn \| ωl , l ) p(yn = ωl \| n)		(11.25)

		l \|	n l		M	p(xn \| ω j , j ) p(yn = ω j \| n)
					j=1

The WGMMMRF-EM algorithm is used to determine the ML estimates of the parameter values by iterating the WGMM-EM algorithm while constraining the density estimation with the hidden MRF model. For supervised learning, the labelled training data is used for the initialization of the WGMM and WGMMMRF models, to give us WGMMS, and WGMMMRFS and no training data is used for the unsupervised learning case, WGMMU, and WGMMMRFU .

11.4.4 Combination of Image Segmentation Experts

In the previous section we developed four new models of image segmentation and mentioned the use of different experts based on different texture features that rely on them. It is beneﬁcial to fuse the decisions of different experts on a per pixel basis. In this section we detail the conventionally used strategy of classiﬁer decision combination, called “ensemble based combination rules,” and then propose a novel strategy for combining expert outputs, called “adaptive weighted model (AWM).” First of all, we describe a generic framework of combination, and then discuss the combination strategies within that framework.

11.4.4.1 Expert Combination Framework and Nomenclature

The image to be segmented can be represented as a 1-D array X = {x1, . . . , xN ), where xn is an input feature for the pixel n and N is the total number of pixels

ˆ = in the image. Let the estimate of the segmentation be denoted by array Y

{yˆ1, . . . , yN ). It is assumed that the number of classes is predetermined from a set of known class labels ωl {1, . . . , L}, and therefore, the estimated class label of pixel n is indicated as yˆn = ωl .

We assume that there are R image segmentation experts, where the rth expert provides a segmentation decision for a given pixel feature xn from a set of learnt parameter vectors θr . Using a WGMM expert, the parameter vector θr of each expert is deﬁned as a set of component mixing coefﬁcients pl(m), means µlm, and

A Knowledge-Based Scheme for Digital Mammography

631

covariances lm from each of the M component Gaussians, m {1, . . . , M}, for each class ωl {1, . . . , L}. On segmentation of an image, the rth expert provides an estimate of the a posteriori probability of a feature vector associated with a pixel xn, belonging to a given class ωl as p(yˆn = ωl | xn, θr ), for n = (1, . . . , N). In order to combine the decisions of different experts, the joint probability of all segmentation decisions is required. Using the Bayes rule, the combined a posteriori probability can be computed from the segmentation experts for class

ωl as follows:

p(yˆ	=	ω	x , θ		. . . θ		)	=	p(yˆ = ωl \| xn, θ1 . . . θR) p(ωl )	(11.26)
									p(xn, θ1, . . . , θR)
		l \|	n	1		R

where p(ωl ) is the prior probability (assumed to be set equally for all classes as 1/R) for each class ωl , and p(xn, θl , . . . , θR) is the unconditional joint probability deﬁned as

p(xn, θ1, . . . , θR) =	p(yˆ = ωk \| xn, θ1, . . . , θR) p(ωk)	(11.27)
	k=1

On the basis of this nomenclature and equal priors from each class, in the following two sections we detail the “ensemble-based combination rules” (section 4.4.2), and then propose a novel strategy for combining results, called “adaptive weighted model (AWM)” (section 4.4.3)

11.4.4.2 Ensemble-Based Combination Rules

Kittler [33] proposed a set of very popular rules for combining probability outputs from a number of experts. These rules are stated as follows:

Product

( Prod)

Sum

(Sum)

Max

(Max)

Min

(Min)

p(yˆ = ωl

j;1

r 1 p(yˆ

ω j xn, θr )

p(yˆ

= ωl

| xn, θr )

xn, θ1

. . . θR)

r=1

;

| xn, θ1

. . . θR) =

= ωl

| xn, θr )

R r=1 p(yˆ

| xn, θ1

. . . θR) = max rR=1(yˆ = ωl | xn, θr )

| xn, θ1

. . . θR) = min rR=1(yˆ = ωl | xn, θr )

632

Singh and Bovis

Majority

Voting

l | n

R =

(Mv)

p(yˆ

ω x , θ . . . θ )

r=1

where

max rR

1 p(yˆ

ωl

xn, θ j )

1 if p(yˆ = ωl

| xn, θ1

. . . θr )

= =

otherwise

The above combination rules have been used in several studies and form the basis of our baseline comparison.

11.4.4.3Average Weighted Model (AWM) Classiﬁer Combination

In our proposed approach, the expert decisions are modeled as a probability density function. From a linear opinion pool of R experts, assume that the rth segmentation expert provides an estimate of the a posteriori probability.

p(yˆn | r, xn) = p(yˆn | xn, θr ) n = (1, . . . , N)

(11.28)

We assume that accompanying this pd f is a linear weight or mixing coefﬁcient, p(r), indicating the contribution of the rth expert in the joint pd f, p(yˆ | x, ), resulting from the combination of experts. The vector is the complete set of parameters describing the combined pd f . Hence, following the expert combination, the complete pd f can be written as

p(yˆn \| xn, ) =	p(r) p(yˆn \| r, xn)			(11.29)
	r=1
given that the mixing coefﬁcients satisfy the following constraints:			R	p(r) =
		r=1		p(r) =
		expert in the
1 and 0 ≤ p(r) ≤ 1. If we treat the weighted contribution of each

unconditional distribution as probabilities, then statistical models such as mixture of experts (MOE) framework [34] can be trained to learn the individual classiﬁer and weight contribution distributions. For this we propose using the GMM using EM algorithm. We now present a method for identifying the weights in a probabilistic manner motivated by the MOE framework. Our proposed approach is, however, different to the conventional MOE method in two ways: (i) First, the a posteriori pd f from each segmentation expert remains ﬁxed having been generated during segmentation; (ii) second, the mixing coefﬁcients for

<<< < Предыдущая 53 54 55 56 57 58 59 60 61 62 63 6465 / 8465 66 67 68 69 70 71 72 73 74 75 76 77 > Следующая >>>

Соседние файлы в предмете Биомеханика

#
10.08.201320.85 Mб111Intermediate Physics for Medicine and Biology - Russell K. Hobbie & Bradley J. Roth.pdf
#
10.08.2013966.8 Кб83Intermediate Probability Theory for Biomedical Engineers - JohnD. Enderle.pdf
#
10.08.201335.92 Mб67Introduction to Biomedical Engineering - John D. Enderle et al.pdf
#
10.08.20131.72 Mб130Introduction to Statistics for Biomedical Engineers - Kristina M. Ropella.pdf
#
10.08.201310.58 Mб110Kluwer - Handbook of Biomedical Image Analysis Vol.1.pdf
#
10.08.201325.84 Mб107Kluwer - Handbook of Biomedical Image Analysis Vol.2.pdf
#
10.08.201316.35 Mб116Kluwer - Handbook of Biomedical Image Analysis Vol.3.pdf
#
10.08.20137.87 Mб1852Laser-Tissue Interactions Fundamentals and Applications - Markolf H. Niemz.pdf
#
10.08.20132.76 Mб125Mathematics for Life Sciences and Medicine - Takeuchi Iwasa and Sato.pdf
#
10.08.20131.85 Mб65Metabolic Engineering - T. Scheper and Jens Nielsen.pdf
#
10.08.201310.41 Mб68Micro-Nano Technology for Genomics and Proteomics BioMEMs - Ozkan.pdf