Abstract-In radiotherapy using 18-fluorodeoxyglucose
positron emission tomography (18F-FDG-PET), the accurate delineation of the
biological tumour volume (BTV) is a crucial step. In this study, new approach
to segment the BTV in F-FDG-PET images is suggested. The technique is based on
the k-means clustering algorithm incorporating automatic optimal cluster number
estimation, using intrinsic positron emission tomography image information. Partitioning
data into a finite number of k homogenous and separate clusters (groups)
without use of prior knowledge is carried out by some unsupervised partitioning
algorithm like the k-means clustering algorithm. To evaluate these resultant
clusters for finding optimal number of clusters, properties such as cluster
density, size, shape and separability are typically examined by some cluster
validation methods. Mainly the aim of clustering analysis is to find the
overall compactness of the clustering solution, for example variance within
cluster should be a minimum and separation between the clusters should be a



are several approaches to validate the segmentation techniques such as phantom
studies and the macroscopic surgical specimen obtained from histology. The use
of macroscopic samples for validation of segmentation techniques in positron
emission tomography (PET) images is one of the most promising approaches
reported so far in clinical studies, the procedure consists of the comparison
of the tumour volumes defined on the PET data with actual tumour volumes
measured on the macroscopic samples recorded from histology (where PET was
performed prior to surgery). Segmentation using the cluster –based algorithms
is very popular, but the main problem in this case is the determination of the
optimal and desired number of clusters. In this, we have implemented an
approach based on k-means algorithm with an automatic estimation12 of the
optimal number of clusters, based on the maximum intensity ratio in a given
volume of interest (VOI).



the VE for a range of k (2–50
clusters), and the optimal number which corresponds to the minimum of SVEs. This method gives
good results but consumes a significant computation time by performing the
clustering for a large range of cluster values before selecting the optimal
number of clusters. Several approaches have been proposed in the literature to
identify the optimal cluster number to better fit the data, three of them are

the results are not promising because they are not adapted to PET image
segmentation. So our goal in this study is to improve the k-means clustering
method, by incorporating an automatic determination of the optimal number of
clusters using a new criterion based on PET image features.

analysing the variation of the maximum activity (intensity) of the uptake


) in
the VOI by scanning all slices, we conclude for all patients that the maximum
intensity value decreases from middle to frontier slices, and the maximum
intensity is often situated almost at the centre of the BTV. The optimal
cluster number has a minimum value at the centre of the BTV, and increases from
central to frontier slices. This correlation between the optimal number of
clusters and the maximum intensity motivates our choice of the following slice
image feature:





 is the maximum activity
(intensity) of the uptake

 in the corresponding slice,

 is the maximum activity
in all slices that encompasses tumour volume BTV inside the

, and

 is the difference between the maximum and the
minimum values of (Imax (slice)/Imax (VOI)) in the



, the new criterion

, has a maximum value for the middle
slices and decreases for the frontiers of the BTV. Note that the r values
range from ‘0’ to ‘1’ for all patients.


Modelling: This section is dedicated to finding a
relationship between the optimal number of clusters k, and the new criterion r.
This relationship could be used to determine the optimal cluster number for the
segmentation of new PET images using only the new slice image feature r.
After analysing the variation of k in function of r criterion
(for all patients included in this study), we use two fitting models: an
exponential and a power function given by (a) and (b), respectively,

= ? ? e?. r + 1 (a)

= a ?

 + 1  (b)

?, ?, a, b are coefficients of fitting models and r
is the proposed criterion. The fitting accuracy evaluation is based on the R-square
criterion. Note that we added ‘1’ to the original fitting equation to avoid
clustering the image with one cluster for the high values of r.

Generalisation: The aim of this step is to automate the
choice of the optimal cluster number for all patients using one corresponding
relationship function by defining a generalised model for all patients. For
this reason, we have divided the database randomly into two parts of 50% each.
The first part (validation set), contains three patients is used for optimising
the model coefficients and fixing the optimal power and exponential generalised
model. The second part (test set), contains four patients, is used for testing
the accuracy of the fixed optimal model.

to the R-square criterion, the optimal exponential and power generalised
function can be rewritten as follows:


= 46.52e?5.918 × r
+ 1

= 1.683r?1.264 + 1


mean alogorithm 3 :



1.  K-mean
clustering algorithm


Optimization by new technique: The
underlying idea behind it is an analysis of the “movement” of objects between
clusters, considered either forward from k to k+1 or backwards from k+1 to k
groups. In other words we find the movement in membership or joint probability
around k groups. The joint probability obtained from adjacent consecutive k
numbers of groups will be used to produce a diagonally dominant probability
matrix for optimal value of k homogenous and separable groups. The maximum normalised
value of the trace as the greatest value for k within the range tested, will be
defined as the optimal value of k clusters for the given dataset. Formally, we
may describe our approach as follows. For a given choice of k = number of
clusters, a given choice of clustering technique U, and a given choice of V =
set of parameters v1…vn used to control the clustering technique, we first
construct a set of clusters

(U,V) = {

} with i=1..k. Next, we construct sets
of clusters

(U,V), and

(U,V) using the same clustering
technique. In the work reported here, we will not vary U and V so we may write
these cluster sets more simply as



. Now these three (



) consecutive groups around k will be
used to find the proportion of common objects from Ck to



, to create a rectangular proportion
matrix of size m x n, where m and n correspond to rows and columns of
proportion matrix

. We denote the proportion of data
elements in common between a particular pair of clusters, say cluster


 and cluster



, which can be abbreviated to

. Similarly, we can compute

, k to create a rectangular matrix of
size n x m where n correspond to columns and m to rows. Note that in general

 is not equal to

as they have different cardinality
meaning k+1 not equal to k and vice versa. To investigate how much movement of
objects occurs from Ck to Ck+1 and from


 we will apply the dot product of matrix for
size m x n (


) and n x m rectangular matrices (


) to get the joint information in

 square matrix m x m for

clusters. Due to the row sum constant of
1 the resultant

 square matrix is also known as a row
stochastic matrix 4 or probability and transition matrix 5. The trace of
resultant set of

clusters will be normalised (average of
the trace) to determine the set of more stable (optimal k)

clusters as if the normalised trace is
maximum and may change occur in the

depending on the dataset for the range
of adjacent set of k values. This

matrix will be used to determine the
maximum normalised trace value for determining set of more stable or consistent
clusters , that will indicate set of clusters in

are stable and completely separated from
one another. The steps involves in determining the optimal value of k from the
resultant clusters are follows as:


Create the m x n forward proportion matrix from k to k+1






Create the n x m backward proportion matrix from k+1 to k







1,2,3 … . .


1,2,3 … . .


dot product of (1) by (2) will give us a m x m matrix as in (3) below with the
entries showing the joint probabilities of the forward/back movement of the
objects between the set of clusters from k to k+1 and k+1 to k.






new index

 can be calculated from

 in (3) as follows:






 From (4) the normalised maximum trace value

 will indicate the set of stable cluster at k
value. In an extreme situation the normalised trace is equal to 1, that is
where the set of clusters in

will keep splitting until the value of
the trace is 1 and it may decrease or increase as we continue but will always
stay less than 1.




A new unsupervised cluster-based approach for segmenting the BTV in

images is introduced. The system is more reliable and has very less error. It can be improved by

 technique used in determining an optimal value of K in K-means clustering, for which k-means clustering it uses a
method to find an optimal value of k number of clusters, using the features and
variables inherited from datasets. The new proposed method is based on
comparison of movement of objects forward/back from k to k+1 and k+1 to k set
of clusters to find the joint probability, which is different from the other
methods and indexes that are based on the distance.



I'm Katy!

Would you like to get a custom essay? How about receiving a customized one?

Check it out