Abstract
It is commonly believed that not all degrees of freedom are needed to produce good solutions for the treatment planning problem in intensity modulated radiation therapy (IMRT). However, typical methods to exploit this fact either increase the complexity of the optimization problem or are heuristic in nature. In this work we introduce a technique based on adaptively refining variable clusters to successively attain better treatment plans. The approach creates approximate solutions based on smaller models that may come arbitrarily close to the optimal solution. Although the method is illustrated using a specific treatment planning model, the components constituting the variable clustering and the adaptive refinement are independent of the particular optimization problem.
1 Treatment planning and application
The treatment planning problem in IMRT is the derivation of intensity maps that produce, as nearly as possible, a prescribed dose distribution when beam directions and couch angles are specified. For this report we will assume that the directions and couch angles of the linear accelerator are fixed because finding optimal values for these parameters is a global optimization problem. The variables of the treatment planning problem are therefore the intensity maps which describe the modulation of the radiation on the cross-section of the beams. Usually, these are discretized on a regular grid with a resolution determined by the width of the multileaf collimator (MLC) leaves. This corresponds to a discretization to fields of size typically less than or equal to 10 × 10 mm. Each small field is called a beamlet. The number of variables in a typical treatment planning problem with five to seven beam directions corresponds to several hundred beamlets. The aim of the IMRT planning and dose delivery process is to destroy cancerous cells while sparing nearby healthy structures. This process naturally lends itself to the formulation of a multicriteria optimization problem in which the dose in each tumor volume and each healthy structure is assessed with separate objective functions [19]. A decision-support system to select a treatment plan from the Pareto frontier of such a multicriteria approach is described in detail in the publication [9].
This multicriteria approach provides control over the trade-off between overdosing healthy structures and destroying cancerous cells. However, it does not consider the number of monitor units (related to total treatment time) needed to deliver the intensity maps using an MLC. Long treatment times increase the risk of treatment errors due to patient movement. Increased monitor units leads to increased leakage radiation reaching the patient’s total body so that the risk of secondary cancers go up [6]. An objective function that provides some control over this aspect was introduced in a recently published article [3]. However, adding an objective function to the treatment planning problem increases its complexity. Complicated cases with many healthy structures close to the tumor volumes can result in significant increase in computation time during the optimization step of the planning process. If a step-and-shoot method is used to deliver the treatment, the intensity maps can be transformed before they are translated into MLC leaf configurations for the patient treatment. In the following, we assume that a static step-and-shoot method is used to deliver the plans.
One way of reducing the total number of different field shapes making up the MLC sequence of apertures is to reduce the number of different intensity levels in an intensity map. Experiments show that if the number of intensity levels decreases, the number of apertures can be expected to decrease [8, 17]. This stratification can, for example, be performed for a given intensity map as shown in figure 1. However, it is required that the quality of the solution for the planning problem be maintained when the intensity levels are reduced [12]. A reduction of the number of variables in the treatment planning problem by requiring that some beamlets have the same intensity value is the a priori counterpart to stratification. The benefit of a priori stratification is that calculations are simplified by a reduction in the number of variables. Also, because the plans are simpler, the number of apertures and the total number of monitor units needed to treat the patient should be reduced. The potential problem with this approach is that it limits control over the dose in the patient’s body. Thus, the trade-off between the consequences of stratification must be carefully evaluated during the treatment planning process.
Figure 1.

The original intensity map was stratified to 5 distinct intensity values. The map in 1(a) can be delivered with 27 apertures, whereas the map in 1(b) needs only 5 apertures.
Reducing the degrees of freedom of the treatment planning problem is not a new idea. A related approach is aperture-based optimization where the monitor units of a set of pre-determined MLC leaf configurations are the variables of the planning problem [5, 18]. Some approaches suggest an automated procedure for generating these apertures [4]. The advantage of these methods is that physical effects of the plan delivery like leaf leakage can be incorporated into the planning, and no sequencing is needed after the optimization. However, it may be difficult to select a set of “good” apertures that will guarantee sufficient control over the dose distribution so that the plan attains an acceptable quality.
Some attempts have been made to modify optimization algorithms used for IMRT treatment planning so that the number of apertures resulting from sequencing is reduced. Most notably, Alber and Nüsslin [1] have proposed an operator that modifies the solution at fixed iterations during the optimization. This operator sets neighboring beamlets to equal intensities, which are then grouped to apertures. The intensity corresponding to all created apertures are then additional variables in the optimization. A different approach was taken by Keller-Reichenbecher et al [8]. Here, the solution is stratified to a small number of intensity levels every fixed number of iterations. Although the results presented in the latter approach indicate that the method performs quite well, it remains a heuristic approach and does not in general converge to an optimal solution.
To address the goal of reduced monitor units, the new objective function presented by Craft et al. [3] and the approach proposed by Alber and Nüsslin [1] increase the complexity of the problem. In comparison, the heuristic method presented by Keller-Reichenbecher et al. [8] does retain the complexity level of the original planning problem. In this paper, we describe a method to first reduce the number of variables of the planning problem to a very small number. In contrast to earlier aperture-based optimization approaches, this is done automatically. Additionally, the method presented here does not restrict the aggregation of variables to apertures but allows clusters of possibly unconnected regions on the beam surface. As such, this is one step closer to intensity modulation than aperture-based planning. It does, however, require the solutions to be sequenced. During the solver iterations, we adaptively add more degrees of freedom to refine the solution until it is acceptable. The planner may set a limit to the number of variables that are used and can control the resulting complexity of the intensity maps. By setting this limit equal to the number of variables in the treatment planning problem, this approach produces optimal solutions. Moreover, it is numerically verified that the adaptive refinement of the problem formulation based on clinically meaningful guidelines has a positive effect on the convergence of a solution mechanism.
Chapter 2 presents the idea of aggregating the variables by the dose computation necessary in IMRT planning. This chapter also justifies the use of a heuristic clustering procedure. Chapter 3 introduces the clustering technique and demonstrates its applicability using a simulation example and a clinical prostate case. Chapter 3 includes a specification of a treatment planning problem formulation for the prostate case which will serve as an illustration for the techniques later. In chapter 4, the refinement strategy is proposed and carried out for the prostate case. Numerical results about the solver progress and the comparison of the solutions obtained by the aggregation and refinement and the original formulation are also made in this chapter.
2 Dose computation and variable aggregation
The standard approach of discretizing the intensity maps to a regular grid of small beamlets is taken as a starting point for solving the IMRT planning problem. The patient’s body is also discretized into small volume elements called voxels to simplify the computation of the dose received by each small volume part. The slices of the CT scan imply a natural dissection in the z-direction. Together with a further sectioning of the x–y plane, the voxels typically are of dimensions of a few millimeters. As a consequence of these discretizations and the superposition principle of dose deposits in photon therapy, the dose distribution over the voxels can be calculated by the matrix multiplication
| (1) |
where d is the m-dimensional vector of dose values for all voxels, the matrix P is the dose information matrix, and the n-dimensional vector x are the intensities of the beamlets over all beams, written as one column vector. The entry pji of the dose information matrix represents the contribution of the ith beamlet to the absorbed dose in the jth voxel under unit intensity. There are several methods to estimate these values. They might be calculated using the pencil beam approach, a superposition algorithm, or some Monte Carlo method. In this paper we do not discuss this important issue - the interested reader is referred to the books by Webb [15, 16]. We assume that P is given in some satisfactory way. Note that the rows of P correspond to the voxels and the columns to the beamlets. Therefore, P typically has about one million rows and several hundred columns. Moreover, the matrix is sparse - often only 10% of the entries are positive. The reason is that one beamlet only hits a small portion of voxels compared to the entire volume. The dose calculation takes significant time in an iterative optimization algorithm to determine the intensity maps - even when techniques to exploit the sparsity of P are used. Further, as MLC hardware becomes even more sophisticated, the leaves will become thinner, and the number of columns of P will increase in the future.
A numeric technique to reduce the number of rows of P using an adaptive clustering method is presented in another publication in this special issue by Scherrer and Küfer [11]. Neighboring voxels belonging to the same organs or tumors are treated as groups if their dose deposits are “similar”. The optimization is carried out on these clusters of voxels with a dose information matrix that has relatively few rows. The largest errors due to the clustering are identified and the clusters broken up to attain a refined description of the body. This iteration between optimization and refinement continues until the clustering error is below a threshold. In this paper, we focus on the aggregation of variables, so we are interested in reducing the number of columns of P.
To reduce the degrees of freedom of the treatment planning problem means to constrain some of the beamlets to have equal intensity values. Imagine that the beamlets are partitioned into L groups B1, …, BL, and each group of beamlets has their own intensity ℓb. Thus, xi = ℓb for all beamlets i belonging to group Bb. This means, the dose calculation (1) can be written as
| (2) |
Aggregating the variables in the treatment planning problem effectively reduces the size of P by summing up all columns that correspond to beamlets with identical intensity values. To code the allocation of beamlets to same groups, we introduce the n × L matrix A with entries
The dose calculation (2) can then be written as d = P · A · ℓ, with ℓ as the vector of all group intensities. The “small dose information matrix” P · A makes the computation of d much faster if L « n.
If the number of different intensity levels L is fixed, we can formulate the allocation problem to find A as an optimization problem to minimize a metric that describes the error due to the aggregation. In other words, right-multiplying A to P should keep the norm ‖d − P · A · ℓ‖q for q ≥ 1 and any given dose d and intensities ℓ small.
Problem 2.1 (Allocation problem)
Given a dose distribution d, a dose information matrix P and a set of group intensities ℓ, find an allocation A that approximates the resulting evaluation of dose distribution as close as possible under the norm ‖ · ‖q for q ≥ 1:
Note that the constraints in Problem 2.1 produce a partition of the beamlets. Unfortunately, the Allocation problem is NP-hard.
To see that the Allocation problem is hard, we reduce PARTITION to an instance of Problem 2.1. Let k1, …, kn be a set of positive integers and . PARTITION asks if there is a partition of the integers into subsets S1 and S2 such that for j = 1, 2. Now consider the following Allocation problem:
| (3) |
If the optimal objective function value (3) is 0 for any q ≥ 1, then PARTITION is answered with yes, and no otherwise. This is the Allocation problem with d = [g g]T, P with the set of integers k1, …, kn as row vectors, and the group intensities ℓ given by 1, the vector of all 1s. This result discourages a search for the optimal aggregation. The alternative is to develop a method that produces solutions of acceptable quality.
Note that if two columns of P are “similar”, meaning the positive entries hit voxels belonging to the same structures with similar contributions, it may be expected that the optimal intensities corresponding to the two beamlets are similar as well. That is, we would like to group together those beamlets with similar impact on the dose distribution d. Grouping similar objects is achieved by clustering methods [7]. In the following chapter, we derive a clustering algorithm to group similar beamlets.
3 Beamlet clustering and implications
The ingredients for a clustering method are a measure of similarity between objects and between objects and clusters, and an algorithm to group the objects based on these similarities. Instead of maximizing the similarity between objects inside a cluster, the dissimilarity could be minimized. A simple measure of dissimilarity of objects that are characterized by a vector of real numbers are the distance metrics
| (4) |
for q ≥ 1. For q = 1, (4) is the rectilinear or Manhattan metric, and for q = 2, (4) becomes the Euclidean metric. We will use q = 2 and take “dist” without the subscript q to mean the Euclidean metric from here on. A cluster will be represented by an average over all columns that are grouped to it. This representative is given by
where Pi is the ith column of P.
A first attempt to characterize the beamlets might be to use the columns of P. There are, however, two disadvantages associated with taking the entire “information” of each beamlet. First, evaluating the distance (4) between two beamlets takes a long time because potentially many entries have to be compared. This calculation becomes even more tedious as clusters grow in size because a vector of averages over sparse columns is in general less sparse. The other disadvantage is that the distance measure does not “discriminate” enough if the original columns are used. The positive entries in P range from the orders 10−5 to 102, and large values will have a dominating effect on the distance. As a result, only large deviations between two objects determine their dissimilarity and small entries are largely ignored. In IMRT, however, it is especially the many small contributions that add up to significant doses that can be exploited to shape the dose distribution. Therefore, a different characteristic for beamlets and cluster representatives must be found.
Better clusters can be expected when the information contained in the columns of P are condensed on an organ level. The contribution of a beamlet to a specific organ is given by the entries in the rows corresponding to voxels of that organ. From a statistics viewpoint, if these entries are seen as random variables, their moments suffice to characterize the beamlets. The kth moment of a random variable Y is given by the expected value of the kth power of Y, E(Yk). We will characterize the beamlets and cluster representations by a vector of moments for each organ in the patient body. We limit the number of moments to at most 3, since higher moments are typically only of theoretical value [14, Chapter 3.9]. Thus, the vectors characterizing our objects only contain (number of organs × number of moments ≈ 10 to 20) entries and could look like the following if 2 moments are used:
where the entry cs1 h (i) denotes the hth moment of the contributions of beamlet i to the organ s1 given by
| (5) |
Now that we have decided on the dissimilarity measure for beamlets and cluster representatives, we need a method to group the beamlets into clusters. Because we fixed the number of clusters in Problem 2.1, we use the K-means algorithm 3.1 to aggregate the variables.
| Algorithm 3.1K-means algorithm adapted from [7] | |
|---|---|
| Procedure: KMeans | |
| Input: Characteristic vectors ci, i = 1,…,n, number of clusters K, distance measure “dist” | |
| Output: allocation A | |
| Step 1: | Produce initial clusters 1, 2,…,K and allocation A and compute the cluster means μ1,…,μK. |
| Step 2: | For beamlet i = 1, compute for every cluster k the increase in error in transferring this beamlet from cluster a(1) to cluster k given by where |k| denotes the cluster size of cluster k, and a(i) = {b : aib = 1} denotes the cluster to which beamlet i is assigned to. If the minimum of this quantity over all k ≠ a(i) is negative, transfer the beamlet i from cluster a(i) to this minimal k, adjust the cluster means of a(i) and k, and set a(i) := k. |
| Step 3: | Repeat Step 2 for i = 2,…, n. |
| Step 4: | If no movement of a beamlet from one cluster to another occurs, stop. Otherwise, return to Step 2. |
The characteristic vectors are the collections of moments of organ contributions, and the distance measure is the Euclidean. The cluster means are determined by adding the dose contributions of newly added or removed beamlets and is the most time-consuming operation of KMeans. Algorithm 3.1 is performed separately for each beam to ensure that exactly L clusters represent the beamlets of each direction. One method to obtain an initial clustering for Step 1 is to randomly assign beamlets to the K clusters. This also has the advantage that it produces different starting points for KMeans, each leading to a different locally optimal allocation. Typically the method runs fast enough so that the clustering can be performed several times with different starting points. The clustering with the smallest error is taken as the final aggregation.
3.1 Case 1: Artificial example
To demonstrate the basic effects of this type of variable aggregation, a simple, artificial case was created. Figure 2 shows one of the transverse slices of this case. The “body” is a large cube, and there are only three relevant structures. The cuboid at the bottom inside the body represents the tumor volume, and the other two structures resemble healthy organs. Five beam directions were chosen and the beamlets clustered to only 2 or 3 groups to illustrate the effect of this type of variable aggregation technique.
Figure 2.

The figure shows the view of a transverse cut of the artificial example in the planning software KonRad (developed by the DKFZ in Heidelberg). The tumor is the cuboid where the beam directions intersect.
The tumor and the organs have size 20 × 4 × 16 cm. The voxel sizes were set to 1 × 1 × 3 mm. This corresponds to a total of 363,825 voxels for this case. Over all 5 beams, a total of 441 beamlets actually hit the tumor volume and constitute the degrees of freedom in the treatment planning problem for this artificial case. The number of positive entries in P is about 14% of all entries in the matrix. We refer to the beam directions by their angles. Starting from the beam entering from the top of figure 2 with angle 0, the directions are separated by 72 degrees.
The beamlets of the directions 0 and 288 degrees were grouped in 3 clusters, and the rest of the directions were clustered in 2 groups. As a result, the matrix P · A has only 12 columns, and about 36% of its entries are positive. Calculating the characteristic vectors for all beamlets and the clustering procedure took a total of about 2.4 seconds per beam on a 2.2 GHz processor. Figure 3(a) shows the cluster number of each beamlet in the intensity map corresponding to beam 0. It was natural to choose 3 clusters because the back-projection of the structures on the surface of the beam can be parted in three: one area in the middle where both organs are in front of the tumor, and two areas where only one is in the way of the beam. Note that the clustering algorithm has no information about the location of the beamlets on the beam surface - they were clustered solely based on the information about their contributions to the structures. The other beam directions showed similar geometric back-projections. The clusters in beam 216 (figure 3(d)), for example, also simulate the projection of the structures on the beam: the area on the beam where only the farthest structure is hit constitutes a separate cluster of beamlets.
Figure 3.

Cluster numbers of the beamlets in each beam for the artificial example. The beams are of different size because the planning software automatically eliminates those beamlets that do not hit the tumor.
3.2 Case 2: Clinical prostate example
While the first artificial example demonstrates that the clustering method based on organ information is in principle capable of identifying different critical regions on the surface of the beam, this does not yet warrant a successful application to real cases. In this section, a prostate case provided to us by the Massachusetts General Hospital in Boston is studied and the variables are aggregated. We will additionally formulate a treatment planning problem and compare the results from the original formulation with the solution to the aggregated problem. We delay the discussion of improving the aggregated solution until chapter 4.
In this case, the prostate and the seminal vesicles are the target volumes. They compose the structure marked with the crosshairs in the left window of figure 4. The structure in front of the prostate is the bladder, and behind the prostate the rectal walls (anterior and posterior) are segmented separately. Finally, the femoral heads are also included as critical structures in this case to limit the dose absorbed by lateral beam directions.
Figure 4.

This view is taken from VIRTUOS (developed by the DKFZ in Heidelberg) and shows the three-dimensional representation of the patient’s body. The left window shows the view from the direction of beam 0.
The dose information matrix consists of 799,200 rows and 173 columns. Again, 5 equiangular beam directions were chosen. There are 13,414,539 positive entries in P, which is 9.7% of all entries. The number of clusters was set to 4 for beam 0 and 5 for the rest. Determining the characteristics of all beamlets and aggregating them took a total of only 2 seconds per beam on a 2.2 GHz processor. The clusters are depicted in figure 5. Again, the clusters closely resemble projections of the structures on the beam. In beam 0, for example, the beamlets corresponding to cluster 1 hit the prostate and both the anterior rectal wall and the posterior rectal wall. Cluster 2 are those beamlets where either only one or no rectal wall is hit.
Figure 5.

Cluster numbers of the beamlets in each beam. Some beamlets are “switched off” because the planning software automatically eliminates those beamlets that do not hit the tumor.
The beamlets in clusters 0 and 3, finally, have to shoot through the bladder and also hit both rectal walls behind the target. Cluster 0 hits the prostate and the beamlets in cluster 3 hit the seminal vesicles. Similar observations can be made for the other beam directions. As a result of the variable aggregation, P · A contains only 24 columns, and the percentage of positive entries increased to 17.6%.
To illustrate the loss in control we imposed, we now compare the original and aggregated solution to the following planning problem. The objective functions for the healthy structures are based on the equivalent uniform dose (EUD) concept [10]. They are of the form
| (6) |
where s denotes the corresponding organ, |s| is the number of voxels in that organ, αs a weight between 0 and 1, dref,s a reference dose value for organ s, and ps and qs are organ-specific modeling parameters (≥ 1). If ps and qs are relatively small, the smaller dose values dj in s are emphasized more. By combining two EUD-type functions, it is possible to model the objective function according to the flexible max-and-mean EUD concept introduced in [13]. The reference dose values dref,s are included to ensure comparability between the objective functions for different organs. The planner must choose the reference doses according to the statement “a dose of dref,s1 in structure s1 is of same importance to me as a dose of dref,s2 in structure s2”. Note that the scale of these reference values does not matter - only the relative magnitudes to each other.
The objective functions for the tumor volumes are given by
| (7) |
to evaluate a lower bound for the target dose, and
| (8) |
with dhom slightly larger than dcur to ensure that the dose is homogeneous in the tumor volume.
The values for each parameter of the functions (6)–(8) are given in the following table.
| Structure s | dref,s | ps | qs | αs |
|---|---|---|---|---|
| tissue | 45 | 2 | 2 | - |
| right femoral head | 50 | 3 | 8 | 0.3 |
| left femoral head | 50 | 3 | 8 | 0.3 |
| anterior rectal wall | 40 | 3 | 8 | 0.75 |
| posterior rectal wall | 25 | 3 | 8 | 0.25 |
| bladder | 30 | 3 | 8 | 0.35 |
The values for the tumor volumes are dcur = 76 and dhom = 80. The values for qcur and qhom are both 4.
The scalarized multicriteria optimization formulation to solve the treatment planning problem can now be stated:
Problem 3.1 (Scalarized treatment planning problem)
The constraints fcur(d), fhom(d) ≤ 0.5 on the tumor volumes largely prevent under-shooting dcur and exceeding dhom. There is no method to test a priori if these constraints can be met. In addition, we would like to use the solution to one aggregated problem as a starting solution for the next refined problem as described in the next chapter. That is, the solver used must be able to cope with infeasible as well as feasible iterates. For this reason, problem 3.1 is solved by a penalty sequential linear programming solver [2, Chapter 10.3].
One graphical output of the quality of a plan is the dose-volume histogram (DVH). These histograms display the percentage of a structure that receives at least a certain dose over the relevant dose interval. The DVH for the optimal original problem is given in figure 6. All calculations were done on the same 2.2 GHz processor to ensure comparability. The solver needed 23 minutes to obtain this solution. The DVH of the solution on the aggregated variables is depicted in figure 7. In this case the solver needed 3.5 minutes. At first glance, it is obvious that the solution to the aggregated problem is not feasible. This may be expected as the degrees of freedom in the planning problem were severely reduced. However, there is a striking similarity in the DVH curves for the organs at risk in both histograms. The following table containing the EUD values (not normalized by their reference dose) also shows this.
Figure 6.

DVH for the original solution.
Figure 7.

DVH for the aggregated solution.
| Structure s | original fEUD,s | aggregated fEUD,s | % deterioration |
|---|---|---|---|
| right femoral head | 31.14 | 29.90 | −4.00 |
| left femoral head | 32.92 | 32.97 | 0.15 |
| anterior rectal wall | 49.92 | 47.46 | −4.93 |
| posterior rectal wall | 30.65 | 32.31 | 5.42 |
| bladder | 37.80 | 38.42 | 1.64 |
The following comparison of the true minimum dose in the tumor volumes indicate that the solution to the aggregated problem is not feasible. In fact, both demands on the curative doses for the prostate and the seminal vesicles are not satisfied.
| Target | original min dose | aggregated min dose |
|---|---|---|
| Prostate | 73.41 | 57.18 |
| Seminal vesicles | 74.41 | 49.47 |
The constraints pertaining to the maximum doses in the targets, however, can be met as the maxima of each volume for both solutions indicate:
| Target | original max dose | aggregated max dose |
|---|---|---|
| Prostate | 81.73 | 81.79 |
| Seminal vesicles | 81.42 | 81.52 |
It is, of course, not surprising that reducing the number of variables from 173 to only 24 may not produce feasible solutions. Some additional variables in the planning problem are definitely needed. An iterative procedure to decide which variables to “free” for a subsequent optimization problem is the topic of the next chapter.
4 Adaptive control and refinement of clusters
In chapter 3, a method to aggregate the variables of the treatment planning problem was presented. In this chapter the refinement of variable clusters is discussed. It is expected that the quality of a solution can deteriorate rather strongly or may not even be feasible when the number of clusters is small. Hence, a mechanism to break up existing clusters at various stages in the algorithm must be implemented to improve the solution of an aggregated problem. Similar to the voxel clustering technique in the article [11], we call such a method an adaptive refinement. In principle, the adaptive refinement creates a series of aggregated optimization problems starting from the first variable cluster, and successively breaks existing clusters in two child clusters. The treatment planning problem with the increased number of beamlet clusters is solved again and the objective function values are checked. An upper limit of how many variables can be freed this way serves as a stopping criterion. Of course, if the solution is still infeasible or the planner is not satisfied with this result, the procedure may be continued.
We will first introduce an idea to identify child clusters of existing aggregations based on the reference doses for each organ. Then we elaborate on how to control the iterations in the refinement. The prostate case of the previous chapter serves as a continuing illustration of the methods proposed in this chapter.
The critical question in a disaggregation procedure is which variables to free from existing clusters. We will make this decision based on the characteristics of the beamlets. In every iteration we identify one organ S that has an unfavorable dose distribution. Then all the clusters are searched and those beamlets with significant influence on the selected organ are separated into a new cluster. A limit on how many clusters are broken up this way is one of the control parameters of the refinement. To prevent moving too many beamlets into a single new cluster, the new clusters are constrained to contain only as many entries as the average cluster size of the old clusters. The procedure Refinement_Iteration is given in detail in algorithm 4.1.
In Step 1 of each refinement iteration, the cluster errors regarding the target organ are evaluated and the worst clusters identified. These worst clusters all contain beamlets that could be used to better control the dose distribution in S. Those beamlets are identified in Step 2 of Refinement_Iteration and separated into new clusters. Since problem 3.1 demands to minimize the maximum EUD normalized by the reference doses, it is natural to pick that organ for which the maximum is attained as S.
| Algorithm 4.1 Adaptive refinement iteration | |
|---|---|
| Procedure: Refinement_Iteration | |
| Input: Characteristic vectors ci, i = 1, …, n of all beamlets, allocation A ∈ 𝔹n×K of beamlets to clusters, cluster means μ(1), …, μ(K), average cluster size tave, target organ S, number of clusters d to break up | |
| Output: new allocation AN, new cluster means μ(1), μ(2), … | |
| Step 1: | For each cluster k that hits S, compute the clustering error of target organ S given by ∑h (∑i∈k (cS h(i) − μS h (k))2)h−1, where h denotes the degree of the moment and rank the clusters in descending order of these errors. |
| Step 2: | For the worst d clusters found, allocate all beamlets in those clusters that have a higher contribution to S than its cluster mean to a new cluster: |
| n := 1 // counter for newly created clusters | |
| for k := 1 to d // consider the worst clusters | |
| for all beamlets i ∈ k | |
| if cS h(i) > μS h(k) then | |
| aN(i) := K + n // separate this beamlet from k | |
| else | |
| aN(i) := a(i) | |
| end if | |
| if cluster K + n contains more than tave beamlets | |
| n := n + 1 | |
| end if | |
| next beamlet i | |
| next cluster k | |
The last choice that remains is how to choose the parameters of the refinement iteration. How many clusters should be formed in each iteration, and how many iterations should be done? One refinement iteration does not take much time because only the characteristic vectors of beamlets and clusters are compared. Each refined formulation of the treatment planning problem has to be resolved. The solution to the previous formulation should be an excellent starting point for the new problem and the refined solution should be found in a few solver iterations. Since this can all be achieved in little time, the number of clusters to be broken up in algorithm 4.1 can be set rather low - say 20 clusters over all beams.
As there exists a lot of empirical evidence that not many degrees of freedom are necessary to produce treatment plans of good quality, the limit of how many variables to end the refinement procedure can be set rather low initially. A simple refinement strategy is then to choose a low threshold for the number of variables (say 60% of the number of beamlets). Once the number of variables is above this threshold, the refinement is only continued if the solution is not yet feasible.
We now illustrate the refinement strategy using the prostate case we began in the previous chapter. Starting from the solution in chapter 3, the refinement is carried out using the following rules:
The organ to refine is the one for which the maximum (normalized) EUD is realized.
If an organ is refined for two consecutive iterations, it can’t be refined in the next iteration.
In every iteration, 20 clusters are broken up.
Stop with the first feasible solution after the number of variables is above 60% of all beamlets.
Rule 2 is included to avoid that on organ is refined too aggressively. The following table indicates the progress of the refinement. The first row is the solution from the previous chapter. The third column indicates the time the solver has taken up to that point.
| Ref. | organ | acc. solver time | number of variables | feasible? |
|---|---|---|---|---|
| 0 | 3 m 38 s | 24 (14%) | NO | |
| 1 | post. rectal wall | 6 m 23 s | 45 (26%) | NO |
| 2 | bladder | 9 m 13 s | 65 (38%) | NO |
| 3 | ant. rectal wall | 12 m 10 s | 84 (49%) | NO |
| 4 | ant. rectal wall | 15 m 14 s | 108 (62%) | NO |
| 5 | bladder | 18 m 26 s | 133 (77%) | YES |
The process stopped after 18 minutes and 26 sec. This is over 4 minutes less time than the original problem formulation. The DVH of the last refinement are shown in figure 8. Perhaps the most striking difference is in the curves pertaining to the femoral heads and the anterior rectal wall. It is evident that the solution using the refinement strategy spared large parts of the anterior rectal wall at the cost of increasing the dose in the femoral heads. As the reference dose for these two organs is rather high compared to the realized dose, this has no effect on the objective function value. To compare, figure 9 displays the normalized EUD values for both, the original solution and the solution obtained from the last refinement.
Figure 8.

The DVH of the last refinement.
Figure 9.

The objective function values (EUDs normalized by their reference doses) for the optimal solution found by the original problem formulation and by the last refinement. The maximum of the values for the optimal solution is obtained by the bladder, and the maximum of the objective function values for the last refinement is given by the value of the anterior rectal wall.
The objective function value for refinement 5 is even slightly better than the solution to the original formulation. This is because the solver stops if no significant improvement can be made for a long time. In the original formulation, the solver may have stopped too early. This shows that the clustering approach may also improve the convergence to the optimal solution.
As may be expected from the quality of the objective function values after the first clustered solution, the normalized EUD values did not change much over the solution process. However, the graphs in figure 10 shows that especially in the first two refinements the biggest improvement of the objective function values was attained by the organ which was refined in that step.
Figure 10.

The improvements in the objective functions remained small over the solution process. However, larger improvements were realized for those organs which were chosen for the refinement.
While the original solution took (25+20+20+22+22=) 109 apertures to be delivered, the solution to the last refinement problem took only (25+15+19+19+17=) 95. The number of monitor units, however, was the same at 216.
5 Discussion
In this work we introduced a variable aggregation technique for the treatment planning problem. The aggregation was motivated by a faster dose calculation that would speed up the solver iterations. A disaggregation method motivated by clinically meaningful indicators (i.e. the maximum EUD normalized by the reference dose) was developed to pose adaptively refined versions of the treatment planning problem. An example calculation on a clinical prostate case demonstrated the potentials of this method. The method introduced found a superior solution in less time. Due to the fact that some beamlets were still in clusters, the solution attained after clustering and refinement also needed significantly fewer shapes after sequencing. The success of this method supports the hypothesis that not all degrees of freedom have to be used to produced treatment plans of high quality. Current research is focused on how to integrate this beamlet aggregation and refinement strategy with a voxel clustering and refinement strategy. It is especially the coordination of the refinements and estimating distance to optimality of any given iteration that remains a challenge.
Footnotes
The research work presented herein was partly supported by NIH grant CA103904-01A1.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Alber M, Nüsslin F. IMRT optimisation under constraints for static and dynamic MLC delivery. Physics in Medicine and Biology. 2001;46(12):3229–3239. doi: 10.1088/0031-9155/46/12/311. [DOI] [PubMed] [Google Scholar]
- 2.Bazaraa M, Sherali H, Shetty C. Nonlinear Programming - Theory and Algorithms. John Wiley & Sons, Inc.; 1979. [Google Scholar]
- 3.Craft D, Süss P, Bortfeld T. The tradeoff between treatment plan quality and required number of monitor units in intensity-modulated radiotherapy. International Journal of Radiation Oncology, Biology, Physics. 2007;67(5):1596–1605. doi: 10.1016/j.ijrobp.2006.11.034. [DOI] [PubMed] [Google Scholar]
- 4.De Gersem W, Claus F, De Wagter C, De Neve W. An anatomy-based beam segmentation tool for intensity-modulated radiation therapy and its application to head-and-neck cancer. International Journal of Radiation Oncology, Biology, Physics. 2001;51(3):849–859. doi: 10.1016/s0360-3016(01)01727-8. [DOI] [PubMed] [Google Scholar]
- 5.De Neve W, De Wagter C, De Jaeger K, Thienpont M, Colle C, Derycke S, Schelfhout J. Planning and delivering high doses to targets surrounding the spinal cord at the lower neck and upper mediastinal levels: static beam-segmentation technique executed with a multileaf collimator. Radiotherapy and Oncology. 1996;40:271–279. doi: 10.1016/0167-8140(96)01784-7. [DOI] [PubMed] [Google Scholar]
- 6.Hall E. Intensity-modulated radiation therapy, protons, and the risk of second cancers. International Journal of Radiation Oncology, Biology, Physics. 2006;65(1):1–7. doi: 10.1016/j.ijrobp.2006.01.027. [DOI] [PubMed] [Google Scholar]
- 7.Hartigan J. Clustering Algorithms. John Wiley & Sons, Inc.; 1975. [Google Scholar]
- 8.Keller-Reichenbecher MA, Bortfeld T, Levegrün S, Stein J, Preiser K, Schlegel W. Intensity modulation with the ”step and shoot” technique using a commercial MLC: a planning study. International Journal of Radiation Oncology, Biology, Physics. 1999;45(5):1315–1324. doi: 10.1016/s0360-3016(99)00324-7. [DOI] [PubMed] [Google Scholar]
- 9.Küfer KH, Monz M, Scherrer A, Süss P, Alonso F, Azizi Sultan A, Bortfeld T, Thieke C. Multicriteria optimization in intensity modulated radiotherapy planning. In: Pardalos PM, Romeijn HE, editors. Handbook of Optimization in Medicine. 2007. [Google Scholar]
- 10.Niemierko A. A generalized concept of equivalent uniform dose (EUD) Medical Physics. 1999;26(1):1100. doi: 10.1118/1.598063. [DOI] [PubMed] [Google Scholar]
- 11.Scherrer A, Küfer KH. Accelerated IMRT plan optimization using the adaptive clustering method. Linear Algebra and its Applications (same issue) [Google Scholar]
- 12.Süss P, Küfer KH, Thieke C. Stratification for step-and-shoot MLC delivery in IMRT. Physics in Medicine and Biology. 2007;52:6039–6051. doi: 10.1088/0031-9155/52/19/022. [DOI] [PubMed] [Google Scholar]
- 13.Thieke C, Bortfeld T, Küfer KH. Characterization of dose distributions through the max and mean dose concept. Acta Oncologica. 2002;41(2):158–161. doi: 10.1080/028418602753669535. [DOI] [PubMed] [Google Scholar]
- 14.Wackerly D, Mendenhall W, Scheaffer R. Mathematical Statistics with Applications. Sixth Edition. Duxbury: 2002. [Google Scholar]
- 15.Webb S. The physics of three-dimensional radiation therapy. IOP Publishing Ltd; 1993. [Google Scholar]
- 16.Webb S. The physics of conformal radiotherapy. IOP Publishing Ltd; 1997. [Google Scholar]
- 17.Xia P, Verhey L. Multileaf collimator leaf-sequencing algorithm for intensity modulated beams with multiple static segments. Medical Physics. 1998;25:1424–1434. doi: 10.1118/1.598315. [DOI] [PubMed] [Google Scholar]
- 18.Xiao Y, Galvin J, Hossain M, Valicenti R. An optimized forward-planning technique for intensity modulated radiation therapy. Medical Physics. 2000;27(9):2093–2099. doi: 10.1118/1.1289255. [DOI] [PubMed] [Google Scholar]
- 19.Yu Y. Multiobjective decision theory for computational optimization in radiation therapy. Medical Physics. 1997;24:1445–1454. doi: 10.1118/1.598033. [DOI] [PubMed] [Google Scholar]
