Purpose: To utilize a deformable phantom to objectively evaluate the accuracy of 11 different deformable image registration (DIR) algorithms. Methods: The phantom represents an axial plane of the pelvic anatomy. Urethane plastic serves as the bony anatomy and urethane rubber with three levels of Hounsfield units (HU) is used to represent fat and organs, including the prostate. A plastic insert is placed into the phantom to simulate bladder filling. Nonradiopaque markers reside on the phantom surface. Optical camera images of these markers are used to measure the positions and determine the deformation from the bladder insert. Eleven different DIR algorithms are applied to the full and empty-bladder computed tomography images of the phantom (fixed and moving volumes, respectively) to calculate the deformation. The algorithms include those from MIM Software (MIM) and Velocity Medical Solutions (VEL) and nine different implementations from the deformable image registration and adaptive radiotherapy toolbox for Matlab. These algorithms warp one image to make it similar to another, but must utilize a method for regularization to avoid physically unrealistic deformation scenarios. The mean absolute difference (MAD) between the HUs at the marker locations on one image and the calculated location on the other serves as a metric to evaluate the balance between image similarity and regularization. To demonstrate the effect of regularization on registration accuracy, an additional beta version of MIM was created with a variable smoothness factor that controls the emphasis of the algorithm on regularization. The distance to agreement between the measured and calculated marker deformations is used to compare the overall spatial accuracy of the DIR algorithms. This overall spatial accuracy is also utilized to evaluate the phantom geometry and the ability of the phantom soft-tissue heterogeneity to represent patient data. To evaluate the ability of the DIR algorithms to accurately transfer anatomical contours, the rectum is delineated on both the fixed and moving images. A Dice similarity coefficient is then calculated between the contour on the fixed image and that transferred, via the calculated deformation, from the moving to the fixed image. Results: The phantom possesses sufficient soft-tissue heterogeneity to act as a proxy for patient data. Large discrepancies appear between the algorithms and the measured ground-truth deformation. VEL yields the smallest mean spatial error and a Dice coefficient of 0.90. MIM produces the lowest MAD value and the highest Dice coefficient of 0.96, but creates the largest spatial errors. Increasing the MIM smoothness factor above the default value improves the overall spatial accuracy, but the factor associated with the lowest mean error decreases the Dice coefficient to 0.85. Conclusions: Different applications of DIR require disparate balances between image similarity and regularization. A DIR algorithm that is optimized only for its ability to transfer anatomical contours will yield large deformation errors in homogeneous regions, which is problematic for dose mapping. For this reason, these algorithms must be tested for their overall spatial accuracy. The developed phantom is an objective tool for this purpose.
ASJC Scopus subject areas
- Radiology Nuclear Medicine and imaging