Dr. Alexander Horsch
Title: Senior Lecturer
Munich University of Technology,
Email: [email protected]
Dr. Thomas M. Lehmann
Department of Medical Informatics,
Aachen University of Technology,
Email: [email protected]
For a copy of the references contained in this article, please contact [email protected].
According to the European Federation for Medical Informatics (EFMI) Working Group on Medical Image Processing (WG-MIP), there is a lack of integration of medical image processing into routine applications of image management systems. In general, insufficient handling of both processing algorithms and image data are seen as major weak points. Two particular problems are identified: (i) reliable evaluation of algorithms for medical image processing and (ii) automatic content analysis of medical images on a high level of abstraction. For the first point, the activities of the EFMI WG-MIP are described, which are based on the EFMI reference image database initiative. For the second, the focus is on content-based image access in medical applications, a topic of increasing importance as the data volume of digital images acquired in the healthcare industry explodes.
The Past and Present of Medical Image Processing
Medical image processing and analysis have been active fields of research for more than 25 years1,2,3. Currently, a method-driven modelling approach dominates the field of biomedical image processing as algorithms for registration, segmentation, classification and measurements are developed on a methodological level. The future of medical image processing is, however, seen in task-oriented solutions that are integrated into diagnosis, intervention panning, therapy and follow-up studies4.
In 2001, the WG-MIP was established within the EFMI to foster this integration5. In particular, the WG-MIP aims at supporting the discussion of how to integrate decision support by means of medical image processing into clinical practice, including the important topics of clinical evaluation, standardisation and technology transfer.
Successful technology transfer is based on evidence, i.e., research and approvals. However, evidence in complex domains such as medicine, pharmacy and medical image processing cannot be created without appropriate evaluation methods and validation platforms. The latter are serving as an environment for testing the performance of novel methods and systems in terms of absolute measurement, comprehensive benchmarking and detailed comparison with known and proven methods and systems.
Evaluating Computer Algorithms for Medical Image Processing
In medical image processing, a non-trivial problem exists with respect to validation environments. The development of new methods is based typically on images taken from one or a few image acquisition units. Hence, the algorithms tend to be optimised to these machines and can seldom be used on other devices without substantial modifications. Furthermore, research groups usually use different and incompatible datasets which prevent comparisons of methods. Image datasets obtained from only one research center never represent the medical variety desirable for sound clinical studies. In academia, innovation of medical image processing is emphasised in terms of algorithmic novelty. Instead of sound validation and evaluation of clinically relevant data, only feasibility studies are In 2002, the WG-MIP of EFMI started an initiative aiming to establish a reference image database in order to support reliable validation and comparison of methods and systems. In close contact with other initiatives, especially with the National Institute of Health (NIH, Bethesda, MD, USA) and the Insight Software Consortium (ISC, Clifton Park, NY, USA), the reference database was built for research and development groups in medical image pro-cessing as well as the industry. The concept of the EFMI reference image database initiative consists of the following main points6:
+ create an overall, economically sustainable framework for life cycles of reference image datasets and corresponding tools meeting the demands for validation and quality control of both academia and industry in research and approval processes;
+ establish a board of experts and let them define criteria to assess the relevance of a medical problem with respect to the importance of image processing;
+ perform an assessment of medical problems using the defined criteria and identify the most relevant ones with a high potential of improvement of diagnostic and treatment outcomes through the application digital image processing methods;
+ specify the image datasets needed and quality criteria for scientifically sound validation and evaluation of these highly relevant problems, and standardise data structures for annotations (gold standards7);
+ collect image data from image providers (single institution or a group of institutions) that meet these specifications and prepare validated image datasets to serve as common references for research and development groups in academia and industry;
+ set up a platform for the dissemination of the reference image datasets, including bilateral cooperation agreements or contracts between provider and user with or without licensing, depending on the type of dataset and usage, and
+follow-up the impact of the dissemination in terms of outcome indicators such as number and quality of published results, or number, costs and time for approval processes using the datasets, compared with before their introduction.
During the last three years, conceptual and promotional work was completed by the EFMI WG-MIP. Although there exists, to a certain extent, awareness of the usefulness or even necessity of the initiative, the concrete commitment to contribute is still rather limited. Academic institutions do not have the resources to set up such a framework, and therefore have to focus on the outcomes of their own research projects, including image data acquisition and management. Industry concentrates on the procedures required by the regulatory authorities and struggles with the threatening of their economic benefits by development and approval cycles that are too long. Both sides would benefit from an approved and powerful common platform for validation. Since setting up such a platform needs joint efforts from the public and industry, the WG-MIP tries to forma strong alliance of academia and industry to create a model for a sustainable platform which will be implemented in a common public-private effort. Currently, an ongoing discussion among representatives of public initiatives and industry is fostered by workshops and meetings at various events (recently in the USA and Europe). The goal is to coordinate and strengthen the activities and make the results available to the global community.
Content-Based Management of Medical Images
Not only for the sake of evaluation and validation, digital image archives in medicine are required and must be managed. However, such image repositories dramatically increase their volumes. In addition to the growing number of digital modalities, improved resolution in time and space results in more and more medical images.
With the increasing data volume of medical images that are routinely acquired in today’s healthcare institutions, common methods of image management become inefficient. Even in modern picture archiving and communication systems (PACS) that are based on the Digital Imaging and Communications in Medicine (DICOM) standard, image data is addressed by alphanumerical indexes such as patient name and examination date. Since an image tells more than a thousand words, recall and precision of this type of medical image information retrieval is limited in general8,9.
Content-based access to images relies on numerical features that are computed from the pixel values. In the medical field, the context of an image might change between the time the image was captured and stored, and the time of image retrieval. It is therefore difficult to define appropriate features that satisfy complex queries at the time of data entry. As a solution, features are extracted on three different levels:
Global Features: On a basic level, a numerical feature vector is extracted from the entire image or volume dataset. Using this representation, medical images can be automatically categorised according to the anatomy (A) and bio-system (B) shown in the image as well as the creation (C) and direction (D) of imaging. Relying on a reference database of more than 10,000 images, this categorization can be performed with an error rate of about 15%, 9% - or less than 5% if the best match or a set of the five or ten best matches are considered, respectively 10. More recently, this annotated reference image database has been used also for benchmarking and comparison of different algorithms for automatic image annotation11. conducted. Consequently, the industry has problems with the acceptance of image processing applications as automated tools in the approval procedures by certification authorities.
Local Features: On the next level of complexity, image objects are modeled as local regions of interest. Such an approach involves several challenges. At first, meaningful regions that correspond to objects must be extracted from the medical imagery. Since the level of detail of these objects depends on the context and application (e.g., the entire bone for maturity but a fracture as small part of the bone for emergencies), a multi-scale partitioning should be used12. This is done in the Image Retrieval in Medical Applications (IRMA) project13. Then, each region on each level can be represented by a numerical feature vector describing shape and texture. First experiments were made on a set of 105 radiographs of human hands, which were taken arbitrarily from the routine of bone age assessment. Performing a query for the metacarpal bones that is based on 25 sample regions selected manually, recall and precision of 0.6 and 0.53 are obtained for the images that have not been used for training, respectively14. For automatic training, the best result was obtained using a support vector machine. Based on 50 training regions, recall and precision yielded 0.58 and 0.67, respectively. In relation to the complexity of the problem, these results are very promising.
Structural Features: Regardless, modeling individual objects in medical images is insufficient for many applications. For instance, maturity assessment of infants is based on size and shape of several bones as well as their distances. In other words, a spatial or temporal constellation of multiple objects within an image must be regarded8. In the IRMA project, structural prototypes are trained from manual references, where node and edge attributes are represented by Gaussian Mixture Models (GMM). Edge attributes such as the normalised distance, the angle between two regions’ main axis, or the relative gray scale are used to represent spatial and / or temporal relations between individual objects (scene description). Accordingly, image similarity is expressed by means of graph to sub-graph matching techniques. In particular, a neural network based on the approach of Schädler and Wysotzki15 is used to efficiently compute the graph-based image similarity.
Although research inmedical image processing is currently at the beginning of developing such sophisticated methods of image analysis and interpretation, it is forseen that in the near future these methods will be required to handle the increasing volume of image data in healthcare. Content-based image management supports research, diagnostics and training of physicians. It will open new opportunities for case-based reasoning and evidence-based medicine.
The technology transfer of medical image processing into clinical routine application requires standardisation and interoperability. In particular, standardised image databases must be established to support reliable and comprehensive evaluation of algorithms. Also, more sophisticated approaches for modeling and understanding the content of medical images are required to support an interoperable management of image databases.
The authors would like to thank the Co-Chairs of the EFMI WG-MIP, Thomas Wittenberg, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany, and Vytenis Punys, Kaunas University of Technology, Kaunas, Lithuania, for their helpful comments on the manuscript.