AUTOMATIC FOOT ULCER SEGMENTATION USING AN ENSEMBLE OF CONVOLUTIONAL NEURAL NETWORKS

Page created by Kelly Brewer
 
CONTINUE READING
AUTOMATIC FOOT ULCER SEGMENTATION USING AN ENSEMBLE OF CONVOLUTIONAL NEURAL NETWORKS
AUTOMATIC FOOT ULCER SEGMENTATION USING AN ENSEMBLE OF
                                                                 CONVOLUTIONAL NEURAL NETWORKS

                                                                    Amirreza Mahbod?              Rupert Ecker†          Isabella Ellinger?
                                            ?
                                                Institute for Pathophysiology and Allergy Research, Medical University of Vienna, Vienna, Austria
                                                       †
                                                         Department of Research and Development, TissueGnostics GmbH, Vienna, Austria
arXiv:2109.01408v1 [eess.IV] 3 Sep 2021

                                                                  ABSTRACT                                                    1. INTRODUCTION

                                                                                                           Diabetes mellitus is one of the world most common chronic
                                                                                                           diseases with an increasing rate of prevalence over the past
                                                                                                           decades [1, 2]. Diabetes mellitus can cause several complica-
                                          Foot ulcer is a common complication of diabetes mellitus;        tions for the patient such as cardiovascular disease, retinopa-
                                          it is associated with substantial morbidity and mortality and    thy and neuropathy [3]. A serious medical condition of di-
                                          remains a major risk factor for lower leg amputation. Extract-   abetes mellitus is skin ulcers on the foot, which can lead to
                                          ing accurate morphological features from the foot wounds         amputation [4]. Up to 3% of patients with diabetes mellitus
                                          is crucial for proper treatment. Although visual and manual      have an active foot ulcer with a lifetime risk of developing a
                                          inspection by medical professionals is the common approach       foot ulcer as high as 25% [5]. Different treatments may be
                                          to extract the features, this method is subjective and error-    applied based on the foot ulcer types and their morphological
                                          prone. Computer-mediated approaches are the alternative          appearances. Parameters such as lesion length, width, area,
                                          solutions to segment the lesions and extract related morpho-     and volume need to be measured for proper treatment [6]. Vi-
                                          logical features. Among various proposed computer-based          sual inspection and measurement of foot ulcers by medical ex-
                                          approaches for image segmentation, deep learning-based           perts is a common approach to investigate the morphological
                                          methods and more specifically convolutional neural networks      features of the wounds. However, this method is subjective
                                          (CNN) have shown excellent performances for various image        and error-prone [7, 8]. An alternative approach is acquiring
                                          segmentation tasks including medical image segmentation.         images from the wound area and using computer-aided meth-
                                          In this work, we proposed an ensemble approach based on          ods to segment the lesions. Wound area is a useful predictor
                                          two encoder-decoder-based CNN models, namely LinkNet             of the final outcome; it allows for monitoring the healing pro-
                                          and UNet, to perform foot ulcer segmentation. To deal with       cess and evaluating the effect of treatment [6].
                                          limited training samples, we used pre-trained weights (Ef-           A number of computerised methods have been proposed
                                          ficientNetB1 for the LinkNet model and EfficientNetB2 for        in the literature to perform foot ulcer segmentation. These
                                          the UNet model) and further pre-training by the Medetec          methods include approaches from classical image process-
                                          dataset. We also applied a number of morphological-based         ing techniques to state-of-the-art machine learning and deep
                                          and colour-based augmentation techniques to train the mod-       learning models. K-means and fuzzy clustering, edge de-
                                          els. We integrated five-fold cross-validation, test time aug-    tection, adaptive thresholding, and region growing method
                                          mentation and result fusion in our proposed ensemble ap-         are examples of the classical computer vision approaches
                                          proach to boost the segmentation performance. Applied on         for wound segmentation [9, 10, 11]. Classical machine ap-
                                          a publicly available foot ulcer segmentation dataset and the     proaches based on hand-crafted features sets and training
                                          MICCAI 2021 Foot Ulcer Segmentation (FUSeg) Challenge,           classifiers such as multi-layer perception or support vector
                                          our method achieved state-of-the-art data-based Dice scores      machine were also used for foot ulcer segmentation [9, 12].
                                          of 92.07% and 88.80%, respectively. Our developed method         Similar to other medical image segmentation tasks such as
                                          achieved the first rank in the FUSeg challenge leaderboard.      skin lesion segmentation [13] or nuclei segmentation in his-
                                          The Dockerised guideline, inference codes and saved trained      tological images [14], deep learning and convolutional neural
                                          models are publicly available in the published GitHub repos-     networks (CNN)-based approaches have shown to outper-
                                          itory:https://github.com/masih4/Foot_Ulcer_                      form other approaches for foot ulcer segmentation [15].
                                          Segmentation.                                                    Some well-known CNN-based architectures such as fully
                                                                                                           convolutional neural network (FCN), UNet, and mask-RCNN
                                              Index Terms— Foot ulcer, segmentation, machine learn-        were utilised to perform ulcer segmentation in former stud-
                                          ing, deep learning, ensemble, medical image analysis             ies [15, 16, 17, 18].
AUTOMATIC FOOT ULCER SEGMENTATION USING AN ENSEMBLE OF CONVOLUTIONAL NEURAL NETWORKS
Fig. 1. The generic workflow of the proposed method. The training phase is depicted on the top and the inference phase is
shown on the bottom.

    In this work, inspired by our former studies for other med-
ical image segmentation tasks [13, 19, 20], we proposed and
developed a model based on two well-established encoder-
decoder-based CNNs, namely UNet and LinkNet, to segment
wounds in clinical foot images. We used pre-trained CNNs,
5-fold cross-validation, test time augmentation (TTA), and re-
sult fusion to boost the segmentation performance. We evalu-
ated our method on a publicly available dataset and the MIC-
CAI 2021 Foot Ulcer Segmentation (FUSeg) Challenge and
achieved the state-of-the-art segmentation performances for
both cases 1 .
                                                                     Fig. 2. Image samples of the Medetec dataset on the left and
                                                                     the chronic wound dataset/FUSeg dataset of the right.
                         2. METHOD

The generic workflow of the proposed method for training and
testing is shown in Fig. 1.

                                                                        to localise and crop the wounds inside the images. For
2.1. Datasets
                                                                        non-square images, zero-padding was utilised to have a
We used the following publicly available datasets in this               fixed size of 512 × 512 pixels for the entire dataset. From
study:                                                                  1010 images, 810 images were used for training and 200
  • Medetec dataset [21]: this dataset consists of 152 clini-           images were kept aside for testing as suggested in [15].
    cal images with the corresponding segmentation masks of           • FUSeg dataset [15]: this dataset is an extended version of
    several types of open foot wounds. As suggested by [15],            the chronic wound dataset. FUSeg dataset was used as the
    the images were zero-padded and resized to a fixed size             training and testing sets of the MICCAI 2021 Foot Ulcer
    of 224 × 224 pixels. We used this dataset only for pre-             Segmentation Challenge. It contains 1210 images where
    training of the utilised segmentation models and not for            1010 images (identical to the chronic wound dataset) were
    testing (refer to Fig. 1).                                          provided for training and 200 images were used for eval-
  • Chronic wound dataset [15]: this dataset contains 1010              uation. The ground truth segmentation masks of the 200
    clinical images, taken with a digital camera, from 889 pa-          test images were kept private by the challenge organisers
    tients. The corresponding segmentation masks of all im-             and were only used for evaluation purposes. Similar to the
    ages are also publicly available. To have a fixed size for all      chronic wound dataset, all images had a fixed size of 512
    images, a YOLOV3 object detection model [22] was used               × 512 pixels.
  1 Challenge leaderboard (accessed on 2021-09-03):    https://
uwm-bigdata.github.io/wound-segmentation/                               Fig. 2 shows some image samples of the used datasets.
2.2. CNN models                                                    2.5. Evaluation

We used two CNN models models, namely UNet [23] and                To evaluate the segmentation performance of the proposed
LinkNet [24]. Instead of using the models in their plain forms,    method and also to compare the results to the other state-of-
we used pre-trained CNNs in the decoder parts of the models.       the-art models for the chronic would dataset, we used preci-
For LinkNet, a pre-trained EfficientNetB1 model [25] was           sion, recall, and data-based Dice score as suggested in [15].
used and for UNet, we utilised the EfficientNetB2 model [25].      Besides these scores, we also report the image-based Dice
As shown in Fig. 1, the entire Medetec dataset was also used       score and data-based intersection over union (IOU) score for
for pre-training. To train the models, we used random scal-        our proposed method. For the FUSeg dataset, we only report
ing, random rotations, vertical and horizontal flipping, and       the results based on data-based Dice scores as only this score
brightness and contrast shifts as augmentation techniques as       was provided by the MICCAI 2021 Foot Ulcer Segmentation
suggested in [13]. We trained each model for 80 epochs with        Challenge organisers.
a learning rate (LR) scheduler that reduced the LR by 90% af-
ter every 25 epoch. The initial LR was set to 0.001. The batch
size was set to 4 and we used the full-size images to train the                    3. RESULTS & DISCUSSION
models. We used Adam optimiser and a combination of Dice
loss and Focal loss for model training. For each dataset, five-    We report the results for the test set of the chronic wound
fold cross-validation was exploited and the best models based      dataset in Table. 1. We compare our proposed method with a
on the segmentation scores of the validation sets were saved       number of deep learning-based approaches that were reported
to be used in the inference phase.                                 in [15]. The last three rows of the table show the performance
                                                                   of our proposed method. Besides the final fusion scheme (last
                                                                   row), we also report the results for single LinkNet (with Effi-
                                                                   cientNetB1 backbone) and UNet (with EfficientNetB2 back-
2.3. Ensemble
                                                                   bone) models in Table. 1. To achieve the reported results
To boost the segmentation performance, we used three dis-          in the last three rows, 5-fold cross-validation and TTA were
tinct ensembling strategies, namely 5-fold cross-validation,       used as explained in Section. 2.3. As the results in the last
TTA and result fusion from the two exploited models.               three rows show, our proposed method even without the fi-
                                                                   nal ensemble of the LinkNet and UNet models, outperforms
     Instead of using the entire training set to train a single
                                                                   the other state-of-the-art models for most evaluation indexes
model, we divided it randomly into five subsets. Then, we
                                                                   (all except precision). Moreover, the results in the last row
train five sub-models based on the derived subsets (i.e. for
                                                                   show that the final fusion scheme yields slightly better overall
each of the sub-models, we used four subsets for training and
                                                                   segmentation performances in comparison to single LinkNet
the hold-out set for validation). In the inference phase, we
                                                                   and UNet models for image-based Dice score, recall and data-
sent the test images to all five derived sub-models and then
                                                                   based Dice score.
took the average over the results.
                                                                       With the same methodology, we attended the MICCAI
     As shown in former studies for various image segmenta-
                                                                   2021 Foot Ulcer Segmentation Challenge and submitted our
tion or classification tasks [26, 27], TTA can boost the overall
                                                                   inference codes and saved models in the frame of a Dock-
performance. Therefore, we used TTA in the inference phase
                                                                   ersied container. The results from our approach and other
to have a better segmentation performance. For TTA, we used
                                                                   top four participating teams in the challenge are reported in
0, 90, 180, and 270-degree rotations as well as horizontal flip-
                                                                   Table. 2. It is worth mentioning that the reported results in
ping.
                                                                   Table. 2 were directly calculated by the challenge organisers
     As we trained two distinct models (LinkNet with Effi-         and are available in the challenge leaderboard 2 . As the re-
cientNetB1 backbone and UNet with EfficientNetB2 back-             sults show, our method outperforms the other approaches and
bone), we fused their results in the inference phase for a given   we achieved first place in the challenge.
test image. We used averaging to fuse the prediction proba-            Although our proposed approach delivers promising per-
bility masks from the two models as shown in Fig. 1.               formances for foot ulcer segmentation, the main drawback of
                                                                   our method is the time needed to perform all described en-
                                                                   sembling in the inference phase that may hinder the practical
2.4. Post-processing                                               application of such an extensive ensemble strategy in clinical
                                                                   routine. However, this is mainly application dependent and
To form the final segmentation masks for the test images,          can be still a handful and accurate approach in many situa-
first, we binarised the fused prediction probability vectors us-   tions where the test set sizes are not that large.
ing a 0.5 threshold. We also apply two post-processing steps,
namely filling the wholes and removing very small detected            2 As the challenge is open for new post submissions, the ranking may be

objects, with the identical settings as described in [15].         changed in the future
Table 1. Comparison of the segmentation performance of our proposed method with other state-of-the-art segmentation models
for the chronic wound dataset [15]. For each evaluation index, the best results are shown in bold.
                                     Image-based                                       Data-based Data-based
             Method                                   Precision (%) Recall (%)
                                     Dice (%)                                          IOU (%)     Dice (%)
             VGG16                        N/A              83.91            78.35           N/A       81.03
             SegNet                       N/A              83.66            86.49           N/A       85.05
             UNet                         N/A              89.04            91.29           N/A       90.15
             Mask-RCNN                    N/A              94.30            86.40           N/A       90.20
             MobileNetV2                  N/A              90.86            89.76           N/A       90.30
             MobileNetV2 + CCL            N/A              91.01            89.97           N/A       90.47
             LinkNet-EffB1               83.93             92.88            91.33          85.35      92.09
             UNet-EffB2                  84.09             92.23            91.57          85.01      91.90
             Ensemble                    84.42             92.68            91.80          85.51      92.07

                                                                    diabetes atlas, 9th edition,” Diabetes Research and Clin-
Table 2. Top five performers of the MICCAI 2021 Foot Ulcer
                                                                    ical Practice, vol. 157, pp. 107843, 2019.
Segmentation (FUSeg) Challenge. Our proposed approach
achieved first place in the challenge leaderboard (shown as     [2] L. Guariguata, D.R. Whiting, I. Hambleton, J. Beagley,
bold in the table).                                                 U. Linnenkamp, and J.E. Shaw, “Global estimates of
 Team                      Approach           Ref Dice(%)           diabetes prevalence for 2013 and projections for 2035,”
 Mahbod et al.             this work           –      88.80         Diabetes Research and Clinical Practice, vol. 103, no.
 Yichen Zhang       UNet with HarDNet68 N/A           87.57         2, pp. 137–149, 2014.
 Adrian Galdran         Stacked U-Nets       [28]     86.91
                                                                [3] Jessica L Harding, Meda E Pavkov, Dianna J Magliano,
 Hong et al.                  N/A            N/A      86.27
                                                                    Jonathan E Shaw, and Edward W Gregg, “Global trends
 Qayyum et al.         UNet with ASPP        N/A      82.29
                                                                    in diabetes complications: a review of current evidence,”
                                                                    Diabetologia, vol. 62, no. 1, pp. 3–16, 2019.
                    4. CONCLUSION                               [4] Dennis F. Bandyk, “The diabetic foot: Pathophysiol-
                                                                    ogy, evaluation, and treatment,” Seminars in Vascular
Computer-mediated foot ulcer segmentation can be consid-            Surgery, vol. 31, no. 2, pp. 43–48, 2018.
ered as an alternative solution for manual analysis. In this
work, we proposed and developed an ensemble strategy based      [5] Jonathan Zhang Ming Lim, Natasha Su Lynn Ng, and
on fully convolutional neural networks to segment foot ul-          Cecil Thomas, “Prevention and treatment of diabetic
cers in clinical images. Applied on two datasets, our method        foot ulcers,” Journal of the Royal Society of Medicine,
shows excellent segmentation performances and outperforms           vol. 110, no. 3, pp. 104–109, 2017.
the other state-of-the-art algorithms.
                                                                [6] Jens A Jørgensen, Line Bisgaard and Sørensen, Gre-
                                                                    gor BE Jemec, and Knud B Yderstræde, “Methods to
              5. ACKNOWLEDGMENTS                                    assess area and volume of wounds – a systematic re-
                                                                    view,” International Wound Journal, vol. 13, no. 4, pp.
This project was supported by the Austrian Research Promo-          540–553, 2016.
tion Agency (FFG), No.872636. We would like to also thank
NVIDIA for their generous GPU donation.                         [7] Kleopatra Alexiadou and John Doupis, “Management
                                                                    of diabetic foot ulcers,” Diabetes Therapy, vol. 3, no. 1,
                                                                    pp. 4, 2012.
                    6. REFERENCES
                                                                [8] Rania Niri, Hassan Douzi, Yves Lucas, and Sylvie
 [1] Pouya Saeedi, Inga Petersohn, Paraskevi Salpea, Belma          Treuillet, “A superpixel-wise fully convolutional neural
     Malanda, Suvi Karuranga, Nigel Unwin, Stephen Co-              network approach for diabetic foot ulcer tissue classifi-
     lagiuri, Leonor Guariguata, Ayesha A. Motala, Kather-          cation,” in International Conference on Pattern Recog-
     ine Ogurtsova, Jonathan E. Shaw, Dominic Bright, and           nition. Springer, 2021, pp. 308–320.
     Rhys Williams, “Global and regional diabetes preva-
     lence estimates for 2019 and projections for 2030 and      [9] Bo Song and Ahmet Sacan, “Automated wound identi-
     2045: Results from the international diabetes federation       fication system based on image segmentation and artifi-
cial neural networks,” in IEEE International Conference       [16] Muñoz, PL and Rodrı́guez, R and Montalvo, N, “Au-
     on Bioinformatics and Biomedicine, 2012, pp. 1–4.                  tomatic segmentation of diabetic foot ulcer from mask
                                                                        region-based convolutional neural networks,” Journal
[10] Mohammad Faizal Ahmad Fauzi, Ibrahim Khansa,                       of Biomedical Research and Clinical Investigation, vol.
     Karen Catignani, Gayle Gordillo, Chandan K. Sen, and               1, no. 1.1006, 2020.
     Metin N. Gurcan, “Computerized segmentation and
     measurement of chronic wound images,” Computers in            [17] Manu Goyal, Moi Hoon Yap, Neil D. Reeves, Satyan
     Biology and Medicine, vol. 60, pp. 74–85, 2015.                    Rajbhandari, and Jennifer Spragg, “Fully convolutional
                                                                        networks for diabetic foot ulcer segmentation,” in IEEE
[11] Kittichai Wantanajittikul, Sansanee Auephanwiriyakul,              International Conference on Systems, Man, and Cyber-
     Nipon Theera-Umpon, and Taweethong Koanantakool,                   netics, 2017, pp. 618–623.
     “Automatic segmentation and degree identification in
     burn color images,” in The 4th 2011 Biomedical En-            [18] Niri Rania, Hassan Douzi, Lucas Yves, and Treuil-
     gineering International Conference, 2012, pp. 169–173.             let Sylvie, “Semantic segmentation of diabetic foot
                                                                        ulcer images: Dealing with small dataset in dl ap-
[12] Lei Wang, Peder C. Pedersen, Emmanuel Agu, Di-                     proaches,” in Image and Signal Processing, Abder-
     ane M. Strong, and Bengisu Tulu, “Area determina-                  rahim El Moataz, Driss Mammass, Alamin Mansouri,
     tion of diabetic foot ulcer images using a cascaded two-           and Fathallah Nouboud, Eds., Cham, 2020, pp. 162–
     stage svm-based classification,” IEEE Transactions on              169.
     Biomedical Engineering, vol. 64, no. 9, pp. 2098–2109,
     2017.                                                         [19] Amirreza Mahbod, Gerald Schaefer, Benjamin Bancher,
                                                                        Christine Löw, Georg Dorffner, Rupert Ecker, and Is-
[13] Amirreza Mahbod, Philipp Tschandl, Georg Langs, Ru-                abella Ellinger, “CryoNuSeg: A dataset for nuclei in-
     pert Ecker, and Isabella Ellinger, “The effects of skin le-        stance segmentation of cryosectioned h&e-stained his-
     sion segmentation on the performance of dermatoscopic              tological images,” Computers in Biology and Medicine,
     image classification,” Computer Methods and Programs               vol. 132, pp. 104349, 2021.
     in Biomedicine, vol. 197, pp. 105725, 2020.
                                                                   [20] Amirreza Mahbod, Gerald Schaefer, Christine
[14] Ruchika Verma, Neeraj Kumar, Abhijeet Patil,                       Löw, Georg Dorffner, Rupert Ecker, and Isabella
     Nikhil Cherian Kurian, Swapnil Rane, Simon Graham,                 Ellinger, “Investigating the impact of the bit depth
     Quoc Dang Vu, Mieke Zwager, Shan E Ahmed Raza,                     of fluorescence-stained images on the performance
     Nasir Rajpoot, Xiyi Wu, Huai Chen, Yijie Huang,                    of deep learning-based nuclei instance segmentation,”
     Lisheng Wang, Hyun Jung, G Thomas Brown, Yanling                   Diagnostics, vol. 11, no. 6, 2021.
     Liu, Shuolin Liu, Seyed Alireza Fatemi Jahromi,
     Ali Asghar Khani, Ehsan Montahaei, Mahdieh So-                [21] Steve Thomas,         “Medetec wound database,”
     leymani Baghshah, Hamid Behroozi, Pavel Semkin,                    http://www.medetec.co.uk/files/
     Alexandr Rassadin, Prasad Dutande, Romil Lodaya,                   medetec-image-databases.html,            2021,
     Ujjwal Baid, Bhakti Baheti, Sanjay Talbar, Amirreza                Accessed: 2021-09-03.
     Mahbod, Rupert Ecker, Isabella Ellinger, Zhipeng
     Luo, Bin Dong, Zhengyu Xu, Yuehan Yao, Shuai Lv,              [22] Joseph Redmon and Ali Farhadi,     “YOLOv3:
     Ming Feng, Kele Xu, Hasib Zunair, Abdessamad Ben                   An incremental improvement,”   arXiv preprint
     Hamza, Steven Smiley, Tang-Kai Yin, Qi-Rui Fang,                   arXiv:1804.02767, 2018.
     Shikhar Srivastava, Dwarikanath Mahapatra, Lubomira           [23] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Con-
     Trnavska, Hanyun Zhang, Priya Lakshmi Narayanan,                   volutional networks for biomedical image segmenta-
     Justin Law, Yinyin Yuan, Abhiroop Tejomay, Aditya                  tion,” in International Conference on Medical Image
     Mitkari, Dinesh Koka, Vikas Ramachandra, Lata Kini,                Computing and Computer-Assisted Intervention, 2015,
     and Amit Sethi, “MoNuSAC2020: A multi-organ                        pp. 234–241.
     nuclei segmentation and classification challenge,” IEEE
     Transactions on Medical Imaging, pp. 1–1, 2021.               [24] A. Chaurasia and E. Culurciello, “LinkNet: Exploiting
                                                                        encoder representations for efficient semantic segmenta-
[15] Chuanbo Wang, DM Anisuzzaman, Victor Williamson,                   tion,” in IEEE Visual Communications and Image Pro-
     Mrinal Kanti Dhar, Behrouz Rostami, Jeffrey Niezgoda,              cessing, Dec 2017, pp. 1–4.
     Sandeep Gopalakrishnan, and Zeyun Yu, “Fully auto-
     matic wound segmentation with deep convolutional neu-         [25] Mingxing Tan and Quoc V Le, “EfficientNet: Rethink-
     ral networks,” Scientific Reports, vol. 10, no. 1, pp. 1–9,        ing model scaling for convolutional neural networks,”
     2020.                                                              arXiv preprint arXiv:1905.11946, 2019.
[26] Nikita Moshkov, Botond Mathe, Attila Kertesz-Farkas,
     Reka Hollandi, and Peter Horvath, “Test-time augmen-
     tation for deep learning-based cell segmentation on mi-
     croscopy images,” Scientific Reports, vol. 10, no. 1, pp.
     1–7, 2020.

[27] Amirreza Mahbod, Gerald Schaefer, Chunliang Wang,
     Rupert Ecker, Georg Dorffner, and Isabella Ellinger,
     “Investigating and exploiting image resolution for trans-
     fer learning-based skin lesion classification,” in Inter-
     national Conference on Pattern Recognition, 2021, pp.
     4047–4053.

[28] Adrian Galdran, Gustavo Carneiro, and Miguel
     A González Ballester, “Double encoder-decoder net-
     works for gastrointestinal polyp segmentation,” in Inter-
     national Conference on Pattern Recognition, 2021, pp.
     293–307.
You can also read