Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna

Page created by Ricardo Pratt
 
CONTINUE READING
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
Machine learning for
medical imaging

Daniel Remondini
DIFA – daniel.remondini@unibo.it
INFN Sezione di Bologna
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
State of the art – “Big data” challenge

  Many available public data (BIG, i.e. at least Tb-sized):

  Heterogeneous types of information
  (imaging, omics, clinical)

  Databases connect and integrate different data types
  (genic & metabolic networks, clinical trials, in vitro experiments,
  catalogues of drug effects and targets)

  Increased computing and storage power (HPC, GPU , Cloud)

  Rapid availability and management of data
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
Public databases and repositories

 Allow in silico meta-analyses and studies
 Provide preliminary information before new experiments

      Transcriptome, Epigenomics, Drugs, Clinical trials, protein structure, …
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
Big Data for biomedical studies

TCIA – the cancer imaging archive
TC, PET, NMR data

TCGA – The Cancer Genetic Atlas
Omics (GEP, NGS, SNP, MET)
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
TCIA-TCGA integrated imaging/omics databases
Collection   Cancer type                                                        Modalities                             #
TCGA-BLCA    Bladder Endothelial Carcinoma                                      CT, CR, MR, PT                             106
TCGA-BRCA    Breast Cancer                                                      MR, MG                                     139
TCGA-CESC    Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma   MR                                          54
TCGA-COAD    Colon Adenocarcinoma                                               CT                                          25
TCGA-ESCA    Esophageal Carcinoma                                               CT                                          16
TCGA-GBM     Glioblastoma Multiforme                                            MR, CT, DX                                 262
TCGA-HNSC    Head and Neck Squamous Cell Carcinoma                              CT, MR, PT, RTSTRUCT, RTPLAN, RTDOSE       227
TCGA-KICH    Kidney Chromophobe                                                 CT, MR                                      15
TCGA-KIRC    Kidney Renal Clear Cell Carcinoma                                  CT, MR, CR                                 267
TCGA-KIRP    Kidney Renal Papillary Cell Carcinoma                              CT, MR, PT                                  33
TCGA-LGG     Low Grade Glioma                                                   MR, CT                                     199
TCGA-LIHC    Liver Hepatocellular Carcinoma                                     MR, CT, PT                                  97
TCGA-LUAD    Lung Adenocarcinoma                                                CT, PT, NM                                  69
TCGA-LUSC    Lung Squamous Cell Carcinoma                                       CT, NM, PT                                  37
TCGA-OV      Ovarian Serous Cystadenocarcinoma                                  CT, MR                                     143
TCGA-PRAD    Prostate Cancer                                                    CT, PT, MR                                  14
TCGA-READ    Rectum Adenocarcinoma                                              CT, MR                                       3
TCGA-SARC    Sarcomas                                                           CT, MR                                       5
TCGA-STAD    Stomach Adenocarcinoma                                             CT                                          46
TCGA-THCA    Thyroid Cancer                                                     CT, PT                                       6
TCGA-UCEC    Uterine Corpus Endometrial Carcinoma                               CT, CR, MR, PT                              58

 TCIA – public database of biomedical imaging for many tumours
 For 21 tumours also omics data are available in TCGA (same samples)
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
Big Data for clinical studies

ADNI – Alzheimer Disease Neuroimaging Initiative
(imaging, omics, clinical, biospecimens)

ABIDE – Autism Brain Imaging Data Exchange
(imaging data on control and autistic samples)

IXI – Information eXtraction from Images
(600 normal healthy subjects MRI)
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
Omics multivariate analysis: examples
   TCGA database                                                                                                                   OPEN       Identification of a T cell gene
                                                                                                            Glioblastoma
     11 Cancer types                                                                                                                          expression clock obtained by
                                                         Lung

     > 2000 samples
                                               Adenocarcinoma
                                                                                                               Breast
                                                                                                                                              exploiting a MZ twin design
                                                          Lung                                             Received: 14 January 2016          Daniel Remondini 1,2, Nathan Intrator3, Claudia Sala1, Michela Pierini4,6, Paolo Garagnani2,4,
                                                      Squamous                                             Accepted: 1 June 2017
                                                                                                                                                                                              www.nature.com/scientificreports
                                                                                                                                              Isabella Zironi 1, Claudio Franceschi5, Stefano Salvioli2,4 & Gastone Castellani1,2
                                                                                                           Published: xx xx xxxx
                                                        Kidney                                                                                Many studies investigated age-related changes in gene expression of different tissues, with scarce
                                                                                                             Ovarian
                                                         Colon                                                                 agreement due to the high number of affecting factors. Similarly, no consensus has been reached
                                                                                                                                                                                                                                                                                                2017
Drug repurposing
                                                                                                              www.nature.com/scientificreports/
                                                                                                             Endometrial       on which genes change expression as a function of age and not because of environment. In this
                                                        Rectum                                                                                 study we analysed gene expression of T lymphocytes from 27 healthy monozygotic twin couples,
                                                                                                                                               with ages ranging over whole adult lifespan (22 to 98 years). This unique experimental design

Target identification                                                                                                                          allowed us        OPEN          Identification of a T cell gene
                                                                                                                                                                    to identify genes involved in normative aging, which expression changes independently
                                                                                                                                               from environmental factors. We obtained a transcriptomic signature with 125 genes, from which
                                                                                                                                                                               expression clock obtained by
                                                                                                                                               chronological age can be estimated. This signature has been tested in two datasets of same cell type
                                                                                                                                                                               exploiting a MZ twin design
                                                                                                                                               hybridized over two different platforms, showing a significantly better performance compared to
                                                                                                                                               random
                                                                                                                                          Received:             signatures.
                                                                                                                                                    14 January 2016           Moreover, the same signature was applied on a dataset from a different cell type
                                                                                                                                                                           Daniel Remondini 1,2, Nathan Intrator3, Claudia Sala1, Michela Pierini4,6, Paolo Garagnani2,4,
                                                                                                                                               (human
                                                                                                                                          Accepted: 1 June 2017muscle). AIsabella
                                                                                                                                                                            lower     performance
                                                                                                                                                                                   Zironi   1
                                                                                                                                                                                                             was5, Stefano
                                                                                                                                                                                              , Claudio Franceschi   obtained, Salvioli2,4indicating          the1,2possibility that the signature is T
                                                                                                                                                                                                                                           & Gastone Castellani
                                                                                                                                          Published: xx xx xxxx
                                                                                                                                               cell-specific. As a whole            our
                                                                                                                                                                           Many studies    resultsage-related
                                                                                                                                                                                        investigated    suggest       that
                                                                                                                                                                                                                changes        this
                                                                                                                                                                                                                         in gene       approach
                                                                                                                                                                                                                                  expression  of different can
                                                                                                                                                                                                                                                           tissues,be
                                                                                                                                                                                                                                                                   withuseful
                                                                                                                                                                                                                                                                       scarce to identify age-modulated
                                                                                                                                                                           agreement due to the high number of affecting factors. Similarly, no consensus has been reached
                                                                                                                                               genes.                      on which genes change expression as a function of age and not because of environment. In this
                                                                                                                                                                               study we analysed gene expression of T lymphocytes from 27 healthy monozygotic twin couples,
                                                                                                                                                                               with ages ranging over whole adult lifespan (22 to 98 years). This unique experimental design
                                                                                                                                                                               allowed us to identify genes involved in normative aging, which expression changes independently
                                                                                                                                                                               from environmental factors. We obtained a transcriptomic signature with 125 genes, from which
                                                                                                                                              Aging is a complex phenomenon characterised by decreased fitness and increased risk of diseases, disability and
                                                                                                                                                                               chronological age can be estimated. This signature has been tested in two datasets of same cell type

                                                                                                                                              death. All these features are sustained by changes in gene expression, as a response of the cells to the environmen-
                                                                                                                                                                               hybridized over two different platforms, showing a significantly better performance compared to
                                                                                                                                                                               random signatures. Moreover, the same signature was applied on a dataset from a different cell type
                                                                                                                                              tal stimuli. Whether this response is programmed and stereotyped or totally random has been (and still is) a puz-
                                                                                                                                                                               (human muscle). A lower performance was obtained, indicating the possibility that the signature is T
                                                                                                                                                                               cell-specific. As a whole our results suggest that this approach can be useful to identify age-modulated
                                                                                                                                              zling question for gerontologists. This question stems from the old theoretical dichotomy which has dominated
                                                                                                                                                                               genes.

                                                                                                                                              the field of aging studies, that can be summarized in two conflicting positions: “aging is programmed” vs “aging is
                                                                                                                                              a random process”1–3Aging . The      fact that no gerontogenes (that is, genes whose expression actively induces aging of the
                                                                                                                                                                             is a complex phenomenon characterised by decreased fitness and increased risk of diseases, disability and
                                                                                                                                                                     death. All these features are sustained by changes in gene expression, as a response of the cells to the environmen-
                                                                                                                                              organism without any         other
                                                                                                                                                                     tal stimuli.       apparent
                                                                                                                                                                                   Whether                 benefit)
                                                                                                                                                                                               this response  is programmed   have        been found
                                                                                                                                                                                                                                   and stereotyped                  so far
                                                                                                                                                                                                                                                       or totally random      hasdoes
                                                                                                                                                                     zling question for gerontologists. This question stems from the old theoretical dichotomy which has dominated
                                                                                                                                                                                                                                                                                   been (andnot       exclude
                                                                                                                                                                                                                                                                                               still is) a puz-    that other (possibly epige-
                                                                                                                                              netic) types of control      exist,
                                                                                                                                                                     the field of agingsostudies,
                                                                                                                                                                                             thethat  question
                                                                                                                                                                                                         can be summarizedis stillin twoopen.
                                                                                                                                                                                                                                          conflictingApositions:
                                                                                                                                                                                                                                                          third“aging  possibility
                                                                                                                                                                                                                                                                              is programmed”  also       exists,
                                                                                                                                                                                                                                                                                                  vs “aging   is   that, according to the con-
                                                                                                                                                                     a random process” . The fact that no
                                                                                                                                                                                                1–3
                                                                                                                                                                                                                4 gerontogenes (that is, genes whose expression actively induces aging of the
                                                                                                                                              ceptualization of Blagosklonny
                                                                                                                                                                     organism without any     andotherHall
                                                                                                                                                                                                       apparent,benefit)
                                                                                                                                                                                                                    aging             quasi-programmed,
                                                                                                                                                                                                                             haveisbeen   found so far does not exclude that       andothershould
                                                                                                                                                                                                                                                                                            (possibly epige-be interpreted as a continuation
   BioPlex                                 Ontocancro                                                                                         of developmental programs
                                                                                                                                                                     ceptualization which,
                                                                                                                                                                     of developmental      programs
                                                                                                                                                                                                      in and
                                                                                                                                                                                                           theHallpost-reproductive
                                                                                                                                                                                                      understand,                at
                                                                                                                                                                                                       which, in the post-reproductiveleast      in
                                                                                                                                                                                                                                              period   part,
                                                                                                                                                                                                                                                             period
                                                                                                                                                                                                                                                                   the
                                                                                                                                                                                                                                                       of life, loose
                                                                                                                                                                                                                                                                              of life,asloose
                                                                                                                                                                     netic) types of control exist, so the question is still open. A third possibility also exists, that, according to the con-

                                                                                                                                              ulation. A possible useful         model
                                                                                                                                                                                       of Blagosklonny
                                                                                                                                                                                                 to
                                                                                                                                                                                                                    4
                                                                                                                                                                                                                      , aging is quasi-programmed, and          should be interpreted
                                                                                                                                                                                                                                                                           presence
                                                                                                                                                                                                                                                                       their  strict and finelyof
                                                                                                                                                                                                                                                                                                         their strict and finely tuned mod-
                                                                                                                                                                                                                                                                                             a continuation
                                                                                                                                                                                                                                                                                                     genetic
                                                                                                                                                                                                                                                                                                 tuned    mod-    (or epigenetic) control over
                                                                                                                                                                     ulation. A possible useful model to understand, at least in part, the presence of genetic (or epigenetic) control over

Protein-Protein                         Genes annotated                                                                                       age-related gene expression
                                                                                                                                                                     age-related genechanges

                                                                                                                                              and they can be therefore
                                                                                                                                                                     environmental considered
                                                                                                                                                                                                          is that
                                                                                                                                                                                          expression changes      is thatof

                                                                                                                                                                                                           a powerful
                                                                                                                                                                                         perturbations, with
                                                                                                                                                                                                                                twins
                                                                                                                                                                                                                           of twins
                                                                                                                                                                                                                                5

                                                                                                                                                                                                                                      model
                                                                                                                                                                                                                the further possibility
                                                                                                                                                                                                                                             . Indeed, monozygotic (MZ) twins share the same genome
                                                                                                                                                                                                                                            5 monozygotic (MZ) twins share the same genome
                                                                                                                                                                                                                                      . Indeed,
                                                                                                                                                                     and they can be therefore considered a powerful model to identify genes whose expression is independent from
                                                                                                                                                                                                                                                     to identify
                                                                                                                                                                                                                                            to cross-validate                 genesin awhose
                                                                                                                                                                                                                                                                  the data obtained          member ofexpression
                                                                                                                                                                                                                                                                                                            the           is independent from
  Interaction                           in cancer-related
                                                                                                                                                                     pair with those obtained in the other. Therefore, as MZ twins grow old, it should be possible to observe whether
                                                                                                                                              environmental perturbations,
                                                                                                                                                                     some of their geneswith           the further
                                                                                                                                                                                              change expression     accordingly, possibility             to cross-validate
                                                                                                                                                                                                                                     indicating the presence       of some kind of geneticthe   control data
                                                                                                                                                                                                                                                                                                           over obtained in a member of the
                                                                                                                                              pair with those obtained           in the        other.       Therefore,
                                                                                                                                                                                                                 gene expression as  timeMZseries intwins         grow         old, brain
                                                                                                                                                                     this phenomenon, or rather if changes are totally private (not shared by the two members of the twin couple). Until
                                                                                                                                                                     today, a plethora    of studies  analyzed                                        different tissues  (including     it should
                                                                                                                                                                                                                                                                                              areas, adipose be possible to observe whether
   Network                                  pathways                                                                                          some of theirFigure        2.andon twin
                                                                                                                                                              genes studies
                                                                                                                                                                     change       Plot expressionofbeenMZ             couple
                                                                                                                                                                                                              accordingly,                   age
                                                                                                                                                                                                                        of different age indicating
                                                                                                                                                                                                                                                       ofvs.
                                                                                                                                                                                                                                              , but not in twins.the
                                                                                                                                                                                                                                                                    age          estimated
                                                                                                                                                                                                                                                                     On thepresence
                                                                                                                                                                                                                                                                                   case-control studies . by ridge regression with
                                                                                                                                                                                                                                                                               other side, geneof      some kind of genetic control over
                                                                                                                                                                125-gene signature to predict
                                                                                                                                                                     tissue       skeletal  muscle) from subjects                       6–14
                                                                                                                                                                                                                                                                                                  expression
                                                                                                                                                                                         pairs have       performed      so far in limited number         old subjects
                                                                                                                                                                                                                                                           15
                                                                                                                                                                                                                                                                          , or in   16, 17

                                                                                                                                              this phenomenon, or rather if changes are totally private (not shared by the two members of the twin couple). Until
                                                                                                                                                           shows
                                                                                                                                              today, a plethora       the estimation
                                                                                                                                                                 of studies
                                                                                                                                                                      Departmentanalyzed
                                                                                                                                                                               1
                                                                                                                                                                                      of     sics an gene
                                                                                                                                                                                                                     obtained
                                                                                                                                                                                                        stronom expression
                                                                                                                                                                                                                       ni ersit of o o time
                                                                                                                                                                                                                                                  for
                                                                                                                                                                                                                                            na o o series
                                                                                                                                                                                                                                                              theIta validation
                                                                                                                                                                                                                                                        na 40126 in
                                                                                                                                                                                                                                                                2
                                                                                                                                                                                                                                                                          .different
                                                                                                                                                                                                                                                                             Inter epartmenta tissues       set   (i.e. all
                                                                                                                                                                                                                                                                                                      enter (including
                                                                                                                                                                                                                                                                                                               .
                                                                                                                                                                                                                                                                                                                               the
                                                                                                                                                                                                                                                                                                                           brain    twins
                                                                                                                                                                                                                                                                                                                                 areas,       from
                                                                                                                                                                                                                                                                                                                                        adipose
                                                                                                                                                           training).
                                                                                                                                                                                                                                    3
                                                                                                                                                                       a ani       ni ersit of o o na o o na 40126 Ita . Department of omputer cience act ciences acu t
                                                                                                                                              tissue and skeletal  muscle)           frome subjects                of different              age6–14         , butannot          int twins.
                                                                                                                                                                                                                                                                                         e icine On   ni ersitthe other side, gene expression

                                                                                                                                                                sample age from blood cells
                                                                                                                                                                                                                4
                                                                                                                                                                       e    i ni ersit               i Israe . Department        of perimenta         Dia nostic            pecia

   Nature Communications 2018                                                                                                                 studies on twin pairs haveresent a been         performed
                                                                                                                                                                                                    e eneration a so        far in       limited
                                                                                                                                                                                                                                             Institute number
                                                                                                                                                                                                                                                          o i i a- utti of    i ooldi Ort subjects
                                                                                                                                                                                                                                                                                          opae ic Institute, or in case-control studies          .
                                                                                                                                                                     of o o na o o na 40138 Ita . I             5
                                                                                                                                                                                                                          Institute of euro o ica ciences of o o na o o na 40124 Ita15.                                                    16, 17
                                                                                                                                                                               6
                                                                                                                                                                                     ress: one                         orator       esearc
                                                                                                                                                                                ia i ar iano 1/10 40136 o o na Ita . Danie emon ini an at an Intrator contri ute e ua to t is wor .
                                                                                                                                                                                tefano a io i an astone aste ani oint super ise t is wor . orrespon ence an re uests for materia s
                                                                                                                                                                               s ou e a resse to D. . emai : anie .remon ini uni o.it)

                  2018
                                                                                                                                              1
                                                                                                                                               Department of                  sics an stronom                   ni ersit of o o na o o na 40126 Ita . 2Inter epartmenta enter .
                                                                                                                                                 Scientific RepoRts | 7: 6005 | DOI:10.1038/s41598-017-05694-2                                              1
                                                                                                                                                a ani          ni ersit of o o na o o na 40126 Ita . 3Department of omputer cience act ciences acu t
ARTICLE                                                                                                                                         e      i nifor   ersit the evalidation                      dataset obtained
                                                                                                                                                                                         i Israe . 4Department           of perimenta   fromDiathe   twin
                                                                                                                                                                                                                                                nostic an couple
                                                                                                                                                                                                                                                              pecia t splitting,
                                                                                                                                                                                                                                                                        e icine ni asersitdescr
                                                                                                                                              of o o napared      o o nawith       40138the     Ita results
                                                                                                                                                                                                         5
                                                                                                                                                                                                        . I        obtained
                                                                                                                                                                                                                   Institute of in
                                                                                                                                                                                                                                 euro   for a signature based on 71 methylation
                                                                                                                                                                                                                                    21 o ica ciences of o o na o o na 40124 Ita .
DOI: 10.1038/s41467-018-06992-7   OPEN                                                                                                        6
                                                                                                                                                 resent a ress: one e eneration a orator esearc Institute o i i a- utti i o i Ort opae ic Institute
                                                                                                                                                ia i ar iano   and1/10    R=         0.91oin
                                                                                                                                                                                 40136             o na theItavalidation
                                                                                                                                                                                                                 . Danie emon   set.  As aatfirst
                                                                                                                                                                                                                                  ini an            test for
                                                                                                                                                                                                                                               an Intrator contrithe
                                                                                                                                                                                                                                                                   utegoodness
                                                                                                                                                                                                                                                                       e ua to t isofworour. sig
Network integration of multi-tumour omics data                                                                                                  tefano adataset  io i an 100,000   astone asterandomly                   chosen
                                                                                                                                                                                                            ani oint super    ise t signatures        of 125
                                                                                                                                                                                                                                      is wor . orrespon    ence probes,    and
                                                                                                                                                                                                                                                                  an re uests     applied
                                                                                                                                                                                                                                                                              for materia  s th
                                                                                                                                                            regression parameters on one split dataset and validating it on the other. W
                                                                                                                                              s ou e a resse to D. . emai : anie .remon ini uni o.it)

suggests novel targeting strategies                                                                                                                         Pearson’s R values, with an average of 0.75 and a SD of 0.09. Remarkably, on
     ARTICLE                                                                                                       Scientific RepoRts | 7: 6005 | DOI:10.1038/s41598-017-05694-2
                                                                                                                                                            forming the optimal signature (corresponding to 0.02% of the random sig
                                                                                                                                                                                                                                     1
Ítalo Faria do Valle1,2, Giulia Menichetti3, Giorgia Simonetti 4, Samantha Bruno4, Isabella Zironi 1,
                                                                                                                                                            high regression performance of the optimal age signature. Concerning pos
       DOI:
Danielle  Fernandes  Durso4,5, José C.M. Mombach OPEN
             10.1038/s41467-018-06992-7                 6, Giovanni Martinelli4,7, Gastone Castellani1 &
                                                                                                                                                            sion values of these 125 genes, we did not observe any significant difference
Daniel Remondini 1
                                                                                                                                                            samples, after correcting for multiple test analysis (data not shown). This sug
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
Machine Learning

    Huge amount of data: data-driven approaches

    Machine Learning: usage of advanced algorithms for data
    analysis (e.g. image analysis)

    1) Unsupervised methods:
       a) data clustering (e.g. ROI segmentation)
       b) feature extraction (e.g. texture features)

    2) Supervised methods: the algorithm uses known information
    (e.g. reference samples, standards) for
      a) sample classification
      b) parameter regression (e.g. risk score, age)
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
Artificial Intelligence and Deep Learning

Some machine learning techniques take inspiration from anatomical and
functional structures of the brain (i.e. visual cortex, known since the ’60s)

                                           Layered (modular, organized)
                                           Hierarchical (from contours to shapes)
                                           Somatotopic (organization)

                                                          Hubel & Wiesel J Physiol 160, 1962
Machine learning for medical imaging - Daniel Remondini DIFA - INFN Sezione di Bologna
Artificial Intelligence and Deep Learning

The functional units of Neural Networks take inspiration from
neurons (since 1957 Rosenblatt’s perceptron)
Deep Neural Network for Machine Learning - supervised

Supervised methods (classification, regression):
Feedforward Deep Networks FDN
FDN is trained with examples, and generalizes to unseen data

My thesis (1996): 4 neurons, 386 Intel CPU, 20 Mb HD
Actual FDN: 105-106 neurons, multi-core CPU & GPU, Tb HD (& RAM)
Deep Neural Network for Machine Learning - unsupervised

Unsupervised feature extraction: Convolutional Neural Networks CNN
CNN layers extract a hierarchy of features (from contours to shapes)
Deep Neural Network for Machine Learning

Typical DNN architecture: encoding (CNN) + task (FDN)
Machine Learning for NMR data: examples

     Data processing:

     1) NMR fingerprinting

     2) QSM processing

     Image analysis:

     1) Automated segmentation

     2) Quantification (feature & texture analysis)

     3) Image quality enhancement (super-resolution)
Fingerprinting
  Associate vectors of features for each MRI ”pixel” (training set) with
  specific values (B, T1, T2, …)

  Original strategy [Ma et al, Nature 495, 2013]: define a ”dictionary” of
  feature/value associations
  Our strategy: train a DL network to discriminate the feature vectors and
  reliably associate the physical NMR parameters

                                                          T1
                                                                           B1
                                                               T2
                                                                       D
                                                         Off

                                                   Barbieri, … Remondini arXiv:1811.11477
Quantitative Susceptibility Mapping

  Transform Phase data into Magnetic susceptibility data c
  Analytical function has singularities (”magic angle”)

  Deep Learning can learn the transform from ”good”examples (our case,
  reconstructions obtained with other approaches) and overcome singularities

                                                             Cristiana Fiscone’s
                                                             Tomorrow C15 talk
In collaboration with Prof. R. Bowtell
Sir Peter Mansfield Institute, Nottingham UK
Feature extraction & analysis

Many observables can be extracted from single pixels (or larger patches) of
MRI images
- Graylevel histogram
- Texture features (based on spatial and intensity proximity)
- Segmented region parameters (eccentricity, complexity, fractal dim, …)

Each sample is mapped into a high-dimensional feature space

                                        N = 102-103

                                        Feature vectors can be used for
                                        - low-dimensional visualization
                                        - Supervised and unsupervised
                                          machine learning
Visualization

       Several tecniques can be used for low-dimensional reduction and
       visualization (in 2-3 dim)
       - PCA/SVD ”family” of methods
                                                                                                                      REPORTS
                                              tion to geodesic distance. For faraway points,          X. Two simple methods are to connect each             The final step applies classical MDS to
                                              geodesic distance can be approximated by                point to all points within some fixed radius !,    the matrix of graph distances DG " {dG (i, j)},

       - ISOMAP (network-geodetics)           adding up a sequence of “short hops” be-
                                              tween neighboring points. These approxima-
                                              tions are computed efficiently by finding
                                                                                                      or to all of its K nearest neighbors (15). These
                                                                                                      neighborhood relations are represented as a
                                                                                                      weighted graph G over the data points, with
                                                                                                                                                         constructing an embedding of the data in a
                                                                                                                                                         d-dimensional Euclidean space Y that best
                                                                                                                                                         preserves the manifold’s estimated intrinsic

       - SNE (Stochastic Neighbor Embedding)
                                              shortest paths in a graph with edges connect-           edges of weight dX(i, j) between neighboring       geometry (Fig. 3C). The coordinate vectors yi
                                              ing neighboring data points.                            points (Fig. 3B).                                  for points in Y are chosen to minimize the
                                                  The complete isometric feature mapping,                 In its second step, Isomap estimates the       cost function
                                              or Isomap, algorithm has three steps, which             geodesic distances dM (i, j) between all pairs
                                                                                                                                                                    E ! !#$D G % " #$D Y %! L 2      (1)
                                              are detailed in Table 1. The first step deter-          of points on the manifold M by computing
                                              mines which points are neighbors on the                 their shortest path distances dG (i, j) in the     where DY denotes the matrix of Euclidean
                                              manifold M, based on the distances dX (i, j)            graph G. One simple algorithm (16) for find-       distances {dY (i, j) " !yi & yj!} and !A! L2

SNE 2-d representation of hand-written                                                    ISOMAP 2-d representation of face images
                                              between pairs of points i, j in the input space         ing shortest paths is given in Table 1.            the L2 matrix norm '(i, j A2i j . The # operator

              M     H
digit images
           VAN DER    AATEN AND       INTON
                                              Fig. 1. (A) A canonical dimensionality reduction
                                              problem from visual perception. The input consists
                                              of a sequence of 4096-dimensional vectors, rep-
                                              resenting the brightness values of 64 pixel by 64
                                              pixel images of a face rendered with different
                                              poses and lighting directions. Applied to N " 698
                                              raw images, Isomap (K " 6) learns a three-dimen-
   0
                                              sional embedding of the data’s intrinsic geometric
   1                                          structure. A two-dimensional projection is shown,
   2                                          with a sample of the original input images (red
   3                                          circles) superimposed on all the data points (blue)
                                              and horizontal sliders (under the images) repre-
   4
                                              senting the third dimension. Each coordinate axis
   5                                          of the embedding correlates highly with one de-
   6                                          gree of freedom underlying the original data: left-
   7                                          right pose (x axis, R " 0.99), up-down pose ( y
   8                                          axis, R " 0.90), and lighting direction (slider posi-
                                              tion, R " 0.92). The input-space distances dX(i,j )
   9                                          given to Isomap were Euclidean distances be-
                                              tween the 4096-dimensional image vectors. (B)
                                              Isomap applied to N " 1000 handwritten “2”s
                                              from the MNIST database (40). The two most
                                              significant dimensions in the Isomap embedding,
                                              shown here, articulate the major features of the
                                              “2”: bottom loop (x axis) and top arch ( y axis).
                                              Input-space distances dX(i,j ) were measured by
                                              tangent distance, a metric designed to capture the
                                              invariances relevant in handwriting recognition
                                              (41). Here we used !-Isomap (with ! " 4.2) be-
                                              cause we did not expect a constant dimensionality
                                              to hold over the whole data set; consistent with
                                              this, Isomap finds several tendrils projecting from
                                              the higher dimensional mass of data and repre-
                                              senting successive exaggerations of an extra
                                              stroke or ornament in the digit.

              (a) Visualization by t-SNE.
Super resolution

                                                                                                          Average test PSNR (dB)
                A DNN can learn to ”improve” image quality (resolution)
                from an adequate training set
                Results of a DNN trained on natural images
                                                                                                                                      4
                                                                                                                                                 Number of backprops
                                             feature maps                 feature maps
                                      of low-resolution image     of high-resolution image

         Low-resolution                                                                       High-resolution
         image (input)                                                                        image (output)

                                                                                                                                     Original
                                                                                                                                          Original
                                                                                                                                              / PSNR
                                                                                                                                                   / PSNR   Bicubic
                                                                                                                                                                 Bicubic
                                                                                                                                                                    / 24.04
                                                                                                                                                                          / 24.04
                                                                                                                                                                            dB dB

                            Patch extraction     Non-linear mapping          Reconstruction
                           and representation
 . 2. Given a low-resolution image Y, the first convolutional layer of the SRCNN extracts a set of feature maps. The
cond layer maps these feature maps nonlinearly to high-resolution patch representations. The last layer combines
  predictions within a spatial neighbourhood to produce the final high-resolution image F (Y).

kernel size c ⇥ f1 ⇥ f1 . The output is composed of             Here W3 corresponds to c filters of a size n2 ⇥ f3 ⇥ f3 ,
 feature maps. B1 is an n1 -dimensional vector, whose           and B3 is a c-dimensional vector.
                                                                                                                           SC / SC
                                                                                                                                25.58
                                                                                                                                    / 25.58
                                                                                                                                      dB dB    SRCNN
                                                                                                                                                   SRCNN
                                                                                                                                                     / 27.95
                                                                                                                                                           / 27.95
                                                                                                                                                             dB dB
ch element is associated with a filter. We apply the               If the representations of the high-resolution patches
ctified Linear Unit (ReLU, max(0, x)) [33] on the filter        are in the image domain (i.e.,we can simply reshape each
 ponses4 .                                                      representation to form the patch), Fig.         1. The
                                                                                                         we expect         proposed Super-Resolution Convolution
                                                                                                                     that the
                                                                filters act like an averaging filter; if Neural    Network (SRCNN) surpasses the bicubic baselin
 .2 Non-linear mapping                                          of the high-resolution patches are in some
                                                                                                          the representations
                                                                                                         with other
                                                                                                                just domains
                                                                                                                                   Dong et al., arXiv:1501.00092v3
                                                                                                                      a few training iterations, and outperforms th
e first layer extracts an n1 -dimensional feature for           (e.g.,coefficients in terms of some bases), we expect that
ch patch. In the second operation, we map each of                                                        sparse-coding-based
                                                                W3 behaves like first projecting the coefficients    onto the          method (SC) [50] with modera
 se n1 -dimensional vectors into an n2 -dimensional             image domain and then averaging. In      training.   The
                                                                                                            either way, W3performance
                                                                                                                            is              may be further improved w
e. This is equivalent to applying n2 filters which have         a set of linear filters.                 more training iterations. More details are provided
rivial spatial support 1 ⇥ 1. This interpretation is only
Classification & regression

       Machine learning techniques (including DL) can
       be used to classify samples or to regress
       parameters (e.g. tumour grade, age):

       -   Partial Least Squares
       -   Support Vector Machine
       -   Discriminant Analysis
       -   K-Nearest Neighbour
       -   Ridge regression
       -   LASSO
       -   …
Daniel Remondini

              Department of Physics and Astronomy – DIFA
INFN Sezione di Bologna – AIM initiative (Artificial Intelligence in Medicine)

                           daniel.remondini@unibo.it

                                www.unibo.it
Per quanto concerne i moderatori, relatori, formatori, tutor, docenti è richiesta dall’Accordo Stato-Regioni vigente apposita dichiarazione esplicita
dell’interessato, di trasparenza delle fonti di finanziamento e dei rapporti con soggetti portatori di interessi commerciali relativi agli ultimi due anni
                                                                 dalla data dell’evento.
                           La documentazione deve essere disponibile presso il Provider e conservata per almeno 5 anni.

                                              Dichiarazione sul Conflitto di Interessi

                                              Il sottoscritto DANIEL REMONDINI in qualità di:

                                                                      □ relatore

                    dell’evento “X CONGRESSO AIRMM - RISONANZA MAGNETICA IN MEDICINA 2019:
                            DALLA RICERCA TECNOLOGICA AVANZATA ALLA PRATICA CLINICA”
                                              Milano, 28-29 marzo 2019
                                          da tenersi per conto di Biomedia srl Provider n. 148,

 ai sensi dell’Accordo Stato-Regione in materia di formazione continua nel settore “Salute” (Formazione ECM) vigente,
                                              Dichiara
   X che negli ultimi due anni NON ha avuto rapporti anche di finanziamento con soggetti portatori di
                                interessi commerciali in campo sanitario
You can also read