NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...

Page created by Rosa Rodriguez
 
CONTINUE READING
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
NXP Connects i.MX 8 Family
Hardware
Lydia Ziegler
i.MX 8 DRAM Introduction and Tools Overview

October 2018 | AMF-AUT-T3361

                                              Company External – NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of
                                              NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V.
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
Agenda
•   i.MX 8 Family Overview
•   i.MX 8QM/QXP DDR Controller
    Overview
•   i.MX 8QM/QXP DDR Initialization Flow
•   i.MX 8QM/QXP DDR Calibration Details
•   i.MX 8QM/QXP DDR Tools Introduction
•   Debugging DDR Failures

                             PUBLIC   1
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
i.MX Explosive Growth

                                                                                                  Over 460M i.MX
                                                                                                  SOCs shipped to
                                                                                                  date.

                                                                                                  Over 140M i.MX
                                                                                                  shipped in vehicles
                                                                                                  since 2007.

                                                                                                  #1 in eReaders

                                                                                                  #1 in Auto
                                                                                                  Infotainment
                                                                                                  Applications
                                                                                                  Processors

 2007   2008   2009   2010   2011          2012          2013   2014   2015   2016   2017

                                    i.MX          i.MX Auto

        Scalability • Trusted Supply • World Class Support
                                                                                     PUBLIC   2
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
i.MX Automotive Roadmap

                          PUBLIC   3
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
Scalability of Embedded Processing: i.MX Subsystem
Reuse
                                                  i.MX 8QM
 A53           A53          A72                              SCU

                                                      DSP
                                                             HSM
                                                                                                        i.MX 8DualMax
                            A72                                              A72
                                                      M4      M4
 A53           A53                                                                                                                  SCU

                                                                                            M4          M4                  DSP
                                                                                                                                    HSM
                                                                                                                                                                            i.MX 8QXP
                      1x GPU (8 s ha ders)      1x GPU (8 s ha ders)
                                                                             A72                                                            A35           A35
                                                                                                                                                                                             SCU
    4K Video          Di s play Controller      Di s play Controller                              1x GPU (8 shaders)                                                 M4         DSP

                                                                           4K Video                                                                       A35
                                                                                                                                                                                             HSM
                                                                                                                                                                                                                                   i.MX 8DX
 2x MIPI-DSI         2x LVDS           PCIe           PCIe    1GbE
                                                                                                    Di s play Controller                    A35
                                                                                                                                                                                                                                                  SCU
                                                                                                                                                                      1x GPU (4 s ha ders)          A35          A35      M4         DSP
                                                                                                                                                                                                                                                  HSM
 2x MIPI-CSI         HDMI 2.0                 Audio           1GbE       MIPI-DSI      2x LVDS        PCIe           PCIe         1GbE
                                                                                                                                              4K Video                Di s play Controller
                                                                                                                                                                                                                           1x GPU (4 s ha ders)
                                                                                                                                                                                                                                                                     i.MX 8DXL
                                x64 LPDDR4/DDR4                         2x MIPI-CSI    HDMI 2.0              Audio                1GbE
                                                                                                                                            LVDS/MIPI           LVDS/MIPI       1GbE                  1080p Video                                                                                        i.MX 8SXL
 USB 3.0 & 2.0                                                                                                                                                                                                             Di s play Controller         A35          A35         M4
                                                                                                  x64 LPDDR4/DDR4                                                                                                                                                                           A35
                                                                                                                                             MIPI-CSI            Audio          1GbE         PCIe                                                                                                        M4
                                                                       USB 3.0 & 2.0                                                                                                                 LVDS/MIPI         LVDS/MIPI           1GbE
                                                                                                                                                                                                                                                        Pa ra llel
                                                                                                                                                                                                                                                                        1GbE
                                                                                                                                                                                                                                                        Di s play                     SCU
                                                                                                                                                                    x32 LPDDR4/DDR3L                  MIPI-CSI          Audio         PCIe    10/100                                        Pa ra llel
                                                                                                                                                                                                                                                                                                            1GbE
                                                                                                                                          USB 3.0 & 2.0                                                                                                                               HSM   Di s play                    SCU
                                                                                                                                                                                                                                                                     PCIe   10/100

                                                                                                                                                                                                                                                        USB 2.0                                                          HSM
                                                                                                                                                                                                                       x16 LPDDR4/DDR3L                                                                  PCIe   10/100
                                                                                                                                                                                                    USB 2.0                                                  x16 LPDDR4/DDR3L               USB 2.0

                                                                                                                                                                                                                                                                                                 x16 LPDDR4/DDR3L

   Most Scalable Family of Automotive Applications Processors for eCockpit,
            Instrument Cluster, Display Audio and Telematics/V2X
                                                                                                                                                                                                                                                  PUBLIC              4
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
Automotive Applications Processor Roadmap
                                       ARM v5-v7                                                      ARM/v8                                                 ARM v8.2
  25-50k DMIPS
 128-300 GFLOPS
    eCockpit                                                                                                                                                  Next Gen
   big.LITTLE                                                                                                                                                 i.MX High
      Vision
                                                                           i.MX 8QuadMax
   Audio DSP

  15-20k DMIPS                                                             i.MX 8QuadPlus                 Pin Compatible Family
   64 GFLOPS
    eCockpit
                                                                                                                           i.MX 8DualMax
      Vision
   Audio DSP

                       i.MX 6Quad i.MX 6QuadPlus                            i.MX 8QuadXPlus
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
Automotive Applications Processor Roadmap
                                       ARM v5-v7                                                      ARM/v8                                                    ARM v8.2
  25-50k DMIPS
 128-300 GFLOPS
    eCockpit                                                                                                                                                     Next Gen
   big.LITTLE                                                                                                                                                    i.MX High
      Vision
                                                                           i.MX 8QuadMax
                                                                                                                                                  Pin Compatible
   Audio DSP                                                                                                                                    eCockpit Processors
                                                                                                          Pin Compatible Family                 • Up to 4x 1080p/ 1x 4k
  15-20k DMIPS                                                             i.MX 8QuadPlus                                                         Displays
   64 GFLOPS
                                                                                                                                                • x64 LP-DDR4 / 3200
    eCockpit
                                                                                                                           i.MX 8DualMax
      Vision                                                                                                                                    • HiFi4 DSP option
                                                                                                                                                       Next Gen
   Audio DSP                                                                                                                                    • Common
                                                                                                                                                       NextSoftware
                                                                                                                                                            Gen     and
                                                                                                                                                       i.MX
                                                                                                                                                       i.MX Entry
                                                                                                                                                            Entry
                                                                                                                                                  Hardware platform
                       i.MX 6Quad i.MX 6QuadPlus                            i.MX 8QuadXPlus
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
Automotive Applications Processor Roadmap
                                       ARM v5-v7                                                      ARM/v8                                                 ARM v8.2
  25-50k DMIPS
 128-300 GFLOPS
    eCockpit                                                                                                                                                  Next Gen
   big.LITTLE                                                                                                                                                 i.MX High
      Vision
                                                                           i.MX 8QuadMax
   Audio DSP

  15-20k DMIPS                                                             i.MX 8QuadPlus                 Pin Compatible Family
   64 GFLOPS
    eCockpit
                                                                                                                           i.MX 8DualMax
      Vision
   Audio DSP
                                                                                                                                Pin Compatible Display
                       i.MX 6Quad i.MX 6QuadPlus                            i.MX 8QuadXPlus                                      Audio and Instrument
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
Automotive Applications Processor Roadmap
                                       ARM v5-v7                                                      ARM/v8                                                     ARM v8.2
  25-50k DMIPS
 128-300 GFLOPS
    eCockpit                                                                                                                                                       Next Gen
   big.LITTLE                                                                                                                                                      i.MX High
      Vision
                                                                           i.MX 8QuadMax
   Audio DSP

  15-20k DMIPS                                                             i.MX 8QuadPlus                 Pin Compatible Family
   64 GFLOPS
    eCockpit
                                                                                                                           i.MX 8DualMax
      Vision
   Audio DSP

                       i.MX 6Quad i.MX 6QuadPlus                            i.MX 8QuadXPlus
NXP Connects i.MX 8 Family Hardware - Lydia Ziegler - Nxp corporate template, INTERNAL ...
Automotive Applications Processor Roadmap
                                       ARM v5-v7                                                      ARM/v8                                                 ARM v8.2
  25-50k DMIPS
 128-300 GFLOPS
    eCockpit                                                                                                                                                  Next Gen
   big.LITTLE                                                                                                                                                 i.MX High
      Vision
                                                                           i.MX 8QuadMax
   Audio DSP

  15-20k DMIPS                                                             i.MX 8QuadPlus                 Pin Compatible Family
   64 GFLOPS
    eCockpit
                                                                                                                           i.MX 8DualMax
      Vision
   Audio DSP

                       i.MX 6Quad i.MX 6QuadPlus                            i.MX 8QuadXPlus
Automotive Applications Processor Roadmap
                                       ARM v5-v7                                                      ARM/v8                                                   ARM v8.2
  25-50k DMIPS
 128-300 GFLOPS
    eCockpit                                                                                                                                                    Next Gen
   big.LITTLE                                                                                                                                                   i.MX High
      Vision
                                                                           i.MX 8QuadMax
   Audio DSP

  15-20k DMIPS                                                             i.MX 8QuadPlus                 Pin Compatible Family
   64 GFLOPS
    eCockpit
                                                                                                                           i.MX 8DualMax                   Next Generation
      Vision
   Audio DSP                                                                                                                                                   i.MX 10

                       i.MX 6Quad i.MX 6QuadPlus
                                                                                                                                                      Scalable Family
                                                                            i.MX 8QuadXPlus
i.MX 8 & 8X Introduction

                           PUBLIC   11
i.MX 8 Family of Automotive Applications Processors

                        GPU                               Display         DSP Option   Virtualization              ARM CPU
                          • Dual Core GPU                                                               Cortex-M4 | Cortex-A53 | Cortex-A72
                          • 16 Vec4 Shaders            Up to 4 displays    Audio DSP     SoC Level
            8             • Up to 128 GFLOPS

                                                                                                                                                                       OpenVX and ISI Vision Acceleration
                          • 64 execution units                                         SoC     OS
                    8     • High Speed                   total pixels
8QuadMax                  • Tessellation / Geometry
                                                                            HiFi 4
                                                                                       Core
                                                                                               OS
                                                                                               OS

                                                                                                                                              Software Compatibility
                            Shaders

                                                                                                                                                                                                            Pin Compatibility
                           •   Dual Core GPU
                           •   16 Vec4 Shaders         Up to 4 displays    Audio DSP     SoC Level
            8              •   Up to 80 GFLOPS
                           •   64 execution units                                       SoC    OS

                           •   Full Speed                total pixels                          OS
                    8                                                                  Core
8QuadPlus                  •   Tessellation/Geometry                        HiFi 4
                                                                                               OS

                               Shaders
                           •   Single Core GPU
                           •   8 Vec4 Shaders          Up to 3 displays    Audio DSP     SoC Level
                           •   Up to 64 GFLOPS
                8          •   32 execution units                                      SoC     OS

                           •   High Speed                total pixels                  Core
                                                                                               OS

8DualMax                   •   Tessellation/Geometry
                                                                            HiFi 4
                                                                                               OS

                               Shaders

                         Family of Scalable Automotive Multimedia Processors
                                                             eCockpit
                                                           Infotainment
                                                   Graphical Instrument Clusters                                        PUBLIC        12
preliminary

i.MX 8 Family – Block Diagrams                        i.MX 8QuadMax           i.MX 8QuadPlus          i.MX 8DualMax
                                        Feature
                                                      29x29 Flip-Chip BGA     29x29 Flip-Chip BGA    29x29 Flip-Chip BGA
                                 Package
                                                         0.75mm pitch            0.75mm pitch           0.75mm pitch

                                 DMIPS (Cortex-A)            26k                     18.5k                  15k

                                 ARM® Core
                                                        4x Cortex-A53             4x Cortex-A53        2x Cortex-A72
                                 Complex 1
                                 ARM® Core
                                                        2x Cortex®-A72            1x Cortex-A72               -
                                 Complex 2

                                 Display Controller           2x                       2x                    1x

                                 GPU                   2x GC7000 XSVX         2x GC7000Lite XSVX      1x GC7000 XSVX

                                 MIPI CSI                  2x 4-lane                2x 4-lane             2x 4-lane

                                 MLB150                       1x                       1x                 via USB

                                 HDMI In                      1x                       1x                     -

                                 HDMI/eDP Out                 1x                       1x                    1x

                                 DDR                        2x x32                   2x x32                2x x32

                                 PCIe                     2x PCIe 3.0              2x PCIe 3.0           2x PCIe 3.0

                                 SATA                      1x SATA3                 1x SATA3                  -

                                                                                                       1x 1Gb w/AVB
                                 Ethernet               2x 1Gb w/AVB              2x 1Gb w/AVB
                                                                                                      1x 10/100 w/AVB

                                                                         PUBLIC      13
Preliminary – Subject to Change

 i.MX 8X Family of Applications Processors
             GPU                            Video    Displays    DSP      USB         DDR            ARM CPU
                   • Single Core GPU                                                  x32         Cortex-A35 + M4
                   • 4 Vec4 Shaders                   Up to 3
                     high performance
             4     • 16 execution units              2x 1080p
                                                     1x WVGA
                                                                                  DDR3L-1866
                                                                                  (ECC option)
                   • OpenGL ES 3.1
8QuadXPlus         • OpenCL Embedded      + Legacy               HiFi 4           LP-DDR4-2400
                                                                                    (no ECC)

                                                                                                                    Software Compatibility

                                                                                                                                             Pin Compatibility
                   • Single Core GPU                  Up to 3                         x32
                   • 4 Vec4 Shaders
                                                     2x 1080p                     DDR3L-1866
             4       high performance
                   • 16 execution units              1x WVGA                      (ECC option)

                   • OpenGL ES 3.1                               HiFi 4           LP-DDR4-2400
8DualXPlus         • OpenCL Embedded      + Legacy                                  (no ECC)

                   • Single Core GPU                                                  x16
                                                      Up to 3
                   • 4 Vec4 Shaders                                               DDR3L-1866
             4       poweroptimized
                   • 16 execution units
                                                     2x 1080p
                                                     1x WVGA
                                                                                   (no ECC)

                   • OpenGL ES 3.1        + Legacy               HiFi 4           LP-DDR4-2400
8DualX             • OpenCL Embedded                                                (no ECC)

                          Family of Scalable Automotive Multimedia Processors
                                           Display Audio Applications
                                          Graphical Instrument Clusters
                                              Telematics and V2X
                                                                                 PUBLIC      14
i.MX 8X Family Block Diagram
                                                     Core Complex 2
                                                                                                                                            i.MX 8DualXPlus               i.MX 8DualX
         Core Complex 1                                                                              Connectivity
                                               1x Cortex-M4F
                                                                                                                                            i.MX 8QuadXPlus
                                                                   1x I2C
        4x                                                                                            4x UART                  Feature
         2-4xCortex-A53
               Cortex-A35                     16KB L1 I-cache
                32KB
                  32KBL1-D
                        L1-D                                      1x UART
      32KB    L1-I 32KB
       32KBL1-I     32KBL1-D
                         L1-D                 16KB L1 D-cache                                             8x I2C
                                                                  6x GPIO
                                                                                                                                                 2 x Cortex-A35
         512KB L2 w/ECC
                                                256KB SRAM      1x TPM Timer                             4x SPI                                (i.MX 8DualXPlus)
                                                                                                                              ARM® Core                                       2 x Cortex-A35
                                                                                                    2x Gbit Ethernet                             4 x Cortex-A35
          Multimedia                                     Memory
                                                                                                                                              (i.MX 8QuadXPlus)
                                              DDR3 @933 MHz (ECC Option)
                 GPU                                                                              1x 10/100 Ethernet
                                              LPDDR4 @ 1200 MHz (no ECC)
                                                                                                   3.3V / 1.8V GPIO
                                                                                                                              ARM® Core         1 x Cortex-M4F                1 x Cortex-M4F
            4- Shaders                              2x SDIO3.0/eMMC5.1
          OpenGL ES 3.1                                                                                                       DSP Core       Tensilica® HiFi 4 DSP        Tensilica HiFi 4 DSP
               Vulkan®                             2x Quad / 1x Octal SPI                PCIe 3.0 with L1 Substate (1-lane)
                VPU
                                                   RAW NAND – BCH62                             1x USB3 OTG w/PHY                         *32-bit DDR3L (ECC option)     16-bit DDR3L (no ECC)
       Video: h.265 dec 4k                                                                                                    DRAM
       h.264 dec/enc 1080p                                                                    1 or 2x USB2 OTG w/PHY                          LPDDR4 (no ECC)             LPDDR4 (no ECC)
                                                          Security
                Audio                                                                              3x CAN/CAN FD                               1 x GC7000Lite               1 x GC7000Lite
                                              HAB, SRTC, SJTAG, TrustZone
     1x Tensilica®                                                                                                            GPU
      HiFi 4 DSP 32KB I 48KB D
                                                                                                                                              High Performance             Power Optimized
                                                AES256, RSA4096, SHA-512                              MOST 25/50
       512 KB SRAM 64KB TCM                         3DES, ARC4, MD-5                                                                       4K h.265 dec, 1080p h.264     1080p h.264 enc/dec
                                                                                                      4x4 Keypad              VPU
                                                                                                                                                    enc/dec
                                                    Flashless SHE, ECC
                                                                                                        4x PWM
      Display & Camera I/O                  Tamper detection, Inline Enc Engine
                                                                                                                                                                         1 x Gigabit with AVB
                                                                                                     1x 12-bit ADC            Ethernet       2 x Gigabit with AVB
                                                                                                                                                                              1 x 10/100
Display Processor w/ SafeAssure®                     System Control
                                                                                                   2x ASRC, SPDIF
 2 x MIPI-DSI/LVDS Combo PHY*                Power Control, Clocks, Reset
                                                                                                  4x SAI, ESAI, MQS           USB with      1 x USB 3.0 (or USB 2.0)
         1x Parallel Display                            Boot ROMs                                                                                                              2 x USB 2.0
                                                                                                                              PHY                 1 x USB 2.0
            1x MIPI CSI                      PMIC interface (dedicated I2C)

           1x Parallel CSI                   Resource Domain Partitioning
                                                                                                     Varies by device
                                                                                                                              *21x21 package only.
                                                                                                                               17x17 will have 16-bit memory interface
  * Each single PHY can either be a 1×4 lane MIPI-DSI or a 1×1 channel LVDS interface for a total of 2 display interfaces.
  In combination, the two PHYs can be configured to be a single 2-channel LVDS interface.
                                                                                                                                                             PUBLIC      15
i.MX 8QM/QXP
DDR Controller Overview

                          PUBLIC   16
DDR Controller/PHY Features
•   i.MX 8QM
    −   Supports LPDDR4 up to 3200Mbps (1.6GHz DDR clock)
    −   Supports DDR4 up to 2400Mbps (1.2GHz DDR clock)
    −   Two DDR Controllers (4KB interleave between controllers)
•   i.MX 8QXP
    −   Supports LPDDR4 up to 2400Mbps (1.2GHz DDR clock)
    −   Supports DDR3L (with ECC) up to 1866Mbps (933MHz DDR clock)
    −   One DDR Controller
•   Data bus width 32-bits/16-bits for all DDR protocols.
•   Supports up to 2 ranks for all DDR protocols
•   Voltage and temperature compensation in the background

                                                                      PUBLIC   17
DDR Subsystem Architecture
                              DDR Controller

                          DRC
                                RRB

                                                              PHYv1 28FDSOI      Up to 32-bit data bus along
                               uMCTL2                                            with associated DQS/DM
                                                                                 control signals
                                                                 data training

                                                  Scheduler
                                                 and SDRAM
                        AXI       Port Arbiter    command           PHY
                                                  generator
                                                   (DDRC)        PHY PLL
                                                                                 Address and control signals
                                                                                 are configurable based on
                                                                    PUB          DRAM type

                                                     WB

•   QM has two sets of DDR controllers/PHYs
•   QX has one DDR controller/PHY

                                                                                    PUBLIC    18
Comparison With i.MX6/7
•   i.MX 6 series uses the MMDC

•   i.MX 8QM/QXP and i.MX7D uses 3rd party IP
    −   DDR Controller IP similar programming model with i.MX7D
    −   DDR PHY is completely different from MX7D

•   i.MX 8QM/QXP DDR is higher speed
    −   Ultra high speed, more challenges for customer PCB design
    −   Previous i.MX max DDR freq 528MHz, i.MX 8 QM up to 1.6GHz
    −   Follow layout recommendations provided in the Hardware Developers Guide

                                                                    PUBLIC   19
i.MX 8QM/QXP and i.MX 8M High-level Comparison

              Feature                  i.MX8 QM/QXP                    i.MX8M
    System Control Unit (SCU)   Yes                          No, architecture similar to
                                                             MX7D
    DDR Initialization          Performed by SCU             Perform by SPL
    Automatic Data training     Performed as part of         Performed by the PHY
                                initialization script (PIR   MCU (firmware loaded into
                                writes)                      MCU IRAM/DRAM)
    Controller version          SNPS DDR Controller          SNPS DDR Controller
                                (dwc_ddr_umctl2)             (dwc_ddr_umctl2)
    PHY version                 SNPS PHY v1                  SNPS PHY v2 (integrated
                                                             MCU)

                                                                        PUBLIC   20
High Level Feature Set Comparison of the i.MX 8 / 8X / 8M
Families
             QM Family   QX Family    mScale Family

                                                      PUBLIC   21
i.MX8 QM                           i.MX8 QXP

    DDR Pin              IO name
                         DCF_00
                         DCF_01
                                   LPDDR4 name DDR4 name
                                      CA2_A
                                      CA4_A
                                                   A5
                                                   A6
                                                           IO name
                                                           DCF_00
                                                           DCF_01
                                                                     LPDDR4 name
                                                                        CA2_A
                                                                        CA4_A
                                                                                      DDR3 name
                                                                                         A5
                                                                                         A6

    Function             DCF_02
                         DCF_03
                         DCF_04
                                      CA5_A
                                                ALERT_N
                                                   A7
                                                   A8
                                                           DCF_03
                                                           DCF_04
                                                           DCF_05
                                                                        CA5_A            A7
                                                                                         A8
                                                                                         A9
                         DCF_05                    A9
                                                           DCF_07                       RAS#
                         DCF_06                   BG1
                                                           DCF_08      CA3_A             A3
                         DCF_07                  ACT_N
                         DCF_08       CA3_A        A3      DCF_09     ODT_CA_A          ODT
                         DCF_09     ODT_CA_A      ODT      DCF_10      CS0_A             A1
•   Pins configurable    DCF_10       CS0_A        A1      DCF_11      CA0_A             A0
                         DCF_11       CA0_A        A0      DCF_12      CS1_A             A2
    based on DDR type    DCF_12       CS1_A        A2      DCF_14      CKE0_A
                         DCF_13                  PARITY    DCF_15      CKE1_A
•   Refer to NXP board   DCF_14       CKE0_A               DCF_16      CA1_A              A4
                         DCF_15       CKE1_A
    schematics for       DCF_16       CA1_A        A4
                                                           DCF_17
                                                           DCF_18
                                                                       CA4_B
                                                                      RESET_N
                                                                                         A12
                                                                                       RESET#
                         DCF_17       CA4_B        A12
    examples             DCF_18      RESET_N    RESET_N
                                                           DCF_19      CA5_B             A14
                         DCF_19       CA5_B        A14     DCF_20                        A15
                         DCF_20                    A15     DCF_21                        BA0
                         DCF_21                    BA0     DCF_22                        BA1
                         DCF_22                    BA1     DCF_23                        BA2
                         DCF_23                   BG0      DCF_24                       CAS#
                         DCF_24                    A17     DCF_25     ODT_CA_B
                         DCF_25     ODT_CA_B      ODT1     DCF_26      CA3_B             A13
                         DCF_26       CA3_B        A13
                                                           DCF_27      CA0_B             A10
                         DCF_27       CA0_B        A10
                                                           DCF_28      CS0_B           CS_N[0]
                         DCF_28       CS0_B      CS_N[0]
                                                           DCF_29      CS1_B           CS_N[1]
                         DCF_29       CS1_B      CS_N[1]
                         DCF_30       CKE0_B      CKE0     DCF_30      CKE0_B           CKE0
                         DCF_31       CKE1_B      CKE1     DCF_31      CKE1_B           CKE1
                         DCF_32       CA1_B        A11     DCF_32      CA1_B             A11
                         DCF_33       CA2_B        A16     DCF_33      CA2_B            WE#
                                                                     PUBLIC      22
JEDEC Timing

               PUBLIC   23
Timing Budget for Read – JEDEC Min From LPDDR4

•   1.6 GHz frequency has a clock period of 625 picoseconds
    − Double   data rate gives a theoretical window of 312.5 picoseconds
•   JEDEC standards require LPDDR4 to have a minimum window of 70% of
    theoretical window (94 picoseconds)
    − Accounts   for all skew, slew rate diff and jitter from LPDDR4 package
                                                                               PUBLIC   24
Timing Budget for Read – Processor Flip-Flop times

•   Set up time requirement for Read FIFO of processor
    −   17 picoseconds
•   Hold time requirement for Read FIFO of processor
    − 17   picoseconds
                                                   PUBLIC   25
Timing Budget for Read – Vref Uncertainty

•   Vref must meet the following tolerance: +/- 1%
    −   Vref effects the time that a signal (DQ/DM/CA) is latched into the pads
•   Timing fluctuations for maximum Vref variations
    −4  picoseconds for Set Up
    − 4 picoseconds for Hold

                                                                      PUBLIC   26
Timing Budget for Read – DQS Placement Uncertainty

•   Accounts for Delay Element granularity in DLL
    −   One delay element is ~ 5 picoseconds long
    − Manufacturing   process variations can change this value.
•   Timing budget for DQS variation is 7 picoseconds applied to Set Up
                                                                  PUBLIC   27
Timing Budget for Read – Voltage-Temperature Drift

• ZQ Calibrations account for signal drive strength on PCB
• Variations in Volt-Temp effect delay element time
• Timing budget for maximum allowed Volt-Temp drift
    −7   picoseconds for Hold
                                                   PUBLIC   28
Timing Budget for Read – Tap Size Variation

• The actual delay element tap point may vary
• Timing budget allows for 2.2 picoseconds based on manufacturing
  process variations.
                                                 PUBLIC   29
Timing Budget for Read – Power Supply Noise

•   Maximum allowed internal power rail ripple is +/- 2%
•   Accounts for jitter introduced on Read signal from package ball to the input
    of the Read FIFO.
•   Timing budget allowances:
    −   Set Up: 8 picoseconds
    −     Hold: 8 picoseconds                             PUBLIC   30
Timing Budget for Read – I/O Rise/Fall Skew mismatch

• Accounts for internal Rise/Fall mismatches of the Read signal from the processor
  balls to the Read FIFO.
• Typically caused by different slew rates for rising and falling edges
• Timing budget allowances:
    −   Set Up: 9 picoseconds
    −     Hold: 9 picoseconds
                                                           PUBLIC   31
Timing Budget for Read – InterSymbol Interference ISI

•   Accounts for interactions between data traces internal to the processor,
    processor balls to the Read FIFO.
•   Timing budget allowances:
    −   Set Up: 8 picoseconds
    −     Hold: 8 picoseconds
                                                               PUBLIC   32
Timing Budget – Allowance for Trace Length Mismatch

•   The remaining Timing Budget is allocated to PCB trace length, internal
    package length, and design margin.
•   For most robust design, recommend match trace lengths as close as
    possible:
    − Addthe internal package length given to the PCB trace length, and then match lengths by
     group.

                                                                        PUBLIC   33
Timing Budget
•   As DDR frequency increases, the time between strobe edges (rise/fall) becomes
    so small that the DRAM system designer needs to account for all possible errors
    in timing.
•   The frequency itself provides the maximum available time in a window.
•   The three major components in a DRAM system can account for all errors:
    − The   DRAM Device
    − The   PHY on the processor
    − The   interconnecting system ~ PCB board.
     ▪   Includes package substrate up to silicon pads.
     ▪   IBIS models include necessary information.

At 1.6 GHz, the maximum data window is 313 picoseconds. Uncertainties on the DRAM and processor
reduce this window to 110 picoseconds. If further errors on the PCB amount to more than 110
picoseconds, there are potential problems with data integrity.
                                                                          PUBLIC   34
i.MX 8QM/QX
DDR Initialization Flow

                          PUBLIC   35
DDR Initialization Flow
•   Three main initialization components            DDR Controller/PHY
                                                    register initialization
    − Controller/PHY     initialization
    − DRAM     initialization
    − Data   training                               DRAM initialization
•   Data training (calibration) part of init flow
    − Data   training specific to DRAM technology
                                                       DRAM training
•   Initialization sequence must adhere to          LP4     DDR4      DDR3
    order shown here
    − Includes   sequence order for data training
•   DDR Register Programming Aid (RPA)              PHY/DRAM Ready
    takes care of this
                                                       PUBLIC    36
i.MX 6 Versus i.MX 8QM/QXP DDR Initialization Process
i.MX 6 Series                                     i.MX 8QM/QX
1. Create an initial DRAM initialization script   1. Create an initial DRAM initialization script
   from RPA                                          from RPA
2. Run initial DRAM initialization                2. Run DDR stress test based on the script
3. Run calibration and then test to make sure     3. Tweak the script (if necessary) to make sure
   board works                                       it can pass on several boards
4. Run calibration on a number of boards and
   obtain average values
5. Place averaged calibration values into
   DRAM initialization script
6. Run updated DRAM initialization
7. Perform testing on several boards

                                                                         PUBLIC   37
i.MX 8QM /QX
DDR Calibration Details

                          PUBLIC   38
DDR Data Training
                                    LPDD   DDR4        DDR3
• Different DDR technology          R4

  require different data
  training
• Data training part of
  initialization process
    − Write  PIR register
    − Poll for completion

•   Command Bus Training
    (CBT) not automatic,
    requires SW algorithm
    − Currently
              under investigation
     and development by R&D

                                              PUBLIC   39
DDR Training/Calibration Introduction
                  DRAM Calibration                                                LPDDR4                           DDR4       DDR3L
                  Impedance (ZQ) calibration                                            ✓                             ✓            ✓
                  Command/address bus                                                   ✓
                  training*
                  Write Leveling                                                        ✓                             ✓            ✓
                  DQS Gate training                                                     ✓                             ✓            ✓
                  Write DQS2DQ training                                                 ✓
                  Data Eye training                                                     ✓                             ✓            ✓
                  VREF training                                                         ✓                             ✓

* Command Bus Training (CBT) not automatic, requires SW algorithm; currently under investigation and development by R&D

                                                                                                                          PUBLIC       40
DDR Training (calibration) During Initialization
•   Reason for data training (calibration) during DRAM initialization
    − New  DRAM technologies increasingly faster
    − Tighter timings affected by delays between PHY and DDR memory
     ▪   Factors like board trace length affect these delays
     ▪   Process variations of the SoC and DRAM may also affect these delays
    − JEDEC    requires data training for LPDDR4 and DDR4 as part of the initialization

•   Data training implemented completely by DDR PHY
    − Some  setup may be needed (i.e. enable/disable DQS pull up/down for DQS gate)
    − Simple write to PHY PIR to start training then poll PHY PGSR0 for training complete
    − RPA handles all of this, no user interaction

•   No longer need to manually run calibration on various boards to take an average
    (as in the case of previous i.MX SoC)

                                                                               PUBLIC     41
DDR Calibration After Initialization (Run-time)
• Run-time calibration during DRAM operation compensates for variations in voltage and temperature
• Enabled during initialization of the DRAM, no further user interaction required
    −   Delay line VT compensation
        ▪   Delays vary over time due to voltage and temperature fluctuations
        ▪   PHY contains circuits to monitor delay in the background during DRAM operation
        ▪ Drift compensation logic periodically adjusts delay line select input for variations in voltage/temperature
        ▪ Ensures each delay line maintains a constant time delay as voltage and temperature change during chip operation

    −   Impedance (ZQ) calibration
        ▪   PHY has background calibration/compensation engine
        ▪ Boot time: during PHY initialization, full calibration performed to find initial values
        ▪ Run time: during DRAM operation
            •   ZQ calibration periodically calibrates the output driver impedance and ODT of SoC and DRAM I/Os
            •   Incremental compensation performed in the background

    −   DQS drift detection (applicable only to LPDDR4)
        ▪   PHY logic monitors drift in read DQS signal compared to DQS_GATE input due to DRAM tDQSCK variations over time
        ▪   tDQSCK for DDR3/4 are kept relatively constant by DRAM and hence do not require DQS drift detection

                                                                                                                  PUBLIC   42
DDR Calibration Modes
• Impedance (ZQ) calibration                     Occurs as part of PHY initialization and run-time

•   Command/address bus training*
•   Write Leveling
•   DQS Gate training
•   Write DQS2DQ training*                       Performed by PHY during initialization
•   Data Eye training
•   VREF training**

Note: The items of DQ training are performed automatically during DRAM initialization by the DDR PHY.

Specifically, each of these trainings are simply triggered by programming their specific bits in the
PHY Initialization Register (PIR).

* Applicable only to LPDDR4
** Applicable only to LPDDR4 and DDR4
                                                                                   PUBLIC   43
Impedance (ZQ) Calibration
What
ZQ calibration calibrates I/O driver impedance across PVT

Why
This automatic process tunes the DRAM and the SoC I/O Pad output drivers (drive strength) and ODT values
across changes in process, voltage, and temperature.

How
ZQ calibration is performed as part of the DRAM initialization process.
Auto ZQ calibration is configured via the register DDRC_ZQCTL0 during DRAM initialization

When
ZQ calibration is configured during DRAM initialization to run periodically. Once configured, there is no further
user interaction required.

                                                                                   PUBLIC   44
Command/Address Bus Training (LPDDR4 only)
What
Command/Address Bus Training (CBT) used to center Command/Address bus (CS and CA[5:0]) with rising
clock edge by adjusting internal delays associated with CA bus

                                     CA

Why
Higher DRAM speeds implies more stringent timing. However, LPDDR4 CA bus is single data rate thereby
increases timing margin when compared to double data rate.

How
QM/QX SNPS PHYv1 does not perform CBT automatically (within JEDEC spec by default). Requires software
algorithm, under investigation by R&D.

When
JEDEC recommends but does not require CBT to be performed during initialization. Another proposal is to run
CBT on a few boards to obtain an average CA delay value and apply to initialization.
                                                                              PUBLIC   45
Write Leveling
What
Compensates for CK to DQS timing skew by aligning clock
with data strobe to improve signal integrity performance

Why
• For non-LPDDR4: compensates for skew between clock
  and data strobe caused by fly-by topology

•   LPDDR4: compensates for CK-to-DQS timing skew
    affecting timing parameters such as tDQSS (write
    command to 1st DQS latching), tDSS and tDSH (DQS
    setup/hold time)

How
DDR PHY invokes write leveling mode in SDRAM then
delays DQS to align with clock at SDRAM

When
Write leveling training is performed automatically by the
DDR PHY during DRAM initialization

                                                            PUBLIC   46
DQS Gate Training
What
Training that sweeps read DQS gate over possible gating positions to discover appropriate placement

Why
• PHY internally gates DQS during non-read operations to prevent erroneous latching of DQS edges
• Precise alignment of gate within read preamble a prerequisite for proper reads
• Delays (such as board trace lengths) in read path are imprecisely known, need to train the gate for a particular system

How
DQS Gate training is performed automatically by the DDR PHY. The PUB features a built-in read DQS strobe gate training unit that
may be triggered as part of the initialization process using the PIR register

When
DQS Gate training is performed automatically during DRAM initialization.

                                                                                               PUBLIC    47
Write DQS2DQ Training (LPDDR4 only)
What
DQS to DQ training is referred to as “Write training” in JEDEC and “Write DQ training” in DFI.

Why
LPDDR4 Memory devices use an unmatched DQS-DQ path to enable high-speed performance and save
power. As a result, the DQS strobe must be trained to arrive at the DQ latch center-aligned with the data eye.

How
The DQ receiver will latch the data present on the DQ bus when DQS reaches the latch, and DQS2DQ training is
accomplished by delaying the DQ signals relative to DQS such that the data eye arrives at the receiver latch
centered on the DQS transition. Above picture shows the DQ position after the training.

When
DQS2DQ training is performed automatically by the DDR PHY during DRAM initialization.

                                                                                     PUBLIC      48
Data Eye Training
What
The PHY training firmware contains automatic training sequences to perform read and write de-skew which aligns
the data bits to the DQ bit with the longest delay using a bit delay line (BDL). After performing bit de-skew the read
and write eye centering training is executed to place the strobe in the center of the eye defined by the bits in the
respective byte. Below is an illustration of before and after de-skewing and centering.

                  Before                                      After

Why
As bit rates increase to 2133Mbps and beyond, maintaining timing margins in the DDR interfaces has
become more difficult. The PHY solution includes delay lines to compensate for per-bit skew due to factors
such as PHY to IO routing skews, package skews, PCB skew, etc.

When
Read/write de-skew and eye centering is performed automatically by the DDR PHY during DRAM initialization.

                                                                                       PUBLIC   49
VREF Training (LPDDR4 and DDR4)
What
• Write/read eyes should be as wide as possible to provide stable/robust
  memory access.
• Eye position depends upon LCDL (delay line) and VREF values.

Why
• VREF is internally generated by SoC and DRAM.
• VREF training used to determine range of VREF values where memory
  interface (write/read) is stable and then find out an optimum write/read
  eye position.

The following types of VREF training are supported:
DRAM VREF Training: Optimizes the write eye by sweeping DRAM VREF
DQ values inside memory.
Host (i.MX8) VREF Training: Optimizes the read eye by sweeping the PHY
I/O’s VREF setting.

How
VREF training is performed automatically by the DDR PHY during DRAM
initialization.

Note, for DDR3L, VREF is externally supplied hence there is no VREF training requirement.
                                                                                            PUBLIC   50
i.MX 8QM/QXP DDR Tool
Introduction

                        PUBLIC   51
i.MX 8QM/QX DDR Register Programming Aid (RPA)
Highlights
•   Developed by SE team and no formal roll out or maintenance
    − Based   on scripts provided by design/validation
•   Excel spread sheet based, transparent, ease-of-use
•   Help to compute DDRC registers configuration
    − JEDEC timing parameters
    − DDRC DFI timing parameters
    − DDRPHY configuration
•   Help to configure DDR mode registers
•   Includes necessary data training for specific memory type
•   “BoardDataBusConfig” worksheet for data bus swizzling
•   Two output formats
    − DCD CFG file – SCFW usage (copy into SCFW board folder)
    − DDR Stress Test Script – for use with the DDR stress test
                                                                  PUBLIC   52
i.MX 8QM/QX RPA
•   Each tool based on DDR technology:
    LPDDR4, DDR4 or DDR3
•   Applies correct order of initialization
    steps
    − Controller/PHY  initialization
    − DRAM initialization
    − Data training
•   Includes worksheet for data bus
    mapping
    − Configures   relevant registers for data bit/byte
     swizzling
•   Generates two initialization formats
    − CFG file for use with SCFW (save as .cfg)
    − DDR Stress Test Script (save as .ds)
•   Color coded cells provides usage
    guidance
                                                          PUBLIC   53
RPA – Register Configuration
•   In most cases, user only needs to update Device Information table
    − Automatically   updates configuration and timings (all timings are based on JEDEC
     standard)
    − No need to manually go through all register fields (strongly recommend to not manually
     edit those fields)
                                                              Indicates the DDR type the RPA is applicable to

                                                Recommend to list vendor and exact part number

                                                  User must ensure these are accurate; values are found in
                                                  the memory device data sheet

                                                                                     PUBLIC      54
RPA – BoardDataBusConfig
• Board layout guidelines allow users to swizzle data bits within a byte
  lane and swap byte lanes
• “BoardDataBusConfig” worksheet – users input SoC data bus
  connection
    − Data    bus mapping must be accurate for PHY data training
    − Relevant        registers are automatically updated

       User must accurately populate this field based on the
     customer schematics. Errors in this field may result in data
                         training errors.

                                                                    PUBLIC   55
RPA – Initialization Scripts
•   Two file formats, simply copy-and-paste into text document:
    − [DCD   CFG file] for SCFW (to support SCFW porting) – save as .cfg
    − [DDR   Stress Test Script] for use with DDR stress test – save as .ds
•   Strongly recommend to not manually edit these tabs
    − Make   changes only to Register Configuration and BoardDataBusConfig tabs
                            DCD CFG file example                                            DDR Stress Test Script example

              • Yellow cells indicates that they are affected by changes on the Register Configuration and
                BoardDataBusConfig tabs
                                                                                                                 PUBLIC      56
i.MX 8QM/QX DDR Stress Test Tool – Overview
•   Supports i.MX 8QM/QX                        DDR Stress Test Folder structure

•   Board hardware requirement
    − USB   OTG port for Serial download mode
    − Debug    AP UART port*
    − Highly   recommend SCU UART port
                                                                         DDR Stress Test
•   Requires functional SCFW                                             GUI

•   Use RPA to generate stress test
    script

* Note, for Win10, may require
manually installing COM port driver
(FTDI, SiLabs,…)
                                                  PUBLIC     57
i.MX 8QM/QX DDR Stress Test Tool – High Level Steps
•   User must first ensure working SCFW
•   Create a new DDR script by RPA tool
    −   Based on DDR device and board hardware design
•   Power on i.MX 8QM/QX board in serial download mode
    −   USB OTG and AP UART port connect correctly
    −   Highly recommend SCU UART port connection to serial terminal
•   Load DDR script and download i.MX8 QM/QX binaries to target board
•   If DDR Stress Test passes, use RPA DCD CFG file to create *.cfg file for
    SCFW
    −   Rebuild SCFW with updated *.cfg and proceed with u-boot/OS porting
    −   Recommend running OS stress test (i.e. memtester)

                                                                        PUBLIC   58
i.MX 8QM/QX DDR Stress Test Tool – SCFW
•   User must first port SCFW to customer board (ensure SCFW is up and
    running)

•   Then build the SCFW for the DDR Stress Test
        make qx R=B0 DDR_CON=ddr_stress_test_parser
    − SCFW  will run a special “parser” instead of running DDR init
    − DDR Stress Test loads ddr initialization to OCRAM then “parser” executes init
    − Copy and re-name scfw_tcm.bin to DDR Stress Test bin folder as follows:
     ▪ QM: mx8qm_scfw_download.bin
     ▪ QX: mx8qx_scfw_download.bin

•   SCU UART port connection to serial terminal
    − Ensures   SCFW is up-and-running

                                                                          PUBLIC   59
i.MX 8QM/QX DDR Stress Test Tool – How to Run
1. Select the correct COM port                                           3. When AP UART, DDR script, and
   number for the AP UART, then hit                                      SoC selected, hit Download
   connect

2. Select the desired DDR initialization
script and SoC

4. Select operational features

5. Select freq range for test or
leave as 0 for testing at target freq
                                           Double check DDR
                                           parameters and ensure
                                           they match what’s on the
6. Hit Stress test to start running
                                           board

                                                                      DDR data training
                                                                      status

                                                                      PUBLIC     60
DDR Stress Test Fails to Run – Common Causes
• DDR Stress Test should run even when data training error occurs
• However, in some corner cases, the DDR Stress Test may fail to run

• Make sure board is in serial download mode and USB OTG is connected
                                                                                Example of successful SCFW execution

• If all you see is this, first make sure the SCFW is properly running (check
  SCFW UART port)
• Make sure to build the SCFW for the DDR Stress Test
• If SCFW hangs during DDR init, make sure you are selecting the correct
  *.ds file (in other words, don’t select a QM *.ds file when using QX)
• If SCFW is successful and DDR init has completed, then check to make
  sure you are connected to the correct COM port for the AP UART
                                                                                          PUBLIC   61
i.MX 8QM/QX DDR Stress Test Versus Memtester
•   Once DDR stress test passes with ample margin, are we guaranteed the OS will
    never fail due to DDR issues?
    −   High degree of confidence DDR robust enough, but…
    −   OS is still the most stressful, particularly an OS stress test like memtester or u-boot
        decompressing the Linux kernel
    −   Recommend to run any OS stress tests to double check

                                                                               PUBLIC   62
i.MX 8X MEK Connection for DDR Stress Test

                                                USB-to-UART serial
                                                connection (debug
 USB OTG Type C                                 UART port)
 (direct connection to
 PC, do not connect
 through USB HUB)

                                       PUBLIC     63
i.MX 8QM/QX RPA and DDR Stress Test Tools
• As the i.MX 8QM and QXP family are not released yet, please
  contact your local NXP FAE for RPA and DDR Stress test tool.
• Eventually this will be posted to Community

                                                 PUBLIC   64
Debugging DDR Failures

                         PUBLIC   65
Potential Causes of DDR Failures
•   DDR Data training (during DDR init) achieves best possible timing and vref parameters
    for optimal performance
    −   If failures occur, more likely to occur early on in data training
    −   If failures do occur in data training, here’re some suggestions
        ▪ First, re-check RPA tool, ensure correct/accurate DDR parameters/configuration
        ▪ For errors like DQS2DQ (LP4) and WLERR (write leveling) training – ensure RPA BoardDataBusConfig is accurate
        ▪ Other errors (less likely) – try adjusting drive strength and ODT parameters
        ▪ Other reason: poor board layout or manufacturing issues; bad memory device
    −   Data training results reported by the DDR stress test

•   Post training DDR failures – unlikely but here are some possible reasons
    −   Ensure row, col, chip select, and data bus size are correct (failures would occur consistently when passing
        certain memory boundaries)
    −   Power supply noise or spikes – refer to HW Developers Guide for board design techniques (cap placements,
        power supply design, etc)

                                                                                        PUBLIC   66
Debugging DDR Failures Flow Chart

DDR initialization
and data training
     (RPA)

                              Re-check DDR
       Data          N       initialization and       Data     N                                   Data           N
                                                                   Adjust drive strengths
     training            “BoardDataBusConfig”       training                                     training
                                                                         and ODT
      pass?               to account for bit/byte    pass?                                        pass?
                                  swizzling

          Y                                              Y                                            Y

  DDR good to                                                                                                    Likely board
      go                                                                                                    layout/manufacturing/
                                                                                                             power-supply-design
                                                                                                              issue or bad DDR

                                                                                            PUBLIC    67
How to Adjust Drive Strength and ODT in RPA
• Values can be adjusted in
  Register Configuration tab
• Adjustable parameters based
  on DRAM type (green shaded
  cells)
• Adjusts parameters for:
    −   CA (command and address) bus
    −   DQ bus
• Pull-down menu list impedance
  options
• Recommend to start with RPA
  defaults
    −   Tuned by validation for best possible
        signal integrity for NXP validation
        boards
    −   To date, we’ve not seen a need to
        adjust

                                                PUBLIC   68
How to Adjust Drive Strength and ODT in RPA
LPDDR4 Example

 Controls pull-up                Note: for CA bus                              Controls pull-up
                                                                                                                    ODT control for DQ bus.
 and pull-down drive             (output only), ODT                            and pull-down drive
                                                                                                                    Note, also adjusts DRAM
 strength for CA bus             irrelevant                                    strength for DQ
                                                                                                                    MR22: SOC_ODT
                                                                               bus

 Note: DRAM drive strength control can be found in MR3 register and ODT control can be found in the MR11 register
                                                                                                                    PUBLIC    69
How to Adjust Drive Strength and ODT in RPA
DDR3 Example

Controls pull-up and              Note: for CA bus                           Controls pull-up and    ODT control for DQ
pull-down drive                   (output only), ODT                         pull-down drive         bus
strength for CA bus               irrelevant                                 strength for DQ bus

Note: DRAM drive strength and ODT control can be found in the MR1 register
                                                                                                    PUBLIC    70
www.nxp.com
NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V.
You can also read