OPENGL BLUEPRINT RENDERING - Christoph Kubisch, 4/7/2016

Page created by Terrance Sanders

Uncategorized

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

OPENGL BLUEPRINT RENDERING - Christoph Kubisch, 4/7/2016

April 4-7, 2016 | Silicon Valley

OPENGL BLUEPRINT RENDERING
Christoph Kubisch, 4/7/2016

MOTIVATION

Blueprints / drawings in CAD/graph viewer
applications

Documents can contain many LINES and LINE_STRIPS

Various line styles can be used (world-space widths,
stippling, joints, caps...)

Potential CPU bottlenecks

  Generating geometry for complex styles

  Collecting and rendering geometry
                                                       Model courtesy of PTC

                                                                     2

MOTIVATION
                       Not targeting full vector graphics

NV_path_rendering covers high fidelity vector
graphics rendering

Per-pixel quadratic Bézier evaluation

Stencil & Cover pass to allow sophisticated
blending

Focus of this talk is rendering lines defined by
traditional vertices

Rendering data from OpenGL buffer objects

Single-pass, but does mean not safe for blending
(does self-overlap)
                                                            3

DEMO: BASIC DEMONSTRATION
                            4

LINE RASTERIZATION
                              Representation

Standard:
skewed rectangle
pixel snapped lines

Multisampling:
aligned rectangle
smooth lines

Both suffer from visible gaps and
overlaps on increasing line width
                                               5

LINE RASTERIZATION
                                Stippling

Stippling only in screenspace

Patterns must be expressable with
16 bits

LINES re-start pattern every
segment

LINE_STRIPS have continous
distance
                                            6

SHADER-DRIVEN LINES

         TECH                         Appearance
                                       on screen

Create TRIANGLES/QUADS          Geometry in world
for line segments                  coordinates
Project extruded vertices
to keep line width
consistent
                             Shapes via fragment
Clip and color in fragment     shader discard
shader based on UV
coordinates and line
distance

                                                    7

SHADER-DRIVEN LINES

        TECH                    FLEXIBILITY

Create TRIANGLES for line   Arbitrary stippling
segments, project           patterns and line widths
extrusion to
world/screen, discard       Joint- and cap-styles
fragments
                            Different distance metrics

                            New coloring/animation
                            possibilities via shaders

                                                         Thin center line as effect
                                                                                  8

SHADER-DRIVEN LINES

        TECH                    FLEXIBILITY                    CAVEATS

Create TRIANGLES for line   Arbitrary stippling          Cannot be as fast as basic
segments, project           patterns and line widths     line rasterization
extrusion to
world/screen, discard       Joint- and cap-styles        Not all data local at
fragments                                                rendering time (line strip
                            Different distance metrics   distances need extra
                                                         calculation)
                            New coloring/animation
                            possibilities via shaders    Geometry still self-
                                                         overlaps

                                                                                      9

SHADER-DRIVEN LINES
                        Sample implementation/library

C interface library to render different
line primitives (LINES, LINE_STRIPS,
ARCS) provided as flexible framework
rather than black-box

Two different render-modes:
render as extruded triangles,
or one pixel wide lines

Uses NVIDIA and ARB OpenGL extensions
if available

                                                        10

SHADER-DRIVEN LINES
                               Sample implementation/library

Global style and stipple definitions                               Style-                         Stipple-
                                                                 Definitions                      Patterns
Stipple from arbitrary bit-pattern, or                     Style 0                          Pattern texture A

float values                                               Style 1                          Pattern texture B
                                                           ...                              ...

                                                  typedef enum NVLSpaceType_e {    typedef enum NVLCapsType_e {
  typedef struct NVLStyleInfo_s {
                                                    NVL_SPACE_SCREEN,                 NVL_CAPS_NONE,
    NVLSpaceType            projectionSpace;
                                                    NVL_SPACE_SCREENDIST3D,           NVL_CAPS_ROUND,
    NVLJoinType             join;
                                                    NVL_SPACE_CUSTOM,                 NVL_CAPS_BOX,
    NVLCapsType             capsBegin;
                                                    NVL_SPACE_CUSTOMDIST3D,           NVL_NUM_CAPS,
    NVLCapsType             capsEnd;
                                                    NVL_NUM_SPACES,                 }NVLCapsType;
    float                   thickness;
                                                  }NVLSpaceType;
    NVLStippleID            stipplePattern;
    float                   stippleLength;
                                                  typedef enum NVLAnchorType_e {   typedef enum NVLJoinType_e {
    float                   stippleOffsetBegin;
                                                    NVL_ANCHOR_BEGIN,                NVL_JOIN_NONE,
    float                   stippleOffsetEnd;
                                                    NVL_ANCHOR_END,                  NVL_JOIN_ROUND,
    NVLAnchorType           stippleAnchor;
                                                    NVL_ANCHOR_BOTH,                 NVL_JOIN_MITER,
    NVLboolean              stippleClamp;
                                                    NVL_NUM_ANCHORS,                 NVL_NUM_JOINS,
  } NVLStyleInfo;
                                                  }NVLAnchorType;                  }NVLJoinType;
                                                                                                         11

SHADER-DRIVEN LINES
                        Sample implementation/library

Uses GPU friendly collection mechanism:
Record many primitives then render               Geometry/Raw Recording
Optionally render sub-sections
                                           Geometry Primitives     VBO reference
Raw Primitives pass vertex data directly
                                            Raw Primitives             Vertex values
Geometry Primitives reference existing
                                             Matrix            Color
Vertex Buffers
                                             Style reference
Collections have usage-style flags:

  filled new per-frame

  recorded once, re-used many frames
                                                                                   12

SHADER-DRIVEN LINES
                                  Quad extrusion

Faster geometry creation by just using Vertex-                   VS               GS
Shader, avoiding extra Geometry-Shader stage

Render GL_QUADS (4 vertices each segment)
                                                       VertexBuffer
Use gl_VertexID to fetch line points               texelFetch(...gl_VertexID/4 + 0 or 1)

Use it for the offsets as well

Using custom vertex-fetch generally not
recommended, but useful for special
situations
                                                 gl_VertexID % 4 + 0   gl_VertexID % 4 + 1

                                                                                       13

SHADER-DRIVEN LINES
                              Minimize Overdraw

No naive rectangles but adjacency in
LINE_STRIP is used to tighten the geometry

Reduces overdraw and minimizes potential
artifacts resulting from that

                                                  14

SHADER-DRIVEN LINES
                                   Depth clamping

Joints and caps exceed original line
definition

Can cause depth-buffer artifacts

Prevent depth over-shooting by passing
closest depth to fragment shader and
                                           #extension GL_ARB_conservative_depth : require
clamp there                                layout (depth_greater) out float gl_FragDepth;

                                           in flat float closestPointDepth;
Can use ARB_conservative_depth or just     ...
min/max to keep hardware z-cull active
                                           gl_FragDepth = max(gl_FragCoord.z,
                                                              closestPointDepth);

                                                                                      15

DISTANCE COMPUTATION

                                                                          V0           V1
LINE_STRIPS need dedicated calculation phase                                                     V2

Read vertices and calculate distances along the strip
                                                                                                        V3
                   VertexBuffer V

                                                                               Sections drawn indepedently
                                                                               Fetch vertices & distances
    Strip Length     4          0           1             2               3
                                                                                 D0                     D1

                                                                                 D1         D2

      DistanceBuffer D      0   [0,1]   [0,1]+[1,2]   [0,1]+[1,2]+[2,3]
                                                                                 D2         D3

Distances are fetched at render-time
                                                                                                   16

DISTANCE COMPUTATION
                                  Shader Tips
                                                              Thread: 0 ... 3
One LINE_STRIP per thread can lead to
under utilization and non ideal memory     Strip Length   3      2     4        8
access due to divergence

SIMT hardware processes threads together   VertexBuffer
in lock-step, common instruction pointer
(masks out inactive threads).                Distance     0       3     5       9
NVIDIA: 1 warp = 32 threads                Accumulation   1       4     6       10
                                               Loop       2       -     7       11
                                                          -       -     8       12
                                                          -       -     -       13
                                                          -       -     -       14
                                                          -       -     -       15
                                                          -       -     -       16
                                                                                 17

DISTANCE COMPUTATION
                                                Shader Tips
                                                                                Thread: 0 ... 3
Compute one LINE_STRIP at a time across
warp, gives nice memory fetch                           Strip Length        9        9         9       9
NV_shader_thread_shuffle to access
neighbors and do prefix-sum calculation                 VertexBuffer

                                                          Distance
vec3 posA = getPosition ( gl_ThreadInWarpNV + …)
vec3 posB = shuffleUpNV (posA, 1, gl_WarpSizeNV);       Accumulation       0         1        2        3
... Handle first thread point differently                   Loop
float dist = distance(posA, posB);
                                                        Access neighbor
                                                            point via     [0,0]    [0,1]    [1,2]     [2,3]
                                                        shuffleUpNV and
Short strips may still under-utilize warp,             compute distance   ... Prefix-sum over distances ...
but are taking only one iteration
                                                                           4         5        6        7
                                                                           8         -        -         -

                                                                                                        18

DISTANCE COMPUTATION
                           Batching & Latency hiding
                                                                                     Effective
                                                Warp 0   Warp 1   Warp 2   Warp 3
                                                                                    Utilization
Memory intensive operations prefer many         Fetch
threads to hide latency of fetch
                                                Wait
                                                 For
Would not „compute“ distance for a single      Memory
strip, but need many strips to work on
                                               Compute

Use one warp per strip if total amount of
threads is low

                                       Hardware switches activity
                                         between entire warps
                                                                                           19

DISTANCE COMPUTATION
                         Batching & Latency hiding

Launch overhead of compute dispatch not     ... “Compute” alternative for few threads
                                            if (numThreads < FEW_THREADS){
negligable for < 10 000 threads               glUseProgram( vs );
                                              glEnable    ( GL_RASTERIZER_DISCARD );
                                              glDrawArrays( GL_POINTS, 0, numThreads );
Use glEnable(GL_RASTERIZER_DISCARD); and    }
                                              glDisable   ( GL_RASTERIZER_DISCARD );

Vertex-Shader to do compute work            else {
                                              glUseProgram( cs );
                                              numGroups = (numThreads+GroupSize-1)/GroupSize;
No shared memory but warp data sharing as     glUniformi1 (0, numThreads);
                                              glDispatchCompute ( numGroups, 1, 1 );
seen before (ARB_shader_ballot or           }

NV_shader_thread_shuffle)                   ... Shader
                                            #if USE_COMPUTE
                                              layout (local_size_x=GROUP_SIZE) in;
                                              layout (location=0) uniform int numThreads;
                                              int threadID = int( gl_GlobalInvocationID.x );
                                            #else
                                              int threadID = int( gl_VertexID );
                                            #endif

                                                                                          20

SMOOTH TRANSITIONS
                       Anti-aliasing edges within shader

Fragment shader effects cause outlines of visible     No geometric edges  No MSAA benefit
shapes to be within geometry

MSAA will not add quality „within triangle“

Need to compute coverage accurately (sample-
shading) or approximate

Use of gl_SampleID (e.g. with
interpolateAtSample) automatically makes            in float stippleCoord;
                                                    ...
shader run per-sample, „discard“ will affect
coverage mask properly                              sc = interpolateAtSample (stippleCoord, gl_SampleID);
                                                    stippleResult = computeStippling( sc );
                                                    if (stippleResult < 0) discard;
Cheaper: GL_SAMPLE_ALPHA_TO_COVERAGE or
clear bits in gl_SampleMask
                                                                                               21

SMOOTH TRANSITIONS
                                     Using Pixel Derivatives
                                                                       1                    1
Simple trick to get smooth transitions, also works             0           0
well on surface contour lines

Use a signed distance field, instead of step                       1
                                                                        fwidth
                                                                       ( signal )
function                                                                            smoothing
                                                                                    zone
                                                                                    around zero
Find if sample is close to transition (zero          0
crossing) via fwidth
                                                                                    signal
Compute smooth weight if required                   -1                              within
                                                                                    smoothing
                                                                                    zone
   float weight = signal < 0 ? -1 : 1;
   float zone = fwidth ( signal ) * 0.5;
   if (abs (signal) < zone){
     weight = signal / zone;
   }

                                                                                       22

RECORDING RAW DATA
                       Using persistent mapped buffers
                                                                          Timeline
When primitives & vertices are not re-used, but
                                                                CPU filling GPU rendering
regenerated by CPU, we want a fast way to get
                                                                  Buffers A
them to GPU
                                                   Cycle sets                     Buffers A
Use ARB_buffer_storage/OpenGL 4.3 to have          of buffers
                                                      each        Buffers B    Signal Frame Fence
buffers in CPU memory for fast copying               frame
                                                                                  Buffers B
                                                                  Buffers C
Need fences to avoid overwriting data still used
by GPU, 3 frames typically enough to avoid                        Wait Fence
                                                                                  Buffers C
synchronization                                                   Buffers A

CPU memory access „okayish“ if data only read                                     Buffers A
rarely (once for stipple-compute, once for                        Buffers B
render)                                                                           Buffers B

                                                                                          23

RENDERING ARCS

Not trivial to compute distance along an arbitrary
projected arc/circle

Approximate circle as line strip

Allocate maximum subdivision

Compute adaptively based on screen-space size
(or frustum cull)

Rendering only needs to fetch distance values,
can still compute position on the fly

                                                     24

OUTLOOK & CONCLUSION

Preserving all primitive order not optimal for
performance, ideally application can operate in layers.

Code your own special primitives for annotations
(arrows...)

Use of shaders can increase visual quality beyond
„fancy surface shading“

Do not need actual geometry for everything (distance
fields are great)

GPU programmable enough to move more effects from
CPU to GPU
                                                          25

April 4-7, 2016 | Silicon Valley

THANK YOU

 JOIN THE NVIDIA DEVELOPER PROGRAM AT developer.nvidia.com/join
 ckubisch@nvidia.com @pixeljetstream

You can also read

Archdiocese of Newark Catholic Schools Curriculum Mapping

Empirical determination of atomic line parameters of the 1.5 µm spectral region

DEEPENING UNDERSTANDING MARK MY WRITING - YEAR 6 NEWSPAPER REPORT

Modeling Agent-based Applications with LTSA - February 13, 2001

SPECTROSCOPIC EVIDENCE ON REALIZATION OF A GENUINE TOPOLOGICAL NODAL LINE SEMIMETAL IN LASBTE

Dynamically induced cascading failures in power grids

Optimizing timetable and network reopen plans for public transportation networks during a COVID19-like pandemic

Electrical Line Clearance Management Plan - Acciona

The Rail Market in the Nordic Countries 2013 - Brooks Reports

Origin and diversity of mutants controlled by the Uq transposable element system in maize

Molecular hydrogen in the chromosphere IRIS observations and a simple model

Cellular and Molecular Characteristics of Established Childhood Soft-Tissue Sarcoma Cell Lines

Establishing Esthetic Lateral Cephalometric Values for Thai Adults after Orthodontic Treatment - ThaiJO

CLICK BAR User Manual - Duotone

Astrophysics of Galaxies 2019-2020 - Lecture V: The Spectral Energy Distribution of Galaxies Star Formation Rate, Ionized ISM

Draft decision TransGrid transmission determination 2015-16 to 2017-18 Attachment 11: Service target performance incentive scheme - November 2014 ...

A K-Nearest Neighbor Heuristic for Real-Time DC Optimal Transmission Switching

CLYDEPORT MOORING GUIDELINES - June 2021 - Peel Ports

Emission lines from X-ray illuminated accretion disc in black hole binaries

AN2656 Application note - STM32F10xxx LCD glass driver firmware

Estimating numbers of EMS-induced mutations affecting life history traits in Caenorhabditis elegans in crosses between inbred sublines

ROAD EXTRACTION AND VECTORIZATION FROM AERIAL IMAGE DATA

Get Coordinated! Grade 6-8 STEM Challenge Inspired by Cory, a CNC Machinist in the Indiana Uplands - Regional ...

Assessing HVDC Transmission for Impacts of Non Dispatchable Generation - June 2018 - EIA

Thematic Area: Disaster Risk Management and Resilient Development - Annual Work Plan 2020-2021 - Foro Prosur

Whozz Calling? Basic POS 4&8 Caller ID Unit Product Manual - Revision 3.2 -01/05/2021

Package 'geosphere' - CRAN

KITEBOARDING USER MANUAL - Naish Kiteboarding

Credit Line Models for Supply Chain Enterprises with Channel Background and Soft Information - MDPI

Cardinia Shire Council Electric Line Clearance Management Plan 2021 2022

Sparkrock 365 Fall 2020 Update 2 Release Notes - April 2021 Copyright 2021 Sparkrock. All rights reserved - NET

Plan of Services for Annexation A-1-2021 Of Lands North and West of Shaw Drive April 2021 - PUBLIC REVIEW COPY Display from April 21, 2021 - May ...

Coronagraphic Observations of Si X λ14301 and Fe XIII λ10747 Linearly Polarized Spectra Using the SOLARC Telescope

The Hubble PanCET program: Long-term chromospheric evolution and flaring activity of the M dwarf host GJ 3470

MAGIC THIPPRO INTERCOM - CONFIGURATION GUIDE - AVT - AUDIO VIDEO TECHNOLOGIES GMBH

Code of Practice for Avoiding Danger from Overhead Electricity Lines - Issue Date to be confirmed 2018 Stock No 9803203-2

EEX Group DataSource SFTP - Specifications

HYBRID RICE PROGRAM - Philippine Rice Research Institute

Rail technical background - November 2020 - GOV.WALES

Rail Strategy Hertfordshire County Council - June 2016

BASIC LAYOUT GUIDELINES - REQUIREMENTS. RECOMMENDATIONS. NICE TO HAVE - KIT-Bibliothek

2019 QUICK VARIO PRO BAR USER MANUAL - Odo Kiteboarding

2010-2019REVISION 1 TRANSMISSION DEVELOPMENT PLAN - ESKOM

Capital Regulations and Credit Line Management during Crisis Times

Aerosol Jet Printing of Silver Lines with A High Aspect Ratio on A Heated Silicon Substrate - Mdpi

Cécile Favre (IPAG, Grenoble) - Detection of weak transitions in forests of lines

MAGIC PHONERSET QUICK GUIDE - VERSION: 1.001 (18. MARCH 2020) 2020, AVT AUDIO VIDEO TECHNOLOGIES GMBH - AVT - AUDIO VIDEO TECHNOLOGIES GMBH

SP_Ace v1.4 and the new GCOG library for deriving stellar

FUNDAMENTALS - DRAWINGS AND SPECIFICATIONS - BC Campus - LibreTexts

Transmission Ten-Year Development Plan 2011-2020 - Public Version - Eskom