Writing Custom GLSL Shaders for Scatter Plots
You need to render 100k+ scatter points at 60fps, but DOM nodes thrash layout and Canvas 2D bottlenecks the CPU — so the points must move to GLSL shaders.
This is the implementation companion to WebGL shader optimization: the exact vertex and fragment code, attribute layout, and picking pass for a production scatter plot.
Done well, this pipeline renders a million-point scatter in a single draw call at full refresh rate, with smooth circular marks, correct alpha compositing, and instant hover — work that would melt a DOM-based chart and bottleneck a Canvas 2D one. The cost is the GLSL itself, which is unforgiving: a missing precision qualifier, a wrong stride, or a silent shader-compile error renders nothing, with no exception thrown. The discipline below front-loads the checks that catch those failures before they reach a user.
Diagnostic checklist
Mapping a dataset to the GPU
A scatter plot is point-sprite rendering: each row becomes one vertex, the vertex shader projects it to clip space and sets gl_PointSize, and the fragment shader clips the square sprite into an anti-aliased circle. Coordinates and sizes live in pre-allocated Float32Array buffers uploaded once, so per-frame work is a single draw call.
Getting the buffer layout right is where silent bugs hide. Pack X and Y interleaved into one Float32Array, upload with bufferData(gl.ARRAY_BUFFER, data, gl.STATIC_DRAW) for a static dataset, and describe it to the shader with vertexAttribPointer(loc, 2, gl.FLOAT, false, 0, 0) — a stride and offset of zero because the two floats are contiguous. The normalize flag must be false for floating-point position data; setting it true would remap your coordinates into the [0,1] range and silently corrupt the plot. If you also pack a per-point size or category, the stride and offsets change, and a mismatch there produces the classic “renders but in the wrong places” failure. During initialization, never call gl.finish() — it forces a CPU-GPU sync that blocks the event loop and inflates Time to Interactive for no benefit.
The fragment shader is where a scatter plot earns its visual quality. WebGL point sprites rasterize as squares by default, so without intervention every point is a hard-edged box. The fix is radial-distance math: gl_PointCoord gives normalized [0,1] coordinates across the sprite, so length(gl_PointCoord - vec2(0.5)) is the distance from the sprite’s center, zero at the middle and about 0.707 at the corners. Feeding that through smoothstep produces a soft, anti-aliased edge for free on the GPU, and discarding fragments beyond the radius clips the square into a circle. Because dense scatter plots overlap heavily, enable alpha blending (SRC_ALPHA, ONE_MINUS_SRC_ALPHA) so overlapping points composite rather than overwrite, and pass alpha as a uniform rather than computing it per fragment to keep fill-rate low on crowded clusters.
Broken vs fixed
// ❌ BROKEN fragment shader: no circle clip, no precision — renders hard squares.
varying vec4 vColor;
void main() {
gl_FragColor = vColor; // Fills the entire square point sprite; edges are aliased squares.
}
// ✅ FIXED fragment shader: radial clip + smoothstep edge = anti-aliased circle.
#version 300 es
precision highp float; // A11Y: highp avoids coordinate truncation that distorts small marks.
in vec4 vColor;
uniform float uAlpha;
out vec4 fragColor;
void main() {
float dist = length(gl_PointCoord - vec2(0.5)); // 0 at center, ~0.707 at corner.
float alpha = smoothstep(0.5, 0.45, dist) * uAlpha; // PERF: branchless soft edge.
if (alpha < 0.01) discard; // Drop fully transparent fragments.
fragColor = vec4(vColor.rgb, alpha);
}
Step-by-step fix
- Compile and link with logs. After compiling each shader, read
gl.getShaderInfoLog(); after linking,gl.getProgramInfoLog(). Silent failures otherwise render black. - Upload coordinates once. Pack X/Y into a
Float32Array,bufferData(..., STATIC_DRAW), thenvertexAttribPointer(loc, 2, gl.FLOAT, false, 0, 0). - Project in the vertex shader.
gl_Position = uProjection * vec4(aPosition, 0.0, 1.0)with a CPU-computed orthographicmat4. - Size for HiDPI. Set
gl_PointSize = clamp(uZoom * uBaseSize * uDPR, 1.0, 64.0)so points stay crisp on Retina without branching. - Clip to a circle. In the fragment shader, compute
length(gl_PointCoord - 0.5)andsmoothstepthe edge;discardnear-zero alpha. - Enable blending.
gl.enable(gl.BLEND)andgl.blendFunc(gl.SRC_ALPHA, gl.ONE_MINUS_SRC_ALPHA)before drawing so overlaps composite correctly. - Add GPU picking for hover. Render unique RGB IDs to an offscreen framebuffer,
readPixelsone pixel under the cursor on rAF, and map back to the array index. - Cull off-screen points by checking
gl_Positionbounds in the vertex shader to cut fragment load on zoomed views.
// ✅ FIXED vertex shader: projection + branchless HiDPI sizing.
#version 300 es
precision highp float;
in vec2 aPosition;
in float aSize;
uniform mat4 uProjection;
uniform float uZoom;
uniform float uBaseSize;
uniform float uDPR;
void main() {
gl_Position = uProjection * vec4(aPosition, 0.0, 1.0);
// PERF: clamp instead of if/else — smooth size scaling with no divergence.
gl_PointSize = clamp(uZoom * aSize * uBaseSize * uDPR, 1.0, 64.0);
}
// PERF: throttled GPU pick — readPixels once per frame, O(1) hit-test vs O(n) CPU scan.
let pickScheduled = false;
canvas.addEventListener('pointermove', (e: PointerEvent): void => {
if (pickScheduled) return;
pickScheduled = true;
requestAnimationFrame(() => {
gl.bindFramebuffer(gl.FRAMEBUFFER, pickFBO);
renderPickPass(); // Each point outputs its index packed as RGB.
const px = new Uint8Array(4);
gl.readPixels(e.offsetX, canvas.height - e.offsetY, 1, 1, gl.RGBA, gl.UNSIGNED_BYTE, px);
const index = (px[0] << 16) | (px[1] << 8) | px[2];
// A11Y: route the same highlight to keyboard focus so screen readers announce it.
if (index < pointCount) highlightPoint(index);
pickScheduled = false;
});
});
Verification
// PERF/correctness: fail fast on shader and program errors before the first draw.
gl.compileShader(fragShader);
console.assert(
gl.getShaderParameter(fragShader, gl.COMPILE_STATUS) === true,
gl.getShaderInfoLog(fragShader) ?? 'fragment compile failed'
);
gl.linkProgram(program);
console.assert(
gl.getProgramParameter(program, gl.LINK_STATUS) === true,
gl.getProgramInfoLog(program) ?? 'link failed'
);
console.assert(gl.getUniformLocation(program, 'uProjection') !== null, 'uProjection unused/optimized out');
Visually, points should render as soft circles, not squares, and stay sharp on a 2× display. Use EXT_disjoint_timer_query to confirm GPU frame time stays under 16.67ms at full point count, and verify hover highlights the correct point via the picking pass.
The GPU-picking pass deserves its own verification because it fails silently. After rendering the pick pass, read back a known point’s pixel and assert that decoding (R << 16) | (G << 8) | B returns that point’s index. If it returns garbage, the usual causes are a Y-axis flip (gl.readPixels uses bottom-left origin, so you must read at canvas.height - offsetY), an index that overflows 24 bits past ~16.7M points, or a pick shader that writes interpolated rather than flat colors. Throttle the readback to one readPixels per animation frame; calling it on every raw mousemove creates a GPU-CPU synchronization stall that freezes the UI precisely when the dataset is large enough to need picking in the first place.
Edge cases & gotchas
- Beyond framebuffer resolution. Past ~4M points, RGB IDs can collide with sub-pixel coverage; fall back to a CPU quadtree for coarse picking.
- NaN coordinates. A single
NaNin the buffer corrupts the projection; guard withisnan()before division or filter on upload. - Z-fighting on transparency. For overlapping translucent points, disable depth writes (
gl.depthMask(false)) during the transparent pass or sort by depth before upload, and avoid writinggl_FragDepthin a transparent fragment shader. - Contrast and color encoding. Alpha-blended points can fall below WCAG 1.4.11 non-text contrast against the background when stacked thinly; expose a high-contrast toggle that overrides the alpha uniform with opaque fallback colors for low-vision users, and never encode a category by hue alone — pair it with a redundant channel.
- Black-screen-after-resize. Recreating the context on resize without rebinding buffers and re-resolving uniform locations renders nothing; rebuild GPU state on
webglcontextrestoredand after any deliberate context recreation rather than assuming the old handles survive.