Rendering 100k Scatter Points Without Frame Drops

Your scatter plot is smooth at a few thousand points but stutters badly the moment the dataset hits six figures and the user pans or zooms.

At 100,000 points you are firmly past what a CPU rasterizer can sustain inside a 16.67ms frame, which is exactly the threshold the Canvas 2D vs WebGL for Data Visualization guide flags as the move-to-WebGL line. This page is the implementation: typed-array geometry, a single instanced or point-sprite draw call, and pushing setup off the main thread.

The arithmetic is unforgiving. A Canvas 2D arc + fill for one point costs on the order of a microsecond once you include path setup and the JS-to-rasterizer boundary crossing. Multiply by 100,000 and a single frame’s drawing alone is well over the entire 16.67ms budget before you have done any layout, hit-testing, or data work. There is no Canvas 2D micro-optimization that closes a 6x-to-10x gap; the fix is structural. You stop issuing one command per point and start handing the GPU a flat block of coordinates that it rasterizes in parallel. Once the geometry lives in a GPU buffer, a frame is a single draw call plus a couple of uniform updates, and the per-point cost effectively vanishes from the main thread.

Diagnostic checklist

From per-point commands to one instanced draw Per-point Canvas commands scale linearly with point count while a single instanced WebGL draw stays flat. point count frame ms Canvas per-point WebGL instanced 16.67ms budget
Per-point Canvas commands cross the frame budget early; one instanced WebGL draw call stays nearly flat as point count grows.

Broken vs fixed

// ❌ BROKEN: object array + one Canvas command per point, every frame.
interface Pt { x: number; y: number; }
function draw(ctx: CanvasRenderingContext2D, pts: Pt[]) {
  ctx.clearRect(0, 0, ctx.canvas.width, ctx.canvas.height);
  for (const p of pts) {              // 100k iterations
    ctx.beginPath();
    ctx.arc(p.x, p.y, 2, 0, Math.PI * 2);
    ctx.fill();                        // 100k CPU draw commands → frame blown
  }
}
// ✅ FIXED: flat Float32Array uploaded once, one instanced WebGL draw per frame.
const gl = canvas.getContext("webgl2")!;
// Build geometry ONCE (ideally in a worker) as interleaved x,y:
const positions = new Float32Array(pointCount * 2); // PERF: flat typed array, no per-point objects
// ...fill positions...

const posBuf = gl.createBuffer()!;
gl.bindBuffer(gl.ARRAY_BUFFER, posBuf);
gl.bufferData(gl.ARRAY_BUFFER, positions, gl.STATIC_DRAW); // upload once, not per frame

function frame() {
  gl.clear(gl.COLOR_BUFFER_BIT);
  // PERF: a single drawArrays rasterizes all 100k points on the GPU.
  gl.drawArrays(gl.POINTS, 0, pointCount);
  requestAnimationFrame(frame);
}
// A11Y: expose the data as a hidden, focusable summary table; GPU pixels are invisible to AT.

The vertex shader maps data coordinates into clip space and sets gl_PointSize; the fragment shader colors the sprite:

// vertex
#version 300 es
in vec2 a_pos;
uniform vec2 u_scale;   // data → clip-space scale
uniform vec2 u_offset;  // pan offset
void main() {
  gl_Position = vec4(a_pos * u_scale + u_offset, 0.0, 1.0);
  gl_PointSize = 3.0;
}

Pan and zoom now update two uniforms (u_scale, u_offset) instead of rebuilding geometry — the 800KB buffer never moves.

This uniform-driven transform is the single most important pattern for interactive density. The naive approach recomputes every point’s screen position in JavaScript on each pan frame and re-uploads the buffer, which puts you right back to O(n) main-thread work and a per-frame megabyte upload. By keeping the points in data space in the buffer and applying the data-to-screen mapping in the vertex shader, the only thing that changes per frame is two vec2 uniforms — a handful of bytes. The GPU re-projects all 100,000 points itself. The same idea extends to color and size: store a per-point category or magnitude as an extra attribute, and let the fragment shader map it to a color, so even styling changes never touch the main thread.

If each mark needs to be more than a single point sprite — say a small textured glyph or a multi-vertex shape — use instanced rendering instead of gl.POINTS. You upload the shape’s geometry once, upload a per-instance attribute buffer of positions (and optionally colors and sizes), and issue one gl.drawArraysInstanced call. The GPU stamps the shape once per instance, reading the per-instance position from a buffer advanced by gl.vertexAttribDivisor. This is how you draw 100,000 little squares or markers in a single draw call without 100,000 separate draws.

Step-by-step fix

// Off-main-thread geometry build keeps the page responsive during load.
const worker = new Worker(new URL("./build-geometry.ts", import.meta.url), { type: "module" });
worker.postMessage({ rows });
worker.onmessage = (e: MessageEvent<{ positions: ArrayBuffer }>) => {
  const positions = new Float32Array(e.data.positions); // transferred, not copied
  gl.bindBuffer(gl.ARRAY_BUFFER, posBuf);
  gl.bufferData(gl.ARRAY_BUFFER, positions, gl.STATIC_DRAW); // PERF: one upload after worker finishes
};

Verification

// Assert geometry is uploaded once, not per frame.
let uploads = 0;
const realBufferData = gl.bufferData.bind(gl);
(gl as WebGL2RenderingContext).bufferData = ((...a: Parameters<typeof realBufferData>) => {
  uploads++; return realBufferData(...a);
}) as typeof gl.bufferData;
// ...pan/zoom for a while...
console.assert(uploads <= 2, `re-uploaded geometry ${uploads} times — should be 1`);

Record a Performance trace while panning a 100k-point plot; the main-thread frames should sit comfortably under 16.67ms with no long tasks. Watch the FPS meter (DevTools rendering overlay) hold at 60 during continuous zoom.

Why typed arrays beat object arrays

The shift from an array of { x, y } objects to a flat Float32Array is not a stylistic preference; it is the difference between fitting in budget and not. An array of 100,000 small objects scatters those objects across the heap, each with its own header, hidden class, and boxed number properties — easily an order of magnitude more memory than the raw coordinates, and laid out so that reading them thrashes the CPU cache. A Float32Array of 200,000 floats is a single contiguous 800KB block: cache-friendly to iterate, cheap for the garbage collector to ignore (it is one allocation, not 100,000), and — decisively — it is the exact memory layout WebGL’s bufferData wants. Handing a typed array straight to the GPU buffer means no per-point marshaling; the bytes are uploaded as-is. Building the same buffer from an object array would require copying every value into a typed array first, which reintroduces the O(n) main-thread pass you were trying to avoid.

This is also why the off-main-thread build pays off. Constructing and filling the Float32Array from raw rows — parsing, projecting, packing — is itself O(n) work, and at 100,000+ rows it can block the main thread long enough to drop frames during load. Doing it in a Web Worker keeps the UI responsive, and because a typed array is backed by a transferable ArrayBuffer, you can move it to the main thread with zero-copy transfer rather than a structured-clone copy. The worker hands over ownership of the buffer, the main thread uploads it once, and the points appear without the page ever stuttering.

Edge cases & gotchas

  • Overdraw at high density. Hundreds of overlapping point sprites blend into solid blobs and waste fill rate. Add alpha falloff in the fragment shader or aggregate to a density/heatmap representation past a zoom threshold.
  • Context loss with large buffers. A lost GL context drops all buffers; on webglcontextrestored, re-upload geometry from the retained Float32Array, so keep that array around.
  • Worker transfer, not copy. Send the ArrayBuffer in the worker’s transfer list (postMessage(buf, [buf])) to move ownership instead of structured-cloning a megabyte each load.
  • Hit-testing must not redraw the scene. With 100k points, re-rendering on every mousemove to find the hovered point is catastrophic. Build a d3.quadtree from the same coordinates once and query it in O(log n), or use a color-pick framebuffer pass; never scan all points per pointer event.
  • Antialiasing and point size. Hard-edged gl.POINTS look harsh; a fragment-shader smoothstep on distance from the point center gives soft circular sprites. Keep gl_PointSize modest, because very large point sprites multiply fill-rate cost and can hit driver size limits.

When even WebGL is not enough

At extreme scales — millions of points, or 100k points redrawn under heavy per-frame computation — the bottleneck can shift away from rasterization toward data movement and aggregation. Two techniques extend the ceiling. First, aggregation: past a zoom threshold where individual points are smaller than a pixel and overplot into solid blobs anyway, switch to a binned density or heatmap representation computed once per view, which both reads better and draws far fewer primitives. Second, off-main-thread rendering with OffscreenCanvas transferred to a worker, so the entire render loop — not just geometry construction — runs off the UI thread and the main thread stays free for interaction and layout. Reach for these only when a profiler shows the single-draw-call WebGL approach itself is the bottleneck; for the common 100k-point case, typed arrays plus one instanced draw and uniform-driven pan/zoom is already enough to hold a steady 60fps.

Frequently Asked Questions

Why does my 100k-point Canvas 2D scatter plot jank?

Because Canvas 2D issues one CPU-bound drawing command per point, so 100,000 points means 100,000 commands per frame, far over the 16.67 millisecond budget. There is no Canvas micro-optimization that closes that gap; move the geometry into a WebGL buffer and draw it in a single GPU call.

How do I keep pan and zoom fast with so many points?

Store points in data space in the GPU buffer and apply the data-to-screen mapping in the vertex shader through scale and offset uniforms. Panning then updates two small uniforms instead of recomputing positions and re-uploading the buffer, so the per-frame cost stays nearly constant.

Should I build the geometry in a Web Worker?

For large datasets, yes. Constructing and packing a 100,000-row Float32Array is O(n) work that can block the main thread during load. Build it in a worker and transfer the ArrayBuffer with zero-copy transfer so the UI stays responsive while the buffer is uploaded once.