Home Posts WASM-Native Browser Extension for Video Anonymization
Security Deep-Dive

WASM-Native Browser Extension for Video Anonymization

WASM-Native Browser Extension for Video Anonymization
Dillip Chowdary
Dillip Chowdary
Tech Entrepreneur & Innovator · May 12, 2026 · 8 min read

Bottom Line

A production-grade video anonymization extension is mostly a pipeline problem: capture the tab in an MV3-safe context, move frames into a worker, and keep the hot pixel loop inside WebAssembly. The cleanest Chrome-first design is service worker plus offscreen document plus dedicated worker plus Wasm core.

Key Takeaways

  • Target Chrome 116+ so a service worker can hand a tab-capture stream ID to an offscreen document.
  • Run MediaStreamTrackProcessor and VideoTrackGenerator in a dedicated worker, not on the main thread.
  • Keep the anonymization loop in WebAssembly; keep capture, ROI selection, and messaging in JavaScript.
  • Use the MV3 CSP minimum with 'wasm-unsafe-eval' for extension pages when loading local Wasm.

Real-time video anonymization inside a browser extension sounds like an AI problem, but the harder part is systems design. In Manifest V3, your background logic lives in a service worker, tab capture has user-gesture constraints, and the high-frequency frame loop must stay off the main thread. The stable Chrome-first answer is to capture in an offscreen document, process frames in a dedicated worker, and push the hottest loop into WebAssembly.

Prerequisites and architecture

Bottom Line

Build the extension around a streaming pipeline, not around popup code. Keep capture in MV3-safe extension contexts, keep frame transforms in a worker, and let Wasm own the expensive pixel math.

Prerequisites

  • Chrome 116+ for the clean service-worker-to-offscreen tab capture flow.
  • A current Rust toolchain and wasm-pack to build the local Wasm module.
  • Working knowledge of Manifest V3, service workers, and browser media streams.
  • A simple anonymization target. This tutorial uses user-defined rectangular regions of interest instead of automatic face detection.

Why this architecture

  • MV3 service workers do not have DOM access, so they should coordinate capture, not draw frames.
  • Offscreen documents can use DOM APIs without opening visible UI and are explicitly supported for user media and workers.
  • MediaStreamTrackProcessor and VideoTrackGenerator are worker-centric APIs, which makes them a natural fit for a dedicated transform worker.
  • WebAssembly is best used for the inner blur or mosaic loop, where JavaScript would otherwise spend most of its time walking typed arrays.

If you also need to sanitize test payloads, session exports, or synthetic labels while documenting the system, TechBytes’ Data Masking Tool fits naturally beside this pipeline.

Watch out: MediaStreamTrackProcessor and VideoTrackGenerator are not Baseline APIs across browsers. This tutorial is intentionally Chromium-first, not cross-browser by default.

Step 1: Scaffold the extension

Create an MV3 shell with a service worker and an offscreen page. The service worker responds to the user gesture, requests a stream ID, ensures the offscreen page exists, and forwards the start message.

{
  "manifest_version": 3,
  "name": "Wasm Video Anonymizer",
  "version": "0.1.0",
  "permissions": ["tabCapture", "offscreen"],
  "action": { "default_title": "Start anonymizer" },
  "background": { "service_worker": "background.js", "type": "module" },
  "content_security_policy": {
    "extension_pages": "script-src 'self' 'wasm-unsafe-eval'; object-src 'self';"
  },
  "web_accessible_resources": [{
    "resources": ["wasm/pkg/*"],
    "matches": ["<all_urls>"]
  }]
}
// background.js
const OFFSCREEN_URL = 'offscreen.html';

async function ensureOffscreen() {
  const url = chrome.runtime.getURL(OFFSCREEN_URL);
  const contexts = await chrome.runtime.getContexts({
    contextTypes: ['OFFSCREEN_DOCUMENT'],
    documentUrls: [url]
  });
  if (contexts.length) return;
  await chrome.offscreen.createDocument({
    url: OFFSCREEN_URL,
    reasons: ['USER_MEDIA', 'WORKERS'],
    justification: 'Process captured tab video in a worker'
  });
}

chrome.action.onClicked.addListener(async (tab) => {
  await ensureOffscreen();
  const streamId = await chrome.tabCapture.getMediaStreamId({
    targetTabId: tab.id
  });
  await chrome.runtime.sendMessage({
    type: 'START_ANONYMIZER',
    streamId
  });
});
<!-- offscreen.html -->
<!doctype html>
<html>
  <body>
    <script type="module" src="offscreen.js"></script>
  </body>
</html>

The key detail is the CSP. Chrome’s extension-pages minimum allows local Wasm with 'wasm-unsafe-eval'. That is the supported path for packaged Wasm, while remotely hosted code remains off-limits in MV3.

Step 2: Build the Wasm core

Keep the Wasm surface area small. You do not need the whole pipeline in Rust. You need one deterministic function that mutates RGBA pixels inside a rectangular region.

// src/lib.rs
use wasm_bindgen::prelude::*;

#[wasm_bindgen]
pub fn blur_region(
    pixels: &mut [u8],
    width: usize,
    height: usize,
    x: usize,
    y: usize,
    w: usize,
    h: usize,
    radius: usize,
) {
    let mut out = pixels.to_vec();
    let max_x = (x + w).min(width);
    let max_y = (y + h).min(height);

    for py in y..max_y {
        for px in x..max_x {
            let mut r = 0u32;
            let mut g = 0u32;
            let mut b = 0u32;
            let mut a = 0u32;
            let mut count = 0u32;

            let y0 = py.saturating_sub(radius);
            let x0 = px.saturating_sub(radius);
            let y1 = (py + radius).min(height - 1);
            let x1 = (px + radius).min(width - 1);

            for sy in y0..=y1 {
                for sx in x0..=x1 {
                    let i = (sy * width + sx) * 4;
                    r += pixels[i] as u32;
                    g += pixels[i + 1] as u32;
                    b += pixels[i + 2] as u32;
                    a += pixels[i + 3] as u32;
                    count += 1;
                }
            }

            let i = (py * width + px) * 4;
            out[i] = (r / count) as u8;
            out[i + 1] = (g / count) as u8;
            out[i + 2] = (b / count) as u8;
            out[i + 3] = (a / count) as u8;
        }
    }

    pixels.copy_from_slice(&out);
}
wasm-pack build --target web

This produces a pkg directory with a JS wrapper and a .wasm binary. Bundle that output into your extension under wasm/pkg/. If your glue code needs cleanup before shipping, run it through the TechBytes Code Formatter so the generated imports stay readable during review.

Pro tip: Do not push frame scheduling, messaging, or ROI management into Wasm. Crossing the JS-Wasm boundary too often can erase the performance win you wanted.

Step 3: Wire the video worker

The offscreen document converts the stream ID into a real MediaStream, then transfers the video track into a dedicated module worker. That worker owns the processor, the output generator, the canvas, and the Wasm module.

// offscreen.js
const worker = new Worker('worker.js', { type: 'module' });

chrome.runtime.onMessage.addListener(async (msg) => {
  if (msg.type !== 'START_ANONYMIZER') return;

  const stream = await navigator.mediaDevices.getUserMedia({
    audio: false,
    video: {
      mandatory: {
        chromeMediaSource: 'tab',
        chromeMediaSourceId: msg.streamId
      }
    }
  });

  const [track] = stream.getVideoTracks();
  worker.postMessage({
    track,
    rois: [{ x: 220, y: 80, w: 260, h: 160, radius: 8 }]
  }, [track]);
});
// worker.js
import init, { blur_region } from './wasm/pkg/your_crate.js';

await init();

self.onmessage = async ({ data }) => {
  const { track, rois } = data;
  const processor = new MediaStreamTrackProcessor({ track });
  const generator = new VideoTrackGenerator();
  const canvas = new OffscreenCanvas(1280, 720);
  const ctx = canvas.getContext('2d', { willReadFrequently: true });

  await processor.readable
    .pipeThrough(new TransformStream({
      async transform(frame, controller) {
        if (canvas.width !== frame.displayWidth || canvas.height !== frame.displayHeight) {
          canvas.width = frame.displayWidth;
          canvas.height = frame.displayHeight;
        }

        ctx.drawImage(frame, 0, 0, canvas.width, canvas.height);
        const image = ctx.getImageData(0, 0, canvas.width, canvas.height);

        for (const roi of rois) {
          blur_region(
            image.data,
            canvas.width,
            canvas.height,
            roi.x,
            roi.y,
            roi.w,
            roi.h,
            roi.radius
          );
        }

        ctx.putImageData(image, 0, 0);
        const output = new VideoFrame(canvas, { timestamp: frame.timestamp });
        frame.close();
        controller.enqueue(output);
      }
    }))
    .pipeTo(generator.writable);

  self.postMessage({ type: 'READY', track: generator.track }, [generator.track]);
};

For a production build, replace the hard-coded ROI array with coordinates from a content script overlay, or with detections from a separate model worker. The important system boundary stays the same: the transform worker receives a track plus anonymization regions and returns a processed track.

Step 4: Verify and measure

Verification checklist

  1. Load the unpacked extension from chrome://extensions.
  2. Open a tab with moving video content and click the extension action.
  3. Confirm the tab capture indicator appears in Chrome.
  4. Open the offscreen document’s DevTools and verify the worker posts READY.
  5. Inspect frame timing in the worker console and confirm the anonymized regions track motion without visible stutter.

Expected output

  • The service worker logs a valid stream ID and no permission errors.
  • The offscreen document creates a single video track from getUserMedia().
  • The worker initializes the local Wasm package once, not once per frame.
  • The processed stream remains continuous, and anonymized rectangles appear blurred in every frame that passes through the pipeline.
// optional debug hook in background.js
chrome.tabCapture.onStatusChanged.addListener((info) => {
  console.log('capture-status', info.status, info.tabId);
});

If frame latency spikes, profile the transform stage first. The most common culprits are repeated Wasm initialization, oversized blur radii, or unnecessary copies of ImageData.

Troubleshooting and what’s next

Top 3 issues

  • "Cannot use stream ID": make sure you are targeting Chrome 116+ and consuming the ID from the extension’s offscreen document, not an arbitrary web page.
  • Wasm fails to load: verify the extension CSP includes 'wasm-unsafe-eval' for extension pages and that the Wasm files are packaged locally.
  • Dropped frames: reduce ROI size, reduce blur radius, or batch ROI updates so the worker spends less time copying pixels per frame.

What’s next

  • Swap rectangular ROIs for a content-script selection layer so users can drag privacy zones directly over the page.
  • Add a second worker for detection so tracking logic and anonymization logic do not contend on one frame loop.
  • Replace blur with mosaic or solid-color masking when you need stronger privacy guarantees.
  • Pipe the processed track into MediaRecorder or WebRTC if you want recording or live publishing.

The core lesson is that a WASM-native extension is not one that rewrites everything in Rust. It is one that reserves Wasm for the hot path, respects MV3 boundaries, and treats the browser as a streaming runtime instead of a page script host.

Frequently Asked Questions

How do I load WebAssembly in a Manifest V3 extension? +
Use a packaged local .wasm file and keep your extension pages on the MV3 minimum CSP that includes 'wasm-unsafe-eval'. Do not rely on remotely hosted code; MV3 explicitly restricts that path.
Why use an offscreen document instead of doing everything in the service worker? +
A service worker has no DOM access, which makes media plumbing and worker setup awkward for video processing. An offscreen document gives you a hidden extension page with DOM APIs while still keeping the user flow clean.
Can I make this work in Firefox or Safari? +
Not with the exact same pipeline. MediaStreamTrackProcessor and VideoTrackGenerator have limited cross-browser support, so this tutorial is best treated as a Chromium-first design.
Should face detection also run in WebAssembly? +
Only if profiling proves it is the bottleneck. A strong default is to keep detection, ROI updates, and control flow in JavaScript or a separate worker, and reserve Wasm for the deterministic per-pixel transform.

Get Engineering Deep-Dives in Your Inbox

Weekly breakdowns of architecture, security, and developer tooling — no fluff.

Found this useful? Share it.