Anti-Flake Engine

How UI Inspect filters false positives from anti-aliasing, sub-pixel rendering, and other noise sources.

The flaky test problem

Visual tests are notoriously flaky. A screenshot can differ from the baseline due to:

Anti-aliasing — Text rendering differs between runs
Sub-pixel rendering — Font smoothing varies by platform
GPU rendering — Hardware acceleration produces non-deterministic output
Dynamic content — Timestamps, animations, ads
Font loading — FOUT/FOIT causing different text rendering

These produce pixel differences that aren't real regressions. The anti-flake engine filters them.

How it works

Anti-aliasing detection

The engine identifies pixels that differ only due to font/edge anti-aliasing:

Check if the pixel is on an edge (adjacent to high-contrast pixels)
Check if the color difference is within the anti-aliasing threshold
If both conditions are met, classify as anti-aliasing noise

Sub-pixel rendering filter

RGB sub-pixel rendering (ClearType on Windows, sub-pixel on macOS) causes systematic 1-pixel color shifts:

Analyze patterns of R/G/B channel differences
If differences follow sub-pixel patterns, classify as noise
Filter from the diff count

Stability scoring

After filtering, the engine classifies the result:

Verdict	Criteria	Action
Stable	Very few flaky pixels, consistent results	Trust the diff
Likely Flaky	Moderate noise, may vary between runs	Review manually
Flaky	High noise, unreliable comparison	Consider ignore regions

Configuration

The anti-flake engine runs automatically during smart diff analysis. Enable it explicitly:

npx ui-inspect build --project proj_id --threshold 0.1

Or via the API:

const result = await client.diff.analyze.mutate({
  designImage: base64,
  implementationImage: base64,
  antiFlake: true,
  threshold: 0.1,
});

console.log(result.analysis.stability);
// { verdict: "stable", score: 0.95, flakyPixelsFiltered: 42 }

console.log(result.analysis.pixelDiff.percentage); // 3.2% (raw)
console.log(result.analysis.pixelDiff.adjustedPercentage); // 1.1% (filtered)

Impact

Metric	Without Anti-Flake	With Anti-Flake
False positive rate	~15-30%	~1-3%
Flaky test rate	~20%	~2%
CI reliability	Inconsistent	Stable