Smart Diff Engine

How UI Inspect's multi-layered visual comparison engine works — pixel matching, SSIM, layout shift detection, and anti-flake filtering.

Overview

UI Inspect's diff engine goes beyond simple pixel comparison. It uses a multi-layered analysis pipeline to provide accurate, actionable results with minimal false positives.

Pipeline stages

Input Images → Alignment → Pixel Match → SSIM → Layout Shift → Anti-Flake → Results

1. Image alignment

Before comparison, images are aligned and normalized:

Scale normalization — Images are resized to match dimensions
Offset detection — Compensates for slight positioning differences
Format conversion — Both images are converted to RGBA PNG

2. Pixel matching (pixelmatch)

The first pass uses the pixelmatch library for raw pixel comparison:

Compares each pixel's RGB values against a configurable threshold
Anti-aliased pixels are detected and handled separately
Returns total pixels, different pixels, and a diff visualization image

3. SSIM (Structural Similarity Index)

SSIM provides a perceptual quality assessment that better matches human vision:

Uses a sliding window approach across the image
Analyzes luminance, contrast, and structural similarity
Returns per-channel scores (red, green, blue) and an overall score
Generates a spatial SSIM map highlighting regions of low similarity
Clusters low-similarity regions for targeted fix suggestions

Score interpretation:

SSIM Score	Meaning
1.0	Identical
0.95+	Nearly identical (sub-pixel differences)
0.85-0.95	Minor differences
0.70-0.85	Noticeable differences
Below 0.70	Significant differences

4. Layout shift detection

Detects systematic content movement:

Vertical shift — Row-signature correlation to detect content pushed up/down
Horizontal shift — Column-signature correlation for left/right shifts
Reports the shift direction and magnitude in pixels
Identifies whether it's a primary (intentional) shift or secondary (cascading) shift

5. Anti-flake filtering

Removes noise that causes false positives:

Anti-aliasing detection — Identifies pixels that differ only due to font rendering
Sub-pixel rendering — Filters differences caused by sub-pixel RGB rendering
Stability scoring — Classifies results as stable, likely-flaky, or flaky
Returns an adjusted diff percentage after filtering

Analysis output

The combined analysis provides:

{
  pixelDiff: {
    totalPixels: number;
    differentPixels: number;
    percentage: number;           // Raw diff percentage
    adjustedPercentage: number;   // After anti-flake filtering
    diffImage: string;            // Base64 visualization
  };
  ssimScore: number;              // 0-1 overall similarity
  ssimChannels: { red, green, blue };
  ssimDiffRegions: Region[];      // Clustered diff areas
  layoutShift: {
    type: string;
    dx: number;                   // Horizontal shift
    dy: number;                   // Vertical shift
  };
  stability: {
    verdict: "stable" | "likely-flaky" | "flaky";
    score: number;
    flakyPixelsFiltered: number;
  };
  colors: ColorAnalysis;
  spacing: SpacingAnalysis;
  typography: TypographyAnalysis;
  dimensions: DimensionAnalysis;
  confidence: number;             // Overall confidence 0-1
}

Fix generation

From the analysis, the engine generates targeted fix suggestions:

CSS property changes with specific selectors
Tailwind class suggestions as alternatives
Confidence scores for each fix (how likely it'll resolve the diff)
Priority levels (high/medium/low) based on visual impact
Before/after values showing what needs to change

On this page