Compare Documents

Paste JSON or XML documents to analyze their semantic similarity. Format is auto-detected.

Enter JSON objects in both panels, then click Compare to see the similarity score.

When enabled, scrolling one panel scrolls the other
When enabled, synonyms like member_id and subscriber_id are treated as equivalent
Analyzing similarity...

Load sample documents or paste your own to get started.

Sample Documents

Explore how the similarity algorithm responds to different document pairs. The algorithm measures structural and token similarity.

Same Structure (High Similarity)

Documents with identical field names score high, even with different values.

Near Duplicate

Identical claim data

~100%

Same Patient, Different Visit

Same fields, different procedure/dates

~100%

Same Provider, Different Patients

Same structure, different patient data

~98%

String vs Numeric Types

"1200.00" vs 1200.00 - same structure

~95%

Date Format Variations

2024-01-15 vs 01/15/2024 - same fields

~98%

Claim Resubmission

Original vs corrected - some new fields

~85%

Partial Overlap (Medium Similarity)

Documents sharing some field names but with structural differences.

Eligibility Request/Response

270/271 pair - shared subscriber fields

~60%

Dental vs Medical

Different claim types, some field overlap

~63%

Case Normalization (High Similarity)

camelCase, PascalCase, kebab-case are automatically normalized to snake_case.

camelCase vs snake_case

claimId → claim_id (normalized)

~99%

Semantic Synonyms

Different words for same concept (member_id vs subscriber_id). Enable "Synonym resolution" above to see high scores.

Field Synonyms

member_id vs subscriber_id

~39% (~95% with synonyms)

Abbreviated vs Full Names

dob vs date_of_birth

~39% (~95% with synonyms)

Structural Differences (Low Similarity)

Different JSON structure or completely different document types.

Nested vs Flat JSON

Different structure depth

~39%

Claim vs Remittance

837D vs 835 - different transaction types

~35%

EDI Segments vs JSON

X12 format vs normalized JSON

~31%

Patient vs Provider

Completely different entity types

~39%

Cross-Format: XML ↔ JSON

Compare documents in different formats. Format is auto-detected.

XML to JSON - Same Fields

Identical field names across formats

~88%

XML to JSON - Synonyms

subscriber_id ↔ member_id with synonyms

~88% (with synonyms)

XML with Attributes

XML attributes extracted as fields

~83%

Cross-Format: HTML ↔ JSON

Compare HTML forms/tables with JSON. Field names extracted from inputs, headers, labels.

HTML Form to JSON

Form inputs → JSON fields

~85%

HTML Table to JSON

Table headers → JSON fields

~85%

HTML Data Attributes

data-* attributes as fields

~83%