POST
/api/compare
Compare two documents and compute their semantic similarity score using an ensemble of algorithms.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
doc_a |
string | required | First document (JSON, XML, or HTML) |
doc_b |
string | required | Second document (JSON, XML, or HTML) |
use_synonyms |
boolean | optional | Enable dental EDI synonym resolution (default: false) |
Response
{
"score": 0.856,
"simhash_score": 0.891,
"minhash_score": 0.823,
"structural_score": 0.854,
"is_similar": true,
"confidence": "high",
"confidence_reason": "All algorithms strongly agree...",
"format_a": "json",
"format_b": "json",
"use_synonyms": false,
"field_matches": [...],
"weights": {
"simhash": 0.4,
"minhash": 0.4,
"structural": 0.2
}
}
Response Fields
| Field | Type | Description |
|---|---|---|
score |
number | Weighted ensemble similarity score (0.0 - 1.0) |
simhash_score |
number | SimHash content similarity (0.0 - 1.0) |
minhash_score |
number | MinHash token overlap (0.0 - 1.0) |
structural_score |
number | Field schema similarity (0.0 - 1.0) |
is_similar |
boolean | Whether score exceeds threshold (0.5) |
confidence |
string | Confidence level: "high", "medium", or "low" |
format_a |
string | Detected format of doc_a: "json", "xml", or "html" |
format_b |
string | Detected format of doc_b: "json", "xml", or "html" |
field_matches |
array | Field-by-field match details with match types |
weights |
object | Algorithm weights used for ensemble scoring |
error |
string? | Error message if parsing failed (null on success) |