Skip to main content

SQONs In Detail

SQON is a JSON-based filter language Overture uses to describe query logic in a backend-neutral way, while retaining human readability and portability. This page focuses on the shape of a SQON itself: what nodes exist, what operators are supported, what aliases are accepted, and which edge cases matter when generating SQON programmatically.

Example: Flat Filter vs. SQON

Flat Filter Approach:

A typical flat filter uses simple key-value pairs with implicit AND logic:

{
"province": "Ontario",
"age": "20-29"
}

General limitations of a flat filter:

  • Can only express AND logic (all conditions must match)
  • Cannot express OR relationships between filters
  • Cannot nest conditions or create complex boolean expressions

SQON Approach:

SQON can express the same filter explicitly:

{
"op": "and",
"content": [
{
"op": "in",
"content": {
"fieldName": "province",
"value": ["Ontario"]
}
},
{
"op": "in",
"content": {
"fieldName": "age",
"value": ["20-29"]
}
}
]
}

Mental Model

Visualize a SQON as a "tree" of nested operations, that may contain one of two kinds of elements:

  • Leaf nodes apply an operator to one or more field values. e.g. age is between 0 and 100.
  • Group nodes combine other SQON nodes with boolean logic. e.g. the AND in "and eye color is brown".

A SQON may start at either level:

  • a single leaf node
  • a group node containing one or more child nodes

Group Nodes

Group nodes combine child SQON nodes. e.g. "filter X" AND "filter Y"
{
"op": "and",
"content": [
{
"op": "in",
"content": {
"fieldName": "fruit.color",
"value": ["red"]
}
},
{
"op": "gte",
"content": {
"fieldName": "fruit.weight_grams",
"value": 100
}
}
]
}

This results in:

  • fruit color is red
  • and fruit weight is at least 100 grams

Supported group operators

  • and
  • or
  • not

Group shape

{
"op": "and | or | not",
"content": ["SQON", "SQON", "..."]
}

Leaf Nodes

Leaf nodes describe a field-level filter. e.g. "value X" IN "field Y"
{
"op": "in",
"content": {
"fieldName": "fruit.color",
"value": ["red"]
}
}

This results in:

  • fruit color is red

Most leaf nodes use:

  • fieldName
  • value

The filter operator is the exception and instead uses fieldNames (plural).

Field Operators

SQONs can apply several kinds of filtering to fields and values:

  • membership answers questions like "is this value in the allowed set?" or "is it excluded from that set?"
  • range compares values against bounds such as greater than, less than, or between two endpoints
  • fuzzy performs a broader text-style search across one or more fields instead of exact matching

Membership-style operators

  • in
  • not-in
  • some-not-in
  • all
Example:
{
"op": "in",
"content": {
"fieldName": "fruit.color",
"value": ["red", "green"]
}
}

This results in:

  • fruit color is red or green

Range-style operators

  • gt
  • gte
  • lt
  • lte
  • between
Example:
{
"op": "between",
"content": {
"fieldName": "fruit.weight_grams",
"value": [100, 200]
}
}

This results in:

  • fruit weight is between 100 and 200 grams

Fuzzy operator

  • filter
Example:
{
"op": "filter",
"content": {
"fieldNames": ["fruit.name", "fruit.nickname"],
"value": "*app*"
}
}

This results in:

  • search for text like app in either fruit.name or fruit.nickname

Accepted Operator Aliases

Arranger accepts several shorthand aliases in addition to canonical operators.

AliasCanonical Operator
=in
==in
===in
!=not-in
!==not-in
>gt
>=gte
<lt
<=lte

For interoperability, the canonical operator names are always preferred when generating new SQONs.

Pivot

A SQON node may also include pivot.

Consider a set of records shaped like this:

[
{
"basket_name": "Andy's basket",
"items": [
{ "name": "apple", "color": "red" },
{ "name": "pear", "color": "yellow" }
]
},
{
"basket_name": "Max's basket",
"items": [
{ "name": "apple", "color": "green" },
{ "name": "cherry", "color": "red" }
]
}
]

Now imagine we want to express:

  • there exists an item whose name is apple
  • and that same item is red
Without a pivot
{
"op": "and",
"content": [
{
"op": "in",
"content": {
"fieldName": "items.name",
"value": ["apple"]
}
},
{
"op": "in",
"content": {
"fieldName": "items.color",
"value": ["red"]
}
}
]
}

This can be read as:

  • some item has the name apple
  • and some item has the color red

That may accidentally match across different nested objects, providing green apples and red cherries.

With a pivot
{
"op": "and",
"pivot": "items",
"content": [
{
"op": "in",
"content": {
"fieldName": "items.name",
"value": ["apple"]
}
},
{
"op": "in",
"content": {
"fieldName": "items.color",
"value": ["red"]
}
}
]
}

The pivot is used to anchor a filter to a nested path. This matters when Arranger translates SQON into Elasticsearch nested queries.

This results in:

  • look within the items nested path, and
  • require the apple and red conditions to be true for the same nested item

So the practical difference is:

  • without pivot: the conditions may be satisfied by different nested rows
  • with pivot: the conditions are scoped to the same nested row

In practice:

  • most simple SQONs omit pivot
  • nested aggregations and nested field filtering are where pivot becomes important
  • pivot may appear on either leaf or group nodes

A pivot can still be rejected later at runtime if it does not match a valid nested field path for the active catalog.

Current Accepted Value Shapes

The current Arranger SQON schema accepts:

  • membership operators: scalar or array
  • range operators: scalar or array
  • between: scalar or array with at least 2 items
  • filter: string

That reflects current compatibility behavior, not necessarily the final ideal shape. In particular:

  • range operators currently tolerate arrays even though scalar values are usually clearer
  • between currently accepts arrays longer than 2, and downstream logic may reduce them to a min/max pair

Programmatic clients should prefer the clearer forms:

  • gt, gte, lt, lte: single scalar value
  • between: exactly 2 values

Extra Keys

Arranger currently accepts but ignores extra keys on SQON nodes and content objects.

That means this is structurally valid today:

{
"op": "in",
"content": {
"fieldName": "fruit.color",
"value": ["red"],
"extraContent": true
},
"extraTopLevel": "ignored"
}

This results in:

  • fruit color is red
  • and the extra keys are ignored by current SQON validation

Important Edge Cases

These are worth handling explicitly when generating SQON in other systems such as MCP servers.

Falsy values can still be valid

These are valid SQON values and should not be treated as missing:

  • 0
  • ""
Examples:
{
"op": "gte",
"content": {
"fieldName": "fruit.weight_grams",
"value": 0
}
}

This results in:

  • fruit weight is at least 0 grams
{
"op": "in",
"content": {
"fieldName": "fruit.label",
"value": ""
}
}

This results in:

  • fruit label is the empty string

Special Arranger values

Some values have special downstream meaning in Arranger.

Examples include:

  • set_id:<id>
  • __missing__
  • wildcard-like strings such as ABC*

These are still ordinary SQON values structurally, but Arranger may compile them into specialized Elasticsearch queries.

If you are generating SQON automatically, the safest defaults are:

  1. Prefer canonical operator names over aliases.
  2. Use leaf-root SQON only for a single simple condition.
  3. Use group-root SQON for composed logic.
  4. Use scalar values for gt, gte, lt, and lte.
  5. Use exactly 2 values for between.
  6. Preserve valid falsy values such as 0 and "".
  7. Do not invent pivot unless you know the nested path semantics.
  8. Treat catalog field names and allowed values as catalog-specific introspection data, not SQON syntax.

When Arranger introspection is enabled, the following endpoints are useful together.

GET /introspection

Lists all catalogues served by the instance. Each entry includes the catalogue's document type, its GraphQL and introspection paths, and an optional description when configured.

{
"catalogCount": 1,
"mode": "single",
"sqonSchemaPath": "/introspection/sqon",
"catalogs": {
"participants": {
"description": "Clinical trial participant records.",
"documentType": "participant",
"paths": {
"fields": "/introspection/fields",
"graphql": "/graphql",
"introspection": "/introspection/participants"
}
}
}
}

description is omitted when not set in the catalogue config. paths.fields is only present in single-catalogue mode.


GET /introspection/:catalogId

Returns field-level details for one catalogue. Use this to understand what fields exist, their types, and which SQON operators apply to each type.

{
"catalogId": "participants",
"description": "Clinical trial participant records.",
"documentType": "participant",
"generatedAt": "2026-05-27T00:00:00.000Z",
"meta": { "authFiltered": false },
"operators": {
"keyword": ["in", "not-in", "some-not-in", "all", "filter"],
"long": ["in", "not-in", "gt", "gte", "lt", "lte", "between"],
"date": ["in", "not-in", "gt", "gte", "lt", "lte", "between"]
},
"fields": {
"participant_id": {
"displayName": "Participant ID",
"type": "keyword"
},
"age_at_diagnosis": {
"displayName": "Age at Diagnosis",
"type": "long",
"unit": "year"
},
"diagnosis_date": {
"displayName": "Diagnosis Date",
"type": "date"
}
}
}

operators groups valid SQON operators by field type. To find which operators apply to a field, look up operators[field.type]. Only the types actually present in the catalogue's fields appear here.

fields lists each field with its displayName, type, and optional unit. The description key is omitted when not configured.

In single-catalogue mode, /introspection/fields is an alias for this endpoint.


GET /introspection/sqon

Returns the SQON JSON Schema and operator metadata shared across all catalogues — combination operators (and, or, not), field operators with value types and applicability, and accepted aliases. Use this to validate or describe SQON structure independently of any catalogue's field set.