Comparison

SQE vs DuckDB

The goal isn’t to be DuckDB. SQE is the Iceberg-first analytical engine that runs both as an embedded single binary and as a distributed cluster with OIDC pass-through and policy enforcement. Where SQE matches DuckDB on the embedded side, the same binary scales to a multi-tenant cluster.

Current as of 4 July 2026 · synced from the SQE source

Iceberg matrix · vs Trino · SQL features

What SQE has that DuckDB doesn’t

	SQE	DuckDB
Execution model	embedded and distributed	single process
Per-query OIDC bearer passthrough	yes	no
OPA / Cedar policy (row filters, masks)	yes	no
Multi-catalog in one engine	Polaris · Nessie · Glue · HMS · S3 Tables	extension-by-extension
Iceberg V3 read + write	native	extension, read-only
Arrow Flight SQL wire protocol	yes	extension

DuckDB-inspired, now in SQE

The embedded-mode capabilities SQE grew while chasing DuckDB’s developer ergonomics — all on the same binary that runs the cluster.

Embedded single binary

sqe-cli --embedded, a DuckDB-class single-process engine with a persistent SQLite-backed warehouse at ~/.sqe/warehouse/.

File-format table functions

read_parquet, read_csv (full DuckDB-parity options), and read_json / read_json_auto.

Query a file directly

SELECT * FROM 'file.parquet' auto-detects parquet, csv, and json on local, S3, HTTPS, and hf:// paths.

COPY … TO

Native COPY <source> TO 'file' (FORMAT csv|json|parquet) export, plus CTAS across formats.

Network-transparent access

An httpfs-equivalent for HTTP/HTTPS/S3, the AWS provider chain, Azure ADLS Gen2, GCS, and Cloudflare R2.

HuggingFace hf:// URLs

hf://datasets/owner/name/path with ?revision= / @rev pinning and the @~parquet auto-view.

CSV ergonomics

Extension-based delimiter + codec auto-detect (.tsv, .gz, .zst…) and DuckDB-friendly aliases (sep, delim, header).

SQL niceties

DESCRIBE, SUMMARIZE, SELECT * EXCLUDE (…), and SELECT * REPLACE (… AS col).

See it — DuckDB-style, in SQE

The same file-first ergonomics, in an embedded sqe-cli session. No cluster, no catalog setup.

sqe-cli --embedded

# Query a Parquet file directly — no table, no setup sqe> SELECT count(*) FROM 'data/events.parquet'; ┌──────────┐ │ count(*) │ ├──────────┤ │ 1048576 │ └──────────┘ # read_csv with DuckDB-parity options + gzip auto-detect sqe> SELECT * FROM read_csv('s3://bucket/orders.tsv.gz', header => true) LIMIT 3; # A HuggingFace dataset, straight from the hf:// URL sqe> SELECT * FROM 'hf://datasets/squad/plain_text/train.parquet' LIMIT 5; # Delta Lake with time travel sqe> SELECT * FROM read_delta('/data/delta/sales', version => '5'); # Export results back out sqe> COPY (SELECT * FROM orders WHERE region='EU') TO 'eu.parquet';

What DuckDB still has that SQE doesn’t

Capability	Why not (yet)
FROM-first syntax (`FROM t SELECT …`)	DataFusion parser does not support it
Struct / list / map literal sugar	Partial, nested types work, the literal syntax is less ergonomic
List comprehensions, lambdas	DataFusion does not support them
`PIVOT` / `UNPIVOT`	DataFusion parser does not support it
`ASOF JOIN`	Open upstream issue, not yet landed
`postgres` / `mysql` / `sqlite` connectors	Positioning: SQE is Iceberg-first
`spatial`, `vss`, `fts`, `excel`	Niche, use a purpose-built engine
HuggingFace glob (`*/.parquet`)	In progress, tree-API cache landed, object-store wiring next

Smoke test

Loading DuckDB’s own train_services.parquet over HTTPS, same machine, same network:

1.618sSQE embedded

1.815sDuckDB v1.4.4

Not a benchmark claim — a single-query smoke test that says embedded mode is at least as fast as DuckDB on a basic file load.