Skip to content

Appendix: Tree Normal Form

Tree grouping bridges two worlds: the relational (flat, tabular, join-oriented) and the hierarchical (nested, tree-structured, document-oriented). Not all JSON has a sensible relational interpretation – arbitrary nesting can be too irregular to map cleanly. But some trees do, and understanding which ones helps clarify what tree grouping actually computes.

This appendix defines tree normal forms – a vocabulary for describing JSON structure and its relational interpretation. Unlike database normal forms (which form a linear hierarchy), tree normal forms are organized as a graph. Nodes are forms; edges are constraints or interpretations. Some edges restrict structure; others assign meaning to nesting.

The Graph

Mind Map

Edges represent:

Edge Type Meaning
TNF-0 → TNF-T Restriction Structural hygiene
TNF-T → TNF-N Interpretation Nesting means namespacing
TNF-T → TNF-G Interpretation Nesting means grouping
TNF-T → TNF-M Interpretation Nesting means pivot
TNF-G → TNF-SR Restriction No nested groups (flat)
TNF-G → TNF-R Restriction Single path, no siblings
TNF-G → TNF-GN Combination Grouping with namespaced leaves

Tree normal form edge types

The graph admits extension. New forms slot in by defining their edges.


The Forms

TNF-0: Valid JSON

The baseline: any valid JSON per RFC 7159.

  • Keys may be duplicated
  • Arrays may be heterogeneous
  • Top-level value may be a scalar
  • Objects may be empty
42
[ 1, [], "a string", true, {},
  { "a": [2, 3], "a": [2, 3] },
  [ 2, 3, [ 4, 5 ] ]
]

Relational interpretation: None guaranteed. This is raw material. But it’s still queryable – pathing and json_each work on any valid JSON.


TNF-T: Well-Typed JSON

Restriction from TNF-0: no duplicate keys, homogeneous arrays, non-empty objects, array or object at top level.

{
  "name": "Alice",
  "scores": [95, 87, 91]
}

Relational interpretation: Arrays can be interpreted as collections; objects as records. Pathing is unambiguous. But nesting semantics are not yet defined – is a nested object a namespace? A grouped row? A pivot?

TNF-T is the foundation for the interpretive forms that follow.


TNF-N: Namespaced

Interpretation from TNF-T: nesting means semantic organization.

Nested objects group related fields – structure, not data rows.

{
  "LastName": "eklund",
  "address": {
    "City": "boston",
    "State": "MA"
  }
}

The address object is a namespace. The tree is semantically equivalent to:

{
  "LastName": "eklund",
  "address_City": "boston",
  "address_State": "MA"
}

Relational interpretation: Namespaced trees flatten to a single row. Pathing (.address.City) navigates the namespace.

Trade-off: More expressive (preserves semantic grouping) but less directly relational (requires flattening).


TNF-G: Grouped

Interpretation from TNF-T: nesting means aggregation.

Arrays represent grouped rows – the result of GROUP BY.

[
  { "Title": "Engineer",
    "people": [
      { "FirstName": "Alice", "LastName": "Smith" },
      { "FirstName": "Bob", "LastName": "Jones" }
    ]
  },
  { "Title": "Manager",
    "people": [
      { "FirstName": "Carol", "LastName": "White" }
    ]
  }
]

Each nesting level is a grouping context. The outer array groups by Title; the inner people array collects rows within each title.

Relational interpretation: Direct correspondence to GROUP BY. Construction compresses cardinality; destructuring expands it.


TNF-M: Metadata-Keyed

Interpretation from TNF-T: nesting means pivot.

Data values become object keys.

{
  "Engineer": [
    { "FirstName": "Alice", "LastName": "Smith" }
  ],
  "Manager": [
    { "FirstName": "Carol", "LastName": "White" }
  ]
}

The keys (Engineer, Manager) are data values lifted to metadata.

Relational interpretation: Keys map to a column; values map to grouped rows. Destructuring recovers the key as a column value.

Trade-off: Convenient for lookup but constrained – only one column can serve as keys per level. Two metadata-keyed objects with the same key type create ambiguous destructuring.


TNF-SR: Simply Relational

Restriction from TNF-G: no nested groups.

A flat array of homogeneous objects – the simplest grouped form.

[
  { "Title": "Engineer", "FirstName": "Alice", "LastName": "Smith" },
  { "Title": "Engineer", "FirstName": "Bob", "LastName": "Jones" },
  { "Title": "Manager", "FirstName": "Carol", "LastName": "White" }
]

No nested arrays. Each object is a row; the array is a table.

Relational interpretation: Direct. The JSON is a table in array-of-objects form. No grouping, no hierarchy – just rows.


TNF-R: Round-Trippable

Restriction from TNF-G: single path from root to deepest leaf, no sibling groups.

[
  { "Title": "Engineer",
    "State": "CA",
    "people": [
      { "FirstName": "Alice", "LastName": "Smith" }
    ]
  }
]

Relational interpretation: Lossless. relation → tree → relation recovers the original data (modulo column order).

Why siblings break round-tripping:

employee(*) ~> { Title,
                 "people": ~> {FirstName, LastName},
                 "cities": ~> [City] }

Siblings aggregate independently. The join – which person was in which city – is not preserved. Destructuring recovers each path independently:

  • Title, FirstName, LastName (via people)
  • Title, City (via cities)

But not the original four-column row. This is TNF-G but not TNF-R.


TNF-GN: Grouped with Namespaced Leaves

Combination of TNF-G and TNF-N: grouping structure with namespaced leaf objects.

[
  { "Title": "Engineer",
    "people": [
      { "name": { "first": "Alice", "last": "Smith" },
        "contact": { "email": "[email protected]", "phone": "555-1234" }
      }
    ]
  }
]

The outer structure is grouped (array of objects with nested arrays). The leaf objects use namespacing (name, contact).

Relational interpretation: Destructure the grouping levels; flatten the namespaced leaves. The result has columns Title, name_first, name_last, contact_email, contact_phone.


Mixing Forms

Real trees often combine forms at different levels. The graph shows which combinations make sense:

{
  "metadata": {
    "generated": "2024-01-15",
    "version": "1.0"
  },
  "data": [
    { "Title": "Engineer",
      "people": [
        { "FirstName": "Alice", "LastName": "Smith" }
      ]
    }
  ]
}
  • metadata is TNF-N (namespacing)
  • data is TNF-G (grouping)

The relational interpretation:

  • Flatten metadata.generated, metadata.version to columns
  • Destructure datadata.people to rows
  • Result: one row per person, metadata fields repeated

Understanding which form applies where clarifies what operations make sense.


Edge Types

The graph has two kinds of edges:

Restriction edges add structural constraints: - TNF-0 → TNF-T: hygiene (no dup keys, homogeneous arrays) - TNF-G → TNF-SR: flatness (no nested groups) - TNF-G → TNF-R: single-path (no siblings)

Interpretation edges assign meaning to structure: - TNF-T → TNF-N: nesting is namespacing - TNF-T → TNF-G: nesting is grouping - TNF-T → TNF-M: nesting is pivot

Restriction edges constrain what trees are valid. Interpretation edges determine how to read them relationally.


Summary

Form Key Property Relational Interpretation
TNF-0 Valid JSON None guaranteed; queryable via pathing
TNF-T Well-typed Arrays are collections; objects are records
TNF-N Namespaced Flatten to single row
TNF-G Grouped GROUP BY; destructure to rows
TNF-M Metadata-keyed Pivot; keys become column values
TNF-SR Simply relational Direct table (array of flat objects)
TNF-R Round-trippable Lossless construction/destruction
TNF-GN Grouped + namespaced Destructure groups, flatten namespaces

Summary of tree normal forms

The forms answer different questions:

  • TNF-0 / TNF-T: Is this JSON structurally sound?
  • TNF-N / TNF-G / TNF-M: What does nesting mean here?
  • TNF-SR / TNF-R: How constrained is the grouping?
  • TNF-GN: Can I mix interpretations?

Tree normal forms are not prescriptive – TNF-0 is sometimes exactly what you need. They are a vocabulary for understanding what your tree structure means relationally, and what operations it supports.


Extending the Graph

The graph admits new forms by defining edges. Examples:

  • TNF-MR (metadata-keyed, round-trippable): TNF-M + single-path constraint
  • TNF-SN (simply namespaced): TNF-N + flat (no nested namespaces)
  • TNF-GM (grouped + metadata): grouping with metadata-keyed intermediate levels

Each new form names a useful combination.

Guiding Principles

  1. Trees and tables mix. Trees have a valid relational interpretation under certain structural constraints.

  2. Start with relations. The cleanest trees arise from grouping relations, not from arbitrary JSON. Construction informs understanding.

  3. Arrays are rows; objects are records. Arrays represent homogeneous collections (multiple rows of the same shape). Objects represent heterogeneous structure (named fields, like columns).

  4. Grouping compresses; destructuring expands. Construction decreases cardinality (many rows → fewer rows with nested arrays). Destructuring increases cardinality (nested arrays → many rows).

  5. Siblings lose information. Sibling tree groups aggregate independently. The relationship between siblings – which person was in which city – is not preserved. This is inherent, not a bug.

  6. Metadata-oriented trees are pivots. When data values become object keys, the structure resembles a pivot table. This helps with the object-relational impedance mismatch but introduces constraints.

  7. Zeroth normal form has its place. Arbitrary JSON can still be queried via json_each and pathing. Tree normal forms define what’s cleanly relational, not what’s queryable at all.

Appendix: Error URI Taxonomy

Every compilation error carries a hierarchical URI that identifies the error category. Error hooks use these URIs for prefix matching:

-- matches any DQL semantic error
users(*) |> (foo.*) (~~error://dql/semantic ~~)

-- matches only table resolution failures
nonexistent_table(*) (~~error://dql/semantic/resolution/table ~~)

-- matches any error at all
bad_query(*) (~~error ~~)

The URI is a stable identifier independent of the error message text. It doubles as the canonical reference for documentation, tooling, and diagnostics.

Design Principles

  1. Domain first. The top level identifies the language being processed: dql/, ddl/, dml/. Users know whether they wrote a query or a definition.

  2. Parse vs semantic. The second level is the classic compiler split. Parse errors mean the source text is structurally invalid. Semantic errors mean the structure is valid but the meaning is wrong.

  3. Prefix matching does the work. Each level narrows usefully: error://dql catches any DQL error. error://dql/semantic catches any semantic error. error://dql/semantic/resolution catches any name binding failure.

  4. No validation. The term is too vague. semantic says what the category is. constraint, arity, resolution say what went wrong.

Prefix Matching

Error hooks match by prefix. An expected URI of dql/semantic matches any actual URI that starts with dql/semantic/:

Expected Matches
error://dql any DQL error (parse or semantic)
error://dql/semantic any semantic error
error://dql/semantic/resolution resolution/table, resolution/column, resolution/ambiguous, etc.
error://dql/semantic/resolution/table table resolution failures only
error://dql/parse any parse failure
(bare) any error

URI Hierarchy

dql/parse/ — Structural Failures

The source text does not form a valid CST, or CST-to-AST conversion finds malformed structure. The problem is syntactic.

URI Condition Trigger
dql/parse Any parse failure
dql/parse/tree_sitter Tree-sitter library error
dql/parse/literal Malformed literal 0xGG, 0o89
dql/parse/expression Malformed expression x +, empty expression
dql/parse/anon Malformed anonymous table _(a @ 2, 3)
dql/parse/pipe Malformed pipe expression x /->
dql/parse/function Malformed function call missing name, lambda body
dql/parse/case Malformed CASE expression missing arm, missing result
dql/parse/window Malformed window spec invalid frame mode
dql/parse/json_path Malformed JSON path [name], {42}
dql/parse/projection Empty or invalid projection \|> -(*)
dql/parse/subquery Malformed scalar subquery missing table, missing continuation
dql/parse/pattern Malformed pattern literal invalid /pattern/ format

Fine-grained leaves (e.g. dql/parse/literal/hex) can be added later. The second level is the useful grain for error hooks.

dql/semantic/ — Semantic Failures

The structure is valid but the meaning is wrong. Names do not resolve, arities do not match, or domain constraints are violated.

dql/semantic/resolution/ — Name Binding Failures

URI Condition Trigger
dql/semantic/resolution Any name binding failure
dql/semantic/resolution/table Table or view not found nonexistent(*)
dql/semantic/resolution/column Column cannot be resolved \|> (bad_col)
dql/semantic/resolution/function Function or HO view not found
dql/semantic/resolution/sigma Sigma predicate not found
dql/semantic/resolution/ambiguous Name matches multiple entities cross-join with shared column
dql/semantic/resolution/scope Name exists but unreachable column behind pipe barrier, post-group leak

Why ambiguous lives under resolution. Ambiguity is the dual of not-found: resolution fails because there are zero matches (not found) or multiple matches (ambiguous). Both are failures of name binding.

Why scope lives under resolution. The name exists in the schema, but the current scope cannot see it. The column is behind a pipe barrier, or a group-by reduced the visible columns. It is a resolution failure with a specific cause.

dql/semantic/arity/ — Wrong Argument Count

URI Condition Trigger
dql/semantic/arity Wrong argument count (general)
dql/semantic/arity/function Function call arity
dql/semantic/arity/predicate Predicate arity +between(1, age)
dql/semantic/arity/sigma Sigma predicate arity
dql/semantic/arity/pattern Positional pattern element count users(a, b, c)

Why arity is separate from resolution. Resolution is about finding the entity. Arity is about calling it. A function can resolve successfully and still fail on arity. These are different failure modes with different fixes: “did you spell it right?” vs “did you pass the right number of arguments?”

dql/semantic/constraint/ — Domain Rule Violations

The query is valid and all names resolve with correct arity, but a domain-specific rule is violated.

URI Condition Trigger
dql/semantic/constraint Any constraint violation
dql/semantic/constraint/pivot Pivot requirements not met missing IN predicate, duplicate column
dql/semantic/constraint/destructuring Destructuring rule violated multiple ~>, comparison in pattern
dql/semantic/constraint/join Join constraint violated multiple full outer, missing condition
dql/semantic/constraint/context Context-aware function misuse typo, wrong args, missing marker
dql/semantic/constraint/unsupported Construct not supported in this position IN in projection, EXISTS in CASE

Why constraint replaces validation. The word constraint names what went wrong: a domain rule was violated. Pivot requires an IN predicate. Destructuring forbids comparisons. Full outer join cannot have multiple targets. These are specific rules, not generic “validation.”

dql/semantic/limitation/ — Known Limitations

URI Condition
dql/semantic/limitation Any known limitation
dql/semantic/limitation/qualified_name_ambiguity Grammar ambiguity with qualified names ending in .
dql/semantic/limitation/not_implemented Feature not yet implemented

ddl/ — DDL Errors

DDL errors are structurally similar to DQL errors but fewer in number.

URI Condition
ddl/parse DDL syntax failure
ddl/semantic/resolution Referenced entity not found
ddl/semantic/constraint DDL rule violated (circular dependency, duplicate definition)

dml/ — DML Errors

URI Condition
dml/parse DML syntax failure
dml/semantic/resolution Target entity not found
dml/semantic/constraint DML rule violated

database/ and io/ — Runtime Errors

These errors occur during query execution, not compilation. They do not belong to a language domain.

URI Condition
database Any database operation error
database/connection Connection lock poisoned
io I/O error

Implementation Notes

The current implementation derives subcategories from error message keywords for ValidationError, TransformationError, and TranspilationError. Stable, static error types (TableNotFoundError, ColumnNotFoundError) already carry precise URIs. A planned refactor will add explicit subcategory fields to all dynamic error types, making URIs independent of message text.

Appendix: Danger URI Taxonomy

Certain behaviors are safe in most contexts but dangerous in others. Rather than forbid them outright, delightql gates them behind danger URIs – named safety boundaries that are closed by default and opened explicitly per-query.

-- open a specific danger for one query
employee(*) as e (~~danger://dql/cardinality/nulljoin ON~~),
  department(*) as d,
  e.DepartmentId = d.DepartmentId

-- the danger auto-closes at query end
employee(*) as e, department(*) as d,
  e.DepartmentId = d.DepartmentId
-- this query uses safe defaults again

The URI is a stable identifier. It doubles as the canonical reference for documentation, tooling, and diagnostics – the same role that error URIs serve for compilation errors.

Design Principles

  1. Off by default. Every danger starts OFF. The safe behavior is active unless the programmer explicitly requests otherwise.

  2. Domain first. The top level identifies the language domain: dql/, ddl/, dml/. This mirrors the error URI hierarchy.

  3. What-goes-wrong second. The second level names the category of harm: cardinality/ (row-count blowup), termination/ (non-halting computation), precision/ (silent data loss). Where error URIs use what phase failed (parse, semantic), danger URIs use what goes wrong – because dangers are not phase-specific.

  4. Prefix matching does the work. Each level narrows usefully. danger://dql catches any DQL danger. danger://dql/cardinality catches any cardinality blowup. danger://dql/cardinality/nulljoin catches only that specific case.

  5. No bare form. (~~danger://dql/cardinality/nulljoin~~) without ON or OFF is an error. Being explicit about the toggle is the entire point.

  6. Query-scoped. A danger gate opens for one query and auto-closes at query end. It does not leak into subsequent queries.

  7. The URI is the documentation. The danger URI in source code is also the canonical reference for what the danger means and why it exists.

Syntax

employee(*) as e (~~danger://dql/cardinality/nulljoin ON~~),
  department(*) as d,
  e.DepartmentId = d.DepartmentId

The annotation lives inside the annotation delimiters (~~ ... ~~) and attaches at a continuation point (after a relation). It is an annotation that travels with the query but is not part of the relational algebra.

Component Meaning
danger:// URI scheme identifying a danger gate
dql/cardinality/nulljoin Hierarchical path to the specific danger
ON Enable the dangerous behavior for this query
OFF Restore the safe default (useful to override a CLI baseline)
ALLOW Permit but do not force – the compiler may use the dangerous path if needed
19 Graduated severity levels for host-defined behavior

Toggle Values

ON and OFF are the common cases. They are binary: the dangerous behavior is either active or not.

ALLOW is a middle ground. It tells the compiler that the dangerous behavior is acceptable but not required. The compiler may choose the safe path when it can and the dangerous path when it must. This is useful for queries where the programmer has verified that the data does not trigger the danger but wants the compiler to retain latitude.

The severity levels 1 through 9 exist for host-defined policies where binary on/off is too coarse. The language defines no semantics for specific levels – the host interprets them. Example uses:

  • A linter that warns at level 3 but errors at level 7
  • A monitoring system that logs at level 1 but alerts at level 5
  • A deployment pipeline that permits level 1-4 in staging but only level 1-2 in production

The severity levels are ordered: higher numbers indicate greater willingness to accept the danger. A tool checking “is danger level at least N?” can compare numerically.

Multiple dangers may be opened for the same query:

employee(*) as e
  (~~danger://dql/cardinality/nulljoin ON~~)
  (~~danger://dql/cardinality/cartesian ON~~),
  department(*) as d,
  e.DepartmentId = d.DepartmentId

Defaults and Overrides

The program starts with a default table where every danger is OFF:

danger://dql/cardinality/nulljoin            OFF
danger://dql/cardinality/cartesian           OFF
danger://dql/termination/unbounded           OFF
danger://dql/semantics/min_multiplicity      OFF

Override Scopes

Not all dangers accept overrides from the same places. The scope at which a danger can be overridden depends on whether it changes language semantics or execution guardrails:

URI Inline File CLI Category
dql/cardinality/nulljoin yes yes no semantic
dql/cardinality/cartesian yes yes yes guardrail
dql/termination/unbounded yes yes yes guardrail
dql/semantics/min_multiplicity yes yes no semantic

Semantic dangers change what operators mean. The nulljoin gate redefines = in join position from SQL = to IS NOT DISTINCT FROM. A DQL script should mean the same thing regardless of who runs it and what CLI flags they pass. Semantic overrides must live in the source text – either inline on the query or at the top of the file – so the script is self-documenting.

Guardrail dangers control whether the engine permits certain operations. They do not change expression semantics. Cartesian product rejection and unbounded recursion prevention are resource limits, not language redefinitions. These may be overridden at any scope, including the CLI.

The guiding principle: operator semantics are fixed by the source text. CLI flags may change SQL shape (via option://) or execution policy (via guardrail danger://), but never language meaning.

Session Baseline (CLI)

The CLI can shift the baseline for guardrail dangers:

dql query --danger dql/cardinality/cartesian=ON --db test.db "..."

Attempting to override a semantic danger from the CLI is an error:

# REJECTED: nulljoin is a semantic danger -- use inline annotation
dql query --danger dql/cardinality/nulljoin=ON --db test.db "..."

Override Precedence

Per-query annotations override the file-level directive. The file-level directive overrides the session baseline. At query end, the danger reverts to the file-level or session-level value:

CLI baseline  ---->  file directive  ---->  per-query  ---->  revert
    OFF                   ON                   OFF             ON

Prefix Matching

Danger hooks match by prefix, identically to error hooks:

Expected Matches
danger://dql any DQL danger
danger://dql/cardinality nulljoin, cartesian, any future cardinality danger
danger://dql/cardinality/nulljoin null-join only
danger://dql/termination unbounded, any future termination danger

URI Hierarchy

dql/cardinality/ — Row-Count Blowups

The query may produce far more rows than the programmer expects. These dangers guard against silent multiplicative explosions in result cardinality.

URI Default Condition What happens when ON
dql/cardinality/nulljoin OFF = in join position compiles to SQL = = in join position compiles to IS NOT DISTINCT FROM. NULL keys match each other, producing a cartesian product of all NULL rows.
dql/cardinality/cartesian OFF Cross joins without an explicit condition are rejected Cross joins without conditions are permitted.

Why nulljoin is a cardinality danger. The NULL-by-NULL cross product is a multiplicative blowup. Five NULLs on the left and three on the right produce fifteen matched rows. The danger is not that NULLs participate in the join – it is that they participate combinatorially.

Why cartesian is a cardinality danger. A cross join of two million-row tables produces a trillion rows. Explicit cross joins are sometimes intended (for generating combinations), but an accidental cross join – one caused by a missing join condition – is one of the most common and costly SQL mistakes.

dql/termination/ — Non-Halting Computation

The query may not terminate.

URI Default Condition What happens when ON
dql/termination/unbounded OFF Recursive CTEs must include a termination condition Recursive CTEs without termination conditions are permitted.

Why unbounded is a termination danger. A recursive CTE without a termination condition produces an infinite result. In practice, the database engine will hit a resource limit and error – but only after consuming significant time and memory. The compiler can detect the absence of a termination condition statically and reject it early.

dql/semantics/ — Operator Semantics

The query’s meaning changes. These dangers alter what an operator computes, not merely whether it is permitted. They are semantic dangers: inline-only, never CLI-overridable.

URI Default Condition What happens when ON
dql/semantics/min_multiplicity OFF Intersection-via-correlation uses bidirectional semijoin (UNION ALL of EXISTS-filtered operands), producing m+n copies of matching tuples Intersection-via-correlation uses ROW_NUMBER + equi-join, producing min(m,n) copies – true INTERSECT ALL multiplicity.

Why min_multiplicity is a semantic danger. The bidirectional semijoin and the ROW_NUMBER path compute different multisets for duplicate tuples. Three copies in the left operand and two in the right yield five rows under bidirectional semijoin but two under min-multiplicity. The difference only surfaces with genuinely duplicate tuples, but it changes what the operator means – the same query produces different results. This is a semantic redefinition, so it must live in the source text.

Future Categories

The hierarchy is designed to grow. Possible future categories:

dql/precision/ — Silent Data Loss

URI Condition
dql/precision/implicit_cast Implicit type coercion that loses information
dql/precision/truncation String or numeric truncation without warning

dml/destructive/ — Irreversible Mutations

URI Condition
dml/destructive/unfiltered_update UPDATE without a WHERE condition
dml/destructive/unfiltered_delete DELETE without a WHERE condition

Relationship to Error URIs

Danger URIs and error URIs are sibling systems:

Error URIs Danger URIs
Scheme error:// danger://
When After compilation fails Before compilation (gate check)
Mechanism Prefix matching for error hooks Prefix matching for gate control
Top level Domain (dql/, ddl/, dml/) Domain (dql/, ddl/, dml/)
Second level Phase (parse/, semantic/) What goes wrong (cardinality/, termination/)
Default Errors always fire Dangers always off

Both use hierarchical URIs. Both support prefix matching. Both serve as stable identifiers for documentation and tooling. The difference is directional: error URIs report what went wrong; danger URIs prevent what could go wrong.

Namespace Directives

The image

A DQL session is a filesystem. You mount databases, install libraries, create directories. When you close the lid, the state persists. When you reopen it, everything is where you left it.

~::                             -- your home directory
├── data::wh                    -- a mounted database
├── analytics                   -- a consulted DDL library
│   └── helpers                 -- the library's internal dependency
├── analytics::grounded         -- library bound to data
└── scratch                     -- a namespace you made

~:: is home. :: is root (where sys and std live). Directives are the shell commands that shape this tree. Queries run inside it.

The image is a SQLite file – the bootstrap database serialized to disk. Not a replay script, but the actual state: namespace tree, entity definitions, connection metadata, timestamps, history. Since DQL already uses SQLite for its internal state, the image format is the system’s own storage format. Dogfooding.

# Ephemeral (default) -- fresh home, dies on exit
echo 'users(*)' | dql query --db warehouse.db

# Persistent -- your laptop
dql --session workspace.db --db warehouse.db -i
> mount!("ref.db", "data::ref")
> consult!("analytics.dql", "analytics")
> weekly_report(*)
> .quit                        # state saved to workspace.db

# Next day -- everything is where you left it
dql --session workspace.db -i
> weekly_report(*)             # just works

The image is queryable. mount!("old_session.db", "prev") and browse what you had last week. Diff two environments by joining their bootstrap tables. The session IS a database.

This is the Smalltalk image model applied to a query environment. Smalltalk’s images were opaque heap dumps. Jupyter notebooks improved this with ordered cells, but introduced a desync problem – run cells out of order and the kernel diverges from what the notebook shows. A DQL image has neither problem: it’s inspectable (it’s SQLite) and it’s the actual state (not a recipe that might diverge).

Directives

Queries transpile to SQL. Directives shape the environment in which queries run. mount! doesn’t produce SQL – it connects a database. consult! loads view definitions. enlist! makes names visible.

Every directive produces, consumes, borrows, or transforms a namespace.

Produce

mount!("warehouse.db", "data::wh")        -- connect database → DataNs
consult!("analytics.dql", "analytics")     -- load DDL file → LibNs
copy!("subset")                            -- pipe terminal: create from entity metadata → LibNs
consult_tree!("models/", "lib")            -- directory tree → nested LibNs
mount_tree!("postgres://host/db", "data")  -- database catalog → nested DataNs

The _tree variants mirror an external hierarchy (filesystem or database catalog) into the namespace tree. The caller names the root; the source names the branches. models/util/greet.dql becomes lib::util::greet.

Consume

unmount!("data::wh")
unconsult!("analytics")
imprint!("analytics", "data::wh")         -- materializes views as tables, consumes LibNs

imprint! is linear – the library namespace is consumed. This prevents ghost duality (abstract definitions alongside concrete tables that inevitably drift).

Borrow

ground!("data::wh", "analytics", "analytics::g")   -- bind lib to data → GroundedNs
serialize!("analytics", "backup.dql")               -- write to file

Transform

refresh!("data::wh")           -- re-introspect schema
reconsult!("analytics")        -- reload from file

Scope-local (visibility)

enlist!("analytics")           -- bare names visible in my scope
alias!("data::wh", "wh")      -- wh.users(*) shorthand
delist!("analytics")           -- remove enlistment + alias

Scope-local operations are saved/restored at DDL boundaries. A DDL that enlists a namespace doesn’t pollute its caller.

Scratch namespaces

Inline DDL ((~~ddl:"name" ~~)) creates scratch namespaces that are ambient – they automatically bind to the database they were created under. A consulted library needs explicit ground! to connect its table references to data. A scratch namespace doesn’t – you’re defining views against the database that’s right here, and the system captures that binding at creation time.

(~~ddl:"helpers"
  young(*) :- users(*), age < 20
~~)
enlist!("helpers")
young(*)        -- users resolves against the current database

See book/design/inline-ddl.md for details on ambient binding, provenance, and the relationship between scratch and consulted namespaces.

Execution

play!("setup.dql")                  -- execute in my scope (source)
exec!("report.dql") |> (total)      -- execute, return last expression
run!("job.dql", "sandbox")          -- isolated sub-session
save!()                             -- persist ~:: to session file

Pipe schemas

Every directive produces one unnamed positional column: the namespace it affected. No status column – rows mean success, errors mean failure.

consult!("a.dql","ns1";"b.dql","ns2")(*) |> enlist!()
lib::(*) |> pick("view1";"view2") |> copy!("subset")
mount!("a.db","da";"b.db","db")(*) |> enlist!()

Scalar-lifted arguments (; between pairs) produce multiple rows. Pipe terminals read the single column positionally.

Nesting

DDL files don’t know their own name. The caller chooses:

consult!("analytics.dql", "analytics")   -- caller's choice
consult!("analytics.dql", "reports")     -- different caller, different name

A DDL that needs helpers cannot self-nest – it doesn’t have crate:: or __name__. Auto-nesting solves this: directives inside a DDL are prefixed under the DDL’s namespace automatically.

-- Inside analytics.dql:
consult!("helpers.dql", "helpers")     -- becomes analytics::helpers
consult!("shared.dql", "::shared")    -- :: escapes to global root
Prefix Target Unix analogy
(bare) relative to current DDL ./
~:: session root ~/
:: global root /

When two DDLs consult the same file, the namespace tree has two entries. The engine shares resources behind the scenes (connections are ref-counted by URI). The semantics are value-level copies; the implementation shares structure. Functional data structures.

Ownership

Namespace directives have ownership semantics. Each directive either produces, consumes, borrows, or transforms a namespace resource.

Key rules: - Can’t unmount! a DataNs that’s borrowed by a ground! - Can’t unconsult! a LibNs that’s borrowed by a ground! - imprint! consumes the LibNs – use-after-imprint is an error - delist! drops both enlistments and aliases - Destroying a parent namespace cascades to children

These enforce real invariants (no dangling views, no stale groundings) through the type system rather than programmer discipline.

Full directive signatures with ownership annotations are in DESIGN-namespace-directives.md.