Precedence and Scoping
Relational Precedence
All relational operators in delightql are left associative. This creates an explicit order of operations that is understandable and can be analyzed in full piped notation. A rule of thumb is that if the grammar of a language makes it easier for the compiler, then that same grammar will make it easier for the programmer who needs to write and maintain code. As predicates and relational operators are introduced from the left, so do newly scoped variables and transformations become available to the right.
employee(*)
{++} employee_2019(*)
, ? department(*.(DepartmentId))
, trim:(lower:(Department)) != "executive"
|> $( coalesce:(DepartmentCity, "UNKNOWN") as DepartmentCity)
|> %( DepartmentCity ~> count:(*) as employee_count, avg:(Salary) )
The above delightql uses full piped notation where every line represents a relation as it travels through a series of relational operators.
- Line 1 requests a predicate. The current piped relation is the same as the
employeepredicate. - Line 2 uses a
union allbinary relational operator. The current piped relation has the same schema but with additional rows ofemployee_2019, and thus a larger cardinality. - Line 3 uses a binary outer join. The current piped relation has a schema of both predicates and a cardinality of the product of both predicates’ cardinality.
- Line 4 changes the cardinality by issuing a
usingsigma clause. Bothonandusingare semantically the same as awhereclause. The current piped relation has the same schema as before but rows not matching the join condition have been removed. The cardinality is lower-bounded by the cardinality ofemployee union all employee_2019asdepartmentis participating as anouterrelation. - Line 5 uses a sigma clause to remove more rows. The current piped relation has the same schema as the prior two, but now more rows have been removed.
- Line 6 uses a BASIC-COVER to transform a dimension. The current piped relation has a schema with the same names but a transformed column. The cardinality (number of rows) stays the same.
- Line 7 runs a
group byon the previous relation. The current piped relation has a new schema based on the grouping and reduced columns and has a cardinality equal to the unique combinations ofDepartmentCity.
The Problem of or/disjunction
The left-associativity reading of a suite of relational operators has an Achilles heel, the introduction of disjunction or the or operator. In Sql, this is not that big of a problem as the natural language usage of or and judicious usage of parentheses make a clear reading of a sigma selection after a where easy enough to understand.
But because delightql borrows from the syntax of Prolog, and disjunction (via the SEMI-OR) has interesting semantics (and because the L-AND , serves double duty as a join and a where) the following piped notation breaks a left-associative reading. Early attempts at delightql’s sigilization of union and union all tried to make use of the SEMI-OR sigilization ; instead of {+} but the details on unifying union-compatible semantics with more general disjunction were hard to overcome. See the essay “Disjunction and Union and Messy Or”.
employee(*)
, trim:(lower:(Department)) = "executive"
; Salary > 120000
, Title != "Engineer"
|> %( DepartmentCity ~> count:(*) as employee_count, avg:(Salary) )
Line 4, introducing an or Salary > 120000 has an ambiguous reading in regards to the context of the sigma clauses that surround it – should it pre-bind with the sigma-clause to its left as an adverb? and as an exception to left-associative reading where sigilization implies a relational operator?. To solve this, there are two solutions, both of which are possible in delightql (but one of which is recommended)
- the programmer may introduce parentheses and use the classic Prolog symbol SEMI-OR
;, and thus override the associative reading with an explicit order of evaluation. - recommended: the programmer may use the keywords
andandoras a way of building de-sigilized sigma clauses. Delightql has very few keywords. They are only introduced to solve semantic difficulties and where their usage is a massive ergonomic advantage to other mechanisms.
What follows are both of these solutions wherein an intended meaning is assumed.
employee(*)
, (trim:(lower:(Department)) = "executive" ; Salary > 120000 )
, Title != "Engineer"
|> %( DepartmentCity ~> count:(*) as employee_count, avg:(Salary) )
employee(*)
, trim:(lower:(Department)) = "executive" or Salary > 120000
, Title != "Engineer"
|> %( DepartmentCity ~> count:(*) as employee_count, avg:(Salary) )
The de-sigilized sigma clause may be more pleasant to write, but their introduction to the language is critical for solving ambiguities in other locations where domain expressions may be used and where syntactic ambiguity is unavoidable.
Consider the following
employee(*)
|> +( DepartmentCity="San Francisco",
Title!="Engineer",
iif:(DepartmentCity="San Francisco",
Title!="Engineer", 100,20))
Without the introduction of the keywords and and or delightql has no idea if the intended meaning of the above should be one or two columns and whether the iif() sql construct has 3 or 4 parameters. The iif() function in Sql takes a predicate expression (truth expression) as its first argument, as a simplification over the standard case search expression. It was introduced in T-SQL but has been implemented in SQLite.
-- ONE COLUMN, 4-arity??
select
*,
DepartmentCity = 'San Francisco' and Title != 'Engineer',
iff(DepartmentCity = 'San Francisco' , Title != 'Engineer', 100,20)
from employee;
-- TWO COLUMNS??? 3-arity???
select
*,
DepartmentCity = 'San Francisco',
Title != 'Engineer',
iff(DepartmentCity = 'San Francisco' and Title != 'Engineer', 100,20)
from employee;
Thus the introduction of and and or makes an unambiguous reading possible, and the following delightql
employee(*)
|> +( DepartmentCity="San Francisco" and Title!="Engineer",
DepartmentCity="San Francisco", Title!="Engineer",
iif(DepartmentCity="San Francisco" and Title!="Engineer",
100, 20) )
has the Sql transpilation of:
select
*,
DepartmentCity = 'San Francisco' and Title != 'Engineer',
DepartmentCity = 'San Francisco',
Title != 'Engineer',
iff(DepartmentCity = 'San Francisco' and Title != 'Engineer', 100,20)
from employee;
Variable Scoping
There are many places in delightql where variables introduced by an explicit order of operations make the scope of these variables natural and obvious.
There are others, where we will need to make specific explicit explanations.
Operator Precedence
Delightql favors an approach used by the Pony language for operator precedence, namely the removal of memorized rules of operator precedence by an explicit usage of parentheses when domain expressions possess potential ambiguity. Rules like PEMDAS are useful for simple expressions, but their complexity becomes a source of bugs as soon as other operators are accepted. See: . If the compiler needs a memorized table of precedence priorities, so does the human programmer. Being explicit is a
Expressions like the below are explicitly forbidden in delightql, as the expression possesses two operators whose presence creates an ambiguity.
1 + 2 * 4
1 + (2 * 4)
// or
(1 + 2) * 4
Keywords, Identifiers and Ground Terms
Keywords
Delightql has very few keywords. The sum totality of them are
astruefalseandorinnullrowsgroupsrangeofunderwithinqua
A programmer may not use any of these keywords as identifiers for tables, columns, namespaces or functions. The upside of this choice is that reading delightql code makes it very clear what is data and/or metadata. The cost of this choice is that there is a large sigil vocabulary to be aware of.
Reserved Namespaces
original: used for the schema that is being updated in an update DML statement.insert: used for convenience higher-order predicatesdql: for system tablesdqlutil:dqlc: for compiler tablestargetalso known assql.shell
Target
employee(*) , + target.between(50_000,Salary,100_000)
select
*
from employee
where Salary between 50000 and 100000;
Identifiers
Identifiers name tables, columns, functions, namespaces, and other entities.
Basic Identifiers
Basic identifiers follow the pattern: start with a letter or underscore, followed by any number of letters, digits, or underscores.
identifier ::= [a-zA-Z_][a-zA-Z0-9_]*
Examples:
employee
FirstName
_private
user2
Qualified Identifiers
Columns can be qualified with their source table using dot notation:
employee.FirstName
e.Salary
Namespaced Identifiers
Entities from different namespaces use the :: separator:
db::employee(*) // Table from 'db' namespace
std::upper:(name) // Function from 'std' namespace
lib::utils::helper:(x) // Multi-level namespace
Quoted Identifiers
For JSON keys containing special characters, use double quotes within path access:
employee:{."@types/react"} // JSON key with special characters
data:{."server.production"} // JSON key containing a dot
Numeric Ground Terms
Delightql supports several numeric literal formats.
Integer Literals
Plain integers, optionally negative:
42
-17
0
1000000
Decimal Literals
Numbers with a decimal point:
3.14159
-0.5
100.0
Hexadecimal Literals
Prefixed with 0x or 0X:
0xFF
0x0A
0X1a2b
Octal Literals
Prefixed with 0o or 0O:
0o755
0o12
0O644
String Ground Terms
Quoted Strings
Strings use double quotes:
"Hello, World"
"Say hello"
String Templates
String templates allow expression interpolation using :" syntax:
:"Hello, {name}!"
:"Total: {price * quantity}"
:"Employee: {FirstName} {LastName}"
Templates compile to string concatenation in the target SQL.
Pattern Literals
Regular expression patterns for column matching use slashes:
/_name/ // Columns ending in _name
/^id$/ // Column exactly named 'id'
/user.*/ // Columns starting with 'user'
Used in column selection:
employee(*) |> ({/_name/}) // Select all columns matching pattern
Other Ground Terms
Boolean Literals
true
false
Null Literal
null
Path Literals
Path literals access nested data structures (JSON, arrays). They start with .:
.name // Object key
.0 // Array index
.scripts.dev // Nested path
Used with the path access operator:
employee:{.address.city} // JSON path access
data:{.0.name} // Array then object access
Placeholder
The underscore _ is the wildcard placeholder:
employee(_, LastName) // Ignore first column
department_kind(_ -> "other") // Default case in function facts