Motivation

My database model is based on functions, I think. The basic idea came to me as I was trying to model a database as a repository for the results of computations.

For example - adding two numbers. Were I a normal person, I would write it like so

fn sum(a: i32, b: i32) -> i32
{
    a + b
}

But that’s not weird enough. Instead, I’ll store the mapping from the arguments to the result, so I won’t have to compute it a second time.

fn sum(a: i32, b: i32) -> i32
{
    match lookup_pair(a, b) {
        Some(result) => result,
        None => {
            let result = a + b;
            store_pair(a, b, result);
            result
        }
}

Think of all the additions I can save on, just for the low, low cost of some I/O operations.

Of course, that’s just the motivation. Memoizing to disk the results of computations makes more sense when I cannot compute the result procedurally but I still want to be able to recall it later. You know, like a database.

Take, for example, sensor data. You can model sensor data as a function sensor_value, with signature sensor_value(sensor_id, time) -> value. Obviously I have no way to calculate this, but once a (sensor, time) pair is mapped to some value, I should always get the same value for that pair, as if it is a deterministic function.

Or.

It could be modeled as a function sensor_id(time) -> value, because there is no reason to have a single function for all our sensors. Or maybe there is, what do I know, I’m not the sensor data modeling police.

Or

It could be modeled as a function sensor_id(time, value) -> bool, which returns true iff sensor with sensor id sensor_id recorded a value of value at time time. This may seem contrived, but it’s more useful than what you may think. It also feels a lot like rediscovering currying.

It’s tuples all the way down

So far, I have found that the most convenient core data type for this model is the tuple. Let’s assume I have a standard menagerie of primitive types - I can make do with Strings, Integers and Booleans for now.

Tuples have a type that is defined by the number of elements they contain, their order and the type they have, as usual. The tuple (1, "string", true) has type (Integer, String, Boolean).

Tuples can nest. The tuple ("answer", 42, (true, false)) has the type (String, Integer, (Boolean, Boolean)). Tuples with just one element are not defined - instead, they reduce to the contained scalar value. (123) has type Integer and value 123. This rule obviously recurses, which means 123, (123), ((123)) are all the same Integer with value 123.

I also define the empty tuple, () as its own type and value.

Functions

Functions are basically named maps from tuples to tuples. I can write

define f(1) = 2

to store the mapping 1 -> 2 in function f and I can write

f(1)

to retrieve the value 2.

Asking for a non existent mapping returns an empty tuple. This also gives us a convenient way to erase values, by mapping them to ()

x = f(1)   // x is now ()

f(1) = 3

x = f(1)   // x is now 3

f(1) = ()  // erase f(1)

x = f(1)   // x is now ()

I think the schema must be relatively rigid here, by which I mean that the type of the co-domain of functions must be consistent for all mappings. Consider the following:

define f(1) = 2
define f(2) = true

Asking for a value of f now may return an Integer or a Boolean and I don’t know which one it will be until the program executes and the values of f are retrieved. This would require dynamic typing but dynamic typing is hard to plan queries for and optimize and I definitely don’t know how to do it.

A related question could be - is the type of the domain value relevant? Should the following be allowed:

define f(1) = 2
define f(1,2) = 3

I think that is ok, as long as f(Integer) and f(Integer, Integer) are treated as completely different functions that just happen to have the same name.

Before we can do anything interesting with our functions, we need to introduce…

Variables

Variables are strongly typed and their type is determined at initialization. When I type

a = 1

a is now an Integer with value 1. I can write

a = 2

and a is now 2, but if I write

a = false   // doesn't compile

I will get a compilation error.

Typical stuff.

Pattern matching

Functions can be defined with variables in their formal arguments and have bodies, delimited by { and } which allow computations and lookups of other functions. For example, can you guess what this does?

define sum(a, b) = { a + b }

x = sum(1, 2)   // x is now 3

These can be combined with value definitions, like so

define sum(a, b) = { a, b }
define sum(1, 1) = 11
define sum(1, 2) = 12

Lookups work with pattern matching on values first and if a value match cannot be found, the definition is executed and the value returned. Given the above,

x = sum(1, 2)
y = sum(1, 3)

makes x equal to 12 and y equal to 4.

I can even have partial matches:

define argument_pos(1, a, b) = { a }
define argument_pos(2, a, b) = { b }

which defines a function that returns its numbered argument.

arg_one  = argument_pos(1, "first", "second")   // arg_one is now "first"
arg_two  = argument_pos(2, "first", "second")   // arg_two is now "second"
arg_none = argument_pos(3, "doesn't", "matter") // arg_none is ()

I can use a similar pattern to define an if function

define if(true, a, b) = { a }
define if(false, a, b) = { b }

// Some constants
a = 1
b = 1
c = 2
d = "string"

e = if( a == b, c, d) // e is 2
f = if( a == c, c, d) // f is "string"

Obviously if here returns different types, which is not consistent with strong typing. I will have to revisit this.

Take a breath, queries are next

I’ll stop here, let this sink in a bit. I just scratched the surface of how queries work (I just showed pattern matching), and before I get to that, I want to let these fundamentals mature a bit.

I have a VM working for this language, but I don’t have persistence yet, everything is in memory. There is also no schema definitions. In general, this is really really early in the evolution.

Come, tell me that this makes no sense, why would anyone build this. Or that if it’s about tuples, where are they? How would you even query this?

I know, right?

But I have solved the most pressing issue of all. I found a name for it.

Glowdust.