- Datalog - Propositional Calculus (0th order logic) - Facts (P), Basic operations: (Not, And, Or), Implication - Example Facts: AliceWentToTheStore, BobWentToTheStore, AliceWentToHome - Implication: - (if P and Q then R) - === R \or \not P \or \not Q - === “Horn Clause” - 1st order logic - Goal: Quantification - Challenge, need way to enumerate classes/sets of facts - Groups of facts: WentTo(Store, Alice), WentTo(Store, Bob), WentTo(Home, Alice) - For all facts in a group (\forall) a property holds - There exists a fact in a group (\exists) such that a property holds - New way to discuss implication: - Given one or more facts P(X,Y), Q(X), …., “infer” a new fact R(Y) - If WentTo(X, Y) and ShoppingAt(X) then ShoppingDone(Y) - \forall Y, \exists X: If P(X,Y) and Q(X) then R(Y) - \forall Y, \exists X: \not P(X,Y) \or \not Q(X) \or R(Y) - Which elements of R must be true? - SELECT Y FROM P NATURAL JOIN Q INTO R - Datalog - R(Y) -= P(X,Y), Q(X) - Head, Body - Find values of Y for which R is true? - Find a value of X for which P(X,Y) and Q(X) are true - What about R(Y, Z) -= P(X,Y), Q(X) - Alternative View: - R(Y) is a function - Dom => Bool - Given Y, find a value of X for which P(X,Y) and Q(X) evaluate to true. - Support - Support: The set of values of Y for which R(Y) evaluates to true - Finite Support: The support set has a fixed size - If P(X,Y), Q(X) have finite support, so does R(Y) - actually, we can do a bit better… to be discussed shortly - Natural consequence: - R(Y,Z) is true for any value of Z as long as R(Y) would be true. - R(Y,Z) has an infinite support! - Z is “unsafe” or “unbound" - Y is “bound” or “safe” - Safety and Support - Assume we have a S(Y,Z) with finite support. - R(Y,Z) dies not have finite support - What about ( S(Y,Z) and R(Y,Z) ) - Interestingly enough, this actually does have finite support: Because Z is safe in S, it does not need to be safe in R. - In general, a variable is safe in a conjunction of terms IFF it is safe in at least one of the terms. - What else can you do with this idea? - Functions F(X) -> ??? - What about other functions? - F(X,Y) -> true if (X < Y) - How about to natural numbers? - Simple way to express Bags! - R(X) -> N = number of instances of X in R (multiplicity) - Leads to some interesting math: - R(X) U S(X) === R(X) + S(X) - R(X) |><| S(X) === R(X) * S(X) - Q(X) -= R(X,Y) * S(X, Y) === Aggregation: Sum[Y] R(X,Y) * S(X,Y) - SELECT COUNT(*) FROM R NATURAL JOIN S - How about real numbers? - R(X) -> Multiplicity - F(X) -> {X} - Q() -= SUM[X] ( R(X) * {X} ) - SELECT SUM(X) FROM R - Q() -= SUM[X] ( R(X,Y) * S(Y,Z) * {Z < 10} * {X} ) - SELECT SUM(R.X) FROM R NATURAL JOIN S WHERE S.Z < 10 - OR: Aggregate { Start with R, Join with S, Filter on Z < 10, Multiply multiplicity by X } - Sequence of transformations, each modifying the output of the last - Pipelining technique sometimes referred to as a “Monad" - ^— all actually used in practice. DBToaster’s AGCA and LogicBlox LogiQL - Connections to Rings: - Semiring: - Set/Type, +: SxS->S, 0, *: SxS->, 1 - +/* associative, commutative - + distributive over * - 0+x = x - 0*x = 0 - 1*x = x - Ring = Semiring + Invertor: - -x + x = 0 - U = + // |><| = * - Monad Algebra - Problem: Nested data! We want to transform it in bulk - Same idea of pipelining operations: - Applied to - Overview - Types: Dom, Tuple(X,Y), List(X) - Basic Ops: Const, Emptyset, Singleton, Map, Union, Tuple, DeTuple, f(DOM) - Key Features: Fold - If/Then/Else - Key Features: Tensor Strength — PairWith (equiv: Cross Product)