This page lists various comments on solutions provided by the students to the Optimization of Logical Query Plans exercises. The goal is to provide hints at common mistakes, or on how to improve the solutions.
Excerpts of the originally submitted and incorrect solutions are marked as follows:
Original solution
etc.
Exercise 4.1
The relations have the following schema:
S(C), T(D, E), U(F, G)
$Q_1(x,y) \leftarrow S(x), T(x,3), U(x,y)$
SPJ: $\pi_{\text{C, G}} \sigma_{\text{S.C} = \text{T.D} \wedge \text{S.C} = \text{U.F} \wedge \text{T.E} = 3}
( S \times T \times U )$
In the relational algebra, without any renaming, the fields do not have a relation name as a prefix. For instance, given the relations $A(X,Y)$ and $B(Z)$, the following are valid expressions:
But not:
In the submitted solution, the SPJ should not mention relation names in $\sigma$.
A matching of a query into a database is a mapping of each variable in the query to a value in the database, in such a way that each atom of the query corresponds to an element of the database. For instance, consider the following queries:
and the database D that contains the following relations:
We can see that m1 = { x: 1, y: 2 } is a matching of Q1 into D. In particular A(x, y) for x = 1 and y = 2 occurs in the database D. Similarly, m2 = { w: 1, x: 2, y: 3, z: 4 } is a matching of Q2 into D, and in particular A(1, 2) (corresponding to A(w,x) ) occurs in D, as well as B(2, 3), and B(3, 4).
In general many different matchings exist for a same query into a same database. For instance, m3 = { w: 1, x: 2, y: 6, z: 7 } is also a matching of Q2 into D.
For a given matching, an individual result of the query on the database is obtained by replacing the variables in the head of the query by the corresponding constants. For instance, m1 leads to the tuple (x) for x = 1 as a result of Q1 on D. Similarly, m2 leads to the tuple (w, z) for w = 1, and z = 4 as a result of Q2 on D.
The result of the query on the database is the set of all individual results of the query. In particular, for Q2 on D, we have as the result the set { (1, 4), (1, 7) }.
Building matchings by hands can be tedious, as you may need to try and associate each atom of the query with each element of the database. The following is a list of tricks that can help speeding up the process. The essence of these is to construct a mapping by considering each atom of the query in turn, and checking the equalities they must imply.
These rules of thumb ensure that a deterministic choice is made whenever it can be. However, in many circumstances an atom can be bound to many different elements of the database; then all possible associations of atoms and elements must be considered, one at a time.
Also note that for some queries, no matching can be built. This can be detected, for instance, if building the matching would require a variable to take two different values or if no value is available at all for a given variable.
As an example, let us building all matchings of Q2 into the database D from the example above. Since the relation A only contains the pair (1, 2) any matching of Q2 into D must contain { w: 1, x : 2 }. We now consider B(x, y), with the constraint that x = 2. There are two elements of the database satisfying this constraint: (2, 3) and (2, 6). We must consider both of them in turn.
consider B(y, z) with y = 3. There is only one element of the database satisfying this condition, which is (3, 4). The corresponding matching is { w: 1, x: 2, y: 3, z: 4 } and the individual result is (1, 4).
consider B(y, z) with y = 6. There is only one element of the database satisfying this condition, which is (6, 7). The corresponding matching is { w: 1, x: 2, y: 6, z: 7 } and the individual result is (1, 7).
Exercise 5.
$Q_2(x,y) \leftarrow Q(x,a), Q(a,b), Q(b,c), Q(c,y)$
$Q_3(x,y) \leftarrow Q(x,a), Q(a,1), Q(1,b), Q(b,y)$
$Q_2 \subseteq Q_3$ ?
We build the following matching from $Q_3$ into the canonical database $D(Q_2)$:
$\{ x \rightarrow x_c,
a \rightarrow a_c,
1 \rightarrow b_c,
b \rightarrow c_c,
y \rightarrow y_c \}$
and hence we have $Q_2 \subseteq Q_3$
…
Exercise 6.
$Q_1(x,z) \leftarrow R(x,y), R(y,w), R(y,z)$
$Q_1'(x,z) \leftarrow R(x,y), R(y,z)$
$Q_1' \subseteq Q_1$ ?
We build the following matching from $Q_1$ into the canonical database $D(Q_1')$:
$\{ x \rightarrow x_c, y \rightarrow y_c, w \rightarrow z_c \}$
and hence we have $Q_1' \subseteq Q_1$
Exercise 6.
$Q_1(x,z) \leftarrow R(x,y), R(y,w), R(y,z)$
$Q_1'(x,z) \leftarrow R(x,y), R(y,z)$
$Q_1' \subseteq Q_1$ ?
We build the following matching from $Q_1$ into the canonical database $D(Q_1')$:
$\{ x \rightarrow x_c, y \rightarrow y_c, w \rightarrow z_c, z \rightarrow z_c \}$
To test the inclusion of a conjunctive query Q into Q' you need to:
1. Construct the canonical database D of Q.
2. Try to find a matching of Q' into D such that the corresponding individual result is the header of Q where variables are seen as constants.
When writing down exercises solutions, it is recommended that you indicate the reason for not being able to build a matching. For instance you could indicate that a matching would need to associate a variable with two different values.
In the provided solution, the answer should also mention that $(x_c, z_c)$ is in the result of $Q_1$ on $D(Q_1')$.
Exercise 6.2
$Q_2(x,y) \leftarrow R(x,z), R(y,w), R(a, w), R(x, y)$
Let $Q_2'(x,y) \leftarrow R(y,w), R(a, w), R(x, y)$, and let us check whether
$Q_2' \subseteq Q_2$.
Then $\{ x \rightarrow x_c, y \rightarrow y_c, z \rightarrow y_c, w \rightarrow w_c, a \rightarrow a_c \}$ is a matching of $Q_2$ into $D(Q_2')$.
Hence $Q_2' \equiv Q_2$.
Let $Q_2^{\prime\prime}(x,y) \leftarrow R(a, w), R(x, y)$, and let us check
whether $Q_2^{\prime\prime} \subseteq Q_2$.
Then for any matching, the variable $y$ must refer at the same time
to the constants $y_c$ (from the head) and $a_c$ (from $R(y, w)$ and
$R(a,w)$). Therefore, $(x_c,y_c)$ is not part of the result of $Q_2$ on
$D(Q_2^{\prime\prime})$, and $Q_2^{\prime\prime} \not\equiv Q_2$.
When optimizing conjunctive queries, you can reuse the optimized expression when optimizing further: for instance, if you have optimized $Q$ into $Q'$ and you want to check whether $Q^{\prime\prime}$ is even more optimal, you can test $Q^{\prime\prime}$ against $Q'$. Indeed, while $Q'$ is smaller that $Q$, it is strictly equivalent. However, since it is smaller, constructing a matching is usually easier.