Differences

This shows you the differences between two versions of the page.

--- teaching:mfe:is [2015/04/13 14:27]
svsummer [Compiling SPARQL queries into machine code]
+++ teaching:mfe:is [2015/04/13 14:31]
svsummer
@@ Line 31: / Line 31: @@
 A number of recent research prototypes exist that compile SQL queries into machine code in this sense:
-  * HyPer – A Hybrid OLTP&OLAP High Performance DBMS http://hyper-db.de/
+  HyPer – A Hybrid OLTP&OLAP High Performance DBMS http://hyper-db.de/ and Legobase - https://github.com/epfldata/NewLegoBase and http://data.epfl.ch/legobase.
-  * Legobase - https://github.com/epfldata/NewLegoBase and http://data.epfl.ch/legobase
 The objective of this master thesis is to apply the same methodology to engineer a compiler that translates (fragments of) SPARQL (the standard query language for querying RDF data on the semantic web) into machine code. The overall methodology should follow the methodology used by HyPer and Legobase:
@@ Line 43: / Line 42: @@
 **Validation of the approach** The thesis should propose a benchmark collection of SPARQL queries that can be used to test the obtained SPARQL-to-machine-code compiler and compare its perforance against a reference, interpreter-based SPARQL compiler.
-**Deliverables** of the master thesis project:
+**Deliverables** of the master thesis project:  * An overview of the state of the art in query-to-machine-code compilation.
-  * An overview of the state of the art in query-to-machine-code compilation.
   * A description of latent modular staging and how it can be used to construct machine-code compilers.
   * The SPARQL compiler (software artifact)
   * A benchmark set of SPARQL queries and associated data sets for the experimental validation
   * An experimental validation of the compiler, comparing efficiency of compiled queries against a reference compiler based on query plan interpretation.
-  *
-**Interested?** Contact : //Stijn Vansummeren//
+**Interested?**
+  * Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]
 **Status**: available
@@ Line 65: / Line 63: @@
 The objective of this master thesis is to implement a recent proposal for such a schema language named SCULPT (http://arxiv.org/abs/1411.2351). Concretely, this entails:
-  * propose an elegant concrete syntax for SCULPT schemas
+  * proposing an elegant concrete syntax for SCULPT schemas
   * implement both the in-memory and streaming validation algorithms of SCULPT proposed in http://arxiv.org/abs/1411.2351
   * extend the SCULPT proposal, by investigating how SCULPT can be combined with complementary features recently proposed by the W3C CSV on the Web Working group (http://www.w3.org/2013/csvw/wiki/Main_Page)
@@ Line 78: / Line 76: @@
   * extension of sculpt with features for converting into RDF (document + software)
-**Interested?**
-  * Contact: //Stijn Vansummeren//
+**Interested?** Contact: [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]
 **Status**: available
@@ Line 119: / Line 117: @@
   * [[http://www.diku.dk/kmc/documents/AiPL-CrashCourse.pdf|A Crash-Course in Regular Expression Parsing and Regular Expressions as Types.]]
-**Interested?**
+**Interested?** Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]
-  * Contact : //Stijn Vansummeren//
 **Status**: available
@@ Line 139: / Line 136: @@
   * Experimental analysis of distributed algorithm on a number of datasets. (document)
-**Required reading**:
+**Interested?** Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]
-  * Computing simulations on finite and infinite graphs. Henzinger, henzinger and Kopke
-  * Ranzato and Tapparo - An efficient simulation algorithm based on abstract interpretation
-  * Ranzato - An efficient simulation algorithm on Kripke structures.
-  * Blom, Orzan - Distributed State Space Minimization
-  * Blom, Orzan - A distributed algorithm for strong bisimulation reduction of state spaces
-  * Ma et al - distributed graph pattern matching
-  * (+ signature-based simulation part of Yongming Luo's thesis).
-**Interested?**
-  * Contact : //Stijn Vansummeren//
 **Status**: available
@@ Line 162: / Line 149: @@
 **Validation of the approach** The thesis should propose a benchmark collection of datalog queries and associated data workloads that be used to test the obtained system, and measure key performance characteristics (elasticity of the system; memory frootprint; overall running time, ...)
-**Required reading**:
-  * Datalog and Recursive Query Processing - Foundations and trends in query processing.
-  * LogicBlox, Platform and Language: A Tutorial (Todd J. Green, Molham Aref, and Grigoris Karvounarakis)
-  * Dedalus: Datalog in Time and Space (Peter Alvaro, William R. Marczak, Neil Conway, Joseph M. Hellerstein, David Maier, and Russell Sears)
-  * Declarative Networking (Loo et al). For the distributed evaluation strategy.
-  * Parallel processing of recursive queries in distributed architectures (VLDB 1989)
-  * Evaluating recursive queries in distributed databases (IEEE trans knowledge and data engieneering, 1993)
 **Deliverables**:
@@ Line 175: / Line 154: @@
   * Description of the leapfrog trie join
-**Interested?**
-  * Contact : //Stijn Vansummeren//
+**Interested?** Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]
 **Status**: available
@@ Line 201: / Line 180: @@
   * Interact with the administration of the Ecole Polytechnique to fine-tune the above requirements; test the implementation; and integrate remarks after testing
-**Interested?**
+**Interested?** Contact : Stijn Vansummeren (stijn.vansummeren@ulb.ac.be), Frédéric Robert <frrobert@ulb.ac.be>
-  * Contact : Stijn Vansummeren (stijn.vansummeren@ulb.ac.be), Frédéric Robert <frrobert@ulb.ac.be>