This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
teaching:mfe:is [2015/04/13 14:44] svsummer [An implementation of the SCULPT schema language for tabular data on the Web] |
teaching:mfe:is [2015/04/13 14:45] svsummer [A Scala-based runtime and compiler for Distributed Datalog] |
||
---|---|---|---|
Line 144: | Line 144: | ||
Datalog is a fundamental query language in datamanagement based on logic programming. It essentially extends select-from-where SQL queries with recursion. There is a recent trend in data management research to use datalog to specify distributed applications, most notably on the web, as well as do inference on the semantic web. The goal of this thesis is to engineer a basic **distributed datalog system**, i.e., a system that is capable of compiling & running distributed datalog queries. The system should be implemented in the Scala programming language. Learning Scala is part of the master thesis project. | Datalog is a fundamental query language in datamanagement based on logic programming. It essentially extends select-from-where SQL queries with recursion. There is a recent trend in data management research to use datalog to specify distributed applications, most notably on the web, as well as do inference on the semantic web. The goal of this thesis is to engineer a basic **distributed datalog system**, i.e., a system that is capable of compiling & running distributed datalog queries. The system should be implemented in the Scala programming language. Learning Scala is part of the master thesis project. | ||
- | The system should: | + | The system should incorporate recently proposed worst-case join algorithms (i.e., the [[http://arxiv.org/abs/1210.0481|leapfrog trie join]]) and employ known local datalog optimizations (such as magic sets and QSQ.) |
- | * incorporate recently proposed worst-case join algorithms (i.e., the [[http://arxiv.org/abs/1210.0481|leapfrog trie join]]) | + | |
- | * employ known local datalog optimizations (such as magic sets and QSQ) | + | |
**Validation of the approach** The thesis should propose a benchmark collection of datalog queries and associated data workloads that be used to test the obtained system, and measure key performance characteristics (elasticity of the system; memory frootprint; overall running time, ...) | **Validation of the approach** The thesis should propose a benchmark collection of datalog queries and associated data workloads that be used to test the obtained system, and measure key performance characteristics (elasticity of the system; memory frootprint; overall running time, ...) | ||
- | **Deliverables**: | ||
+ | **Deliverables**: | ||
* Semantics of datalog; overview of known optimization strategies (document) | * Semantics of datalog; overview of known optimization strategies (document) | ||
* Description of the leapfrog trie join (document) | * Description of the leapfrog trie join (document) | ||
Line 157: | Line 155: | ||
* Experimental analysis of developped system on a number of use cases (document) | * Experimental analysis of developped system on a number of use cases (document) | ||
+ | \\ | ||
**Interested?** Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]] | **Interested?** Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]] | ||