Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
teaching:projh402 [2018/10/02 08:13]
svsummer
teaching:projh402 [2022/09/06 10:37]
ezimanyi
Line 1: Line 1:
-====== ​MA Computer Science Projects (PROJ-H-402)  ​======+====== PROJ-H-402 ​: Computing Projects ​======
  
- +This is the list of Computing Projects topics ​proposed for the current academic year by the CoDE departmentÉcole polytechnique de BruxellesULB.
-===== Course objective ===== +
-The course PROJ-H-402 ​is managed by Dr. Mauro Birattari. Please refer to the course description page http://​iridia.ulb.ac.be/​proj-h-402/​index.php/​Main_Page for the rules concerning the project. ​ What follows is a list of project proposals supervised by academic members of CoDE. +
- +
-===== Project proposals ===== +
- +
-=== Engineering of a Rule-Based Information Extraction Engine === +
- +
-Information extraction, the activity of extracting structured +
-information from unstructured text, is a core data preparation +
-step. Systems for information extraction fall into two main +
-categories. The first category contains machine-learning based +
-systems, where a significant amount of training is required to train +
-good models for specific extraction tasks. The second category +
-consists of rule-based systems in which the data to be extracted from +
-the text is specified by (human-written) rules in some (often +
-declarative) extraction language. Despite advances in machine +
-learning, rule-based systems are widely used in practice. +
- +
-In recent years, novel theoretical algorithms have been proposed ​to +
-more efficiently execute rule-based information extraction +
-workloads. The objective in this project is to implement one such +
-Algorithm, by Florenzano et al (2018), experimentally analyze its +
-performance,​ and propose extensions of the algorithm to overcome +
-performance bottlenecks.  +
- +
- +
-References:  +
- +
-- Fernando Florenzano, Cristian Riveros, Martín Ugarte, +
-Stijn Vansummeren,​ Domagoj Vrgoc: Constant Delay Algorithms ​for +
-Regular Document Spanners. PODS 2018: 165-177 +
- +
- +
-**Interested?​** Contact Stijn Vansummeren (stijn.vansummeren@ulb.ac.be) +
- +
-**Status**: available +
- +
- +
-=== Query processing for mixed database-machine learning based workloads === +
- +
-Because of the growing importance and wide deployment of large-scale +
-Machine Learning (ML), there is wide interest in the design and +
-implementation of processing engines that can efficiently evaluate ML +
-workloads. One class of sytems, embodied ​by systems such as Tensorflow +
-and SystemML takes linear algebra as the key primitive for expressing +
-ML workflowsand obtain efficient processing engines by porting known +
-database-style optimization techniques to the linear algebra +
-setting. Another class of systemsembodied by FAQ queries take +
-relational algebra as the key primitive, but modify it to allow +
-expression of certain ML workloads. To some extent, the classical +
-optimization techniques as well as recent results for exploiting +
-modern hardware transfer to this extended relational algebra. As an +
-added bonus, traditional database workloads (OLTP/OLAP style) can be +
-trivially supported +
- +
-The focus in this project is in the latter style of systems. The +
-overall goal is to experimentally identify classes of FAQ queries for +
-which it would be beneficial to exploit techniques developped in the +
-former class of systems. Concretely, this can be approached by +
-experimentally studying queries in the FAQ framework (featuring joins) +
-for which known results in evaluating linear algebra operations (in +
-concretum: matrix multiplication algorithms that run in less than +
-O(n^3) time) can be exploited. +
- +
-**Contact** : Stijn Vansummeren (stijn.vansummeren@ulb.ac.be) +
- +
-**Status**: available+
  
  
 +  * [[teaching:​projh402:​wis|Data Science]]
 +  * [[teaching:​projh402:​ia|Artificial Intelligence]]
 +  * [[teaching:​projh402:​or|Operational Research and Decision Aid]]
 
teaching/projh402.txt · Last modified: 2022/09/06 10:39 by ezimanyi