Differences

This shows you the differences between two versions of the page.

--- teaching:mfe:is [2017/10/25 11:46]
msakr [Assessing Existing Communication Protocols In The Context Of DaaS]
+++ teaching:mfe:is [2018/04/23 09:54]
svsummer [Master Thesis in Collaboration with Euranova]
@@ Line 24: / Line 24: @@
   * Contact : [[ezimanyi@ulb.ac.be|Esteban Zimanyi]]
+** Dynamic Query Processing on GPU Accelerators
+   This master thesis is put forward in the context of the DFAQ
+   Research Project:  "Dyanmic Processing of Frequently Asked
+   Queries", funded by the Wiener-Anspach foundation.
+   Within this project, our lab is hence developing novel ways for
+   processing "fast Big Data", i.e., processing of analytical queries
+   where the underlying data is constantly being updated.  The
+   analytics problems envisioned cover wide areas of computer science
+   and include database aggregate queries, probabilistic inference,
+   matrix chain computation, and building statistical models.
+   The objective of this master thesis is to build upon the novel
+   dynamic processing algorithms being developed in the lab, and
+   complement these algorithms by proposing dynamic evaluation
+   algorithms that execute on modern GPU architectures, thereby
+   exploiting their massive parallel processing capabilities.
+   Since our current development is done in the Scala programming
+   language, prospective students should either know Scala, or being
+   willing to learn it within the context of the master thesis.
+   *Validation of the approach* Validation of master thesis' work
+   should be done on two levels:
+  - a theoretical level; by proposing and discussing alternative ways
+    to do incremental computation on GPU architectures, and comparing
+    these from a theoretical complexity viewpoint
+  - an experimental level; by proposing a benchmark collection of CEP
+    queries that can be used to test the obtained versions of the
+    interpreter/compiler, and report on the experimentally observed
+    performance on this benchmark.
+  *Deliverables* of the master thesis project
+   - An overview of query processing on GPUs
+   - A definition of the analytics queries under consideration
+   - A description of different possible dynamic evaluation algorithms
+     for the analytical queries on GPU architectures.
+   - A theoretical comparison of these possibilities
+   - The implementaiton of the evaluation algorithm(s) (as an interpreter/compiler)
+   - A benchmark set of queries and associated data sets for
+     the experimental validation
+   - An experimental validation of the compiler, and analysis of the results.
+   *Interested?*
+   - Contact :  [[svsummer@ulb.ac.be][Stijn Vansummeren]]
+   *Status*: available
+** Complex Event Processing in Apache Spark and Apache Storm
+  The master thesis is put forward in the context of the SPICES
+  "Scalable Processing and mIning of Complex Events for
+  Security-analytics" research project, funded by Innoviris.
+  Within this project, our lab is developping a declarative language
+  for Complex Event Processing (CEP for short). The goal in Complex
+  Event Processing is to derive pre-defined patterns in a stream of
+  raw events. Raw events are typically sensor readings (such as
+  "password incorrect for user X trying to log in on machine Y" or
+  "file transfer from machine X to machine Y"). The goal of CEP is
+  then to correlate these events into complex events. For example,
+  repeated failed login attempts by X to Y should trigger a complex
+  event "password cracking warning" that refers to all failed login
+  attempts.
+  The objective of this master thesis is to build an
+  interpreter/compiler for this declarative CEP language that targets
+  the distributed computing frameworks Apache Spark and/or Apache
+  Storm as backends.  Getting aquaintend with these technologies is
+  part of the master thesis objective.
+  *Validation of the approach* Validation of the proposed
+  interpreter/compiler should be done on two levels:
+  - a theoretical level; by comparing the generated Spark/Storm
+    processors to a processor based on "Incremental computation" that
+    is being developped at the lab
+  - an experimental level; by proposing a benchmark
+    collection of CEP queries that can be used to test the obtained
+    interpreter/compiler, and report on the experimentally observed
+    performance on this benchmark.
+  *Deliverables* of the master thesis project
+   - An overview of the processing models of Spark and Storm
+   - A definition of the declarative CEP language under consideration
+   - A description of the interpretation/compilation algorithm
+   - A theoretical comparison of this algorithm wrt an incremental
+     evaluation algorithm.
+   - The interpreter/compiler itself (software artifact)
+   - A benchmark set of CEP queries and associated data sets for
+     the experimental validation
+   - An experimental validation of the compiler, and analysis of the results.
+   *Interested?*
+   - Contact :  [[svsummer@ulb.ac.be][Stijn Vansummeren]]
+   *Status*: available
+** Graph Indexing  for Fast Subgraph Isomorphism Testing
+   There is an increasing amount of scientific data, mostly from the
+   bio-medical sciences, that can be represented as collections of
+   graphs (chemical molecules, gene interaction networks, ...). A
+   crucial operation when searching in this data is that of subgraph
+   isomorphism testing: given a pattern P that one is interested in
+   (also a graph) in and a collection D of graphs (e.g., chemical
+   molecules), find all graphs in G that have P as a
+   subgraph. Unfortunately, the subgraph isomorphism problem is
+   computationally intractable. In ongoing research, to enable
+   tractable processing of this problem, we aim to reduce the number
+   of candidate graphs in D to which a subgraph isomorphism test needs
+   to be executed. Specifically, we index the graphs in the collection
+   D by means of decomposing them into graphs for which subgraph
+   isomorphism *is* tractable. An associated algorithm that filters
+   graphs that certainly cannot match P can then formulated based on
+   ideas from information retrieval.
+   In this master thesis project, the student will emperically
+   validate on real-world datasets the extent to which graphs can be
+   decomposed into graphs for which subgraph isomorphism is
+   tractable, and run experiments to validate the effectiveness of
+   the proposed method in terms of filtering power.
+   *Interested?*
+   - Contact :  [[svsummer@ulb.ac.be][Stijn Vansummeren]]
+   *Status*: available
 ===== Complex Event Processing in Apache Spark and Apache Storm =====