Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
teaching:mfe:is [2017/10/25 11:46]
msakr [Assessing Existing Communication Protocols In The Context Of DaaS]
teaching:mfe:is [2018/04/23 09:54]
svsummer [Master Thesis in Collaboration with Euranova]
Line 24: Line 24:
  
   * Contact : [[ezimanyi@ulb.ac.be|Esteban Zimanyi]]   * Contact : [[ezimanyi@ulb.ac.be|Esteban Zimanyi]]
 +
 +** Dynamic Query Processing on GPU Accelerators
 +   
 +   This master thesis is put forward in the context of the DFAQ
 +   ​Research Project: ​ "​Dyanmic Processing of Frequently Asked
 +   ​Queries",​ funded by the Wiener-Anspach foundation.
 +
 +   ​Within this project, our lab is hence developing novel ways for
 +   ​processing "fast Big Data", i.e., processing of analytical queries
 +   where the underlying data is constantly being updated. ​ The
 +   ​analytics problems envisioned cover wide areas of computer science
 +   and include database aggregate queries, probabilistic inference,
 +   ​matrix chain computation,​ and building statistical models.
 +
 +   The objective of this master thesis is to build upon the novel
 +   ​dynamic processing algorithms being developed in the lab, and
 +   ​complement these algorithms by proposing dynamic evaluation
 +   ​algorithms that execute on modern GPU architectures,​ thereby
 +   ​exploiting their massive parallel processing capabilities.
 +
 +   Since our current development is done in the Scala programming
 +   ​language,​ prospective students should either know Scala, or being
 +   ​willing to learn it within the context of the master thesis.
 +
 +   ​*Validation of the approach* Validation of master thesis'​ work
 +   ​should be done on two levels:
 +  - a theoretical level; by proposing and discussing alternative ways
 +    to do incremental computation on GPU architectures,​ and comparing
 +    these from a theoretical complexity viewpoint
 +  - an experimental level; by proposing a benchmark collection of CEP
 +    queries that can be used to test the obtained versions of the
 +    interpreter/​compiler,​ and report on the experimentally observed
 +    performance on this benchmark.
 +   
 +  *Deliverables* of the master thesis project
 +   - An overview of query processing on GPUs
 +   - A definition of the analytics queries under consideration
 +   - A description of different possible dynamic evaluation algorithms
 +     for the analytical queries on GPU architectures.
 +   - A theoretical comparison of these possibilities
 +   - The implementaiton of the evaluation algorithm(s) (as an interpreter/​compiler)
 +   - A benchmark set of queries and associated data sets for
 +     the experimental validation
 +   - An experimental validation of the compiler, and analysis of the results.
 +
 +   ​*Interested?​*
 +   - Contact :  [[svsummer@ulb.ac.be][Stijn Vansummeren]]
 +
 +   ​*Status*:​ available
 +
 +
 +** Complex Event Processing in Apache Spark and Apache Storm
 +
 +  The master thesis is put forward in the context of the SPICES
 +  "​Scalable Processing and mIning of Complex Events for
 +  Security-analytics"​ research project, funded by Innoviris.
 +
 +  Within this project, our lab is developping a declarative language
 +  for Complex Event Processing (CEP for short). The goal in Complex
 +  Event Processing is to derive pre-defined patterns in a stream of
 +  raw events. Raw events are typically sensor readings (such as
 +  "​password incorrect for user X trying to log in on machine Y" or
 +  "file transfer from machine X to machine Y"). The goal of CEP is
 +  then to correlate these events into complex events. For example,
 +  repeated failed login attempts by X to Y should trigger a complex
 +  event "​password cracking warning"​ that refers to all failed login
 +  attempts.
 +
 +  The objective of this master thesis is to build an
 +  interpreter/​compiler for this declarative CEP language that targets
 +  the distributed computing frameworks Apache Spark and/or Apache
 +  Storm as backends. ​ Getting aquaintend with these technologies is
 +  part of the master thesis objective.
 +
 +  *Validation of the approach* Validation of the proposed
 +  interpreter/​compiler should be done on two levels:
 +  - a theoretical level; by comparing the generated Spark/Storm
 +    processors to a processor based on "​Incremental computation"​ that
 +    is being developped at the lab
 +  - an experimental level; by proposing a benchmark
 +    collection of CEP queries that can be used to test the obtained
 +    interpreter/​compiler,​ and report on the experimentally observed
 +    performance on this benchmark.
 +   
 +  *Deliverables* of the master thesis project
 +   - An overview of the processing models of Spark and Storm
 +   - A definition of the declarative CEP language under consideration
 +   - A description of the interpretation/​compilation algorithm
 +   - A theoretical comparison of this algorithm wrt an incremental
 +     ​evaluation algorithm.
 +   - The interpreter/​compiler itself (software artifact)
 +   - A benchmark set of CEP queries and associated data sets for
 +     the experimental validation
 +   - An experimental validation of the compiler, and analysis of the results.
 +
 +   ​*Interested?​*
 +   - Contact :  [[svsummer@ulb.ac.be][Stijn Vansummeren]]
 +
 +   ​*Status*:​ available
 +
 +** Graph Indexing ​ for Fast Subgraph Isomorphism Testing
 +   
 +   There is an increasing amount of scientific data, mostly from the
 +   ​bio-medical sciences, that can be represented as collections of
 +   ​graphs (chemical molecules, gene interaction networks, ...). A
 +   ​crucial operation when searching in this data is that of subgraph
 +   ​isomorphism testing: given a pattern P that one is interested in
 +   (also a graph) in and a collection D of graphs (e.g., chemical
 +   ​molecules),​ find all graphs in G that have P as a
 +   ​subgraph. Unfortunately,​ the subgraph isomorphism problem is
 +   ​computationally intractable. In ongoing research, to enable
 +   ​tractable processing of this problem, we aim to reduce the number
 +   of candidate graphs in D to which a subgraph isomorphism test needs
 +   to be executed. Specifically,​ we index the graphs in the collection
 +   D by means of decomposing them into graphs for which subgraph
 +   ​isomorphism *is* tractable. An associated algorithm that filters
 +   ​graphs that certainly cannot match P can then formulated based on
 +   ideas from information retrieval.
 +
 +   In this master thesis project, the student will emperically
 +   ​validate on real-world datasets the extent to which graphs can be
 +   ​decomposed into graphs for which subgraph isomorphism is
 +   ​tractable,​ and run experiments to validate the effectiveness of
 +   the proposed method in terms of filtering power.
 +
 +   ​*Interested?​*
 +   - Contact :  [[svsummer@ulb.ac.be][Stijn Vansummeren]]
 +
 +   ​*Status*:​ available
 +
  
 ===== Complex Event Processing in Apache Spark and Apache Storm ===== ===== Complex Event Processing in Apache Spark and Apache Storm =====
 
teaching/mfe/is.txt · Last modified: 2020/09/29 17:03 by mahmsakr