Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:infoh415 [2021/09/21 13:40]
ezimanyi [Topics for the current academic year]
teaching:infoh415 [2021/10/07 12:47]
ezimanyi [Topics for the current academic year]
Line 2: Line 2:
  
  
 +===== Last important announcment ====
 +All VUB student registered to the course which are not on the Teams of the course should take contact with gilles.dejaegere@ulb.be
 +
 +
 +07/10:
 +
 +Hello everyone,
 +
 +
 +Unfortunately,​ due to a medical issue, I will be back home late for the exercise session of today (around 15h15). If you have any question, please send me a teams message so that I can call you at that time or at another time.
 +
 +
 +Best regards,
 +
 +Gilles
 ===== Lecturer ===== ===== Lecturer =====
  
Line 32: Line 47:
  
 The course is given during the first semester ​ The course is given during the first semester ​
-  * Lectures on Mondays from 4 pm to 6 pm+  * Lectures on Mondays from 4 pm to 6 pm in the K.4.601 (Solbosch campus)
   * Exercises on Thursdays from 2 pm to 4 pm   * Exercises on Thursdays from 2 pm to 4 pm
  
Line 129: Line 144:
 */ */
  
-Students, in groups of two, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<​Technology>​ and <​Tool>"​.+Students, in groups of two or four students, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<​Technology>​ and <​Tool>" ​for groups of 2 students and "<​Technology>​ with <​Tool1>​ and <​Tool2>"​ for groups of 4 students.
  
-Each group will study a database technology and illustrate it with an application developed ​​in a database management system to be chosen (e.g., SQL Server, PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain the foundations of the underlying technology. The application must use the chosen technology.+Each group will study a database technology ​(e.g., document stores, time series databases, etc.) and illustrate it with an application developed ​​in a database management system to be chosen (e.g., SQL Server, PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain the foundations of the underlying technology. The application must use the chosen technology. Examples of technologies and tools can be found for example in the following ​ [[https://​db-engines.com/​en/​ranking|web site]].
  
 It is important to understand that the objective of the project is NOT about developing an application with GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "​objects"​ (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology must be provided to show that the chosen tool is THE technology of choice for your application,​ better than all other alternatives,​ and that it will perform correctly when the system is deployed at full scale. It is important to understand that the objective of the project is NOT about developing an application with GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "​objects"​ (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology must be provided to show that the chosen tool is THE technology of choice for your application,​ better than all other alternatives,​ and that it will perform correctly when the system is deployed at full scale.
  
 The choice of topic and the application must be made ​​in agreement with the lecturer. The topic should not be included in the program of the Master in Computer Science and Engineering. The project will be presented to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and the report will (1) explain the foundations of the technology chosen, (2) explain how these foundations are implemented by the database management system chosen and (3) illustrate all these concepts with the application implemented. The choice of topic and the application must be made ​​in agreement with the lecturer. The topic should not be included in the program of the Master in Computer Science and Engineering. The project will be presented to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and the report will (1) explain the foundations of the technology chosen, (2) explain how these foundations are implemented by the database management system chosen and (3) illustrate all these concepts with the application implemented.
 +
 +For 2-student group, the duration of the presentation is 30 minutes. It will structured in two parts of similar length
 +   * An introduction to the technology
 +   * An introduction to the tool illustrated with an example application assessing its advantages and disadvantages.
 +
 +For 4-student group, the duration of the presentation is 45 minutes. It will structured in three parts of similar length
 +   * An introduction to technologies presented jointly by the two groups
 +   * An introduction to the two tools, each presented by each group
 +   * A common assessment of the advantages and disadvantages of both tools tested in a common example application.
  
 The evaluation of the project focuses on the following criteria: The evaluation of the project focuses on the following criteria:
Line 150: Line 174:
   * Column stores and Cassandra, Hbase, ...   * Column stores and Cassandra, Hbase, ...
   * Data warehouses and Apache Hive   * Data warehouses and Apache Hive
-  ​* Deductive Databases and XSB +  * Distributed databases and SQL Server, ​Oracle, Citus, ...
-  ​* Distributed databases and SQL Server, ​DynamoDB, ...+
   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...
   * Embedded databases and BerkeleyDB   * Embedded databases and BerkeleyDB
   * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, ....   * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, ....
   * Key-value stores and BerkeleyDB, DynamoDB, Redis, Voldermort, ...   * Key-value stores and BerkeleyDB, DynamoDB, Redis, Voldermort, ...
-  * Multimedia databases and Oracle 
   * Multi-model databases and MarkLogic   * Multi-model databases and MarkLogic
   * NewSQL databases and VoltDB   * NewSQL databases and VoltDB
Line 169: Line 191:
 ===== Topics for the current academic year ===== ===== Topics for the current academic year =====
  
 +  * Analytical databases and Endeca: David Silberwasser,​ Sami Abdul Sater
   * Cloud databases and Microsoft Azure SQL: Davide Rendina, Margarita Hernandez   * Cloud databases and Microsoft Azure SQL: Davide Rendina, Margarita Hernandez
-  * Datawarehouses ​and Apache Hive: Nicole Zafalón, Andrés Espinal +  * Column databases with Cassandra and HBase: Md Jamiur Rahman Rifat, Khushnur Binte Jahangir, ​ Hind Bakkali and Gaëlle Frauenkron 
-  * Distributed databases ​and SQL Server: Asha Seif, Kainaat Amjid +  * Column stores and Apache Kudu: Pei Liao, Minxing Jiang 
 +  * Data warehouses ​and Apache Hive: Nicole Zafalón, Andrés Espinal 
 +  * Distributed databases ​with Citus: Asha Seif, Kainaat Amjid 
 +  * Distributed databases with Apache Ignite: Fan Chen, Mathieu Pardon 
 +  * Distributed Databases with DynamoDB: Loïc Caudron, Matteo Snellings 
 +  * Document stores with CouchBase and CouchDB: Mohammadreza Amini, Ossoama Benaissa, Zheng Ren, Adriana Sirbu 
 +  * Document stores and Firestore: Luca De Santos, Sacha Keserovic ​
   * Document stores and MongoDB: Hang Yu, Zhiyang Guo   * Document stores and MongoDB: Hang Yu, Zhiyang Guo
 +  * Embedded databases and BerkeleyDB: Starygin Evgueniy, Bernard Loic
   * In-memory databases and Memcached: Diogo Repas and Sandra Hillergren   * In-memory databases and Memcached: Diogo Repas and Sandra Hillergren
 +  * Key-value databases with DynamoDB: Aline Desmet, Chloé Dekeyser
   * Key-value databases with Cloud bigtable and Redis: Luiz Fonseca, Zyrako Musaj, Yanjian Zhang and Zhicheng Luo   * Key-value databases with Cloud bigtable and Redis: Luiz Fonseca, Zyrako Musaj, Yanjian Zhang and Zhicheng Luo
   * Multimedia databases and Oracle: Wassim Belgada, Imestir Ibrahim   * Multimedia databases and Oracle: Wassim Belgada, Imestir Ibrahim
-  * Object-oriented ​databases and ObjectBoxFilip SotiroskiNiccolo Morabito +  * NewSQL ​databases ​with VoltDB ​and CockroachDBAli Imam Manzer, Maciej Piekarski, Johan Gjini, Nabil Souissi,  
-  * Object-oriented databases and Perst: Andrea Gonzato, Pietro Ferrazi+  * Object-oriented databases ​with ObjectBox ​and Perst: ​Filip Sotiroski, Niccolo Morabito, ​Andrea Gonzato, Pietro Ferrazi
   * Real-time databases and Firebase: Himanshu Choudhary, Sergio Postigo, Tejaswini dhuppad   * Real-time databases and Firebase: Himanshu Choudhary, Sergio Postigo, Tejaswini dhuppad
 +  * Search engines with Apache Solr and ElasticSearch:​ Pap Sanou, Szymon Swirydowicz,​ Alexandre Chapelle, Nicolas Dardenne
   * Spatial raster databases and Rasdaman: Adam Broniewski, Victor Divi   * Spatial raster databases and Rasdaman: Adam Broniewski, Victor Divi
   * Stream databases and Apache Kafka: Vlada Kylynnyk, Mahmut Asım Onat   * Stream databases and Apache Kafka: Vlada Kylynnyk, Mahmut Asım Onat
-  * Time series databases with Influx DB and Kdb+: Mohammad Zain Abbas, Muhammad Ismail, Yi Wu, Zhonghan ​Li +  * Time series databases with Influx DB and Kdb+: Mohammad Zain Abbas, Muhammad Ismail, Yi Wu, Chonghan ​Li 
-  * Search engines ​and ElasticSearchAlexandre ChapelleNicolas Dardenne +  * Time series databases ​and PromoteusDumitru NegruBrice Petit 
-  * Column stores ​and CassandraMd Jamiur Rahman RifatKhushnur Binte Jahangir+  * XML Databases ​and BaseXMaxime RenversezMael Touret
  
  
 
teaching/infoh415.txt · Last modified: 2023/12/04 18:14 by ezimanyi