Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:infoh415 [2021/10/03 19:52]
ezimanyi [Project]
teaching:infoh415 [2021/10/17 16:29]
ezimanyi [Topics for the current academic year]
Line 2: Line 2:
  
  
-===== Last important ​announcment ​==== +===== Last important ​announcement ​==== 
-All VUB student registered to the course ​which are not on the Teams of the course should take contact with gilles.dejaegere@ulb.be+All VUB student registered to the course ​who are not on the Teams of the course should take contact with gilles.dejaegere@ulb.be
  
  
 +07/10:
 +
 +Hello everyone,
 +
 +
 +Unfortunately,​ due to a medical issue, I will be back home late for the exercise session of today (around 15h15). If you have any question, please send me a teams message so that I can call you at that time or at another time.
 +
 +
 +Best regards,
 +
 +Gilles
 ===== Lecturer ===== ===== Lecturer =====
  
Line 135: Line 146:
 Students, in groups of two or four students, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<​Technology>​ and <​Tool>"​ for groups of 2 students and "<​Technology>​ with <​Tool1>​ and <​Tool2>"​ for groups of 4 students. Students, in groups of two or four students, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<​Technology>​ and <​Tool>"​ for groups of 2 students and "<​Technology>​ with <​Tool1>​ and <​Tool2>"​ for groups of 4 students.
  
-Each group will study a database technology and illustrate it with an application developed ​​in a database management system to be chosen (e.g., SQL Server, PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain the foundations of the underlying technology. The application must use the chosen technology.+Each group will study a database technology ​(e.g., document stores, time series databases, etc.) and illustrate it with an application developed ​​in a database management system to be chosen (e.g., SQL Server, PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain the foundations of the underlying technology. The application must use the chosen technology. Examples of technologies and tools can be found for example in the following ​ [[https://​db-engines.com/​en/​ranking|web site]].
  
 It is important to understand that the objective of the project is NOT about developing an application with GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "​objects"​ (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology must be provided to show that the chosen tool is THE technology of choice for your application,​ better than all other alternatives,​ and that it will perform correctly when the system is deployed at full scale. It is important to understand that the objective of the project is NOT about developing an application with GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "​objects"​ (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology must be provided to show that the chosen tool is THE technology of choice for your application,​ better than all other alternatives,​ and that it will perform correctly when the system is deployed at full scale.
Line 163: Line 174:
   * Column stores and Cassandra, Hbase, ...   * Column stores and Cassandra, Hbase, ...
   * Data warehouses and Apache Hive   * Data warehouses and Apache Hive
-  * Distributed databases and SQL Server, ​DynamoDB, ...+  * Distributed databases and SQL Server, ​Oracle, Citus, ...
   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...
   * Embedded databases and BerkeleyDB   * Embedded databases and BerkeleyDB
   * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, ....   * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, ....
   * Key-value stores and BerkeleyDB, DynamoDB, Redis, Voldermort, ...   * Key-value stores and BerkeleyDB, DynamoDB, Redis, Voldermort, ...
-  * Multimedia databases and Oracle 
   * Multi-model databases and MarkLogic   * Multi-model databases and MarkLogic
   * NewSQL databases and VoltDB   * NewSQL databases and VoltDB
Line 182: Line 192:
  
   * Analytical databases and Endeca: David Silberwasser,​ Sami Abdul Sater   * Analytical databases and Endeca: David Silberwasser,​ Sami Abdul Sater
 +  * Analytical databases with Apache Druid and ClickHouse: Andrzej Krzysztof Pietrusiak, Tripat Kaur, Viktor Stavrinopoulos
   * Cloud databases and Microsoft Azure SQL: Davide Rendina, Margarita Hernandez   * Cloud databases and Microsoft Azure SQL: Davide Rendina, Margarita Hernandez
-  * Column ​stores and Cassandra: Md Jamiur Rahman Rifat, Khushnur Binte Jahangir +  * Column ​databases with Cassandra ​and HBase: Md Jamiur Rahman Rifat, Khushnur Binte Jahangir,  Hind Bakkali and Gaëlle Frauenkron 
-  * Datawarehouses ​and Apache Hive: Nicole Zafalón, Andrés Espinal +  * Column stores and Apache Kudu: Pei Liao, Minxing Jiang 
-  * Distributed databases ​and SQL Server: Asha Seif, Kainaat Amjid+  * Data warehouses ​and Apache Hive: Nicole Zafalón, Andrés Espinal 
 +  * Distributed databases ​with Citus: Asha Seif, Kainaat Amjid 
 +  * Distributed databases with Apache Ignite: Fan Chen, Mathieu Pardon
   * Distributed Databases with DynamoDB: Loïc Caudron, Matteo Snellings   * Distributed Databases with DynamoDB: Loïc Caudron, Matteo Snellings
   * Document stores with CouchBase and CouchDB: Mohammadreza Amini, Ossoama Benaissa, Zheng Ren, Adriana Sirbu   * Document stores with CouchBase and CouchDB: Mohammadreza Amini, Ossoama Benaissa, Zheng Ren, Adriana Sirbu
   * Document stores and Firestore: Luca De Santos, Sacha Keserovic ​   * Document stores and Firestore: Luca De Santos, Sacha Keserovic ​
   * Document stores and MongoDB: Hang Yu, Zhiyang Guo   * Document stores and MongoDB: Hang Yu, Zhiyang Guo
 +  * Embedded databases and BerkeleyDB: Starygin Evgueniy, Ndele-A-Mulenghe Mashini
   * In-memory databases and Memcached: Diogo Repas and Sandra Hillergren   * In-memory databases and Memcached: Diogo Repas and Sandra Hillergren
   * Key-value databases with DynamoDB: Aline Desmet, Chloé Dekeyser   * Key-value databases with DynamoDB: Aline Desmet, Chloé Dekeyser
Line 195: Line 209:
   * Multimedia databases and Oracle: Wassim Belgada, Imestir Ibrahim   * Multimedia databases and Oracle: Wassim Belgada, Imestir Ibrahim
   * NewSQL databases with VoltDB and CockroachDB:​ Ali Imam Manzer, Maciej Piekarski, Johan Gjini, Nabil Souissi, ​   * NewSQL databases with VoltDB and CockroachDB:​ Ali Imam Manzer, Maciej Piekarski, Johan Gjini, Nabil Souissi, ​
-  * Object-oriented databases with ObjectBox and Perst: Filip Sotiroski, Niccolo Morabito, ​Andrea Gonzato, Pietro Ferrazi+  * Object-oriented databases with ObjectBox and Perst: Filip Sotiroski, Niccolo Morabito, ​Vlada Kylynnyk, Pietro Ferrazi
   * Real-time databases and Firebase: Himanshu Choudhary, Sergio Postigo, Tejaswini dhuppad   * Real-time databases and Firebase: Himanshu Choudhary, Sergio Postigo, Tejaswini dhuppad
 +  * Search engines with Apache Solr and ElasticSearch:​ Pap Sanou, Szymon Swirydowicz,​ Alexandre Chapelle, Nicolas Dardenne
   * Spatial raster databases and Rasdaman: Adam Broniewski, Victor Divi   * Spatial raster databases and Rasdaman: Adam Broniewski, Victor Divi
-  * Stream databases and Apache Kafka: Vlada Kylynnyk, Mahmut Asım Onat 
   * Time series databases with Influx DB and Kdb+: Mohammad Zain Abbas, Muhammad Ismail, Yi Wu, Chonghan Li   * Time series databases with Influx DB and Kdb+: Mohammad Zain Abbas, Muhammad Ismail, Yi Wu, Chonghan Li
-  * Search engines with Apache Solr and ElasticSearchPap SanouSzymon Swirydowicz,​ Alexandre Chapelle, Nicolas Dardenne+  * Time series databases ​and PromoteusDumitru NegruBrice Petit
   * XML Databases and BaseX: Maxime Renversez, Mael Touret   * XML Databases and BaseX: Maxime Renversez, Mael Touret
  
 
teaching/infoh415.txt · Last modified: 2023/12/04 18:14 by ezimanyi