This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
teaching:infoh415 [2019/07/30 10:23] ezimanyi [Examples of topics from the previous academic year] |
teaching:infoh415 [2021/12/21 12:58] ezimanyi [Project] |
||
---|---|---|---|
Line 2: | Line 2: | ||
+ | ===== Last important announcement ==== | ||
+ | All VUB student registered to the course who are not on the Teams of the course should take contact with gilles.dejaegere@ulb.be | ||
+ | |||
+ | ===Additionnel Sessions next week === | ||
+ | |||
+ | Hello everyone, | ||
+ | |||
+ | After checking with the professor of infoh419 it seems that many of you are busy on Thursday 18/11 from 16h to 18h, the additional lecture of infoh415 is therefore cancelled and we will catch up another time. Tomorrow we will therefore only have one exercise session from 14h to 16h that will cover the last part of temporal databases. | ||
===== Lecturer ===== | ===== Lecturer ===== | ||
Line 32: | Line 40: | ||
The course is given during the first semester | The course is given during the first semester | ||
- | * Lectures on Thursdays from 2 pm to 4 pm at the room S.UA4.218 | + | * Lectures on Mondays from 4 pm to 6 pm in the K.4.601 (Solbosch campus) |
- | * Exercises on Mondays from 4 pm to 6 pm at the room S.UB4.130 | + | * Exercises on Thursdays from 2 pm to 4 pm |
+ | /* | ||
{{:teaching:infoh415:infoh415schedule2018.pdf|Schedule}} | {{:teaching:infoh415:infoh415schedule2018.pdf|Schedule}} | ||
- | |||
- | /* | ||
* [[http://www.google.com/calendar/embed?src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&ctz=Europe/Brussels|Online schedule]] | * [[http://www.google.com/calendar/embed?src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&ctz=Europe/Brussels|Online schedule]] | ||
*/ | */ | ||
+ | |||
+ | |||
+ | |||
===== Objectives ===== | ===== Objectives ===== | ||
Today, databases are moving away from typical management applications, and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution, and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications. | Today, databases are moving away from typical management applications, and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution, and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications. | ||
+ | |||
+ | |||
+ | |||
===== Content ===== | ===== Content ===== | ||
Line 54: | Line 67: | ||
Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL. | Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL. | ||
- | ==== Object Databases ==== | + | ==== Graph Databases ==== |
- | Object-oriented model. Object Persistance. ODMG standard: Object Definition Language and Object Query Language. | + | ... |
==== Spatial Databases ==== | ==== Spatial Databases ==== | ||
Line 76: | Line 89: | ||
* Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001 | * Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001 | ||
* Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002 | * Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002 | ||
- | * R.G.G. Cattel et al., The Object Database Standard: ODMG 3.0, Morgan Kaufmann, 2000 ({{:teaching:odmg.pdf|version pdf}}) | + | * Ian Robinson, Jim Webber, Emil Eifrem, Graph Databases, 2nd Edition, O'Reilly Media, 2015 |
* Philippe Rigaux, Michel Scholl, Agnès Voisard, Spatial Databases: With Application to GIS, Morgan Kaufmann, 2001 | * Philippe Rigaux, Michel Scholl, Agnès Voisard, Spatial Databases: With Application to GIS, Morgan Kaufmann, 2001 | ||
Line 92: | Line 105: | ||
* IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:infoh415:a_matter_of_time.pdf|version pdf}}) | * IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:infoh415:a_matter_of_time.pdf|version pdf}}) | ||
===== Links ===== | ===== Links ===== | ||
- | * Temporal databases | + | * Spatial databases |
- | * [[http://timecenter.cs.aau.dk/|TimeCenter]], an international research centre for temporal databases. | + | * [[https://postgis.net/workshops/postgis-intro/|Introduction to PostGIS]] |
- | * [[http://www.timeconsult.com/Software/Software.html|TimeDB]], a temporal relational database | + | * [[https://learn.crunchydata.com/postgis|Crunchy Data Interactive PostGIS Learning Portal]] |
+ | * Spatio-temporal (or mobility) databases | ||
+ | * [[https://mobilitydb.com/|MobilityDB]] | ||
* Object databases | * Object databases | ||
* [[http://www.odbms.org/|ODBMS.ORG]], portal of ressources about object databases. | * [[http://www.odbms.org/|ODBMS.ORG]], portal of ressources about object databases. | ||
- | * [[http://www.db4o.com/|db4o]], an open source object database. | ||
* [[http://www.objectstore.com/datasheet/index.ssp|ObjectStore]], an object database | * [[http://www.objectstore.com/datasheet/index.ssp|ObjectStore]], an object database | ||
* [[http://www.objectivity.com|Objectivity]], an object database | * [[http://www.objectivity.com|Objectivity]], an object database | ||
- | * [[http://www.versant.com/|Versant]], an object database | ||
- | * [[http://www.jade.co.nz/jade/|Jade]], an object database | ||
- | * [[http://sourceforge.net/projects/ozone/|Ozone]], an object database | ||
* Post-relationnal databases | * Post-relationnal databases | ||
- | * [[http://www.fresher.com/|Matisse]] | ||
* [[http://www.intersystems.com/cache/index.html|Caché]] | * [[http://www.intersystems.com/cache/index.html|Caché]] | ||
Line 111: | Line 121: | ||
* {{teaching:infoh415:activenotes.pdf|Active databases}} | * {{teaching:infoh415:activenotes.pdf|Active databases}} | ||
* {{teaching:infoh415:temporalnotes.pdf|Temporal databases}} | * {{teaching:infoh415:temporalnotes.pdf|Temporal databases}} | ||
- | * {{teaching:infoh415:objectnotes.pdf|Object databases}} | + | * {{:teaching:infoh415:graphdb-ulb-2021.zip|Graph Notes (2021 version)}} |
+ | /* * {{teaching:infoh415:objectnotes.pdf|Object databases}} | ||
+ | * {{:teaching:infoh415:graph_databases_notes.zip|Graph Notes}}*/ | ||
* {{teaching:infoh415:spatialnotes.pdf|Spatial databases}} | * {{teaching:infoh415:spatialnotes.pdf|Spatial databases}} | ||
Line 126: | Line 138: | ||
*/ | */ | ||
- | Students, in groups of two, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. | + | Students, in groups of either two or four students, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<Technology> and <Tool>" for groups of 2 students and "<Technology> with <Tool1> and <Tool2>" for groups of 4 students. |
- | Each group will study a database technology and illustrate it with an application developed in a database management system to be chosen (e.g., Oracle, PostgreSQL, DB2, SQL Server, mySQL, etc..). | + | Each group will study a database technology (e.g., document stores, time series databases, etc.) and illustrate it with an application developed in a database management system to be chosen (e.g., SQL Server, PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain the foundations of the underlying technology. The application must use the chosen technology. Examples of technologies and tools can be found for example in the following [[https://db-engines.com/en/ranking|web site]]. |
- | The topic should be addressed in a technical way, to explain the underlying technologies. The application must use the specific technology manipulated. | + | |
- | The choice of topic and the application must be made in agreement with the lecturer. The topic should not be included in the programme of the Master in Computer Science and Engineering. The project will be presented to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and written report will explain the possibilities offered by the database management system chosen and give a general description of the application implemented. | + | It is important to understand that the objective of the project is NOT about developing an application with GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "objects" (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. Please notice that you SHOULD NOT generate random data since (1) a huge number of datasets are available (2) alternatively, there are many available data generators. |
+ | |||
+ | As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology must be provided to show that the chosen tool is THE technology of choice for your application, better than all other alternatives, and that it will perform correctly when the system is deployed at full scale. | ||
+ | |||
+ | The choice of topic and the application must be made in agreement with the lecturer. The topic should not be included in the program of the Master in Computer Science and Engineering. The project will be presented to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and the report will (1) explain the foundations of the technology chosen, (2) explain how these foundations are implemented by the database management system chosen and (3) illustrate all these concepts with the application implemented. | ||
+ | |||
+ | For 2-student group, the duration of the presentation is 30 minutes. It will structured in two parts of similar length | ||
+ | * An introduction to the technology | ||
+ | * An introduction to the tool illustrated with an example application assessing its advantages and disadvantages. | ||
+ | |||
+ | For 4-student group, the duration of the presentation is 45 minutes. It will structured in three parts of similar length | ||
+ | * An introduction to technologies presented jointly by the two groups | ||
+ | * An introduction to the two tools, each presented by each group | ||
+ | * A common assessment of the advantages and disadvantages of both tools tested in a common example application. | ||
The evaluation of the project focuses on the following criteria: | The evaluation of the project focuses on the following criteria: | ||
Line 140: | Line 164: | ||
The project will count for 25% of the final grade. | The project will count for 25% of the final grade. | ||
- | The project must be submitted by **Monday, December 16, 2019**. | + | The project must be submitted by **Monday, December 13, 2021**. |
- | + | ||
- | ===== Examples of topics from the previous academic year ===== | + | |
- | + | ||
- | You can take a look at the [[https://db-engines.com/en/|DB-Engines]] web site to get an idea of the currently available technologies and tools. Examples of previous topics are given next: | + | |
* Analytical databases and Endeca | * Analytical databases and Endeca | ||
Line 150: | Line 170: | ||
* Column stores and Cassandra, Hbase, ... | * Column stores and Cassandra, Hbase, ... | ||
* Data warehouses and Apache Hive | * Data warehouses and Apache Hive | ||
- | * Deductive Databases and XSB | + | * Distributed databases and SQL Server, Oracle, Citus, ... |
- | * Distributed databases and SQL Server, DynamoDB, ... | + | |
* Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ... | * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ... | ||
* Embedded databases and BerkeleyDB | * Embedded databases and BerkeleyDB | ||
- | * Graph Databases and Neo4J, OrientDB, ... | ||
* In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, .... | * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, .... | ||
- | * Key-value stores and Redis, Voldermort, ... | + | * Key-value stores and BerkeleyDB, DynamoDB, Redis, Voldermort, ... |
- | * Multimedia databases and Oracle | + | |
* Multi-model databases and MarkLogic | * Multi-model databases and MarkLogic | ||
* NewSQL databases and VoltDB | * NewSQL databases and VoltDB | ||
- | * Object-oriented databases and db4o | + | * Object-oriented databases and ObjectBox, Perst |
* Real-time databases and Firebase | * Real-time databases and Firebase | ||
* Search engines and Solr, ElasticSearch, Sphinx ... | * Search engines and Solr, ElasticSearch, Sphinx ... | ||
- | * Spatial databases and Rasdaman | + | * Spatial raster databases and Rasdaman |
- | * Stream databases and Apache Kafka | + | * Stream databases and Apache Kafka, Event Stores |
* Time series databases and Influx DB, Kdb+, ... | * Time series databases and Influx DB, Kdb+, ... | ||
* XML databases and BaseX | * XML databases and BaseX | ||
- | |||
===== Topics for the current academic year ===== | ===== Topics for the current academic year ===== | ||
- | * {{:teaching:infoh415:student_projects:2019:azure.pdf|Cloud databases and Microsoft Azure}}: Sara Diaz, Buse Ozer | + | |
- | * {{:teaching:infoh415:student_projects:2019:xsb.pdf|Deductive databases and XSB}}: Gonçalo Moreira, Kaoutar Chennaf | + | * Analytical databases with Apache Druid and ClickHouse: Andrzej Krzysztof Pietrusiak, Tripat Kaur, Viktor Stavrinopoulos, Deven Ramani |
- | * {{:teaching:infoh415:student_projects:2019:kafka.pdf|Distributed messaging with Apache Kafka}}: René Gómez Londoño, Ankush Sharma | + | * Cloud databases and Microsoft Azure SQL: Davide Rendina, Margarita Hernandez |
- | * {{:teaching:infoh415:student_projects:2019:dynamodb.pdf|Distributed databases and DynamoDB}}: Elena Ouro, Carlos Badillo | + | * Column databases with Cassandra and HBase: Md Jamiur Rahman Rifat, Khushnur Binte Jahangir, Hind Bakkali and Gaëlle Frauenkron |
- | * {{:teaching:infoh415:student_projects:2019:hive.pdf|Distributed databases and Apache Hive}}: Ricardo Rojas, Danilo Acosta | + | * Column stores and Apache Kudu: Pei Liao, Minxing Jiang |
- | * {{:teaching:infoh415:student_projects:2019:mongodb.pdf|Document stores and MongoDB}}: Sivaporn Homvanish, Tzu-Man Wu | + | * Data warehouses and Apache Hive: Nicole Zafalón, Andrés Espinal |
- | * {{:teaching:infoh415:student_projects:2019:couchbase.pdf|Document stores and CouchBase}}: Carlos Martinez Lorenzo, Pablo Molina Mata | + | * Data Warehouses with Redshift and Google BigQuery: Manar EL AMRANI, Hamza MAHMOUDI, Salma SALMANI, Cédric HANSSENS |
- | * {{:teaching:infoh415:student_projects:2019:couchdb.pdf|Document stores and CouchDB}}: Aparna Khire, Mingrui Dong | + | * Distributed databases with Citus and DynamoDB: Asha Seif, Kainaat Amjid, Loïc Caudron, Matteo Snellings |
- | * {{:teaching:infoh415:student_projects:2019:berkeleydb.pdf|Embedded databases and Berkeley DB}}: Ainhoa Zapirain, Nazrin Najafzade | + | * Distributed databases with Apache Ignite: Fan Chen, Mathieu Pardon |
- | * {{:teaching:infoh415:student_projects:2019:memsql.pdf|In-memory databases and MemSQL}}: Haydar Ali Ismail, Dwi Prasetyo Adi Nugroho | + | * Distributed databases with RethinkDB: Thapa Darshan, Sami Akroune |
- | * {{:teaching:infoh415:student_projects:2019:redis.pdf|Key-value stores and Redis}}: Amritansh Sharma, Haftamu Hailu | + | * Document stores with CouchBase and CouchDB: Mohammadreza Amini, Ossoama Benaissa, Zheng Ren, Adriana Sirbu |
- | * {{:teaching:infoh415:student_projects:2019:memcached.pdf|Key-value stores and Memcached}}: Nathan Hullebroeck, Julien Delbeke | + | * Document stores and Firestore: Luca De Santos, Sacha Keserovic |
- | * {{:teaching:infoh415:student_projects:2019:cassandra.pdf|NoSQL databases and Cassandra}}: Pratham Solanki, Braulio Blanco | + | * Document stores and MongoDB: Hang Yu, Zhiyang Guo |
- | * {{:teaching:infoh415:student_projects:2019:db4o.pdf|Object-oriented databases and db4o}}: Pinar Turkyilmaz, Annemarie Burger | + | * Document stores and Supabase: Shady Al Shoha, Nabil El Ouahabi |
- | * {{:teaching:infoh415:student_projects:2019:firebase.pdf|Real-time databases and Firebase}}: Pablo Lopez, Maria Gabriela Martinez | + | * Embedded databases and BerkeleyDB: Starygin Evgueniy, Ndele-A-Mulenghe Mashini |
- | * {{:teaching:infoh415:student_projects:2019:elasticsearch.pdf|Search engines and ElasticSearch}}: Ioannis Prapas, Sokratis Papadopulos | + | * In-memory databases and Memcached: Diogo Repas and Sandra Hillergren |
- | * {{:teaching:infoh415:student_projects:2019:sphinx.pdf|Search engines and Sphinx}}: Kevin SEFU, Antonio RAFAELE, Nestor RAMOS PEREZ | + | * Key-value databases with DynamoDB: Aline Desmet, Chloé Dekeyser |
- | * {{:teaching:infoh415:student_projects:2019:rasdaman.pdf|Spatial data and Rasdaman}}: Fernando Mendes Stefanini, Evgeny Pozdeev | + | * Key-value databases with Cloud bigtable and Redis: Luiz Fonseca, Zyrako Musaj, Yanjian Zhang and Zhicheng Luo |
- | * {{:teaching:infoh415:student_projects:2019:influxdb.pdf|Time series databases and Influx DB}}: Shabana Salmaan, Danish Amjad | + | * Multimedia databases and Oracle: Wassim Belgada, Imestir Ibrahim |
- | * {{:teaching:infoh415:student_projects:2019:kdb.pdf|Time series databases with Kdb+}}: Eugen Robert Patrascu, Kunal Arora | + | * Multimodel databases and ArangoDB: David Silberwasser, Sami Abdul Sater |
- | * {{:teaching:infoh415:student_projects:2019:hbase.pdf|Wide-column databases and Apache HBase}}: Edoardo Conte, Carlos E. Muniz Cuza | + | * Multimodel databases and MarkLogic: Yassine Hodaibi, Jean-Jacques Debilde |
- | * {{:teaching:infoh415:student_projects:2019:basex.pdf|XML databases and BaseX}}: Marine Devers, Richard Bauwens | + | * NewSQL databases with VoltDB and CockroachDB: Ali Imam Manzer, Maciej Piekarski, Johan Gjini, Nabil Souissi, |
- | * {{:teaching:infoh415:student_projects:2019:solr_1.pdf|Search engine and Solr}}: Maazouz Mehdi, Meire Wouter | + | * Object-oriented databases with ObjectBox and Perst: Filip Sotiroski, Niccolo Morabito, Vlada Kylynnyk, Pietro Ferrazi |
- | * {{:teaching:infoh415:student_projects:2019:solr_2.pdf|Search engine and Solr}}: Mulham Aryan, Samia Azzouzi, Kamdem Tagne Thomas Borel | + | * Real-time databases and Firebase: Himanshu Choudhary, Sergio Postigo, Tejaswini dhuppad |
+ | * Search engines with Apache Solr and ElasticSearch: Pap Sanou, Szymon Swirydowicz, Alexandre Chapelle, Nicolas Dardenne | ||
+ | * Spatial raster databases and Rasdaman: Adam Broniewski, Victor Divi | ||
+ | * Stream databases and Apache Kafka, Event Stores: Nazgul Rakhimzhanova | ||
+ | * Time series databases with Influx DB and Kdb+: Mohammad Zain Abbas, Muhammad Ismail, Yi Wu, Chonghan Li | ||
+ | * Time series databases and TimescaleDB: Dumitru Negru, Brice Petit | ||
+ | * XML Databases and BaseX: Maxime Renversez, Mael Touret | ||
+ | |||
===== Examinations from Previous Years ===== | ===== Examinations from Previous Years ===== |