Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
teaching:infoh415 [2018/12/27 11:31]
ezimanyi [Topics for the current academic year]
teaching:infoh415 [2025/10/13 18:13] (current)
ezimanyi [Topics for the current academic year]
Line 1: Line 1:
 ====== INFO-H-415: Advanced Databases ====== ====== INFO-H-415: Advanced Databases ======
 +
  
  
Line 10: Line 11:
 ===== Teaching Assistant ===== ===== Teaching Assistant =====
  
-  * [[Gilles.Dejaegere@ulb.ac.be|Gilles Dejaegere]]+  * [[boris.coquelet@ulb.be|Boris Coquelet]]
  
  
Line 22: Line 23:
   * Master in Computer Science and Engineering [MA-IRIF]   * Master in Computer Science and Engineering [MA-IRIF]
   * Master in Computer Sciences [INFO]   * Master in Computer Sciences [INFO]
-  * Erasmus Mundus Master in Big Data Management and Analytics (BDMA) 
  
  
Line 32: Line 32:
  
 The course is given during the first semester ​ The course is given during the first semester ​
-  * Lectures on Thursdays ​from pm to pm at the room S.UA4.218 +  * Lectures on Mondays ​from pm to pm 
-  * Exercises on Mondays ​from pm to pm at the room S.UB4.130+  * Exercises on Thursdays ​from pm to pm
  
 +/* 
 {{:​teaching:​infoh415:​infoh415schedule2018.pdf|Schedule}} {{:​teaching:​infoh415:​infoh415schedule2018.pdf|Schedule}}
  
- 
-/*  
   * [[http://​www.google.com/​calendar/​embed?​src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&​ctz=Europe/​Brussels|Online schedule]]   * [[http://​www.google.com/​calendar/​embed?​src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&​ctz=Europe/​Brussels|Online schedule]]
 */ */
 +
 +
 +===== Grading =====
 +  * Group project (25%)
 +  * Written exam (75%)
 +    * the exam is open book; notes and books can be used. Laptops and other electronic devices are **not** allowed. Please prepare your paper material in advance, not the day before the examination to avoid any printing problems
 +
 +
 ===== Objectives ===== ===== Objectives =====
  
 Today, databases are moving away from typical management applications,​ and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution,​ and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications. Today, databases are moving away from typical management applications,​ and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution,​ and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications.
 +
 +
 +
 ===== Content ===== ===== Content =====
  
-==== Active ​Databases ====+==== Spatial ​Databases ====
  
-Taxonomy ​of conceptsApplications ​of active databases: integrity maintenance,​ derived ​data, replicationDesign of active databases: termination,​ confluence, determinism,​ modularisation.+Spatial data and applications. Space ontology. Conceptual modeling ​of spatial aspectsManipulation ​of spatial ​data with standard SQL. 
 + 
 +==== Mobility Databases ==== 
 + 
 +...
  
 ==== Temporal Databases ==== ==== Temporal Databases ====
Line 54: Line 68:
 Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL. Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL.
  
-==== Object ​Databases ====+==== Active ​Databases ====
  
-Object-oriented modelObject PersistanceODMG standardObject Definition Language and Object Query Language.+Taxonomy of conceptsApplications of active databases: integrity maintenance,​ derived data, replicationDesign of active databasestermination,​ confluence, determinism,​ modularisation.
  
-==== Spatial Databases ==== 
- 
-Spatial data and applications. Space ontology. Conceptual modeling of spatial aspects. Manipulation of spatial data with standard SQL. 
  
  
Line 73: Line 84:
   * Tom Johnston, Bitemporal Data: Theory and Practice, Morgan Kaufmann, 2014   * Tom Johnston, Bitemporal Data: Theory and Practice, Morgan Kaufmann, 2014
   * R.T. Snodgrass, The TSQL2 Temporal Query Language, Kluwer Academic Publishers, 1995   * R.T. Snodgrass, The TSQL2 Temporal Query Language, Kluwer Academic Publishers, 1995
-  * S.W. Dietrich, S.D. Urban, Fundamentals of Object Databases: Object-Oriented and Object-Relational Design, Morgan & Claypool, 2011 
   * Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001   * Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001
   * Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002   * Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002
-  * R.G.G. Cattel et al., The Object Database Standard: ODMG 3.0, Morgan Kaufmann, 2000  ({{:​teaching:​odmg.pdf|version pdf}}) 
   * Philippe Rigaux, Michel Scholl, Agnès Voisard, Spatial Databases: With Application to GIS, Morgan Kaufmann, 2001   * Philippe Rigaux, Michel Scholl, Agnès Voisard, Spatial Databases: With Application to GIS, Morgan Kaufmann, 2001
  
Line 86: Line 95:
   * E. Zimányi, Temporal Aggregates and Temporal Universal Quantifiers in Standard SQL, SIGMOD Record, 35(2):​16-21,​ 2006. ({{http://​code.ulb.ac.be/​dbfiles/​Zim2006article.pdf|version pdf}})   * E. Zimányi, Temporal Aggregates and Temporal Universal Quantifiers in Standard SQL, SIGMOD Record, 35(2):​16-21,​ 2006. ({{http://​code.ulb.ac.be/​dbfiles/​Zim2006article.pdf|version pdf}})
   * Krishna Kulkarni, Jan-Eike Michels, Temporal features in SQL:2011, SIGMOD Record, 41(3):​34-43,​ 2012. ({{teaching:​infoh415:​TempFeaturesSQL2011.pdf|version pdf}})   * Krishna Kulkarni, Jan-Eike Michels, Temporal features in SQL:2011, SIGMOD Record, 41(3):​34-43,​ 2012. ({{teaching:​infoh415:​TempFeaturesSQL2011.pdf|version pdf}})
-  * Gregory Sannik, Fred Daniels, Enabling the Temporal Data Warehouse, Teradata White paper. ({{teaching:​infoh415:​teradata_enabling_temporal.pdf|version pdf}})+  ​* Michael H. Böhlen, Anton Dignös, Johann Gamper, Christian S. Jensen, Temporal Data Management: An Overview, Proc. of the 7th European Summer School on Business Intelligence and Big Data, eBISS 2017, Bruxelles, Belgium, LNBIP 324, Springer 2018. ({{teaching:​infoh415:​bohlen.pdf|version pdf}})  ​* Gregory Sannik, Fred Daniels, Enabling the Temporal Data Warehouse, Teradata White paper. ({{teaching:​infoh415:​teradata_enabling_temporal.pdf|version pdf}})
   * Richard T. Snodgrass, A Case Study of Temporal Data, Teradata White paper. ({{teaching:​infoh415:​teradata_temporal_case_study.pdf|version pdf}})   * Richard T. Snodgrass, A Case Study of Temporal Data, Teradata White paper. ({{teaching:​infoh415:​teradata_temporal_case_study.pdf|version pdf}})
   * Teradata, Temporal Table Support. ({{teaching:​infoh415:​teradata_temporal_support.pdf|version pdf}})   * Teradata, Temporal Table Support. ({{teaching:​infoh415:​teradata_temporal_support.pdf|version pdf}})
Line 92: Line 101:
   * IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:​infoh415:​a_matter_of_time.pdf|version pdf}})   * IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:​infoh415:​a_matter_of_time.pdf|version pdf}})
 ===== Links ===== ===== Links =====
-  * Temporal ​databases  +  * Spatial ​databases 
-    * [[http://timecenter.cs.aau.dk/|TimeCenter]], an international research centre for temporal databases. +    * [[https://postgis.net/​workshops/​postgis-intro/|Introduction to PostGIS]] 
-    * [[http://www.timeconsult.com/Software/Software.html|TimeDB]], a temporal relational database+    * [[https://learn.crunchydata.com/postgis|Crunchy Data Interactive PostGIS Learning Portal]] 
 +  * Mobility databases 
 +    * [[https://mobilitydb.com/|MobilityDB]]  
   * Object databases   * Object databases
     * [[http://​www.odbms.org/​|ODBMS.ORG]],​ portal of ressources about object databases.     * [[http://​www.odbms.org/​|ODBMS.ORG]],​ portal of ressources about object databases.
-    * [[http://​www.db4o.com/​|db4o]],​ an open source object database. 
     * [[http://​www.objectstore.com/​datasheet/​index.ssp|ObjectStore]],​ an object database     * [[http://​www.objectstore.com/​datasheet/​index.ssp|ObjectStore]],​ an object database
     * [[http://​www.objectivity.com|Objectivity]],​ an object database     * [[http://​www.objectivity.com|Objectivity]],​ an object database
-    * [[http://​www.versant.com/​|Versant]],​ an object database 
-    * [[http://​www.jade.co.nz/​jade/​|Jade]],​ an object database 
-    * [[http://​sourceforge.net/​projects/​ozone/​|Ozone]],​ an object database 
   * Post-relationnal databases   * Post-relationnal databases
-    * [[http://​www.fresher.com/​|Matisse]] 
     * [[http://​www.intersystems.com/​cache/​index.html|Caché]]     * [[http://​www.intersystems.com/​cache/​index.html|Caché]]
  
 ===== Course Slides ===== ===== Course Slides =====
  
-  * {{teaching:​infoh415:​activenotes.pdf|Active databases}} 
-  * {{teaching:​infoh415:​temporalnotes.pdf|Temporal databases}} 
-  * {{teaching:​infoh415:​objectnotes.pdf|Object databases}} 
   * {{teaching:​infoh415:​spatialnotes.pdf|Spatial databases}}   * {{teaching:​infoh415:​spatialnotes.pdf|Spatial databases}}
 +  * Mobility databases
 +  * {{teaching:​infoh415:​temporalnotes.pdf|Temporal databases}}
 +  * {{teaching:​infoh415:​activenotes.pdf|Active databases}}
 +/*  * {{:​teaching:​infoh415:​graphdb-ulb-2021.zip|Graph Notes (2021 version)}} */
 +/*   * {{teaching:​infoh415:​objectnotes.pdf|Object databases}} ​  
 +  * {{:​teaching:​infoh415:​graph_databases_notes.zip|Graph Notes}}*/
  
  
Line 118: Line 127:
  
   * [[teaching:​infoh415:​TP|Exercices Web page]]   * [[teaching:​infoh415:​TP|Exercices Web page]]
 +
 ===== Project ===== ===== Project =====
  
Line 126: Line 136:
 */ */
  
-Students, in groups of two, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document.+Students, in groups of four students, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<​Technology>​ with <​Tool1>​ and <​Tool2>"​. 
 + 
 +Each group will study a database technology (e.g., document stores, time series databases, etc.) and illustrate it with an application developed ​​in two database management systems to be chosen (e.g., SQL Server, PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain the foundations of the underlying technology. The application must use the chosen technology. Examples of technologies and tools can be found for example in the following ​ [[https://​db-engines.com/​en/​ranking|web site]]. 
 + 
 +It is important to understand that the objective of the project is NOT about developing an application with a GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "​objects"​ (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. Please notice that you SHOULD NOT generate data for the benchmark since you can find in Internet (1) a huge number of available datasets (2) alternatively,​ there are many available data generators. 
 + 
 +As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology (e.g., using PostgreSQL) must be provided to show that the chosen tool is THE technology of choice for your application,​ better than all other alternatives,​ and that it will perform correctly when the system is deployed at full scale. Please notice that there are MANY standard benchmarks for various database technologies so in that case you should prefer using a standard benchmark that reinventing the wheel and create your own benchmark.
  
-Each group will study a database technology ​and illustrate it with an application ​developed ​​​in ​a database management system to be chosen (e.g., Oracle, PostgreSQL, DB2, SQL Server, mySQL, etc..). +The choice of topic and the application ​must be made ​​in ​agreement with the lecturer. The topic should ​not be included ​in the program of the Master in Computer Science and Engineering. The project will be presented ​to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and the report will (1) explain the foundations of the technology ​chosen, (2) explain how these foundations are implemented by the database management systems chosen and (3) illustrate all these concepts with the application implemented.
-The topic should be addressed ​in a technical way, to explain ​the underlying technologies. The application must use the specific ​technology ​manipulated.+
  
-The choice ​of topic and the application must be made ​​in agreement with the lecturerThe topic should not be included ​in the programme ​of the Master in Computer Science and Engineering. The project will be presented ​to the lecturer and the fellow students at the end of the semester. This presentation will be supported ​by a slideshow. ​written report containing the contents ​of the presentation is also required. The presentation ​and written report will explain the possibilities offered by the database management system chosen and give general description of the application ​implemented.+The duration ​of the presentation is 45 minutesIt will structured ​in three parts of SIMILAR length 
 +   * An introduction to technology 
 +   * An introduction to the two tools, each presented by a subgroup of two persons 
 +   ​* ​common assessment ​of the advantages ​and disadvantages of both tools tested in common example ​application.
  
 The evaluation of the project focuses on the following criteria: The evaluation of the project focuses on the following criteria:
Line 140: Line 158:
 The project will count for 25% of the final grade. The project will count for 25% of the final grade.
  
-The project must be submitted ​by **Monday, December ​172018**.+The project must be submitted **immediately after** the project presentationwhich will take place on the week on Monday ​December ​162025. Please send the report and the presentation in PDF format to the lecturer
  
-===== Examples of topics from the previous academic year ===== +  ​* Cloud databases and Microsoft Azure, AWS, ...
- +
-You can take a look at the [[https://​db-engines.com/​en/​|DB-Engines]] web site to get an idea of the currently available technologies and tools. Examples of previous topics are given next: +
- +
-  * Analytical databases and Endeca +
-  ​* Cloud databases and Microsoft Azure+
   * Column stores and Cassandra, Hbase, ...   * Column stores and Cassandra, Hbase, ...
-  * Deductive Databases ​and XSB +  * Data warehouses ​and Apache Hive, ... 
-  * Distributed databases and SQL Server, ​DynamoDB, ...+  * Distributed databases and SQL Server, ​Oracle, Citus, ...
   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...
-  * Embedded databases and BerkeleyDB +  * Embedded databases and BerkeleyDB, ​DuckDB, ... 
-  * Graph Databases and Neo4JOrientDB, ... +  * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, .... 
-  * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, .... +  * Key-value stores and BerkeleyDB, DynamoDB, ​Redis, Voldermort, ... 
-  * Key-value stores and Redis, Voldermort, ... +  * Multi-model databases and MarkLogic, CosmosDB, ... 
-  * Multimedia databases and Oracle +  * NewSQL databases and VoltDB, CockrachDB, ... 
-  * Multi-model databases and MarkLogic +  * Object-oriented databases and ObjectBox, Perst, ... 
-  * NewSQL databases and VoltDB +  * Real-time databases and Firebase, ... 
-  * Object-oriented databases and db4o +  * Search engines and Solr, ElasticSearch,​ Sphinx ... 
-  * Real-time databases and Firebase +  * Spatial raster databases and Rasdaman, ... 
-  * XML databases and BaseX+  * Stream databases and Kafka, Event Stores, Flink, NebulaStream,​ ... 
 +  * Time series databases and Influx DB, Kdb+, ... 
 +  * Vector databases and PGVector, Chromadb, ... 
 +  * XML databases and BaseX, ...
  
 +====== Topics for the current academic year =====
 +
 +  * Columnar Databases with XX and YY: David Amsens, Ahmed Talhaoui, Alexandru Tirpan
 +  * Distributed Databases with MongoDB and Citus: Soffack Mafoken Irène, Moaaz Afzal, Meli Annabelle Grace, Noor ul Hassan
 +  * Document Databases with MongoDB and CouchDB: Yassmine Debbaghi, Louis Batisse, Maxime Hainaut, Matteo Padonou
 +  * Embedded databases with DuckDb and BerkeleyDb: Ethan Rogge, Basile Donnay, Anas Cheouirfa, Hac Le
 +  * Graph databases with Neo4j and ArangoDB: Antoine BERTHION, Nha TRUONG, Andrius EZERSKIS, Capucine SPEILERS
 +  * Key-values stores with Redis and Memcached: Sharof Imad, Manuelle Ndamtang, Francis Yaun, Nauman Ahmad
 +  * New SQL databases with CockroachDB and VoltDB: Luigi Cristanelli,​ Eurielle Nkwinga, Louis Devroye, Lina Shahada
 +  * Search engines with ElasticSearch and Apache Solr: Sirine Ameraoui, Othman El Kazbani, Flament Franklin, Siddharth Sahay
 +  * Time Series databases with InfluxDB and TimescaleDB : Deliallisi Klejdi, Ziane Yasser, Touimer Amin, Delhaise Thalia.
 +  * Vector DBMS with Elasticsearch and Chroma: DAHRI Mohamed, BENMASSAOUD El Mamoune, BELKHIRI Rida, EL HARROUTI Imad
  
-===== Topics for the current academic year ===== 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​azure.pdf|Cloud databases and Microsoft Azure}}: Sara Diaz, Buse Ozer 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​xsb.pdf|Deductive databases and XSB}}: Gonçalo Moreira, Kaoutar Chennaf 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​kafka.pdf|Distributed messaging with Apache Kafka}}: René Gómez Londoño, Ankush Sharma 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​dynamodb.pdf|Distributed databases and DynamoDB}}: Elena Ouro, Carlos Badillo 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​hive.pdf|Distributed databases and Apache Hive}}: Ricardo Rojas, Danilo Acosta 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​mongodb.pdf|Document stores and MongoDB}}: Sivaporn Homvanish, Tzu-Man Wu 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​couchbase.pdf|Document stores and CouchBase}}:​ Carlos Martinez Lorenzo, Pablo Molina Mata 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​couchdb.pdf|Document stores and CouchDB}}: Aparna Khire, Mingrui Dong 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​berkeleydb.pdf|Embedded databases and Berkeley DB}}: Ainhoa Zapirain, Nazrin Najafzade 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​memsql.pdf|In-memory databases and MemSQL}}: Haydar Ali Ismail, Dwi Prasetyo Adi Nugroho 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​redis.pdf|Key-value stores and Redis}}: Amritansh Sharma, Haftamu Hailu 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​memcached.pdf|Key-value stores and Memcached}}:​ Nathan Hullebroeck,​ Julien Delbeke 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​cassandra.pdf|NoSQL databases and Cassandra}}:​ Pratham Solanki, Braulio Blanco 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​db4o.pdf|Object-oriented databases and db4o}}: Pinar Turkyilmaz, Annemarie Burger 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​firebase.pdf|Real-time databases and Firebase}}: Pablo Lopez, Maria Gabriela Martinez 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​elasticsearch.pdf|Search engines and ElasticSearch}}:​ Ioannis Prapas, Sokratis Papadopulos 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​sphinx.pdf|Search engines and Sphinx}}: Kevin SEFU, Antonio RAFAELE, Nestor RAMOS PEREZ 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​rasdaman.pdf|Spatial data and Rasdaman}}: Fernando Mendes Stefanini, Evgeny Pozdeev 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​influxdb.pdf|Time series databases and Influx DB}}: Shabana Salmaan, Danish Amjad 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​kdb.pdf|Time series databases with Kdb+}}: Eugen Robert Patrascu, Kunal Arora 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​hbase.pdf|Wide-column databases and Apache HBase}}: Edoardo Conte, Carlos E. Muniz Cuza 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​basex.pdf|XML databases and BaseX}}: Marine Devers, Richard Bauwens 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​solr_1.pdf|Search engine and Solr}}: Maazouz Mehdi, Meire Wouter 
-  * {{:​teaching:​infoh415:​student_projects:​2019:​solr_2.pdf|Search engine and Solr}}: Mulham Aryan, Samia Azzouzi, Kamdem Tagne Thomas Borel 
  
 ===== Examinations from Previous Years ===== ===== Examinations from Previous Years =====
 +  * Academic year 2024-2025 
 +    * {{:​teaching:​infoh415_exam_jan25.pdf|First session}} 
 +  * Academic year 2023-2024 
 +    * {{:​teaching:​infoh415:​infoh415-2024-january.pdf|First session}}
   * Academic year 2016-2017   * Academic year 2016-2017
     * {{:​teaching:​infoh415:​infoh415-2017-january.pdf|First session}}     * {{:​teaching:​infoh415:​infoh415-2017-january.pdf|First session}}
 
teaching/infoh415.1545906666.txt.gz · Last modified: 2018/12/27 11:31 by ezimanyi