Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:infoh415 [2017/12/19 20:43]
ezimanyi [Topics for the current academic year]
teaching:infoh415 [2022/09/20 10:19]
ezimanyi [Last important announcement]
Line 1: Line 1:
 ====== INFO-H-415: Advanced Databases ====== ====== INFO-H-415: Advanced Databases ======
  
 +
 +===== Last important announcement ====
 +All VUB student registered to the course who are not on the Teams of the course should take contact with gilles.dejaegere@ulb.be
  
 ===== Lecturer ===== ===== Lecturer =====
Line 10: Line 13:
 ===== Teaching Assistant ===== ===== Teaching Assistant =====
  
-  * [[http://code.ulb.ac.be/​code.people.php?​id=1285|Dhananjay Ipparthi]] ([[dhananjay.ipparthi@ulb.ac.be ​]])+  * [[Gilles.Dejaegere@ulb.ac.be|Gilles Dejaegere]]
  
  
Line 32: Line 35:
  
 The course is given during the first semester ​ The course is given during the first semester ​
-  * Lectures on Thursdays ​from pm to pm at the room S.UA4.218 +  * Lectures on Mondays ​from pm to pm in the K.4.601 (Solbosch campus) 
-  * Exercises on Mondays ​from pm to pm at the room S.UB4.130 +  * Exercises on Thursdays ​from pm to pm
- +
-/* +
-{{:​teaching:​infoh415:​infoh415-1415-courseplan-rev.1.pdf|Schedule}} +
-*/ +
  
 /*  /* 
 +{{:​teaching:​infoh415:​infoh415schedule2018.pdf|Schedule}}
 +
   * [[http://​www.google.com/​calendar/​embed?​src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&​ctz=Europe/​Brussels|Online schedule]]   * [[http://​www.google.com/​calendar/​embed?​src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&​ctz=Europe/​Brussels|Online schedule]]
 */ */
 +
 +
 +
 ===== Objectives ===== ===== Objectives =====
  
 Today, databases are moving away from typical management applications,​ and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution,​ and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications. Today, databases are moving away from typical management applications,​ and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution,​ and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications.
 +
 +
 +
 ===== Content ===== ===== Content =====
  
Line 56: Line 62:
 Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL. Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL.
  
-==== Object ​Databases ====+==== Graph Databases ====
  
-Object-oriented modelObject PersistanceODMG standard: Object Definition Language and Object Query Language.+...
  
 ==== Spatial Databases ==== ==== Spatial Databases ====
Line 78: Line 84:
   * Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001   * Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001
   * Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002   * Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002
-  * R.G.G. Cattel et al.The Object Database Standard: ODMG 3.0Morgan Kaufmann2000 +  * Ian RobinsonJim WebberEmil EifremGraph Databases, 2nd Edition, O'​Reilly Media, 2015
   * Philippe Rigaux, Michel Scholl, Agnès Voisard, Spatial Databases: With Application to GIS, Morgan Kaufmann, 2001   * Philippe Rigaux, Michel Scholl, Agnès Voisard, Spatial Databases: With Application to GIS, Morgan Kaufmann, 2001
  
Line 88: Line 94:
   * E. Zimányi, Temporal Aggregates and Temporal Universal Quantifiers in Standard SQL, SIGMOD Record, 35(2):​16-21,​ 2006. ({{http://​code.ulb.ac.be/​dbfiles/​Zim2006article.pdf|version pdf}})   * E. Zimányi, Temporal Aggregates and Temporal Universal Quantifiers in Standard SQL, SIGMOD Record, 35(2):​16-21,​ 2006. ({{http://​code.ulb.ac.be/​dbfiles/​Zim2006article.pdf|version pdf}})
   * Krishna Kulkarni, Jan-Eike Michels, Temporal features in SQL:2011, SIGMOD Record, 41(3):​34-43,​ 2012. ({{teaching:​infoh415:​TempFeaturesSQL2011.pdf|version pdf}})   * Krishna Kulkarni, Jan-Eike Michels, Temporal features in SQL:2011, SIGMOD Record, 41(3):​34-43,​ 2012. ({{teaching:​infoh415:​TempFeaturesSQL2011.pdf|version pdf}})
-  * Gregory Sannik, Fred Daniels, Enabling the Temporal Data Warehouse, Teradata White paper. ({{teaching:​infoh415:​teradata_enabling_temporal.pdf|version pdf}})+  ​* Michael H. Böhlen, Anton Dignös, Johann Gamper, Christian S. Jensen, Temporal Data Management: An Overview, Proc. of the 7th European Summer School on Business Intelligence and Big Data, eBISS 2017, Bruxelles, Belgium, LNBIP 324, Springer 2018. ({{teaching:​infoh415:​bohlen.pdf|version pdf}})  ​* Gregory Sannik, Fred Daniels, Enabling the Temporal Data Warehouse, Teradata White paper. ({{teaching:​infoh415:​teradata_enabling_temporal.pdf|version pdf}})
   * Richard T. Snodgrass, A Case Study of Temporal Data, Teradata White paper. ({{teaching:​infoh415:​teradata_temporal_case_study.pdf|version pdf}})   * Richard T. Snodgrass, A Case Study of Temporal Data, Teradata White paper. ({{teaching:​infoh415:​teradata_temporal_case_study.pdf|version pdf}})
   * Teradata, Temporal Table Support. ({{teaching:​infoh415:​teradata_temporal_support.pdf|version pdf}})   * Teradata, Temporal Table Support. ({{teaching:​infoh415:​teradata_temporal_support.pdf|version pdf}})
Line 94: Line 100:
   * IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:​infoh415:​a_matter_of_time.pdf|version pdf}})   * IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:​infoh415:​a_matter_of_time.pdf|version pdf}})
 ===== Links ===== ===== Links =====
-  * Temporal ​databases  +  * Spatial ​databases 
-    * [[http://timecenter.cs.aau.dk/|TimeCenter]], an international research centre for temporal databases. +    * [[https://postgis.net/​workshops/​postgis-intro/​|Introduction to PostGIS]] 
-    * [[http://www.timeconsult.com/Software/​Software.html|TimeDB]], a temporal relational database+    * [[https://​learn.crunchydata.com/postgis|Crunchy Data Interactive PostGIS Learning Portal]] 
 +  * Spatio-temporal ​(or mobility) ​databases 
 +    * [[https://mobilitydb.com/|MobilityDB]]  
   * Object databases   * Object databases
     * [[http://​www.odbms.org/​|ODBMS.ORG]],​ portal of ressources about object databases.     * [[http://​www.odbms.org/​|ODBMS.ORG]],​ portal of ressources about object databases.
-    * [[http://​www.db4o.com/​|db4o]],​ an open source object database. 
     * [[http://​www.objectstore.com/​datasheet/​index.ssp|ObjectStore]],​ an object database     * [[http://​www.objectstore.com/​datasheet/​index.ssp|ObjectStore]],​ an object database
     * [[http://​www.objectivity.com|Objectivity]],​ an object database     * [[http://​www.objectivity.com|Objectivity]],​ an object database
-    * [[http://​www.versant.com/​|Versant]],​ an object database 
-    * [[http://​www.jade.co.nz/​jade/​|Jade]],​ an object database 
-    * [[http://​sourceforge.net/​projects/​ozone/​|Ozone]],​ an object database 
   * Post-relationnal databases   * Post-relationnal databases
-    * [[http://​www.fresher.com/​|Matisse]] 
     * [[http://​www.intersystems.com/​cache/​index.html|Caché]]     * [[http://​www.intersystems.com/​cache/​index.html|Caché]]
  
Line 113: Line 116:
   * {{teaching:​infoh415:​activenotes.pdf|Active databases}}   * {{teaching:​infoh415:​activenotes.pdf|Active databases}}
   * {{teaching:​infoh415:​temporalnotes.pdf|Temporal databases}}   * {{teaching:​infoh415:​temporalnotes.pdf|Temporal databases}}
-  * {{teaching:​infoh415:​objectnotes.pdf|Object databases}}+  * {{:​teaching:​infoh415:​graphdb-ulb-2021.zip|Graph Notes (2021 version)}} 
 +/*   * {{teaching:​infoh415:​objectnotes.pdf|Object databases}} ​   
 +  * {{:​teaching:​infoh415:​graph_databases_notes.zip|Graph Notes}}*/
   * {{teaching:​infoh415:​spatialnotes.pdf|Spatial databases}}   * {{teaching:​infoh415:​spatialnotes.pdf|Spatial databases}}
  
Line 128: Line 133:
 */ */
  
-Students, in groups of two, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document.+Students, in groups of either ​two or four students, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<​Technology>​ and <​Tool>"​ for groups of 2 students and "<​Technology>​ with <​Tool1>​ and <​Tool2>"​ for groups of 4 students.
  
-Each group will study a database technology and illustrate it with an application developed ​​in a database management system to be chosen (e.g., Oracle, PostgreSQL, DB2, SQL Server, ​mySQL, etc..). +Each group will study a database technology ​(e.g., document stores, time series databases, etc.) and illustrate it with an application developed ​​in a database management system to be chosen (e.g., SQL Server, ​PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain ​the foundations of the underlying ​technology. The application must use the chosen ​technology. Examples of technologies and tools can be found for example in the following ​ [[https://​db-engines.com/​en/​ranking|web site]].
-The topic should be addressed in a technical way, to explain the underlying ​technologies. The application must use the specific ​technology ​manipulated.+
  
-The choice of topic and the application must be made ​​in agreement with the lecturer. The topic should not be included in the programme ​of the Master in Computer Science and Engineering. The project will be presented to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and written ​report will explain the possibilities offered ​by the database management system chosen and give a general description of the application implemented.+It is important to understand that the objective of the project is NOT about developing an application with a GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "​objects"​ (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. Please notice that you SHOULD NOT generate random data for the benchmark since you can find in Internet (1) a huge number of available datasets (2) alternatively,​ there are many available data generators. 
 + 
 +As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology (e.g., using PostgreSQL) must be provided to show that the chosen tool is THE technology of choice for your application,​ better than all other alternatives,​ and that it will perform correctly when the system is deployed at full scale. Please notice that there are MANY standard benchmarks for various database technologies so in that case you should prefer using a standard benchmark that reinventing the wheel and create your own benchmark. 
 + 
 +The choice of topic and the application must be made ​​in agreement with the lecturer. The topic should not be included in the program ​of the Master in Computer Science and Engineering. The project will be presented to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and the report will (1) explain the foundations of the technology chosen, (2) explain how these foundations are implemented ​by the database management system chosen and (3) illustrate all these concepts with the application implemented
 + 
 +For 2-student group, the duration of the presentation is 30 minutes. It will structured in two parts of SIMILAR length 
 +   * An introduction to the technology 
 +   * An introduction to the tool illustrated with an example application assessing its advantages and disadvantages. 
 + 
 +For 4-student group, the duration of the presentation is 45 minutes. It will structured in three parts of SIMILAR length 
 +   * An introduction to technologies presented jointly by the two groups 
 +   * An introduction to the two tools, each presented by each group 
 +   * A common assessment of the advantages and disadvantages of both tools tested in a common example application.
  
 The evaluation of the project focuses on the following criteria: The evaluation of the project focuses on the following criteria:
Line 142: Line 159:
 The project will count for 25% of the final grade. The project will count for 25% of the final grade.
  
-The project must be submitted by **Monday, December ​182017**.+The project must be submitted by **Monday, December ​132021**. Please send the report and the presentation in PDF format to the lecturer
  
-===== Examples of topics from the previous academic year ===== 
- 
-  * Analytical databases and Endeca 
   * Cloud databases and Microsoft Azure   * Cloud databases and Microsoft Azure
   * Column stores and Cassandra, Hbase, ...   * Column stores and Cassandra, Hbase, ...
-  * Database Security ​and Oracle +  * Data warehouses ​and Apache Hive 
-  * Deductive Databases and XSB +  * Distributed databases and SQL Server, ​Oracle, Citus, ...
-  * Distributed databases and SQL Server, ​DynamoDB, ...+
   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...
   * Embedded databases and BerkeleyDB   * Embedded databases and BerkeleyDB
-  ​* Graph Databases and Neo4J, OrientDB, ... +  * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, .... 
-  ​* In-memory databases and Kdb+, MemSQL, Oracle TimesTen, .... +  * Key-value stores and BerkeleyDB, DynamoDB, ​Redis, Voldermort, ... 
-  * Key-value stores and Redis, Voldermort, ... +  * Multi-model databases and MarkLogic, CosmosDB 
-  * Multimedia databases and Oracle +  * NewSQL databases and VoltDB, CockrachDB, ... 
-  * Multi-model databases and MarkLogic +  * Object-oriented databases and ObjectBoxPerst
-  * NewSQL databases and VoltDB +
-  * Object-oriented databases and db4o +
-  * Object-relational mappings and Entity FrameworkHibernate, Java Data Objects, ...+
   * Real-time databases and Firebase   * Real-time databases and Firebase
-  * Spatial databases and SQL Server +  ​* Search engines and Solr, ElasticSearch,​ Sphinx ... 
-  * Spatial 3D Databases ​and PostgreSQL+  ​* Spatial ​raster ​databases and Rasdaman 
 +  * Stream databases and Apache Kafka, Event Stores 
 +  * Time series databases ​and Influx DB, Kdb+, ...
   * XML databases and BaseX   * XML databases and BaseX
- 
  
 ===== Topics for the current academic year ===== ===== Topics for the current academic year =====
  
-  * Kaïs AlbichariTanguy d'​Hose:​ {{:​teaching:​mongodb_2017.pdf|Document stores and MongoDB}} +  * Analytical databases with Apache Druid and ClickHouse: Andrzej Krzysztof PietrusiakTripat Kaur, Viktor Stavrinopoulos,​ Deven Ramani 
-  * Alexis ReynouardRémy Detobel: {{:​teaching:​elasticsearch_2017.pdf|Search engines and Elastic Search}} +  * Cloud databases and Microsoft Azure SQL: Davide RendinaMargarita Hernandez 
-  * Tiffany Ong Lopez, Sergio Ruiz Sainz: {{:​teaching:​ignite_2017.pdf|In-memory ​databases and Apache Ignite}} +  * Column ​databases ​with Cassandra ​and HBase: Md Jamiur Rahman Rifat, Khushnur Binte Jahangir, ​ Hind Bakkali and Gaëlle Frauenkron 
-  * Sofia YfantidouNoor Zehra: {{:​teaching:​influxdb_2017.pdf|Time series DBs and InfluxDB}} +  * Column stores and Apache Kudu: Pei LiaoMinxing Jiang 
-  * Todi ThanasiLev Denisov: {{:​teaching:​cassandra_2017.pdf|NoSQL databases and Cassandra}} +  * Data warehouses and Apache Hive: Nicole ZafalónAndrés Espinal 
-  * Mi ZhouPrabhdeep Minhas: {{:​teaching:​basex_2017.pdf|XML Databases and BaseX}} +  * Data Warehouses with Redshift and Google BigQuery: Manar EL AMRANIHamza MAHMOUDI, Salma SALMANI, Cédric HANSSENS ​ 
-  * Lucie Bauwin, Nicolas Baudoux: {{:​teaching:​firebase_2017.pdf|Real-time ​databases and Firebase}} +  * Distributed ​databases ​with Citus and DynamoDB: Asha Seif, Kainaat Amjid, Loïc Caudron, Matteo Snellings 
-  * Antoine Vandevenne, Akira BaesDocument stores and RethinkDB +  * Distributed databases with Apache IgniteFan Chen, Mathieu Pardon 
-  * Marc Garnica, Batuhan Tuter: {{:​teaching:​pipelinedb_2017.pdf|Stream ​databases ​and PipelineDB}} +  * Distributed ​databases ​with RethinkDB: Thapa Darshan, Sami Akroune 
-  * Maksim Hrytsenia, Rui Liu: Column ​stores and HBase +  * Document ​stores ​with CouchBase ​and CouchDB: Mohammadreza Amini, Ossoama Benaissa, Zheng Ren, Adriana Sirbu 
-  * Kumar Kshitij, Arthur Valingot: Stream databases ​and StreamSQL +  * Document stores ​and Firestore: Luca De Santos, Sacha Keserovic ​ 
-  * Ozge KorogluAnna Turu Pi: {{:​teaching:​neo4jj_2017.pdf|Graph databases and Neo4J}} +  * Document stores and MongoDB: Hang YuZhiyang Guo 
-  * Kashif RabbaniIvan Putera Masli: NewSQL Databases and CockroachDB +  * Document stores and Supabase: Shady Al ShohaNabil El Ouahabi 
-  * Jayanthi Kambayatughar,​ Marie Elisabeth Heinrich: {{:​teaching:​azure_2017.pdf|Cloud ​databases and Microsoft Azure}} +  * Embedded ​databases and BerkeleyDB: Starygin Evgueniy, Ndele-A-Mulenghe Mashini 
-  * Dagoberto Herrera, Keneth Ubeda: {{:​teaching:​opentsdb_2017.pdf|Time series ​databases and OpenTSDB}} +  * In-memory ​databases and Memcached: Diogo Repas and Sandra Hillergren 
-  * Bruno Baldez Correa, Yue WangObject-relational mapping tools and Hibernate +  * Key-value databases with DynamoDBAline Desmet, Chloé Dekeyser 
-  * Raisa Uku, Fatemeh Shafiee: {{:​teaching:​redis_2017.pdf|Key-value ​stores ​and Redis}} +  * Key-value ​databases with Cloud bigtable ​and Redis: Luiz Fonseca, Zyrako Musaj, Yanjian Zhang and Zhicheng Luo 
-  * Dany Efila: ​Multimedia databases and Oracle +  * Multimedia databases and Oracle: Wassim Belgada, Imestir Ibrahim 
-  * Anastasiia Zavolozhina,​ Ferdiansyah Dolot: Object oriented ​databases and Db4o +  * Multimodel ​databases and ArangoDB: David Silberwasser,​ Sami Abdul Sater 
-  * Batra Shubham, Liccardo Nathan: {{:​teaching:​berkeleydb_2017.pdf|Embedded ​databases and BerkeleyDB}} +  * Multimodel ​databases and MarkLogic: Yassine Hodaibi, Jean-Jacques Debilde 
-  * Yasin Arslan, Jacky Trinh: {{:​teaching:​ravendb_2017.pdf|Document stores ​and RavenDB}} +  * NewSQL databases with VoltDB ​and CockroachDB:​ Ali Imam Manzer, Maciej Piekarski, Johan Gjini, Nabil Souissi, ​ 
-  * Alex Buléon, Antoine Chédin: Multi-model databases and OrientDB +  * Object-oriented ​databases ​with ObjectBox ​and Perst: Filip Sotiroski, Niccolo Morabito, Vlada Kylynnyk, Pietro Ferrazi 
-  * Beyens Ziad, Nougba Hamza: Document stores ​and CouchDB +  * Real-time databases ​and Firebase: Himanshu Choudhary, Sergio Postigo, Tejaswini dhuppad 
-  * Aleksei Karetnikov, David Pieschacon: {{:​teaching:​arangodb_2017.pdf|Graph stores ​and ArangoDB}} +  * Search engines with Apache Solr and ElasticSearch:​ Pap Sanou, Szymon Swirydowicz,​ Alexandre Chapelle, Nicolas Dardenne 
-  * Dany-Simone Efila Efila, Michel Noucha: Document stores ​and Cloudant +  * Spatial raster databases ​and Rasdaman: Adam Broniewski, Victor Divi 
-  * George KagramanyanLéni PolisenoDocument stores and Couchbase +  * Stream databases and Apache KafkaEvent StoresNazgul Rakhimzhanova 
-  * Hajji Issam, Toure Ibrahim: {{:​teaching:​nuodb_2017.pdf|newSQL ​databases and nuoDB}} +  * Time series ​databases ​with Influx DB and Kdb+: Mohammad Zain Abbas, Muhammad Ismail, Yi Wu, Chonghan Li 
-  * Madrane SofianePierre-Alexandre Bourdais: Object-oriented embedded database and Perst +  * Time series databases and TimescaleDB:​ Dumitru NegruBrice Petit 
-  * Kirubel YaekobYasmine Daoud: {{:​teaching:​opentsdb_2017.pdf|Database security and SQL Server}} +  * XML Databases and BaseX: Maxime RenversezMael Touret
  
  
 
teaching/infoh415.txt · Last modified: 2023/12/04 18:14 by ezimanyi