Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:infoh415 [2019/09/17 09:42]
ezimanyi [Schedule]
teaching:infoh415 [2022/09/28 08:08]
gdejaege [Last important announcement]
Line 2: Line 2:
  
  
 +===== Last important announcement ====
 +All VUB student registered to the course who are not on the Teams of the course should take contact with gilles.dejaegere@ulb.be
 +
 +The first exercise session will take place Thursday 29th between 14h and 16h in UB4.126 (not UB4.130).
 +
 +Before this session, please have a look at the software that must be installed in order to be able to do the exercises. It is indicated on the website: https://​cs.ulb.ac.be/​public/​teaching/​infoh415/​tp
 +
 +
 +If you are using your computer, please install and test the software before the exercise session. It usually takes some time. If you plan on using the computers of the classroom (which is possible but not recommended) , I would advice you to arrive a few minutes in advance in order to be able to start and test all the necessary software before the start of the course. Using the computers of the room require to connect on a server (the instructions are in the slide on the first session).
 ===== Lecturer ===== ===== Lecturer =====
  
Line 32: Line 41:
  
 The course is given during the first semester ​ The course is given during the first semester ​
-  * Lectures on Mondays from pm to pm at the room S.UA4.218 +  * Lectures on Mondays from pm to pm in the K.4.601 (Solbosch campus) 
-  * Exercises on Thursdays from 2 pm to 4 pm at the room S.UB4.130+  * Exercises on Thursdays from 2 pm to 4 pm
  
 +/* 
 {{:​teaching:​infoh415:​infoh415schedule2018.pdf|Schedule}} {{:​teaching:​infoh415:​infoh415schedule2018.pdf|Schedule}}
  
- 
-/*  
   * [[http://​www.google.com/​calendar/​embed?​src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&​ctz=Europe/​Brussels|Online schedule]]   * [[http://​www.google.com/​calendar/​embed?​src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&​ctz=Europe/​Brussels|Online schedule]]
 */ */
 +
 +
 +
 ===== Objectives ===== ===== Objectives =====
  
 Today, databases are moving away from typical management applications,​ and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution,​ and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications. Today, databases are moving away from typical management applications,​ and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution,​ and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications.
 +
 +
 +
 ===== Content ===== ===== Content =====
  
Line 54: Line 68:
 Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL. Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL.
  
-==== Object ​Databases ====+==== Graph Databases ====
  
-Object-oriented modelObject PersistanceODMG standard: Object Definition Language and Object Query Language.+...
  
 ==== Spatial Databases ==== ==== Spatial Databases ====
Line 92: Line 106:
   * IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:​infoh415:​a_matter_of_time.pdf|version pdf}})   * IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:​infoh415:​a_matter_of_time.pdf|version pdf}})
 ===== Links ===== ===== Links =====
-  * Temporal ​databases  +  * Spatial ​databases 
-    * [[http://timecenter.cs.aau.dk/|TimeCenter]], an international research centre for temporal databases. +    * [[https://postgis.net/​workshops/​postgis-intro/​|Introduction to PostGIS]] 
-    * [[http://www.timeconsult.com/Software/​Software.html|TimeDB]], a temporal relational database+    * [[https://​learn.crunchydata.com/postgis|Crunchy Data Interactive PostGIS Learning Portal]] 
 +  * Spatio-temporal ​(or mobility) ​databases 
 +    * [[https://mobilitydb.com/|MobilityDB]]  
   * Object databases   * Object databases
     * [[http://​www.odbms.org/​|ODBMS.ORG]],​ portal of ressources about object databases.     * [[http://​www.odbms.org/​|ODBMS.ORG]],​ portal of ressources about object databases.
-    * [[http://​www.db4o.com/​|db4o]],​ an open source object database. 
     * [[http://​www.objectstore.com/​datasheet/​index.ssp|ObjectStore]],​ an object database     * [[http://​www.objectstore.com/​datasheet/​index.ssp|ObjectStore]],​ an object database
     * [[http://​www.objectivity.com|Objectivity]],​ an object database     * [[http://​www.objectivity.com|Objectivity]],​ an object database
-    * [[http://​www.versant.com/​|Versant]],​ an object database 
-    * [[http://​www.jade.co.nz/​jade/​|Jade]],​ an object database 
-    * [[http://​sourceforge.net/​projects/​ozone/​|Ozone]],​ an object database 
   * Post-relationnal databases   * Post-relationnal databases
-    * [[http://​www.fresher.com/​|Matisse]] 
     * [[http://​www.intersystems.com/​cache/​index.html|Caché]]     * [[http://​www.intersystems.com/​cache/​index.html|Caché]]
-  * Spatial databases 
-    * [[https://​postgis.net/​workshops/​postgis-intro/​|Introduction to PostGIS]]  ​ 
  
 ===== Course Slides ===== ===== Course Slides =====
Line 113: Line 122:
   * {{teaching:​infoh415:​activenotes.pdf|Active databases}}   * {{teaching:​infoh415:​activenotes.pdf|Active databases}}
   * {{teaching:​infoh415:​temporalnotes.pdf|Temporal databases}}   * {{teaching:​infoh415:​temporalnotes.pdf|Temporal databases}}
-  * {{teaching:​infoh415:​objectnotes.pdf|Object databases}}+  * {{:​teaching:​infoh415:​graphdb-ulb-2021.zip|Graph Notes (2021 version)}} 
 +/*   * {{teaching:​infoh415:​objectnotes.pdf|Object databases}} ​   
 +  * {{:​teaching:​infoh415:​graph_databases_notes.zip|Graph Notes}}*/
   * {{teaching:​infoh415:​spatialnotes.pdf|Spatial databases}}   * {{teaching:​infoh415:​spatialnotes.pdf|Spatial databases}}
  
Line 120: Line 131:
  
   * [[teaching:​infoh415:​TP|Exercices Web page]]   * [[teaching:​infoh415:​TP|Exercices Web page]]
 +
 ===== Project ===== ===== Project =====
  
Line 128: Line 140:
 */ */
  
-Students, in groups of two, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document.+Students, in groups of four students, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<​Technology>​ with <​Tool1>​ and <​Tool2>"​. 
 + 
 +Each group will study a database technology (e.g., document stores, time series databases, etc.) and illustrate it with an application developed ​​in two database management systems to be chosen (e.g., SQL Server, PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain the foundations of the underlying technology. The application must use the chosen technology. Examples of technologies and tools can be found for example in the following ​ [[https://​db-engines.com/​en/​ranking|web site]]. 
 + 
 +It is important to understand that the objective of the project is NOT about developing an application with a GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "​objects"​ (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. Please notice that you SHOULD NOT generate data for the benchmark since you can find in Internet (1) a huge number of available datasets (2) alternatively,​ there are many available data generators. 
 + 
 +As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology (e.g., using PostgreSQL) must be provided to show that the chosen tool is THE technology of choice for your application,​ better than all other alternatives,​ and that it will perform correctly when the system is deployed at full scale. Please notice that there are MANY standard benchmarks for various database technologies so in that case you should prefer using a standard benchmark that reinventing the wheel and create your own benchmark.
  
-Each group will study a database technology ​and illustrate it with an application ​developed ​​​in ​a database management system to be chosen (e.g., Oracle, PostgreSQL, DB2, SQL Server, mySQL, etc..). +The choice of topic and the application ​must be made ​​in ​agreement with the lecturer. The topic should ​not be included ​in the program of the Master in Computer Science and Engineering. The project will be presented ​to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and the report will (1) explain the foundations of the technology ​chosen, (2) explain how these foundations are implemented by the database management systems chosen and (3) illustrate all these concepts with the application implemented.
-The topic should be addressed ​in a technical way, to explain ​the underlying technologies. The application must use the specific ​technology ​manipulated.+
  
-The choice ​of topic and the application must be made ​​in agreement with the lecturerThe topic should not be included ​in the programme ​of the Master in Computer Science and Engineering. The project will be presented ​to the lecturer and the fellow students at the end of the semester. This presentation will be supported ​by a slideshow. ​written report containing the contents ​of the presentation is also required. The presentation ​and written report will explain the possibilities offered by the database management system chosen and give general description of the application ​implemented.+The duration ​of the presentation is 45 minutesIt will structured ​in three parts of SIMILAR length 
 +   * An introduction to technology 
 +   * An introduction to the two tools, each presented by a subgroup of two persons 
 +   ​* ​common assessment ​of the advantages ​and disadvantages of both tools tested in common example ​application.
  
 The evaluation of the project focuses on the following criteria: The evaluation of the project focuses on the following criteria:
Line 142: Line 162:
 The project will count for 25% of the final grade. The project will count for 25% of the final grade.
  
-The project must be submitted by **Monday, December ​162019**.+The project must be submitted by **Monday, December ​122022**. Please send the report and the presentation in PDF format to the lecturer
  
-===== Examples of topics from the previous academic year ===== +  ​* Cloud databases and Microsoft Azure, AWS, ...
- +
-You can take a look at the [[https://​db-engines.com/​en/​|DB-Engines]] web site to get an idea of the currently available technologies and tools. Examples of previous topics are given next: +
- +
-  * Analytical databases and Endeca +
-  ​* Cloud databases and Microsoft Azure+
   * Column stores and Cassandra, Hbase, ...   * Column stores and Cassandra, Hbase, ...
   * Data warehouses and Apache Hive   * Data warehouses and Apache Hive
-  ​* Deductive Databases and XSB +  * Distributed databases and SQL Server, ​Oracle, Citus, ...
-  ​* Distributed databases and SQL Server, ​DynamoDB, ...+
   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...   * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ...
   * Embedded databases and BerkeleyDB   * Embedded databases and BerkeleyDB
-  * Graph Databases and Neo4J, OrientDB, ... 
   * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, ....   * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, ....
-  * Key-value stores and Redis, Voldermort, ... +  * Key-value stores and BerkeleyDB, DynamoDB, ​Redis, Voldermort, ... 
-  * Multimedia databases and Oracle +  * Multi-model databases and MarkLogic, CosmosDB 
-  * Multi-model databases and MarkLogic +  * NewSQL databases and VoltDB, CockrachDB, ... 
-  * NewSQL databases and VoltDB +  * Object-oriented databases and ObjectBox, Perst
-  * Object-oriented databases and db4o+
   * Real-time databases and Firebase   * Real-time databases and Firebase
   * Search engines and Solr, ElasticSearch,​ Sphinx ...   * Search engines and Solr, ElasticSearch,​ Sphinx ...
-  * Spatial databases and Rasdaman +  * Spatial ​raster ​databases and Rasdaman 
-  * Stream databases and Apache Kafka+  * Stream databases and Apache Kafka, Event Stores
   * Time series databases and Influx DB, Kdb+, ...   * Time series databases and Influx DB, Kdb+, ...
   * XML databases and BaseX   * XML databases and BaseX
- 
  
 ===== Topics for the current academic year ===== ===== Topics for the current academic year =====
  
-To be done 
  
-/*  * {{:teaching:infoh415:student_projects:2019:azure.pdf|Cloud ​databases and Microsoft Azure}}Sara DiazBuse Ozer */+  ​Document databases with MongoDB and CouchBase: Abd Abu Sbei, Hoschek Maren, Gupta Prashant, TBD 
 +  ​* ​Document databases with CouchDB and RavenDBAissa Abdoul-Aziz,​ Helin Demirel, Imane Moussaoui, Salma Mekarnia 
 +  * Key-value databases with etcd and HazelcastLiliia Aliakberova,​ Arina Gepalova, Jose Antonio Lorencio Abril, Mariana Mayorga Llano 
 +  * Key-value databases with Redis and ScyllaDBMir Wise Khan, Rishika Gupta, Ahmad, Chidiebere Ogbuchi 
 +  * RDF databases with Virtuoso and Apache JenaNikola Ivanović, Bogdana Živković, Tianheng Zhou, You Xu 
 +  * Search engines with ElasticSearch and OpenSearchMuhammad Rizwan Khalid, Sayyor Yusupov, Ali Abusaleh 
 +  * Time Series ​databases ​with InfluxDB ​and PrometheusLuis Alfredo LeónJezuela Gega, Satria Wicaksono, Isabella Forero 
 +  ​Timeseries databases with TimescaleDB and Graphite: Koumudi Ganepola, Adina Bondoc, Zyad Al-Azazi, Alaa Almutawa 
 +  * Wide column stores with Cassandra and HBase: ZHOU Anthony Zhou, Arnaud Cools, TBD, TBD
 ===== Examinations from Previous Years ===== ===== Examinations from Previous Years =====
  
 
teaching/infoh415.txt · Last modified: 2023/12/04 18:14 by ezimanyi