This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
teaching:infoh415 [2017/12/19 18:49] ezimanyi [Topics for the current academic year] |
teaching:infoh415 [2023/10/07 14:51] ezimanyi [Topics for the current academic year] |
||
---|---|---|---|
Line 2: | Line 2: | ||
+ | ===== Last important announcement ==== | ||
+ | Dear all, | ||
+ | |||
+ | The first practical session will be next week on Thursday (see schedule for room and hour). Please have your computer ready with the tools needed. If you do not have a laptop I advise you to pair up with somebody with one for the practical session. | ||
+ | |||
+ | The first three sessions will be on spatial databases (see [[teaching:infoh415:TP|Exercices Web page]])! | ||
+ | |||
+ | See you next Thursday, | ||
+ | |||
+ | Boris. | ||
===== Lecturer ===== | ===== Lecturer ===== | ||
Line 10: | Line 20: | ||
===== Teaching Assistant ===== | ===== Teaching Assistant ===== | ||
- | * [[http://code.ulb.ac.be/code.people.php?id=1285|Dhananjay Ipparthi]] ([[dhananjay.ipparthi@ulb.ac.be ]]) | + | * [[boris.coquelet@ulb.be|Boris Coquelet]] |
Line 32: | Line 42: | ||
The course is given during the first semester | The course is given during the first semester | ||
- | * Lectures on Thursdays from 2 pm to 4 pm at the room S.UA4.218 | + | * Lectures on Mondays from 4 pm to 6 pm in the UB4.132 (Solbosch campus) |
- | * Exercises on Mondays from 4 pm to 6 pm at the room S.UB4.130 | + | * Exercises on Thursdays from 2 pm to 4 pm |
- | + | ||
- | /* | + | |
- | {{:teaching:infoh415:infoh415-1415-courseplan-rev.1.pdf|Schedule}} | + | |
- | */ | + | |
/* | /* | ||
+ | {{:teaching:infoh415:infoh415schedule2018.pdf|Schedule}} | ||
+ | |||
* [[http://www.google.com/calendar/embed?src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&ctz=Europe/Brussels|Online schedule]] | * [[http://www.google.com/calendar/embed?src=dug2eihu8tqtnkjhmtuupj0je0%40group.calendar.google.com&ctz=Europe/Brussels|Online schedule]] | ||
*/ | */ | ||
+ | |||
+ | |||
+ | |||
===== Objectives ===== | ===== Objectives ===== | ||
Today, databases are moving away from typical management applications, and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution, and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications. | Today, databases are moving away from typical management applications, and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution, and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications. | ||
+ | |||
+ | |||
+ | |||
===== Content ===== | ===== Content ===== | ||
- | ==== Active Databases ==== | + | ==== Spatial Databases ==== |
- | Taxonomy of concepts. Applications of active databases: integrity maintenance, derived data, replication. Design of active databases: termination, confluence, determinism, modularisation. | + | Spatial data and applications. Space ontology. Conceptual modeling of spatial aspects. Manipulation of spatial data with standard SQL. |
+ | |||
+ | ==== Mobility Databases ==== | ||
+ | |||
+ | ... | ||
==== Temporal Databases ==== | ==== Temporal Databases ==== | ||
Line 56: | Line 73: | ||
Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL. | Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL. | ||
- | ==== Object Databases ==== | + | ==== Active Databases ==== |
- | Object-oriented model. Object Persistance. ODMG standard: Object Definition Language and Object Query Language. | + | Taxonomy of concepts. Applications of active databases: integrity maintenance, derived data, replication. Design of active databases: termination, confluence, determinism, modularisation. |
- | ==== Spatial Databases ==== | ||
- | |||
- | Spatial data and applications. Space ontology. Conceptual modeling of spatial aspects. Manipulation of spatial data with standard SQL. | ||
Line 78: | Line 92: | ||
* Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001 | * Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001 | ||
* Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002 | * Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002 | ||
- | * R.G.G. Cattel et al., The Object Database Standard: ODMG 3.0, Morgan Kaufmann, 2000 | + | * Ian Robinson, Jim Webber, Emil Eifrem, Graph Databases, 2nd Edition, O'Reilly Media, 2015 |
* Philippe Rigaux, Michel Scholl, Agnès Voisard, Spatial Databases: With Application to GIS, Morgan Kaufmann, 2001 | * Philippe Rigaux, Michel Scholl, Agnès Voisard, Spatial Databases: With Application to GIS, Morgan Kaufmann, 2001 | ||
Line 88: | Line 102: | ||
* E. Zimányi, Temporal Aggregates and Temporal Universal Quantifiers in Standard SQL, SIGMOD Record, 35(2):16-21, 2006. ({{http://code.ulb.ac.be/dbfiles/Zim2006article.pdf|version pdf}}) | * E. Zimányi, Temporal Aggregates and Temporal Universal Quantifiers in Standard SQL, SIGMOD Record, 35(2):16-21, 2006. ({{http://code.ulb.ac.be/dbfiles/Zim2006article.pdf|version pdf}}) | ||
* Krishna Kulkarni, Jan-Eike Michels, Temporal features in SQL:2011, SIGMOD Record, 41(3):34-43, 2012. ({{teaching:infoh415:TempFeaturesSQL2011.pdf|version pdf}}) | * Krishna Kulkarni, Jan-Eike Michels, Temporal features in SQL:2011, SIGMOD Record, 41(3):34-43, 2012. ({{teaching:infoh415:TempFeaturesSQL2011.pdf|version pdf}}) | ||
- | * Gregory Sannik, Fred Daniels, Enabling the Temporal Data Warehouse, Teradata White paper. ({{teaching:infoh415:teradata_enabling_temporal.pdf|version pdf}}) | + | * Michael H. Böhlen, Anton Dignös, Johann Gamper, Christian S. Jensen, Temporal Data Management: An Overview, Proc. of the 7th European Summer School on Business Intelligence and Big Data, eBISS 2017, Bruxelles, Belgium, LNBIP 324, Springer 2018. ({{teaching:infoh415:bohlen.pdf|version pdf}}) * Gregory Sannik, Fred Daniels, Enabling the Temporal Data Warehouse, Teradata White paper. ({{teaching:infoh415:teradata_enabling_temporal.pdf|version pdf}}) |
* Richard T. Snodgrass, A Case Study of Temporal Data, Teradata White paper. ({{teaching:infoh415:teradata_temporal_case_study.pdf|version pdf}}) | * Richard T. Snodgrass, A Case Study of Temporal Data, Teradata White paper. ({{teaching:infoh415:teradata_temporal_case_study.pdf|version pdf}}) | ||
* Teradata, Temporal Table Support. ({{teaching:infoh415:teradata_temporal_support.pdf|version pdf}}) | * Teradata, Temporal Table Support. ({{teaching:infoh415:teradata_temporal_support.pdf|version pdf}}) | ||
Line 94: | Line 108: | ||
* IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:infoh415:a_matter_of_time.pdf|version pdf}}) | * IBM, A Matter of Time: Temporal Data Management in DB2 for z/OS. ({{teaching:infoh415:a_matter_of_time.pdf|version pdf}}) | ||
===== Links ===== | ===== Links ===== | ||
- | * Temporal databases | + | * Spatial databases |
- | * [[http://timecenter.cs.aau.dk/|TimeCenter]], an international research centre for temporal databases. | + | * [[https://postgis.net/workshops/postgis-intro/|Introduction to PostGIS]] |
- | * [[http://www.timeconsult.com/Software/Software.html|TimeDB]], a temporal relational database | + | * [[https://learn.crunchydata.com/postgis|Crunchy Data Interactive PostGIS Learning Portal]] |
+ | * Mobility databases | ||
+ | * [[https://mobilitydb.com/|MobilityDB]] | ||
* Object databases | * Object databases | ||
* [[http://www.odbms.org/|ODBMS.ORG]], portal of ressources about object databases. | * [[http://www.odbms.org/|ODBMS.ORG]], portal of ressources about object databases. | ||
- | * [[http://www.db4o.com/|db4o]], an open source object database. | ||
* [[http://www.objectstore.com/datasheet/index.ssp|ObjectStore]], an object database | * [[http://www.objectstore.com/datasheet/index.ssp|ObjectStore]], an object database | ||
* [[http://www.objectivity.com|Objectivity]], an object database | * [[http://www.objectivity.com|Objectivity]], an object database | ||
- | * [[http://www.versant.com/|Versant]], an object database | ||
- | * [[http://www.jade.co.nz/jade/|Jade]], an object database | ||
- | * [[http://sourceforge.net/projects/ozone/|Ozone]], an object database | ||
* Post-relationnal databases | * Post-relationnal databases | ||
- | * [[http://www.fresher.com/|Matisse]] | ||
* [[http://www.intersystems.com/cache/index.html|Caché]] | * [[http://www.intersystems.com/cache/index.html|Caché]] | ||
===== Course Slides ===== | ===== Course Slides ===== | ||
- | * {{teaching:infoh415:activenotes.pdf|Active databases}} | ||
- | * {{teaching:infoh415:temporalnotes.pdf|Temporal databases}} | ||
- | * {{teaching:infoh415:objectnotes.pdf|Object databases}} | ||
* {{teaching:infoh415:spatialnotes.pdf|Spatial databases}} | * {{teaching:infoh415:spatialnotes.pdf|Spatial databases}} | ||
+ | * Mobility databases | ||
+ | * {{teaching:infoh415:temporalnotes.pdf|Temporal databases}} | ||
+ | * {{teaching:infoh415:activenotes.pdf|Active databases}} | ||
+ | /* * {{:teaching:infoh415:graphdb-ulb-2021.zip|Graph Notes (2021 version)}} */ | ||
+ | /* * {{teaching:infoh415:objectnotes.pdf|Object databases}} | ||
+ | * {{:teaching:infoh415:graph_databases_notes.zip|Graph Notes}}*/ | ||
Line 120: | Line 134: | ||
* [[teaching:infoh415:TP|Exercices Web page]] | * [[teaching:infoh415:TP|Exercices Web page]] | ||
+ | |||
===== Project ===== | ===== Project ===== | ||
Line 128: | Line 143: | ||
*/ | */ | ||
- | Students, in groups of two, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. | + | Students, in groups of four students, will realize a project in a topic relevant to advanced databases. Examples of topics are given in the next section of this document. Please notice that the template for these topics is "<Technology> with <Tool1> and <Tool2>". |
+ | |||
+ | Each group will study a database technology (e.g., document stores, time series databases, etc.) and illustrate it with an application developed in two database management systems to be chosen (e.g., SQL Server, PostgreSQL, MongoDB, etc.). The topic should be addressed in a technical way, to explain the foundations of the underlying technology. The application must use the chosen technology. Examples of technologies and tools can be found for example in the following [[https://db-engines.com/en/ranking|web site]]. | ||
+ | |||
+ | It is important to understand that the objective of the project is NOT about developing an application with a GUI. The objective is to benchmark the proposed tool in relation to the database requirements of your application. Therefore, it is necessary to determine the set of queries and updates that your application requires and do a benchmark with, e.g., 1K, 10K, 100K, and 1M "objects" (rows, documents, nodes, etc. depending on the technology used) to determine if the tool shows a linear or exponential behavior. Please notice that you SHOULD NOT generate data for the benchmark since you can find in Internet (1) a huge number of available datasets (2) alternatively, there are many available data generators. | ||
+ | |||
+ | As usual when performing benchmarks, the queries and updates are executed n times (e.g., 6 times where the first execution is not considered because it is different from the others since the cache structures must be filled) and the average of the execution times is computed. A comparison with traditional relational technology (e.g., using PostgreSQL) must be provided to show that the chosen tool is THE technology of choice for your application, better than all other alternatives, and that it will perform correctly when the system is deployed at full scale. Please notice that there are MANY standard benchmarks for various database technologies so in that case you should prefer using a standard benchmark that reinventing the wheel and create your own benchmark. | ||
- | Each group will study a database technology and illustrate it with an application developed in a database management system to be chosen (e.g., Oracle, PostgreSQL, DB2, SQL Server, mySQL, etc..). | + | The choice of topic and the application must be made in agreement with the lecturer. The topic should not be included in the program of the Master in Computer Science and Engineering. The project will be presented to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and the report will (1) explain the foundations of the technology chosen, (2) explain how these foundations are implemented by the database management systems chosen and (3) illustrate all these concepts with the application implemented. |
- | The topic should be addressed in a technical way, to explain the underlying technologies. The application must use the specific technology manipulated. | + | |
- | The choice of topic and the application must be made in agreement with the lecturer. The topic should not be included in the programme of the Master in Computer Science and Engineering. The project will be presented to the lecturer and the fellow students at the end of the semester. This presentation will be supported by a slideshow. A written report containing the contents of the presentation is also required. The presentation and written report will explain the possibilities offered by the database management system chosen and give a general description of the application implemented. | + | The duration of the presentation is 45 minutes. It will structured in three parts of SIMILAR length |
+ | * An introduction to technology | ||
+ | * An introduction to the two tools, each presented by a subgroup of two persons | ||
+ | * A common assessment of the advantages and disadvantages of both tools tested in a common example application. | ||
The evaluation of the project focuses on the following criteria: | The evaluation of the project focuses on the following criteria: | ||
Line 142: | Line 165: | ||
The project will count for 25% of the final grade. | The project will count for 25% of the final grade. | ||
- | The project must be submitted by **Monday, December 18, 2017**. | + | The project must be submitted by **Monday, December 11, 2023**. Please send the report and the presentation in PDF format to the lecturer. |
- | ===== Examples of topics from the previous academic year ===== | + | * Cloud databases and Microsoft Azure, AWS, ... |
- | + | ||
- | * Analytical databases and Endeca | + | |
- | * Cloud databases and Microsoft Azure | + | |
* Column stores and Cassandra, Hbase, ... | * Column stores and Cassandra, Hbase, ... | ||
- | * Database Security and Oracle | + | * Data warehouses and Apache Hive |
- | * Deductive Databases and XSB | + | * Distributed databases and SQL Server, Oracle, Citus, ... |
- | * Distributed databases and SQL Server, DynamoDB, ... | + | |
* Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ... | * Document stores and Cloudant, Couchbase, CouchDB, MongoDB, RavenDB, RethinkDB, ... | ||
* Embedded databases and BerkeleyDB | * Embedded databases and BerkeleyDB | ||
- | * Graph Databases and Neo4J, OrientDB, ... | + | * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, Memcached, .... |
- | * In-memory databases and Kdb+, MemSQL, Oracle TimesTen, .... | + | * Key-value stores and BerkeleyDB, DynamoDB, Redis, Voldermort, ... |
- | * Key-value stores and Redis, Voldermort, ... | + | * Multi-model databases and MarkLogic, CosmosDB |
- | * Multimedia databases and Oracle | + | * NewSQL databases and VoltDB, CockrachDB, ... |
- | * Multi-model databases and MarkLogic | + | * Object-oriented databases and ObjectBox, Perst |
- | * NewSQL databases and VoltDB | + | |
- | * Object-oriented databases and db4o | + | |
- | * Object-relational mappings and Entity Framework, Hibernate, Java Data Objects, ... | + | |
* Real-time databases and Firebase | * Real-time databases and Firebase | ||
- | * Spatial databases and SQL Server | + | * Search engines and Solr, ElasticSearch, Sphinx ... |
- | * Spatial 3D Databases and PostgreSQL | + | * Spatial raster databases and Rasdaman |
+ | * Stream databases and Apache Kafka, Event Stores | ||
+ | * Time series databases and Influx DB, Kdb+, ... | ||
* XML databases and BaseX | * XML databases and BaseX | ||
- | |||
===== Topics for the current academic year ===== | ===== Topics for the current academic year ===== | ||
- | * Kaïs Albichari, Tanguy d'Hose: {{:teaching:mongodb_2017.pdf|Document stores and MongoDB}} | + | * Cloud databases with Microsoft Azure and AWS: Maria Camila Salazar, Valerio Rocca, Ludovica Caiola, Simon Coessens |
- | * Alexis Reynouard, Rémy Detobel: {{:teaching:elasticsearch_2017.pdf|Search engines and Elastic Search}} | + | * Time Series DBMS with InfluxDB and Kdb: Gian Tejada Gargate, Gabriel Lozano Pinzón, José Carlos Lozano Dibildox, Enxhi Nushi |
- | * Tiffany Ong Lopez, Sergio Ruiz Sainz: In-memory databases and Apache Ignite | + | * Document Stores with CouchDB and MongoDB: Aryan Gupta, Dilbar Isakova, Hareem Raza, Muhammad Qasim Khan |
- | * Sofia Yfantidou, Noor Zehra: Time series DBs and InfluxDB | + | * Graph Databases with Neo4J and JanusGraph: Gabriela Kaczmarek, Berat Furkan Koçak, Jakub Kwiatkowski, Arijit Samal |
- | * Todi Thanasi, Lev Denisov: NoSQL databases and Cassandra | + | * Distributed Databases with SQL Sever and Oracle: Sony Shrestha, Aayush Paudel, MD Kamrul Islam, Shofiyyah Nadhiroh |
- | * Mi Zhou, Prabhdeep Minhas: XML Databases and BaseX | + | * Search engines with Elasticsearch and Solr: Benjamin Gold, Quentin Demonceau, Nils Van Es Ostos, David García Morillo |
- | * Lucie Bauwin, Nicolas Baudoux: Real-time databases and Firebase | + | * Data Warehousing with Google BigQuery and Snowflake: Yutao Chen, Qianyun Zhuang, Min Zhang, Ziyong Zhang |
- | * Antoine Vandevenne, Akira Baes: Document stores and RethinkDB | + | * Distributed databases with Apache Cassandra and Citus: Catalina Correa, Vassili Papadakis, Paeg Hing Leong, Mohamed Bouchkhachakh |
- | * Marc Garnica, Batuhan Tuter: Stream databases and PipelineDB | + | * Key-value Stores with Redis and Amazon DynamoDB: Dionisius Mayr, Herma Elezi, Rana İşlek, Thomas Suau |
- | * Maksim Hrytsenia, Rui Liu: Column stores and HBase | + | |
- | * Kumar Kshitij, Arthur Valingot: Stream databases and StreamSQL | + | |
- | * Ozge Koroglu, Anna Turu Pi: {{:teaching:neo4jj_2017.pdf|Graph databases and Neo4J}} | + | |
- | * Kashif Rabbani, Ivan Putera Masli: NewSQL Databases and CockroachDB | + | |
- | * Jayanthi Kambayatughar, Marie Elisabeth Heinrich: Cloud databases and Microsoft Azure | + | |
- | * Dagoberto Herrera, Keneth Ubeda: {{:teaching:opentsdb_2017.pdf|Time series databases and OpenTSDB}} | + | |
- | * Bruno Baldez Correa, Yue Wang: Object-relational mapping tools and Hibernate | + | |
- | * Raisa Uku, Fatemeh Shafiee: Key-value stores and Redis | + | |
- | * Dany Efila: Multimedia databases and Oracle | + | |
- | * Anastasiia Zavolozhina, Ferdiansyah Dolot: Object oriented databases and Db4o | + | |
- | * Batra Shubham, Liccardo Nathan: Embedded databases and BerkeleyDB | + | |
- | * Yasin Arslan, Jacky Trinh: Document stores and RavenDB | + | |
- | * Alex Buléon, Antoine Chédin: Multi-model databases and OrientDB | + | |
- | * Beyens Ziad, Nougba Hamza: Document stores and CouchDB | + | |
- | * Aleksei Karetnikov, David Pieschacon: Graph stores and ArangoDB | + | |
- | * Dany-Simone Efila Efila, Michel Noucha: Document stores and Cloudant | + | |
- | * George Kagramanyan, Léni Poliseno: Document stores and Couchbase | + | |
- | * Hajji Issam, Toure Ibrahim: newSQL databases and nuoDB | + | |
- | * Madrane Sofiane, Pierre-Alexandre Bourdais: Object-oriented embedded database and Perst | + | |
- | + | ||
+ | /* | ||
+ | * Microsoft Azure and Google Cloud SQL: Marques Correia Tiago, Kellian Germain, Sébastien Arte, Nehili Adel | ||
+ | * Document databases with ArangoDB and MarkLogic: Mir Wise Khan, Rishika Gupta, Ahmad, Chidiebere Ogbuchi | ||
+ | * Document databases with MongoDB and CouchBase: Abd Abu Sbei, Hoschek Maren, Gupta Prashant, TBD | ||
+ | * Document databases with CouchDB and RavenDB: Aissa Abdoul-Aziz, Helin Demirel, Imane Moussaoui, Salma Mekarnia | ||
+ | * Embedded databases and BerkeleyDB and CouchBase Lite: Talhaoui Yassin, Arfani Abdessamad, Faek Ilias, Adegnon Kokou | ||
+ | * Key-value databases with etcd and Hazelcast: Liliia Aliakberova, Arina Gepalova, Jose Antonio Lorencio Abril, Mariana Mayorga Llano | ||
+ | * Key-value databases with OrientDB and Memcached: Mustapha Ayadi, Valentin De Baene, Soumaya Izmar, Yi Zhu | ||
+ | * Oriented Object Databases with ObjectBox and Perst: Belgada Naoufal, El Hamri Ayoub, Akroune Sami, Sif Eddine Boughris | ||
+ | * RDF databases with Virtuoso and Apache Jena: Nikola Ivanović, Bogdana Živković, Tianheng Zhou, You Xu | ||
+ | * Search engines with ElasticSearch and OpenSearch: Muhammad Rizwan Khalid, Sayyor Yusupov, Ali Abusaleh, Ali Belyazid | ||
+ | * Search engines databases with Solr and Manticore Search: Rachel Aouad Albshara, Loïc Cordeiro Fonseca, Quentin Magron, Dang Phi L. Pham | ||
+ | * Stream Databases with PipelineDB and HStreamDB: Idil Dikbas, Ehsan Gifani, TBD, TBD | ||
+ | * Time Series databases with InfluxDB and KDB: Luis Alfredo León, Jezuela Gega, Satria Wicaksono, Isabella Forero | ||
+ | * Timeseries databases with TimescaleDB and QuestDB: Koumudi Ganepola, Adina Bondoc, Zyad Al-Azazi, Alaa Almutawa | ||
+ | * Wide column stores with Cassandra and HBase: Anthony Zhou, Arnaud Cools, Damien Decleire, Thomas Dudziak | ||
+ | */ | ||
===== Examinations from Previous Years ===== | ===== Examinations from Previous Years ===== | ||