Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:infoh419 [2018/10/17 09:50]
ezimanyi [Groups of the current year]
teaching:infoh419 [2022/09/19 11:25]
ezimanyi [Books]
Line 38: Line 38:
  
 ===== Books ===== ===== Books =====
-  * [[https://www.springer.com/​9783642546549|Data Warehouse Systems: Design and Implementation]] ​by Alejandro A. Vaisman and Esteban Zimányi. Springer, ​2014.+  * [[https://link.springer.com/​978-3-662-65167-4|Data Warehouse Systems: Design and Implementation]], second edition, ​Alejandro A. Vaisman and Esteban Zimányi. Springer, ​2022.
   * [[http://​www.morganclaypool.com/​doi/​abs/​10.2200/​s00299ed1v01y201009dtm009|Multidimensional Databases and Data Warehousing]] by Cristian S. Jensen, Torben Bach Pedersen, and Christian Thomsen. Morgan & Claypool Publishers.   * [[http://​www.morganclaypool.com/​doi/​abs/​10.2200/​s00299ed1v01y201009dtm009|Multidimensional Databases and Data Warehousing]] by Cristian S. Jensen, Torben Bach Pedersen, and Christian Thomsen. Morgan & Claypool Publishers.
   * [[http://​www.mcgraw-hill.co.uk/​html/​0071610391.html|Data Warehouse Design: Modern Principles and Methodologies]] by Matteo Golfarelli and Stefano Rizzi. McGraw-Hill,​ 2009   * [[http://​www.mcgraw-hill.co.uk/​html/​0071610391.html|Data Warehouse Design: Modern Principles and Methodologies]] by Matteo Golfarelli and Stefano Rizzi. McGraw-Hill,​ 2009
Line 95: Line 95:
  
 The project of the course consist of 2 parts: The project of the course consist of 2 parts:
-  * Part I: Implement the TPC-DS benchmark (deadline 1/11/2018+  * Part I: Implement the TPC-DS benchmark (deadline 1/11/2021
-  * Part II: Implement the TPC-DI benchmark (deadline ​20/12/2018)+  * Part II: Implement the TPC-DI benchmark (deadline ​24/12/2021)
 You have free choice to use the tools on which the two benchmarks will be implemented. For example, the TPC-DS benchmark could be implemented on SQL Server Analysis Services, Pentaho Analysis Services (aka Mondrian), etc. Similarly, the TPC-DI benchmark could be implemented on SQL Server Integration Services, Pentaho Data Integration,​ Talend Data Studio, SQL scripts, etc., which then load the data warehouse on a DBMS such as SQL Server, Oracle, PostgreSQL, etc.  You have free choice to use the tools on which the two benchmarks will be implemented. For example, the TPC-DS benchmark could be implemented on SQL Server Analysis Services, Pentaho Analysis Services (aka Mondrian), etc. Similarly, the TPC-DI benchmark could be implemented on SQL Server Integration Services, Pentaho Data Integration,​ Talend Data Studio, SQL scripts, etc., which then load the data warehouse on a DBMS such as SQL Server, Oracle, PostgreSQL, etc. 
  
-Furthermore,​ both benchmarks ​can be implemented with several scale factors, which determine the size of the resulting data warehouse. ​For the purposes of this project ​you can use the smallest ​scale factor.+Furthermore,​ both benchmarks ​must be implemented with several scale factors, which determine the size of the resulting data warehouse. ​You DO NOT need to use the scale factors mentioned in the TPC requirements. The pedagogical objectives aimed at is that you learn how to properly perform a benchmark. Therefore, you need to estimate ​the biggest ​scale factor ​that you can put on your own computer: this will be your reference scale factor, say 1.0, and then you will need to have 3 smaller scale factors, e.g., at 0.1, 0.2, and 0.5 of the full size in order to see the evolution of the performance.
  
-The project is carried out in groups of 3 to 4 persons, which will be the same for the two parts. Before you can submit part I of the project, you will have to register in a group. For this, please send an email to the lecturer with the information about your group by 1/10/2018 at the latest. The submission deadlines for parts I and II are strict.+The project is carried out in groups of 3-4 persons, which will be the same for the two parts. Before you can submit part I of the project, you will have to register in a group. For this, please send an email to the lecturer with the information about your group by 1/10/2020 at the latest. The submission deadlines for parts I and II are strict.
  
 The deliverables expected for each part of the project are the following: The deliverables expected for each part of the project are the following:
Line 111: Line 111:
 ===== Groups of the current year ===== ===== Groups of the current year =====
  
-  * MySQLSara DiazBuse OzerPinar Turkyilmaz and Shabana Salmaan +  * SQL ServerNicole ZafalónDiogo RapasAndrés Espinal, Adam Broniewski 
-  * Group 2Carlos BadilloSokratis PapadopoulosIoannis PrapasGabriela Martinez +  * PostgreSQLNiccolò MorabitoCHUN HAN LIVíctor DivíFilip Sotiroski 
-  * Apache SparkApache HiveKubernetesGoogle Cloud Platform (GPC)Ricardo Rojas RuizAnnemarie BurgerDanilo J. Acosta VillalobosElena Ouro Paz +  * mySQL: Valada kylynnykYanjian ZhangZhicheng LouKainaat Amjid  
-  * MariaDBGonçalo MoreiraNezrin NecefzadeRémy DetobelShafagh Kashefzarelialestani +  * OracleEl Achouchi IliassBelgada WassimAjouaou Soufiane 
-  * Cloudera ImpalaHadoopGoogle Cloud Platform (GPC)Eugen PatrascuKunal AroraEdoardo ConteCarlos E. Muniz Cuza +  * SQLite: Laamiri Achraf, Mareghni NidhalKuete Kamta Frank Jordan 
-  * Group 6Evgeny PozdeevFernando Mendes StefaniniBraulio Blanco LambruschiniPablo Jose Lopez Estigarribia +  * mariadbTejaswini DhupadHimanshu ChoudharyKamdem Tagne Thomas BorelSergio Postigo 
-  * Apache Spark over MySQLAnkush SharmaRené GomezHaftamu HailuKaoutar Chennaf +  * Spark SQL: Yi WuHang YuZhiyang Guo, Mohammad Zain Abbas 
-  * PostgreSQLPablo Molina MataDanish AmjadCarlos Martínez Lorenzo +  * DB2/AirflowMd Jamiur Rahman RifatKhushnur Binte JahangirAsha Said SeifPietro Ferrazzi 
-  * SQL ServerSivaporn HomvanishTzu-man WuAinhoa Zapirain Mariezcurrena +  * Microsoft Azure SQLDavide RendinaMarita HernandezLuiz FonsecaZyrako Musaj 
-  * MemSQLAmritansh SharmaDwi Prasetyo Adi NugrohoHaydar Ali IsmailPratham Solanki +  * CitusNazgul K. Rakhimzhanova⁩Mohammad Ismail TirmiziMaël TouretWassim Kezai 
- +  * AWS AuroraHind BakkaliGaëlle FrauenkronMahmut Asım Onat, Salma Salmani 
 +  * Google BigQuerySoufian El Bakkali TamaraMaciej PiekarskiDavid Silberwasser,​ Sami Abdul Sater 
 +  * ImpalaYahya BakkaliAmirmohammad FallahiMaxime HauwaertAlexandre Libert
 ===== Examinations from Previous Years ===== ===== Examinations from Previous Years =====
  
 
teaching/infoh419.txt · Last modified: 2023/11/20 16:18 by ezimanyi