Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:infoh419 [2018/09/01 10:41]
ezimanyi [Grading]
teaching:infoh419 [2018/09/18 11:01]
ezimanyi [Course Slides]
Line 6: Line 6:
   * [[http://​cs.ulb.ac.be/​members/​esteban/​|Esteban Zimányi]]   * [[http://​cs.ulb.ac.be/​members/​esteban/​|Esteban Zimányi]]
   * <​ezimanyi@ulb.ac.be>​   * <​ezimanyi@ulb.ac.be>​
-  * Tuesday 2 pm - 4 pm 
-  * Friday 4 pm - 6 pm 
- 
 ===== Volume ===== ===== Volume =====
  
Line 65: Line 62:
   * {{teaching:​infoh419:​dw00-refresher.pdf|Refresher Databases}}   * {{teaching:​infoh419:​dw00-refresher.pdf|Refresher Databases}}
   * {{teaching:​infoh419:​dw01-introduction.pdf|Introduction}}   * {{teaching:​infoh419:​dw01-introduction.pdf|Introduction}}
-  ​* {{teaching:​infoh419:​dw02-cubes.pdf|Cubes}} +    ​* {{teaching:​infoh419:​database_explosion_report.pdf|Database explosion report}} 
-  * {{teaching:​infoh419:​dw03-dfm.pdf|Dimension Fact Model}}+    * {{teaching:​infoh419:​database_explosion.pdf|Database explosion}} 
 +  * {{teaching:​infoh419:​dw02-dfm.pdf|Dimension Fact Model}}
   * {{teaching:​infoh419:​dw04-logicalmodel.pdf|Logical Model}}   * {{teaching:​infoh419:​dw04-logicalmodel.pdf|Logical Model}}
   * {{teaching:​infoh419:​dw05-dimensionchanges.pdf|Dimension Changes}}   * {{teaching:​infoh419:​dw05-dimensionchanges.pdf|Dimension Changes}}
Line 90: Line 88:
   * [[teaching:​infoh419:​TP|Exercices Web page]]   * [[teaching:​infoh419:​TP|Exercices Web page]]
  
-===== Group assignment ​=====+===== Group Project ​===== 
 + 
 +[[http://​www.tpc.org|TPC]] is a non-profit corporation that defines transaction processing and database benchmarks and disseminates objective, verifiable TPC performance data to the industry. Regarding data warehouses, two TPC benchmarks are relevant: 
 +  * [[http://​www.tpc.org/​tpcds/​|TPC-DS]],​ the Decision Support Benchmark, which models the decision support functions of a retail product supplier.  
 +  * [[http://​www.tpc.org/​tpcdi/​|TPC-DI]],​ the Data Integration Support Benchmark, which models a typical ETL process that loads a data warehouse.
  
-The assignment is carried out in groups ​of 3 to 4 people. Before you can submit assignment part I, you will have to register in a groupThe link to register a group is included belowPlease to select your group before or on 25/10/2018.+The project ​of the course consist of 2 parts: 
 +  * Part I: Implement the TPC-DS benchmark (deadline 1/​11/​2018) 
 +  * Part II: Implement the TPC-DI benchmark (deadline 20/​12/​2018) 
 +You have free choice ​to use the tools on which the two benchmarks will be implementedFor example, the TPC-DS benchmark could be implemented on SQL Server Analysis Services, Pentaho Analysis Services (aka Mondrian), etcSimilarly, the TPC-DI benchmark could be implemented ​on SQL Server Integration Services, Pentaho Data Integration,​ Talend Data Studio, SQL scripts, etc., which then load the data warehouse on a DBMS such as SQL Server, Oracle, PostgreSQL, etc
  
-The assignment consist ​of 2 parts:+Furthermore,​ both benchmarks can be implemented with several scale factors, which determine the size of the resulting data warehouse. For the purposes of this project you can use the smallest scale factor.
  
-  * Part I: Create a conceptual model and translate to a logical schema ​ (deadline 15/​11/​2018) +The project is carried out in groups of 3 to 4 personswhich will be the same for the two parts. Before you can submit part I of the projectyou will have to register in group. For this, please send an email to the lecturer with the information about your group by 1/10/2018 at the latest. The submission deadlines for parts I and II are strict.
-  * Part II: (deadline 20/​12/​2018) +
-    * Creating ETL scripts for updating the database ​in SSIS, +
-    * Predicting how the size of the data warehouse will grow over time, +
-    *  Deploy ​data cube on top of the data warehouse ​and create a report.+
  
-Assignment ​part I will be available on 25/10. For the next parts, assignment II will become available right after the submission deadline ​of assignment part I. The submission deadlines for parts I and II are strict.+The deliverables expected for each part of the project are the following:​ 
 +  * A report in pdf explaining the essential aspects ​of your implementation, ​and 
 +  * A zip file containing the code of your implementation,​ with all necessary instructions to be able to replicate your implementation by the lecturer in standard computing infrastructure.
  
-The assignment ​evaluation will count for 30% of your total grade. This may seem undervalued,​ however, putting effort in the assignment ​will definitely help you in achieving a better understanding of the course material which will result in a better score in the paper exam which amounts for 70% of the grade.+The project ​evaluation will count for 30% of your total grade. This may seem undervalued,​ however, putting effort in the project ​will definitely help you in achieving a better understanding of the course material which will result in a better score in the paper exam which amounts for 70% of the grade.
  
 ===== Examinations from Previous Years ===== ===== Examinations from Previous Years =====
 
teaching/infoh419.txt · Last modified: 2023/11/20 16:18 by ezimanyi