This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
teaching:infoh419 [2018/09/01 12:51] ezimanyi [Course Slides] |
teaching:infoh419 [2018/09/17 18:50] ezimanyi [Group Project] |
||
---|---|---|---|
Line 89: | Line 89: | ||
* [[teaching:infoh419:TP|Exercices Web page]] | * [[teaching:infoh419:TP|Exercices Web page]] | ||
- | ===== Group assignment ===== | + | ===== Group Project ===== |
- | The assignment is carried out in groups of 3 to 4 people. Before you can submit assignment part I, you will have to register in a group. The link to register a group is included below. Please to select your group before or on 25/10/2018. | + | [[http://www.tpc.org|TPC]] is a non-profit corporation that defines transaction processing and database benchmarks and disseminates objective, verifiable TPC performance data to the industry. Regarding data warehouses, two TPC benchmarks are relevant: |
+ | * [[http://www.tpc.org/tpcds/|TPC-DS]], the Decision Support Benchmark, which models the decision support functions of a retail product supplier. | ||
+ | * [[http://www.tpc.org/tpcdi/|TPC-DI]], the Data Integration Support Benchmark, which models a typical ETL process that loads a data warehouse. | ||
- | The assignment consist of 2 parts: | + | The project of the course consist of 2 parts: |
+ | * Part I: Implement the TPC-DS benchmark (deadline 1/11/2018) | ||
+ | * Part II: Implement the TPC-DI benchmark (deadline 20/12/2018) | ||
+ | You will have free choice to use the tools on which the two benchmarks will be built. For example, the TPC-DS benchmark can be implemented on SQL Server Analysis Services, Pentaho Analysis Services (aka Mondrian), etc. Similarly, the TPC-DI could be implemented on SQL Server Integration Services, Pentaho Data Integration, Talend Data Studio, or even SQL scripts. | ||
- | * Part I: Create a conceptual model and translate to a logical schema (deadline 15/11/2018) | + | The project is carried out in groups of 3 to 4 people, which will be the same for the two parts. Before you can submit part I of the project, you will have to register in a group. For registering a group send an email to the lecturer. Please to select your group before or on 1/10/2018. The submission deadlines for parts I and II are strict. |
- | * Part II: (deadline 20/12/2018) | + | |
- | * Creating ETL scripts for updating the database in SSIS, | + | |
- | * Predicting how the size of the data warehouse will grow over time, | + | |
- | * Deploy a data cube on top of the data warehouse and create a report. | + | |
- | Assignment part I will be available on 25/10. For the next parts, assignment II will become available right after the submission deadline of assignment part I. The submission deadlines for parts I and II are strict. | + | The deliverables expected for each part of the project are the following: |
+ | * A report in pdf explaining the essential aspects of your implementation, and | ||
+ | * A zip file containing the code of your implementation, with all necessary instructions to be able to replicate your implementation by the lecturer. | ||
- | The assignment evaluation will count for 30% of your total grade. This may seem undervalued, however, putting effort in the assignment will definitely help you in achieving a better understanding of the course material which will result in a better score in the paper exam which amounts for 70% of the grade. | + | The project evaluation will count for 30% of your total grade. This may seem undervalued, however, putting effort in the project will definitely help you in achieving a better understanding of the course material which will result in a better score in the paper exam which amounts for 70% of the grade. |
===== Examinations from Previous Years ===== | ===== Examinations from Previous Years ===== |