This is an old revision of the document!


INFO-H-417 : Database Systems Architecture

GENERAL INFORMATION

In contrast to a typical introductory course in database systems where one learns to design and query relational databases, the goal of this course is to get a fundamental insight into the implementation aspects of systems designed to manage and process large amounts of data. Our objective in this respect is two-fold. (1) To gain the background required to design and implement future data management and processing systems and (2) to gain an understanding of how performance of practical data management systems can be tweaked.

In particular, we take a look under the hood of relational database management systems, with a focus on query and transaction processing. The focus on relational database management systems is motivated by the fact that the algorithms and architectures underlying relational databases have strongly influenced the design of contemporary data processing and management systems: graph databases, in-memory database systems, stream databases, and even NoSQL systems.

With respect to query processing, we study the whole workflow of how a typical relational database management system optimizes and executes SQL queries. This entails an in-depth study of: (1) translating the SQL query into a “logical query plan”; (2) optimizing the logical query plan; (3) how each logical operator can be algorithmically implemented on the physical (disk) level, and how secondary-memory index structures can be used to speed up these algorithms; and (4) the translation of the logical query plan into a physical query plan using cost-based plan estimation.

With respect to transaction processing we study how a typical relational database management systems ensures recovery from errors and controls concurrent access to the data. Topics studied in transaction processing include logging, serializability, concurrency control, and their combination.

Contacts

Organisation

  • The course is taught during the first semester
  • The list of competences that will be taught during the course and interrogated during the exam is available in the course plan.

Course Material

The course uses the book "Database Systems: The Complete Book (second, international edition)" by H. Garcia-Molina, J. D. Ullman, and J. Widom (ISBN-13: 978-0131354289), complemented by course notes made available on this website.

Method of Evaluation

Students are evaluated on both a project to be developed during the semester, and a written exam. The project work contributes 6/20 points to the overall score, and the written exam contributes the remaining 14/20 points. Participation in both the project work and the written exam are mandatory requirements for passing the course.

COURSE TRAJECTORY

Lecture 1: Course Introduction and Translation of SQL into the Relational Algebra

  • During lecture 1 we refresh the basic background knowledge on relational database management systems (relations, relational algebra, SQL). To re-acquaint yourself with the relevant background knowledge, you are expected to read thoroughly chapter 1, chapter 2 (only sections 2.2 and 2.4), chapter 5 (only sections 5.1 and 5.2) and 6 from the handbook TCB.
  • During lecture 1 (slides), we present an overview of the architecture of a query compiler (see chapter 16, sections 16.1, 16.3.1 and 16.3.2 in the book) and study the translation of SQL into the extended relational algebra (see course notes for the full translation algorithm).
  • You are expected to solve exercise 1 of the translation exercises (pdf) by the exercise session of friday 27 september. Exercise 2 gives extra exercise possibilities, but will not be corrected in class.
 
teaching/infoh417.1568110263.txt.gz · Last modified: 2019/09/10 12:11 by svsummer