Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:projh402 [2020/10/03 14:11]
ezimanyi [K-D-Tree Indexes for MobilityDB]
teaching:projh402 [2020/10/03 17:50]
ezimanyi [VODKA Indexes for MobilityDB]
Line 79: Line 79:
 Indexes are essential in databases for quickly locating data without having to search every row in a table every time a database table is accessed. Thus, an index is an auxiliary data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index. PostgreSQL provides [[https://​habr.com/​ru/​company/​postgrespro/​blog/​441962/​|multiple types of indexes]] for various data types. Indexes are essential in databases for quickly locating data without having to search every row in a table every time a database table is accessed. Thus, an index is an auxiliary data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index. PostgreSQL provides [[https://​habr.com/​ru/​company/​postgrespro/​blog/​441962/​|multiple types of indexes]] for various data types.
  
-In MobilityDB two types of indexes has been implemented,​ [[https://​habr.com/​en/​company/​postgrespro/​blog/​444742/​|GiST]] and [[https://​habr.com/​ru/​company/​postgrespro/​blog/​446624/​||SP-GiST]]. More precisely, in PostgreSQL, these types of indexes are frameworks for developing multiple types of indexes. Concerning SP-GiST indexes, in MobilityDB we have developed 4-dimensional quad-trees where the dimensions are X, Y, and possibly Z for the spatial dimension and T for the time dimension. An alternative approach would be to use [[https://​en.wikipedia.org/​wiki/​K-d_tree|K-D Trees]]. K-D trees can be implemented in PostgreSQL using the SP-GiST framework and an example [[https://​github.com/​postgres/​postgres/​blob/​master/​src/​backend/​access/​spgist/​spgkdtreeproc.c|implementation]] for simple [[https://​www.postgresql.org/​docs/​current/​datatype-geometric.html|geometric types]] exist. The goal of the project is to implement K-D indexes for MobilityDB and perform a benchmark comparison between K-D trees and the existing 4-dimensional quad-trees.+In MobilityDB two types of indexes has been implemented, namely, [[https://​habr.com/​en/​company/​postgrespro/​blog/​444742/​|GiST]] and [[https://​habr.com/​ru/​company/​postgrespro/​blog/​446624/​|SP-GiST]]. More precisely, in PostgreSQL, these types of indexes are frameworks for developing multiple types of indexes. Concerning SP-GiST indexes, in MobilityDB we have developed 4-dimensional quad-trees where the dimensions are X, Y, and possibly Z for the spatial dimension and T for the time dimension. An alternative approach would be to use [[https://​en.wikipedia.org/​wiki/​K-d_tree|K-D Trees]]. K-D trees can be implemented in PostgreSQL using the SP-GiST framework and an example [[https://​github.com/​postgres/​postgres/​blob/​master/​src/​backend/​access/​spgist/​spgkdtreeproc.c|implementation]] for simple [[https://​www.postgresql.org/​docs/​current/​datatype-geometric.html|geometric types]] exist. The goal of the project is to implement K-D indexes for MobilityDB and perform a benchmark comparison between K-D trees and the existing 4-dimensional quad-trees. 
 + 
 +===== VODKA Indexes for MobilityDB ===== 
 + 
 +MobilityDB implemented [[https://​habr.com/​en/​company/​postgrespro/​blog/​444742/​|GiST]] and [[https://​habr.com/​ru/​company/​postgrespro/​blog/​446624/​|SP-GiST]] indexes for temporal types. These indexes are based on bounding boxes, that is, inner or leaf levels of the indexes store a bounding box that keeps the mininum and maximum values of each of the dimensions X, Y, Z (if available) and T where X, Y, Z are for the spatial dimension and T for the temporal dimension. The reason for this is that a temporal type (for example, a moving point representing the movement of a vehicle) can have thousands of timestamped points and keeping all these points for each vehicle indexed in a table is very inefficient. By keeping the bounding box only, it is possible to quickly filter the rows in a table and then a more detailed analysis can be made for those rows selected by the index. 
  
  
  
 
teaching/projh402.txt · Last modified: 2022/09/06 10:39 by ezimanyi