Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:projh402 [2020/10/01 11:24]
ezimanyi [Projects in Mobility Databases]
teaching:projh402 [2020/10/01 16:37]
ezimanyi [Visualization Moving Objects on the Web]
Line 20: Line 20:
  
  
-===== Visualization Moving Objects on the Web =====+===== Visualization ​of Moving Objects on the Web =====
  
 <TBD> <TBD>
Line 26: Line 26:
  
 ===== Implementing TSBS on MobilityDB ===== ===== Implementing TSBS on MobilityDB =====
-which includes devising a spatio-temporal bucket function equivalent to time_bucket from TimescaleDB. 
  
-<TBD>+The Time Series Benchmark Suite ([[https://​github.com/​timescale/​tsbs|TSBS]]) is a collection of Go programs that are used to generate datasets and then benchmark read and write performance of various time series databases. This bechmark has been developed by [[https://​www.timescale.com/​|TimescaleDB]],​ which is a time series extension of PostgreSQL.  
 + 
 +A significant addition of TimescaleDB to PosgreSQL is the addition of the [[https://​blog.timescale.com/​blog/​simplified-time-series-analytics-using-the-time_bucket-function/​|time_bucket]] function. This function allows to partition the time line in user-defined interval units that are used for aggregating data. 
 + 
 +The project consists in implementing a multidimensional generalization of the time_bucket function that allows the user to partition the spatial and/or temporal domain of a table in units (or tiles) that can be used for aggregating data. Then, the project consists of performing a benchmark comparison of TimescaleDB and MobilityDB. 
  
  
Line 34: Line 38:
 A distributed database is an architecture in which multiple database instances on different machines are integrate in order to form a single database server. Both the data and the queries are then distributed over these database instances. This architecture is effective in deploying big databases on a cloud platform. A distributed database is an architecture in which multiple database instances on different machines are integrate in order to form a single database server. Both the data and the queries are then distributed over these database instances. This architecture is effective in deploying big databases on a cloud platform.
  
-MobilityDB is engineered as an extension of PostgreSQL. AWS supports PostgreSQL databases in Amazon RDS for PostgreSQL and in Amazon Aurora. The goal of this project is to integrate MobilityDB with these products. The key outcomes are a comprehensive assessment of which MOD API can/cannot be distributed,​ and an assessment of the performance gain. These outcomes should serve as a base for a thesis project to achieve effective integration.+MobilityDB is engineered as an extension of PostgreSQL. AWS supports PostgreSQL databases in [[https://​aws.amazon.com/​rds/​postgresql/​|Amazon RDS]] for PostgreSQL and in [[https://​aws.amazon.com/​rds/​aurora/​postgresql-features/​|Amazon Aurora]]. The goal of this project is to integrate MobilityDB with these products. The key outcomes are a comprehensive assessment of which MOD API can/cannot be distributed,​ and an assessment of the performance gain. These outcomes should serve as a base for a thesis project to achieve effective integration.
  
  
Line 43: Line 47:
  
 ===== Map-matching as a Service ===== ===== Map-matching as a Service =====
-When GPS tracks typically contain errors, because the GPS receiver ​ +GPS location ​tracks typically contain errors, as the GPS points will normally be some meters away from the true position. If we know that the movement happened on a street network, e.g., a bus or a car, then we can correct this back by putting the points on the street. Luckily there are Algorithms for this, called Map-Matching. There are also a handful of open source systems that do map matching. It remains however difficult to end users to use them, because ​they involve non-trivial installation and configuration effort. Preparing the base map, which will be used in the matching is also an issue to users.  
 + 
 +The goal of this project is to build an architecture for a Map-Matching service. The challanges are that the GPS data arrives in different formats, and that Map-Matching is a time consuming Algorithm. This architecture should thus allow different input formats, and should be able to automatically scale according to the request rate. Another key outcome of this project is to compare the existing Map-Matching implementations,​ and to discuss their suitability in real world problems. 
 + 
 +Links: 
 +  * [[https://​github.com/​bmwcarit/​barefoot|Barefoot]] 
 +  * [[https://​valhalla.readthedocs.io/​en/​latest/​api/​map-matching/​api-reference/​|Valhalla Map Matching API]]  
 +  * [[https://​github.com/​graphhopper/​map-matching|GraphHopper]] 
 +  * [[https://​github.com/​cyang-kth/​fmm|Fast Map Matching]]
  
  
 ===== Geospatial Trajectory Data Cleaning ===== ===== Geospatial Trajectory Data Cleaning =====
 +Data cleaning is essential preprocessing for analysing the data and extracting meaningful insights. Real data will typically include outliers, inconsistencies,​ missing data, repeated transactions possibly with different keys, and other kinds of acquisition errors. In geospatial trajectory data, there are even more sources of error, such as GPS inaccuracies. ​
  
 +The goal of this project is to survey the state of the art in geospatial trajectory data cleaning, both model-based and machine learning. The work also includes prototyping and empirically evaluating a selection of these methods in the MobilityDB system, and on different real datasets. These outcomes should serve as a base for a thesis project to enhance geospatial trajectory data cleaning.
  
 ===== Geospatial Trajectory Similarity Measure ===== ===== Geospatial Trajectory Similarity Measure =====
 +One of the main functions for a wide range of application domains is to measure the  similarity between two  moving objects'​ trajectories. This is desirable for similarity-based retrieval, classification,​ clustering and  other querying and mining tasks over moving objects'​ data. The  existing movement similarity measures can be classified into  two classes: (1) spatial similarity that focuses on finding trajectories with  similar geometric shapes, ignoring the temporal dimension; and (2) spatio-temporal similarity that takes into account both the spatial and the temporal dimensions of movement data.
  
 +The goal of this project is to survey and to prototype in MobilityDB the state of art methods in trajectory similarity. Since it is a complex problem, these outcomes should serve as a base for a thesis project to propose effective and efficient trajectory similarity measures.
 ===== Spatiotemporal k-Nearest Neighbour (kNN) Queries ===== ===== Spatiotemporal k-Nearest Neighbour (kNN) Queries =====
 +An example of continuous kNN is when the GPS device of the vehicle initiates a query
 +to find the three closest gas stations to the vehicle at any time instant during its trip from source to destination. According to the location of the vehicle, the set of three nearest gas stations can change. The result is thus a set of intervals, where very interval is associated with a set of three gas stations. The challenge in this type of query is to find an efficient incremental way of evaluation. ​
  
 +The goal of the project is to survey the state of art in continuous kNN queries, and to prototype selected methods in MobilityDB. Since it is a complex problem, these outcomes should serve as a base for a more elaborate thesis project.
  
  
 
teaching/projh402.txt · Last modified: 2022/09/06 10:39 by ezimanyi