Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
teaching:mfe:is [2014/02/14 12:59]
ezimanyi [From Relational Databases to Linked Open Data]
teaching:mfe:is [2014/03/25 12:44]
svsummer [Efficient computation of simulation for structural indexing]
Line 1: Line 1:
-====== MFE 2013-2014 : Web and Information Systems ======+====== MFE 2014-2015 : Web and Information Systems ======
  
 ===== Introduction ===== ===== Introduction =====
Line 104: Line 104:
  
 Our laboratory performs collaborative research with Euranova R&D (http://​euranova.eu/​). The list of subjects proposed for this year by Euranova can be found  Our laboratory performs collaborative research with Euranova R&D (http://​euranova.eu/​). The list of subjects proposed for this year by Euranova can be found 
-{{:​teaching:​mfe:​euranova_master_thesis_2013_2014.pdf|here}}.+{{:​teaching:​mfe:​mt2014_euranova.pdf|here}}
  
 These subject include topics on distributed graph processing, processing big data using Map/Reduce, cloud computing, and social networks. These subject include topics on distributed graph processing, processing big data using Map/Reduce, cloud computing, and social networks.
  
   * Contact : [[ezimanyi@ulb.ac.be|Esteban Zimanyi]]   * Contact : [[ezimanyi@ulb.ac.be|Esteban Zimanyi]]
 +===== Structural compression of relational and semantic web databases =====
 +
 +Recent research in database management systems at ULB has shown how to
 +theoretically construct succinct (compressed) representations for
 +relational databases and semantic web databases. The advantage of
 +these succinct representations is that they allow querying directly
 +*on the succinct representation*,​ without needing to consult the
 +underlying database.
 +
 +The goal of this thesis is to study scalable algorithms for
 +constructing the actual succinct representations. Some in-memory
 +algorithms are already known, but given the large size of typical
 +database, distributed and out-of-memory alternatives need to be found.
 +
 +
 +  * Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]] ​  
  
 ===== Aspects of Text Analytics and Information Extraction ===== ===== Aspects of Text Analytics and Information Extraction =====
Line 137: Line 153:
  
 Interested? Contact [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]] Interested? Contact [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]
 +
 ===== Models for programming Data Management in the Cloud ===== ===== Models for programming Data Management in the Cloud =====
  
Line 190: Line 207:
  
                                                                                                                                        
- 
-===== Aspects of File and Data Synchronization ===== 
- 
-With the ubiquitous use of mobile computational devices such as 
-laptops and PDAs, it has become increasingly important to be able to 
-synchronize data between these devices. To give a few examples: a new 
-appointment inserted in the calendar on a PDA must become visible on 
-the user's laptop after synchronization (and vice versa); files 
-modified on the laptop have to be synchronized with the Desktop 
-computer, and so on. But what do we do when an appointment is modified 
-on the PDA and at the same time deleted on the laptop? Dealing with 
-such conflicts is the main difficulty in designing good 
-synchronization software. 
- 
-The goal of this thesis is to study, compare, and implement various 
-approaches to file and data synchronizers. This entails studying some 
-of the techniques used by distributed systems and version control 
-systems (such as CVS and Subversion),​ but also requires an 
-investigation of some more recent synchronization proposals like 
-Unison (http://​www.cis.upenn.edu/​~bcpierce/​unison/​index.html) and the 
-so-called "​Lenses"​ proposed in the Harmony project 
-(http://​www.seas.upenn.edu/​~harmony). 
- 
-  * Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]] 
- 
  
 =====Foundations of Data Description Languages===== =====Foundations of Data Description Languages=====
Line 240: Line 232:
   * Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]   * Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]
  
-=====Capturing Semantic ​ Web Data from Web Pages===== 
  
- 
-The [[http://​linkeddata.org/​|Linked Open Data]] (LOD) initiative is aimed at extending the Web  by means of publishing various open datasets as RDF,  setting RDF links between data items from different data sources. ​ In spite of  the interest of organization in publishing their data, many of them are not willing to pay the price of devoting working hours or their employees for doing the hard work that preparing and updating these data requires. Therefore, a very interesting and practical problem that arises is how to produce LOD automatically from Web sites. This   ​problem can be tackled if selected and well-defined domains are chosen. ​ 
- 
-  
-In his thesis we propose to select a site of a broadcasting company, and, through intelligent crawling techniques capture data of interest and publish it as RDF data. In a second step, we propose to  use these data to pose queries that involve different nodes of the Web of linked ​ data.  ​ 
-  
- 
-* Contacts :  
-    * [[ezimanyi@ulb.ac.be|Esteban Zimányi]] (CoDE) 
-  
 =====Publishing and Using Spatio-temporal Data on the Semantic Web===== =====Publishing and Using Spatio-temporal Data on the Semantic Web=====
  
Line 260: Line 241:
 by application providers, that can build attractive and useful applications,​ in particular, for devices like mobile phones, tablets, etc.  by application providers, that can build attractive and useful applications,​ in particular, for devices like mobile phones, tablets, etc. 
  
-The goals of this thesis are: (i) study the existing proposals for mapping spatio-temporal data into LOD; (ii) apply this mapping to a real world case study (the "Open Semantic Cloud for Brussels" ​project; (iii) Based on the produced mapping, and using existing applications like the [[http://http://​linkedgeodata.org/​|Linked Geo Data project]], build applications that make use of LOD for example, to find out which cultural events are taking place at a given time at a given location. ​  +The goals of this thesis are: (1) study the existing proposals for mapping spatio-temporal data into LOD; (2) apply this mapping to a real-world case study (as was the case for the [[http://​www.oscb.be/​|Open Semantic Cloud for Brussels]] project; (3) Based on the produced mapping, and using existing applications like the [[http://​linkedgeodata.org/​|Linked Geo Data project]], build applications that make use of LOD for example, to find out which cultural events are taking place at a given time at a given location. ​  
    
  
-Contacts ​ +    ​Contact: [[ezimanyi@ulb.ac.be|Esteban Zimányi]]
-    * [[ezimanyi@ulb.ac.be|Esteban Zimányi]] ​(CoDE)+
  
 =====Extending SPARQL for Spatio-temporal Data Support===== =====Extending SPARQL for Spatio-temporal Data Support=====
  
 [[http://​www.w3.org/​TR/​rdf-sparql-query/​|SPARQL]] is the W3C standard language to query RDF data over the semantic web. Although syntactically similar to SQL,  SPARQL is based on graph matching. In addition, SPARQL is aimed, basically, to query alphanumerical data.  ​ [[http://​www.w3.org/​TR/​rdf-sparql-query/​|SPARQL]] is the W3C standard language to query RDF data over the semantic web. Although syntactically similar to SQL,  SPARQL is based on graph matching. In addition, SPARQL is aimed, basically, to query alphanumerical data.  ​
-Therefore, a proposal to extend SPARQL to support spatial data has been presented to the Open Geospatial Consortium. This proposal is called ​ [[http://​www.opengeospatial.org/​projects/​groups/​geosparqlswg/​|GeoSPARQL]]. ​+Therefore, a proposal to extend SPARQL to support spatial datacalled ​ [[http://​www.opengeospatial.org/​projects/​groups/​geosparqlswg/​|GeoSPARQL]], has been presented to the Open Geospatial Consortium 
    
-In his thesis we propose to (a) perform an analysis of the current proposal for GeoSARQL; (b) a study of  current implementations of SPARQL that support spatial data; (c) implement simple extensions for SPARQL to support spatial data, and use these language in real-world use cases. ​+In this thesis we propose to (1) perform an analysis of the current proposal for GeoSPARQL; (2) a study of  current implementations of SPARQL that support spatial data; (3) implement simple extensions for SPARQL to support spatial data, and use these language in real-world use cases. ​
    
  
-Contacts ​ +   Contact: [[ezimanyi@ulb.ac.be|Esteban Zimányi]]
-    * [[ezimanyi@ulb.ac.be|Esteban Zimányi]] ​(CoDE)+
    
- 
-===OLD Subjects 2011-2012==== 
- 
-===== From Relational Databases to Linked Open Data ===== 
- 
- 
-[[http://​www.w3c.org/​|RDF]] is the [[http://​www.w3c.org/​|W3C]] proposed framework for representing information 
-in the Web. Basically, information in RDF is represented as a set of triples of the form (subject,​predicate,​object). ​ RDF syntax is based on directed labeled graphs, where URIs are used as node labels and edge labels. ​ In spite of the  constant growth in the amount of RDF data available on the Web, and the growing ​ number of applications for these data, most companies still store their data in relational databases. Nevertheless,​ many of these companies are interested in publishing (part of) their data on the Web in RDF format. Further, the [[http://​linkeddata.org/​|Linked Open Data]] (LOD) initiative is aimed at extending the Web  by means of publishing various open datasets as RDF,  setting RDF links between data items from different data sources. ​ This suggests that the problem at hand will be, in the near future, to transform ​ relational to Linked Open Data. 
- 
-The increasing interest in publishing relational data as RDF  resulted in the creation of the W3C RDB2RDF Working Group, which is elaborating a recommendation for mapping relational to RDF data.  This mapping ​ poses challenges not only from a theoretical point of view, but also from a practical one as well. Just to mention a few ones: generating IRIs to represent RDF resources is not trivial, as it is not the representation ​ of keys (e.g., primary and foreign) in the RDF world.  ​ 
- 
-We propose to  develop a Relational to RDF translation framework, using the principles stated in the [[http://​www.w3c.org/​TR/​201/​WD-rdb-direct-mapping-20110324/​|W3C RDB2RDF]] working group document. This framework will be tested ​ using real-world datasets provided by interested partners. Further, our goal is to transform the resulting RDF graph using the LOD principles, ​ which will allow to leverage the value of each partner'​s dataset.  ​ 
- 
-* Contacts :  
-    * [[svsummer@ulb.ac.be|Stijn Vansummeren]] (CoDE) 
- 
- 
-===== Projects and issues management tools for public safety software development ===== 
- 
-Intergraph S.A. (SG&I Division) is the leading provider of software solutions to the Belgian Emergency Services (100, 101 &112 call centers). Intergraph’s responsibility is to deliver high quality software and provide maintenance servicesto the users of these services (police, medical and fire brigades). 
- 
-Intergraph uses several tools (Project Plan, Redmine, SharePoint,​…) to plan new projects and to report and track issues. Each tool has its strengths and its target audience. For example, Redmine is used as a defect tracking tool by the development team. It enables to control the bug fixing workflow from reporting to solution verification by the quality team. Project Plan and SharePoint are used by managers to communicate higher level information like cost, milestones and progress. 
- 
-Intergraph Development Team has the objective to reach CMMI Level 2 (Capability Maturity Model Integration). This level of maturity requests to develop software with processes that are planned and measured. Planning is based on agile Sprint methodology. ​ Measures are for examples effort, tasks burn down, cost, reaction time…We want to integrate these processes – planning and key measures - in our Redmine environment. ​ We want to be able to report automatically project delays to the executive management and costs to the accounting department using the data reported in Redmine. 
- 
- 
-The goal of this thesis is: 
-  - To understand the CMMI Level 2 and make proposals to reach that level. 
-  - To study the feasibility of enhancing the Redmine application with tools that will help us to implement better planning and reporting. 
-  - To design, develop and test these tools. 
-Intergraph S.A. is looking for profiles able to handle such tasks. This thesis can be seen as a main entrance gate to the Company for a permanent position. ​ 
- 
-  * References : 
-    * [[http://​www.intergraph.com/​]] 
-    * [[http://​www.intergraph.com/​global/​be/​]] 
-    * [[http://​www.sei.cmu.edu/​reports/​10tr033.pdf]] 
- 
- 
- 
- 
-===== Automatic Support for Spatio-Temporal Integrity Constraints ===== 
- 
-The Object Constraint Language (OCL), part of the UML standard, is a formal language for defining constraints on UML models. The [[http://​dresden-ocl.sourceforge.net/​|Dresden OCL toolkit]] is an open source software platform for OCL tool support. One of the tools comprising the OCL toolkit is OCL2SQL, an SQL code generator that generates an SQL check constraint, assertion or trigger for an OCL invariant. OCL2SQL can be used and adapted for different relational database systems and different object-to-table mappings. 
- 
-The objective of the project is to extend the toolkit for taking into account spatial, temporal and multi-representation constraints,​ as those proposed by the MADS model. 
- 
-  * Contact : [[ezimanyi@ulb.ac.be|Esteban Zimányi]] ​ 
- 
- 
-===== A database infrastructure for storing and manipulating trajectories ===== 
- 
-Thanks to current sensors and GPS technologies,​ large-scale capture of the evolving position of individual mobile objects has become technically and economically feasible. 
- 
-Typical examples of moving objects include cars, persons and planes equipped with a GPS device, animals bearing a transmitter whose signals are captured by satellites, and parcels tagged with RFIDs. 
- 
-Analysis of trajectory data is the key to a growing number of 
-applications aiming at global understanding and management of complex phenomena that involve moving objects (e.g. worldwide courier distribution,​ city traffic management, bird migration monitoring). ​ 
- 
-This project consists of studying and extending the limited capabilities of commercial data management systems for storing and manipulating the position of moving objects all along their lifespan. 
- 
- 
-  * Contact : [[ezimanyi@ulb.ac.be|Esteban Zimányi]] ​ 
- 
- 
-===== Extending PostGIS for the support of continuous fields ===== 
- 
-PostGIS is an popular open-source database system supporting spatial application applications. ​ 
- 
-Continuous fields are phenomena that are perceived as having a value at each point in space and/or time. Examples of such phenomena include 
-temperature,​ altitude, or land use. In [[http://​code.ulb.ac.be/​dbfiles/​VaiZim2009aincollection.pdf|this paper]] we defined a data type that encapsulates the different operations needed 
-for manipulating continuous fields. The objective of the project is to implement such a data type in the PostGISsystem. ​ 
- 
-  * Contacts :  
-    * [[ezimanyi@ulb.ac.be|Esteban Zimányi]] 
 
teaching/mfe/is.txt · Last modified: 2020/09/29 17:03 by mahmsakr