MFE 2009-2010 : Systèmes d'Information et Web Sémantique

Introduction

One of the research areas of the Department of Computer & Decision Engineering concerns information systems. Since several years the department has been involved in the development of the MADS conceptual model, an extension of the Entity-Relationship model for representing spatial, temporal, and multi-representation features. One of the characteristics of the MADS model is that it covers both data definition as well as query and update facilities. More recently, the department has been working in the development of the MultiDim model, an extension of the Entity-Relationship model for representing multi-dimensional data, i.e., data that is contained in data warehouses. Finally, the department is also actively involved in Semantic Web, especially in the areas of ontologies, web services and XML querying.

The subjects presented below covers these areas. Notice that this list of subjects is not exhaustive, the students interested in these topics are invited to propose original subjects.

Automatic Support for Spatio-Temporal Integrity Constraints

The Object Constraint Language (OCL), part of the UML standard, is a formal language for defining constraints on UML models. The Dresden OCL toolkit is an open source software platform for OCL tool support. One of the tools comprising the OCL toolkit is OCL2SQL, an SQL code generator that generates an SQL check constraint, assertion or trigger for an OCL invariant. OCL2SQL can be used and adapted for different relational database systems and different object-to-table mappings.

The objective of the project is to extend the toolkit for taking into account spatial, temporal and multi-representation constraints, as those proposed by the MADS model.

A database infrastructure for storing and manipulating trajectories

Thanks to current sensors and GPS technologies, large-scale capture of the evolving position of individual mobile objects has become technically and economically feasible.

Typical examples of moving objects include cars, persons and planes equipped with a GPS device, animals bearing a transmitter whose signals are captured by satellites, and parcels tagged with RFIDs.

Analysis of trajectory data is the key to a growing number of applications aiming at global understanding and management of complex phenomena that involve moving objects (e.g. worldwide courier distribution, city traffic management, bird migration monitoring).

This project consists of studying and extending the limited capabilities of commercial data management systems for storing and manipulating the position of moving objects all along their lifespan.

Création de vues SQL modifiables à l'aide de triggers

De nos jours, les bases de données (BD) sont de plus en plus utilisées par des utilisateurs aux profils différents. Les utilisateurs deviennent exigeants et sont désireux d’utiliser des bases de données possédant une structure adaptée à leurs attentes personnelles. Les bases de données actuelles doivent donc pouvoir accepter des requêtes issues d’utilisateurs ayant une vue différente de la structure de celle-ci.

Actuellement le standard SQL permet de définir des vues, ces vues sont en réalité des structures virtuelles de la base de données. La plupart des systèmes de gestion de base de données (SGBD) intègrent ce mécanisme mais limitent les requêtes d’insertions de données au travers de ces vues.

Pour palier ce problème, il existe les triggers SQL qui sont des morceaux de code SQL qui peuvent se déclencher lors de requêtes d’insertion, suppression ou mise à jour et capables de modifier leur comportement. Les administrateurs peuvent donc écrire des triggers pour modifier le comportement des requêtes d’insertion sur les vues afin qu’elles soient compatibles avec la BD. L'écriture de triggers est souvent fastidieuse et sujette à de multiples erreurs.

Le but de ce MFE sera d’une part d’établir un descriptif des limitations des vues pour les principaux SGBD tels que Microsoft SQL Server, Oracle, etc. Ensuite, proposer un ensemble de règles de transformations permettant de générer des triggers afin de mettre à jour la BD suite à des insertions au travers des vues.

Moteur de raisonnement en XQuery pour OWL-QL

Une ontologie est un modèle de données permettant de réprésenter un ensemble de connaissances d'un domaine donné via des concepts et les relations sémantiques entre ces concepts.

Les logiques de description (DL) sont un ensemble de formalismes pour réprésenter une ontologie avec plus ou moins d'expressivité. Le raisonnement sur ontologies, comme la subsumption, est en général un problème complexe gourmand en temps. Un sous ensemble des logiques de description (DL-Lite) permet par contre de garder une complexité de raisonemment logarithmique. Le projet QuOnto est l'implémentation des logiques DL-Lite sur une base de données relationnelle à l'aide de SQL.

OWL est le standard XML du W3C pour réprésenter et stocker des ontologies. Un sous-ensemble de la norme OWL 2 (OWL QL) est prévu pour réprésenter en XML des ontologies décrites en DL-Lite.

Le but de ce mémoire est d'étudier les différents DL-Lite et le projet QuOnto pour ensuite définir et implémenter un moteur de raisonnement pour OWL-QL à l'aide de XQuery

Native XML support for Multidimensional Data in Data Warehouses

eXist is a very active Open Source project to develop a native XML database system with index-based XQuery processing. The database is completely written in Java and may be deployed in a number of ways, either running as a stand-alone server process, inside a servlet-engine, or directly embedded into an application.

The objective of the project is to extend the eXist database and XQuery language for taking into account multidimensional data, i.e., to enable XML for data warehouse applications. Two previous master theses developed a group-by extension for XQuery that is now integrated into eXist.

This project can be subdivided in several master theses :

  • analyzing and implementing topological rollup functionalities for XQuery
  • enhance XQuery group-by extension processing using indexes
  • analyzing and benchmarking various XML multidimensional data models and databases



Indexing learning object metadata using Hadoop

The Learning Resource Exchange (LRE) consists mainly of an infrastructure for collecting and indexing IEEE Learning Object Metadata. Currently, this infrastructure is deployed on a single server and the indexation of the LRE metadata using the Apache Lucene framework takes up to 8 hours. Given the current growth of the federation, the number of LRE resources is expected to be multiplied by 10 within the next 24 months.

Hadoop is a framework for running applications on large clusters built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework.

The student will be working at the European Schoolnet (http://ww.eun.org), an international partnership of 31 European Ministries of Education developing learning for schools, teachers and pupils across Europe.


Developing reusable LRE portlets using the Liferay Framework

Liferay is a framework that enables the creation of reusable web components named “porlets”. Portlets can be combined to rapidly build web portals. The trainee will be invited to participate in the analysis, design, and implementation of LRE portlets, i.e., reusable web components that support LRE-related functionalities such as the discovery of learning resources, the social tagging of learning resources, the management of bookmarks of learning resources, the tracking of learning resource usage, etc.

These portlets will be used to build new portals as well as to add LRE functionalities to existing ones.

The student will be working at the European Schoolnet (http://ww.eun.org), an international partnership of 31 European Ministries of Education developing learning for schools, teachers and pupils across Europe.


Developing and testing a Protocol for Metadata Publishing

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH - http://www.openarchives.org/OAI/openarchivesprotocol.html) is an XML- based protocol for metadata information retrieval. It is a synchronous messaging protocol, transmitted over HTTP and uses a fixed set of protocol requests and parameters. Results are returned in XML and conform to defined Schema XSD‘s, which constrain the result format. However this constraint doesn‘t extend down to the record level and record formats are generally agreed between clients and data providers depending upon the community of practice or domain. Simply put, OAI-PMH consists of a pull-mechanism for mirroring XML documents. As many repositories prefer pushing their metadata rather than being passively harvested, there is a need for a simple and robust protocol for publishing metadata in a repository. As a lightweight protocol for depositing content from one location to another, SWORD APP (http://www.swordapp.org/) is a good candidate protocol for transporting XML documents. Similarly to what is defined in OAI-PMH, this XML document should allow for identifying, publishing, updating, and deleting metadata and their collections. The traineeship will consists of defining this XML document (XSD), examine to what extent it can be bound to SWORD APP or another transport layer, implement a prototype and, once the approach is validated, deploy it between the LRE repositories.

The student will be working at the European Schoolnet (http://ww.eun.org), an international partnership of 31 European Ministries of Education developing learning for schools, teachers and pupils across Europe.


Rule Engine in Event-Driven Architecture

Nowadays, rule engines are used in IT infrastructures to evaluate basic rules. In such architectures, analysts translate the enterprise business processes into a set of computerized-processes and rules. For instance, a bank loan application could define the process for a bank loan acceptance as: (1) retrieving user information in a central data warehouse, (2) loading from ETL user profiles, (3) according to the information loaded from this user, asking to the rule engine if the loan can be accepted. The rule engine could be a set of rules like if the user has enough saved, if in the last three months he has not spent 2/3 of his revenues, etc. This kind of rule engines could be used in real-time or high availability applications such as Next Generation IN (NgIN). For instance we could define rules for charging a call according to network information, like the localisation, the pre-paid units, etc. Or the network alarm management could use real-time information to infer a root cause analysis and even propose solutions to service provider operators. We could also imagine a real-time service orchestrator based on rules which could define which service to load according to network information. Today, there is no constructor able to provide such kind of rule engines in network applications. On the other hand, more and more IT players propose new paradigms for executing rules in high availability architectures, such as the Complex Event Processing (CEP). These architectures allow to execute rules and to correlate asynchronously events over a specific period. Some applications are proposed for fraud detection, surveillance, RT risk management, market aggregation, etc. However, CEP rule engine have not reached the maturity of IT rule engines which still provide more flexibility and are better integrated in J2EE infrastructures.

The aim of this master thesis is to study and to compare the two possible architectures for implementing a rule engine in Event Driven Architectures. In addition the student will propose architecture to integrate the JBoss rule engine to a JAVA real-time application server (The Red Hat Mobicents JAIN SLEE).

The thesis covers four aspects: (1) study of the JBoss Drools rule engine, (2) Survey of the Complex Event Processing in term of architecture and product, (3) A comparison between rule execution model in CEP and IT infrastructure, and finally (4) the implementation of a proof of concept by integrating JBoss Drools within the JAIN SLEE AS. ORGANIZATION

The thesis is organized by the ULB in collaboration with Euranova Labs and Alcatel-Lucent Application Software Group R&D center at Namur. The student will be coached by the R&D Team responsible for developing the OSP AS, a real-time AS deployed over 250 operators.


Semantic Transformation of the ECORE model

The Model Driven Architecture (MDA) is becoming a standard approach for building software in industry. Starting from a model, describing the business logic, MDA tools can interpret and generate application from the model. EMF is the cornerstone of the MDA approach on the Eclipse platform. The EMF project is a modeling framework and code generation facility for building tools and other applications based on a structured data model. From a model specification described in XMI, EMF provides tools and runtime support to produce a set of Java classes for the model, along with a set of adapter classes that enable viewing and command-based editing of the model, and a basic editor.

If the MDA approach is becoming a de facto standard, the model transformation remains project specific. Indeed, the application development process often lead to transform a generic model (PIM: Platform Independent Model) to other specific model (PSM: platform specific model). For instance, there is no tool for transforming the class diagram into activity diagram, system event, or Entity relationship diagram. However some applications have already provided several transformation paradigms but really specific to particular protocol or application domains: (1) XML Spy and the XSLT Transformation (2) Talend Open Studio for ETL tools. Event if these tools provide intuitive and powerful user interfaces, they are not adapted to software engineering.

Alcatel-Lucent has been developing such model binding or weaving graphical editor for its JAVA Service Creation Environment. This Editor will be used for defining the transformation from the Database schema model and the corresponding GUI in JSF. The idea is to be able to define graphically the GUI representing the DB by transforming models. However, Alcatel-Lucent lacks of transformation semantic to express clearly a specific transformation.

The aim of this master thesis is to study the different transformation model tools such as Talend Open Studio, XML spy but also BPEL semantic and to propose a transformation semantic for transforming models. In addition the student will implement a prototype as a part of the open source project Wazaabi which is used as a base for the ALU binding editor.

The thesis covers four aspects: (1) study of the transformation semantic in BPMN, XSLT and ETL tools, (2) Study of the binding editor of the Wazaabi project, (3) proposing a transformation semantic and finally (4) implementing prototype of a set of transformations in the Wazaabi binding editor.

The thesis is organized by the ULB in collaboration with Euranova Labs and Alcatel-Lucent Application Software Group R&D center at Namur. The student will be coached by the R&D Team responsible for developing the OSP AS, a real-time AS deployed over 250 operators and the Wazaabi project creator.


Enterprise Service Bus Integration

Alcatel-Lucent Namur center addresses the problem to develop real-time services for NgIN and IMS but also to develop management applications for controlling and provisioning those services. For instance, the charging engine, used for the real-time billing must be managed by customer care web applications from operator shops (Mobistar teleboutique, Orange shops, etc.). In addition this kind of application must be able to integrate operator IT applications such as Customer Relationship management (CRM), SAP, accounting system, etc. Thus the generated code by IDEs must be able to integrate such IT infrastructure and exposing management features to other applications.

In IT architecture, a usual way to integrate different application having different communication formats is to provide an Enterprise Service Bus (ESB) as a common skeleton. Therefore Alcatel-Lucent management applications could be integrated with specific connectors, offering access to the RT platform management from the bus.

The RT platform is managed by a J2EE server, JBoss 5.0 and EJB 3.0. The aim of this thesis is define the requirement for integrating the J2EE management of the platform to the JBoss ESB. The student will first list the technical requirement to integrate such bus and after a technical analysis of the Alcatel-Lucent J2EE management, he will recommend and develop a flexible architecture to expose new management commands to the ESB.

The thesis covers four aspects: (1) studying of the JBoss ESB and Event Driven Architecture in general, (2) technical analysis of the J2EE management server and EJB 3.0 generation, (3) Studying of the JET Template generation and finally (4) proposing and implements architecture to expose new management command to ESB.

The thesis is organized by the ULB in collaboration with Euranova Labs and Alcatel-Lucent Application Software Group R&D center at Namur. The student will be coached by the R&D Team responsible for developing the OSP AS, a real-time AS deployed over 250 operators and J2EE specialist from the CTO R&D division.


 
teaching/mfe0910/is.txt · Last modified: 2009/04/21 11:27 by ezimanyi