Differences

This shows you the differences between two versions of the page.

--- teaching:projh402 [2014/10/24 08:54]
svsummer [Project proposals]
+++ teaching:projh402 [2015/06/26 16:09]
svsummer
@@ Line 6: / Line 6: @@
 ===== Project proposals =====
+===== Graph Indexing for Fast Subgraph Isomorphism Testing =====
+There is an increasing amount of scientific data, mostly from the bio-medical sciences, that can be represented as collections of graphs (chemical molecules, gene interaction networks, ...). A crucial operation when searching in this data is that of subgraph    isomorphism testing: given a pattern P that one is interested in (also a graph) in and a collection D of graphs (e.g., chemical molecules), find all graphs in G that have P as a   subgraph. Unfortunately, the subgraph isomorphism problem is computationally intractable. In ongoing research, to enable tractable processing of this problem, we aim to reduce the number of candidate graphs in D to which a subgraph isomorphism test needs   to be executed. Specifically, we index the graphs in the collection D by means of decomposing them into graphs for which subgraph   isomorphism *is* tractable. An associated algorithm that filters graphs that certainly cannot match P can then formulated based on ideas from information retrieval.
+In this project, the student will emperically validate on real-world datasets the extent to which graphs can be decomposed into graphs for which subgraph isomorphism is tractable, and run experiments to validate the effectiveness of the proposed method in terms of filtering power.
+**Interested?** Contact : [[stijn.vansummeren@ulb.ac.be|Stijn Vansummeren]]
+**Status**: available
 ==== Principles of Database Management Architectures in Managed Virtual Environments ====
@@ Line 70: / Line 80: @@
-==== Development of a Personal Scientific Digital Library Management System ====
-In this project, the student is asked to construct a software system to help manage large collections of scientific papers in digital form. Specifically, the system must be able to:
-  - Scan a given filesystem location for given filetypes (PDFs, EPUB, ...) containing scientific articles.
-  - Extract the metadata from each identified file. Here, the metadata includes the title of the article, its authors, the publishing venue, the publisher, the year of publication, the article's abstract ... The development of an intelligent way to retreive this metadata is requried. This could be done, for example by a combination of parsing the file, contacting the internet repositories of known publishers (AMC, Springer, Elsevier) etc to retrieve the data.
-  - Offer search capabilities, in order to allow a user to find all indexed articles matching certain criteria (title, author, ...)
-  - Offer archiving capabilities
-Use of semantic web technologies (RDF, SPARQL, ...) to store and search the metadata is encouraged.
-**Contact** : Stijn Vansummeren (stijn.vansummeren@ulb.ac.be)
-**Status**: taken