[Dbworld] PhD Position on Uncertainty Management at TELECOM ParisTech
From: “Pierre Senellart” <pierre at senellart.com>
PhD Position on Uncertainty Management at TELECOM ParisTech
===========================================================
Title
—–
Probabilistic Uncertainty Management in the DataRing
Description
———–
Sources of uncertainty in data abound: noisy measurements, data resulting
from imperfect automatic systems such as information extraction or
natural language processing, data by nature imprecise such as a
human-made diagnostic, etc. In the context of an autonomous,
heterogeneous, decentralized system such as the one investigated in the
DataRing project [1], uncertainty also originates from essentially
imperfect schema matchings, doubts about the actual presence of a fact or
of a whole document on a given peer, or redundancy and contradiction of
the information present in different peers. One possible way, among the
most natural, to represent this uncertainty is through probabilistic
databases.
The objective of this PhD position is to find formal models for the
representation and efficient querying of probabilistic databases in a
peer-to-peer environment, and to build corresponding prototype systems.
Because of the heterogeneous nature of the information shared in the
DataRing, semi-structured (i.e., XML) models should be favored, though
the simplicity of the flat-tuple representation of the relational model
can also be an inspiration. Previously studied probabilistic
semi-structured models [2,3,4] can be a basis for the proposed work.
Particular aspects of interest include:
- management of the various forms of uncertainty;
- routing and distributed computation of probabilistic queries over
the peer-to-peer network;
- corroboration of information across sources;
- ranking of query results and top-k query processing.
Supervision
———–
The 3-year PhD thesis will be supervised by Pierre Senellart and Talel
Abdessalem in the Computer Science and Networking Department at TELECOM
ParisTech, in interaction with the other partners of the ANR DataRing
project, notably Serge Abiteboul’s Gemo team at INRIA Saclay.
TELECOM ParisTech, formerly known as ENST, is the leading French
engineering school specialized in information technology, and is located
inside Paris.
Conditions
———-
Starting date: beginning 2009 (flexible).
Prerequisites for applying: Master’s degree in computer science (or
equivalent diploma), background in applied and theoretical database
management.
Revenue: ~1500 ⬠monthly net revenue, over 3 years
Please contact Pierre Senellart
for any information and for applications.
References
———-
[1] S. Abiteboul and N. Polyzotis, The Data Ring: Community Content
Sharing. In Proc. CIDR, January 2007, Asilomar, USA.
[2] P. Senellart and S. Abiteboul, On the complexity of managing
probabilistic XML data. In Proc. PODS, June 2007, Beijing, China.
[3] B. Kimelfeld, Y. Kosharovski, and Y. Sagiv, Query efficiency in
probabilistic XML models. In Proc. SIGMOD, June 2008, Vancouver, Canada.
[4] S. Cohen, B. Kimelfeld, and Y. Sagiv, Incorporating constraints in
probabilistic XML. In Proc. PODS, June 2008, Vancouver, Canada.
_______________________________________________
Please do not post msgs that are not relevant to the database community at large. Go to www.cs.wisc.edu/dbworld for guidelines and posting forms.
To unsubscribe, go to https://lists.cs.wisc.edu/mailman/listinfo/dbworld