Reasoning about Disclosure in Data Integration in the Presence of Source
Constraints
Abstract
Data integration systems allow users to access data
sitting in multiple sources by means of queries over
a global schema, related to the sources via mappings. Datasources often contain sensitive information, and thus an analysis is needed to verify that a
schema satisfies a privacy policy, given as a set of
queries whose answers should not be accessible to
users. Such an analysis should take into account not
only knowledge that an attacker may have about the
mappings, but also what they may know about the
semantics of the sources. In this paper, we show
that source constraints can have a dramatic impact
on disclosure analysis. We study the problem of determining whether a given data integration system
discloses a source query to an attacker in the presence of constraints, providing both lower and upper
bounds on source-aware disclosure analysis