1
2 """
3 This module is brute force implementation of the 'finite' version of
4 U{RDFS semantics<http://www.w3.org/TR/rdf-mt/>} and of
5 U{OWL 2 RL<http://www.w3.org/TR/owl2-profiles/#Reasoning_in_OWL_2_RL_and_RDF_Graphs_using_Rules>}
6 on the top of RDFLib (with some caveats, see below). Some extensions to these are also implemented.
7 Brute force means that, in all cases, simple forward chaining rules are used to extend (recursively) the incoming graph with all triples
8 that the rule sets permit (ie, the "deductive closure" of the graph is computed).
9 There is an extra options whether the axiomatic triples are added to the graph (prior to the forward chaining step).
10 These, typically set the domain and range for properties or define some core classes.
11 In the case of RDFS, the implementation uses a 'finite' version of the axiomatic triples only (as proposed, for example,
12 by Herman ter Horst). This means that it adds only those C{rdf:_i} type predicates that do appear in the original graph,
13 thereby keeping this step finite. For OWL 2 RL, OWL 2 does not define axiomatic triples formally; but they can be deduced from the
14 U{OWL 2 RDF Based Semantics<http://www.w3.org/TR/owl2-rdf-based-semantics/>} document and are listed in Appendix 6 (though informally).
15 Note, however, that this implementation adds only those triples that refer to OWL terms that are meaningful for the OWL 2 RL case.
16
17 Package Entry Points
18 ====================
19
20 The main entry point to the package is via the L{DeductiveClosure<DeductiveClosure>} class. This class should be initialized to control
21 the parameters of the deductive closure; the forward chaining is done via the L{expand<DeductiveClosure.expand>} method.
22 The simplest way to use the package from an RDFLib application is as follows::
23
24 graph = Graph() # creation of an RDFLib graph
25 ...
26 ... # normal RDFLib application, eg, parsing RDF data
27 ...
28 DeductiveClosure(OWLRL_Semantics).expand(graph) # calculate an OWL 2 RL deductive closure of graph
29 # without axiomatic triples
30
31 The first argument of the C{DeductiveClosure} initialization can be replaces with other classes, providing different
32 types of deductive closure; other arguments are also possible. For example::
33
34 DeductiveClosure(OWLRL_Extension, rdfs_closure = True, axiomatic_triples = True, datatype_axioms = True).expand(graph)
35
36 will calculate the deductive closure including RDFS and some extensions to OWL 2 RL, and with all possible axiomatic
37 triples added to the graph (this is about the maximum the package can do…)
38
39 The same instance of L{DeductiveClosure<DeductiveClosure>} can be used for several graph expansions. In other words, the expand function does
40 not change any state.
41
42 For convenience, a second entry point to the package is provided in the form of a function called L{convert_graph<convert_graph>},
43 that expects a directory with various options, including a file name. The function parses the file, creates the expanded graph, and serializes the result into RDF/XML or
44 Turtle. This function is particularly useful as an entry point for a CGI call (where the HTML form parameters are in a directory) and
45 is easy to use with a command line interface. The package distribution contains an example for both.
46
47 There are major closure type (ie, semantic closure possibilities); these can be controlled through the appropriate
48 parameters of the L{DeductiveClosure<DeductiveClosure>} class:
49
50 - using the L{RDFS_Semantics<RDFSClosure.RDFS_Semantics>} class, implementing the U{RDFS semantics<http://www.w3.org/TR/rdf-mt/>}
51 - using the L{OWLRL_Semantics<OWLRL.OWLRL_Semantics>} class, implementing the U{OWL 2 RL<http://www.w3.org/TR/owl2-profiles/#Reasoning_in_OWL_2_RL_and_RDF_Graphs_using_Rules>}
52 - using L{RDFS_OWLRL_Semantics<CombinedClosure.RDFS_OWLRL_Semantics>} class, implementing a combined semantics of U{RDFS semantics<http://www.w3.org/TR/rdf-mt/>} and U{OWL 2 RL<http://www.w3.org/TR/owl2-profiles/#Reasoning_in_OWL_2_RL_and_RDF_Graphs_using_Rules>}
53
54 In all three cases there are other dimensions that can control the exact closure being generated:
55
56 - for convenience, the so called axiomatic triples (see, eg, the U{axiomatic triples in RDFS<http://www.w3.org/TR/rdf-mt/#rdfs_interp>}) are, by default, I{not} added to the graph closure to reduce the number of generated triples. These can be controlled through a separate initialization argument
57 - similarly, the axiomatic triples for D-entailement are separated
58
59 Extensions
60 ----------
61
62 The three major entry points (ie, L{RDFS Semantics<RDFSClosure.RDFS_Semantics>}, L{OWL2 RL Semantics<OWLRL.OWLRL_Semantics>},
63 and L{RDFS + OWL 2 RL Semantics<CombinedClosure.RDFS_OWLRL_Semantics>}) represent clearly documented rule sets that correspond to various
64 inference regimes defined by the RDFS and OWL 2 standards. They can also be viewed as incomplete implementation of a full
65 U{OWL 2 specification following the RDF based semantics (a.k.a. “OWL 2 Full”)<http://www.w3.org/TR/owl2-rdf-based-semantics/>}. While the approach of
66 using a simple forward chaining process cannot be used for a complete OWL 2 Full implemenations, it is however possible to add some features that, while not
67 being mandated by, say, the U{OWL 2 RL<http://www.w3.org/TR/owl2-profiles/#Reasoning_in_OWL_2_RL_and_RDF_Graphs_using_Rules>} specification, are nevertheless
68 useful and implementable. This can be done by providing a suitable subclass of the L{RDFS + OWL 2 RL Semantics<CombinedClosure.RDFS_OWLRL_Semantics>}), adding, eg, to
69 the set of rules that are implemented.
70
71 As an example, this package contains such an L{extension<OWLRLExtras.OWLRL_Extension>} that can also be used by the entry points. The features implemented by this extension, ie, added to the core
72 OWL 2 RL features are:
73
74 - self restriction
75 - owl:rational datatype
76 - datatype restrictions via facets
77
78 (There are some minor restriction on the datatype restriction implementation, see the L{description of the corresponding module<RestrictedDatatype>}.)
79
80 When initializing this L{extension<OWLRLExtras.OWLRL_Extension>} class, the user can control whether RDFS reasoning should also
81 be used or not (default is C{False}).
82
83 Some Technical/implementation aspects
84 =====================================
85
86 The core processing is done in the in the L{Core<Closure.Core>} class, which is subclassed by the L{RDFS<RDFS_Semantics>} and
87 the L{OWL 2 RL<OWLRL_Semantics>} classes (these two are then, on their turn, subclassed by the
88 L{RDFS + OWL 2 RL Semantics<CombinedClosure.RDFS_OWLRL_Semantics>}) class). The core implements the core functionality of cycling
89 through the rules, whereas the rules themselves are defined and implemented in the subclasses. There are also methods that are executed only once either
90 at the beginning or at the end of the full processing cycle. Adding axiomatic triples is handled separately, which allows a finer user control over
91 these features.
92
93 Literals must be handled separately. Indeed, the functionality relies on 'extended' RDF graphs, that allows literals
94 to be in a subject position, too. Because RDFLib does not allow that, processing begins by exchaning all literals in the
95 graph for bnodes (identical literals get the same associated bnode). Processing occurs on these bnodes; at the end of the process
96 all these bnodes are replaced by their corresponding literals if possible (if the bnode occurs in a subject position, that triple
97 is removed from the resulting graph). Details of this processing is handled in the separate L{Literals Proxies<RDFClosure.Literals.LiteralProxies>}
98 class.
99
100 The OWL specification includes references to datatypes that are not in the core RDFS specification, consequently not
101 directly implemented by RDFLib. These are added in a separate module of the package.
102
103 Problems with Literals with datatypes
104 -------------------------------------
105
106 The current distribution of RDFLib is fairly poor in handling datatypes, particularly in checking whether a lexical form
107 of a literal is "proper" as for its declared datatype. A typical example is::
108 "-1234"^^xsd:nonNegativeInteger
109 which should not be accepted as valid literal. Because the requirements of OWL 2 RL are much stricter in this respect, an alternative set of
110 datatype handling (essentially, conversions) had to be implemented (see the L{XsdDatatypes} module).
111
112 The L{DeductiveClosure<DeductiveClosure>} class has an additional instance variable whether
113 the default RDFLib conversion routines should be exchanged against the new ones. If this flag is set to True and instance creation (this is
114 the default), then the conversion routines are set back
115 to the originals once the expansion is complete, thereby avoiding to influence older application that may not work properly with the
116 new set of conversion routines.
117
118 If the user wants to use these alternative lexical conversions everywhere in the application, then
119 the L{use_improved_datatypes_conversions<DeductiveClosure.use_improved_datatypes_conversions>} method can be invoked. That method changes
120 the conversion routines and, from that point on, all usage of L{DeductiveClosure<DeductiveClosure>} instances will use the
121 improved conversion methods without resetting them. Ie, the code structure can be something like::
122 DeductiveClosure().use_improved_datatypes_conversions()
123 ... RDFLib application
124 DeductiveClosure().expand(graph)
125 ...
126 The default situation can be set back using the L{use_rdflib_datatypes_conversions<DeductiveClosure.use_improved_datatypes_conversions>} call.
127
128 It is, however, not I{required} to use these methods at all. Ie, the user can use::
129 DeductiveClosure(improved_datatypes=False).expand(graph)
130 which will result in a proper graph expansion except for the datatype specific comparisons which will be incomplete.
131
132 Serializer bugs
133 ---------------
134
135 During the development of the software a number of small bugs on the RDFLib serializers were found. The alternative RDF/XML
136 and Turtle serializers, originally developed for the U{RDFa distiller<http://www.w3.org/2007/08/pyRdfa>}, have been added to this package, too.
137
138 The L{convert_graph<convert_graph>} entry point used, for example, by the CGI service, uses these serializers.
139
140 Turtle Parsing bug
141 ------------------
142
143 Unfortunately, there are some bugs in the underlying Turtle parser, used by RDFLib. All bugs are related to the way common datatypes can be
144 abbreviated in Turtle. According to the latest U{Turtle grammar<http://www.w3.org/TeamSubmission/2008/SUBM-turtle-20080114/>}, the following
145 abbreviations should happen
146
147 - Constants of the form 1234 should be interpreted as xsd integers, which is done correctly by the parser.
148 - Constants of the form 1.2345 should be interpreted as xsd:decimal. Unfortunately, the parser interprets them as xsd:double
149 - Constants of the form 'true' or 'false' (whithout the quotes, that is) should be interpreted as xsd:boolean. Instead, they are put as symbols into the default namespace
150 - Constants of the form 1.2345E12 should be interpreted as xsd:doubles. Unfortunately, the parser crashes on those
151
152 The current distribution includes a modified version of the RDFLib Turtle parser that takes care of the second and third item. Unfortunately,
153 third problem seems to be missing in the core grammar of the parser, and could not be handled. Ie, "1.2345E+12"^^xsd:double should be used
154 instead.
155
156 The L{convert_graph<convert_graph>} entry point used, for example, by the CGI service, uses this parser.
157
158 @requires: U{RDFLib<http://rdflib.net>}, 2.2.2. and higher
159 @license: This software is available for use under the U{W3C Software License<http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231>}
160 @organization: U{World Wide Web Consortium<http://www.w3.org>}
161 @author: U{Ivan Herman<a href="http://www.w3.org/People/Ivan/">}
162 """
163
164 """
165 $Id: __init__.py,v 1.31 2009/10/01 11:06:28 ivan Exp $ $Date: 2009/10/01 11:06:28 $
166 """
167
168 __version__ = "4.1"
169 __author__ = 'Ivan Herman'
170 __contact__ = 'Ivan Herman, ivan@w3.org'
171 __license__ = u'W3C® SOFTWARE NOTICE AND LICENSE, http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231'
172
173 import StringIO
174 from types import *
175
176 from rdflib import Graph
177 from rdflib.plugin import register
178 from rdflib.syntax import serializers, parsers
179 from rdflib.Literal import Literal as rdflibLiteral
180
181 import DatatypeHandling, Closure
182 from OWLRLExtras import OWLRL_Extension, OWLRL_Extension_Trimming
183 from OWLRL import OWLRL_Semantics
184 from RDFSClosure import RDFS_Semantics
185 from CombinedClosure import RDFS_OWLRL_Semantics
186 from OWL import imports
187
188
189 RDFXML = "xml"
190 TURTLE = "turtle"
191 AUTO = "auto"
192
193 NONE = "none"
194 RDF = "rdf"
195 RDFS = "rdfs"
196 OWL = "owl"
197 FULL = "full"
198
199
200
227
228
230 """Intepret the owl import statements. Essentially, recursively merge with all the objects in the owl import statement, and remove the corresponding
231 triples from the graph.
232
233 This method can be used by an application prior to expansion. It is I{not} done by the the L{DeductiveClosure} class.
234
235 @param iformat: input format; can be one of L{AUTO}, L{TURTLE}, or L{RDFXML}. L{AUTO} means that the suffix of the file name or URI will decide: '.ttl' means Turtle, RDF/XML otherwise.
236 @param graph: the RDFLib Graph instance to parse into.
237 """
238 while True :
239
240 all_imports = [ t for t in graph.triples((None, imports, None)) ]
241 if len(all_imports) == 0 :
242
243 return
244
245 for t in all_imports : graph.remove(t)
246
247 for (s,p,uri) in all_imports:
248
249
250 if isinstance(uri, rdflibLiteral) :
251 __parse_input(iformat, str(uri), graph)
252 else :
253 __parse_input(iformat, uri, graph)
254
255
256
258 """
259 Return the right semantic extension class based on three possible choices (this method is here to help potential users, the result can be
260 fed into a L{DeductiveClosure} instance at initialziation)
261 @param owl_closure: whether OWL 2 RL deductive closure should be calculated
262 @type owl_closure: boolean
263 @param rdfs_closure: whether RDFS deductive closure should be calculated. In case C{owl_closure==True}, this parameter should also be used in the initialziation of L{DeductiveClosure}
264 @type rdfs_closure: boolean
265 @param owl_extras: whether the extra possibilities (rational datatype, etc) should be added to an OWL 2 RL deductive closure. This parameter has no effect in case C{owl_closure==False}.
266 @param trimming : whether extra trimming is done on the OWL RL + Extension output
267 @return: deductive class reference or None
268 @rtype: Class type
269 """
270 if owl_closure :
271 if owl_extras :
272 if trimming :
273 return OWLRL_Extension_Trimming
274 else :
275 return OWLRL_Extension
276 else :
277 if rdfs_closure :
278 return RDFS_OWLRL_Semantics
279 else :
280 return OWLRL_Semantics
281 elif rdfs_closure :
282 return RDFS_Semantics
283 else :
284 return None
285
287 """
288 Entry point to generate the deductive closure of a graph. The exact choice deductive
289 closure is controlled by a class reference. The important initialization parameter is the C{closure_class}: a Class object referring to a
290 subclass of L{Closure.Core}. Although this package includes a number of
291 such subclasses (L{OWLRL_Semantics}, L{RDFS_Semantics}, L{RDFS_OWLRL_Semantics}, and L{OWLRL_Extension}), the user can use his/her own if additional rules are
292 implemented.
293
294 Note that owl:imports statements are I{not} interpreted in this class, that has to be done beforehand on the graph that is to be expaned.
295
296 @ivar rdfs_closure: Whether the RDFS closure should also be executed. Default: False.
297 @type rdfs_closure: boolean
298 @ivar axiomatic_triples: Whether relevant axiomatic triples are added before chaining, except for datatype axiomatic triples. Default: False.
299 @type axiomatic_triples: boolean
300 @ivar datatype_axioms: Whether further datatype axiomatic triples are added to the output. Default: false.
301 @type datatype_axioms: boolean
302 @ivar closure_class: the class instance used to expand the graph
303 @type closure_class: L{Closure.Core}
304 @cvar improved_datatype_generic: Whether the improved set of lexical-to-Python conversions should be used for datatype handline I{in general}, ie, not only for a particular instance and not only for inference purposes. Default: False.
305 @type improved_datatype_generic: boolean
306 """
307 improved_datatype_generic = False
308 - def __init__(self, closure_class, improved_datatypes = True, rdfs_closure = False, axiomatic_triples = False, datatype_axioms = False) :
309 """
310 @param closure_class: a closure class reference.
311 @type closure_class: subclass of L{Closure.Core}
312 @param rdfs_closure: whether RDFS rules are executed or not
313 @type rdfs_closure: boolean
314 @param axiomatic_triples: Whether relevant axiomatic triples are added before chaining, except for datatype axiomatic triples. Default: False.
315 @type axiomatic_triples: boolean
316 @param datatype_axioms: Whether further datatype axiomatic triples are added to the output. Default: false.
317 @type datatype_axioms: boolean
318 @param improved_datatypes: Whether the improved set of lexical-to-Python conversions should be used for datatype handline. See the introduction for more details. Default: True.
319 @type improved_datatypes: boolean
320 """
321 if closure_class is None :
322 self.closure_class = None
323 else :
324 if isinstance(closure_class, ClassType ) == False:
325 raise ValueError("The closure type argument must be a class reference")
326 else :
327 self.closure_class = closure_class
328 self.axiomatic_triples = axiomatic_triples
329 self.datatype_axioms = datatype_axioms
330 self.rdfs_closure = rdfs_closure
331 self.improved_datatypes = improved_datatypes
332
347
354
361
362
363
364
366 """
367 Entry point for external scripts (CGI or command line) to parse an RDF file(s), possibly execute OWL and/or RDFS closures,
368 and serialize back the result in some format.
369 Note that this entry point can be used requiring no entailement at all.
370 Because both the input and the output format for the package can be RDF/XML or Turtle, such usage would
371 simply mean a format conversion.
372
373 If OWL 2 RL processing is required, that also means that the owl:imports statements are interpreted. Ie,
374 ontologies can be spread over several files. Note, however, that the output of the process would then include all
375 imported ontologies, too.
376
377 @param options: object with specific attributes, namely:
378 - options.sources: list of uris or file names for the source data; for each one if the name ends with 'ttl', it is considered to be turtle, RDF/XML otherwise (this can be overwritten by the options.iformat, though)
379 - options.text: direct Turtle encoding of a graph as a text string (useful, eg, for a CGI call using a text field)
380 - options.owlClosure: can be yes or no
381 - options.rdfsClosure: can be yes or no
382 - options.owlExtras: can be yes or no; whether the extra rules beyond OWL 2 RL are used or not.
383 - options.axioms: whether relevant axiomatic triples are added before chaining (can be a boolean, or the strings "yes" or "no")
384 - options.daxioms: further datatype axiomatic triples are added to the output (can be a boolean, or the strings "yes" or "no")
385 - options.format: output format, can be "turtle" or "rdfxml"
386 - options.iformat: input format, can be "turtle", "rdfxml", or "auto". "auto" means that the suffix of the file is considered: '.ttl' is for turtle, rdfxml otherwise
387 - options.trimming: whether the extension to OWLRL should also includ trimming
388 """
389 def __convert_to_turtle(graph) :
390 """Using a non-rdflib Turtle Serializer"""
391 register("my-turtle",serializers.Serializer,"RDFClosure.serializers.TurtleSerializer","TurtleSerializer")
392 return graph.serialize(format="my-turtle")
393
394 def __convert_to_XML(graph) :
395 """Using a non-rdflib RDF/XML Serializer"""
396 register("my-xml",serializers.Serializer,"RDFClosure.serializers.PrettyXMLSerializer","PrettyXMLSerializer")
397 return graph.serialize(format="my-xml")
398
399 def __modify_request_header() :
400 """Older versions of RDFlib, though they added an accept header, did not include anything for turtle. This is
401 taken care of here."""
402 from rdflib.URLInputSource import headers
403
404
405 acceptHeader = "application/rdf+xml, text/turtle, text/n3, application/xml;q=0.8, application/xhtml+xml;q=0.5"
406 headers['Accept'] = acceptHeader
407
408 def __check_yes_or_true(opt) :
409 return opt == True or opt == "yes" or opt == "Yes" or opt == "True" or opt == "true"
410
411 import warnings
412 warnings.filterwarnings("ignore")
413 if len(options.sources) == 0 and (options.text == None or len(options.text.strip()) == 0) : raise Exception("No graph specified either via a URI or text")
414
415 __modify_request_header()
416
417 graph = Graph()
418
419
420
421 iformat = AUTO
422 try :
423 iformat = options.iformat
424 except :
425
426 pass
427
428
429 try :
430 if options.source != None : options.sources.append(options.source)
431 except :
432
433 pass
434
435 register("n3", parsers.Parser, "RDFClosure.parsers.N3Parser","N3Parser")
436
437
438
439 for inp in set(options.sources) : __parse_input(iformat, inp, graph)
440
441
442 if options.text != None :
443 graph.parse(StringIO.StringIO(options.text),format="n3")
444
445
446 owlClosure = __check_yes_or_true(options.owlClosure)
447 rdfsClosure = __check_yes_or_true(options.rdfsClosure)
448 owlExtras = __check_yes_or_true(options.owlExtras)
449 trimming = __check_yes_or_true(options.trimming)
450 axioms = __check_yes_or_true(options.axioms)
451 daxioms = __check_yes_or_true(options.daxioms)
452
453 if owlClosure : interpret_owl_imports(iformat, graph)
454
455
456 graph.bind("owl","http://www.w3.org/2002/07/owl#")
457 graph.bind("xsd","http://www.w3.org/2001/XMLSchema#")
458
459 closure_class = return_closure_class(owlClosure, rdfsClosure, owlExtras, trimming)
460 DeductiveClosure(closure_class, improved_datatypes = True, rdfs_closure = rdfsClosure, axiomatic_triples = axioms, datatype_axioms = daxioms).expand(graph)
461
462 if options.format == TURTLE :
463 return __convert_to_turtle(graph)
464 else :
465 return __convert_to_XML(graph)
466