Package RDFClosure
[hide private]
[frames] | no frames]

Package RDFClosure

source code

This module is brute force implementation of the 'finite' version of RDFS semantics and of OWL 2 RL on the top of RDFLib (with some caveats, see below). Some extensions to these are also implemented. Brute force means that, in all cases, simple forward chaining rules are used to extend (recursively) the incoming graph with all triples that the rule sets permit (ie, the "deductive closure" of the graph is computed). There is an extra options whether the axiomatic triples are added to the graph (prior to the forward chaining step). These, typically set the domain and range for properties or define some core classes. In the case of RDFS, the implementation uses a 'finite' version of the axiomatic triples only (as proposed, for example, by Herman ter Horst). This means that it adds only those rdf:_i type predicates that do appear in the original graph, thereby keeping this step finite. For OWL 2 RL, OWL 2 does not define axiomatic triples formally; but they can be deduced from the OWL 2 RDF Based Semantics document and are listed in Appendix 6 (though informally). Note, however, that this implementation adds only those triples that refer to OWL terms that are meaningful for the OWL 2 RL case.

Package Entry Points

The main entry point to the package is via the DeductiveClosure class. This class should be initialized to control the parameters of the deductive closure; the forward chaining is done via the expand method. The simplest way to use the package from an RDFLib application is as follows:

graph = Graph()                                 # creation of an RDFLib graph
...
...                                             # normal RDFLib application, eg, parsing RDF data
...
DeductiveClosure(OWLRL_Semantics).expand(graph) # calculate an OWL 2 RL deductive closure of graph
                                                # without axiomatic triples

The first argument of the DeductiveClosure initialization can be replaces with other classes, providing different types of deductive closure; other arguments are also possible. For example:

DeductiveClosure(OWLRL_Extension, rdfs_closure = True, axiomatic_triples = True, datatype_axioms = True).expand(graph)

will calculate the deductive closure including RDFS and some extensions to OWL 2 RL, and with all possible axiomatic triples added to the graph (this is about the maximum the package can do…)

The same instance of DeductiveClosure can be used for several graph expansions. In other words, the expand function does not change any state.

For convenience, a second entry point to the package is provided in the form of a function called convert_graph, that expects a directory with various options, including a file name. The function parses the file, creates the expanded graph, and serializes the result into RDF/XML or Turtle. This function is particularly useful as an entry point for a CGI call (where the HTML form parameters are in a directory) and is easy to use with a command line interface. The package distribution contains an example for both.

There are major closure type (ie, semantic closure possibilities); these can be controlled through the appropriate parameters of the DeductiveClosure class:

In all three cases there are other dimensions that can control the exact closure being generated:

Extensions

The three major entry points (ie, RDFS Semantics, OWL2 RL Semantics, and RDFS + OWL 2 RL Semantics) represent clearly documented rule sets that correspond to various inference regimes defined by the RDFS and OWL 2 standards. They can also be viewed as incomplete implementation of a full OWL 2 specification following the RDF based semantics (a.k.a. “OWL 2 Full”). While the approach of using a simple forward chaining process cannot be used for a complete OWL 2 Full implemenations, it is however possible to add some features that, while not being mandated by, say, the OWL 2 RL specification, are nevertheless useful and implementable. This can be done by providing a suitable subclass of the RDFS + OWL 2 RL Semantics), adding, eg, to the set of rules that are implemented.

As an example, this package contains such an extension that can also be used by the entry points. The features implemented by this extension, ie, added to the core OWL 2 RL features are:

(There are some minor restriction on the datatype restriction implementation, see the description of the corresponding module.)

When initializing this extension class, the user can control whether RDFS reasoning should also be used or not (default is False).

Some Technical/implementation aspects

The core processing is done in the in the Core class, which is subclassed by the RDFS and the OWL 2 RL classes (these two are then, on their turn, subclassed by the RDFS + OWL 2 RL Semantics) class). The core implements the core functionality of cycling through the rules, whereas the rules themselves are defined and implemented in the subclasses. There are also methods that are executed only once either at the beginning or at the end of the full processing cycle. Adding axiomatic triples is handled separately, which allows a finer user control over these features.

Literals must be handled separately. Indeed, the functionality relies on 'extended' RDF graphs, that allows literals to be in a subject position, too. Because RDFLib does not allow that, processing begins by exchaning all literals in the graph for bnodes (identical literals get the same associated bnode). Processing occurs on these bnodes; at the end of the process all these bnodes are replaced by their corresponding literals if possible (if the bnode occurs in a subject position, that triple is removed from the resulting graph). Details of this processing is handled in the separate Literals Proxies class.

The OWL specification includes references to datatypes that are not in the core RDFS specification, consequently not directly implemented by RDFLib. These are added in a separate module of the package.

Problems with Literals with datatypes

The current distribution of RDFLib is fairly poor in handling datatypes, particularly in checking whether a lexical form of a literal is "proper" as for its declared datatype. A typical example is:

 "-1234"^^xsd:nonNegativeInteger

which should not be accepted as valid literal. Because the requirements of OWL 2 RL are much stricter in this respect, an alternative set of datatype handling (essentially, conversions) had to be implemented (see the XsdDatatypes module).

The DeductiveClosure class has an additional instance variable whether the default RDFLib conversion routines should be exchanged against the new ones. If this flag is set to True and instance creation (this is the default), then the conversion routines are set back to the originals once the expansion is complete, thereby avoiding to influence older application that may not work properly with the new set of conversion routines.

If the user wants to use these alternative lexical conversions everywhere in the application, then the use_improved_datatypes_conversions method can be invoked. That method changes the conversion routines and, from that point on, all usage of DeductiveClosure instances will use the improved conversion methods without resetting them. Ie, the code structure can be something like:

 DeductiveClosure().use_improved_datatypes_conversions()
 ... RDFLib application
 DeductiveClosure().expand(graph)
 ...

The default situation can be set back using the use_rdflib_datatypes_conversions call.

It is, however, not required to use these methods at all. Ie, the user can use:

 DeductiveClosure(improved_datatypes=False).expand(graph)

which will result in a proper graph expansion except for the datatype specific comparisons which will be incomplete.

Serializer bugs

During the development of the software a number of small bugs on the RDFLib serializers were found. The alternative RDF/XML and Turtle serializers, originally developed for the RDFa distiller, have been added to this package, too.

The convert_graph entry point used, for example, by the CGI service, uses these serializers.

Turtle Parsing bug

Unfortunately, there are some bugs in the underlying Turtle parser, used by RDFLib. All bugs are related to the way common datatypes can be abbreviated in Turtle. According to the latest Turtle grammar, the following abbreviations should happen

The current distribution includes a modified version of the RDFLib Turtle parser that takes care of the second and third item. Unfortunately, third problem seems to be missing in the core grammar of the parser, and could not be handled. Ie, "1.2345E+12"^^xsd:double should be used instead.

The convert_graph entry point used, for example, by the CGI service, uses this parser.


Requires: RDFLib, 2.2.2. and higher

License: This software is available for use under the W3C Software License

Organization: World Wide Web Consortium

Author: Ivan Herman

Version: 4.1

Contact: Ivan Herman, ivan@w3.org

Submodules [hide private]

Classes [hide private]
  DeductiveClosure
Entry point to generate the deductive closure of a graph.
Functions [hide private]
 
__parse_input(iformat, inp, graph)
Parse the input into the graph, possibly checking the suffix for the format.
source code
 
interpret_owl_imports(iformat, graph)
Intepret the owl import statements.
source code
Class type
return_closure_class(owl_closure, rdfs_closure, owl_extras, trimming=False)
Return the right semantic extension class based on three possible choices (this method is here to help potential users, the result can be fed into a DeductiveClosure instance at initialziation)
source code
 
convert_graph(options)
Entry point for external scripts (CGI or command line) to parse an RDF file(s), possibly execute OWL and/or RDFS closures, and serialize back the result in some format.
source code
Variables [hide private]
  __author__ = 'Ivan Herman'
  __license__ = u'W3C® SOFTWARE NOTICE AND LICENSE, http://www.w...
  RDFXML = 'xml'
  TURTLE = 'turtle'
  AUTO = 'auto'
  NONE = 'none'
  RDF = 'rdf'
  RDFS = 'rdfs'
  OWL = 'owl'
  FULL = 'full'
  __package__ = 'RDFClosure'

Imports: StringIO, IntType, TypeType, BooleanType, CodeType, UnboundMethodType, StringType, BuiltinMethodType, FloatType, DictionaryType, NotImplementedType, BuiltinFunctionType, DictProxyType, GeneratorType, InstanceType, ObjectType, DictType, GetSetDescriptorType, FileType, EllipsisType, StringTypes, ListType, MethodType, TupleType, ModuleType, FrameType, LongType, BufferType, TracebackType, ClassType, MemberDescriptorType, UnicodeType, SliceType, ComplexType, LambdaType, FunctionType, XRangeType, NoneType, Graph, register, serializers, parsers, rdflibLiteral, DatatypeHandling, Closure, OWLRL_Extension, OWLRL_Extension_Trimming, OWLRL_Semantics, RDFS_Semantics, RDFS_OWLRL_Semantics, imports, AxiomaticTriples, CombinedClosure, Literals, OWLRL, OWLRLExtras, RDFSClosure, RestrictedDatatype, XsdDatatypes


Function Details [hide private]

__parse_input(iformat, inp, graph)

source code 

Parse the input into the graph, possibly checking the suffix for the format.

Parameters:
  • iformat - input format; can be one of AUTO, TURTLE, or RDFXML. AUTO means that the suffix of the file name or URI will decide: '.ttl' means Turtle, RDF/XML otherwise.
  • inp - input file; anything that RDFLib accepts in that position (URI, file name, file object). If '-', standard input is used.
  • graph - the RDFLib Graph instance to parse into.

interpret_owl_imports(iformat, graph)

source code 

Intepret the owl import statements. Essentially, recursively merge with all the objects in the owl import statement, and remove the corresponding triples from the graph.

This method can be used by an application prior to expansion. It is not done by the the DeductiveClosure class.

Parameters:
  • iformat - input format; can be one of AUTO, TURTLE, or RDFXML. AUTO means that the suffix of the file name or URI will decide: '.ttl' means Turtle, RDF/XML otherwise.
  • graph - the RDFLib Graph instance to parse into.

return_closure_class(owl_closure, rdfs_closure, owl_extras, trimming=False)

source code 

Return the right semantic extension class based on three possible choices (this method is here to help potential users, the result can be fed into a DeductiveClosure instance at initialziation)

Parameters:
  • owl_closure (boolean) - whether OWL 2 RL deductive closure should be calculated
  • rdfs_closure (boolean) - whether RDFS deductive closure should be calculated. In case owl_closure==True, this parameter should also be used in the initialziation of DeductiveClosure
  • owl_extras - whether the extra possibilities (rational datatype, etc) should be added to an OWL 2 RL deductive closure. This parameter has no effect in case owl_closure==False.
  • trimming - whether extra trimming is done on the OWL RL + Extension output
Returns: Class type
deductive class reference or None

convert_graph(options)

source code 

Entry point for external scripts (CGI or command line) to parse an RDF file(s), possibly execute OWL and/or RDFS closures, and serialize back the result in some format. Note that this entry point can be used requiring no entailement at all. Because both the input and the output format for the package can be RDF/XML or Turtle, such usage would simply mean a format conversion.

If OWL 2 RL processing is required, that also means that the owl:imports statements are interpreted. Ie, ontologies can be spread over several files. Note, however, that the output of the process would then include all imported ontologies, too.

Parameters:
  • options - object with specific attributes, namely:
    • options.sources: list of uris or file names for the source data; for each one if the name ends with 'ttl', it is considered to be turtle, RDF/XML otherwise (this can be overwritten by the options.iformat, though)
    • options.text: direct Turtle encoding of a graph as a text string (useful, eg, for a CGI call using a text field)
    • options.owlClosure: can be yes or no
    • options.rdfsClosure: can be yes or no
    • options.owlExtras: can be yes or no; whether the extra rules beyond OWL 2 RL are used or not.
    • options.axioms: whether relevant axiomatic triples are added before chaining (can be a boolean, or the strings "yes" or "no")
    • options.daxioms: further datatype axiomatic triples are added to the output (can be a boolean, or the strings "yes" or "no")
    • options.format: output format, can be "turtle" or "rdfxml"
    • options.iformat: input format, can be "turtle", "rdfxml", or "auto". "auto" means that the suffix of the file is considered: '.ttl' is for turtle, rdfxml otherwise
    • options.trimming: whether the extension to OWLRL should also includ trimming

Variables Details [hide private]

__license__

Value:
u'W3C® SOFTWARE NOTICE AND LICENSE, http://www.w3.org/Consortium/Legal\
/2002/copyright-software-20021231'