Ivan's Blog

Wed, 29 Nov, 2006

Literals as subject in RDF and internationalization

I had one of those “aha” moments today. I am at the AC meeting of W3C and listened to a presentation on the Internationalization Tag Set work done at W3C. Essentially, the group tries to define a set of generic (XML) attributes and elements relevant and important for internationalization. One of the possible examples is the proper handling of right-to-left scripts in left-to-right one (like Arabic or Hebrew put in an English text): how should one ensure that the text: “פעילות הבינאום, W3C” is properly displayed (i.e., with the ‘W3C’ at the left of the hebrew characters). It so happens that this requires an extra attribute (called ‘dir’ in HTML). HTML happens to have this attribute, but what the ITS group is working on is to define a generic set that could be used in other XML dialects, too. Another example is an attribute to denote whether a specific text should be translated or not.

The question is: what about RDF literals? How can one ensure that the right set of information are set on literals? At the moment we have only the language attribute, which is necessary, but may not be enough. And my “aha” moment was: if we did not have the restriction in the RDF model which says that literals cannot be subjects of triplets then this would be way easier. The current terminologies used in ITS could have a variant in the form of an RDF vocabulary and be used to characterize literals. This may not cover all internationalization issues (like Ruby markup, for example), but may be o.k. for most.

Of course, one has to be very, very cautious in changing the RDF model and semantics, i.e., there might be some hidden mines here. But the internationalization issue may certainly be a good use case for looking at this restriction again…

(There is a really nice overview on the bidirectional issue above on the I18N site of W3C, if you are interested.)

Category: /WorkRelated/SemanticWeb; Posted at: 06:06 UTC; Permalink


Blossom's logo