Ivan's Blog

Thu, 30 Nov, 2006

Book mashup, SW book list…

I regularly follow the updates on the “Books on Semantic Web” wiki page. I just found out that for most of the books a new entry has been added: a reference to the book mashup service of Chris Bizer & friends. It is really a great service, thanks Chris!

The only disadvantage is that the various RDF references are spread over the wiki page, so it is not easy to make, for example, SPARQL queries over the whole list of books. To make this easier I made a small python program that simply collects all mashup data for the books on the wiki, merges them into one, and puts the result back on the Web as one RDF resource. The script is ran once a day, so there might be a bit of delay when updating the book list. (I could have added it as a CGI script, and I may do it at some point; the problem is that collecting all the data takes quite a long time, so it is not ideal as a service…)

Category: /WorkRelated/SemanticWeb; Posted at: 21:52 UTC; Permalink

Wed, 29 Nov, 2006

Literals as subject in RDF and internationalization

I had one of those “aha” moments today. I am at the AC meeting of W3C and listened to a presentation on the Internationalization Tag Set work done at W3C. Essentially, the group tries to define a set of generic (XML) attributes and elements relevant and important for internationalization. One of the possible examples is the proper handling of right-to-left scripts in left-to-right one (like Arabic or Hebrew put in an English text): how should one ensure that the text: “פעילות הבינאום, W3C” is properly displayed (i.e., with the ‘W3C’ at the left of the hebrew characters). It so happens that this requires an extra attribute (called ‘dir’ in HTML). HTML happens to have this attribute, but what the ITS group is working on is to define a generic set that could be used in other XML dialects, too. Another example is an attribute to denote whether a specific text should be translated or not.

The question is: what about RDF literals? How can one ensure that the right set of information are set on literals? At the moment we have only the language attribute, which is necessary, but may not be enough. And my “aha” moment was: if we did not have the restriction in the RDF model which says that literals cannot be subjects of triplets then this would be way easier. The current terminologies used in ITS could have a variant in the form of an RDF vocabulary and be used to characterize literals. This may not cover all internationalization issues (like Ruby markup, for example), but may be o.k. for most.

Of course, one has to be very, very cautious in changing the RDF model and semantics, i.e., there might be some hidden mines here. But the internationalization issue may certainly be a good use case for looking at this restriction again…

(There is a really nice overview on the bidirectional issue above on the I18N site of W3C, if you are interested.)

Category: /WorkRelated/SemanticWeb; Posted at: 06:06 UTC; Permalink

Thu, 23 Nov, 2006

Tabulate bookmarklet

As a typical example of programming by example, I have copied and changed the foaf explorer bookmarklet to make a tabulator bookmarklet. It looks for a link header element referring to application/rdf+xml, and sends the corresponding URI to the (latest version) of the tabulator. The tabulator will automatically load that RDF file, instead of starting up with its default values. Nothing fancy, but maybe useful. (Caveat: if there are several such link entries, the bookmarklet finds the first one only…)

(I wanted to put the bookmarklet itself, ie, the javascript: URI into this blog directly, but blosxom had problems displaying it. Oh well…)

Category: /WorkRelated/SemanticWeb; Posted at: 15:49 UTC; Permalink

Fri, 10 Nov, 2006

ISWC Conference (ISWC Day 5)

Great keynote from Rudi Studer. An essential part of his keynote was the relationships between the Semantic Web and other communities (KR, Database, Software Engineering, Natural Language Processing, Machine Learning). One aspect that I really appreciated in the keynote is that Rudi did not use his keynote on what he (or his team) did and does, but gave an overall view instead (I have seen too many of those self-centric keynotes…). My only respectful criticism is that the Web aspect was a bit missing. I think it was Chris Welty who said somewhere that the new thing in the Semantic Web is the Web; very often, when anaylising the relationships, Rudi did not really make it clear what the Semantic Web’s relationship is to a specific area as opposed to Semantic’s (ie, essentially, knowledge representation’s) relationships.

The most interesting part for me was what he said about the relationship to the Database community. Some new things I will have to learn, that is for sure (eg, he referred to the term “dataspaces”, which seem to be the new buzzword in that community). And, clearly, there is some extra outreach to be done in that space; I just read the blogs of Ian and Danny on the keynote of Mårten Mickos at the the Web 2.0 Summit, which does not look very good. We already had hallway discussion on trying to organize an event around the relationship between SW and databases in 2007 (eg, at W3C); maybe the topic should be a bit larger than I originally thought. To be followed up…

I had to give a talk in the next session, and I spent the rest of the day chatting with different people; ie, no more sessions. But I went to the closing ceremony. I was pleased to see the paper on DartGrid (see my previous blog) win the best paper price in the use cases category. And it was also a pleasure to see the team of Amsterdam win the Semantic Web challenge with their MultimediaN E-Culture demonstrator. What I particularly liked in their demonstration, beyond the technical content, was the beautifully designed user interface with nice, pastell colours, clear, clean, and non-cluttered screen. On the other hand, I can only repeat what I said elsewhere: I had to come to the other side of the globe to learn what people are doing at about 20 meters away from my office at CWI (and I also drive along the Vrije Universiteit with the tramway every time I go to the Office). Shame on me. Something should be done about it…

It was a great, albeit exhausting week (I also took part at a face-to-face meeting of the Rules Interchange Format Working Group last week). And I have a bunch of things to read in the proceedings still…

Category: /WorkRelated/SemanticWeb; Posted at: 11:50 UTC; Permalink

Thu, 09 Nov, 2006

ISWC Conference (ISWC Day 4)

Only two papers really stuck with me on the second day; I missed a bunch of other sessions, sadly…

The paper of Motik & al[1] is one of the papers which explore the connections of Description Logic and Rules. Obviously, I will have to read the details in the proceedings, but my first understanding is that this is also based on a rules with the DL side being some sort of a black box. The only unusual aspect is the introduction of what they call “autoepistemic extenstion of DLs”. I must admit, this is the first time I hear about this thing, so I still have to understand it. But all this should be interesting. The paper is theoretical for now; my understanding is that there is no implementation yet, though somehow the feeling I got from the presentation is that the authors have a clear idea on how to implement it, so it should be a matter of time only.

The paper on DartGrid[2] is impressive. These guys did a huge work that covers a large palette of achievements: a system to map relational database content to RDF, with a nice user interface to control the details of that mapping; a query rewriter from SPARQL to the necessary SQL queries using those mapping; a semantic query system (backed up by ontologies) to search through those databases; user interfaces to generate queries, etc. And an application combining cca. 50 databases in one coherent system, with each database containing between 70,000 and 100,000 records. This application is deployed in the Chinese Academy of Traditional Medicine and is used by researchers there. Another application on neural research is under development at Yale. (By the way, the home page of the DartGrid project is in English, so you can look it up…) The paper was part of a session on eScience use cases; the other papers were also pretty nice!

There was also a panel on Web 2.0 in the evening. It was not really controversial and the goal was really to see how the Semantic Web could, in some ways, help the further development of Web 2.0. I liked Benjamin Grossof’s characterization, who said something like “it is better to be the child riding the elephant than the ant crashed by it”…

  1. Can OWL and Logic Programming Live Together Happily Ever After?”, by Boris Motik, Ian Horrocks, Riccardo Rosati, Ulrike Sattler
  2. From Legacy Relational Databases to the Semantic Web: an In-Use Application for Traditional Chinese Medicine”, by Chunyin Zhou, Yimin Wang, Heng Wang, Jinmin Tang, Zhaohui Wu, Ainin Yin, Huajun Chen, Yuxin Mao, Meng Cui
Category: /WorkRelated/SemanticWeb; Posted at: 12:38 UTC; Permalink

Wed, 08 Nov, 2006

ISWC Conference (ISWC Day 3)

The problem with such a conference is that one spends as much time talking to other people as listening to the presentations. Great hallway talks, technical discussion, work… it is all good and important, but I always feel a bit guilty when I miss a session. Oh well…

I quite liked Tom Gruber’s keynote on “Social Web”. Tom tried to avoid the controversial Web 2.0 term and talked rather of the collective intelligence of folksonomies, tagging, blogging, etc. It was good to hear a talk that avoids the unnecessary controversy on the relationship between Web 2.0 and, say, the Semantic Web. Tom also talked about an attempt to give a more coherent ontological model for tagging, though it seems that this work is stalled due to missing people to work on it (see also an earlier blog he had on this for some more details). Would be good to pick this up…

The survey of the Web Ontology Landscape[1] was interesting. They survey a bunch of OWL and RDFS files trying to characterize them in terms of what level of OWL they use, what are the frequencies of usage of various facilities, etc. Although there were some criticisms on whether the sample they use was fully o.k., the conclusions were still interesting. Two things stuck for me: that a large number of OWL ontologies use a very “light” level of functionalities (which leads to the issue of having light ontologies and how important those are); and that a number of ontologies “slip” into OWL Full out of OWL DL due to some very small additional features they use. Bijan told me afterwards that if OWL1.1 were used, than those ontologies would remain OWL DL, actually, which is interesting.

I also quite liked the presentation of Fresnel[2], a language to express how one wants to see RDF data (a distant analogy is like CSS to HTML). The demonstrations showed by Emmanuel were really nice, and one would hope that more tools were at least testing this thing. It is really promising. I talked to Emmanuel later, by the way: it seems the IsaViz is one of the tools that does use this, though not the “stable” version at W3C. He said he would make a new stable version with Fresnel soon; we can then announce it on the SW Activity News page

The same session included a presentation on /facet[3], a faceted user interface to RDF data. It looks really nice, though one can really have a “feel” to a tool like that by trying it. So I hope to install it on my machine soon to play with it (it requires SWI-prolog; well, that should not be a big problem…). The only slight caveat with this project: I had to come to the other side of the globe to learn what Michiel & Co. are doing, although their office is around 20 meters away from me at CWI… sad

  1. A Survey of the Web Ontology Landscape”, by Taowei Wang, Bijan Parsia, Jim Hendler
  2. Fresnel: A Browser-Independent Presentation Vocabulary for RDF”, by Christian Bizer, Emmanuel Pietriga, David Karger, Ryan Lee
  3. /facet: A Browser for Heterogeneous Semantic Web Repositories”, by Michiel Hildebrand, Jacco van Ossenbruggen, Lynda Hardman
Category: /WorkRelated/SemanticWeb; Posted at: 12:35 UTC; Permalink

Tue, 07 Nov, 2006

Beijing photos

Summer palace, Beijing

Having spend some vacations lately in Beijing, I also took photos, of course. I have finally worked through all of them and have put a selection of those on the Web (mixed with older photos that I took on previous occasions).

Actually, I also play with the Picasa Web Album site, and I have the same album on my site there. I quite like the Picasa program that I have been using to organize my photos for a while. Actually, I also quite like the Picasa Web Album, and I would love to use that site to store my photos but… there is a limitation on the amount of space that one can use for free and to buy a larger space one has to be in the US. No kidding: if you are not in the US (more exactly, if you do not have a US credit card, I guess), then you cannot buy a larger disc space. This parochialism is really offending.

Category: /Private/General; Posted at: 13:29 UTC; Permalink

Workshop on SW in Health Care and Life Sciences (ISWC Day 2)

Part of the Workshop was, in fact, a report on what the W3C Interest Group does. Being part of that one, it was not really new to me, but I hope that it was new to the audience… I was impressed, by the way, by the high turnout. This is really good!

But there were also “non-IG” presentations, although I was around only during the morning, so I cannot really comment on all of them. I think the one which impressed me the most came from the Center for Disease Control, in Atlanta, on how they collect data from all over the world for the purpose of global disease surveillance, ie, information on diseases that go beyond national boundaries (SARS, malaria, bird flue, etc). They collect very heterogenous data and they use RDF based technologies to combine those and present them to the user. (See an example interface of what they present.) Impressive stuff. Mashup on a giant scale.

One question did come up, actually, during question time (with them and others): it is nice to see an application pattern whereby very hetereogeneous data are integrated via RDF, and then sophisticated search interfaces are offered to the user to query the data. However, it is very important that the core RDF data should be reachable and shared as raw data. Ie, if other people want to use that data directly, it should not be “hidden” behind beautiful SPARQL engine interfaces… Moreover, the links to the RDF data should be easy to find for everyone to use. The Semantic Web is (also) about sharing data…

In the afternoon I went to the panel on Interaction Design Grand Challanges and the Semantic Web.The panelists (TimBL, Jim Hendler, Nigel Shadbolt, David Karger) had lots of interesting (and sometimes fun) things to say on the issue of interaction with Semantic Web Data (or user interaction in general), on the role (or the lack of role happy of ontologies, etc. Nigel had an interesting slide which showed the graphical representation of a pretty large ontology, but it also turned out that, in practice, users use only a very small part of that ontology in practice. The whole issue of “shallow” ontologies came up several times, something definitely to follow up.

However, I must admit I was a bit disappointed, though I also realize that this was my fault. The panel being sponsored by the new WSRI, and all the panelists being somehow involved with this stuff, I was hoping to hear more about some more general, WSRI related ideas and plans (personally, I am mostly interested by the question whether and how WSRI will look at issues like the effect of the Web on societies at large, in a new type of web-illeteracy, etc). But, as I said, it was my fault because it was a false expectation; after all, this panel was indeed part of a Semantic Web User Interface Workshop… Anyway. I think I will have the opportunity to hear about those issues, too.

Category: /WorkRelated/SemanticWeb; Posted at: 12:21 UTC; Permalink

Mon, 06 Nov, 2006

Workshop on Uncertainty Reasoning on the SW (ISWC Day 1)

I had the pleasure of participating at the Uncertainty Reasoning for the Semantic Web (URSW) Workshop today in Athens, GA, right before ISWC. There were quite a number of nice presentations although, I must say, I missed some more practical application examples; most of the papers were quite theoretical. But well, I guess that phase is also important. I must admit I still have to read through the papers themselves to grasp all details, though.

I have already written about similar subject some times ago, where I referred to some work on extending SHOIN(D) towards fuzzy logic. The drawback of the approaches I knew about is that they all needed the development of new type of reasoners, i.e., fuzzy DL engines. And that is really an obstacle for their acceptance. In this respect it was interesting to see one of the papers[1] that aims at overcoming this drawback by attempting a mapping of a fuzzy SHOIN(D) to traditional (“crisp”) DL, which means that, though the user can still express his/her statements using fuzzy DL, after the translation a traditional DL engine can be used. The drawback seems to be that the number of generated crisp axioms is pretty high (sometimes quadratic). But it is certainly a direction to explore.

Another (position) paper that caught my interest was a paper on “Probabilistic DLP”[2]. The Description Logic Programming approach has become pretty well known in the past few years: essentially, the way you could combine logic programming is to treat a DL knowledge base as a black box, and you could ask “questions” to a DL reasoner while evaluating your LP rules. It is a neat separation of two different view of the World. Calì and Lukasiewicz[2] have put an additional twist by allowing probabilistic values be assigned to the logic programming part (the “rules”) while leaving the DL part unchanged, ie, “crisp”. It is a bit like the previous paper: you leave your DL reasoner intact for efficiency. I must say I missed some real-life examples to see that this approach is really useful, ie, that you do not have to mix in probabilistic reasoning into the DL World.

Nicles and Cobos[3] extend DL (ie, OWL) by adding so-called social contexts. If you have statements like Hero(Columbus), you can express things like “Tina asserts to Tim and Tom that Hero(Columbus)”, whereas “Tim and Tom asserts to Tina that Exploiter(Columbus) and ¬Hero(Columbus)” (it seems that the AI community has used similar constructs for a while, I must admit it is new to me). Beyond the familiar ABox and TBox, one has now a set of SBox-s to express these “context” information. It may actually be quite interesting to see where this would lead: when talking to outsiders about the Semantic Web one of the issues that do come up is how to describe the provenance of RDF assertions, what is their value in a given context, etc. This type of work may lead in that direction. Again, I would have liked to see a real-World example… (they also have a Web site for the project).

Last but certainly not least, it is worth bookmarking the PR-OWL site. Not only does it refer to a line of research on combining probabilistic reasoning with OWL (that is what “PR-OWL” stands for), but the site also has a number of good references to what other people have done in the area. Such resources are really good to see and have…

The last session of the Workshop was actually very interesting. The organisers (Paulo da Costa, Kathryn and Ken Laskey) clearly wanted to move towards more practical goals, trying to get out of the ivory tower of academic research. So the audience was divided into small break-out groups to draft variants of real use cases in using some form of uncertainty reasoning. A number of those were drafted, from bioinformatics to wine ontologies. The idea is to propose a W3C Incubator Group where such use cases, with categorization and sketches for solutions, could be collected more systematically, and the use cases drafted at the Workshop would be the first, starting set. The goal is that, eventually, the Incubator Group could provide valuable input to various other W3C groups (like the Rule Interchange Format or Deployment Working Groups) that could address this whole area in conjunction with their respective work. That may be very interesting indeed. One more area to watch…

It was a good day. Looking forward to the rest of the week!

  1. “Crisp Representation for Fuzzy SHOIN with Fuzzy Nominals and General Concept Inclusions”, by F. Bobillo, M. Delgado, and J. Gómez-Romero.
  2. “An Approach to Probabilistic Data Integration for the Semantic Web”, by A. Calì and T. Lukasiewicz
  3. “Social Contexts and the Probabilistic Fusion and Ranking of Opinions: Towards a Social Semantics for the Semantic Web”, by M. Nickles and R. Cobos
  4. “Probabilistic Ontologies for Efficient Resource Sharing in Semantic Web Services”, by P.C.G. da&nsbp;Costa, K.B. Laskey, and K.J. Laskey
Category: /WorkRelated/SemanticWeb; Posted at: 00:17 UTC; Permalink

Thu, 02 Nov, 2006

Girl with Pearl Earring

Reproduction of Vermeer's 'girl with a pearl earring' painting

I had the pleasure, yesterday evening, to see the movie “Girl With a Pearl Earring” (courtesy of the Belgian TV…). The movie itself is, well, a fiction; I think I actually preferred the original novel by T. Chevalier. What captivated was not really the story but the beautiful work done by the cameraman. Some of the shots were really like the paintings of J.&Vermeer in terms of light, colour, etc.

And the movie reminded me again how absolutely beautiful the original painting of Vermeer is. I remember being so captivated by it at my last visit in the Hague at the Mauritiushuis that I sat down on a couch facing the painting and I just stayed there for a very very long time, completely absorbed by the view. It certainly is, at least for me, one of the highlights of classical Flamish/Dutch painting, and of European painting in general. It was good to see such a movie and remind me again…

 

Category: /Private/General; Posted at: 07:53 UTC; Permalink


Blossom's logo