Archive

Archive for the ‘Semantic web’ Category

I’m on the Semantic Web! Pt. 2

June 1st, 2010

Ok. So in the last post I was talking about how I created an RDF graph to describe myself and found that I’d entered into a huge rambling geekfest about the design of the FOAF vocabluary. So, I decided to cut all of that out and post it here separately. For context, it follows directly from having created this RDF document. If your interested read on…

From that experience, of creating my own RDF graph, I had only one hiccup: using the FOAF vocabulary, while it is relatively simple to define a group (such as the company which employs you) and list its members (in that case, staff), it seems impossible to do it the other way around. Essentially, you cannot say “I work for OpenText” but can say “OpenText employs me”. I do understand why this is, though: it is fairly standard for predicates to assume ahas relationship not an is one (#me has foaf:name Chris, not is foaf:name Chris), and standards are essential for Linked Data to work.

You may think that the problem described above sounds pretty irrelevant (you may be right: read on), so let me run through my thought process:

Imagine two graphs, one describing me and one describing OpenText. In the OpenText graph there is a list of employees (as there is on Freebase) which include a reference to my graph. You could, then, search (for the purpose of an example) for the weblogs of OpenText employees fairly successfully. If, however, you were using my graph, you couldn’t find a list of colleagues of mine because I couldn’t add “Chris is employed by OpenText” to then graph and, hence, the two could not be connected.

Someone obviously agreed with that assessment as I discovered the RoleVocab vocabulary on the FOAF wiki. I used that vocab in my person profile document to assert that “Chris has a role in the organization OpenText”.

With hindsight, I think that might have been a mistake, though. My mind-frame was trapped in the resource – the me. Perhaps I should have been thinking about the whole RDF graph. Why couldn’t I include a separate resource about OpenText which only included my employment? Well, because the domain of the foaf:member property is foaf:Group and the foaf:Organization type is a direct subclass of foaf:Agent. Essentially, the foaf vocabulary is saying that you can only be a member of a group and not an organization. Personally, I think that the most semantically correct way around this issue would be to make foaf:Organization a subClass of foaf:Group or, failing that, foaf:Organization could be added as a second rdfs:domain property of foaf:member…. I may make the suggestion.

In the meantime, I’ve also added an OWL Object Property to the top of my RDF document which describes the predicate”employee”, as in “OpenText has employee Chris”.

So: apologies for the geeky and rambling post and please let me know your thoughts on the whole “Group has member Person”/”Person participates in Group” conundrum.

Author: admin Categories: Semantic web Tags:

I’m on the Semantic Web!

June 1st, 2010

That’s right. About a fortnight ago I decided it was about time to practice what I preach (well, specifically what I was due to preach at last weeks excellent ePublishing Innovation Forum) and get myself onto the Semantic Web. For those new to the concept of the Semantic Web, I’m talking about creating an RDF graph which includes a resource describing me.

So, without further ado, here I am:

http://chrisscott.org/about/card#me

The document at the end of that link is a FOAF Personal Profile Document. As you can see, the URI above includes the fragment “me”. This is a fairly important part of the Linked Data concept as it allows one of the axioms, that the URI is dereferenceable, whilst also identifying a resource, “me”, which can be used to link the graph to others. So, if you are curious, take a look at my personal profile and check out the “me” resource – it’s pretty simplistic but a good starting point.

So, how did I go about creating my personal profile on the Semantic Web? Well I started with a step I urge everyone to do: I signed up to the Opera community. You can do the same here. Once you’ve done that you can go to your profile and click on the “FOAF” link on the right hand side of the footer:

My profile page in the Opera community.

That’s the quickest and easiest way to get yourself represented on the Semantic Web but for me Opera do not give you enough control. For example, I cannot use the foaf:weblog predicate to point to this blog, only the one which Opera host for me (that said, they do support the rdfs:seeAlso predicate so my private personal profile is referenced by my Opera one). For that reason, I took the XML generated for my Opera community profile, tweaked it a bit and uploaded it onto this domain.

Give it a go! I’d love to hear how people get on…

NB: I ended up going on a bit in the draft of this post about the FOAF vocab design and got a bit technical, so I’ve seperated that content off into this post.

Author: admin Categories: Semantic web Tags: , ,

Semantic Web? What’s in it for me?

June 9th, 2009

There’s no doubt about it: the Semantic Web is the hottest thing in the on-line industry at the moment. It’s all over the web, on the speaker circuits, in multitudes of product labs. On-line publishers are being told again and again that they need to get there content into RDF triplets and create linked data. One of the questions they should be asking is: why?

This article assumes some knowledge of RDF although does not go into technical details. There are many good sites which introduce RDF, including RDF: about, which I recommend reading.

Okay, so some of the reasons why are obvious. Tim berners-lee’s vision of linked data tying the WWW together has inarguable and massive benefits. The potential for applications utilizing knowledge gleaned from RDF triplets is mind-boggling. One of the points Dame Wendy Hall made at last months ePublishing forum was that if publishers felt that they missed out by not getting on board with the World Wide Web as sharply as they would have, with hindsight, then now was their opportunity to make up for that. Don’t miss the boat twice was the message; start thinking about the Semantic Web now.

And, for me, that sentiment – start thinking about the Semantic Web now – is even more pertinent than, perhaps, it was intended. In the mid nineties when the modern Web was taking off putting content on-line was a risky, uncertain business for publishers. There may have been some publishers who jumped in early and reaped the rewards, some were burned and some joined the party late, but knowing what we do now no publisher would have hesitated. So now the Semantic Web is the big, new thing; largely unknown and poorly understood (aren’t all new concepts?). But unlike the boom of the WWW – the scale of which was never predicted, even by TBL – we now do have some concept of the magnitude of what the Semantic Web could achieve. Certainly there is enough hype about it, now, that I, at least, can’t imagine the Semantic Web (in some form) not taking off.

So more than just looking at the augmentation of the Web with linked data as another opportunity to not miss the boat, we should be planning what we are going to do with this data. I can see the uses (visualisation?) of RDF triplets falling, broadly speaking, into two (non mutually exclusive) categories:

  1. Representations of specific facts
  2. Representations of generic facts

Currently there are a number of examples of interfaces for interacting with linked data available on the web. RKBExplorer is one of the best. There are also numerous examples of geo-data mapping applications, etc. These are representations of specific facts. That is, we have a question in mind and are displaying the answer(s). Take, for example, a set of triplets which link articles to there author, in the form:

chrisauthoredthispost

Using this information a piece of software can now ask the question: who wrote this article? And it would get back the correct answer: me. Now, in reality, this would be an extremely over simplified knowledge base; a more likely set up would include a foaf:Person and possibly a bnode referencing some Dublin Core meta-data (don’t worry about the terminology). Then the scope of available questions widens dramatically. Where do the colleagues of the person who wrote this article live? Where can I find a photo of the author? By complying with these standard ontologies software can make pretty accurate assumptions about valid questions to ask.

In the same vien, whole new possibilities become achievible in terms of mash-ups. Say I’m writing a review of a new novel. If I can assume that Amazon and all the other big online vendors are producing RDF documents describing their stock I can simply query for ISBN which I know is stored as dc:Identifier and return all prices which I can assume (for the perpose of an example) are commerce:Price. In short, RDF is a great way of managing distributed data – which is something you’ll hear a lot of if you dig into the subject.

But even with applications utilizing complex webs of linked data in this way they are still only asking predefined questions. “I know how to display a latitude and longitude on a map so I’ll find out those details”. “If a foaf:Person has a picture I’ll display it by their posts”.

The second category of uses I described for RDF triplets was the representation of generic facts. This is something I haven’t seen done yet (with the possible exception of the SPARQL – which is not appropriate for this discussion) but seams, to me, at least, to be an obvious next step. Let me explain…

The beauty of the RDF approach – beyond any other – is that is allows the document owner to describe any fact with computers still able to extract some kind of meaning from it. This is where the predicate of the triplet comes in and why using a URI is so important. It goes without saying that if a well used standard exists for describing any component of the triplet then it should be used but if one doesn’t exist you can still describe the fact. I could create by own URI which describes the predicate ‘ate for lunch’, if I so pleased. And then I could publish the fact that I, Chris, ate for lunch beans-on-toast and, in theory, an application with no prior relation to me could understand what I meant (at least to some degree). The application in question would, possibly, not understand what “ate for lunch” means but it could point it’s user to the URI I created and, hence, explain the fact to them.

Finding new ways to represent these generic facts has to be on the horizon of anyone interested in pushing the Semantic Web into the mainstream. It may be through widgets and apps, it may require a new generation of browser, but it should happen. I have no doubt that the kind of mash-ups and queries that I described as representations of specific facts are achieved much, much more easily using RDF channels for data but, essencially, we could already represent those kind of links between data. I could build a database of all the authors who write on my site and produce a Google Maps integration to show you where they live. However, I could never – without a unified system of triplets – even concieve of displaying arbitrary facts to acompany an article unless someone had manually written them. Certainly, one could not display those facts dynamically, it would be impractical. But, as the RDF standard becomes more popular, allowing applications (widgets, etc) and search portals to do just that is very much a realistic prospect.

If we, as online publishers, are going to jump, two-footed, into the Semantic Web (which I firmly believe we should) we should also be thinking about our goals and reasons for doing so. No publisher’s target is to help search engines answer a searchers query without visiting their site; or contributing to the building a knowledge base of unparrallelled proportions. No. The goal has to be the same as it always was, to improve the users experience and to drive web traffic. So, sure, don’t get left behind, get content into RDF format, but why stop there? This is the time to be thinking about how to get ahead of the curve, how to use this data. Certainly I am…

Author: chris Categories: Semantic web Tags: ,

Creating compelling content in the Web 5.0 world

April 30th, 2009

Whoa, there. Web 5.0?

Okay, so I made up web 5.0. Actually, I detest the numbered generations we’ve applied to the web. The main problem I have with these terms is that they imply a linear progression. They suggest that we are going to abandon the interactive web, Web 2.0, for the semantic web, Web 3.0. Obviously we aren’t. I doubt anyone would even suggest it. Web developers will continue to use both. Hence Web 5.0 (do the maths).

I’m going to drop the term now – it was just a joke. The modern World Wide Web is, in fact, much more than just the three so-called generations – although clearly they are very important. I can identify three main concepts (not technologies) which are facilitating the current evolution of the web:

  • Interactivity (2.0)
  • Semantic understanding (3.0)
  • Commoditization (the Cloud)

Nothing ground breaking there. And we, as users, are certainly seeing more and more of these big three in our daily use of the web.

Interactivity is fairly obvious. I think the biggest revolution in interactive content came about as Wikipedia took off. Undoubtedly the most expansive (centralized) base of knowledge the world has ever seen – and written by volunteers, members of the public. It really is a staggering collaborative achievement. Then there’s blogging, micro-blogging, social networking, professional networking, content discovery (digg, etc), pretty much anything you might want to contribute, you can.

Semantic understanding is a little trickier to see. That’s hardly suprising as it is so much newer and far less understood. Believe the hype, though. The sematic web is coming and it will change everything (everything web related, that is). If you don’t believe me try googling for “net income IBM”. You should see something like this:

Google results using RDF infoThat top result is special. It’s special because it’s the answer; it’s what you were looking for. No need to trawl through ten irrelevant pages to find the data – it’s just there. Google managed to display this data because IBM published it as part of an RDF document. If you search for the same information about Amazon – who don’t, no such luck. (That particular example was given by Ellis Mannoia in a great Web 3.0 talk at Internet World this week – so thanks Ellis.)

That leaves us with commoditization. Specifically, the commoditization of functionality from a developers point of view. This concept is largely, although not exclusively, linked to the Cloud. The term “the Cloud” is used broadly to describe services make avalible over the internet. GMail, for example, is email functionality in the cloud. Users don’t need to install anything to use GMail (bar a web client) they just use it when they want, from any computer. Many of the Cloud services out there are available as APIs, and that leads to the commoditization of functionality. Say I want to add a mapping application to my web site to show my audience where I am. A few years ago that would have been a significant amount of development work. These days it’s trivial – you just make a call to the GoogleMaps API. And so map functionalities become a commodity.

The point of this post, however, is that these are not mutually exclusive concepts. There is no reason why you cannot combine semantic understanding with Cloud computing, or UGC, or both. Quite the opposite: combining the three should be the goal.

There are problems, however. Utilizing Cloud computing requires a certain amount of adherence to standards – fitting in to an API. And semantic understanding (and meta data, in general) takes time to accrue. In general those two constraints don’t work well with Web 2.0 functionality.

Let me give an example: If a user contributes a comment to an article they probably won’t take the time to add the meta data required for semantic understanding to be achieved. In the same way if they don’t give their location you can’t show them as a pin on GoogleMaps.

However semantic understanding is (IMHO) more than just the use of RDF documents. Tools like Nstein’s Text Mining Engine can be used to create a semantic footprint describing a piece of text. I’ve talked, in previous posts, about using the data gleaned by the TME in imaginative and experimental ways. Take the example above. If a user were to post a comment about a talk they attended the TME could extract, not only the concepts of the comment, but also data like the location of the subject. That semantic understanding can be used to programatically call the GoogleMaps API to add a new pin in your map.

And there you have it. Semantic understanding of interactive content used to harness the power of Cloud computing. One of the most important benefits of the TME, for me, is the flexibility it affords you. If you know that you can get access to that kind on information it opens up all kinds of possibilities. Exploring some of these possibilities has to be the focus for making a brand stand out against the plethora of content suppliers and aggregators available; for improving the users experience and gaining their loyalty.

So it’s time to stop thinking about Web 2.0 or Web 3.0 and start thinking about the technology and techniques available and how they can be used to the greatest effect.

Author: chris Categories: Semantic web, Social Media Tags: , , ,