14 January 2007

A microscope on Microformats #

Reading through my feeds, an article about how Microformats might be integrated into the next release of Firefox caught my eye today. If you remember, Microformats is present in the current latest stable release of Firefox, as a means of getting 'dynamic' bookmark titles. But that is all there is to it as far as Firefox 2.0 is concerned. Reading a little deeper into this new concept, I discovered a whole new way of creating relational data content, by following simple and standard semantics. What's even better is how easy it is to start creating Microformat content. It takes nothing more than giving standard values to a few attributes in your HTML markup! But first, a little more about Microformats.

Microformats The term Microformat was given to semantic mark-up by Tantek Çelik and Kevin Marks during a presentation (or sometime after) on 'real world semantics' at EtCon 2004. It is just a standard format to present data so that services, applications, bots etc. can detect them, understand them, and do whatever they're programmed to do with the values they find. For example, the following is a sample block of code for an hCalendar microformat.

Going back home for vacations
13th January '07 (12:15 AM)
(January 13- February 10)
Gurgaon, Haryana
I'll be going back home after my end semester examinations. I've been missing home for quite some time, and want to get away from this hectic college schedule quickly.

And here is the code for it:

<div class='vevent'>
<span class='summary'>Going back home for vacations</span>
<span class='dtstamp' title='20070113T1215+0530'>13th January '07 (12:15 AM)</span>
(<span class='dtstart' title='2007-01-13'>January 13</span> -
<span class='dtend' title='2007-02-10'>February 10</span>)
<span class='location'>Gurgaon, Haryana</span>
<span class='description'>I'll be going back home after my end semester examinations. I've been missing home for quite some time, and want to get away from this hectic college schedule quickly.</span>

If you have any of the microformat recognition extensions installed (I've listed two that I know towards the end of the post), you'll see a calendar event popup listing out that event. If you use one which offers you actions to perform on the recognised content, you can do so, which in this case will be adding to 'your' calendar this event to be reminded of later. The same can be used for different types of data such as contact details (called hCard), or reviews (hReview), social network (XFN) and Lists/Outlines (XOXO). The complete list can be found here.

How microformat works

Ofcourse, any type of data can be turned into a microformat, as long as it is recognised as a standard, and everyone is not cooking up 'their' version of it. It is there to help us maintain a standard, so that building applications to work with the data held by these formats becomes simpler. It'd be as simple as finding a block of code with the specified attributes, and getting the values of the other attributes as well as the contained text. It can be seamlessly integrated to look like a part of your page, while the hidden microformat properties is only for those who want it. The information can be anything from stock prices to package tracking. The point is to allow easy access to content. As Alex Faaborg puts it:

The general model is the user travels to a particular site, and then proceeds to enter data (classified add, review, list of friends) for a particular purpose. Your information is scattered all over the Web, and you have to pick which sites you want to use.

The combination of blogging and microformats is now reversing this model. Now, your information remains in your blog, and the Web sites come to you. For instance, if you want to sell something, you can blog about it using an hListing, and a site like Edgeio will find it when it aggregates classified advertisements across the Web.

I've been looking at the prospect of having a microformat for blog posts over the internet. Stephen has already stated pretty strongly about how the XOXO format is great for creating templates, which can make working with them (and the data they hold) much simpler. I will top that and ask for a standard outline to hold all post data in. That way, whenever we look for post details, it will be the same for a Blogger blog or a Wordpress blog (and many other blogging platforms out there). The sharing of data becomes tons easier! I realised this while developing my way of loading creating expandable posts asynchronously. Default Blogger templates already follow a pseudo-microformat (I don't know if it was intentional or not), and hence, if you've not modified your template too much, or followed the standard naming of classes, your blog is already following a microformat for your posts. Any information now can be scraped right off the page as and when it might be required! There has been a proposal for any kind of listing to be presentated in a microformat called hListing however, so one's interested might have a look into that.

Currently, as I said earlier, Firefox is the only browser which handles microformats natively, and that too only for dynamic titles for bookmarks. However, a couple of extensions are there which detect and present actions based on microformats found on a page. The one I use and recommend is Operator by Michael Kaply. This is a very useful extension if you wish to completely integrate Firefox with the current known microformat standards. It detects the data, and then presents you with ways of using that data. For example, on this page, you should see two microformats. One is my label, and the other is the calendar event written above. Operator will give you options to search Flickr, Del.icio.us and Technorati for this tag, to find related content, or add the calendar event to Google Calendar. Makes my job that much easier! :) The second extension, called Tails Export by Robert De Bruin, does pretty much the same thing, without the integration with other applications. It works by 'exporting' your microformats to known file types which can be used in other applications.

The functions of these extensions will eventually be integrated into the browser natively, and hence make it something of an 'information broker' as some people out there are calling it. The web browser, in general will look do all the information collecting 'for' you, so that all you have to do is click and collect.

This graphic from Mozilla sums up their plan for integrating this new standard very well:

Firefox information broker

Since microformat are such an invisible, yet solid way of presenting little bits of shareable data, I think it has the potential to reach where feeds have today, since they technically present the same thing; a simple way of sharing data amongst applications. Practically any data can be turned into a microformat, as long as there is wide generation of it. With some touting the new year as one for Microformats, I am somewhat agreeing with them. It 'could' be the next big thing! I am going to begin to use it in places where I think they're necessary. My little contribution in building a smarter, more semantic web! Seems fun and useful! :P Install one of those extensions, and discover a new way to work with information for yourself!


Deepak said...

A very good post.

I checked my blog with the extension. It seems I've stripped some of the inbuilt microformats (rel-tag) in Blogger with my hack. I'll have to learn and integrate that.

Singpolyma said...

While I make a nice case for XOXO over OPML, and perhaps even for XOXO in blog formats (using the standard format I invented and use myself) I strongly urge that to be mixed with hAtom.

hAtom by itself can be powerful.

hAtom encompasses most standard blog data... and it's microformats.org 'approved'.

Aditya said...

Thank you! :)

It's not really a necessity. Not like you're breaking some kind of code of conduct :P

It just becomes easy for other people to work with various data available in your pages :) But yes, it's good to follow 'some' format.

Singpolyma said...

rel=tag is actually always a good idea because of Technorati, hehe.

I just noticed that Operator is detecting the hCal in this page, but won't let me do anything with it because it has only a dtstamp and no dtstart...

Deepak said...

We, as hackers and the "service providers", should take the initiative for integrating these into our hacks. That's what I meant.

If it is a good thing, lets promote it. :)

Efendi said...

there's this nice article on how to add hCard ;)