05 February 2007

Making the TOC: A look into DOM

I got the idea for it when I saw the two fixed menus over at Quirksmode. Now, I'm not a great book publishing, tutorial blog writing coder. But I can write code, so with the new look, I wanted to add that little detail in. It becomes a nuisance when you don't know what is exactly on the page you're looking at. Especially if you see one's like the archive pages, which stretch on foreverBlogger offers no way of restricting the number of posts, or am I missing something here?, and it can get very disorienting.

The concept

I came up with the TOC to be dynamic, as in it would adapt to the page you're on (rather than just show you the name of the posts on that page, a la Classic Blogger). A simple widget modified a little would have done the trick, but I wanted this one to go a step further and list out the sections in each post. I got into the habit of using headings after reading Postbubble, so those seemed the perfect section markers.

I also wanted the TOC to be visible all the time, but not really fixed like in other places. So here again I took the idea for scrolling it like in Quirksmode. Scroll with the page until it's about to scroll out, and then get fixed with the page. Creates a catchy effect of the page dragging it. When you scroll back up, it regains it's original position if you scroll to the top. Perfect! :)

I decided to hide it on post pages, because with just one post, you don't really need a table of contents. The purpose is to help navigate a page internally when on a multiple posts page. So if you want to see it, the front page would be a good start :)

Finally, to be expandable, or not to be expandable. That was the question. The initial plan was to make the sections list for every post to be expandable. But then the problem of an intuitive interface came up. How do you expand when the user wants to see the sections, and take him to the post itself when he wants to go to the post. I could have created the click-double click interface, but that doesn't become immediately obvious. So I stuck to an open sections list approach. I had to let it overflow, since this can cause the TOC to extend beyond the page on a page with a lot of posts (with a lot of sections), but that's only on the archive pages. Search pages see the best application of the TOC, since you get to see a list of posts matching your criteria, and immediately jump to the one you wish to.

Making it and DOM glory

I tried a few methods trying to get the optimum performance with least amount of code. Sure as heck, XPath crossed my path more than once, but I just couldn't generate a post specific section list. I always ended up with the same sections for every post. I'm sure I was doing something wrong, but I'm new to XPath, so until I perfect it, I rather not depend on it.

I then tried to use if...else statements to pin-point the exact node I wanted to extract information from. Here again I had the option of switching to PHP to easily scrape the data, but making an external call for something so native to a page seemed like overkill. So I decided to go hardcore, and make DOM Javascript do my bidding. This was harder than I had initially thought.

One of the basic mistakes I made was trying to append a bunch of children to one element. This was needed to generate the sections list, and append it as a child to the newly generated post list item. Of course, this would be only if that post had sections. The ideal way is to use document.createDocumentFragment() to create a fragment, then inside the loop I used to search for the headings (a simple nodeName == 'H4' checkEven now, I don't know why I didn't use element.getElementsByTagName('h4') to find all the headings in every post body. It wouldn't have reduced lines of code, but would have saved me a brain wreck.), I created a post list item, an internal linkI have a separate function which runs through the headings inside a post-body element, and gives them an ID of 'heading-(i)' where (i) comes from the loop counter (just for uniqueness). Then I create a link to point to the ID of the traversed heading to that heading, and then appending the link to the list item, and then the list item to the fragment, while the current node being traversed had child nodes.

Since code generated by Blogger follows the same pattern, it was easy hardcoding the position of the post body. I then traversed this node for the H4 elements. At the end of the loop, I created a new list, and appended the entire fragment to it. This is the 'only' way I know how to work with adding a lot of content in one go, and it also helped me to check if there 'were' nodes to append or not. No point adding a list if it has nothing, right? :) The code I ended up with was this:

body = posts[i].childNodes[7]; j=0;//body points to the post body
    fr = document.createDocumentFragment();//the fragment
    while(body.childNodes[j]){
            if(body.childNodes[j].nodeName == 'H4'){
            sec_li = document.createElement('li');
            sec_link = document.createElement('a');
     sec_link.href = '#'+body.childNodes[j].id;//assigns ID of the heading being traversed
  sec_link.innerHTML = '- '+body.childNodes[j].textContent;
  sec_li.appendChild(sec_link);
        fr.appendChild(sec_li);
        }
        j++;
    }//end of body.childNodes
    ul_child = document.createElement('ul');//create a new list
    ul_child.id = 'sectionUL';
    ul_child.className = 'headingList';
    if(fr.childNodes.length > 0)
        ul_child.appendChild(fr);//append fragment to new list if it has children
    if(ul_child.childNodes.length > 0)
        ul.appendChild(ul_child);//append new list to main list if it has children

Notice that I could have easily used innerHTML to come up with the required links and everything, and as the discussion here goes, it would have saved me lines of code. However, the beauty of DOM Javascript tipped the balance. It helped to visualise the hierarchy of the TOC being generated. It would be as simple as:

  • First post
    • Section 1
    • Section 2
  • Second post
  • Third post
    • Section 1
    • Section 2

And so on, with each section list being created only if the post was divided into sections. Remember though, that the way the script is written, a single H4 element in the post body resulted in this section list. I didn't want to add unnecessary jargon by putting in checks to ascertain the legality of dividing a post into sections.

That is the dynamic part of the function, which analyses the page. The remaining part (which you'll see if you view the source) works to create static links to the top, footnotes and the bottom content. The best part is, this is very easily extendible if one adds more sections or something. Blogger makes life easy with uniform hierarchy of posts, hence if you write the loop with one post in mind, you'll effectively be writing it for all of them. I used the excellent getElementsByClassName function to get references to the different post blocks on a page.

The one little thing to notice is that if you click the link before the posts have finished loading, it'll give an error. That's because it adds links to various parts of a post, and does that for all the posts. It won't be able to traverse the DOM properly if elements are missing, and wherever it catches an error, it'll halt. If that happens, just wait for the page to finish loading, and then click the link again. You'll have your TOC.

Conclusion

This was a nice and healthy experiment into scripting with proper DOM functions (for the most part). Took some nice brain racking to come up with the fastest method, and the result was a nice and fancy add-on to the page which I think people will appreciate in the long run. I hope to release it in the future, after it's survived my tests and hammering, and a few tweaks that I have planned for the public version. Hope you enjoy using it as much as I did making it!


5 comments

Efendi said...

great post ^^ , i thought i've seen it somewhere :P and now, i remember again :) it's the quirksmode's ;)

*bookmarked this ;)

Aditya said...

Thanks! Although it is not really the same. Mine is a bit more complicated if you have a look at both the codes.

The end result it pretty much the same, but his pages are easier to make TOCs for :P I had to break it up into individual posts, and section within those posts.

Deepak said...

Nice hack. I see the extensive use of DOM. :D

BTW, could you have a look at the code highlighting colors? The code comments are difficult to read (I had to squint! :) ), because the text and BG colors are not complementary.

Aditya said...

It's not really a hack (shh... Avatar could be watching!), but just a simple script :)

I've fixed the colours, but the comments are light because they're comments, not that important.

And yes, I forced myself to use DOM functions to partly show the 'innerHTML' coders that you don't need it if you're just a little organised! :)

Deepak said...

Aah.. I didn't go through your code. Never getting enough time of late.

Comments should be light. No doubt about that. But I felt that it should have been distanced a bit more from the background color. What happened was that it was so near to the BG in color space that it was virtually unreadable...too bright to the eyes.