Extendible BBCode Parser in JavaScript

Photo By Dean Terry

I decided to try my hand at implementing a BBCode parser in JavaScript. You can play around with it online here, and download the source here.

I had looked around a little bit and noticed that the existing JavaScript BBCode parsers had at least a few of the following issues:

  • They didn’t report errors on misaligned tags (e.g., [b][u]test[/b][/u]).
  • They couldn’t handle tags of the same type that were nested within each other (e.g., [color=red]red[color=blue]blue[/color]red again[/color]). This happens because their regex will look for the first closing tag it can find.
  • They couldn’t handle BBCode’s list format (e.g., [list][*]item 1[*]item 2[/list]).
  • They didn’t report errors on incorrect parent-child relationships (e.g., [list][td]item 1?[/td][/list]).
  • They weren’t easily extendible.

I naively thought it’d be easy to quickly whip up a parser, and at first it was. Most BBCode tags can be implemented with a simple find and replace. However, I quickly ran into the issues of dealing with nested tags of the same type, the noparse tag, and the list tag’s annoying [*] tag (which doesn’t have a closing tag). Luckily, I came across a neat blog post on finding nested patterns in JavaScript, which came in handy for isolating tag pairs, from the inner-most on up. Taking the idea from that post, one can do something like this to process the inner tags first and avoid the nested tag problem:

var str = "[list][list]test[/list][/list]",
    re = /\[([^\]]*?)\](.*?)\[\/\1\]/gi;
while (str !== (str = str.replace(re, function(strMatch, subM1, subM2) {
    return "" + subM2 + "";
})));
// str = "test"

That idea works well, though you can’t implement a noparse tag if you process the inner-most tags first. So I decided to pre-process the BBCode with something similar to the idea above and add in nested-depth information to each open and close tag. Once all of the tags had that, I could parse the processed code with a regex that could easily match-up the correct open and close tags.

To get around the issue of the [*] tag having no closing tag, I wrote code that inserted [/*] tags where they were supposed to go during the pre-processing period. I wont go into the algorithm here, but you can dig into the code if you’re interested.

Also, I should note that the fact that JavaScript allows you to use a function as the second parameter to the replace method makes processing the tags really easy. Once you match a set of tags, you can recursively call the parse function on that tag’s contents from inside of the function you passed to replace.

Using the parser

To use the use the parser, you’d simply include xbbcode.js and xbbcode.css files somewhere on your page (which are contained in the zip file linked above), and then call the XBBCODE object from somewhere in your JavaScript:

var result = XBBCODE.process({
    text: "Some bbcode to process here",
    removeMisalignedTags: false,
    addInLineBreaks: false
});
console.log("Errors: " + result.error);
console.dir(result.errorQueue);
console.log(result.html);// the HTML form of your BBCode

Adding new tags

To add a new tag to your BBCode, add properties to the “tags” object inside of the XBBCODE object. For example, say you wanted to add a tag called [googleit] which would change its contents into a link of its google search results. You’d implement that by adding this to the tags object:

"googleit": {
    openTag: function(params,content) {
        var website = "\"http://www.google.com/#q=" + content + '"';
        return '<a href=' + website + '>';
    },
    closeTag: function(params,content) {
        return '</a>';
    }
}

Then you could have BBCode like this: “[googleit]ta-da![/googleit]” which would be transformed into this: “<a href=”http://www.google.com/#q=ta-da!”>ta-da!</a>”

If you have any suggestions or find any bugs let me know.

“HTML5: Up and Running” Book Review

HTML5: Up and Running
I’ve read a decent number of articles on what will be new in HTML5. I’ve read up on the canvas element, localStorage, web workers, and a couple of the other elements one can use when creating Chrome Web Browser Extensions (for when I created my Typing Speed Monitor and Image Definitions for Dictionaries extensions).

However, I hadn’t really sat down and taken the time to thoroughly go through all the goodies that are planned for/coming with HTML5. So when my office mate showed me a huge pile of books he had just purchased, I saw the one titled HTML5: Up and Runningand got kind of curious. After flipping through it, I found out that its also available online for free under the title of Dive into HTML5, but I ended up ordering my own copy since I prefer to read the paper editions. However, a good number of resources are linked to, so a digital version of the book is somewhat advantageous.

Anyway, the book starts off with some history on how HTML developed. It goes through an old thread in a 1993 W3C mailing list archive, where participants were discussing the creation of image tag. Essentially no one could really agree on how it should be setup (Should it be img, icon or include? Should its properties be src or href?), and ultimately an author of Mosaic (an early web browser) decided to just use what he had initially proposed and shipped his browser with a working img tag. The point of the story is to show you that HTML isn’t this carefully crafted language, it’s based on discussion, but many of its features came about simply because a popular web browser decided to stand behind them.

The next chapter discusses how you as a developer can use the new HTML5 tags in your web pages today, and still have your site be backward compatible with older browsers. It uses a JavaScript technique to do this, however, there are a couple of ways to use the new tags and be backwards compatible, some of which you can read about here.

The rest of the book focuses on giving introductions to the various new features you’ll have access to in HTML5, specifically: the canvas element, the video tag, the geolocation API, the localStorage element, how to setup your site for offline storage, all the new form elements, and how microdata works. These discussions are all pretty good, though I especially liked the chapter on the video tag. I didn’t really know much about the different video formats going into the chapter, so it was nice to have a high level discussion on how videos are encoded. It was also interesting to have the author touch on the licensing issues of H.264 video. After reading about all the fees involved, especially those possibly coming after Dec. 31, 2015, it seems like it’d be a bad format to use as a standard.

Overall I liked the book and would recommend checking it out if you’re curious about using and playing around with the currently available features of HTML5.