Notes on Offline Web Storage for TAAG

After you’ve visited it at least once, the Text to ASCII Art Generator (TAAG) will now work even when you don’t have an internet connection. I had a user request this, so I figured I’d add it in for them and anyone else out there who may want to use the app offline.

Offline web applications are one of the new capabilities being introduced with HTML5. An offline web application has its files downloaded and permanently installed in a web browser’s application cache. This allows users to go to an app’s URL and have it work even when they aren’t online. It’s a smart idea, but it has some interesting pros and cons. In this entry I’ll detail what I did to set TAAG up to work offline, detail some issues I ran into, and provide some notes for anyone looking create an offline app.

The first step in creating an offline web application is to create a manifest file which tells the browser which files it should store in its application cache, which files should try getting from the network, and if offline, which files it should use in place of certain standard files. Any HTML files that reference this manifest will be implicitly added to the CACHE section of the manifest and after a user’s first visit, all cached data will come from the application cache, even if a user F5’s the page. The application cache is only ever updated when the manifest file is updated, and even then, it will take 2 visits for a user to see any updates. This is because when a user visits the app’s URL, the application will be read from the cache, it will then see the manifest has been updated, and it will then update the application cache for the user’s next visit. These idiosyncrasies are a bit to take in, but once you know what’s going on its not that bad. Here’s what a sample manifest file would look like:

CACHE MANIFEST
# This is a comment.
# The cache is only ever refreshed when changed.
# Last updated: 2012.08.18

# The CACHE section indicates what we want cached.
# The HTML file linking to this manifest to auto-added to this list.
CACHE:
one.jpg
two.jpg
jquery.js?12345

# The NETWORK section indicates which non-cached files should be 
# obtained from the network. This will almost always be "*".
NETWORK:
*

# The fallback section allows us to setup "fallback" pages to use 
# when offline.
FALLBACK:
page2.htm page2-is-offline.htm

An HTML file linking to this manifest might look like this:

<html manifest="my.appcache">
<head><title>Test</title><meta charset="utf-8"></head>
<body>
<div>Some text</div>
<img src="one.jpg"/>
<img src="two.jpg"/>
<script src="jquery.js?12345"></script>
</body>
</html>

And one last gotcha – to keep the manifest file from being cached by the browser, and to ensure the manifest is served with the correct mime type (an incorrect mime type will keep the manifest from being recognized), we need to update the web server’s configuration. For Apache, we’d add the following rules to our .htaccess file:

AddType text/cache-manifest .appcache

<Files *.appcache>
    ExpiresActive On
    ExpiresDefault "access"
</Files>

That should do it! With this technique your users will now be able to use your apps even when they don’t have an internet connection. I find this cool because it makes web apps more useful and puts them a step closer to supplanting Desktop apps. However, there are unfortunately a number of issues that have held offline web apps back, and they’re worth mentioning.

Issues with application caching

A lot of people aren’t happy with how application caching currently works. The W3C has set up a working group that’s discussing possible improvements, though hopefully the main components of the current spec will continue to work. However, I also feel like they need to make some changes too. Below I’ll list my biggest concerns.

  • How would users know that certain URLs still work when they’re offline?

    I don’t understand how this isn’t the most important issue for offline applications. Without someone manually telling you that you can use a particular application while offline, I don’t see how someone would think they could browse to example.com/webapp and expect it to work when they didn’t have an internet connection. In their current form, I only see offline applications as useful to techies or people in the know.

    It’d be nice if offline applications could provide an icon, name, and description, and then browsers could show users which offline applications they had installed. It could be argued that this is a browser feature, but bookmarks are a browser feature too, and all browsers allow webpages to provide a favicon for bookmarking purposes. Whatever the case, there should be a way to let users know certain web pages/apps work offline.

    I’ve only seen this discussed once, and the people involved didn’t come to any conclusions, so I’m not too hopeful on this one. However, I think it’s probably the most important point, since no users = no usage.

    2012.09.01 Edit: Someone alerted me that there is a Widgets spec that’s currently in the works that supposed to solve this issue. I haven’t taken a good look through it yet, but it’s good to know there are attempts to solve this problem currently in the works.

  • Fresh updates aren’t served immediately / F5 doesn’t break the cache / the page that links to the manifest has to be cached.

    This seems to be the most popular feature request, though after reading spec author Hixie’s take on the issue, I sort of agree that many of the use-cases for using the application cache in this way are very similar to straight up HTTP caching. Though it would be nice if an app used the HTTP cache while online and this cache synced with the application cache, which would then be used when a user was offline.

    Around two months ago, a new feature was added to spec in an attempt to address this. It takes the form of a new setting that gets added to the manifest:

    SETTINGS:
    prefer-online
    

    It’s explained here, though there’s some debate on if its the right solution, and I was unable to get it to work in Chrome or the nightly FireFox build, so I’m not sure how it works or if it actually solves the problem.

Other Notes

  • Chrome was easier to work with than FireFox

    Chrome will print to the console as its loading the application into the cache, so you know instantly if there’s a problem or not. FireFox didn’t do this and that made it a bit more annoying to work with.

  • Listing installed applications

    • For Chrome, browse to the URL “chrome://appcache-internals/”
    • For FireFox, in the menu, go to: “Tools>Options>Advanced>Network” and see the section on “Offline Web Content and User Data”.

  • In the manifest, URLs need to be URL encoded and query strings needed to be included

    Kind of obvious, but worth mentioning.

  • The popular iframe trick doesn’t work in FireFox

    There is a popular trick that aims to allow an HTML page to use the application cache while not being cached itself. It attempts this by having the page include an iframe with an HTML tag that links to the manifest. From my own tests, this trick works in Chrome but not in FireFox.

    2012.09.10 Update: After updating to FireFox 15, it seems to work now.

  • JavaScript API

    I didn’t go into it here, but there’s also a JavaScript API for the application cache. See the “Additional Resources” section below for more information.

  • What would happen if every website wanted to work offline?

    The space on a user’s computer is finite. I’m not sure how offline storage will work if offline apps end up becoming very popular in the future. Maybe the solution is to focus on installable apps which have the option of working offline – and users can pick and choose what they want installed?

Additional Resources

One thought on “Notes on Offline Web Storage for TAAG”

Comments are closed.