Modelling HTTP cache configuration in the Semantic Web
The W3C Semantic Web Interest Group is considering URI best practices, whether to use LSIDs or HTTP URIs, etc. I ran into some of them at MIT last week. At first it sounded like they wanted some solution so general it would solve the only two hard things in Computer Science: cache invalidation and naming things , as Phil Karlton would say. But then we started talking about a pretty interesting approach: using the semantic web to model cache configuration. It has long been a thorn in my side that there is no standard/portable equivalent ot .htaccess files, no RDF schema for HTTP and MIME, etc.
At WWW9 in May 2000, I gave a talk on formalizing HTTP caching. Where I used larch there, I'd use RDF, OWL, and N3 rules, today. I made some progress in that direction in August 2000: An RDF Model for GET/PUT and Document Management.
Web Architecture: Protocols for State Distribution is a draft I worked on around 1996 to 1999 wihthout ever really finishing it.
I can't find Norm Walsh's item on wwwoffle config, but I did find his XML 2003 paper Caching in with Resolvers:
This paper discusses entity resolvers, caches, and other strategies for dealing with access to sporadically available resources. Our principle focus is on XML Catalogs and local proxy caches. We’ll also consider in passing the ongoing debate of names and addresses, most often arising in the context of URNs vs. URLs.
In Nov 2003 I worked on Web Architecture Illustrated with RDF diagramming tools.
The tabulator, as it's doing HTTP, propagates stuff like content type, last modified, etc. from javascript into its RDF store. Meanwhile, the accessability evaluation and repair folks just released HTTP Vocabulary in RDF. I haven't managed to compare the tabulator's vocabulary with that one yet. I hope somebody does soon.
And while we're doing this little survey, check out the Uri Template stuff by Joe Gregorio and company. I haven't taken a very close look yet, but I suspect it'll be useful for various problems, if not this one in particular.
WRT modelling HTTP caching itself, I've been thinking about writing some cosmogol for it to see how both fare. Want to help? As far as modelling cache configuration, we spent a lot of time working on this at Akamai, leading to URISpace (which wasn't really compatible with the semantic Web, but it did what we needed). The main trick is balancing between editability and readability; it's hard to make it flexible and easy to understand. The other tricky part is mapping between stuff you get in-request (URI, headers), in-response (usually headers), and in out-of-band configuration, as well as clearly establishing precedence with them. This is probably just a use case for the Web configuration problem, to some degree. I've often thought about putting together a translator for URISpace to Apache config, squid accelerator config, robots.txt, p3p.xml, site maps, and so on. WRT URI Templates; they're more one-off; I'd suggest at looking at something like WADL -- which uses them -- for more capable configuration. Cheers,


Although it may seem to be a dull subject from the outside looking in, the standardization of URI Templates will be probably the most positive thing to happen for URL Design in the last 10 years. Cheers, Sebie