archives
Access control and version control: an over-constrained problem?
For a long time, when it came to sharing code, life was simple: everything went into the w3ccvs repository via ssh; when I commit WWW/People/Connolly/Overview.html the latest version is propagated out to all the web mirrors automatically and everybody can see the results at http://www.w3.org/People/Connolly/. I can also write via HTTP, using Amaya or any command-line tool with PUT support, e.g. curl. The PUT results in a cvs commit, followed by the same mirroring magic.
We have a forms-based system for setting the access control for each file -- access control to the latest version, that is; access to older versions, to logs and the rest of the usual cvs goodies is limited to the few dozen people with access to the w3ccvs repository.
To overcome that limitation, Daniel Veillard set up dev.w3.org which conforms more to the open source norms, providing anonymous read-only CVS access and web-browseable history. There is no http-based access control there, though CVS commit access is managed with ssh.
The downside of dev.w3.org is that it doesn't support web publishing like w3ccvs does. http://dev.w3.org/cvsweb/2001/palmagent/event-test.html is the cvs history of event-test.html, not the page itself. The address that gives the page itself is http://dev.w3.org/cvsweb/~checkout~/2001/palmagent/event-test.html?rev=HEAD&content-type=text/html;%20charset=iso-8859-1. And every GET involves cvs reading the ,v file. And of course, relative links don't work.
For the SWAP project, we use a horrible kludge of both: we commit to w3cvs, and every 15 minutes, dev.w3.org is updated by rsync.
Then I have a personal web site, where I run Zope. That gives me thru-the-web editing with revision history/audit/undo, but at the cost of having the whole site in one big file on the server, and no local goodies like cvs diff. Ugh. I'd like to switch to something else, but I haven't found anything else that talks webdav to iCal out-of-the-box. Plus, I can spare precious little time for sysadmin on my personal site, where there's basically just one user. I was really reluctant to use flickr.com URIs for my photos, but their tools are so much nicer that the alternative is that I basically don't publish photos at all. Plus, the social networking benefits of publishing on flickr are considerable. But that's really a separate point (or is it? hmm).
As I wrote in a January item, we're using svn in DIG. DIG is part of CSAIL, which is a long-time kerberos shop. Public key authentication is so much nicer than shared key. I can use cvs or ssh over ssh to any number of places (mindswap, microformats, ...) with the same public key. But I need separate credentials (and a separte kinit step) for CSAIL. Ah... but it does propagate svn commit to http space in the right way. I think I'll try it ... see data4.
I'd rather use hg, which is peer-to-peer; it's theoretically possible to use svn as an hg peer, but that's slightly beyond the documented state-of-the-art.
citing W3C specs from WWW conference papers
As I said in a July 2000 message to www-rdf-interest:
There are very few data formats I trust... when I use when I use the computer to capture my knowledge, I pretty much stick to plain text, XML (esp XHTML, or at least HTML that tidy can turn into XHTML for me), RCS/CVS, and RFC822/MIME. I use JPG, PNG, and PDF if I must, but not for capturing knowledge for exchange, revision, etc.
And as I explained in a 1994 essay, converting from LaTeX is hard, so I try not to write in LaTeX either.
The Web conference has instructions for submitting PDF using LaTeX or MS Word and (finally!) for submitting XHTML. (The WWW2006 paper CSS stylesheet is horrible... who wants to read 9pt times on screen?!?! Anyway...) So when the IRW 2006 organizers told me they'd like a PDF version of my paper in that style, I dusted off my Transforming XHTML to LaTeX and BibTeX tools and got to work.
My paper cites a number of W3C specs, including HTML 4. The W3C tech reports index/digital library has an associated bibliography generator. I fed it http://www.w3.org/TR/html401 and it generated a nice bibliographic reference from an RDF data set. I'm interested in the ongoing citation microformats work that might make that transformation lossless, since I need not just XHTML, but BibTex. What I'm doing currently is adding some bibtex vocabulary in class and rel attributes:
<dt class="TechReport"> <a name="HTML4" id="HTML4">[HTML4]</a> </dt> <dd><span class="author">Le Hors, Arnaud and Raggett, Dave and Jacobs, Ian</span> Editors, <cite> <a href="http://www.w3.org/TR/1999/REC-html401-19991224">HTML 4.01 Specification</a> </cite>, <span class="institution">W3C</span> Recommendation, 24 <span class="month">December</span> <span class="year">1999</span>, <tt class="number">http://www.w3.org/TR/1999/REC-html401-19991224</tt>. <a href="http://www.w3.org/TR/html401" title="Latest version of HTML 4.01 Specification">Latest version</a> available at http://www.w3.org/TR/html401 .</dd>
When run thru my xh2bib.xsl, out comes:
@TechReport{HTML4,
title = "{
HTML 4.01 Specification
}",
author = {Le Hors, Arnaud
and Raggett, Dave
and Jacobs, Ian},
institution = {W3C},
month = {December},
year = {1999},
number = {http://www.w3.org/TR/1999/REC-html401-19991224},
howpublished = { \url{http://www.w3.org/TR/1999/REC-html401-19991224} }
}
I think I should be using editor = rather than author = but that didn't work the 1st time I tried and I haven't investigated further.
In any case, I'm reasonably happy with the PDF output.
On GData, SPARQL update, and RDF Diff/Sync
The Google Data APIs Protocol is pretty interesting. It seems to be based on the Atom publishing protocol, which is a pretty straightforward application of HTTP and XML, so that's a good thing.
The query features seem to be less expressive than the SPARQL protocol, but GData has an update feature, while the SPARQL update issue is postponed. Updating at the triple level is tricky. I helped TimBL refine Delta: an ontology for the distribution of differences between RDF graphs a bit, and there's working code in cwm. But I haven't really managed to use it in practical settings. My PDA's calendar has an XMLRPC service where I can update a whole record at a time, just like GData. I assume caldav does likewise.
The GData approach to concurrency looks quite reasonable. I haven't studied the authentication mechanism. I hope to get to that presently.

