RDF

A look at emerging Web security architectures from a Semantic Web perspective

Submitted by connolly on Fri, 2006-03-17 17:51. :: | | | | | |

W3C had a workshop, Toward a more Secure Web this week. Citigroup hosted; the view from the 50th floor was awesome.

Some notes on the workshop are taking shape:

A look at emerging Web security architectures from a Semantic Web perspective

Comparing OpenID, SXIP/DIX, InfoCard, SAML to RDF, GRDDL, FOAF, P3P, XFN and hCard

At the W3C security workshop this week, I finally got to study SXIP in some detail after hearing about it and wondering how it compares to OpenID, Yadis, and the other "Identity 2.0" techniques brewing. And just in time, with a DIX/SXIP BOF at the Dallas IETF next week.

Getting my Personal Finance data back with hCalendar and hCard

Submitted by connolly on Wed, 2006-03-08 19:25. :: | |

The Quicken Interchange Format (QIF) is notoriously inadequate for clean import/export. The instructions for migrating Quicken data across platforms say:

  1. From the old platform, dump it out as QIF
  2. On the new platform, read in the QIF data
  3. After importing the file, verify that account balances in your new Quicken for Mac 2004 data file are the same as those in Quicken for Windows. If they don't match, look for duplicate or missing transactions.

I have not migrated my data from Windows98 to OS X because of this mess. I use win4lin on my debian linux box as life-support for Quicken 2001.

Meanwhile, Quicken supports printing any report to a tab-separated file, and I found that an exhaustive transaction report represents transfers unambiguously. Since October 2000, when my testing showed that I could re-create various balances and reports from these tab-separated reports, I have been maintaining a CVS history of my exported Quicken data, splitting it every few years:

   $ wc *qtrx.txt
4785 38141 276520 1990-1996qtrx.txt
6193 61973 432107 1997-1999qtrx.txt
4307 46419 335592 2000qtrx.txt
5063 54562 396610 2002qtrx.txt
5748 59941 437710 2004qtrx.txt
26096 261036 1878539 total

I started a little module on dev.w3.org... I call it Quacken currently, but I think I'm going to have to rename it for trademark reasons. I started with normalizeQData.py to load the data into postgress for use with saCASH, but then saCASH went Java/Struts and all way before debian supported Java well enough for me to follow along. Without a way to run them in parallel and sync back and forth, it was a losing proposition anyway.

Then I managed to export the data to the web by first converting it to RDF/XML:

qtrx93.rdf: $(TXTFILES)
$(PYTHON) $(QUACKEN)/grokTrx.py $(TXTFILES) >$@

... and then using searchTrx.xsl (inside a trivial CGI script) that puts up a search form, looks for the relevant transactions, and returns them as XHTML. I have done a few other reports with XSLT; nothing remarkable, but enough that I'm pretty confident I could reproduce all the reports I use from Quicken. But the auto-fill feature is critical, and I didn't see a way to do that.

Then came google suggest and ajax. I'd really like to do an ajax version of Quicken.

I switched the data from CVS to mercurial a few months ago, carrying the history over. I seem to have 189 commits/changesets, of which 154 are on the qtrx files (others are on the makefile and related scripts). So that's about one commit every two weeks.

Mercurial makes it easy to keep the whole 10 year data set, with all the history, in sync on several different computers. So I had it all with me on the flight home from the W3C Tech Plenary in France, where we did a microformats panel. Say... transactions are events, right? And payee info is kinda like hCard...

So factored out the parts of grokTrx.py that do the TSV file handling (trxtsv.py) and wrote an hCalendar output module (trxht.py).

I also added some SPARQL-ish filtering, so you can do:

 python trxht.py --account 'MIT 2000' --class 200009xml-ny  2000qtrx.txt

And get a little microformat expense report:

9/20/00 SEPTEMBERS STEAKHOUSE ELMSFORD NY  MIT 2000
 19:19c[Citi Visa HI]/200009xml-ny29.33
9/22/00 RAMADA INNS ELMSFORD GR ELMSFORD NY  MIT 2000
 3 nightsc[Citi Visa HI]/200009xml-ny603.96
9/24/00 AVIS RENT-A-CAR 1 WHITE PLAINS NY  MIT 2000
  c[Citi Visa HI]/200009xml-ny334.45
1/16/01 MIT  MIT 2000
 MIT check # 20157686 dated 12/28/00c[Intrust Checking]/200009xml-ny-967.74

Mercurial totally revolutionizes coding on a plane. There's no way I would have been as productive if I couldn't commit and diff and such right there on the plane. I'm back to using CVS for the project now, in order to share it over the net, since I don't have mercurial hosting figured out just yet. But here's the log of what I did on the plane:

changeset:   19:d1981dd8e140
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 20:48:44 2006 -0600
summary: playing around with places

changeset: 18:9d2f0073853b
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 18:21:35 2006 -0600
summary: fixed filter arg reporting

changeset: 17:3993a333747b
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 18:10:10 2006 -0600
summary: more dict work; filters working

changeset: 16:59234a4caeae
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 17:30:28 2006 -0600
summary: moved trx structure to dict

changeset: 15:425aab9bcc52
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 20:57:17 2006 +0100
summary: vcards for payess with phone numbers, states

changeset: 14:cbd30e67647a
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 19:12:38 2006 +0100
summary: filter by trx acct

changeset: 13:9a2b49bc3303
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 18:45:06 2006 +0100
summary: explain the filter in the report

changeset: 12:2ea13bafc379
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 18:36:09 2006 +0100
summary: class filtering option

changeset: 11:a8f550c8759b
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 18:24:45 2006 +0100
summary: filtering in eachFile; ClassFilter

changeset: 10:acac37293fdd
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 17:53:18 2006 +0100
summary: moved trx/splits fixing into eachTrx in the course of documenting trxtsv.py

changeset: 9:5226429e9ef6
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 17:28:01 2006 +0100
summary: clarify eachTrx with another test

changeset: 8:afd14f2aa895
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 17:19:36 2006 +0100
summary: replaced fp style grokTransactions with iter style eachTrx

changeset: 7:eb020cda1e67
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 16:16:43 2006 +0100
summary: move isoDate down with field routines

changeset: 6:123f66ac79ed
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 16:14:45 2006 +0100
summary: tweak docs; noodle on CVS/hg scm stuff

changeset: 5:4f7ca3041f9a
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 16:04:07 2006 +0100
summary: split trxtsv and trxht out of grokTrx

changeset: 4:95366c104b42
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 14:48:04 2006 +0100
summary: idea dump

changeset: 3:62057f582298
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 09:55:48 2006 +0100
summary: handle S in num field

changeset: 2:0c23921d0dd3
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 09:38:54 2006 +0100
summary: keep tables bounded; even/odd days

changeset: 1:031b9758304c
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 09:19:05 2006 +0100
summary: table formatting. time to land

changeset: 0:2d515c48130b
user: Dan Connolly <connolly@w3.org>
date: Sat Mar 4 07:55:58 2006 +0100
summary: working on plane

I used doctest unit testing quite a bit, and rst for documentation:

trxht -- format personal finance transactions as hCalendar

Usage

Run a transaction report over all of your data in some date range and print it to a tab-separated file, say, 2004qtrx.txt. Then invoke a la:

$ python trxht.py 2004qtrx.txt  >,x.html
$ xmlwf ,x.html
$ firefox ,x.html

You can give multiple files, as long as the ending balance of one matches the starting balance of the next:

$ python trxht.py 2002qtrx.txt 2004qtrx.txt  >,x.html

Support for SPARQL-style filtering is in progress. Try:

$ python trxht.py --class myclass myqtrx.txt  >myclass-transactions.html

to simulate:

describe ?TRX where { ?TRX qt:split [ qs:class "9912mit-misc"] }.

Future Work

  • add hCards for payees (in progress)
  • pick out phone numbers, city/state names
  • support a form of payee smushing on label
  • make URIs for accounts, categories, classses, payees
  • support round-trip with QIF; sync back up with RDF export work in grokTrx.py
  • move the quacken project to mercurial
  • proxy via dig.csail.mit.edu or w3.org? both?
  • run hg serve on homer? swada? login.csail?
  • publish hg log stuff in a _scm/ subpath; serve the current version at the top

Reflections on the W3C Technical Plenary week

Submitted by connolly on Tue, 2006-03-07 20:31. :: | | | | | | |

Here comes (some of) the TAG
Originally uploaded by Norm Walsh.

The last item on the agenda of the TAG meeting in France was "Reviewing what we have learned during a full week of meetings". I proposed that we do it on the beach, and it carried.

By then, the network woes of Monday and Tuesday had largely faded from memory.

I was on two of the plenary day panels. Tantek reports on one of them: Microformats voted best session at W3C Technical Plenary Day!. My presentation in that panel was on GRDDL and microformats. Jim Melton followed with his SPARQL/SQL/XQuery talk. Between the two of them, Noah Mendelsohn said he thought the Semantic Web might just be turning a corner.

My other panel presentation was Feedback loops and formal systems where I talked about UML and OWL after touching on contrast between symbolic approaches like the Semantic Web and statistical approaches like pagerank. Folksonomies are an interesting mixture of both, I suppose. Alistair took me to task for being sloppy with the term "chaotic system"; he's quiet right that complex system is the more appropriate description of the Web.

The TAG discussion of that session started with jokes about how formal systems is soporific enough without putting it right after a big French lunch. TimBL mentioned the scheme denotational semantics, and TV said that Jonathan Rees is now at Creative Commons. News to me. I spent many, many hours poring over his scheme48 code a few years back. I don't think I knew where the name came from until today: Within 48 hours we had designed and implemented a working Scheme, including read, write, byte code compiler, byte code interpreter, garbage collector, and initialization logic.

The SemWeb IG meeting on Thursday was full of fun lightning talks and cool hacks. I led a GRDDL discussion that went well, I think. The SPARQL calendar demo rocked. Great last-minute coding, Lee and Elias!

There and back again

On the return leg of my itinerary, the captain announced the cruising altitude, as usual, and then added ... which means you'll spend most of today 6 miles above the earth.

My travel checklist worked pretty well, with a few exceptions. The postcard thing isn't a habit yet. I forgot a paperback book; that was OK since I slept quite a bit on the way over and I got into the coding zone on the way back more about that later, I hope.

Other Reflections

See also reflections by:

... and stay tuned for something from

See also: Flickr photo group, NCE bookmarks

Toward Semantic Web data from Wikipedia

Submitted by connolly on Tue, 2006-03-07 17:23. :: | | |

When I heard about Wikimania 2006 in August in Boston, I put it on my travel schedule, at least tentatively.

Then I had an idea...

Wikipedia:Infobox where the data lives in wikipedia. sparql, anyone? or grddl?

my bookmarks, 2006-02-16

Then I put the idea in a wishlist slide in my presentation on microformats and GRDDL at the W3C technical plenary last week.

The next day, in the SemWeb IG meeting, I met Markus Krötzsch and at lunch I learned he's working on Semantic MediaWiki, a project to do just what I'm hoping for. From our discussion, I think this could work out really well.

For reference, he's 3rd from the left in a photo from wikimania 2005.

I use wikipedia quite regularly to look up airport codes, latitutes, longitudes, lists of postal codes, and the like; boy would I love to have it all in RDF... maybe using GRDDL on the individual pages, maybe a SPARQL interface from their DB... maybe both.

Hmm... the RDF export of their San Diego demo page seems to conflate pages with topics of pages. I guess I should file a bug.

Investigating logical reflection, constructive proof, and explicit provability

Submitted by connolly on Thu, 2006-02-16 02:38. :: |

Prompted by a question about RDF schemas for dollars, gallons, and other units, I found myself looking into SUMO again.

The SUMO/wordnet mappings are great, and the SUMO time concepts are backed by a nice KIF formalization of Allen's work, but it's overly constrained; I prefer the cyc style where the before and after relations apply to conferences and events, without indirection via time intervals.

But what got me really thinking was the way holdsDuring takes formulas as arguments, despite the claim that SUMO is written in SUO-KIF, which has pretty traditional first-order syntax and semantics. I eventually found this addressed explicitly:

... SUMO does include a number of predicates that take formulae as arguments, such as holdsDuring, believes, and KappaFn. In Sigma, we also perform a number of "tricks" which allow the user to state things which appears to be higher order, but which are in fact first order and have a simple syntactic transformation to standard first order form.

How does SUMO employ higher order logic? in the Ontology Portal FAQ, March 2005

I'm curious about how variables are handled in these tricks. The code is all there, and I downloaded it, but I didn't get very far.

I actually don't think first-order semantics are the best fit for the Semantic Web, where formulas refer to documents and the formulas they contain; reflection is a key mechanism, and prohibiting loops is unscalable. I think constructive proof is a much better way to think about proofs in N3.

I discovered Artemov's explicit provability stuff a while ago; my notes date from September 2003:

... explicit provability provides a tool of increasing the efficiency of a formal system by verified rules without explosion of the metamathematical level of the system.

So I dug out my LogicOfProofs larch transcription notes and the Explicit provability and constructive semantics article from the Bulletin of Symbolic Logic, volume 7, No.1, pp. 1-36, 2001 and started working on lp.n3, and an N3 rule for the constructive form of modus ponens:

# application
{ (t s) comp ts.
t proves { F => G }.
s proves F
}
=> { ts proves G }.

that is: if t is an algorithm for proving that F implies G, and s is an algorithm for proving s, then the composition of s and t is an algorithm for proving G. This Curry-Howard correspondence is really nifty.

The proof that "Socrates is mortal" from "if Socrates is a man then he is mortal" and "Socrates is a man" looks like:

2000/10/swap$ python cwm.py util/lp.n3 --think
...
( lpex:a2
lpex:a1 )
a l:_g0;
:comp [
a :Proof;
:proves {lpex:Socrates a lpex:Mortal .
} ] .

lpex:a1 a :Proof;
:proves {lpex:Socrates a lpex:Man .
} .

lpex:a2 a :Proof;
:proves {{
lpex:Socrates a lpex:Man .

} log:implies {lpex:Socrates a lpex:Mortal .
} .
} .

... which is much easier to read than cwm's --why style:

2000/10/swap$ python cwm.py test/reason/socrates.n3 --think --why
@forSome :_g0 .
[ a pr:Conjunction,
pr:Proof;
pr:component [
a pr:Inference;
pr:evidence (
[
a pr:Extraction;
pr:because :_g0;
pr:gives {:socrates a :Man .
} ] );
pr:rule [
a pr:Extraction;
pr:because :_g0;
pr:gives {{
:socrates a :Man .

} log:implies {:socrates a :Mortal .
} .
} ] ],
:_g0;
pr:gives {:socrates a :Man,
:Mortal .
{
:socrates a :Man .

} log:implies {:socrates a :Mortal .
} .
} ].

I didn't actually see formulas occuring as terms in that 2001 paper. So it might be a distraction with respect to the original holdsDuring issue. And that version of the logic of proofs includes all of propositional calculus, including the law of the excluded middle. But among his accomplishments I see

Reflexive lambda-calculus. The Curry-Howard isomorphism converting intuitionistic proofs into typed lambda-terms is a simple instance of an internalization property of a our system lambda-infinity which unifies intuitionistic propositions (types) with lambda-calculus and which is capable of internalizing its own derivations as lambda-terms.

so perhaps I should keep studying his stuff. I wish he'd use s-expressions and QUOTE like Moore's ACL2 paper and Chaitin's work rather than doing reflection with Godel numbering. I wonder what HA is; ah... wikipedia to the rescue:

It is essentially Peano arithmetic, minus the law of the excluded middle...

Toward integration of cwm's proof structures with InferenceWeb's PML

Submitted by connolly on Wed, 2006-02-15 23:11. :: |

The proof generation feature of cwm has been in development for a long time. The idea goes back at least as far as the section on proofs and validation in the original 1998 Semantic Web Roadmap, where we find sees of the proof-based access control idea that is now funded under the policy aware web project:

The HTTP "GET" will contain a proof that the client has a right to the response.

And in the TAMI project, we're looking at proofs as part of an audit mechanism for accountable datamining.

The cwm development process incorprates aspects of extrememe programming: test-driven development and a variation on pair programming; when somebody has a new feature working, somebody else in the group tries to (a) reproduce the test results and (b) review the tests, if not the code. When the pair agree that the tests are OK, we claim victory in a small group setting, and if that goes well, we make a release or at least send mail about the new feature. This typically takes a week or two or three.

In the case of cwm --why, I have been evaluating the tests since at least as far back as this December 2002 check-in comment on swap/test/reason/Makefile, and I still haven't made up my mind:

date: 2002/12/30 15:00:35; author: timbl; state: Exp; lines: +9 -6
--why works up to reason/t5. GK and SBP's list bugs fixed.

I have tried and failed to understand many times when Tim explained the simplicity of the reason proof ontology. I think I'm finally starting to get it. I'm nowhere near being certain it's free of use/mention problems, but I'm starting to see how it works.

The InferenceWeb folks have all sorts of nifty proof browsing stuff, and they're working with us in the TAMI project. In our meeting last August, they explained PML well enough that TimBL started on to-pml.n3, some rules to convert cwm's proof structures to PML. The rest of the integration task has been starved by work on SPARQL and Paulo moving to UTEP and all sorts of other stuff, but we seem to be moving again.

I tried running one of my versioning proofs through to-pml.n3 and then looking at it with the InferenceWeb proof browser, but I haven't got the PML structure right and it doesn't give very detailed diagnostics.

I made some progress by loading one of the PML examples into tabulator (alpha) and loading my swap/reason style proof in there and using the outline view to browse the structure. (It turns out that TimBL started coding the tabulator one day when he was having trouble reading a proof. GMTA ;)I discovered that PML is striped:

  • NodeSet
    • isConsequentOf
      • InferenceStep
        • hasAntecedent
          • NodeSet
            • ...

... where the swap/reason ontology just has Step objects and hangs the conclusions off them.

That was the key to some big changes to to-pml.n3. I don't have the output working in the PML browser yet, but Paolo sent me a pointer to a PML primer, which seems to have the remaining clues that I need.

See also: help checking/reading some proof related to versioning? to cwm-talk.

Using RDF and OWL to model language evolution

Submitted by connolly on Wed, 2006-02-15 18:37. :: | |

diagram on whiteboard Back in September, the TAG had this great whiteboard discussion of versioning.

versioning diagram TimBL captured an omnigraffle version, which is nice because it's XML and I can convert it to RDF with XSLT, but it's a mac thing and Dave Orchard, who's doing most of the relevant writing, was more inclined to building UML diagrams with Visio. Then Henry Thompson found this violet tool that's just what we need: a light-weight, cross-platform (Java, in this case) editor that produces pretty clean XML representations of UML class diagrams.

This week, Dave sent an update on the terminology section.

That was just the prompt I was looking for to swap in my work on formally defining W3C's namespace change policy options from last September. I thought for a while about how to test that the RDF version of the UML diagram was right, and I realized I could get cwm to smush producers and consumers based on cardinality constraints.

First we extract RDF from the UML diagram...

ext-vers$ make test-agent-pf.n3
xsltproc ... --output ext-vers-uml.rdf grokVioletUML.xsl ext-vers-uml.violet

... and then we mix with some rules to implement OWL cardinalities...

python $swap/cwm.py ext-vers-uml.rdf test-agent.n3 \
owl-excerpt.n3 rdfs-excerpt.n3 --think \
--filter=test-agent-goal.n3 --why >test-agent-pf.n3

And indeed, it concludes a1 = a2.

I'm working on getting a proof/justification using --why and some proof browsing tools, but that's another story.

One of the reasons I'm pleased with this ontology that the TAG came up with in Edinburgh is that it allows me to formalize the concept of sublanguage that I have heard TimBL talk about now and again. For example:

  1. HTML2 is a sublanguage of HTML4.
  2. a production of chap2 is in HTML2; the intent is chap1in
  3. a consumption of chap2, ?C is in HTML4
  4. Show: ?C intent chap1in
  5. ?C intent chap1in by 1, 2, 3 and defn sublanguage
where the definition of sublanguage is:
{
[] ev:text ?TXT; :language [ is :sublanguage of ?BIG ]; :intent ?I.
?COMM ev:text ?TXT; :language ?BIG.
} => { ?COMM :intent ?I }.
{
[] ev:text ?TXT; :language ?BIG; :impact ?I.
?COMM ev:text ?TXT; :language [ is :sublanguage of ?BIG ].
} => { ?COMM :impact ?I }.

bnf2turtle -- write a turtle version of an EBNF grammar

Submitted by connolly on Fri, 2006-02-10 01:11. :: |

In order to cross one of the few remaining t's on the SPARQL spec, I wrote bnf2turtle.py today. It turned out to be such a nice piece of code that I elaborated the module documentation using ReStructuredText. It's checked into the SPARQL spec editor's draft materials, but I'll probably move it to the swap codebase presently. Meanwhile, here's the formatted version of the documentation:

Author: Dan Connolly
Version: $Revision: 1.13 $ of 2006-02-10
Copyright: W3C Open Source License Share and enjoy.

Usage

Invoke a la:

python bnf2turtle.py foo.bnf >foo.ttl

where foo.bnf is full of lines like:

[1] document ::= prolog element Misc*

as per the XML formal grammar notation. The output is Turtle - Terse RDF Triple Language:

:document rdfs:label "document"; rdf:value "1";
 rdfs:comment "[1] document ::= prolog element Misc*";
 a g:NonTerminal;
  g:seq (
    :prolog
    :element
    [ g:star
      :Misc
     ]
   )
.

Motivation

Many specifications include grammars that look formal but are not actually checked, by machine, against test data sets. Debugging the grammar in the XML specification has been a long, tedious manual process. Only when the loop is closed between a fully formal grammar and a large test data set can we be confident that we have an accurate specification of a language [1].

The grammar in the N3 design note has evolved based on the original manual transcription into a python recursive-descent parser and subsequent development of test cases. Rather than maintain the grammar and the parser independently, our goal is to formalize the language syntax sufficiently to replace the manual implementation with one derived mechanically from the specification.

[1]and even then, only the syntax of the language.

Related Work

Sean Palmer's n3p announcement demonstrated the feasibility of the approach, though that work did not cover some aspects of N3.

In development of the SPARQL specification, Eric Prud'hommeaux developed Yacker, which converts EBNF syntax to perl and C and C++ yacc grammars. It includes an interactive facility for checking strings against the resulting grammars. Yosi Scharf used it in cwm Release 1.1.0rc1, which includes a SPAQRL parser that is almost completely mechanically generated.

The N3/turtle output from yacker is lower level than the EBNF notation from the XML specification; it has the ?, +, and * operators compiled down to pure context-free rules, obscuring the grammar structure. Since that transformation is straightforwardly expressed in semantic web rules (see bnf-rules.n3), it seems best to keep the RDF expression of the grammar in terms of the higher level EBNF constructs.

Open Issues and Future Work

The yacker output also has the terminals compiled to elaborate regular expressions. The best strategy for dealing with lexical tokens is not yet clear. Many tokens in SPARQL are case insensitive; this is not yet captured formally.

The schema for the EBNF vocabulary used here (g:seq, g:alt, ...) is not yet published; it should be aligned with swap/grammar/bnf and the bnf2html.n3 rules (and/or the style of linked XHTML grammar in the SPARQL and XML specificiations).

It would be interesting to corroborate the claim in the SPARQL spec that the grammar is LL(1) with a mechanical proof based on N3 rules.

Background

The N3 Primer by Tim Berners-Lee introduces RDF and the Semantic web using N3, a teaching and scribbling language. Turtle is a subset of N3 that maps directly to (and from) the standard XML syntax for RDF.

I started with a kludged and broken algorithm for handling the precedence of | vs concatenation in EBNF rules; for a moment I thought the task required a yacc-like LR parser, but then I realized recursive descent would work well enough. A dozen or so doctests later, it did indeed work. I haven't checked the resulting grammar against the SPARQL tests yet, but it sure looks right.

Then I wondered how much of the formal grammar notation from the XML spec I hadn't covered, so I tried it out on the XML grammar (after writing a 20 line XSLT transformation to extract the grammar from the XML REC) and it worked the first time! So I think it's reasonably complete, though it has a few details that are hard-coded to SPARQL.

See also: cwm-talk discussion, swig chump entry.

tabulator use cases: when can we meet? and PathCross

Submitted by connolly on Wed, 2006-02-08 13:48. :: | | | | |

I keep all sorts of calendar info in the web, as do my colleagues and the groups and organizations we participate in.

Suppose it was all in RDF, either directly as RDF/XML or indirectly via GRDDL as hCalendar or the like.

Wouldn't it be cool to grab a bunch of sources, and then tabulate names vs. availability on various dates?

I would probably need rules in the tabulator; Jos's work sure seems promising.

Closely related is the PathCross use case...

Suppose I'm travelling to Boston and San Francisco in the next couple months. I'd like my machine to let me know I have a FriendOfaFriend who also lives there or plans to be there.

See also the Open Group's Federated Free/Busy Challenge.

RSS is dead; long live RSS

Submitted by connolly on Thu, 2006-02-02 23:36. ::

Just after I saw the news that Norm is killing his RSS feed in favor of Atom, in my city's newsletter, they tout that they're starting to use RSS.

They don't seem to keep back issues of the newsletter online, and their markup is invalid. Maybe they'd better study some of the older technologies like URIs and HTML before moving up to RSS.

Escaped markup really is nasty. Is there a way to do it with rdf:parseType="Literal" that works? Why isn't this answer readily available from a test suite? There are validation services for RSS and Atom, and I'm sure various tools have various regression suites, but I don't see much in the way of collaborative development of test suites for feeds.

I've done a little bit of work on hCard testing; I try to stay test-driven as I write GRDDL transformations for microformats. I'd like to find more time for that sort of thing.

Syndicate content