Eben Moglen on Free Software and Social Justice
The original appearance of this entry was in Danny Weitzner - Open Internet Policy
Leader of the free software movement, colleague and friend of mine Eben Moglen gave an important
talk on Free Software and Social Justice at the Plone Conference Keynote earlier this year. Eben gives a most compelling account of the connection between open access to software and general social welfare. I encourage you to read the whole speech, but consider Eben characterizes the importance of software in our society by analogy to previous large scale intellectual, technological and economics developments:
The twenty-first century economy is undergirded by software. Which is as crucial as the underlying element in economic development in the twenty-first century as the production of steel ingots was in the twentieth. We have moved to a societal structure in this country, are moving elsewhere in the developed world, will continue to move throughout the developing economies, towards economies whose primary underlying commodity of production is software. And the good news is that nobody owns it.
The reason that this is good news requires us to go back to a moment in the past in the development of the economies of the West, before steel. What was, after all, characteristic of the economy before steel was the slow persistent motivated expansion of European societies and European economies out into the larger world for both much evil and much good built around the possession of a certain number of basic technological improvements, mostly around naval transportation and armament. All of which was undergirded by a control of mathematics superior to the control of mathematics available in other cultures around the world. There are lots of ways we could conceive the great European expansion which redescribed human beings’ relationship to the globe. But one way to put it is they had the best math. And nobody owned that either.
Imagine if you will for a moment a society in which mathematics has become property, and it’s owned by people. Now every time you want to do anything useful: build a house, make a boat, start a bridge, devise a market, move objects weighing certain numbers of kilos from one place to another your first stop is at the mathematics store to buy enough mathematics to complete the task which lies before you. You can only use as much arithmetic at a time as you can afford, and it is difficult to build a sufficient inventory of mathematics, given its price, to have any extra on hand. You can predict, of course, that the mathematics sellers will get rich. And you can predict that every other activity in society, whether undertaken for economic benefit or for the common good, will pay taxes in the form of mathematics payments.
From there he goes on to show how it is the sharing software will contribute to greater equality, prosperity and justice. I’ve sometimes wondered why so many smart, dedicated, insightful people are so passionate about free software. Eben explains why. Read it!
Celebrating OWL interoperability and spec quality
In a Standards and Pseudo Standards item in July, Holger Knublauch gripes that SQL interoperability is still tricky after all these years, and UML is still shaking out bugs, while RDF and OWL are really solid. I hope GRDDL and SPARQL will get there soon too.
At the OWL: Experiences and Directions workshop in Athens today, as the community gathered to talk about problems they see with OWL and what they'd like to add to OWL, I felt compelled to point out (using a few slides) that:
- XML interoperability is quite good and tools are pretty much ubiquitous, but don't forget the XML Core working group has fixed over 100 errata in the specifications since they were originally adopted in 1998.
- HTML interoperability is a black art; the specification is only a small part of what you need to know to build interoperable tools.
- XML Schema interoperability is improving, but interoperability problem reports are still fairly common, and it's not always clear from the spec which tool is right when they disagree.
And while the OWL errata do include a repeated sentence and a missing word, there have been no substantive problems reported in the normative specifications.
How did we do that? The OWL deliverables include:
- Rigorous normative specification using mathematical logic
- based on mature research results
- Overview,
Guide,
Reference, also part of the standard
- Note translations in French, Hungarian, Japanese contributed by the community.
- 100s of tests developed concurrent with the spec
- demonstrating each feature
- capturing dozens of issues

Jeremy and Jos did great work on the tests. And Sandro's approach to getting test results back from the tool developers was particularly inspired. He asked them to publish their test results as RDF data in the web. Then he provided immediate feedback in the form of an aggregate report that included updates live. After our table of test results had columns from one or two tools, several other developers came out of the woodwork and said "here are my results too." Before long we had results from a dozen or so tools and our implementation report was compelling.
The GRDDL tests are coming along nicely; Chime's message on implementation and testing shows that the spec is quite straightforward to implement, and he updated the test harness so that we should be able to support Evaluation and Report Language (EARL) soon.
SPARQL looks a bit more challenging, but I hope we start to get some solid reports from developers about the SPARQL test collection soon too.
Blogging is great
People have, since it started, complained about the fact that there is junk on the web. And as a universal medium, of course, it is important that the web itself doesn't try to decide what is publishable. The way quality works on the web is through links.
It works because reputable writers make links to things they consider reputable sources. So readers, when they find something distasteful or unreliable, don't just hit the back button once, they hit it twice. They remember not to follow links again through the page which took them there. One's chosen starting page, and a nurtured set of bookmarks, are the entrance points, then, to a selected subweb of information which one is generally inclined to trust and find valuable.
A great example of course is the blogging world. Blogs provide a gently evolving network of pointers of interest. As do FOAF files. I've always thought that FOAF could be extended to provide a trust infrastructure for (e..g.) spam filtering and OpenID-style single sign-on and its good to see things happening in that space.
In a recent interview with the Guardian, alas, my attempt to explain this was turned upside down into a "blogging is one of the biggest perils" message. Sigh. I think they took their lead from an unfortunate BBC article, which for some reason stressed concerns about the web rather than excitement, failure modes rather than opportunities. (This happens, because when you launch a Web Science Research Initiative, people ask what the opportunities are and what the dangers are for the future. And some editors are tempted to just edit out the opportunities and headline the fears to get the eyeballs, which is old and boring newspaper practice. We expect better from the Guardian and BBC, generally very reputable sources)
In fact, it is a really positive time for the web. Startups are launching, and being sold [Disclaimer: people I know] again, academics are excited about new systems and ideas, conferences and camps and wikis and chat channels and are hopping with energy, and every morning demands an excruciating choice of which exciting link to follow first.
And, fortunately, we have blogs. We can publish what we actually think, even when misreported.
Reinventing HTML
Making standards is hard work. Its hard because it involves listening to other people and figuring out what they mean, which means figuring out where they are coming from, how they are using words, and so on.
There is the age-old tradeoff for any group as to whether to zoom along happily, in relative isolation, putting off the day when they ask for reviews, or whether to get lots of people involved early on, so a wider community gets on board earlier, with all the time that costs. That's a trade-off which won't go away.
The solutions tend to be different for each case, each working group. Some have lots of reviewers and some few, some have lots of time, some urgent deadlines.
A particular case is HTML. HTML has the potential interest of millions of people: anyone who has designed a web page may have useful views on new HTML features. It is the earliest spec of W3C, a battleground of the browser wars, and now the most widespread spec.
The perceived accountability of the HTML group has been an issue. Sometimes this was a departure from the W3C process, sometimes a sticking to it in principle, but not actually providing assurances to commenters. An issue was the formation of the breakaway WHAT WG, which attracted reviewers though it did not have a process or specific accountability measures itself.
There has been discussion in blogs where Daniel Glazman, Björn Hörmann, Molly Holzschlag, Eric Meyer, and Jeffrey Zeldman and others have shared concerns about W3C works particularly in the HTML area. The validator and other subjects cropped up too, but let's focus on HTML now. We had a W3C retreat in which we discussed what to do about these things.
Some things are very clear. It is really important to have real developers on the ground involved with the development of HTML. It is also really important to have browser makers intimately involved and committed. And also all the other stakeholders, including users and user companies and makers of related products.
Some things are clearer with hindsight of several years. It is necessary to evolve HTML incrementally. The attempt to get the world to switch to XML, including quotes around attribute values and slashes in empty tags and namespaces all at once didn't work. The large HTML-generating public did not move, largely because the browsers didn't complain. Some large communities did shift and are enjoying the fruits of well-formed systems, but not all. It is important to maintain HTML incrementally, as well as continuing a transition to well-formed world, and developing more power in that world.
The plan is to charter a completely new HTML group. Unlike the previous one, this one will be chartered to do incremental improvements to HTML, as also in parallel xHTML. It will have a different chair and staff contact. It will work on HTML and xHTML together. We have strong support for this group, from many people we have talked to, including browser makers.
There will also be work on forms. This is a complex area, as existing HTML forms and XForms are both form languages. HTML forms are ubiquitously deployed, and there are many implementations and users of XForms. Meanwhile, the Webforms submission has suggested sensible extensions to HTML forms. The plan is, informed by Webforms, to extend HTML forms. At the same time, there is a work item to look at how HTML forms (existing and extended) can be thought of as XForm equivalents, to allow an easy escalation path. A goal would be to have an HTML forms language which is a superset of the existing HTML language, and a subset of a XForms language wit added HTML compatibility. We will see to what extend this is possible. There will be a new Forms group, and a common task force between it and the HTML group.
There is also a plan for a separate group to work on the XHTML2 work which the old "HTML working group" was working on. There will be no dependency of HTML work on the XHTML2 work.
As well as a new HTML work, there are other things want to change. The validator I think is a really valuable tool both for users and in helping standards deployment. I'd like it to check (even) more stuff, be (even) more helpful, and prioritize carefully its errors, warning and mild chidings. I'd like it to link to an explanations of why things should be a certain way. We have, by the way, just ordered some new server hardware, paid for by the Supporters program -- thank you!
This is going to be hard work. I'd like everyone to go into this realizing this. I'll be asking these groups to be very accountable, to have powerful issue tracking systems on the w3.org web site, and to be responsive in spirit as well as in letter to public comments. As always, we will be insisting on working implementations and test suites. Now we are going to be asking for things like talking with validator developers, maybe providing validator modules and validator test suites. (That's like a language test suite but backwards, in a way). I'm going to ask commenters to be respectful of the groups, as always. Try to check whether the comment has been made before, suggest alternative text, one item per message, etc, and add to technical perception social awareness.
This is going to be a very major collaboration on a very important spec, one of the crown jewels of web technology. Even though hundreds of people will be involved, we are evolving the technology which millions going on billions will use in the future. There won't seem like enough thankyous to go around some days. But we will be maintaining something very important and creating something even better.
Tim BL
p.s. comments are disabled here in breadcrumbs, the DIG research blog, but they are welcome in the W3C QA weblog.
Now is a good time to try the tabulator
Tim presented the tabulator to the W3C team today; see slides: Tabulator: AJAX Generic RDF browser.
The tabulator was sorta all over the floor when I tried to present it in Austin in September, but David Sheets put it back together in the last couple weeks. Yay David!
In particular, the support for viewing the HTTP data that you pick up by tabulating is working better than ever before. The HTTP vocabulary has URIs like http://dig.csail.mit.edu/2005/ajar/ajaw/httph#content-type. That seems like an interesting contribution to the WAI ER work on HTTP Vocabulary in RDF.
Note comments are disabled here in breadcrumbs until we figure out OpenID comment policies and drupal etc.. The tabulator issue tracker is probably a better place to report problems anyway. We don't have OpenID working there yet either, unfortunately, but we do support email callback based account setup.
Talking with U.T. Austin students about the Microformats, Drug Discovery, the Tabulator, and the Semantic Web
Working with the MIT tabulator students has been such a blast that while I was at U.T. Austin for the research library symposium, I thought I would try to recruit some undergrads there to get into it. Bob Boyer invited me to speak to his PHL313K class on why the heck they should learn logic, and Alan Cline invited me to the Dean's Scholars lunch, which I used to attend when I was at U.T.
To motivate logic in the PHL313K class, I started with their experience with HTML and blogging and explained how the Semantic Web extends the web by looking at links as logical propositions.
I used my XML 2005 slides to talk a little bit about web history and web architecture, and then I moved into using hCalendar (and GRDDL, though I left that largely implicit) to address the personal information disaster. This was the first week or so of class and they had just started learning propositional logic, and hadn't even gotten as far as predicate calculus where atomic formulas like those in RDF show up. And none of them had heard of microformats. I promised not to talk for the full hour but then lost track of time and didn't get to the punch line, "so the computer tells you that no, you can't go to both the conference and Mom's birthday party because you can't be in two places at once" until it was time for them to head off to their next class.
One student did stay after to pose a question that is very interesting and important, if only tangentially related to the Semantic Web: with technology advancing so fast, how do you maintain balance in life?
While Boyer said that talk went well, I think I didn't do a very good job of connecting with them; or maybe they just weren't really awake; it was an 8am class after all. At the Dean's Scholars lunch, on the other hand, the students were talking to each other so loudly as they grabbed their sandwiches that Cline had to really work to get the floor to introduce me as a "local boy done good." They responded with a rousing ovation.
Elaine Rich had provided the vital clue for connecting with this audience earlier in the week. She does AI research and had seen TimBL's AAAI talk. While she didn't exactly give the talk four stars overall, she did get enough out of it to realize it would make an interesting application to add to a book that she's writing, where she's trying to give practical examples that motivate automata theory. So after I took a look at what she had written about URIs and RDF and OWL and such, she reminded me that not all the Deans Scholars are studying computer science; but many of them do biology, and I might do well to present the Semantic Web more from the perspective of that user community.
So I used TimBL's Bio-IT slides. They weren't shy when I went too fast with terms like hypertext, and there were a lot of furrowed brows for a while. But when I got to the
drug discovery diagram, I said I didn't even know some of these words and asked them which ones they knew. After a chuckle about "drug", one of them explained about SNP, i.e. single nucleotide polymorphism and another told me about OMM and the discussion really got going. I didn't make much more use of Tim's slides. One great question about integrating data about one place from lots of sources prompted me to tempt the demo gods and try the tabulator. The demo gods were not entirely kind; perhaps I should have used the released version rather than the development version. But I think I did give them a feel for it. In answer to "so what is it you're trying to do, exactly?" I gave a two part answer:
- Recruit some of them to work on the tabulator so that their name might be on the next paper like the SWUI06 paper, Tabulator: Exploring and Analyzing linked data on the Semantic Web.
- Integrate data accross applications and accross administrative boundaries all over the world, like the Web has done for documents.
We touched on the question of local and global consistency, and someone asked if you can reason about disagreement. I said that yes, I had presented a paper in Edinburgh just this May that demonstrated formally a disagreement between several parties
One of the last questions was "So what is computer science research anway?" which I answered by appeal to the DIG mission statement:
The Decentralized Information Group explores technical, institutional and public policy questions necessary to advance the development of global, decentralized information environments.
And I said how cool it is to have somebody in the TAMI project with real-world experience with the privacy act. One student followed up and asked if we have anybody with real legal background in the group, and I pointed him to Danny. He asked me afterward how to get involved, and it turned out that IRC and freenode are known to him, so the #swig channel was in our common neighborhood in cyberspace, even geography would separate us as I headed to the airport to fly home.
technorati tags:Austin, semantic, web
Blogged with Flock
ACL 2 seminar at U.T. Austin: Toward proof exchange in the Semantic Web
In our PAW and TAMI projects, we're making a lot of progress on the practical aspects of proof exchange: in PAW we're working out the nitty gritty details of making an HTTP client (proxy) and server that exchange proofs, and in TAMI, we're working on user interfaces for audit trails and justifications and on integration with a truth maintenance system.
It doesn't concern me too much that cwm does some crazy stuff when finding proofs; it's the proof checker that I expect to deploy as part of trusted computing bases and the proof language specification that I hope will complete the Semantic Web standards stack.
But N3 proof exchange is no longer a completely hypothetical problem; the first examples of interoperating with InferenceWeb (via a mapping to PML) and with Euler are working. So it's time to take a close look at the proof representation and the proof theory in more detail.
My trip to Austin for a research library symposium at the University of Texas gave me a chance to re-connect with Bob Boyer. A while back, I told him about RDF and asked him about Semantic Web logic issues and he showed me the proof checking part of McCune's Robbins Algebras Are Boolean result:
Proofs found by programs are always questionable. Our approach to this problem is to have the theorem prover construct a detailed proof object and have a very simple program (written in a high-level language) check that the proof object is correct. The proof checking program is simple enough that it can be scrutinized by humans, and formal verification is probably feasible.
In my Jan 2000 notes, that excerpt is followed by...
I offer a 500 brownie-point bounty to anybody who converts it to Java and converts the ()'s in the input format to <>'s.
5 points for perl. ;-)
Bob got me invited to the ACL2 seminar this week; in my presentation, Toward proof exchange in the Semantic Web. I reviewed a bit of Web Architecture and the standardization status of RDF, RDFS, OWL, and SPARQL as background to demonstrating that we're close to collecting that bounty. (Little did I know in 2000 that TimBL would pick up python so that I could avoid Java as well as perl ;-)
Matt Kauffman and company gave all sorts of great feedback on my presentation. I had to go back to the Semantic Web Wave diagram a few times to clarify the boundary between research and standardization:
- RDF is fully standardized/ratified
- turtle has the same expressive capability as RDF's XML syntax, but isn't fully ratified, and
- N3 goes beyond the standards in both syntax and expressiveness
One of the people there who knew about RDF and OWL and such really encouraged me to get N3/turtle done, since every time he does any Semantic Web advocacy, the RDF/XML syntax is a deal-killer. I tried to show them my work on a turtle bnf, but what I was looking for was in June mailing list discussion, not in my February bnf2turtle breadcrumbs item.
They asked what happens if an identifier is used before it appears in an @forAll directive and I had to admit that I could test what the software does if they wanted to, but I couldn't be sure whether that was by design or not; exactly how quantification and {}s interact in N3 is sort of an open issue, or at least something I'm not quite sure about.
Moore noticed that our conjunction introduction (CI) step doesn't result in a formula whose main connective is conjuction; the conjuction gets pushed inside the quantifiers. It's not wrong, but it's not traditional CI either.
I asked about ACL2's proof format, and they said what goes in an ACL2 "book" is not so much a proof as a sequence of lemmas and such, but Jared was working on Milawa, a simple proof checker that can be extended with new prooftechniques.
I started talking a little after 4pm; different people left at different times, but it wasn't until about 8 that Matt realized he was late for a squash game and headed out.
I went back to visit them in the U.T. tower the next day to follow up on ACL2/N3 connections and Milawa. Matt suggested a translation of N3 quantifiers and {}s into ACL2 that doesn't involve quotation. He offered to guide me as I fleshed it out, but I only got as far as installing lisp and ACL2; I was too tired to get into a coding fugue.
Jared not only gave me some essential installation clues, but for every technical topic I brought up, he printed out two papers showing different approaches. I sure hope I can find time to follow up on at least some of this stuff.
del.icio.us tags:Austin, semantic, web, logic, research
Blogged with Flock
On the Future of Research Libraries at U.T. Austin
Wow. What a week!
I'm always on the lookout for opportunities to get back to Austin, so I was happy to accept an invitation to this 11 - 12 September symposium, The Research Library in the 21st Century run by University of Texas Libraries:
In today's rapidly changing digital landscape, we are giving serious thought to shaping a strategy for the future of our libraries. Consequently, we are inviting the best minds in the field and representatives from leading institutions to explore the future of the research library and new developments in scholarly communication. While our primary purpose is to inform a strategy for our libraries and collections, we feel that all participants and their institutions will benefit.
I spent the first day getting a feel for this community, where evidently a talk by Clifford Lynch of CNI is a staple. "There is no scholarship without scholarly communication," he said, quoting Courant. He noted that traditionally, publishers disseminate and libraries preserve, but we're shifting to a world where the library helps
disseminate and makes decisions on behalf of the whole world about which works to preserve. He said there's a company (I wish I had made a note of the name) that has worked out the price of an endowed web site; at 4% annual return, they figure it at $2500/gigabyte.
James Duderstadt from the University of Michigan told us that the day when the entire contents of the library fits on an iPod (or "a device the size of a football" for other audiences that didn't know about iPods ;-) is not so far off. He said that the University of Michigan started digitizing their 7.8million volumes even before becoming a Google Book Search library partner. They initially estimated it would take 10 years, but the current estimate is 6 years and falling. He said that yes, there are copyright issues and other legal challenges, and he wouldn't be suprised to end up in court over it; he had done that before. Even the sakai project might face litigation. What got the most attention, I think, was when he relayed first-hand experience from the Spellings Commission on the Future of Higher Education; their report is available to those that know where to look, though it is not due for official release until September 26.
He also talked about virtual organizations, i.e. groups of researchers from universities all over, and even the "meta university," with no geographical boundaries at all. That sort of thing fueled my remarks for the Challenges of Access and Preservation panel on the second day. I noted that my job is all about virtual organizations, and if the value of research libraries is connected to recruiting good people, you should keep in mind the fact that "get together and go crazy" events like football games are a big part of building trust and loyalty.
Kevin Guthrie, President of ITHAKA, made a good point that starting new things is usually easier than changing old things, which was exactly what I was thinking when President Powers spoke of "preserving our investment" in libraries in his opening address. U.T. invested $650M in libraries since 1963. That's not counting bricks and mortar; that's special collections, journal subscriptions, etc.
My point that following links is 96% reliable sparked an interesting conversation; it was misunderstood as "96% of web sites are persistent" and then "96% of links persist"; when I clarified that it's 96% of attempts to follow links that succeed, and this is because most attempts to follow links are from one popular resource to another, we had an interesting discussion of ephemera vs. the scholarly record and which parts need what sort of attention and what sort of policies. The main example was that 99% of political websites about the California run-off election went offline right after the election. My main point was: for the scholarly record, HTTP/DNS is as good as it gets for the forseeable future; don't throw up your hands at the 4% and wait for some new technology; apply your expertise of curation and organizational change to the existing technologies.
In fact, I didn't really get beyond URIs and basic web architecture in my remarks. I had prepared some points about the Semantic Web, but I didn't have time for them in my opening statement and they didn't come up much later in the conversation, except when Ann Wolpert, Director of Libraries at MIT, brough up DSPACE a bit.
Betsy Wilson of the University of Washington suggested that collaboration would be the hallmark of the library of the future. I echoed that back in the wrap-up session referring to library science as the "interdisciplinary discipline"; I didn't think I was making that up (and a google search confirms I did not), but it seemed to be new to this audience.
By the end of the event I was pretty much up to speed on the conversation; but on the first day, I felt a little out of place and when I saw the sound engineer getting things ready, I mentioned to him that I had a little experience using and selling that sort of equipment. It turned out that he's George Geranios, sound man for bands like Blue Oyster Cult for about 30 years. We had a great conversation on digital media standards and record companies. I'm glad I sat next to David Seaman of the DLF at lunch; we had a mutual colleague in Michael Sperberg-McQueen. I asked him about IFLA, one of the few acronyms from the conversation that I recognized; he helped me understand that IFLA conferences are relevant, but they're about libraries in general, and the research library community is not the same. And Andrew Dillon got me up to speed on all sorts of things and made the panel I was on fun and pretty relaxed.
Fred Heath made an oblique reference to a New York Times article about moving most of the books out of the U.T. undergraduate library as if everyone knew, but it was news to me. Later in the week I caught up with Ben Kuipers; we didn't have time for my technical agenda of linked data and access limited logic, but we did discover that both of us were a bit concerned with the fragility of civilization as we know it and the value of books over DVDs if there's no reliable electricity.
The speakers comments at the symposium were recorded; there's some chance that edited transcripts will appear in a special issue of a journal. Stay tuned for that. And stay tuned for more breadcrumbs items on talks I gave later in the week where I did get beyond the basic http/DNS/URI layer of Semantic Web Archtiecture.
tags:Austin, URI, Web Architecture
Stitching the Semantic Web together with OWL at AAAI-06
I was pleased to find that AAAI '06 in Boston a couple weeks ago had a spectrum of people I know and don't know and work that's near and far from my own. The talk about the DARPA grand challenge was inspiring.
But closer to my work, I ran into Jeff Heflin, who I worked with on DAML and especially the OWL requirements document. Amid too many papers about ontologies for the sake of ontologies and threads like Is there real world RDF-S/OWL instance data?, his Investigation into the Feasibility of the Semantic Web is a breath of fresh air. The introduction sets out their approach this way:
Our approach is to use axioms of OWL, the de facto Semantic Web language, to describe a map for a set of ontologies. The axioms will relate concepts from one ontology to the other. ... There is a well-established body of research in the area of automated ontology alignment. This is not our focus. Instead we investigate the application of these alignments to provide an integrated view of the Semantic Web data.
(emphasis mine). The rest of the paper justifies this approach, leading up to:
We first query the knowledge base from the perspective of each of the 10 ontologies that define the concept Person. We now ask for all the instances of the concept Person. The results vary from 4 to 1,163,628. We then map the Person concept from all the ontologies to the Person concept defined in the FOAF ontology. We now issue the same query from the perspective of this map and we get 1,213,246 results. The results now encompass all the data sources that commit to these 10 ontologies. Note: a pair wise mapping would have taken 45 mapping axioms to establish this alignment instead of the 9 mapping axioms that we used. More importantly due to this network effect of the maps, by contributing just a single map, one will
automaticallyget the benefit of all the data that is available in the network.
That's fantastic stuff.
We now pause for a word from Steve
Lawrence; NEC Research Institute, to lament the lack of free
online proceedings for AAAI: Articles freely available online are
more highly cited. For greater impact and faster scientific progress,
authors and publishers should aim to make research easy to access.
OK, now back to the great paper...
Along the way, they give a definition of a knowledge function, K, that is remarkably similar to log:semantics from N3. They also define a commitment function that is basically the ontological closure pattern.
The approach to querying all this data is something they call DLDB, which comes from a paper they submitted to the ISWC Practical and Scalable Semantic Systems workshop. Darn! no full text proceedings online again. Ah... Jeff's pubs include a tech report version. To paraphrase: there's a table for each class and a table for each property that relates rows from the class tables. They use a DL reasoner to find subclass relationships, and they make views out of them. I have never seen this approach to RdfAndSql before; it sure looks promising. I wonder if we can integrate it into our dbview work somehow and perhaps into our truth-maintenance system in the TAMI project.
This wasn't the only work at AAAI on scalable, practical knowledge representation. I caught just a glance at some other papers at the conference that exploit wikipedia as a dataset in various algorithms. I hope to study those more.
I also ran into Ben Kuipers, whose Algernon and Access-Limited Logic has long appealed to me as an approach to reasoning that might work well when scaled up to Semantic Web data sets. That work is mostly on hold; we started talking about getting it going again, but didn't get very far into the conversation. I hope to pick that up again soon.
I gather the 1.0 release of OpenCyc happened at the conference; there's a lot of great stuff in cyc, but only time will tell how well it will integrate with other Semantic Web stuff.
Meanwhile, a handy citation for Heflin's paper...
- An Investigation into the Feasibility of the Semantic Web. In Proc. of the Twenty First National Conference on Artificial Intelligence (AAAI 2006), Boston, USA, 2006 (abstract)
That's marked up using an XHTML/LaText/BibTex idiom that I'm working on so that we get BibTex for free:
@inproceedings{pan06a,
title = "{An Investigation into the Feasibility of the Semantic Web}",
author = {Z. Pan and A. Qasem and J. Heflin},
booktitle = {Proc. of the Twenty First National Conference on Artificial Intelligence (AAAI 2006)},
year = {2006},
address = {Boston, USA},
}
on Wikimania 2006, from a few hundred miles away
Wikimania 2006 was last week in Boston; I had it on my travel schedule, tenatively, months in advance, but I didn't really come up with a solid justification, and there were conflicts, so I ended up not going.
I was very interested to see the online participation options, but I didn't get my hopes up too high, because I know that ConnectingAudiences is challenging.
I tried to participate in the transcription stuff real-time; installation of the goby collaborative editor went smoothly enough (it looks like an interesting alternative to SubEthaEdit, though it's client/server, not peer-to-peer; they're talking about switching to the jabber protocol...) but I couldn't seem to connect to any sessions while people were active in them.
The real-time video feed of mako on a definition of Freedom was surprisingly good, though I couldn't give it my full attention during the work day. I didn't understand the problem he was speaking to (isn't GFDL good enough?) until I listened to Lessig on Free Culture and realized that CC share-alike and GFDL don't interoperate. (Yet another reason to keep the test of independent invention in mind at all times.)
Lessig read this quote, but only referred to the author using a photo that I couldn't see via the audio feed; when I looked it up, I realized there was a gap in this student's free culture education:
If we don't want to live in a jungle, we must change our attitudes. We must start sending the message that a good citizen is one who cooperates when appropriate, not one who is successful at taking from others.
RMS, 1992
These sessions on the wikipedia process look particularly interesting; I hope to find time to see or listen to a recording:
- The Process of Requests for Adminship on the English Wikipedia: The Role of Trust in an Open System
- A Question-and-answer session with the English Wikipedia Arbitration Committee
I bumped into TimBL online and remind him about the Wikipedia and the Semantic Web panel; he had turned it down because of other travel obligations, but he just managed to stop by after all. I hope it went allright; he was pretty jet-lagged.
I see WikiSym 2006 coming up August 21-23, 2006 in Odense, Denmark. I'm not sure I can find justification to make travel plans on just a few weeks of notice. But Denny's hottest conference ever item burns like salt in an open wound and motivates me to give it a try. It looks like the SweetWiki folks, who participate in the GRDDL WG, will be there; that's the start of a justification...

