Modelling HTTP cache configuration in the Semantic Web

Submitted by connolly on Fri, 2006-12-22 19:10. :: |

The W3C Semantic Web Interest Group is considering URI best practices, whether to use LSIDs or HTTP URIs, etc. I ran into some of them at MIT last week. At first it sounded like they wanted some solution so general it would solve the only two hard things in Computer Science: cache invalidation and naming things , as Phil Karlton would say. But then we started talking about a pretty interesting approach: using the semantic web to model cache configuration. It has long been a thorn in my side that there is no standard/portable equivalent ot .htaccess files, no RDF schema for HTTP and MIME, etc.

At WWW9 in May 2000, I gave a talk on formalizing HTTP caching. Where I used larch there, I'd use RDF, OWL, and N3 rules, today. I made some progress in that direction in August 2000: An RDF Model for GET/PUT and Document Management.

Web Architecture: Protocols for State Distribution is a draft I worked on around 1996 to 1999 wihthout ever really finishing it.

I can't find Norm Walsh's item on wwwoffle config, but I did find his XML 2003 paper Caching in with Resolvers:

This paper discusses entity resolvers, caches, and other strategies for dealing with access to sporadically available resources. Our principle focus is on XML Catalogs and local proxy caches. We’ll also consider in passing the ongoing debate of names and addresses, most often arising in the context of URNs vs. URLs.

In Nov 2003 I worked on Web Architecture Illustrated with RDF diagramming tools.

The tabulator, as it's doing HTTP, propagates stuff like content type, last modified, etc. from javascript into its RDF store. Meanwhile, the accessability evaluation and repair folks just released HTTP Vocabulary in RDF. I haven't managed to compare the tabulator's vocabulary with that one yet. I hope somebody does soon.

And while we're doing this little survey, check out the Uri Template stuff by Joe Gregorio and company. I haven't taken a very close look yet, but I suspect it'll be useful for various problems, if not this one in particular.

Is it now illegal to link to copyrighted material in Australia? NO

Submitted by Danny Weitzner on Wed, 2006-12-20 12:26. ::

The original appearance of this entry was in Danny Weitzner - Open Internet Policy

There’s been a lot of coverage (Sydney Morning Herald, Copyright ruling puts hyperlinking on notice, 19 December 2006) about a recent copyright case from the Australia Federal Court. This is an important case but to my reading the decision itself, it’s a mistake to see it as a general rule against linking to copyrighted material, as some of the press coverage suggests. Of course, it would cripple the Web if it became illegal to merely link to copyrighted material. As virtually all Web pages are copyrighted by someone, a rule that any link is an invitation to engage in copyright violation would mean one could only link to pages with permission. That would, indeed, break the Web.

But that is not was this case seems to say. From an admittedly cursory reading of the opinion, the Australia court seems to have tied it’s decision to that fact that:

“…it was the deliberate choice of Mr Cooper to establish and maintain his website in a form which did not give him the power immediately to prevent, or immediately to restrict, internet users from using links on his website to access remote websites for the purpose of copying sound recordings in which copyright subsisted.” (41)*

and the court went on to accept the trial courts finding that:

“… Mr Cooper [the defendant and operator of mp3s4free.net site] benefited financially from sponsorship and advertisements on the website; that is, that the relationship between Mr Cooper and the users of his website had a commercial aspect. Mr Cooper’s benefits from advertising and sponsorship may be assumed to have been related to the actual or expected exposure of the website to internet users. As a consequence Mr Cooper had a commercial interest in attracting users to his website for the purpose of copying digital music files.” (48)

To boil it down, though Cooper didn’t actually have the power to spot people from illegally copying the MP34 files to which he provided links, his intent was that people engage in copying he knew to be illegal and that he actually benefited from that behavior.

The court also addressed the defendants argument that a ruling against him could also outlaw search engines in Australia. The court said: “Google is a general purpose search engine rather than a website designed to facilitate the downloading of music files”

Copyright law has developed elaborate doctrine in order to try to determine when to punish those who have some role in enabling infringement as opposed to those who are the actual infringers. I’m not sure that that balance is always right, but this case, similar to the US Supreme Court case MGM v. Grokster is an effort to find a way to indicate when linking to copyrighted material goes beyond building the Web and violates the law. I’m not always happy about where that line is drawn, but it’s a lot more subtle than the simple technical question whether a link is provided or not.

* note that the Australia courts have adopted the enlightened practice of using paragraph numbers to refer inside an opinion, rather than relying on page numbers which neither work well with digital copies (such as web pages that lack pagination) and which give certain legal publishes undue control over search/retrieval services for legal documents.

Eben Moglen on Free Software and Social Justice

The original appearance of this entry was in Danny Weitzner - Open Internet Policy

Leader of the free software movement, colleague and friend of mine Eben Moglen gave an important
talk on Free Software and Social Justice at the Plone Conference Keynote earlier this year. Eben gives a most compelling account of the connection between open access to software and general social welfare. I encourage you to read the whole speech, but consider Eben characterizes the importance of software in our society by analogy to previous large scale intellectual, technological and economics developments:

The twenty-first century economy is undergirded by software. Which is as crucial as the underlying element in economic development in the twenty-first century as the production of steel ingots was in the twentieth. We have moved to a societal structure in this country, are moving elsewhere in the developed world, will continue to move throughout the developing economies, towards economies whose primary underlying commodity of production is software. And the good news is that nobody owns it.
The reason that this is good news requires us to go back to a moment in the past in the development of the economies of the West, before steel. What was, after all, characteristic of the economy before steel was the slow persistent motivated expansion of European societies and European economies out into the larger world for both much evil and much good built around the possession of a certain number of basic technological improvements, mostly around naval transportation and armament. All of which was undergirded by a control of mathematics superior to the control of mathematics available in other cultures around the world. There are lots of ways we could conceive the great European expansion which redescribed human beings’ relationship to the globe. But one way to put it is they had the best math. And nobody owned that either.
Imagine if you will for a moment a society in which mathematics has become property, and it’s owned by people. Now every time you want to do anything useful: build a house, make a boat, start a bridge, devise a market, move objects weighing certain numbers of kilos from one place to another your first stop is at the mathematics store to buy enough mathematics to complete the task which lies before you. You can only use as much arithmetic at a time as you can afford, and it is difficult to build a sufficient inventory of mathematics, given its price, to have any extra on hand. You can predict, of course, that the mathematics sellers will get rich. And you can predict that every other activity in society, whether undertaken for economic benefit or for the common good, will pay taxes in the form of mathematics payments.

From there he goes on to show how it is the sharing software will contribute to greater equality, prosperity and justice. I’ve sometimes wondered why so many smart, dedicated, insightful people are so passionate about free software. Eben explains why. Read it!

Celebrating OWL interoperability and spec quality

Submitted by connolly on Sat, 2006-11-11 00:29. :: | |

In a Standards and Pseudo Standards item in July, Holger Knublauch gripes that SQL interoperability is still tricky after all these years, and UML is still shaking out bugs, while RDF and OWL are really solid. I hope GRDDL and SPARQL will get there soon too.

At the OWL: Experiences and Directions workshop in Athens today, as the community gathered to talk about problems they see with OWL and what they'd like to add to OWL, I felt compelled to point out (using a few slides) that:

  • XML interoperability is quite good and tools are pretty much ubiquitous, but don't forget the XML Core working group has fixed over 100 errata in the specifications since they were originally adopted in 1998.
  • HTML interoperability is a black art; the specification is only a small part of what you need to know to build interoperable tools.
  • XML Schema interoperability is improving, but interoperability problem reports are still fairly common, and it's not always clear from the spec which tool is right when they disagree.

And while the OWL errata do include a repeated sentence and a missing word, there have been no substantive problems reported in the normative specifications.

How did we do that? The OWL deliverables include:

OWL test results screenshot

Jeremy and Jos did great work on the tests. And Sandro's approach to getting test results back from the tool developers was particularly inspired. He asked them to publish their test results as RDF data in the web. Then he provided immediate feedback in the form of an aggregate report that included updates live. After our table of test results had columns from one or two tools, several other developers came out of the woodwork and said "here are my results too." Before long we had results from a dozen or so tools and our implementation report was compelling.

The GRDDL tests are coming along nicely; Chime's message on implementation and testing shows that the spec is quite straightforward to implement, and he updated the test harness so that we should be able to support Evaluation and Report Language (EARL) soon.

SPARQL looks a bit more challenging, but I hope we start to get some solid reports from developers about the SPARQL test collection soon too.

tags: QA, GRDDL, SPARQL, OWL, RDF, Semantic Web

Blogging is great

Submitted by timbl on Fri, 2006-11-03 10:11. ::

People have, since it started, complained about the fact that there is junk on the web. And as a universal medium, of course, it is important that the web itself doesn't try to decide what is publishable. The way quality works on the web is through links.

It works because reputable writers make links to things they consider reputable sources. So readers, when they find something distasteful or unreliable, don't just hit the back button once, they hit it twice. They remember not to follow links again through the page which took them there. One's chosen starting page, and a nurtured set of bookmarks, are the entrance points, then, to a selected subweb of information which one is generally inclined to trust and find valuable.

A great example of course is the blogging world. Blogs provide a gently evolving network of pointers of interest. As do FOAF files. I've always thought that FOAF could be extended to provide a trust infrastructure for (e..g.) spam filtering and OpenID-style single sign-on and its good to see things happening in that space.

In a recent interview with the Guardian, alas, my attempt to explain this was turned upside down into a "blogging is one of the biggest perils" message. Sigh. I think they took their lead from an unfortunate BBC article, which for some reason stressed concerns about the web rather than excitement, failure modes rather than opportunities. (This happens, because when you launch a Web Science Research Initiative, people ask what the opportunities are and what the dangers are for the future. And some editors are tempted to just edit out the opportunities and headline the fears to get the eyeballs, which is old and boring newspaper practice. We expect better from the Guardian and BBC, generally very reputable sources)

In fact, it is a really positive time for the web. Startups are launching, and being sold [Disclaimer: people I know] again, academics are excited about new systems and ideas, conferences and camps and wikis and chat channels and are hopping with energy, and every morning demands an excruciating choice of which exciting link to follow first.

And, fortunately, we have blogs. We can publish what we actually think, even when misreported.

Reinventing HTML

Submitted by timbl on Fri, 2006-10-27 16:14. ::

Making standards is hard work. Its hard because it involves listening to other people and figuring out what they mean, which means figuring out where they are coming from, how they are using words, and so on.

There is the age-old tradeoff for any group as to whether to zoom along happily, in relative isolation, putting off the day when they ask for reviews, or whether to get lots of people involved early on, so a wider community gets on board earlier, with all the time that costs. That's a trade-off which won't go away.

The solutions tend to be different for each case, each working group. Some have lots of reviewers and some few, some have lots of time, some urgent deadlines.

A particular case is HTML. HTML has the potential interest of millions of people: anyone who has designed a web page may have useful views on new HTML features. It is the earliest spec of W3C, a battleground of the browser wars, and now the most widespread spec.

The perceived accountability of the HTML group has been an issue. Sometimes this was a departure from the W3C process, sometimes a sticking to it in principle, but not actually providing assurances to commenters. An issue was the formation of the breakaway WHAT WG, which attracted reviewers though it did not have a process or specific accountability measures itself.

There has been discussion in blogs where Daniel Glazman, Björn Hörmann, Molly Holzschlag, Eric Meyer, and Jeffrey Zeldman and others have shared concerns about W3C works particularly in the HTML area. The validator and other subjects cropped up too, but let's focus on HTML now. We had a W3C retreat in which we discussed what to do about these things.

Some things are very clear. It is really important to have real developers on the ground involved with the development of HTML. It is also really important to have browser makers intimately involved and committed. And also all the other stakeholders, including users and user companies and makers of related products.

Some things are clearer with hindsight of several years. It is necessary to evolve HTML incrementally. The attempt to get the world to switch to XML, including quotes around attribute values and slashes in empty tags and namespaces all at once didn't work. The large HTML-generating public did not move, largely because the browsers didn't complain. Some large communities did shift and are enjoying the fruits of well-formed systems, but not all. It is important to maintain HTML incrementally, as well as continuing a transition to well-formed world, and developing more power in that world.

The plan is to charter a completely new HTML group. Unlike the previous one, this one will be chartered to do incremental improvements to HTML, as also in parallel xHTML. It will have a different chair and staff contact. It will work on HTML and xHTML together. We have strong support for this group, from many people we have talked to, including browser makers.

There will also be work on forms. This is a complex area, as existing HTML forms and XForms are both form languages. HTML forms are ubiquitously deployed, and there are many implementations and users of XForms. Meanwhile, the Webforms submission has suggested sensible extensions to HTML forms. The plan is, informed by Webforms, to extend HTML forms. At the same time, there is a work item to look at how HTML forms (existing and extended) can be thought of as XForm equivalents, to allow an easy escalation path. A goal would be to have an HTML forms language which is a superset of the existing HTML language, and a subset of a XForms language wit added HTML compatibility. We will see to what extend this is possible. There will be a new Forms group, and a common task force between it and the HTML group.

There is also a plan for a separate group to work on the XHTML2 work which the old "HTML working group" was working on. There will be no dependency of HTML work on the XHTML2 work.

As well as a new HTML work, there are other things want to change. The validator I think is a really valuable tool both for users and in helping standards deployment. I'd like it to check (even) more stuff, be (even) more helpful, and prioritize carefully its errors, warning and mild chidings. I'd like it to link to an explanations of why things should be a certain way. We have, by the way, just ordered some new server hardware, paid for by the Supporters program -- thank you!

This is going to be hard work. I'd like everyone to go into this realizing this. I'll be asking these groups to be very accountable, to have powerful issue tracking systems on the w3.org web site, and to be responsive in spirit as well as in letter to public comments. As always, we will be insisting on working implementations and test suites. Now we are going to be asking for things like talking with validator developers, maybe providing validator modules and validator test suites. (That's like a language test suite but backwards, in a way). I'm going to ask commenters to be respectful of the groups, as always. Try to check whether the comment has been made before, suggest alternative text, one item per message, etc, and add to technical perception social awareness.

This is going to be a very major collaboration on a very important spec, one of the crown jewels of web technology. Even though hundreds of people will be involved, we are evolving the technology which millions going on billions will use in the future. There won't seem like enough thankyous to go around some days. But we will be maintaining something very important and creating something even better.

Tim BL

p.s. comments are disabled here in breadcrumbs, the DIG research blog, but they are welcome in the W3C QA weblog.

Now is a good time to try the tabulator

Submitted by connolly on Thu, 2006-10-26 11:40. :: | | |

Tim presented the tabulator to the W3C team today; see slides: Tabulator: AJAX Generic RDF browser.

The tabulator was sorta all over the floor when I tried to present it in Austin in September, but David Sheets put it back together in the last couple weeks. Yay David!

In particular, the support for viewing the HTTP data that you pick up by tabulating is working better than ever before. The HTTP vocabulary has URIs like http://dig.csail.mit.edu/2005/ajar/ajaw/httph#content-type. That seems like an interesting contribution to the WAI ER work on HTTP Vocabulary in RDF.

Note comments are disabled here in breadcrumbs until we figure out OpenID comment policies and drupal etc.. The tabulator issue tracker is probably a better place to report problems anyway. We don't have OpenID working there yet either, unfortunately, but we do support email callback based account setup.

Talking with U.T. Austin students about the Microformats, Drug Discovery, the Tabulator, and the Semantic Web

Submitted by connolly on Sat, 2006-09-16 21:36. :: | | | | | |

Working with the MIT tabulator students has been such a blast that while I was at U.T. Austin for the research library symposium, I thought I would try to recruit some undergrads there to get into it. Bob Boyer invited me to speak to his PHL313K class on why the heck they should learn logic, and Alan Cline invited me to the Dean's Scholars lunch, which I used to attend when I was at U.T.

To motivate logic in the PHL313K class, I started with their experience with HTML and blogging and explained how the Semantic Web extends the web by looking at links as logical propositions. cal screen shot I used my XML 2005 slides to talk a little bit about web history and web architecture, and then I moved into using hCalendar (and GRDDL, though I left that largely implicit) to address the personal information disaster. This was the first week or so of class and they had just started learning propositional logic, and hadn't even gotten as far as predicate calculus where atomic formulas like those in RDF show up. And none of them had heard of microformats. I promised not to talk for the full hour but then lost track of time and didn't get to the punch line, "so the computer tells you that no, you can't go to both the conference and Mom's birthday party because you can't be in two places at once" until it was time for them to head off to their next class.

One student did stay after to pose a question that is very interesting and important, if only tangentially related to the Semantic Web: with technology advancing so fast, how do you maintain balance in life?

While Boyer said that talk went well, I think I didn't do a very good job of connecting with them; or maybe they just weren't really awake; it was an 8am class after all. At the Dean's Scholars lunch, on the other hand, the students were talking to each other so loudly as they grabbed their sandwiches that Cline had to really work to get the floor to introduce me as a "local boy done good." They responded with a rousing ovation.

Elaine Rich had provided the vital clue for connecting with this audience earlier in the week. She does AI research and had seen TimBL's AAAI talk. While she didn't exactly give the talk four stars overall, she did get enough out of it to realize it would make an interesting application to add to a book that she's writing, where she's trying to give practical examples that motivate automata theory. So after I took a look at what she had written about URIs and RDF and OWL and such, she reminded me that not all the Deans Scholars are studying computer science; but many of them do biology, and I might do well to present the Semantic Web more from the perspective of that user community.

So I used TimBL's Bio-IT slides. They weren't shy when I went too fast with terms like hypertext, and there were a lot of furrowed brows for a while. But when I got to the FOAFm OMM, UMLS, SNP, Uniprot, Bipax, Patents all have some overlap with drug target ontology drug discovery diagram, I said I didn't even know some of these words and asked them which ones they knew. After a chuckle about "drug", one of them explained about SNP, i.e. single nucleotide polymorphism and another told me about OMM and the discussion really got going. I didn't make much more use of Tim's slides. One great question about integrating data about one place from lots of sources prompted me to tempt the demo gods and try the tabulator. The demo gods were not entirely kind; perhaps I should have used the released version rather than the development version. But I think I did give them a feel for it. In answer to "so what is it you're trying to do, exactly?" I gave a two part answer:

  1. Recruit some of them to work on the tabulator so that their name might be on the next paper like the SWUI06 paper, Tabulator: Exploring and Analyzing linked data on the Semantic Web.
  2. Integrate data accross applications and accross administrative boundaries all over the world, like the Web has done for documents.

We touched on the question of local and global consistency, and someone asked if you can reason about disagreement. I said that yes, I had presented a paper in Edinburgh just this May that demonstrated formally a disagreement between several parties

One of the last questions was "So what is computer science research anway?" which I answered by appeal to the DIG mission statement:

The Decentralized Information Group explores technical, institutional and public policy questions necessary to advance the development of global, decentralized information environments.

And I said how cool it is to have somebody in the TAMI project with real-world experience with the privacy act. One student followed up and asked if we have anybody with real legal background in the group, and I pointed him to Danny. He asked me afterward how to get involved, and it turned out that IRC and freenode are known to him, so the #swig channel was in our common neighborhood in cyberspace, even geography would separate us as I headed to the airport to fly home.

technorati tags:, ,

Blogged with Flock

ACL 2 seminar at U.T. Austin: Toward proof exchange in the Semantic Web

Submitted by connolly on Sat, 2006-09-16 21:15. :: | | |

 

In our PAW and TAMI projects, we're making a lot of progress on the practical aspects of proof exchange: in PAW we're working out the nitty gritty details of making an HTTP client (proxy) and server that exchange proofs, and in TAMI, we're working on user interfaces for audit trails and justifications and on integration with a truth maintenance system.

It doesn't concern me too much that cwm does some crazy stuff when finding proofs; it's the proof checker that I expect to deploy as part of trusted computing bases and the proof language specification that I hope will complete the Semantic Web standards stack.

But N3 proof exchange is no longer a completely hypothetical problem; the first examples of interoperating with InferenceWeb (via a mapping to PML) and with Euler are working. So it's time to take a close look at the proof representation and the proof theory in more detail.

My trip to Austin for a research library symposium at the University of Texas gave me a chance to re-connect with Bob Boyer. A while back, I told him about RDF and asked him about Semantic Web logic issues and he showed me the proof checking part of McCune's Robbins Algebras Are Boolean result:

Proofs found by programs are always questionable. Our approach to this problem is to have the theorem prover construct a detailed proof object and have a very simple program (written in a high-level language) check that the proof object is correct. The proof checking program is simple enough that it can be scrutinized by humans, and formal verification is probably feasible.

In my Jan 2000 notes, that excerpt is followed by...

I offer a 500 brownie-point bounty to anybody who converts it to Java and converts the ()'s in the input format to <>'s.

5 points for perl. ;-)

Bob got me invited to the ACL2 seminar this week; in my presentation, Toward proof exchange in the Semantic Web. I reviewed a bit of Web Architecture and the standardization status of RDF, RDFS, OWL, and SPARQL as background to demonstrating that we're close to collecting that bounty. (Little did I know in 2000 that TimBL would pick up python so that I could avoid Java as well as perl ;-)

Matt Kauffman and company gave all sorts of great feedback on my presentation. I had to go back to the Semantic Web Wave diagram a few times to clarify the boundary between research and standardization:

  • RDF is fully standardized/ratified
  • turtle has the same expressive capability as RDF's XML syntax, but isn't fully ratified, and
  • N3 goes beyond the standards in both syntax and expressiveness

One of the people there who knew about RDF and OWL and such really encouraged me to get N3/turtle done, since every time he does any Semantic Web advocacy, the RDF/XML syntax is a deal-killer. I tried to show them my work on a turtle bnf, but what I was looking for was in June mailing list discussion, not in my February bnf2turtle breadcrumbs item.

They asked what happens if an identifier is used before it appears in an @forAll directive and I had to admit that I could test what the software does if they wanted to, but I couldn't be sure whether that was by design or not; exactly how quantification and {}s interact in N3 is sort of an open issue, or at least something I'm not quite sure about.

Moore noticed that our conjunction introduction (CI) step doesn't result in a formula whose main connective is conjuction; the conjuction gets pushed inside the quantifiers. It's not wrong, but it's not traditional CI either.

I asked about ACL2's proof format, and they said what goes in an ACL2 "book" is not so much a proof as a sequence of lemmas and such, but Jared was working on Milawa, a simple proof checker that can be extended with new prooftechniques.

I started talking a little after 4pm; different people left at different times, but it wasn't until about 8 that Matt realized he was late for a squash game and headed out.

MLK and the UT TowerI went back to visit them in the U.T. tower the next day to follow up on ACL2/N3 connections and Milawa. Matt suggested a translation of N3 quantifiers and {}s into ACL2 that doesn't involve quotation. He offered to guide me as I fleshed it out, but I only got as far as installing lisp and ACL2; I was too tired to get into a coding fugue.

Jared not only gave me some essential installation clues, but for every technical topic I brought up, he printed out two papers showing different approaches. I sure hope I can find time to follow up on at least some of this stuff.

del.icio.us tags:, , , ,

Blogged with Flock

On the Future of Research Libraries at U.T. Austin

Submitted by connolly on Sat, 2006-09-16 17:14. :: | | |

Wow. What a week!

I'm always on the lookout for opportunities to get back to Austin, so I was happy to accept an invitation to this 11 - 12 September symposium, The Research Library in the 21st Century run by University of Texas Libraries:Image: San Jacinto Residence Hall

In today's rapidly changing digital landscape, we are giving serious thought to shaping a strategy for the future of our libraries. Consequently, we are inviting the best minds in the field and representatives from leading institutions to explore the future of the research library and new developments in scholarly communication. While our primary purpose is to inform a strategy for our libraries and collections, we feel that all participants and their institutions will benefit.

I spent the first day getting a feel for this community, where evidently a talk by Clifford Lynch of CNI is a staple. "There is no scholarship without scholarly communication," he said, quoting Courant. He noted that traditionally, publishers disseminate and libraries preserve, but we're shifting to a world where the library helps disseminate and makes decisions on behalf of the whole world about which works to preserve. He said there's a company (I wish I had made a note of the name) that has worked out the price of an endowed web site; at 4% annual return, they figure it at $2500/gigabyte.

James Duderstadt from the University of Michigan told us that the day when the entire contents of the library fits on an iPod (or "a device the size of a football" for other audiences that didn't know about iPods ;-) is not so far off. He said that the University of Michigan started digitizing their 7.8million volumes even before becoming a Google Book Search library partner. They initially estimated it would take 10 years, but the current estimate is 6 years and falling. He said that yes, there are copyright issues and other legal challenges, and he wouldn't be suprised to end up in court over it; he had done that before. Even the sakai project might face litigation. What got the most attention, I think, was when he relayed first-hand experience from the Spellings Commission on the Future of Higher Education; their report is available to those that know where to look, though it is not due for official release until September 26.

He also talked about virtual organizations, i.e. groups of researchers from universities all over, and even the "meta university," with no geographical boundaries at all. That sort of thing fueled my remarks for the Challenges of Access and Preservation panel on the second day. I noted that my job is all about virtual organizations, and if the value of research libraries is connected to recruiting good people, you should keep in mind the fact that "get together and go crazy" events like football games are a big part of building trust and loyalty.

Kevin Guthrie, President of ITHAKA, made a good point that starting new things is usually easier than changing old things, which was exactly what I was thinking when President Powers spoke of "preserving our investment" in libraries in his opening address. U.T. invested $650M in libraries since 1963. That's not counting bricks and mortar; that's special collections, journal subscriptions, etc.

My point that following links is 96% reliable sparked an interesting conversation; it was misunderstood as "96% of web sites are persistent" and then "96% of links persist"; when I clarified that it's 96% of attempts to follow links that succeed, and this is because most attempts to follow links are from one popular resource to another, we had an interesting discussion of ephemera vs. the scholarly record and which parts need what sort of attention and what sort of policies. The main example was that 99% of political websites about the California run-off election went offline right after the election. My main point was: for the scholarly record, HTTP/DNS is as good as it gets for the forseeable future; don't throw up your hands at the 4% and wait for some new technology; apply your expertise of curation and organizational change to the existing technologies.

In fact, I didn't really get beyond URIs and basic web architecture in my remarks. I had prepared some points about the Semantic Web, but I didn't have time for them in my opening statement and they didn't come up much later in the conversation, except when Ann Wolpert, Director of Libraries at MIT, brough up DSPACE a bit.

Betsy Wilson of the University of Washington suggested that collaboration would be the hallmark of the library of the future. I echoed that back in the wrap-up session referring to library science as the "interdisciplinary discipline"; I didn't think I was making that up (and a google search confirms I did not), but it seemed to be new to this audience.

By the end of the event I was pretty much up to speed on the conversation; but on the first day, I felt a little out of place and when I saw the sound engineer getting things ready, I mentioned to him that I had a little experience using and selling that sort of equipment. It turned out that he's George Geranios, sound man for bands like Blue Oyster Cult for about 30 years. We had a great conversation on digital media standards and record companies. I'm glad I sat next to David Seaman of the DLF at lunch; we had a mutual colleague in Michael Sperberg-McQueen. I asked him about IFLA, one of the few acronyms from the conversation that I recognized; he helped me understand that IFLA conferences are relevant, but they're about libraries in general, and the research library community is not the same. And Andrew Dillon got me up to speed on all sorts of things and made the panel I was on fun and pretty relaxed.

Fred Heath made an oblique reference to a New York Times article about moving most of the books out of the U.T. undergraduate library as if everyone knew, but it was news to me. Later in the week I caught up with Ben Kuipers; we didn't have time for my technical agenda of linked data and access limited logic, but we did discover that both of us were a bit concerned with the fragility of civilization as we know it and the value of books over DVDs if there's no reliable electricity.

The speakers comments at the symposium were recorded; there's some chance that edited transcripts will appear in a special issue of a journal. Stay tuned for that. And stay tuned for more breadcrumbs items on talks I gave later in the week where I did get beyond the basic http/DNS/URI layer of Semantic Web Archtiecture.


tags:, ,