authorization

OpenID "Hello World" on apache still deep magic

Submitted by connolly on Thu, 2009-01-08 18:37. ::

I have a home movie that I just want to show to just a few friends around the Web. With OpenID, I should be able to just give my web server a list of my friends' pages, right?

I eventually found a README for mpopenid with just what I wanted:

PythonOption authorized-users "http://alice.com/ http://bob.com/"

But that wasn't on the top page of hits on a search for "apache OpenID". (Like most sites, mine runs on apache.) The top hit is mod_auth_openid, but its FAQ that says my use case isn't directly supported:

Is it possible to limit login to some users, like htaccess/htpasswd does?
No. ... If you want to restrict to specific users that span multiple identity providers, then OpenID probably isn't the authentication method you want. Note that you can always do whatever vetting you want using the REMOTE_USER CGI environment variable after a user authenticates.

So I installed the prerequisites for mpopenid: libapache2-mod-python and python-elementtree were straightforward, but I struggled to find a version of python-openid that matched. I almost gave up at that point, but heartened by somebody else who got mpopenid working, I went back to searching and found a launchpad development version of mpopenid. That seems to work with python-openid-1.1.0.

In /etc/apache2/sites-available/mysite, I have this bit that glues mpopenid's login page into my site:

<Location "/openid-test-aux">
SetHandler mod_python
PythonOption action-path "/openid-test-aux"
PythonHandler mpopenid::openid
</Location>

And in mysite/movies/.htaccess, this bit says only I get to see http://mysite.example/sekret:

<Files "sekret">
PythonAccessHandler mpopenid::protect
PythonOption authorized-users "http://www.w3.org/People/Connolly/"
</Files>

The mpopenid README also shows an option to put the list of pages in a separate file:

PythonOption authorized-users-list-url file:///my/directory/allowed-users.txt

But I haven't tried that yet. So far I'm happy to put the list right in the .htaccess file.

FOAF and OpenID: two great tastes that taste great together

Submitted by connolly on Wed, 2007-10-24 23:00. :: | |

As Simon Willison notes, OpenID solves the identity problem, not the trust problem. Meanwhile, FOAF and RDF are potential solutions to lots of problems but not yet actual solutions to very many. I think they go together like peanut butter and chocolate, creating a deliciously practical testbed for our Policy Aware Web research.

Our struggle to build a community is fairly typical:

In Dec 2006, Ryan did a Drupal upgrade that included OpenID support, but that only held the spammers back for a couple weeks. Meanwhile, Six Apart is Opening the Social Graph:

 

... if you manage a social networking service, we strongly encourage you to embrace OpenID, hCard XFN, FOAF and the other open standards around data portability.

With that in mind, a suggestion to outsource to a centralized commercial blog spam filtering service seemed like a step in the wrong direction; we are the Decentralized Information Group after all; time to eat our own cooking!

The policy we have working right now is, roughly: you can comment on our blog if you're a friend of a friend of a member of the group.

In more detail, you can comment on our blog if:

  1. You can show ownership of a web page via the OpenID protocol.
  2. That web page is related by the foaf:openid property to a foaf:Person, and
  3. That foaf:Person is
    1. listed as a member of the DIG group in http://dig.csail.mit.edu/data, or
    2. related to a dig member by one or two foaf:knows links.

The implementation has two components so far:

  • an enhancement to drupal's OpenID support to check a whitelist
  • a FOAF crawler that generates a whitelist periodically

We're looking into policies such as You can comment if you're in a class taught by a DIG group member, but there are challenges reconciling policies protecting privacy of MIT students with this approach.

We're also interested in federating with other communities. The Advogato community is particuarly interesting because

  1. The DIG group is pretty into Open Source, the core value of advogato.
  2. Advogato's trust metric is designed to be robust in the face of spammers and seems to work well in practice.

So I'd like to be able to say You can comment on our blog if you're certified Journeyer or above in the Advogato community. Advogato has been exporting basic foaf:name and foaf:knows data since a Feb 2007 update, but they didn't export the results of the trust metric computation in RDF.

Asking for that data in RDF has been on my todo list for months, but when Sean Palmer found out about this OpenID and FOAF stuff, he sent an enhancement request, and Steven Rainwater joined the #swig channel to let us alpha test it in no time. Sean also did a nice write-up.

This is a perfect example of the sort of integration of statistical methods into the Semantic Web that we have been talking about as far back as our DAML proposal in 2000:

Some of these systems use relatively simple and straightforward manipulation of well-characterized data, such as an access control system. Others, such as search engines, use wildly heuristic manipulations to reach less clearly justified but often extremely useful conclusions. In order to achieve its potential, the Semantic Web must provide a common interchange language bridging these diverse systems. Like HTML, the Semantic Web language should be basic enough that it does not impose an undue burden on the simplest web software systems, but powerful enough to allow more sophisticated components to use it to advantage as well.

Now we just have to enhance our crawler to get that data or otherwise integrate it with the drupal whitelist. (I'm particularly interested in using GRDDL to get FOAF data right from the OpenID page; stay tuned for more on that.) And I guess we need Advogato to provide a user interface for foaf:openid support... or maybe links to supplementary FOAF files via rdfs:seeAlso or owl:sameAs.

New Commenting Policy

Submitted by ryanlee on Tue, 2007-10-02 12:57. :: |

I've added a new commenting policy to combat our OpenID-based spammers. It's a whitelist based on FOAF. There will be more about it in this space as development moves forward, so stay tuned if you'd like to know how to be placed on the whitelist.

As for specifics, the whitelist implementation is a Drupal module that reads a list of OpenIDs from an externally generated set every hour. The user's OpenID is checked against the whitelist at login time, and matches are allowed to proceed with account creation, commenting, etc.

Potential improvements:

  • a UI for settings
  • better database transaction flow, particularly for error handling
  • viewable whitelist

Feel free to contact me if you're interested in the module portion of this equation for use with your own openid.module (again, it does none of the whitelist generation).

A design for web content labels built from GRDDL and rules

Submitted by connolly on Thu, 2007-01-25 13:35. :: | | |

In #swig discussion, Tim mentioned he did some writing on labels and rules and OWL which prompted me to flesh out some related ideas I had. The result is a Makefile and four tests with example labels. One of them is:

All resources on example.com are accessible for all users and meet WAI AA guidelines except those on visual.example.com which are not suitable for users with impaired vision.

I picked an XML syntax out of the air and wrote visaa.lbl:

<label
xmlns="http://www.w3.org/2007/01/lbl22/label"
xmlns:mobilebp="http://www.w3.org/2007/01/lbl22/mobilebp@@#"
xmlns:wai="http://www.w3.org/2007/01/lbl22/wai@@#"
>
<scope>
<domain>example.com</domain>
<except>
<domain>visual.example.com</domain>
</except>
</scope>
<audience>
<wai:AAuser />
</audience>
</label>

And then in testdata.ttl we have:

<http://example.com/pg1simple> a webarch:InformationResource.
<http://visual.example.com/pg2needsVision> a
webarch:InformationResource.
:charlene a wai:AAuser.

Then we run the test thusly...

$ make visaa_test.ttl
xsltproc --output visaa.rdf label2rdf.xsl visaa.lbl
python ../../../2000/10/swap/cwm.py visaa.rdf lblrules.n3 owlAx.n3
testdata.ttl \
--think --filter=findlabels.n3 --n3 >visaa_test.ttl

and indeed, it concludes:

    <http://example.com/pg1simple>     lt:suitableFor :charlene .

but doesn't conclude that pg2needsVision is OK for charlene.

The .lbl syntax is RDF data via GRDDL and label2rdf.xsl. Then owlAx.n3 is rules that derive from the RDFS and OWL specs; i.e. stuff that's already standard. As Tim wrote, A label is a fairly direct use of OWL restrictions. This is very much the sort of thing OWL is designed for. Only the lblrules.n3 bit goes beyond what's standardized, and it's written in the N3 Rules subset of N3, which, assuming a few built-ins, maps pretty neatly to recent RIF designs.

A recent item from Bijan notes a SPARQL-rules design by Axel; I wonder if these rules fit in that design too. I hope to take a look soonish.

ACL 2 seminar at U.T. Austin: Toward proof exchange in the Semantic Web

Submitted by connolly on Sat, 2006-09-16 21:15. :: | | |

 

In our PAW and TAMI projects, we're making a lot of progress on the practical aspects of proof exchange: in PAW we're working out the nitty gritty details of making an HTTP client (proxy) and server that exchange proofs, and in TAMI, we're working on user interfaces for audit trails and justifications and on integration with a truth maintenance system.

It doesn't concern me too much that cwm does some crazy stuff when finding proofs; it's the proof checker that I expect to deploy as part of trusted computing bases and the proof language specification that I hope will complete the Semantic Web standards stack.

But N3 proof exchange is no longer a completely hypothetical problem; the first examples of interoperating with InferenceWeb (via a mapping to PML) and with Euler are working. So it's time to take a close look at the proof representation and the proof theory in more detail.

My trip to Austin for a research library symposium at the University of Texas gave me a chance to re-connect with Bob Boyer. A while back, I told him about RDF and asked him about Semantic Web logic issues and he showed me the proof checking part of McCune's Robbins Algebras Are Boolean result:

Proofs found by programs are always questionable. Our approach to this problem is to have the theorem prover construct a detailed proof object and have a very simple program (written in a high-level language) check that the proof object is correct. The proof checking program is simple enough that it can be scrutinized by humans, and formal verification is probably feasible.

In my Jan 2000 notes, that excerpt is followed by...

I offer a 500 brownie-point bounty to anybody who converts it to Java and converts the ()'s in the input format to <>'s.

5 points for perl. ;-)

Bob got me invited to the ACL2 seminar this week; in my presentation, Toward proof exchange in the Semantic Web. I reviewed a bit of Web Architecture and the standardization status of RDF, RDFS, OWL, and SPARQL as background to demonstrating that we're close to collecting that bounty. (Little did I know in 2000 that TimBL would pick up python so that I could avoid Java as well as perl ;-)

Matt Kauffman and company gave all sorts of great feedback on my presentation. I had to go back to the Semantic Web Wave diagram a few times to clarify the boundary between research and standardization:

  • RDF is fully standardized/ratified
  • turtle has the same expressive capability as RDF's XML syntax, but isn't fully ratified, and
  • N3 goes beyond the standards in both syntax and expressiveness

One of the people there who knew about RDF and OWL and such really encouraged me to get N3/turtle done, since every time he does any Semantic Web advocacy, the RDF/XML syntax is a deal-killer. I tried to show them my work on a turtle bnf, but what I was looking for was in June mailing list discussion, not in my February bnf2turtle breadcrumbs item.

They asked what happens if an identifier is used before it appears in an @forAll directive and I had to admit that I could test what the software does if they wanted to, but I couldn't be sure whether that was by design or not; exactly how quantification and {}s interact in N3 is sort of an open issue, or at least something I'm not quite sure about.

Moore noticed that our conjunction introduction (CI) step doesn't result in a formula whose main connective is conjuction; the conjuction gets pushed inside the quantifiers. It's not wrong, but it's not traditional CI either.

I asked about ACL2's proof format, and they said what goes in an ACL2 "book" is not so much a proof as a sequence of lemmas and such, but Jared was working on Milawa, a simple proof checker that can be extended with new prooftechniques.

I started talking a little after 4pm; different people left at different times, but it wasn't until about 8 that Matt realized he was late for a squash game and headed out.

MLK and the UT TowerI went back to visit them in the U.T. tower the next day to follow up on ACL2/N3 connections and Milawa. Matt suggested a translation of N3 quantifiers and {}s into ACL2 that doesn't involve quotation. He offered to guide me as I fleshed it out, but I only got as far as installing lisp and ACL2; I was too tired to get into a coding fugue.

Jared not only gave me some essential installation clues, but for every technical topic I brought up, he printed out two papers showing different approaches. I sure hope I can find time to follow up on at least some of this stuff.

del.icio.us tags:, , , ,

Blogged with Flock

OpenID, verisign, and my life: mediawiki, bugzilla, mailman, roundup, ...

Submitted by connolly on Mon, 2006-07-31 15:45. ::

Please, don't ask me to manage another password! In fact, how about getting rid of most of the ones I already manage?

I have sent support requests for some of these; the response was understandable, if disappointing: when debian/ubuntu supports it, or at least when the core MailMain/mediawiki guys support it, we'll give it a try. I opened Issue 18: OpenID support in roundup too; there are good OpenID libraries in python, after all.

A nice thing about OpenID is that the service provider doesn't have to manage passwords either. I was thinking about where my OpenID password(s) should live, and I realized the answer is: nowhere. If we put the key fingerprint in the OpenID persona URL, I can build an OpenID server does public key challenge-response authentication and doesn't store any passwords at all.

As I sat down to tinker with that idea, I rememberd the verisign labs openid service and gave it a try. Boy, it's nice! They use the user-chosen photo anti-phishing trick and provide nice audit trails. So it will probably be quite a while before I feel the need to code my own OpenID server.

I'm still hoping for mac keychain support for OpenID. Meanwhile, has anybody seen a nice gnome applet for keeping the state of my ssh-agent credentials and my CSAIL kerberos credentials visible?

On GData, SPARQL update, and RDF Diff/Sync

Submitted by connolly on Tue, 2006-04-25 17:38. :: | | |

The Google Data APIs Protocol is pretty interesting. It seems to be based on the Atom publishing protocol, which is a pretty straightforward application of HTTP and XML, so that's a good thing.

The query features seem to be less expressive than the SPARQL protocol, but GData has an update feature, while the SPARQL update issue is postponed. Updating at the triple level is tricky. I helped TimBL refine Delta: an ontology for the distribution of differences between RDF graphs a bit, and there's working code in cwm. But I haven't really managed to use it in practical settings. My PDA's calendar has an XMLRPC service where I can update a whole record at a time, just like GData. I assume caldav does likewise.

The GData approach to concurrency looks quite reasonable. I haven't studied the authentication mechanism. I hope to get to that presently.

Access control and version control: an over-constrained problem?

Submitted by connolly on Tue, 2006-04-25 03:47. :: |

For a long time, when it came to sharing code, life was simple: everything went into the w3ccvs repository via ssh; when I commit WWW/People/Connolly/Overview.html the latest version is propagated out to all the web mirrors automatically and everybody can see the results at http://www.w3.org/People/Connolly/. I can also write via HTTP, using Amaya or any command-line tool with PUT support, e.g. curl. The PUT results in a cvs commit, followed by the same mirroring magic.

We have a forms-based system for setting the access control for each file -- access control to the latest version, that is; access to older versions, to logs and the rest of the usual cvs goodies is limited to the few dozen people with access to the w3ccvs repository.

To overcome that limitation, Daniel Veillard set up dev.w3.org which conforms more to the open source norms, providing anonymous read-only CVS access and web-browseable history. There is no http-based access control there, though CVS commit access is managed with ssh.

The downside of dev.w3.org is that it doesn't support web publishing like w3ccvs does. http://dev.w3.org/cvsweb/2001/palmagent/event-test.html is the cvs history of event-test.html, not the page itself. The address that gives the page itself is http://dev.w3.org/cvsweb/~checkout~/2001/palmagent/event-test.html?rev=HEAD&content-type=text/html;%20charset=iso-8859-1. And every GET involves cvs reading the ,v file. And of course, relative links don't work.

For the SWAP project, we use a horrible kludge of both: we commit to w3cvs, and every 15 minutes, dev.w3.org is updated by rsync.

Then I have a personal web site, where I run Zope. That gives me thru-the-web editing with revision history/audit/undo, but at the cost of having the whole site in one big file on the server, and no local goodies like cvs diff. Ugh. I'd like to switch to something else, but I haven't found anything else that talks webdav to iCal out-of-the-box. Plus, I can spare precious little time for sysadmin on my personal site, where there's basically just one user. I was really reluctant to use flickr.com URIs for my photos, but their tools are so much nicer that the alternative is that I basically don't publish photos at all. Plus, the social networking benefits of publishing on flickr are considerable. But that's really a separate point (or is it? hmm).

As I wrote in a January item, we're using svn in DIG. DIG is part of CSAIL, which is a long-time kerberos shop. Public key authentication is so much nicer than shared key. I can use cvs or ssh over ssh to any number of places (mindswap, microformats, ...) with the same public key. But I need separate credentials (and a separte kinit step) for CSAIL. Ah... but it does propagate svn commit to http space in the right way. I think I'll try it ... see data4.

I'd rather use hg, which is peer-to-peer; it's theoretically possible to use svn as an hg peer, but that's slightly beyond the documented state-of-the-art.

A step forward with python and sshagent, and a walk around gnome security tools

Submitted by connolly on Wed, 2006-03-29 09:34. ::

At the August PAW meeting, I dropped a pointer in IRC to sshAuth.py, my attempt to use sshagent to make digital signatures. I started on it 2003/09, and I banged my head against a while for quite a while trying to get it to work.

Last night, while noodling on calendar synchronization and delegation, I took another run at the problem; this time, it worked! Thanks to paramiko:


from paramiko import Agent, RSAKey, Message
import Crypto.Util.randpool
import binascii

data = "hoopy" # data to sign
user = "connolly" # salt to taste

# get my public key
authkeys = file("/home/%s/.ssh/authorized_keys" % user)
authkeys.next() # skip 1st one
keyd = authkeys.next()
tn, uu, other = keyd.split()
keyblob = binascii.a2b_base64(uu)
pubkey = RSAKey(Message(keyblob))

pool = Crypto.Util.randpool.RandomPool()
a = Agent()
agtkey = a.get_keys()[0]
sigblob = agtkey.sign_ssh_data(pool, data)

print pubkey.verify_ssh_sig(data, Message(sigblob))

That skip 1st one bit took me a while to figure out. I have 2 keys in my ~/.ssh/authorized_keys file. I wonder if sshAuth.py would work with that fix.

I also took a look at the state-of-the art in password agents and managers for gnome. revelation looks interesting. I'm still hoping for something like OpenID/SXIP integrated with password managers like the OSX keychain.

I took notes in the #swig channel while I was at it. I got a kick out of this exchange:

 
04:44:59 <Ontogon_> dan, are you talking to yourself?
04:45:32 <dajobe> he's talking to the web

A look at emerging Web security architectures from a Semantic Web perspective

Submitted by connolly on Fri, 2006-03-17 17:51. :: | | | | | |

W3C had a workshop, Toward a more Secure Web this week. Citigroup hosted; the view from the 50th floor was awesome.

Some notes on the workshop are taking shape:

A look at emerging Web security architectures from a Semantic Web perspective

Comparing OpenID, SXIP/DIX, InfoCard, SAML to RDF, GRDDL, FOAF, P3P, XFN and hCard

At the W3C security workshop this week, I finally got to study SXIP in some detail after hearing about it and wondering how it compares to OpenID, Yadis, and the other "Identity 2.0" techniques brewing. And just in time, with a DIX/SXIP BOF at the Dallas IETF next week.

Syndicate content