IRC log of dig on 2010-10-04

Timestamps are in UTC.

11:30:55 [DIGlogger]
DIGlogger (~dig-logge@groups.csail.mit.edu) has joined #dig
11:30:55 [lindbohm.freenode.net]
topic is: Decentralized Information Group @ MIT http://dig.csail.mit.edu/
11:30:55 [lindbohm.freenode.net]
Users on #dig: DIGlogger RalphS mhausenblas melvster nunnun_away timbl webr3 Yudai presbrey drrho ericP sandro ernestchiang_ gbot73
11:37:32 [mcherian]
mcherian (~mathew@c-24-91-159-17.hsd1.ma.comcast.net) has joined #dig
11:43:55 [melvster]
melvster has quit (Ping timeout: 240 seconds)
12:22:58 [betehess]
betehess (~betehess@betehess.w3.org) has joined #dig
12:25:01 [amy]
amy (~amy@31-34-225.wireless.csail.mit.edu) has joined #dig
12:28:12 [timbl_]
timbl_ (~timbl@pool-96-237-236-230.bstnma.fios.verizon.net) has joined #dig
12:31:52 [timbl]
timbl has quit (Ping timeout: 252 seconds)
12:32:38 [timbl_]
timbl_ has quit (Ping timeout: 245 seconds)
13:03:17 [oshani]
oshani (~oshani@c-71-233-151-72.hsd1.ma.comcast.net) has joined #dig
13:05:32 [nunnun_away]
nunnun_away is now known as nunnun
13:14:03 [melvster]
melvster (~melvster@p579F9B57.dip.t-dialin.net) has joined #dig
13:16:43 [timbl]
timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
13:18:43 [lkagal]
lkagal (~lkagal@30-6-179.wireless.csail.mit.edu) has joined #dig
13:21:00 [lkagal]
lkagal has quit (Client Quit)
13:39:17 [oshani]
oshani has quit (Quit: Mama nidi!)
14:04:40 [marisol]
marisol (~marisol@31-35-238.wireless.csail.mit.edu) has joined #dig
14:48:37 [oshani]
oshani (~oshani@c-71-233-151-72.hsd1.ma.comcast.net) has joined #dig
16:00:12 [mhausenblas]
mhausenblas has quit (Quit: mhausenblas)
16:14:07 [timbl]
timbl has quit (Quit: timbl)
16:24:26 [timbl]
timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
16:36:48 [mcherian]
mcherian has quit (Ping timeout: 245 seconds)
17:23:43 [oshani]
oshani has quit (Quit: oshani)
18:08:34 [mcherian]
mcherian (~mathew@64.134.67.3) has joined #dig
18:11:26 [timbl]
timbl has quit (Quit: timbl)
18:21:10 [lkagal]
lkagal (~lkagal@30-6-179.wireless.csail.mit.edu) has joined #dig
18:22:52 [timbl]
timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
19:11:15 [marisol]
marisol has quit (Quit: marisol)
19:26:22 [timbl]
timbl has quit (Quit: timbl)
19:44:02 [mcherian]
mcherian has quit (Ping timeout: 255 seconds)
20:13:05 [timbl]
timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
20:16:39 [presbrey]
timbl, have you loaded any huge documents in tabulator?
20:17:16 [presbrey]
I'm trying to load ~300k triples I converted from medicare.gov
20:17:32 [timbl]
No I haven't
20:17:38 [presbrey]
script timeout, continue/stop script warning pops up
20:17:56 [timbl]
Well, it does take a while to load things
20:18:21 [timbl]
(It would be nice to get a progress bar of course)
20:18:31 [timbl]
DO you worry that the time is not linear?
20:18:40 [presbrey]
its a very simple graph, single degree, 15k instances I linked only by zip code
20:19:21 [presbrey]
I could isolate them into individual graphs by zipcode for better performance but that seems silly
20:19:30 [presbrey]
yes I guess a progress bar would be nice
20:20:10 [timbl]
Jim Hendler I know breaks big files up into medium sized chunks when he generates linke data for tabulator
20:20:25 [timbl]
Maybe though we need t profile it
20:20:48 [presbrey]
thats the workflow I was guessing I'd have to use with tabulator
20:21:20 [mcherian]
mcherian (~mathew@30-6-25.wireless.csail.mit.edu) has joined #dig
20:21:26 [presbrey]
I'll just sparql through it
20:21:37 [timbl]
Another possibility is to just ask for the first meg MB of the file, and quit, or background the rest in some way, or revert to diring SPARQL queries against hte server directly as one explores
20:22:04 [timbl]
DO you have a sparql endpoint for it?
20:23:07 [presbrey]
sure
20:23:47 [timbl]
Id be interested in fixing tabulator up so that you can switch to just using SPARQL for URIs in a given tree
20:24:15 [presbrey]
I can help with some infrastructure there
20:24:43 [presbrey]
these files I put up on a host that treats application/sparql-query to any HTTP URI
20:24:51 [presbrey]
data-wiki style but w/o the update for now
20:25:38 [presbrey]
though I suppose there should be some explicit header or metadata for tabulator to follow
20:26:02 [presbrey]
thinking more, maybe ms-author-via is enough
20:26:10 [presbrey]
via sparql, rather
20:27:11 [timbl]
What is the RDF label spec P____
20:27:15 [timbl]
POWDER
20:27:32 [timbl]
http://www.w3.org/TR/powder-voc/
20:29:10 [presbrey]
here's the one I'm looking at now:
20:29:20 [presbrey]
http://assets.qwobl.com/2010/medicare/NHC_NH
20:29:26 [presbrey]
A script on this page may be busy, or it may have stopped responding. You can stop the script now, open the script in the debugger, or let the script continue.
20:29:27 [presbrey]
Script: chrome://tabulator/content/js/rdf/rdflib.js:2432
20:30:23 [timbl]
and you say stop and you have some of the data?
20:30:34 [timbl]
BTW you can change that timeout
20:30:45 [timbl]
about:config
20:32:17 [presbrey]
that 12.5MB gets to me at 3.78M/s in 3.2s
20:32:33 [presbrey]
...in about:config, dom.max_chrome_script_run_time? dom.max_script_run_time
20:33:05 [presbrey]
if I hit stop and refresh, tabulator does show partial data
20:33:28 [presbrey]
I suppose it thinks it already retrieved it and doesn't detect partial-parsing
20:34:50 [timbl]
Well, I like that if it partially parses it I get what it did get
20:34:57 [timbl]
if you lioook at the internal pane
20:35:18 [timbl]
http://assets.qwobl.com/2010/medicare/schema#NursingHome 404
20:35:34 [presbrey]
hehe
20:35:45 [presbrey]
yes that's not formalized yet
20:36:09 [presbrey]
the data is still in early browsing stage for me
20:36:15 [presbrey]
very weakly linked, zip only, etc.
20:36:53 [presbrey]
medicare.gov ships it in CSV
20:37:03 [presbrey]
so schema: is a placeholder
20:38:21 [timbl]
Would be nice if for example http://assets.qwobl.com/2010/medicare/schema#NursingHomeName were sepcified as being a rdfs:subPropertyOf rdfs:label and then the hones would be labelled with their names
20:38:29 [timbl]
But I digress ..
20:38:43 [timbl]
My oIRC cl;ient has gone transprent ... restarting it
20:38:51 [timbl]
timbl has left #dig
20:39:14 [timbl]
timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
20:39:33 [timbl]
That's better
20:40:19 [webr3]
would be very nice to have either an ntriples stream or a sax like processor for rdf, then you could just consume and add to the store as you go with a v low memory footprint
20:41:00 [timbl]
Yes, it is a shame that there is no attempt to stop the script without extreme predjudice first,. gen an exception, before totally destrying th ethread
20:41:32 [timbl]
cwm's rdf/xml processor is stream-based.
20:41:51 [timbl]
You can use it for pipeline processing. Based on ax
20:41:53 [timbl]
sax
20:42:32 [timbl]
But the way XMLHTTPRequest works atthe moment is to give you a DOM fully parsed.
20:42:36 [timbl]
Maybe one could fix that
20:42:50 [webr3]
nice, have been working with manu to define the same for the rdfa api, so far implemented a streaming one for ntriples in js, working on turtle next, then finall rdf/xml which may be a challenge without dom
20:43:06 [webr3]
timbl, good point - I'll chase that up and talk to webapps guys to make it happen
20:43:07 [timbl]
Then if you had live update code, predbrey, a table which had subscribed to changes in th estore would get real-time updates
20:43:52 [timbl]
webr3, you aowking on the RDF API?
20:43:58 [timbl]
RDFA I mean
20:44:27 [timbl]
I think there is a big mistake happening to not make the RDFA API look like (tabulators) RDF API where it can
20:44:48 [webr3]
timbl, yes - part of rdfa wg now specifically to cover the rdfa api - and to bring it as inline with tabulator as I can!
20:45:42 [presbrey]
so if I split this data by zip code
20:45:56 [presbrey]
I might produce an index.n3 that explicitly links to all zip code graphs
20:45:59 [webr3]
timbl, spending most of my time recoding tabulator and also doing rdfa api at the same time so i can understand from the inside out - brb
20:46:25 [presbrey]
seeAlso could be useful for linking to each
20:47:03 [presbrey]
powder?
20:47:08 [webr3]
presbrey, link:listDocumentProperty is nice - a typed seeAlso
20:51:48 [timbl]
webr3, I hacked section 3.1.2 of http://www.w3.org/2010/rdf-api/RDF-API.html to see what the tabulator API would look like
20:51:52 [timbl]
Just as good IMHO
20:51:57 [timbl]
and alreday implemented
20:52:11 [timbl]
If you look at th eexamples
20:54:18 [timbl]
document.data could support the graph (IndexedFormula) interface from tabulator, including each and any etc
20:56:38 [webr3]
timbl, yup I've made a load of changes which aren't in the document yet ( no editor rights ), so far aligned data store pretty much with formula, then working in idnexed formula over the top
20:58:11 [webr3]
findAllMembers, sym etc all v good :) thanks!
20:58:43 [timbl]
You've been making an alternative version too?
21:00:55 [webr3]
changed that version, need to get some time with manu to update the api though - still not finished, need to turn data store in to an (optional) indexed formula
21:01:25 [webr3]
sorry, to update the docs, have all the IDL defined and implemented of version not in editors draft yet
21:02:05 [timbl]
who is editing it? Manu?
21:06:25 [webr3]
yes, manu is editing it, when he gets a chance, he has a huge changeset from me to add in, most of the Data Interfaces have changed, and will change further
21:06:39 [webr3]
apologies, doing many things at once
21:07:59 [webr3]
timbl, shall I give you a nudge when the next revision is done so you can review?
21:08:57 [webr3]
also, fyi, webapps/html wg have blocked IRI, BlankNode, TypedLiteral, RDFTriple PlainLiteral from being in the global namespace (as in having named constructors in the browsers)
21:11:50 [timbl]
Well, could you pas on my comments to the WG?
21:12:15 [timbl]
I only changed 3.1.2
21:12:44 [timbl]
in that
21:12:53 [timbl]
timbl has quit (Quit: timbl)
21:22:40 [kennyluck]
kennyluck (~kennyluck@114-43-122-83.dynamic.hinet.net) has joined #dig
21:22:40 [kennyluck]
kennyluck has quit (Excess Flood)
21:33:41 [kennyluck]
kennyluck (~kennyluck@114-43-122-83.dynamic.hinet.net) has joined #dig
21:33:45 [kennyluck]
kennyluck has quit (Excess Flood)
21:54:59 [mcherian]
mcherian has quit (Read error: Operation timed out)
22:37:26 [lkagal]
lkagal has quit (Quit: lkagal)
23:00:16 [marisol]
marisol (~marisol@pool-141-154-118-225.bos.east.verizon.net) has joined #dig
23:01:55 [RalphS]
RalphS has quit (Quit: heading to train ...)
23:30:40 [marisol]
marisol has quit (Quit: marisol)