IRC log of dig on 2010-10-04
Timestamps are in UTC.
- 11:30:55 [DIGlogger]
- DIGlogger (~dig-logge@groups.csail.mit.edu) has joined #dig
- 11:30:55 [lindbohm.freenode.net]
- topic is: Decentralized Information Group @ MIT http://dig.csail.mit.edu/
- 11:30:55 [lindbohm.freenode.net]
- Users on #dig: DIGlogger RalphS mhausenblas melvster nunnun_away timbl webr3 Yudai presbrey drrho ericP sandro ernestchiang_ gbot73
- 11:37:32 [mcherian]
- mcherian (~mathew@c-24-91-159-17.hsd1.ma.comcast.net) has joined #dig
- 11:43:55 [melvster]
- melvster has quit (Ping timeout: 240 seconds)
- 12:22:58 [betehess]
- betehess (~betehess@betehess.w3.org) has joined #dig
- 12:25:01 [amy]
- amy (~amy@31-34-225.wireless.csail.mit.edu) has joined #dig
- 12:28:12 [timbl_]
- timbl_ (~timbl@pool-96-237-236-230.bstnma.fios.verizon.net) has joined #dig
- 12:31:52 [timbl]
- timbl has quit (Ping timeout: 252 seconds)
- 12:32:38 [timbl_]
- timbl_ has quit (Ping timeout: 245 seconds)
- 13:03:17 [oshani]
- oshani (~oshani@c-71-233-151-72.hsd1.ma.comcast.net) has joined #dig
- 13:05:32 [nunnun_away]
- nunnun_away is now known as nunnun
- 13:14:03 [melvster]
- melvster (~melvster@p579F9B57.dip.t-dialin.net) has joined #dig
- 13:16:43 [timbl]
- timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
- 13:18:43 [lkagal]
- lkagal (~lkagal@30-6-179.wireless.csail.mit.edu) has joined #dig
- 13:21:00 [lkagal]
- lkagal has quit (Client Quit)
- 13:39:17 [oshani]
- oshani has quit (Quit: Mama nidi!)
- 14:04:40 [marisol]
- marisol (~marisol@31-35-238.wireless.csail.mit.edu) has joined #dig
- 14:48:37 [oshani]
- oshani (~oshani@c-71-233-151-72.hsd1.ma.comcast.net) has joined #dig
- 16:00:12 [mhausenblas]
- mhausenblas has quit (Quit: mhausenblas)
- 16:14:07 [timbl]
- timbl has quit (Quit: timbl)
- 16:24:26 [timbl]
- timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
- 16:36:48 [mcherian]
- mcherian has quit (Ping timeout: 245 seconds)
- 17:23:43 [oshani]
- oshani has quit (Quit: oshani)
- 18:08:34 [mcherian]
- mcherian (~mathew@64.134.67.3) has joined #dig
- 18:11:26 [timbl]
- timbl has quit (Quit: timbl)
- 18:21:10 [lkagal]
- lkagal (~lkagal@30-6-179.wireless.csail.mit.edu) has joined #dig
- 18:22:52 [timbl]
- timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
- 19:11:15 [marisol]
- marisol has quit (Quit: marisol)
- 19:26:22 [timbl]
- timbl has quit (Quit: timbl)
- 19:44:02 [mcherian]
- mcherian has quit (Ping timeout: 255 seconds)
- 20:13:05 [timbl]
- timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
- 20:16:39 [presbrey]
- timbl, have you loaded any huge documents in tabulator?
- 20:17:16 [presbrey]
- I'm trying to load ~300k triples I converted from medicare.gov
- 20:17:32 [timbl]
- No I haven't
- 20:17:38 [presbrey]
- script timeout, continue/stop script warning pops up
- 20:17:56 [timbl]
- Well, it does take a while to load things
- 20:18:21 [timbl]
- (It would be nice to get a progress bar of course)
- 20:18:31 [timbl]
- DO you worry that the time is not linear?
- 20:18:40 [presbrey]
- its a very simple graph, single degree, 15k instances I linked only by zip code
- 20:19:21 [presbrey]
- I could isolate them into individual graphs by zipcode for better performance but that seems silly
- 20:19:30 [presbrey]
- yes I guess a progress bar would be nice
- 20:20:10 [timbl]
- Jim Hendler I know breaks big files up into medium sized chunks when he generates linke data for tabulator
- 20:20:25 [timbl]
- Maybe though we need t profile it
- 20:20:48 [presbrey]
- thats the workflow I was guessing I'd have to use with tabulator
- 20:21:20 [mcherian]
- mcherian (~mathew@30-6-25.wireless.csail.mit.edu) has joined #dig
- 20:21:26 [presbrey]
- I'll just sparql through it
- 20:21:37 [timbl]
- Another possibility is to just ask for the first meg MB of the file, and quit, or background the rest in some way, or revert to diring SPARQL queries against hte server directly as one explores
- 20:22:04 [timbl]
- DO you have a sparql endpoint for it?
- 20:23:07 [presbrey]
- sure
- 20:23:47 [timbl]
- Id be interested in fixing tabulator up so that you can switch to just using SPARQL for URIs in a given tree
- 20:24:15 [presbrey]
- I can help with some infrastructure there
- 20:24:43 [presbrey]
- these files I put up on a host that treats application/sparql-query to any HTTP URI
- 20:24:51 [presbrey]
- data-wiki style but w/o the update for now
- 20:25:38 [presbrey]
- though I suppose there should be some explicit header or metadata for tabulator to follow
- 20:26:02 [presbrey]
- thinking more, maybe ms-author-via is enough
- 20:26:10 [presbrey]
- via sparql, rather
- 20:27:11 [timbl]
- What is the RDF label spec P____
- 20:27:15 [timbl]
- POWDER
- 20:27:32 [timbl]
- http://www.w3.org/TR/powder-voc/
- 20:29:10 [presbrey]
- here's the one I'm looking at now:
- 20:29:20 [presbrey]
- http://assets.qwobl.com/2010/medicare/NHC_NH
- 20:29:26 [presbrey]
- A script on this page may be busy, or it may have stopped responding. You can stop the script now, open the script in the debugger, or let the script continue.
- 20:29:27 [presbrey]
- Script: chrome://tabulator/content/js/rdf/rdflib.js:2432
- 20:30:23 [timbl]
- and you say stop and you have some of the data?
- 20:30:34 [timbl]
- BTW you can change that timeout
- 20:30:45 [timbl]
- about:config
- 20:32:17 [presbrey]
- that 12.5MB gets to me at 3.78M/s in 3.2s
- 20:32:33 [presbrey]
- ...in about:config, dom.max_chrome_script_run_time? dom.max_script_run_time
- 20:33:05 [presbrey]
- if I hit stop and refresh, tabulator does show partial data
- 20:33:28 [presbrey]
- I suppose it thinks it already retrieved it and doesn't detect partial-parsing
- 20:34:50 [timbl]
- Well, I like that if it partially parses it I get what it did get
- 20:34:57 [timbl]
- if you lioook at the internal pane
- 20:35:18 [timbl]
- http://assets.qwobl.com/2010/medicare/schema#NursingHome 404
- 20:35:34 [presbrey]
- hehe
- 20:35:45 [presbrey]
- yes that's not formalized yet
- 20:36:09 [presbrey]
- the data is still in early browsing stage for me
- 20:36:15 [presbrey]
- very weakly linked, zip only, etc.
- 20:36:53 [presbrey]
- medicare.gov ships it in CSV
- 20:37:03 [presbrey]
- so schema: is a placeholder
- 20:38:21 [timbl]
- Would be nice if for example http://assets.qwobl.com/2010/medicare/schema#NursingHomeName were sepcified as being a rdfs:subPropertyOf rdfs:label and then the hones would be labelled with their names
- 20:38:29 [timbl]
- But I digress ..
- 20:38:43 [timbl]
- My oIRC cl;ient has gone transprent ... restarting it
- 20:38:51 [timbl]
- timbl has left #dig
- 20:39:14 [timbl]
- timbl (~timbl@31-34-129.wireless.csail.mit.edu) has joined #dig
- 20:39:33 [timbl]
- That's better
- 20:40:19 [webr3]
- would be very nice to have either an ntriples stream or a sax like processor for rdf, then you could just consume and add to the store as you go with a v low memory footprint
- 20:41:00 [timbl]
- Yes, it is a shame that there is no attempt to stop the script without extreme predjudice first,. gen an exception, before totally destrying th ethread
- 20:41:32 [timbl]
- cwm's rdf/xml processor is stream-based.
- 20:41:51 [timbl]
- You can use it for pipeline processing. Based on ax
- 20:41:53 [timbl]
- sax
- 20:42:32 [timbl]
- But the way XMLHTTPRequest works atthe moment is to give you a DOM fully parsed.
- 20:42:36 [timbl]
- Maybe one could fix that
- 20:42:50 [webr3]
- nice, have been working with manu to define the same for the rdfa api, so far implemented a streaming one for ntriples in js, working on turtle next, then finall rdf/xml which may be a challenge without dom
- 20:43:06 [webr3]
- timbl, good point - I'll chase that up and talk to webapps guys to make it happen
- 20:43:07 [timbl]
- Then if you had live update code, predbrey, a table which had subscribed to changes in th estore would get real-time updates
- 20:43:52 [timbl]
- webr3, you aowking on the RDF API?
- 20:43:58 [timbl]
- RDFA I mean
- 20:44:27 [timbl]
- I think there is a big mistake happening to not make the RDFA API look like (tabulators) RDF API where it can
- 20:44:48 [webr3]
- timbl, yes - part of rdfa wg now specifically to cover the rdfa api - and to bring it as inline with tabulator as I can!
- 20:45:42 [presbrey]
- so if I split this data by zip code
- 20:45:56 [presbrey]
- I might produce an index.n3 that explicitly links to all zip code graphs
- 20:45:59 [webr3]
- timbl, spending most of my time recoding tabulator and also doing rdfa api at the same time so i can understand from the inside out - brb
- 20:46:25 [presbrey]
- seeAlso could be useful for linking to each
- 20:47:03 [presbrey]
- powder?
- 20:47:08 [webr3]
- presbrey, link:listDocumentProperty is nice - a typed seeAlso
- 20:51:48 [timbl]
- webr3, I hacked section 3.1.2 of http://www.w3.org/2010/rdf-api/RDF-API.html to see what the tabulator API would look like
- 20:51:52 [timbl]
- Just as good IMHO
- 20:51:57 [timbl]
- and alreday implemented
- 20:52:11 [timbl]
- If you look at th eexamples
- 20:54:18 [timbl]
- document.data could support the graph (IndexedFormula) interface from tabulator, including each and any etc
- 20:56:38 [webr3]
- timbl, yup I've made a load of changes which aren't in the document yet ( no editor rights ), so far aligned data store pretty much with formula, then working in idnexed formula over the top
- 20:58:11 [webr3]
- findAllMembers, sym etc all v good :) thanks!
- 20:58:43 [timbl]
- You've been making an alternative version too?
- 21:00:55 [webr3]
- changed that version, need to get some time with manu to update the api though - still not finished, need to turn data store in to an (optional) indexed formula
- 21:01:25 [webr3]
- sorry, to update the docs, have all the IDL defined and implemented of version not in editors draft yet
- 21:02:05 [timbl]
- who is editing it? Manu?
- 21:06:25 [webr3]
- yes, manu is editing it, when he gets a chance, he has a huge changeset from me to add in, most of the Data Interfaces have changed, and will change further
- 21:06:39 [webr3]
- apologies, doing many things at once
- 21:07:59 [webr3]
- timbl, shall I give you a nudge when the next revision is done so you can review?
- 21:08:57 [webr3]
- also, fyi, webapps/html wg have blocked IRI, BlankNode, TypedLiteral, RDFTriple PlainLiteral from being in the global namespace (as in having named constructors in the browsers)
- 21:11:50 [timbl]
- Well, could you pas on my comments to the WG?
- 21:12:15 [timbl]
- I only changed 3.1.2
- 21:12:44 [timbl]
- in that
- 21:12:53 [timbl]
- timbl has quit (Quit: timbl)
- 21:22:40 [kennyluck]
- kennyluck (~kennyluck@114-43-122-83.dynamic.hinet.net) has joined #dig
- 21:22:40 [kennyluck]
- kennyluck has quit (Excess Flood)
- 21:33:41 [kennyluck]
- kennyluck (~kennyluck@114-43-122-83.dynamic.hinet.net) has joined #dig
- 21:33:45 [kennyluck]
- kennyluck has quit (Excess Flood)
- 21:54:59 [mcherian]
- mcherian has quit (Read error: Operation timed out)
- 22:37:26 [lkagal]
- lkagal has quit (Quit: lkagal)
- 23:00:16 [marisol]
- marisol (~marisol@pool-141-154-118-225.bos.east.verizon.net) has joined #dig
- 23:01:55 [RalphS]
- RalphS has quit (Quit: heading to train ...)
- 23:30:40 [marisol]
- marisol has quit (Quit: marisol)