IRC log of dig on 2007-07-04
Timestamps are in UTC.
- 00:58:04 [lkagal]
- lkagal (n=lkagal1@79.68.171.66.subscriber.vzavenue.net) has joined #dig
- 02:45:36 [lkagal]
- lkagal has quit ()
- 03:43:06 [djweitzner]
- djweitzner has quit ()
- 10:58:16 [lkagal]
- lkagal (n=lkagal1@79.68.171.66.subscriber.vzavenue.net) has joined #dig
- 10:58:41 [lkagal]
- lkagal has quit (Client Quit)
- 12:33:25 [timbl]
- UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 128: ordinal not in range(128)
- 12:33:40 [timbl]
- Why an acii code error when utf-8 encoding?
- 12:55:37 [sbp]
- timbl: you have an '\x8E' in the str to be encoded, i.e. before encoding
- 12:55:50 [sbp]
- it's trying to decode that input as ascii before encoding it as utf-8...
- 12:56:05 [sbp]
- presumably you'll need to decode it as iso-8859-1 first before encoding it
- 12:56:36 [sbp]
- >>> '...\x8e...'.encode('utf-8')
- 12:56:36 [sbp]
- Traceback (most recent call last):
- 12:56:36 [sbp]
- File "<stdin>", line 1, in <module>
- 12:56:36 [sbp]
- UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 3: ordinal not in range(128)
- 12:56:36 [sbp]
- >>> '...\x8e...'.decode('iso-8859-1').encode('utf-8')
- 12:56:38 [sbp]
- '...\xc2\x8e...'
- 12:56:40 [sbp]
- >>>
- 12:56:42 [sbp]
- like that
- 12:57:24 [timbl]
- This is in teh N3 serialization stage.
- 12:57:37 [timbl]
- So everything it is encoding shojuld be unicode.
- 12:58:39 [sbp]
- so the \x8e input byte is actually representing the codepoint U+008E?
- 12:59:17 [sbp]
- the input string can't be a unicode string, otherwise you wouldn't get an error
- 12:59:18 [sbp]
- >>> u'\x8e'.encode('utf-8')
- 12:59:19 [sbp]
- '\xc2\x8e'
- 12:59:19 [sbp]
- >>>
- 12:59:23 [timbl]
- I think I get it. It is encoding a string whcih it read from a file, a stdout error log
- 12:59:40 [timbl]
- without decding it properly.
- 12:59:45 [sbp]
- aha
- 13:00:21 [timbl]
- that makes sense. I'll assuem all files are encodined in utf8
- 13:00:44 [timbl]
- What I haven't foudn is the equivalent to .deecode in Javascript.
- 13:01:24 [timbl]
- This is all testing hte n3 parser ported to js
- 13:17:56 [lkagal]
- lkagal (n=lkagal1@79.68.171.66.subscriber.vzavenue.net) has joined #dig
- 13:45:42 [sbp]
- timbl: the only thing I've found so far is http://developer.mozilla.org/en/docs/nsIScriptableUnicodeConverter which is a Gecko interface for doing bytes <-> unicode conversion
- 13:46:24 [sbp]
- there's an example of using it to read a file in a user-specified encoding at http://developer.mozilla.org/en/docs/Reading_textual_data#Reading_strings_2
- 14:34:43 [timbl]
- Pitytey call everything ns .. makes it socially more diff for others to stdize later. Noce find. Thank you!
- 14:35:25 [sbp]
- yup. you're welcome
- 14:36:00 [timbl]
- They have inheritance grahs in gif (which seem not to work) but no inheritnce data in rdf... ne day!
- 14:36:25 [sbp]
- I wonder what the GIFs are backed by, if anything? UML?
- 14:38:12 [timbl]
- or just the IDL maybe
- 14:42:07 [timbl]
- Ok, so I will have that in the browser btu not in the rhino environment. But Rhino has access to te java runtime, so I am sure I can find an equivalent.
- 14:43:22 [sbp]
- why the use of Rhino? a standalone Tabulator?
- 14:43:57 [timbl]
- For testing the n3 parser in the cwm development environment - batch
- 14:44:19 [timbl]
- http://www.w3.org/2000/10/swap/pyjs/Makefile
- 14:44:20 [sbp]
- aha
- 14:44:48 [timbl]
- It might even be useful for debugging
- 14:45:06 [timbl]
- I found a bug in the store wit hit. At least it gives stack traces.
- 14:45:21 [timbl]
- Actually has a simple debugger, though not as nice as Firebug
- 14:46:30 [timbl]
- It is kinda weird. notation3.py was the first bit of code whcih started all of cwm etc off, now I am dealing with it again. danc emailed it t me and installed ython and played with it on a plane in 2000
- 14:46:35 [sbp]
- there's also http://www.mozilla.org/js/spidermonkey/ - same kind of thing, but in C
- 14:46:45 [sbp]
- might be faster. it's the implementation of Javascript used in the browser
- 14:47:08 [sbp]
- I think you can do apt-get install smjs or something, too, on debianesque distros
- 14:47:17 [sbp]
- heh, yeah
- 14:47:56 [timbl]
- Rhino I had tried before and not had the enegerym but this time i foun it is just a case of sticking the tag file unwraped into classpath.
- 14:48:16 [sbp]
- interesting that after all of the formalisation attempts, too, the original simple recursive descent module is still tops
- 14:49:02 [sbp]
- ah. I wonder how much harder it'd be with Spidermonkey; I might give it a go and report on it myself
- 14:50:29 [timbl]
- topos? i haven't times it.
- 14:50:37 [timbl]
- The predictive parser should be faster
- 14:50:56 [timbl]
- as it can jump to the right production as a function of te next character
- 14:51:45 [sbp]
- tops in terms of being still the main parser in cwm, I mean
- 14:52:06 [sbp]
- is that because all the other parsers simply lag in terms of features?
- 14:52:14 [sbp]
- are they harder to maintain and upgrade?
- 14:58:12 [timbl]
- I have never gotten around to integrating them and testing them. You made one or two, didn't you?
- 14:58:20 [timbl]
- n3p?
- 14:58:33 [sbp]
- yeah, n3p is the most recent
- 14:58:43 [timbl]
- The predcitive parser I only had parsing but not generting any triples.
- 14:58:51 [sbp]
- its test suite might be quite useful too
- 14:58:59 [timbl]
- did you get n3p to the point where it would generate triples?
- 14:59:12 [sbp]
- yep, I did indeed
- 14:59:21 [timbl]
- Tets suite? ah. Yosi is accumlatibe test suites.
- 14:59:47 [timbl]
- swap/test/n3
- 14:59:52 [sbp]
- though with the 2004 version of n3.n3. when I tried to feed in the latest n3.n3 last year, with all the new unicode stuff, rdflib (which is used in the schema-parsing stage only) started to complain about stuff. I didn't get it fixed
- 15:00:18 [sbp]
- yeah, I saw the HTML table of the test output you linked to in here a while ago
- 15:00:22 [timbl]
- Ah .. the 32 bit regexps
- 15:00:25 [timbl]
- 64 bit
- 15:00:42 [timbl]
- Oh, ok, so Yosi has it in hos test mtrix?
- 15:01:01 [sbp]
- right, but I don't think the regexps were the problem. it was just an API change in rdflib... though of course since I didn't get past that I don't know if there'd be further problems
- 15:01:12 [timbl]
- ah
- 15:01:12 [sbp]
- nope, n3p isn't in the test matrix. nor its test suite, I believe
- 15:01:54 [timbl]
- Do you have CSV wriet access to swp?
- 15:01:58 [sbp]
- nope
- 15:02:11 [timbl]
- I wonder whether w could get you that
- 15:02:52 [sbp]
- for me to integrate n3p into the test suite?
- 15:03:10 [timbl]
- and the n3p tests
- 15:03:25 [timbl]
- we should be moving toward publishing turtle with tests too
- 15:03:40 [timbl]
- so having a go test matrix will be important
- 15:04:00 [timbl]
- Currently I don'r have a class of turtle tests which don't use extra n3 features
- 15:04:22 [sbp]
- publishing turtle in TR space? which group is responsible for that?
- 15:05:38 [timbl]
- Ivan felt that the sw interst group should do it
- 15:05:56 [timbl]
- as a note
- 15:06:03 [sbp]
- interesting
- 15:06:07 [timbl]
- to get it out there
- 15:06:11 [sbp]
- good idea
- 15:06:19 [timbl]
- then the ietf would annoint the mime type
- 15:06:32 [sbp]
- they only need a Note? hmm
- 15:07:04 [sbp]
- what's the hold up of the N3 MIME type, by the way? similar situation?
- 16:04:13 [davidli]
- davidli (i=dli@MACGREGOR-FIVE-FIFTY-NINE.MIT.EDU) has joined #dig
- 16:06:18 [davidli]
- davidli has quit (Client Quit)
- 16:15:03 [timbl]
- The IETF said on a liason call tehy need some 'stable document to refer to
- 16:15:06 [timbl]
- a note would do
- 19:36:59 [lkagal]
- lkagal has quit ()