IRC log of dig on 2007-07-04

Timestamps are in UTC.

00:58:04 [lkagal]
lkagal (n=lkagal1@79.68.171.66.subscriber.vzavenue.net) has joined #dig
02:45:36 [lkagal]
lkagal has quit ()
03:43:06 [djweitzner]
djweitzner has quit ()
10:58:16 [lkagal]
lkagal (n=lkagal1@79.68.171.66.subscriber.vzavenue.net) has joined #dig
10:58:41 [lkagal]
lkagal has quit (Client Quit)
12:33:25 [timbl]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 128: ordinal not in range(128)
12:33:40 [timbl]
Why an acii code error when utf-8 encoding?
12:55:37 [sbp]
timbl: you have an '\x8E' in the str to be encoded, i.e. before encoding
12:55:50 [sbp]
it's trying to decode that input as ascii before encoding it as utf-8...
12:56:05 [sbp]
presumably you'll need to decode it as iso-8859-1 first before encoding it
12:56:36 [sbp]
>>> '...\x8e...'.encode('utf-8')
12:56:36 [sbp]
Traceback (most recent call last):
12:56:36 [sbp]
File "<stdin>", line 1, in <module>
12:56:36 [sbp]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 3: ordinal not in range(128)
12:56:36 [sbp]
>>> '...\x8e...'.decode('iso-8859-1').encode('utf-8')
12:56:38 [sbp]
'...\xc2\x8e...'
12:56:40 [sbp]
>>>
12:56:42 [sbp]
like that
12:57:24 [timbl]
This is in teh N3 serialization stage.
12:57:37 [timbl]
So everything it is encoding shojuld be unicode.
12:58:39 [sbp]
so the \x8e input byte is actually representing the codepoint U+008E?
12:59:17 [sbp]
the input string can't be a unicode string, otherwise you wouldn't get an error
12:59:18 [sbp]
>>> u'\x8e'.encode('utf-8')
12:59:19 [sbp]
'\xc2\x8e'
12:59:19 [sbp]
>>>
12:59:23 [timbl]
I think I get it. It is encoding a string whcih it read from a file, a stdout error log
12:59:40 [timbl]
without decding it properly.
12:59:45 [sbp]
aha
13:00:21 [timbl]
that makes sense. I'll assuem all files are encodined in utf8
13:00:44 [timbl]
What I haven't foudn is the equivalent to .deecode in Javascript.
13:01:24 [timbl]
This is all testing hte n3 parser ported to js
13:17:56 [lkagal]
lkagal (n=lkagal1@79.68.171.66.subscriber.vzavenue.net) has joined #dig
13:45:42 [sbp]
timbl: the only thing I've found so far is http://developer.mozilla.org/en/docs/nsIScriptableUnicodeConverter which is a Gecko interface for doing bytes <-> unicode conversion
13:46:24 [sbp]
there's an example of using it to read a file in a user-specified encoding at http://developer.mozilla.org/en/docs/Reading_textual_data#Reading_strings_2
14:34:43 [timbl]
Pitytey call everything ns .. makes it socially more diff for others to stdize later. Noce find. Thank you!
14:35:25 [sbp]
yup. you're welcome
14:36:00 [timbl]
They have inheritance grahs in gif (which seem not to work) but no inheritnce data in rdf... ne day!
14:36:25 [sbp]
I wonder what the GIFs are backed by, if anything? UML?
14:38:12 [timbl]
or just the IDL maybe
14:42:07 [timbl]
Ok, so I will have that in the browser btu not in the rhino environment. But Rhino has access to te java runtime, so I am sure I can find an equivalent.
14:43:22 [sbp]
why the use of Rhino? a standalone Tabulator?
14:43:57 [timbl]
For testing the n3 parser in the cwm development environment - batch
14:44:19 [timbl]
http://www.w3.org/2000/10/swap/pyjs/Makefile
14:44:20 [sbp]
aha
14:44:48 [timbl]
It might even be useful for debugging
14:45:06 [timbl]
I found a bug in the store wit hit. At least it gives stack traces.
14:45:21 [timbl]
Actually has a simple debugger, though not as nice as Firebug
14:46:30 [timbl]
It is kinda weird. notation3.py was the first bit of code whcih started all of cwm etc off, now I am dealing with it again. danc emailed it t me and installed ython and played with it on a plane in 2000
14:46:35 [sbp]
there's also http://www.mozilla.org/js/spidermonkey/ - same kind of thing, but in C
14:46:45 [sbp]
might be faster. it's the implementation of Javascript used in the browser
14:47:08 [sbp]
I think you can do apt-get install smjs or something, too, on debianesque distros
14:47:17 [sbp]
heh, yeah
14:47:56 [timbl]
Rhino I had tried before and not had the enegerym but this time i foun it is just a case of sticking the tag file unwraped into classpath.
14:48:16 [sbp]
interesting that after all of the formalisation attempts, too, the original simple recursive descent module is still tops
14:49:02 [sbp]
ah. I wonder how much harder it'd be with Spidermonkey; I might give it a go and report on it myself
14:50:29 [timbl]
topos? i haven't times it.
14:50:37 [timbl]
The predictive parser should be faster
14:50:56 [timbl]
as it can jump to the right production as a function of te next character
14:51:45 [sbp]
tops in terms of being still the main parser in cwm, I mean
14:52:06 [sbp]
is that because all the other parsers simply lag in terms of features?
14:52:14 [sbp]
are they harder to maintain and upgrade?
14:58:12 [timbl]
I have never gotten around to integrating them and testing them. You made one or two, didn't you?
14:58:20 [timbl]
n3p?
14:58:33 [sbp]
yeah, n3p is the most recent
14:58:43 [timbl]
The predcitive parser I only had parsing but not generting any triples.
14:58:51 [sbp]
its test suite might be quite useful too
14:58:59 [timbl]
did you get n3p to the point where it would generate triples?
14:59:12 [sbp]
yep, I did indeed
14:59:21 [timbl]
Tets suite? ah. Yosi is accumlatibe test suites.
14:59:47 [timbl]
swap/test/n3
14:59:52 [sbp]
though with the 2004 version of n3.n3. when I tried to feed in the latest n3.n3 last year, with all the new unicode stuff, rdflib (which is used in the schema-parsing stage only) started to complain about stuff. I didn't get it fixed
15:00:18 [sbp]
yeah, I saw the HTML table of the test output you linked to in here a while ago
15:00:22 [timbl]
Ah .. the 32 bit regexps
15:00:25 [timbl]
64 bit
15:00:42 [timbl]
Oh, ok, so Yosi has it in hos test mtrix?
15:01:01 [sbp]
right, but I don't think the regexps were the problem. it was just an API change in rdflib... though of course since I didn't get past that I don't know if there'd be further problems
15:01:12 [timbl]
ah
15:01:12 [sbp]
nope, n3p isn't in the test matrix. nor its test suite, I believe
15:01:54 [timbl]
Do you have CSV wriet access to swp?
15:01:58 [sbp]
nope
15:02:11 [timbl]
I wonder whether w could get you that
15:02:52 [sbp]
for me to integrate n3p into the test suite?
15:03:10 [timbl]
and the n3p tests
15:03:25 [timbl]
we should be moving toward publishing turtle with tests too
15:03:40 [timbl]
so having a go test matrix will be important
15:04:00 [timbl]
Currently I don'r have a class of turtle tests which don't use extra n3 features
15:04:22 [sbp]
publishing turtle in TR space? which group is responsible for that?
15:05:38 [timbl]
Ivan felt that the sw interst group should do it
15:05:56 [timbl]
as a note
15:06:03 [sbp]
interesting
15:06:07 [timbl]
to get it out there
15:06:11 [sbp]
good idea
15:06:19 [timbl]
then the ietf would annoint the mime type
15:06:32 [sbp]
they only need a Note? hmm
15:07:04 [sbp]
what's the hold up of the N3 MIME type, by the way? similar situation?
16:04:13 [davidli]
davidli (i=dli@MACGREGOR-FIVE-FIFTY-NINE.MIT.EDU) has joined #dig
16:06:18 [davidli]
davidli has quit (Client Quit)
16:15:03 [timbl]
The IETF said on a liason call tehy need some 'stable document to refer to
16:15:06 [timbl]
a note would do
19:36:59 [lkagal]
lkagal has quit ()