W3C | Semantic Web | Advanced Development | SWAP | Tutorial | Comparing Formats

n3 Cwm Changes

See also:

@@which release?

Release 1.1rc1

see 11 Aug announcement

Performance Improvements

Python 2.3 or later now required

Used for Sets

Experimental SPARQL Server support

Cwm can now run as a SPARQL server. This includes:

RDF/XML serialization fixes

A few strange bugs in rdf:xml serialization, many related to the rdf: prefix or xml: prefix, have been fixed

delta exit statuses

delta now returns with similar exit statuses as the diff utility for plaintext files. An exit status of 0 means no differences between the from and to graphs were found. An exit status of 1 means some differences were found. An exit status of 2 means differences were not computed for some reason.

Release 1.0

delta and cwm --patch

General bugfixes

Fixes in decimal support

A bug introduced into 0.8.0 where cwm crashed if no input files were specified has been fixed

Other

A crude hack in WebAccess.py allows a local file system to be used instead of a given website. The intent is to clean this up in the future.

There has been much work on a grammar for n3. grammar/n3.n3 is the grammar, grammar/predictiveParser.n3 can understand it.

RDF/XML support changes

Performance improvements

Performance work has resulted in some tasks taking 1/10 the time that they used to. Much more work is planned in this regard.

Packaging

Cwm now uses python's distutils for distribution. This allows for installation of cwm. As an added bonus, there are now rpm's, windows installers.

Flatten support

--flatten and --unflatten have been rewritten, replacing the old --flat. Flatten minimally reifies an n3 file to make it an rdf graph. Note that the graph may still fail to serialize as rdf, due to literals as subjects.

Release 0.8

This release has some bugfixes, as well as some changed and new functionality.

General bugfixes

Cwm will now do the expected thing on log:uri if given an invalid URI string.

An integer over the platforms native integer size will no longer crash Cwm.

Cwm now understands most xsd datatyped for use in builtins.

By setting the environment variable CWM_RDFLIB to 1, cwm will use rdflib to parse rdf/xml files. Note that this is unsupported, may break in the future, and requrires rdflib be installed. Rdflib supports some rdf features Cwm's parser does not, so this may be useful to you notwithstanding

BNodes scope changes

BNodes given names using the _: notation now have formula scope, not document scope.

Release / Test changes

Cwm is now released in a new file, cwm.tar.gz, which, while being a fraction of the size of the old cwm.tgz, still includes a complete set of tests. It was discovered that Cwm was not actually running all of its tests. This has been fixed, and the tests have been split into tests guarenteed to work in the .tar.gz file (offline) and those which will not.

N3QL support

N3QL support. Initial support for N3QL query language handling in cwm. command line looks like: cwm myKB.n3 --query=myquery.n3ql. Initial test case in swap/test/ql/detailed.tests

N3 serialization changes

The default output form is @forAll, not this log:forAll, @forSome, not this log:forSome. Use --n3=v to force the old obsolete form for variable declations. The parser has for a long time and will for some time support both forms.

Reification support

Reification, after being broken for almost a year, has been rewritten and is better than ever. See reify/detailed.testsThe options, as always, are --reify and --dereify. The tests for this are now in. There is insufficient documentation for this feature, and it has not been completely tested, so there may yet be bugs in it. If there is demand for it, fixing the --flatten feature should be possible now.

2004/6/23:

MIME type changes

Mime type for Notation3 switched to text/rdf+n3. This could be a major change. The application for application/n3 never seemed to get anywhere with IANA anyway. Text has the advantage that browsers will by default dispaly it, and N3 is designed to be readable. Downside is that if it not ASCII, then you must include the charset specification in the mime type. This must be UTF-8, and notation3 is always utf-8.

Release 0.7.3

HTTP errors

Cwm now correctly identifies HTTP 404 errors, throwing an exception, and does not try to parse the returned (HTML) file.

Reflexive statements now work

An n3 statement which references the same universal quantifier twice now does the right thing.

Release 0.7.2

Ooops - nodeID was misspelled nodeid on RDF/XML output.

Patch functionality

The --patch=patchfile command line argument allows a patch file to be applied to the knowledge base (current working formula). See the Diff, Patch, Update and Sync note in the Design Issues series on the motivation for exchanging difference files, and how they work. Diff files can in certain specific circumstances (a well-labeled graph) be produced by the diff.py program included with this distribution. The new tests are in $SWAP/test/delta/detailed.tests

Internationization undefined, Numerics canonicalized

The i18n/detailed.tests have been removed from the test harness. They were not right, and URI/ IRI issues are not clear yet. It is not clear whether cwm should URI-canonicalize, or IRI-canonicalize. (My instinct is that it should - Tim).

Cwm does not canonicalize numerical (xsd:double and xsd:integer) values on N3 output. It uses python's str(float(s)) and str(int(s)). The effect is to reduce some over-precision in the output.

Release 0.7.1, 2004-03-04

RDF Parser improved

The cwm regresssion test now incorporates the RDF Core Positive Parser Tests except for those which deal with reification or with XML literals. In the process, xml:base supposrt was added in the parser.

A new test found in the updated core tests requires RDF to be parsed even when there is no enveloping <rdf:RDF> tag, even if the outermost element is a typed node production, and so not something in the RDF namespace at all. This makes rdf much less self-describing, and makes it more dangerous that one might parse say an HTML file as RDF by accident. Use with care. If need this feature, use the --rdf=R flag.

The RDF core tests are done with --rdf=RT to make the parser parse naked RDF or RDF buried in foreign XML..

nodeid generated on RDF output

This has been a missing feature of the RDF generator for a while. The nodeid feature allows bnodes to be output in RDF/XML. I may not have got this right, as I don't have RDF generation tests, only RDF parse tests.

Ordering of output

The ordering of Terms has been changed. Automatically generated terms with no URIs sort after anything which has a URI.

This will change the order of N3 and RDF/XML output but does not change its semantics.s

Namespace prefix smarts on output

Cwm now does output in a two-pass process. This makes its counting of the number of occurrences of namespaces more acurate, which determines the default namespace it choses. This does take more time, though not as long as the previous method of working out which was going to be most common. To skip this process, use the "d" flag on output (N3 or RDF/XML) to suppress the use of a default namespace.

Because this counting is now accurate, it now suppresses namespace prefix declarations which are not actually needed in the output.

Cwm will also make up prefixes when it needs them for a namespace, and none of the input data uses one. It peeks into the the namespace URI, and looks around for a short string after the last "/", adding numbers if necessary to make the prefix unique.

Namespaces without hashes

Cwm when writing N3 not normally use namespace names for URIs which do not have a "#". Including a "/" in the flags overrides this.

cwm mydcdata.n3 --n3="/"

Namespaces which end in "/" lead (in my opinion) to an unfortunate confusion between the RDF propoerties and the HTTP document they identify. This is related to W3C TAG Issue httpRange-14.

ValueError: You cannot use 'this' except as subject of forAll or forSome

The "this" syntax in a formula refers to the formula itself. It was used for thr pseudo-statements this log:forSome x, and this log;forAll y.

In a few rare cases it actually was used to refer logically to the formula itself. A classic is of course { this a log:falseHood }. I decided that this was going to make more problems than it would solve. The psudostatements have gone anway (in response to popular request), they became just syntax. And the @forAll syntax has been introduced as the way to go. So with this release, while you can still like many N3 files use this to qualify variables, you can't use it for anything else.

Release 0.7, 2004-02-04

Release 0.7. 2004-02-04: This is a first numbered release. After much discussion we picked 0.7 as the number. Added a CVS tag rel-0-7, so that if you have the source through CVS, you up- or down-grade to this by

cvs update -r rel-0-7

The idea is that this release has a well-defined set of bugs, and that we work toward a more community-supported platform with time.

  1. The set of open bugs or request for enhancements (RFEs) which have been sent to the list are now available, every time we run make, in an iCalendar format ToDo list. . This at least tracks the outstanding one. Those for which a "[closed] Re: ..." response exists are given Completed status.
  2. A few have actually been closed. Closure now where appropiate involves including a suitable test in the regression tests. This test currently includes all the n3 format files detailed.tests in subdirectories of $SWAP/test.
  3. New mailing lists have been made.
    public-cwm-announce
    This low-traffic is for announcements about releases of cwm software.
    public-cwm-bugs
    This is for the announcement and brief discussion/clarification of cwm bugs. A mail with subject [closed] Re: .... marks a thread as closed. We may make this protocol more sophistiucaed with time. If responding to only use mailers which send the refernce headers so that the threads on this mail ling list work. For new threads, please make the subject line informative, and use the word "bug" or "RFE" as appopriate. The current plan is to review changes in this monthly and send it to the announce list
    public-cwm-talk
    Discussion by users and/or developers of the use and abuse of project software.

Changes 2003-09

  1. More on-added functionality: --closure=e smushes nodes which are = into one. That is any time the working formula (knowledge base) has a triple added where a = b, the store with "e" in the closure mode will replace all occurrences of b with a. When chosing which of the two to use as the node, the preference will be to literals or lists or formulae, then to symbols with URIs (lexically lowest prefered, e.g. <a> rather than <b>), then blank nodes. It will remember the quality.
  2. --closure=T makes the adding of a triple {F a log:Truth} where F is a formula to cause the content of F to be loaded into the knowledge base.
  3. The ordering of terms has been fixed so that all constants literals, lsits amd formula should all appear before any symbols in a listing.

    (code is formula.compareTerm())

Cwm Changes 2003/08

At this point cwm made a number of changes at once, so we document them here.

@prefix test: <http://www.w3.org/2000/10/swap/test/regression#>.

  1. The namespaces used for lists changed from DAML to RDF. The namespace used for "=" switched to OWL. DAML and DAML+OIL are used no more, though we remember them fondly. This change actually shouldn't change much in many applications, where files use the collection syntax in RDF or the () and = syntaxes in N3. It does affect the order of statements cwm uses to pretty-print files.

  2. Lists are handled differently internally. This is a bit of a an experiment. Instead of being stored in their first/rest forms in the graph, complete lists are stored as List objects. Thse are first class objects. .
  3. New option --data strips the store down to an RDF graph, losing universal variables (forAll's) and any statements which mentoin nested formulae. --purge-rules is now deprocated and does the same thing. (--purge-rules sed to be a just remove things menioning forAll or implies.)
  4. Generated Ids have in the past been generated relative to the current input file (or base) like <#_g0>. This had two problems. The first is that purists point outthat cwm is assertingthings about the user's namespace which aren't necessarily true - why does cwm have the right to do this? The second is pragmatic: when two files already with cwm-generated genids were re-processed using cwm, it could under some cases end up reusing the same id. Now, cwm makes up, by default, much more resilient ones. It imagines a file called something like .run-19435661734.7651234 in the local directory, and uses that file's local space for its names. If you don't like this (eg for repeatable testing) you can set an environment variable CWM_RUN_NS to something like "#", which will create the original behaviour, or a specific namespace you know you won't reuse in related work.

More changes 2003-08-25

  1. 2003-08 Pretty printing split from llyn.py to pretty.py. Interface may change. Pretty printing should be faster now - for some runs it was taking a lot longer than the parsing and inference steps. It had never been designed for speed, but it was getting ridiculous.
  2. 2003-08-30 List processing improved. test/list/append.n3 works, reverse.n3 doesn't yet. substituteEquals supported by all Terms, does substituton of equals, used in compbaintion with F._redirection which is a dictionary of aliases. Preparing for handling of equality and smushing of nodes.

For python cwm developers

  1. The store is being separated into components. llyn.py was too big. The idea is to be able to support a redland underlying store in the future. The unification engine which includes the built-in function operation now uses the store's api to access the store without peeking into internals.
  2. thing.py split into myStore.py and term.py
  3. It used to be that the universal and existential variables in a formula were stored in forAll and forSome psudo-property triples. This was basically a kludge. One had to be always aware that the pseudoproperties didn't act like real ones (no substitution for = etc). So now the lists of variables have goene into the formula objects. This should make it much easir for people tofollow whatis happeningin the code, and removes all kinds of special case code.

Getting an old version

You can get the old version before these 2.0 changes using CVS, by checking out with the tag oldLists .

cvs update -r oldLists

Features done earlier

Done ==== - sucking in the schema (http library?)

--schemas ; - to know about r1 see r2;

- split Query engine out as subclass of RDFStore? (DWC) SQL-equivalent client

- split out separate modules: CGI interface, command-line stuff, built-ins (DWC 30Aug2001)

- (test/retest.sh is another/better list of completed functionality --DWC)

- BUG: a [ b c ] d. gets improperly output. See anon-pred

- Separate the store hash table from the parser.

- DONE - regeneration of genids on output.

- DONE - repreentation of genids and foralls in model - regression test

- DONE (once!) Manipulation: { } as notation for bag of statements - DONE - filter -DONE - graph match

-DONE - recursive dump of nested bags

- DONE - semi-reification - reifying only subexpressions

- DONE - Bug :x :y :z as data should match [ :y :z ] as query. Fixed by stripping forSomes from top of query.

- BUG: {} is a context but that is lost on output!!! statements not enough. See foo2.n3 - change existential representation :-( to make context a real conjunction again? (the forSome triple is special in that you can't remove it and reduce info) - filter out duplicate conclusions

- BUG! - DONE - Validation: validate domain and range constraints against closure of classes and mutually disjoint classes.

- Use unambiguous property to infer synomnyms (see sameDan.n3 test case in test/retest.sh)

- schema validation - done partly but no "no schema for xx predicate". ULTILS WE HAVE DONE

- includes(expr1, expr2) (cf >= , dixitInterAlia )

- indirectlyImplies(expr1, expr2)

- startsWith(x,y)

- uri(x, str)

- usesNamespace(x,y) # find transitive closure for validation - awful function in reality


$Id: changes.html,v 1.48 2005/11/03 22:25:04 connolly Exp $