The repository is essentially a virtual filesystem. Each file is a 'resource', and is modeled as an object of one of the following classes:
__ Container __________ Repository
__ UriReferenceFile |__ User
| |__ Group
RawFile --| |__ Command
|__ XmlDocument ______|__ Server
|__ Alias
|__ Container
|__ SchematronDocument
|__ (RdfDocument)
|__ XsltDocument ___________
| |
|__ (DocumentDefinition) |
|_________ XsltDocumentDefinition
|_________ XPathDocumentDefinition
If the resource that you're storing in the repository isn't one of the more specialized classes, then it is just a RawFile. Generally, as a user, you're mainly going to work with just RawFile, XmlDocument, XsltDocument, and Container.
Every resource is referencable by a path, just like in a POSIX file system, and it works about like you'd expect it to. Each path has an equivalent ftss URI.
A minimal repository includes a root (path '/') Container that contains an 'ftss' Container, in which can be found various Containers for Users, Groups, Commands, Servers, and some essential data files.
There is an RDF 'model', which is an unordered set of RDF statements, stored in a database. It is divided into a "system" model for the statements pertaining to the repository (and other system-managed statements), and a "user" model that is under your control and is essentially a blank slate. Certain features, such as the '4ss rdf complete' command act on the combined system and user models.
An XmlDocument is an XML document, of course, and it can optionally be associated with a DocumentDefinition ("docdef"). A DocumentDefinition is a resource that describes how to derive RDF statements from the XML -- deserialization guidelines, basically. Its content can either be XML or XSLT that follows certain guidelines. When the XmlDocument that is associated with this docdef is created, updated, or deleted, RDF statements will be updated automatically in the user model. This is really powerful, and is described in more detail here (free registration required). As an example, if the XML doc is XHTML, then you could write a docdef to generate a Dublin Core 'title' RDF statement from the /html/head/title element. Anytime the XML doc is updated, the RDF statements derived from it via the docdef will also be updated. These statements, being automatically managed, are stored in the "system" model, but there has been some discussion as to whether that is appropriate and how it might change in the future. Only one docdef can be associated with a document, but docdefs can import definitions from one another, if needed, like this example:
<ftss:DocDef name='foo' xmlns:ftss='http://xmlns.4suite.org/reserved' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' > <!-- import dublin_core.docdef's definitions --> <ftss:BaseNames> <ftss:Base xmlns:xlink='http://www.w3.org/1999/xlink' xlink:href='ftss:///ftss/data/dublin_core.docdef' xlink:type='simple' xlink:actuate='onLoad' xlink:show='embed'/> </ftss:BaseNames> <!-- foo.docdef's own definitions follow --> ... </DocDef>
An RdfDocument is a virtual document, of sorts. When you view its content, it looks like serialized RDF/XML. But whenever it is loaded into the repository, modified or deleted, corresponding changes are made to statements in the user model. For writing, it is like an XmlDocument with a docdef that deserializes RDF XML. For reading, it is a nice RDF XML view of statements in a particular scope.
In the model, statements are grouped by 'scope', perhaps better thought of as 'asserted by' or 'the context in which the statements apply'. It is often just a URI representing the origin of the statements, e.g. if the statements were deserialized from an XML document, it is the URI of that document. The scope essentially makes an RDF statement be a quadruple instead of a triple, because in scope A you can have statement Sub1 Pred1 Obj1, and at the same time in scope B you can have statement Sub1 Pred1 Obj2 (ordinarily, such statements would be mutually exclusive). Uche wrote up a bit about it here and here.
The RDF model can be queried with Versa, a somewhat XPath-inspired query language that 4Suite developed in conjunction with engineers from Sun Microsystems. The '4ss rdf' commands can be used to do Versa queries and other operations on statements in the model.
It is possible to create and use an RDF model without repository integration via our '4RDF' API. This is demonstrated briefly here and you can also use the 4rdf command-line tool for that.
The repository is built within a real database, typically either PostgreSQL or "FlatFile", though we do support a few others. FlatFile is just a homegrown storage system that uses your local file system and probably should be renamed, as it is no longer 'flat'. It's the easiest to set up, but suffers from I/O overhead, as one would expect.
All of the resources in the repository have RDF statements associated with them in the system model. The statements model metadata such as creation time, last mod time, size, owner, group, relationship between containers and contents, etc. -- basically everything except the content itself, if content exists.
The repository is accessible through various layers of APIs:
____________________________________________________________
| repository & RDF back-end (Postgres, FlatFile, whatever) |
|____________________________________________________________|
|
server core Python API
(Ft.Server.Server.SCore.*)
_________|________________________________
| | |
RPC service (4ssd, usually listening on port 8803; | FTP service
controlled by 4ss_manager command-line script) |
| HTTP
client core Python AP service
(Ft.Server.Client.Core.*)
|
command-line
script (4ss)
So the repository can be on machine X, and you can access it from machine Y via these APIs.
The RPC, HTTP, and FTP services also manifest as Server objects in the repository, which makes setting them up and turning them on or off as easy as loading or modifying an XML document into the repository.
The most powerful part is the HTTP service. It integrates XSLT support, invoked through form data or through rules established at the server level. You can cause any requested XML document to be transformed and the result given in the HTTP response.
The XSLT support includes stylesheet chaining, and extension elements & functions that provide repository & RDF model access/modification including XUpdate and Versa queries. You get XInclude and XPointer as well, and of course any form data and other HTTP request info is also available to the stylesheet to do with as you wish.
You can build an entire web application just by turning on the server and loading some XSLT documents into the repository. This is what our repository demos and the Dashboard do. Once you get your head around how it works, it's amazingly cool, in our collective opinion. If you're going to compare it to other web application frameworks, it's more like Zope than Cocoon, but different and just in class all its own.
See also the info here.
One caveat is that the HTTP server isn't designed with a huge amount of security in mind, nor does it replicate all the features you'd want in an HTTP server (notably, virtual hosts!), so it is best to run the repository's HTTP service behind a proxying server such as Apache using its ProxyPass directive.
