Publishing based on a repository export
Introduction
Here we consider a semi-static publishing scenario. It works disconnected from the repository server, based on an "export" of the (relevant) repository data. There can be as many of these publishing frontends (with corresponding exports) as you want.
+------------+
|repo server |.....|
+------------+ |
|
|
+----------------------+ +-------+ v
| publishing frontend | | | +--------------+
| | |export |<....|export updater|
| +------+ | | | +--------------+
| |cache | | | |
| +------+ | | |
+----------------------+ +-------+
The export & export updater
The export contains documents satisfying a certain expression (i.e. based on collection membership, document types, ...).
The sort of export depends on the document type of the document:
- Image/Attachment: export only ImageData/AttachmentData
- Navigation: export expanded navigation tree from NavigationManager
- others: do publisher request
- publisher requests might depend on site, needs to be configured which ones needs to be performed
The exporter does its work as a certain user (default guest).
The export needs a one-time initialisation and is then gradually updated by the export updater which gets notified of updates by the JMS events.
The exporter can be configured with mutliple destination locations, i.e. can manage multiple identical exports (alternative: do rsync?).
For each document, the exporter must maintain a list of dependencies (i.e. the documents on which the document depends, thus when these change, the export of this document also needs to be refreshed). These are:
- included documents
- linked-to documents (links can get annotated with information about linked-to documents)
- documents in query results? (so that these can be updated immediately in case a document gets deleted)
- for documents containing queries, a time-based refresh interval could be used
When a document updates, both the export of the document itself and of its dependent documents needs to be refreshed.
To minimize work, it might be good to collect required updates over a certain interval (e.g. 5 minutes) so that if documents would require refreshing multiple times during that interval (e.g. because lots of updates are put live in a short interval), no unncessary duplicate work is done.
The exporter updater should work such that there is no difference between a fresh export and one that gets updated via the export updater.
To be done:
- filesystem structure of this export
- which JMS events to react to and what to do on each event
- how to track the dependency information (in memory structure + XML persistency ?)
- what do when new site or change in publisher requests ? -> new export from scratch
Publishing frontend
The publishing frontend still needs to do all the publishing work, as in the Daisy Wiki.
It can however cache the result of the styling process. Note that this is not done as part of the export, so that in case the stylesheets change, only the cache needs to be flushed, but the export can remain. (todo: need to think on what to base the cache key: site + doc id + nav path ?). This cache should exclude the navigation parts, so that if the navigation gets updates, the entire cache does not need to be flushed. Possibly, we could have a two-level cache: one with, and one without, the navigation included. The one without navigation could be updated when the export updates (does it?).
The caching should however be independent of the navigation tree, so that updates to the navigation tree don't require to flush the entire cache (?).
It has its own "NavigationManager" implementation that works based on the exported navigation tree.
To do:
- How to handle links to documents which are not part of the export
- can reuse same site defintions, skins & doctype styling as Wiki



There are no comments.