Website Downloads Documentation Knowledgebase Wiki Issue tracker Commercial support

Publication Process Tasks Reference

General

All publication tasks read and write their files in the book instance that is currently being processed. Thus all input and output paths specified on the individual publication tasks are relative paths within a book instance, and are prefixed with the directory of the current publication output, thus /publications/<publication output name>/.

applyDocumentTypeStyling

Syntax

<applyDocumentTypeStyling/>

Description

Applies document type specific stylesheets.

Location of the document type specific stylesheets

The stylesheets are searched in the following locations:

First a document-type-specific, publication-type-specific stylesheet is searched here:

<wikidata dir>/books/publicationtypes/<publication-type-name>/document-styling/<document-type-name>.xsl

If not found, then a document-type-specific stylesheet is searched here:

<wikidata dir>/books/publicationtypes/document-styling/<document-type-name>.xsl

Finally, if not found, a generic stylesheet is used:

webapp/daisy/books/publicationtypes/common/book-document-to-html.xsl

Output of the document type specific stylesheets

In contrast with the Daisy Wiki, the document type specific stylesheets should always produce Daisy-HTML output. This is because later on a lot of processing still needs the logical HTML structure (such as header shifting, formatting cross references, ...). If you want to do output-medium specific things, you can leave custom attributes and elements in the output and later on interpret them in the stylesheet that will translate the HTML to its final format.

The output of the stylesheets should follow the following structure:

<html>
  <body>
   <h0 id="dsy<document id>"
       daisyDocument="<document id>"
       daisyBranch="<branch>"
       daisyLanguage="<language>">document name</h0>

   ... the rest of the content ...

  </body>
</html>

Since headers in a document start at h1, but the name of the document usually corresponds to the section title, the name of the document should be left in a <h0> tag. This will then later be corrected by the shiftHeaders task.

The id attribute on the <h0> element is required for proper resolving of links and cross references pointing to this document/book section.

The applyDocumentTypeStyling task writes its result files in the book instance in the following directory:

publications/<publication-name>/documents

addSectionTypes

Syntax

<addSectionTypes/

Description

In the book definition, types can be assigned to sections (e.g. a type of 'appendix' might be assigned to a section). This task will add an attribute called daisySectionType to the <h0> tag of each document for which a section type is specified.

This task should be run after the applyDocumentTypeStyling task and will make its changes to the files generated by that task (thus no new files are written).

shiftHeaders

Syntax

<shiftHeaders/>

Description

This task is deprecated, it still exists but performs no function whatsoever. Shifting headers is now performed as part of the assembleBook task.

assembleBook

Syntax

<assembleBook output="filename"/>

Description

Assembles one big XML containing all the content of the book. Thus this task combines all documents specified in the book definition in one big XML. It also inserts headers for sections in the book definition that only specify a title. If a document contains included documents, these are also merged in at the include position.

The <html> and <body> tags of the individual documents are hereby removed, the resulting assembled XML has just one <html> element containing one <body> element.

While the assembling is done, the headers in the documents are also shifted depending on their hierarchical nesting in the book definition. All headers are always shifted by at least 1, to move the h0 headers to h1, see the applyDocumentTypeStyling task.

The output is written to the specified output path (in the book instance).

addNumbering

Syntax

<addNumbering input="filename" output="filename"/>

Description

Assigns numbers to headers (sections), figures and tables.

The numbering is done based on numbering patterns specified in the publication properties. The numbering is different for sections, figures or tables of different types.

For example, the following properties define the numbering patterns for sections of the type "default":

Property name

Property value

numbering.default.h1

1

numbering.default.h2

h1.1

numbering.default.h3

h1.h2.1

The property values are numbering patterns, which define the formatting of the number, following a certain syntax, explained below.

The properties for figures and tables are called "figure.<figuretype>.numberpattern" and "table.<tabletype>.numberpattern", respectively. The numbering of figures and tables happens per chapter (per h1-level). Figures and tables are only numbered if they have a caption (defined by the daisy-caption attribute)

For sections, the following additional properties can be defined:

  • numbering.<section-type>.increase-number: defines whether the section number must be increased when a section of this type is encountered. It can be useful to disable this for "anonymous" sections that do not require numbering, though for which the numbering of the next sections should simply continue as if the anonymous sections were not there.
  • numbering.<section-type>.reset-number: defines whether the section numbering must be restarted when a section of this type is encountered after a section of another type, inside the same section level.
  • numbering.<section-type>.hx.start-number: defines the initial number for sections of this type on level x (default: 1)

On the elements to which a number is assigned, the following attributes are added:

  • daisyNumber: the number formatted according to the number pattern
  • daisyPartialNumber: only the number of the element itself formatted according to the style indicated in the numbering pattern (1, i, I, a or A), without any of the other parts of the numbering pattern
  • daisyRawNumber: the unformatted number of the element.

Syntax of the numbering patterns

Each numbering pattern should contain exactly one of the following characters: 1, i, I, a or A. The number of the section (or figure/table) will be inserted at the location of that character. The character indicates the type of numbering (e.g. A for numbering with letters).

The number of ancestor sections can be refered using 'h1', 'h2', ... till 'h9'. The number of the highest ancestor which has a number can be referred to using 'hr' ("root header").

It is possible to retrieve text from the resource bundle of the current publication type by putting a resource bundle key between $ signs, for example $mykey$.

Any other characters used in the numbering pattern will be output as-is.

verifyIdsAndLinks

Syntax

<verifyIdsAndLinks input="filename" output="filename"/>

Description

This task does two things:

  1. It does some linking related checks: it will warn for double IDs, or Daisy-links and cross-references pointing to documents or IDs not present in the book. It will also warn for images of which the source starts with "file:", which is most often caused by accident. All these warnings are written to the link log.
  2. It will assign IDs to the following elements that do not have an ID yet:
    • headers
    • images and tables which have a caption

addIndex

Syntax

<addIndex input="..." output="..."/>

Description

This task generates the index based on the index entries in the document (which are marked with <span class="indexentry">...</span>).

It collects all index entries, sorts them, creates hierarchy in them (by splitting index entries on any colon that appears in them), and writes out the original document with index appended before the body close tag, whereby the output has the following structure:

<h1 id="index">Index</h1>
<index>
  <indexGroup name="A">
    <indexEntry name="A...">
      <id>...</id>
      <id>...</id>
      [... more id-s ...]
      [... nested index entries ...]
    </indexEntry>
    [... more index entries ...]
  </indexGroup>
  [... more index groups ...]
</index>

The <indexGroup> elements combine index entries based on their first letter. Any entries before the letter A are grouped in an <indexGroup> without a name attribute.

The <id> elements inside the indexEntries list all the IDs of the indexentry-spans that define this index entry.

Note that this task will also assign IDs to the indexentry spans.

addTocAndLists

Syntax

<addTocAndLists input="filename" output="filename"/>

Description

This task creates the Table Of Contents and the lists of figures and tables.

Table Of Contents (TOC)

The TOC is created based on the HTML header elements (h1, h2, etc.). Only headers up to a certain level are included, which is configurable using the publication property called "toc.depth", whose value should be an integer number (1, 2, etc.).

The TOC is inserted at the beginning of the document, after the <body> opening tag, and has an XML like this:

<toc>
  <tocEntry targetId="..." daisyNumber="..." daisyPartialNumber="..." daisyRawNumber="...">
    <caption>...</caption>
    [... nested tocEntry elements ...]
  </tocEntry>
  [... more tocEntry elements ...]
</toc>

The targetId attribute is the ID of the corresponding header. The daisyNumber, daisyPartialNumber and daisyRawNumber attributes are only present if the corresponding number had a number assigned by the addNumbering task. See the description of that task for the meaning of these attributes.

The caption element contains the content of the header tag, including any mixed content. However, footnotes or index entries which might occur in the heading are not copied into the caption element.

Lists of figures and lists of tables

Lists of figures and lists of tables are created per type of figure or table. The types for which the lists should be created have to specified in two properties:

  • list-of-figures.include-types
  • list-of-tables.include-types

These properties should contain a comma separated list of types. For figures and tables that do not have a specific type assigned, the type is assumed to be "default". For example to have a list of all default figures, and a list of all figures with type "screenshot", one would set the list-of-figures.include-types property to "default,screenshot". Note that the order in which the types are specified is the order in which the lists will be inserted in the output.

The lists are inserted in the output after the TOC, and have an XML structure like this:

<list-of-figures type="...">
  <list-item targetId="..." daisyNumber="..." daisyPartialNumber="..." daisyRawNumber="...">the caption</list-item>
  [... more list-item elements ...]
</list-of-figures>

For tables the root element is "list-of-tables".

applyPipeline

Syntax

<applyPipeline input="..." output="..." pipe="..."/>

Description

This task calls a Cocoon pipeline in the publication type sitemap.

The pipeline is supplied with the following parameters (flow context attributes):

  • bookXmlInputStream: an inputstream for the file specified in the input attribute
  • bookInstanceName
  • bookInstance (the BookInstance object)
  • locale: java.util.Locale object for the locale in which to publish the book
  • localeAsString: the locale as a string
  • pubProps: java.util.Map containing the publication properties
  • bookMetadata: java.util.Map containing the book metadata
  • publicationTypeName
  • publicationOutputName

The pipe attribute specified the pipeline to be called (thus the path to be matched by a matcher in the sitemap). The output of the pipeline execution is saved to the file specified in the output attribute.

For practical usage examples, see the default publication types included with Daisy.

copyResource

Syntax

<copyResource from="..." to="..."/>

Description

Copies a file or directory (recursively) from the publication type to the book instance. As with all other tasks, the "to" path will automatically be prepended with the directory of the current publication output (/publications/<publication output name>/).

splitInChunks

Syntax

<splitInChunks input="..." output="..." firstChunkName="..."/>

Description

Groups the input into chunks. New chunks are started on each <hX>, in which X is configurable using the publication property "chunker.chunklevel".

The output will have the following format:

<chunks>
  <chunk name="...">
    <html>
      <body>
        [content of the chunk]
      </body>
    </html>
  </chunk>
  [... more chunk elements ...]
</chunks>

By default the name of each chunk will be the ID of the header where the new chunk started, except for the first chunk for which the chunk name can optionally be defined using the firstChunkName attribute on the splitInChunks task element.

The original <html> and <body> elements are discarded, the new <chunks> element will be the root of the output. New <html> and <body> elements are inserted into each chunk, so that the content of each chunk forms a stand-alone HTML document.

writeChunks

Syntax

<writeChunks input="..." outputPrefix="..." chunkFileExtension="..."
             applyPipeline="..." pipelineOutputPrefix="..." chunkAfterPipelineFileExtension="..."/>

Description

Writes the content of individual chunks, as created by the splitInChunks taks, to separate XML files.

The attributes applyPipeline, pipelineOutputPrefix and chunkAfterPipelineFileExtension are optional. If present, the pipeline specified in the applyPipeline attribute will be applied to each of the chunks, and the result will be written to a file with the same name as the original chunk, but with the extension specified in the attribute chunkAfterPipelineFileExtension.

makePDF

Syntax

<makePDF input="..." output="..."/>

Description

Transforms an XSL-FO file to PDF. The current implementation uses the (commercial) Ibex PDF serializer. [todo: note on serialized execution]

getDocumentPart

Syntax

<getDocumentPart propertyName="..." propertyOrigin="..." partName="..." saveAs="..." setProperty="..."/>

Description

This task retrieves the content of a part of a Daisy document. The Daisy document is specified using a "daisy:" link in a publication property or book metadata attribute.

  • propertyName: specifies the name of the property
  • propertyOrigin: either 'publication' for a publication property or 'metadata' for a book metadata attribute
  • partName: the name of the part from which to get the data. For example, "ImageData" for images.
  • saveAs: where the data should be saved.
  • setProperty (optional): specifies the name of (publication) property which will be set to true if the part data has been effectively retrieved

This task can be useful when you let the user specify e.g. a logo to put in the header or footer by specifying a daisy link in a publication/metadata property.

copyBookInstanceResources

Previously (Daisy 1.4) this was called copyBookInstanceImages. This old currently name still works for backwards-compatibility.

Syntax

<copyBookInstanceResources input="..." output="..." to="..."/>

Description

Copies all resources which are linked to using the "bookinstance:" scheme to the directory specified in the to attribute, unless the resource would already be in the output directory (thus when the resource link starts with "bookinstance:output/"). The links are adjusted to the new path and the resulting XML is written to the file specified in the output attribute.

The following resource links are taken into account:

HTML element

Corresponding attribute

img

src

a

href

object

data

embed

src

This task is ideally suited to copy e.g. the images to the output directory when publishing as HTML (for PDF, this is not needed since the images are embedded inside the HTML file).

zip

Syntax

<zip/>

Creates a zip file containing all files in the output directory. The zip file itself is also written in the output directory, with as name the name of the book instance concatenated with a dash and the name of the publication.

custom

Syntax

<custom class="..." [... any other attributes ...]/>

Description

This task provides a hook for implementing your own tasks. A publication process task should implement the following interface:

org.outerj.daisy.books.publisher.impl.publicationprocess.PublicationProcessTask

The implementation class can have three possible constructors (availability checked in the order listed here):

  • A constructor taking an XMLBeans XmlObject object as argument. The XmlObject will represent the <custom> XML element. This constructor is useful if you want to access nested XML content of the <custom> element (for advanced configuration needs).
  • A constructor taking a java.util.Map as argument. The Map will contain all attributes of the <custom> element.
  • A default constructor (no arguments)

renderSVG

Syntax

This task currently has no native tag, so it should be used through the custom task capability.

<custom class="org.outerj.daisy.books.publisher.impl.publicationprocess.SvgRenderTask"
        input="..." output="..."/>

The following table lists additional optional attributes.

attribute

description

outputPrefix

where the generated SVGs should be stored in the book instance, relative to the output of the current publication process, default: from-svg/

format

jpg or png. default: jpg. the Ibex XSL-FO renderer doesn't seem to handle the png's.

dpi

dots per inch, default: 96. For good quality, put this to e.g. 250.

quality

for jpegs, by default 1 (should be a value between 0 and 1)

backgroundColor

for transparent areas in images, color specified in a form like #FFFFFF (default: leave transparent)

enableScripts

should scripts in the SVG be executed, default: false. (Note: the Rhino version included with Cocoon 2.1 doesn't work well together with Batik, this can be resolved by upgrading to rhino 1.6-RC2, though this also needs a recompile of Cocoon -- ask on the mailing list if help needed)

maxPrintWidth

maximum value for the generated print-width attribute, in inches. The other dimension scales proportionally. Default: 6.45

maxPrintHeight

maximum value for the generated print-height attribute, in inches. The other dimension scales proportionally. Default: 8.6

Description

Parses the file specified in the input attribute, and reacts on all renderSVG tags it encounters:

<rs:renderSVG xmlns:rs="http://outerx.org/daisy/1.0#bookSvgRenderTask"
              bookStorePath="..."/>

The bookStorePath attribute points to some resource in the book instance, which will be interpreted as an SVG file and rendered. The renderSVG tag is removed, and replaced with an <img> tag, with:

  • a src attribute pointing to the generated image (the produced image file name is the same as the original file name, but in a different directory as defined by the outputPrefix attribute)
  • height and width attributes specifying the size in pixels (for HTML)
  • print-height and print-width attributes specifying the size in a form suited for XSL-FO (e.g. 3in), depending on the specified dpi.

To make use of this task, you will typically download the content of some document part in the book instance using <requiredParts/>, and have a doctype XSL which generates the renderSVG tag. This will eventually be illustrated in a tutorial on the community Wiki.

The renderSVG task requires Batik, which is (at the time of this writing) not included by default in the Daisy Wiki.

callPipeline

Syntax

This task currently has no native tag, so it should be used through the custom task capability.

<custom class="org.outerj.daisy.books.publisher.impl.publicationprocess.CallPipelineTask"
        input="..." output="..." outputPrefix="..."/>

The outputPrefix attribute is optional and defaults to after-call-pipeline/.

Description

Parses the file specified in the input attribute, and reacts on all callPipeline tags it encounters. This is different from the applyPipeline task which applies a Cocoon pipeline on the file specified in the input attribute itself. This task is ideally suited to do some processing on part content downloaded using <requiredParts/>. The callPipeline tag is typically produced in the document-type specific XSLT for the document type containing the part.

Syntax for the callPipeline tag:

<cp:callPipeline xmlns:cp="http://outerx.org/daisy/1.0#bookCallPipelineTask"
                 bookStorePath="..."
                 pipe="..."
                 outputPrefix="..."
                 outputExtension="...">
   ... any nested content ...
</cp:callPipeline>

When such a tag is encountered, the Cocoon pipeline specified in the pipe attribute will be applied on the document specified in the bookStorePath attribute. The pipeline is a pipeline in the sitemap of the current publication type.

The outputPrefix attribute is optional, and can specify an alternative outputPrefix than the one globally configured. The outputExtension attribute is optional too, and specifies an extention for the result file. The base file name is the same as the current filename (the one specified in the bookStorePath attribute).

The cp:callPipeline tag itself is removed from the output, however its nested content is passed through. For all elements nested inside <cp:callPipeline>, the attributes will be searched for the string {callPipelineOutput}, which will be replaced with the path of the produced file. For example, if you want to transform some XML file to SVG and then render it with the renderSVG task, you can do something like:

<cp:callPipeline bookStorePath="something"
                 outputPrefix="something/"
                 pipe="MyPipe">
  <rs:renderSVG bookStorePath="{callPipelineOutput}"/>
</cp:callPipeline>

If you would put the above fragment in an XSL, don't forget to escape the braces by doubling them: "{{callPipelineOutput}}".

Comments (0)
Advertisement

Daisy hosting, installation, support. Workshops and turnkey Daisy CMS projects. Get Daisy from its creators.

outerthought.org

Downloads provided by

SourceForge.net Logo

Open source stats