This document specifies the DevEver Documentation and Specification Language (DEDOC). DEDOC is an XML schema and markup language intended for the specification of technical specifications, standards and documentation. For more information on DEDOC and its supporting tooling, see the DEDOC website.
How DEDOC is specified. Note that this specification is itself written as a Guile Scheme program, which, when executed, outputs XHTML with embedded RELAX NG schema definitions. Thus, this document is both human and machine-readable. Generally, the RELAX NG schema will be automatically extracted from this document to facilitate its further use for validation purposes.
Because the RELAX NG schema is written as part of this document, this document is the canonical source for the RELAX NG schema definitions for DEDOC. Thus, this document constitutes the normative specification for DEDOC for the purposes of both human-readable and machine-readable expressions.
Moreover, the expression of DEDOC herein, in which narrative is interweaved with RELAX NG definitions shown inline, constitutes an application of “literate programming” methodology to XML schema definition. This is directly inspired by other attempts to both apply literate programming to XML schema definition while simultaneously having a single source of truth for schema definition, most notably TEI's “One Document Does It All” (TEI ODD) model.
The choice of XHTML as the schema for this document was made to avoid circular dependencies on DEDOC.
Purposes of DEDOC. DEDOC is intended to support writing documentation once and producing multiple production quality output formats, including:
There exists an existing XML-based markup language, namely DocBook, which aims to cover much of this ground, but DocBook suffers from several issues:
When designing a universal source format for documentation which aims to target the production of multiple output forms, the means by which diagrams are to be expressed becomes a potentially complicated question. In particular:
What complicates this matter further is that diagrams may contain text. Where a specific typesetting system is used such as TeX, it is likely to be jarring if text in a diagram is rendered differently to the main body text, as there can be distinctive differences in rendering. Conversely, if a diagram has its text typeset in TeX, because the diagram was generate via TeX, it may be jarring for such a diagram to appear inside a web page.
Thus, the following determinations are made:
Firstly, that it may be inevitably necessary for different input representations to be used for generation of different output formats. For example, a diagram might be provided as two files, one to be used for XHTML output and one to be used for PDF output.
This is not constrained to just diagrams. For example, a piece of mathematics might need to be expressed both as TeX code (for use when generating PDF output) and as MathML code (for web use). (Though ConTeXt does support MathML input, it is anticipated that there will be cases where its MathML support is inadequate for complex formulas.)
Thus a general solution of forks is adopted. A fork is a construct inside a DEDOC document whereby a processor consuming a DEDOC document chooses exactly one of the forks, and is free to choose the fork most appropriate to it. For example, a math formula might be expressed in a fork containing both MathML and TeX representations, or a diagram might be expressed in a fork containing SVG, PDF and PNG representations.
Secondly, support for diagrams receive specific attention. A diagram is essentially expressed as a fork (though it may be a degenerate fork containing only one representation). Diagrams may be expressed in a variety of formats, such as external SVG, PDF or raster files, or as some kind of program or program fragment which, when executed, generates the desired diagram. An example of the latter is the TikZ DSL for drawing diagrams which has been implemented on top of TeX, but there are also countless other examples of non-TeX programs which are designed to consume some kind of textual input and generate diagrammatic outputs, such as Asymptote. This constitutes a very convenient and time-saving way for developers to express (and version control) diagrams which might otherwise have to be created manually inside a graphical editor and versioned as opaque binary files created by graphical editing tools.
A diagram fork thus specifies one or more representations; each representation specifies its format, and either its text or a filename containing the representation data. If multiple representations are provided, an output generator is free to choose the one best suited to its output format.
In some cases, the desired output format may be especially “aligned” with a provided representation. For example, if a diagram is provided as TikZ code, and the output is being produced using a TeX processor, rather than generating the diagram as a vector file and embedding it, the TikZ code can be directly executed inside the TeX environment during the typesetting process. This has the advantage that the diagram inherits any font and other settings applied to the TeX document and thus matches the look and feel of the rest of the output as closely as possible.
In other cases, the only provided representations may be “unaligned”. If the provided representation is a raster or vector image, it is simply included directly. Another example of an unaligned representation is TikZ input code where the desired output is XHTML; in this case, TeX must be invoked for each such diagram to produce SVG output for that diagram alone. Compare this with when TeX is being used for typesetting, where the TikZ code is simply included into the document and does not result in a separate invocation of TeX. This process should be managed automatically, so that TikZ can be used to generate diagrams for both XHTML and TeX (PDF) output methods.
Numerous other text-input diagram generators are available and it is anticipated that these will generally require the invocation of some external program for both the TeX and XHTML output pathways; thus the diagram support in DEDOC must be extensible to arbitrary external processing tools.
Although diagrams are modelled as forks and thus can have an author write or provide multiple input representations if truly necessary to maintain acceptable fidelity across all output formats, it is anticipated that in the majority of cases it will be possible to generate acceptable quality diagrams from a single source and DEDOC focuses on ensuring that this is the case.
The namespace URI for DEDOC is “https://www.devever.net/ns/dedoc
”.
Where used, namespace prefixes in this document refer to the following namespaces:
namespace mml = "http://www.w3.org/1998/Math/MathML" namespace xlink = "http://www.w3.org/1999/xlink" namespace bib = https://www.devever.net/ns/bib namespace local = "" datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes"
This page is RDDL-enabled. The following schema artifacts generated from this document are available:
For more information on the tooling surrounding DEDOC, see the DEDOC website.
The DEDOC language is composed of three layers:
Inline constructs contain text and other inline constructs, block-level constructs contain other block constructs and inline constructs, and structural constructs contain other structural constructs or block-level constructs. Inline constructs are constructs which occur inside a single paragraph and often tend to relate to the horizontal formatting of text. Block constructs include the paragraph (p) element and other constructs and tend to relate to the vertical layout of text. Structural constructs group blocks and create a hierarchical logical document structure.
The remainder of this document introduces the three layers, starting with the lowest layer and building upwards.
The root element is
:doc
An entire document.
The control information comprises metadata which does not appear in the document body itself, and which should not necessarily be rendered.
Contains information about a build process which produced a DEDOC XML file.
Contains a single-line VCS revision summary. The form attribute indicates whether a full or abbreviated revision summary is used. For example, a short VCS revision summary might contain only a few hexadecimal characters to cryptographically identify the revision, whereas the long summary may contain the full hash. Note that there is no set form for either of these strings and they are not required to contain only hexadecimal characters.
Contains a full-length cryptographic identifier for the VCS revision from which the DEDOC XML file was built. If the VCS being used does not have a suitable cryptographic identifier, the best available unambiguous identifier should be used. A + should be appended if the tree was 'dirty' when building, meaning that changes may have been made since the referenced revision.
Contains a timestamp for the VCS revision from which the DEDOC XML file was built.
The document body contains structural constructs.
A section. Sections may begin with some block-level constructs which are not in a section, but block-level constructs directly within a given section may not come after a subsection of that section. Sections nest infinitely, but specific output systems may have limits on the depth supported.
Container of header and metadata information for structural (and formal float) constructs which have a title.
The title of a structural construct, such as a section or document.
The number of a section.
Block constructs contain other block constructs, inline constructs or text, and generally relate to the vertical layout of text in a document.
Denotes a paragraph, which constains only inline constructs and which is the most commonly used construct to place inline constructs in a block construct environment.
An unordered list. May contain only <li> elements, which constitute the elements of the list.
An ordered list. May contain only <li> elements, which constitute the elements of the list.
A list item in an ordered or unordered list.
A dictionary list, which maps keys to values.
A dictionary list item.
The key of a dictionary list item.
The body of a dictionary list item.
Formal floats are numbered containers such as “figures” and “tables”. These form separate numbering namespaces independent of section numbering. They do not necessarily contain actual tables.
A figure is a formal float numbered with a prefix word “Figure”. They are generally to be used to show diagrams but need not be. They contain block constructs.
A table is a formal float numbered with a prefix word “Table”. They are generally to be used to show diagrams but need not be. They contain block constructs.
An equation is a formal float. They are used for display math and contain math code directly; they cannot be used for other purposes. This is a forking construct.
Verbatims are blocks of text which are laid out in monospaced, verbatim form with no elision of spaces. They are typically used for displaying source code fragments. Note that unlike e.g. LaTeX verbatims, they can contain other markup.
A generic code listing verbatim. This should be your default choice of verbatim if in doubt.
TODO
Inline constructs contain text and other inline constructs, and generally relate to the horizontal formatting of text within a given paragraph.
“Semantic phrases” refers to one or a few words which should be annotated with their semantic meaning so that they can sometimes be specially typeset. Examples of “semantic phrases” that appear in many manuals are typed commands, class names, RFC 2119 keywords, etc.
A proword is a word or phrase with normative power in the context of a standard or specification. Examples include RFC 2119 capitalized words in RFCs, and the phrases “shall”, “shall not”, “should”, “should not”, “may”, “may not”, “must” and “must not” in ISO standards.
A procedure name. Used to refer to a procedure by name in prose.
A keyword. Usually typeset in monospace.
Inline mathematics. This is also a fork construct, and can therefore contain multiple representations of the same mathematics.
A breakout is a construct which is considered a block, and which can contain blocks, yet which is allowed to appear in an inline context.
A footnote defined inline. A footnote contains block constructs.
Inline constructs which reference other documents, or other constructs in the same document. Some of these are also considered semantic phrases.
Use a term in prose which was previously defined. Use to properly reference the item of terminology at its definition site.
The optional attribute “sp” specifies whether this use of the term is singular or plural.
Inline reference to another construct in the same or another document, generating a hyperlink where possible. The text is manually specified.
Inline citation. This differs from link in that the text of the hyperlink is generated automatically.
Though discouraged, some elements are defined which can be used to express a specific typesetting request. This should only be done if no alternatives are suitable.
Request emphasis (generally represented as italics). Avoid where possible.
Request typesetting in monospace. Avoid using this if an appropriate semantic phrase element is available.
Provides constructs which should not be used unless absolutely necessary, because they expose the semantics of underlying typesetting systems.
TeX passthrough. The TeX code specified is executed. For non-TeX output, this element and its contents are removed and ignored.
Lint is semantically meaningless text wrapped in an element designating it as such. It is usually used to contain spacing or punctuation between other things.