From one to many: AsciiDoc converts a text file to various output formats
Single Source
AsciiDoc syntax along with its eponymous command lets users create a text document with unobtrusive markup and convert it to a variety of output formats.
Write once, publish many – the idea behind AsciiDoc [1] is not new. The AsciiDoc syntax was created as a simple method of editing DocBook documents and has established itself as a more or less ubiquitous document format that acts as a source for a variety of other output formats.
AsciiDoc is both in wide use and actively developed. Even publishing companies accept manuscripts in this format or use it internally. The system comprises a source text and a converter that converts the source into the desired output.
The asciidoc
command accepts three document types with the -d
switch – book
, article
, and manpage
– with default front and back matter (Table 1). The software uses back ends to generate various formats with the -b
switch (Table 2). The default document type is article
.
Table 2
AsciiDoc Back Ends
Back End | Formats |
---|---|
docbook5 |
DocBook (current version), PDF |
docbook45 |
DocBook (widespread version), PDF |
xhtml11 |
XHTML 1.1 (default) |
html4 |
HTML 4 |
html5 |
HTML5 |
slidy |
(X)HTML presentations |
wordpress |
Websites, blogs, CMS |
latex |
Standard LaTeX (for PDF Fineprint) |
epub |
eBooks |
Table 1
AsciiDoc Document Types
Type | Default Sections |
---|---|
book |
dedication |
|
preface |
|
appendix |
|
bibliography |
|
glossary |
|
index |
|
colophon |
article |
abstract |
|
appendix |
|
glossary |
|
bibliography |
|
index |
manpage |
NAME* |
|
SYNOPSIS* |
* Mandatory sections |
The design goal for the source text markup was a format that is easily understandable for humans but that still offers relatively advanced options, including support for all of the usual tags and structures, as well as hierarchical layers for structuring text, references within the document, and URLs for external content. Beyond this, the software supports the ability to embed keywords, indexes and footnotes, literary references, images, and tables.
Text
AsciiDoc is oriented on many existing popular conventions for embellishing text files. You will be familiar with many of these and probably have used some features already – for example, underlining a title with equals signs or a head with dashes or marking a bullet point with an asterisk at the start of the line.
Paragraphs, probably the most important element for structuring texts, are created by an empty line between blocks of text. If the body text of a paragraph does not start in the first column (i.e., if you have at least one space or a tab), AsciiDoc treats the whole paragraph differently, applying the formatting literally without interpretation and using monospaced font in the output.
AsciiDoc also supports a special type of paragraph formatting: You can add an instruction starting with [type]
in what is otherwise an empty line (Listing 1); AsciiDoc differentiates between four types (Table 3).
Listing 1
Listing Paragraph Style
Table 3
Paragraph Types
Instruction | Result |
---|---|
verse |
Normal typeface taking hard line breaks into account (left-justified). |
quote |
Normal typeface, balanced typography. |
listing |
Monospace font, often with a background or border; takes hard line breaks into account; left justified. |
literal |
Monospace font, takes hard line breaks into account, left justified. |
AsciiDoc groups multiple successive spaces in the body text during conversion. To add spaces at a specific position, you need special non-breaking spaces, which you generate using {nbsp}
singly or in succession. Classic markup in the text uses the popular Markdown syntax (Table 4).
Table 4
A Brief Introduction to Markdown
Instruction | Result |
---|---|
*Text* |
Bold |
_Text_ |
Italics |
+Text+ |
Literal text in a typeface with fixed spacing |
`Text` |
Literal text in a typeface with fixed spacing |
Two ways to create the layers required to add structure are by adding single-line instructions or by using a variant with two lines.
The single-line headers (Table 5) are correctly identified by the converters only if the text really does just take up one line, but you can leave out the closing tags. Alternatively, you can mark the structural layer by underscoring headings with different characters: An equals sign for the first layer, a dash for the second, then a tilde, a circumflex, and finally a plus sign at the lowest level (Listing 2).
Listing 2
Document Headings
Table 5
Layers
Instruction | Result |
---|---|
= Text = |
Document title |
== Text == |
Chapter |
=== Text === |
Section |
==== Text ==== |
Subsection |
===== Text ===== |
Sub-subsection |
Additionally, you can insert anchors (i.e., labels) in front of headings:
[Label] === Section heading
The heading formats are pretty flexible because they support versatile formatting in the source code.
The ability to omit the closing tags for individual markups is convenient but can make it difficult to troubleshoot bugs that cause a conversion to fail. This problem also occurs with the use of multiline tags: They tend to cause far more problems than their single-line counterparts in practical applications.
AsciiDoc reserves the first lines in a document (the Header) for special tasks. In addition to the document title, you can enter the author's name, an email address, a version number, and a date. The software uses these details both as the document's metadata and as the basic document information in the output.
References and Links
If you need references within the document, [[Label]]
creates the anchor that you can point to at any position in the text using <<Label>>
or <<Label,Text>>
. Although AsciiDoc relies fully on Unicode, non-standard characters in the label can cause problems during conversion.
The reference Text
mitigates the situation somewhat: The program uses it at the point where the reference exists in the text, and arbitrary characters are permissible.
If the reference text is missing, AsciiDoc inserts the previously defined label or a different element during conversion. The rules for what the software uses here vary: In sections, for example, the default is the headline text.
You can reference other documents in one of these forms:
link:filenameI#ID link:filename#ID[Text]
If you need to do so, you can state paths that include filenames. A simple syntax exists for URLs:
http://address http://address[Text]
The program detects both automatically because of the http:
keyword. This also applies to mail addresses:
mailto:address mailto:address[Text]
AsciiDoc supports two approaches to embedding images of different kinds: in the body text as inline graphics or as a separate block (i.e., an image paragraph). The image:
keyword is used for images in the body text:
image:Filename[Text]
Even if the text is missing, you need to use the square brackets. The text can include additional formatting for the image – for example, defining the size with height=14pt
.
If you put images in separate paragraphs, use block macros. The first part of the keyword is image
followed by two colons, then the filename (with an optional path) followed by square brackets. To enter an image caption, use the caption=
keyword with the caption text in double quotes. Alternatively, and more simply, you can use the following form, which additionally defines a label for the image:
[[Label]] .Text image::Filename[Options]
Lists are easy to format in AsciiDoc. Bulleted lists begin with a single dash or an asterisk. To create nested lists, increase the number of leading dashes (asterisks) to indent further.
Numbered lists use a similar principle. AsciiDoc changes the enumeration style at each level (Listing 3). For an even more elegant way of doing this, just type dots. You can insert up to five dots at the start of the line to create an enumerated list.
Listing 3
Numbered Lists
You can mix and embed the types of lists, but you need to pay close attention to an empty line so that AsciiDoc can tell which elements belong together and where the list ends. Because this typically leads to difficulties, it is useful to initially save the lists in a separate file and convert them as a sanity check. If it works, you can then add the code to the document.
Delimited blocks are another kind of structural element that uses specific formatting, such as displaying source code. In LaTeX-speak these elements are known as "environments." A number of delimited blocks are already defined in asciidoc.conf
. Each of these environments normally has several variants; however, not all are available for all formats, or their appearance might differ. Listing 4 shows some of the basic structures. A block comprises the desired text delimited by two lines of special characters (e.g., asterisks, dashes, etc.): one above the text and one below.
Listing 4
Predefined Blocks
In the line that precedes the first line of a block, you define the environment style. However, many of the styles do not harmonize with all output formats, which can cause the conversion process to exit prematurely.
For example, the open block structure, which starts and ends with two dashes,
[Style] -- Code --
is used for summaries in the abstract
style or to store introductory text for parts of a publication in the partintro
style. In a list, a double-dashed section of text allows you to add an open block to a list element [2].
The AsciiDoc table format gives you useful results for simple and short content. Listing 5 shows the source code for a simple table. All the major settings rely on options which are included in the square brackets before the table. The table itself is created with pipe and equals signs. Although this is problematic for short tables, it can be error prone for longer tables.
Listing 5
AsciiDoc Tables
The table options are organized in several groups (Table 6). In AsciiDoc, tables can always have a header, typically for the column designators, and a footer. Both of these contain special formats if they are defined using D header
and footer
keywords.
Table 6
Lines and Columns
Option | Values | Function |
---|---|---|
grid |
none, cols, rows, all |
Table grid lines |
frame |
topbot, none, sides, all |
Table border |
options |
header, footer |
Header, footer |
format |
psv, csv, dsv |
Separator character for columns |
valign |
top, bottom, middle |
Vertical alignment within table cell |
width |
Between 1% and 100% |
Table width |
cols |
Multiplier (*), alignment, width, style |
Column description |
Defining the style for the columns causes a number of side effects; you can try them out using the example in Listing 5 by changing the format for the columns from a
to v
. The use of formatting cells is extremely messy; Figure 1 shows an example of what you will want to avoid.
Problems
Errors in authoring AsciiDoc documents obviously do not become apparent until you try to convert. Some editors, including Emacs, support the input of source code by providing special modes that discover at least simple oversights while you are typing; however, even these helpers are unable to find more complex logical errors or errors in syntax.
The AsciiDoc documentation [3] is fairly sparse; thus, makes a lot of sense to look at a few examples before you start on your own projects. Some documents are available on the website, including an article that explains the use of indexes [4], a variant on the User Guide [5] formatted as an article, and a template for a book [6] that includes more complex structures, such as a reference list, a colophon, and a dedication.
Converting with the asciidoc
script and the a2x
toolchain takes a surprising amount of time just to generate EPUBs or PDFs from relatively simple source text. The -v
option gives you more detailed information on what is taking place.
Things look a little better in the case of text-only or HTML output; however, these formats only support a subset of the available options, so they are only useful for previewing documents at best.
To minimize the time overhead, you can always use the Shell &&
operator to combine the processes of creating an HTML file and a more complex format. You can then discover and resolve the syntactic errors with a faster HTML convert before converting to the final format after everything else has been worked out.
That said, a successful conversion to HTML is no guarantee that a PDF file will give you the desired results. In fact, it is quite difficult to create PDFs with an attractive layout directly in AsciiDoc (Figure 2). The workaround is to export to LaTeX. AsciiDoc generates standard LaTeX, which you can then convert to a PDF document after manual editing.
The advantage of the workaround is that you have access to all of LaTeX's options with minimal additional overhead. AsciiDoc only supports a small subset of the formats available in LaTeX; among other things, it does not support picture environments, which either rules out their use, or forces you to do some post-editing.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Fedora Asahi Remix 41 Available for Apple Silicon
If you have an Apple Silicon Mac and you're hoping to install Fedora, you're in luck because the latest release supports the M1 and M2 chips.
-
Systemd Fixes Bug While Facing New Challenger in GNU Shepherd
The systemd developers have fixed a really nasty bug amid the release of the new GNU Shepherd init system.
-
AlmaLinux 10.0 Beta Released
The AlmaLinux OS Foundation has announced the availability of AlmaLinux 10.0 Beta ("Purple Lion") for all supported devices with significant changes.
-
Gnome 47.2 Now Available
Gnome 47.2 is now available for general use but don't expect much in the way of newness, as this is all about improvements and bug fixes.
-
Latest Cinnamon Desktop Releases with a Bold New Look
Just in time for the holidays, the developer of the Cinnamon desktop has shipped a new release to help spice up your eggnog with new features and a new look.
-
Armbian 24.11 Released with Expanded Hardware Support
If you've been waiting for Armbian to support OrangePi 5 Max and Radxa ROCK 5B+, the wait is over.
-
SUSE Renames Several Products for Better Name Recognition
SUSE has been a very powerful player in the European market, but it knows it must branch out to gain serious traction. Will a name change do the trick?
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.