Documentation with RST

Author:Roland Puntaier

Overview

Github does not support include. You can find a rendered version on readthedocs.

Purpose

The doc folder

  • motivates to use light markup text, specifically restructuredText, over DOCX or ODT for documentation of software projects
  • is an example documentation using rstdoc, a wrapper around Sphinx and Pandoc

For your project documenation a readme

  • gives an overview of the documents: Files
  • summaries dependencies between the documents: Dependencies

Files

readme.rest

Overview

readme.rest provides an overview of the documentation, such that new team members or reviewers can find their way in the documentation.

No actual content is placed in the readme.rest file.

ra.rest

Risk Analysis

Motivations and risks.

The argumentation is kept general to motivate general requirements that do not presume solutions.

sr.rest

System Requirements

General requirements motivated from Risk Analysis.

dd.rest

Design Description

The detailed choices made to satisfy System Requirements.

Here the actual format, the conventions and the tools are proposed.

tp.rest

Test Plan

Documentation for the tests of this package (rstdoc).

rstdoc.rest

rstdoc

The description of the python API and command line tools provided by this package.

Dependencies

FCA diagram of dependencies with clickable nodes: ra lightblue, sr red, dd yellow, tp green

tr0:

dhy, r1p, Sustainability, r8d, r90, r9g, Community, Parallelism, Traceability, rj4, Redundancy, Formatting vs Content, rstdcx, Accessibility, Automation, Productivity, s0t, s10, s1g, s45, s8c, scs, seo, sgt, sim, System Requirements, sxr, test_lnkname

Up: tr2, tr4, tr5, tr7, tr8, tr10, tr12, tr13, tr14, tr15, tr16, tr17

Down:

tr1:

Up:

Down: tr3, tr6, tr9, tr11, tr14, tr15, tr16, tr17

tr2:

Accessibility, scs

Up: tr3

Down: tr0

tr3:

Accessibility

Up: tr1

Down: tr2, tr4, tr5

tr4:

Sustainability, r9g, Community, Traceability, rj4, Formatting vs Content, rstdcx, Accessibility, Productivity, s45, sgt

Up: tr3, tr6, tr9

Down: tr0

tr5:

Accessibility, sim

Up: tr3

Down: tr0

tr6:

Traceability

Up: tr1

Down: tr4, tr7, tr8

tr7:

Traceability, s0t

Up: tr6

Down: tr0

tr8:

Traceability, s10

Up: tr6

Down: tr0

tr9:

sgt

Up: tr1

Down: tr4, tr10

tr10:

r1p, Redundancy, sgt

Up: tr9, tr11

Down: tr0

tr11:

Redundancy

Up: tr1

Down: tr10, tr12, tr13

tr12:

r8d, Redundancy, s8c

Up: tr11

Down: tr0

tr13:

Redundancy, s1g

Up: tr11

Down: tr0

tr14:

dhy, Parallelism

Up: tr1

Down: tr0

tr15:

r90, seo

Up: tr1

Down: tr0

tr16:

Automation, sxr

Up: tr1

Down: tr0

tr17:

System Requirements, test_lnkname

Up: tr1

Down: tr0

_images/_traceability_file.png

Figure 1: FCA diagram of dependencies: ra lightblue, sr red, dd yellow, tp green

Risk Analysis

Purpose

This risk analysis focuses on displaying the benefits of using light markup text, and specifically RST, for documentation of (software) projects.

This file also

This whole documentation is an example for documentation using RST. Specifically this file’s original is a template file that integrates automatic generation of some of its parts.

Only a part of rstdoc documentation deals with the provided python code.

Qualitative Analysis

Productivity

To have more evolution with less effort one must change the tools to better productivity.

The objective is to find a documentation format that is more productive than MS Office or Libre Office for technical documentation.

DOCX or ODT cannot be well integrated into a software project. The reason does not lie in the writing itself, but in the organization of information and the further development and handling of text. They

  • have low accessibility: san, stq
  • have low traceability: s9v, s0t
  • produce too much redundancy: sgt, s8c
  • are no good for automation: sgt, s8c

As a result:

  • They lead to low productivity (sa7).
  • The quality of the content suffers.

Code is written in a text editor. Documentation must be written with the same text editor. It brings overhead to access information with two different tools.

Formatting vs Content

r9g:s45

The purpose of DOCX or ODT, in general the WYSIWYG idea, is about providing easy formatting. The information coded in a human language is surrounded by layers of formatting.

  • The DOCX XML files are zipped, which makes them binary.
  • The XML-based formatting is so full of formatting markup, that it is not very readable. This also applies to non-zipped formats like docbook.

But formatting should have no importance in development.

There is a less obtrusive alternative for formatting than via XML, HTML or even TeX:

The content is important in documentation, not formatting.

Every bit of information needs a location. This location cannot be in a DOCX or ODT, because there it is not well accessible (Accessibility).

Certain content can be stored in a text database and reused in other documents via templating (r8d).

Data can be better integrated into text, than into DOCX or ODT.

From an immediate but naive perspective it may seem easier to compose documents using DOCX or ODT. Due to the complex task to bring a big project to a consistent final state, it is not. More detailed reasons are the topic of this documentation.

Final Version

rio:scf

The purpose of this proposal is targeted to the development time.

After the project is over documents are

  1. archived or
  2. placed on a web server

The formats usually used, are:

  1. PDF
  2. HTML
  3. DOCX or ODT

In case the final version is a printout, DOCX or ODT allow final formatting correction before printing.

Parallelism

Parallel processing is faster than serial processing. The productivity of a team increases if the team members can work in parallel.

In order for the developer to work independently he needs to be allowed to make his own decisions.

The decisions get their input from the documentation done by others and the information generated by the developer himself.

Every developer needs to understand, how the product will be used.

The external requirements are kept

  • minimal
  • mostly soft, i.e. modifiable
  • with good rationale, especially for hard requirements

If a developer has an idea, a conflict or an issue, he can adapt the source code and the documentation and the tests, also of others, to resolve the issue by himself.

The chief developer only

  • does initial coordination
  • observes, i.e. reviews the changes, as other developers do, too.

The format of the documentation matters regarding independence:

  • changes can be traced in the VCS (Traceability)
  • information can more easily be found via grep-like tools over all files (Accessibility)
  • a final document file can be decomposed into separate source files for developers

Traceability

Trace changes
rnn:s9v

Documentation is the description of the system in a human language. It is meant for humans. Nevertheless it is not a novel, but more like code.

  • It defines variables and values (concepts) like code.
  • It undergoes the same changes as code.
  • It has dependencies and a hierarchical structure like code.

Team members need to be able to follow changes. A version control system (VCS) like SVN or GIT is needed to trace the changes in documentation.

Trace dependencies
rw9:s0t, san

Code uses identifiers for its items (variables, functions, classes, …).

The documentation can use IDs to mark an item (paragraph, figure, table, …).

The ID can be used to reference an item from somewhere else: m-n. A special case is 1-n, e.g. the ID of a header comprises all IDs of the paragraphs below.

Flat addressing: Relations are not reflected in the names and especially not in the IDs. Especially the IDs do not have an order. Flat and unordered IDs are more flexible, because they are independent of the changes in structure and order.

Accessibility

Hypertext
r33:san

The productivity depends much on how fast information can be found.

Access time: The time to access stored information.

The access time is fastest for information stored in the brain. The brain of most humans is very slow to memorize, though. And the brain forgets. Normally one can expect only the current topic to be present for immediate processing.

Related information can be quickly looked up, if the documentation contains references that immediatly can be jumped to (hypertext). The importance of this can be seen by the immense success of hypertext in the internet.

To allow hypertext referenced items must have a unique resource ID (URI).

Community

rj4:s9o

Community spreads the effort for tooling to more people.

  • The commercial model makes more people dependent on one company. In case of DOCX there is no alternative to MS Word, that renders documents in the same way. This makes Microsoft a monopoly leading to over-pricing.
  • The open source model is a decentralized community effort: With software there is no effort and therefore no loss in sharing. One gains the effort of others.

The open source model is preferred, because one has more control.

  • one can add a feature if needed
  • one can fix a bug immediately

The total effort is less than for the commercial model.

Sustainability

ref:sed

The information shall be accessible

  • over a long time
  • by many people

But if the format is only readable by one of many commercial tools,

  • at some point one may not want or be able to pay the license
  • some people might use a different tool

If one would like to change the tool one cannot without substantial costs (vendor lock-in).

Because of the sustainability argument, a DOCX document needs to be converted to PDF, e.g. before sending to someone else or maybe even when checking into a VCS.

Further reading:

Redundancy

r90:sgt, s8c

Redundancy: When the same information needs to be maintained at more places.

Less redundancy means higher productivity.

Redundancy

  • needs more resources
    • more pages, more memory
    • more time to read
    • more time to write
    • more effort when changing something
  • leads to inconsistencies

The reasons for redundancy are

  • barrier between formats: DOCX and text, computer language and human language
  • inability to link to information: no hypertext
  • inability to exploit functional dependencies: no automation available
  • normative boilerplate texts: no automation available

Further reading:

Automation

Scripting
r1p:sgt, s8c

Why don’t we write code in MS Word or LibreOffice? Because it would be hard to parse away all the formatting.

It does not help to have a library like Office-XML-SDK or DOM, because the additional complexity through formatting elements still needs to be dealt with when parsing or creating documentation parts.

A format where formatting is less important and less obtrusive can be handled more easily via scripts.

Templates
r8d:

Many internet sites are generated with a mixture of text and a scripting language (PHP, JS, Python, …). Such templates allow

  • to mix text and data or
  • to generate text from data.

Text files can easily be generated from templates files.

Quantitative Analysis

Introduction to risk analysis

Risk analysis is basically a simulation.

Event:A possible and recurring configuration of values of variables.
Frequency:\(f\). How often an event is observed per time interval.
Probability:\(p\). Compares the frequency of mutually exclusive values of one variable. At least one value must occur (exhaustiveness).
Rating:\(v\). Judge an event by associating a value expressing harm/benefit, loss/profit or advantage/disadvantage.
Risk:\(r\). The risk is frequency * rating:
\[r_{e} = f_{e} v_{e}\]

The total risk \(R\) sums over all events:

\[R = \sum_e r_e\]

Events can depend on other events functionally or statistically. One can start with the probability for an event once a day and then follow conditional probability chains to other events.

The risk analysis tries to analyse these dependencies to get to a more precise estimation of the frequency.

It is hard to get good estimates of frequencies in a complex real world, because there are

  • many variables
  • many dependencies
  • unknown probabilities

Because the frequencies will be inaccurate, instead of numbers one can use more imprecise but realistic values, that need to be defined for the special area (Table 1).

Table 1: Occurrence values for a medical device

Number Category Explanation
1 Unimaginable Never occurs in the lifetime of device
2 Improbable Occurs once in the lifetime of device
3 Remote imaginable Occurs once in 100 applications
4 Sometimes Occurs once in 10 applications
5 Probable Occurs once per applications
6 Frequent Occurs multiple times per applications

The rating depends on the

  • area (health sector, finance, …) and the
  • circumstances (war or peace, rich or poor, …)

Table 2: Severity rating in the health sector.

Number Category Explanation
1 Non-essential Minor injury not needing medical intervention
2 Minor Small to moderate injury
3 Critical Severe injury or death
4 Catastrophical Multiple deaths

In this discrete description, risk value could be

ac:acceptable
alarp:as low as reasonably practicable
nac:not acceptable

The risk function is defined by a table. The total risk can

  • count each risk value occurrence
  • count each cell occurrence in the risk table (Table 3)

Table 3: Occurrence/Severity matrix. AC, NAC, ALARP are counts of events in the respective cell.

Risk R        
OS 1 2 3 4
6 ALARP NAC NAC NAC
5 ALARP ALARP NAC NAC
4 ALARP ALARP ALARP NAC
3 AC ALARP ALARP ALARP
2 AC AC ALARP ALARP
1 AC AC AC ALARP

Countermeasures

r2m:

The purpose of the risk analysis is not to make a yes/no decision for a project, but to derive countermeasures that reduce the risk or prevent harm or financial loss.

The countermeasures change the probability of the events, by changing the causal dependencies between events.

The rating probably will not change, unless circumstances change.

In the Occurrence/Severity example, in Table 3:

  • before the measures: events are possibly in the upper right corner
  • after the measures: events are ideally only in the lower left corner
  • the events in the top/left to right/bottom diagonal have a trade-off and should be kept “as low as reasonably practicable”

Risk analysis for documenting with RST

rp5:

This risk analysis compares to the above introduction to risk analysis in this way:

  • Event is a task a developer performs
  • Time consumed per event corresponds to severity (per developer)
  • Occurrence per developer

Instead of the discrete values, numbers are used for time and occurrence. The numbers are rough estimates, because they depend a lot:

  • on the developer
  • on the tools he uses (editor and plugins)
  • how well he knows his tools
  • which phase the development is in
  • how long the project takes
  • how much documentation there is

The risk is the effort per developer.

(1)\[R = \sum_{e}v_e f_e\]

Math 1:

  • \(e\): event to perform a task
  • \(v_e\): time consumed for task
  • \(f_e\): how often per day the task \(e\) occurs
  • \(R\): total effort per developer

The countermeasures taken lead to:

  • RST for documentation instead of MS Word or Libre Office

Events

The following events have a

  • one-line description
  • occurrence \(f\)
  • countermeasure
  • the effort \(v_1\) [min] before countermeasure
  • the effort \(v_2\) [min] after countermeasure

As a check for the estimation \(\sum f v_1\) should give \(1d = 8h = 480\text{min}\).

The estimates assume a project that takes

  • about a year
  • has 5 team members
  • needs to be consistently documented
description occurences measure time1/min time2/min
Include documentation in the build system 1/5/365 sxr 0 10
Create separate version of documentation file (e.g. doc_1.1.docx) 1/5/100 s10 10 0
Look for file and open in editor then open another file in another tool (office application) 20 sed 1 1/10
Plan the design of a software component and document it 1 s8c 40 30
Review the changes in a documentation file 1 s9v 20 1
Look up an ID in a documentation file 10 san 1 1/60
Solve an implementation detail or a bug report 2   100 100
Discuss an interface with other team member consulting documentation 1 san 10 9
Describe an implementation detail or how a bug was fixed documentation 2 san 30 20
Merge contributions to a documentation file from more developers 1/30 sxr 30 1
A printout of the documentation shall be started (without printing time) 1/5/100 scf 5 10
Create a traceability file that shows how documentation items are linked 1/5/100 s0t 3*480 1
Search for all occurrences of a name in all project files 10 stq 4 1
Replace all occurrences of a name in all project files 5 stq 4 1
Refactor and re-describe parts of code and update documentation 1 s8c, san 30 20
Fix a formatting issue 10 s45 1 1/2
Check for consistency of a limit values between code and documentation 1 s8c 2 0
Make the documentation of automatic tests or a test report of a test run 1 sgt 20 10

Result

The assumed 1 year project with 5 developers would take only 0.7 years.

  • Effort without RST: 486min=1.00000000000000day
  • Effort with RST: 332min=0.7day
  • Less effort (sa7): -154min=-0.3day

The benefit is not so much due to using a text editor instead of an office application to write documentation. It is due to a good exploitation of all the possibilities opened by pure text (Requirements on Documentation and Requirements on Project).

System Requirements

Purpose

Propose an alternative to MS Office for technical documentation.

Requirements on Documentation

sa7:Productivity

The documentation format increases productivity in comparison to MS Office.

s9o:Community, rj4

The documentation format is not new. The documentation format is supported by a large open source community.

s9v:Traceability

It is possible to diff a documentation file with the version control system (VCS). Therefore it must be text-based.

scf:Accessibility

Tools shall be available for conversion to the following formats

  • HTML: to make the documentation viewable over the internet
  • PDF: to archive a version and for printing
  • DOCX and ODT: to satisfy existing procedures

The tools shall be

  • open-source and community supported
  • stable
san:Accessibility, Traceability, hyperlinks

The documentation items shall be marked by flat and unordered IDs.

Use these IDs to jump to documentation items inside the editor via a keyboard shortcut or a mouse click: hypertext.

Support hyperlinks in the formats the documentation can be converted to.

stq:Accessibility

Full text search over all files with regular expressions shall be available from inside the editor for

  • source code and
  • documentation
sed:Sustainability

The documentation can be opened by a normal text editor.

The documentation is easy to read and write in a text editor.

s45:Formatting vs Content

Support formatting:

paragraphs, sections with headers
enumerated and bullet lists, footnotes, citations, comments
bold, italic, typeface, hyperlinks
tables, images, figures, code listings, mathematics

The formatting

  • is not obtrusive (r9g)
  • shall be intuitive
  • does not need much learning

Table-like data is stored as text using a format that is

  • not too verbose
  • easily accessible by scripting (sgt)
sgt:Redundancy, r1p

Make it easy to automatically generate parts of documentation

  • from source code
  • from data

Data shall be usable

  • in source code
  • in documentation
s0t:Traceability

Automatically generate a dependencies file that shows how documentation items are linked. Warn about missing or duplicate targets.

s8c:Redundancy, r8d

Provide means to integrate into the documentation

  • defines that are also usable in source code
  • calculation results

Requirements on Project

s10:Traceability

The project uses a version control system (VCS) like SVN or GIT.

Documentation history is handled by the VCS. Team members follow changes of documentation on the VCS.

sxr:Automation

The project uses a build system.

Creation of documentation is integrated in the build system.

s1g:Redundancy

Whenever something is used twice in code and documentation let it be generated from a master copy: constants, defines of structs, …

scs:Accessibility

All documentation of concern to development is integrated in the text-based documentation.

  • risk analysis / motivation
  • specification
  • design description
  • test plan
  • issues
  • meeting minutes
sim:Accessibility

There is a readme document that informs, where to put and how to find which information.

seo:r90

The developer

  • only works and cares about the text sources of documentation
  • does not spend time in fixing formatting issues of a generated format
sil:Parallelism

Developers can work independently.

This is linked to s10. The VCS needs to enable independent development. A distributed VCS like GIT has advantages in this regard.

Design Description

Purpose

An implementation of the System Requirements is described in a short way with links to motivation further down in the document.

For an illustrative implementation of the following conventions see the end of dcx.py or the files generated by rstdcx --rest/--stpl smpl, and this documentation itself.

Documentation Format

dje:s9v, san, stq, sed, sgt, s8c

Documentation files are text files.

dio:dt7, dbz

Use pure RST as documentation format. Don’t use Sphinx extension.

dld:doe, d03

Place a target ID before paragraphs, figures, list-tables, code listings, maths:

.. _`dx8`:

:dx8: <optionally key words here>

``:dx8:`` is a RST definition that will get special formatting in the generated document.
Else, ``dx8:`` would do, too.
Omitting it, if the ID is not wanted in the generated document.

|dx8| is an example ID.

.. _`dz3`:

.. figure:: _images/smpl.png
   :name:

   |dz3|: Caption here.

Reference via |dz3|.

.. _`dta`:

|dta|: Table legend

.. list-table::
   :name:
   :widths: 20 80
   :header-rows: 1

   * - Bit
     - ...

.. _`dyi`:

|dyi|: Listing showing struct.

.. code-block:: cpp
   :name:

   struct xxx{
      int yyy; //yyy for zzz
   }

.. _`d9x`:

.. math::
   :name:

If headers are referenced across files, add an ID before them, too:

.. _`d99`:

header
------

Inside the file, headers are automatically targets, by the definition of RST.

The ID

dfq:san

The ID within a file

  • has only word characters: easily search word under cursor in editor
  • is short: easy to memorize
  • is lowercase: some editors are case insensitive and it is easier to write in lowercase
  • is random, not ordered: no problem when reordering documentation parts
  • is not derived from the topic of the paragraph, which is often not well defined and would make the ID long

All IDs that end up in a top level file should start with the same letter. This way one can immediately tell which top level file the ID is from. dcx.links_and_tags uses this to color a graph of all the ID dependencies.

Files

d75:scs, sim, seo, sil, dt7

Top level files use the extension .rest.

The last line of a top level .rest file must be the following (d1z):

``.. include:: _links_sphinx.rst``

For image substitutions to work place the .. |xxx| image:: _images/xxx.jpg into the top level .rest file.

Sphinx uses an index.rest that become the entry point to the project site. It includes all top level .rest files for a project-wide table of contents:

.. toctree::
   readme.rest
   ra.rest
   sr.rest
   dd.rest
   tp.rest
   rstdoc.rest

The names of the files are an example. See Overview for their purpose.

dfy:dt7

Included files use the extension .rst. Include them with:

.. include:: somefile.rst
d0t:s9v, s8c

Images (.png, .jpg, .svg,…) are in the _images, or ..\_images folder.

Images can be generated from tikz files (.tikz) or templates of it (.tikz.stpl).

dyn:s8c, dv6

.rest, .rst, .tikz files can be SimpleTemplate templates.

Then they have the additional extension

  • .stpl, if they can be expanded by themselves
  • .tpl, if they are only used by other template files (%include('some.rst.tpl')). .tpl files can have parameters, which are provided via %include('some.rst.tpl',aparam="test") .tpl files can be in the same or the parent folder, without the need for a path in %include().

In this documentation ra.rest.stpl includes utility.rst.tpl.

Tools

dru:scf

The tools are all open source and community driven.

dmm:scf

Pandoc is used to convert to HTML, PDF, DOCX, ODT

dsn:Editor, san

A text editor is used for writing. It should support CTags’s .tags files

d13:sgt, s8c

Python is used

  • for scripting and (da0)
  • for templates (dv6)
d23:sgt

Data is preferably written directly in Python. Any text format that is readable in Python works, too. Very table-like data is written in yaml or json (dg8).

dwm:doe, san, sgt, s8c

rstdoc’s rstdcx is used to

  • create .tags and _links_xxx.rst files to support hypertext (d03)
  • generated files from source code using the gen file (dhy)
  • expand template files .stpl (dv6)
  • convert .tikz files to .png and place into ./_images or ../_images
dqf:sxr, scf, sgt, s8c

As build system waf is used (dw8).

d7o:Sphinx, Project Site, scf

Optionally Sphinx can be used to create a central HTML site for the project with links to all the top level files (.rest) (d1w)

df3:Latex, scf

LaTex to

  • convert RST directly to PDFs or
  • create graphics using tikz

Motivation

Light Markup

dt7:light markup

LaTex and other document markup languages are not easy to learn (s45) and have not enough constraints.

The alternatives are light markup formats:

  • sgt, s8c: It is easier to generate parts of the documentation with scripts from source code or source code comments. It allows mixing source code with documentation for better cohesion and less redundancy.

  • s45: It can be easily learned, because it restricts itself to essential elements.

  • s45: The elements are of conceptual nature (header, list item, ) not actual formatting. The formatting is done when creating the final document. This makes it easier to keep a consistent formatting when more people work on the documentation.

  • s9v: As text, it is perfect for version control systems. One can commit documentation changes together with the according source code changes. It allows to keep outdated information hidden in the VCS history and not lying around and messing up. It is easy to review documentation changes. To make it even more easy one should try to have

    • one sentence per line
    • one clause per line or
    • one list item per line
  • s0t: It is easier to extract, which items link to which other ones, especially if the team agrees on facilitating conventions.

  • san, stq, sed: It can be edited with a text editor, i.e. the same tool developers work with all the time.

  • stq: It is accessible to grep.

  • san: Ctags can be used to jump around while editing.

  • sed: It is very readable.

  • scf: It can be translated to several final formats, e.g.

    Sphinx converts all files with an extension provided in conf.py. .rest is chosen for such main files. .rst is then for included files.

Further reading:

RST

dbz:RST

There are many light markup formats. But especially restructuredText (RST)

http://rst.ninjs.org can be used to play with RST.

For the conversion from RST to DOCX currently the best tool is Pandoc, Pandoc only takes pure RST and does not know about the Sphinx extensions like :ref:.

Hypertext

doe:hypertext in text, r33

rstdoc’s rstdcx generates a .tags file for target IDs CTags’s .tags files are supported by many editors. With .tags files your editor can jump around in RST files, as if marked up hypertext (HTML).

d03:hypertext in HTML, PDF and DOCX, r33

To reference paragraphs, figures, mathematics and tables use RST’s replacement substitutions:

This is text that references |dx8|.

As the RST’s link format differs between HTML, PDF and DOCX, rstdoc’s rstdcx generates separate files with definitions for the replacement substitutions:

_links_docx.rst:

.. |targetid| replace:: `targetid <file.docx#targetid>`_

_links_pdf.rst:

.. |targetid| replace:: `targetid <file.pdf#targetid>`_

_links_sphinx.rst:

.. |targetid| replace:: :ref:`targetid <file.html#targetid>`
d1z:Pandoc bug

Substitutions cannot be in included files, until the Pandoc include bug is corrected. For the _links_docx.txt this helps:

cat file.rst _links_docx.txt | sed -e's/.. include:: _links_sphinx.txt//g' | pandoc -f rst -t docx -o file.docx

The last line of a top level .rest file must be:

``.. include:: _links_sphinx.rst``

For image substitutions to work place the .. |xxx| image:: xxx.jpg into the top level .rest file.

Editor

If most work is done on text, it becomes important to have a very flexible and powerful editor. One should invest a some time to know the editor well.

There are a lot of editors that work well with RST, e.g. Emacs.

Atom
language-restructuredtext
rst-preview-pandoc
table-editor
rst-snippets
atom-build-waf
find-and-replace-under-cursor

atom-build and atom-ctags were modified to allow finding files by putting the relevant subdirectory into Atom’s project paths.

Scripting

da0:scripting

Python is a good choice as a scripting language, because it

  • is easy
  • is powerful
  • has many libraries
  • has a huge community

Build System

dw8:waf

waf (dqf) is a good choice as build system:

  • It is python
  • It is made part of the project, i.e. not an external dependency
  • It supports many computer languages
  • rstdoc’s dcx.py is a waf plugin

As a general automation tool to doit would be a good choice, but it does not provide abstraction for compiler handling.

make works, too, but it is less flexible.

Generated documentation

dhy:gen

To generate documentation files from source code, rstdoc’s rstdcx

  • Looks into a gen file, to see which source files lead to which target file. The gen file can be

    • python code defining from_to_fun_kw = [[fromfile,tofile,fun,kw],...] or
    • from|to|fun|kwargs lines

    fun means gen_fun() (or just gen() if empty) that can be found commented #gen_fun in fromfile.

  • Gets the # def gen_fun(lns,**kw) python function. fun comes from the gen file, the rest must match exactly. # def gen_fun marks the end of the function.

  • Executes the def gen_fun(lns,**kw) function with the lines of the source file and the kw from the gen file

  • Saves the result in the target files

In waf’s wscript_build, call gen_files() to initiate interpretation of the gen file.

See the gen file of this documentation as an example.

Templating

dv6:templating text

Python has many template libraries. An important one is Jinja2. But they are targeted especially to HTML and consider aspects (e.g. security), that are not of relevance for technical documentation.

bottle’s SimpleTemplate is inverted Python (text un-enclosed, code enclosed) without any further restrictions to the Python code.

  • It is easy to learn
  • Very powerful
dpv:templating tikz

One can use it to replace native control structures.

This smpl.tikz:

[thick]
\draw (0,0) grid (3,3);
\foreach \c in {(0,0), (1,0), (2,0), (2,1), (1,2)}
    \fill \c + (0.5,0.5) circle (0.42);

can become smpl.tikz.stpl:

[thick]
\draw (0,0) grid (3,3);
%for cx,cy in {(0,0), (1,0), (2,0), (2,1), (1,2)}:
    \fill ({{cx+0.5}},{{cy+0.5}}) circle (0.42);
%end
dhl:defines for code and documentation

One can have

  • limits defined in a python file (specifications.py) and use them in
  • a code file template (e.g. specifications.h.stpl)
  • and in a documentation file template (e.g. specifications.rest.stpl)

Alternatively one can have

  • a code file with limits (e.g. specifications.h)
  • parse that code file in the python code and use the values in the text of a documentation file template (all in e.g. specifications.rest.stpl)
drz:mathematics

One can use

  • python mathematics (e.g. sympy)
  • use it for actual calculations
  • expand the formulas in the text
  • use the calculated values in the text
%from sympy.abc import *
%from sympy import Eq, latex
%pythagoras=Eq(c**2,a**2+b**2)
Pythagorean theorem

.. math:

    {{latex(pythagoras)}}

is very important.

becomes:

Pythagorean theorem

.. math:

    c^{2} = a^{2} + b^{2}

is very important.

Data

dg8:data

Data needs to be text, because then changes can be traced via VCS.

Since the scripting and templating is done with Python, data written in Python is most easily accessible, e.g. via and import.

If the data is more table-like:

yaml is good for direct editing, because it is

  • very non-verbose and
  • almost can be used directly as RST
  • changes can be best followed without too much syntax

json is appropriate, if the data

  • is generated
  • needs to be read and possibly edited by humans
  • needs to be rendered on HTML via javascript

XML is only appropriate for a predefined schema, that is unlikely to change throughout the development time.

There are alternatives for XML’s XPath in yaml and json:

Project Site

d1w:project site

A central HTML project site can be a central point of team coordination. One can have

  • an issue file
  • minutes file for meetings
  • a project coordination file with plans, deadlines
  • a progress file per developer

This keeps all the data together in

  • one format and
  • one tool chain
  • one repository

Test Plan

Purpose

The System Requirements are tested by

The tests here concentrate on rstdoc.

Test Driver

There are only automatic tests.

pytest and tox are used to run the tests.

[tox]
envlist=py38
[testenv]
commands=
    py.test --doctest-modules --junit-xml=test.xml -k 'not test_with_images[rst_odt'
deps=
    gitpython
    txdir
    cffi
    cairocffi
    pillow
    pyx
    pyfca
    pygal
    cairosvg
    numpy
    matplotlib
    sympy
    pint
    pyyaml
    svgwrite
    drawsvg
    stpl
    pypandoc
    docutils
    sphinx
    sphinx_bootstrap_theme
    mock
    virtualenv
    pytest-coverage

To run tox, in the root folder, enter:

tox

To run pytest, in the root folder, enter:

py.test

To have a test coverage report, enter:

py.test --cov=rstdoc --cov-report term-missing

Test Coverage

The tests aim to produce 100% test coverage.

The current test coverage is this.

============================= test session starts ==============================
platform linux -- Python 3.10.9, pytest-7.2.0, pluggy-1.0.0
rootdir: /home/roland/mine/rstdoc, configfile: pytest.ini
plugins: toolbox-0.4, Flask-Dance-6.2.0, anyio-3.6.2, mock-3.10.0, xonsh-0.13.3, cov-4.0.0
collected 518 items

rstdoc/dcx.py .....................                                      [  4%]
rstdoc/reflow.py .                                                       [  4%]
rstdoc/retable.py ..                                                     [  4%]
test/test_dcx.py ....................................................... [ 15%]
........................................................................ [ 29%]
........................................................................ [ 43%]
........................................................................ [ 56%]
........................................................................ [ 70%]
........................................................................ [ 84%]
............................                                             [ 90%]
test/test_fromdocx.py .                                                  [ 90%]
test/test_rst_tables.py ..................................               [ 96%]
test/test_unretable.py ................                                  [100%]

---------- coverage: platform linux, python 3.10.9-final-0 -----------
Name                  Stmts   Miss  Cover   Missing
---------------------------------------------------
rstdoc/__init__.py        2      0   100%
rstdoc/dcx.py          2116    202    90%   50-51, 55-56, 59-60, 278-279, 283-284, 289-291, 296-297, 313-318, 353-354, 734-736, 741, 786, 790-791, 799, 802, 835, 867-872, 896, 951, 1014, 1028-1029, 1130, 1149, 1156, 1158-1160, 1178-1179, 1220-1221, 1231-1232, 1265, 1277-1279, 1304-1305, 1337-1338, 1461-1463, 1618-1620, 1628-1629, 1639-1640, 1670, 1679, 1693-1694, 1700-1701, 1919-1920, 1926-1927, 1945, 1954, 2021, 2177-2178, 2182, 2206-2207, 2461, 2532, 2550, 2565-2569, 2588, 2593, 2595, 2717, 2754-2756, 2816-2818, 2826-2834, 2853, 2904-2905, 2934, 2955-2957, 3038, 3089, 3101, 3179, 3328, 3386, 3509-3510, 3525-3526, 3570, 3803-3809, 3848-3881, 3921, 3935, 3994, 4004, 4008-4013, 4153-4154, 4164-4171, 6402, 6476-6477, 6492-6493, 6726-6727, 6756, 6770, 6779, 6800-6801, 6804
rstdoc/fromdocx.py      164    135    18%   83-86, 90, 94-95, 99, 110-134, 139-142, 147-159, 164-180, 185-206, 211-213, 228-304, 323-333, 337
rstdoc/listtable.py     107     11    90%   211-233, 236, 252-254, 263
rstdoc/reflow.py        167     14    92%   365-395, 398, 400, 402, 416-418, 427
rstdoc/reimg.py          83     14    83%   120-122, 150-164, 167, 181-183, 187, 196
rstdoc/retable.py       267     30    89%   238, 318-319, 425, 486-528, 532
rstdoc/untable.py       131     13    90%   88, 102-103, 243-259, 262, 275-277, 287
rstdoc/wafw.py           86     58    33%   36-41, 47-55, 60-63, 70-84, 88-95, 98-107, 111-114, 117-128
---------------------------------------------------
TOTAL                  3123    477    85%


======================= 518 passed in 1271.21s (0:21:11) =======================

Tests

rstdcx, dcx.py

test_lnkname:

Test the extraction of the name for different kinds of targets:

header, figure, list-table, table,
code-block, code, math, definition (:id:)
test_dcx_regex:

Test the regular expressions used in dcx.py.

test_rstincluded:
 

Tests dcx.rstincluded.

test_init:

Tests the initialization of a sample directory tree with the --stpl tmp or --rest tmp options.

test_dcx_alone_samples:
 

Tests calling rstdcx/dcx.py without parameters.

test_dcx_in_out:
 

Tests calling rstdcx/dcx.py with in-file or standard in to standard out.

test_dcx_out_file:
 

Tests calling rstdcx/dcx.py with in-file and out-file and out type parameter.

test_make_samples:
 

Tests building the samples with Makefile

test_waf_samples:
 

Tests running Waf on the sample projects.

test_docparts_after:
 

Tests dcx.doc_parts with different parameters for documentation extraction.

test_convert_with_images_no_outinfo:
 

Tests dcx.convert with images on the fly in rest.stpl files for different targets.

test_include_cmd:
 

Tests rstdcx with -I option and .rest.stpl files generating images on the fly and embedding for HTML and DOCX.

RST tables

These tests mostly originate from the history of vim-rst-tables.

testCreateTable:
 

Test retable.reformat_table by creating a grid table from lines where columns are separated by two blanks.

testReformatEmpty:
 

Tests retable.reformat_table with a table with an empty cell.

testReflowTable:
 

Tests retable.reflow_table with a table whose start line was reduced.

testReflowWithReplacements:
 

Tests retable.reflow_table with a table containing replacement substitutions with successive rows reduced in length.

testReflowWithLineBreak:
 

Tests retable.reflow_table with a successive line lengthened.

testReTitle:

Tests retable.re_title on a fixture file.

testCreateFromData:
 

Tests creation of table from data (retable.create_rst_table).

rstdoc

Purpose

rstdoc(1) Version 1.8.2 | rstdoc

See background and documentation.

Many companies use DOCX and thus produce an information barrier. Working with text is more integrated in the (software) development process. A final format can be DOCX, but, at least during development, text is better.

Sphinx is an extension of Docutils used for many (software) projects, but it does not support creation of DOCX files, which certain companies demand. Pandoc does support DOCX, but does not support the Sphinx extensions, hence :ref: and the like cannot be used.

This python package supports working with RST as documentation format without depending on Sphinx.

  • link RST documents using substitutions (generated in _links_xxx.r?st)
  • create a .tags file to jump around in an editor that support ctags
  • RST handling with python: reformat/create RST tables
  • post-process Pandoc’s conversion from DOCX to RST
  • pre-process Pandoc’s conversion from RST to DOCX
  • Support in building with WAF (or Makefile)
    • expand SimpleTemplate template files .stpl
    • graphics files (.tikz, .svg, .dot, .uml, .eps or .stpl thereof, and .pyg) are converted to .png and placed into ./_images or <updir>/_images, if there, else into current directory.
    • a gen file specifies how RST should be generated from source code files (see dcx.py)

The conventions used are shown

pip install rstdoc installs:

Module CLI Script Description
dcx rstdcx, rstdoc create .tags, labels and links
fromdocx rstfromdocx Convert DOCX to RST using Pandoc
listtable rstlisttable Convert RST grid tables to list-tables
untable rstuntable Converts certain list-tables to paragraphs
reflow rstreflow Reflow paragraphs and tables
reimg rstreimg Rename images referenced in the RST file
retable rstretable Transforms list tables to grid tables

rstdcx

restructuredText sources are split into two types of files: main files considered by Sphinx, and included files. Which of .rest or .rst is main or included is determined by source_suffix in a <root>/conf.py or opposite to the extension of the included _links_sphinx.r?st file:

  • if you have .. include:: /_links_sphinx.rest, then the main file extension is .rst

rstdoc creates documentation (PDF, HTML, DOCX) from restructuredText (.rst, .rest) using either

rstdoc and rstdcx command line tools call dcx.py. which

  • creates .tags to jump around with the editor
  • handles .stpl files
  • processes gen files (see examples produced by –rest)
  • creates links files (_links_docx.r?st, _links_sphinx.r?st, …)
  • forwards known files to either Pandoc, Sphinx or Docutils

See example at the end of dcx.py. It is supposed to be used with a build tool. make and waf examples are included.

  • Initialize example tree (add --rstrest to make .rst main and .rest included files):

    $ ./dcx.py --rest repo #repo/doc/{sy,ra,sr,dd,tp}.rest files OR
    $ ./dcx.py --stpl repo #repo/doc/{sy,ra,sr,dd,tp}.rest.stpl files
    $ ./dcx.py --ipdt repo #repo/pdt/AAA/{i,p,d,t}.rest.stpl files
    $ ./dcx.py --over repo #.rest all over
    
  • Only create .tags and _links_xxx.r?st:

    $ cd repo
    $ rstdoc
    
  • Create the docs (and .tags and _links_xxx.r?st) with make:

    $ make html #OR
    $ make epub #OR
    $ make latex #OR
    $ make docx #OR
    $ make pdf
    

    The latter two are done by Pandoc, the others by Sphinx.

  • Create the docs (and .tags and _links_xxx.r?st) with waf:

    Instead of using make one can load dcx.py (rstdoc.dcx) in waf. waf also considers all recursively included files, such that a change in any of them results in a rebuild. All files can have an additional .stpl extension to use SimpleTemplate.

    $ waf configure #also copies the latest version of waf in here $ waf –docs docx,sphinx_html,rst_odt $ #or you provide –docs during configure to always compile the docs

The following image language files should be parallel to the .r?st files. They are automatically converted to .png and placed into ./_images or <updir>/_images or else parallel to the file.

  • .tikz or .tikz.stpl. This needs LaTex.

  • .svg or .svg.stpl

  • .dot or .dot.stpl

    This needs graphviz.

  • .uml or .uml.stpl

    This needs plantuml . Provide either

    • plantuml.bat with e.g. java -jar "%~dp0plantuml.jar" %* or
    • plantuml sh script with java -jar `dirname $BASH_SOURCE`/plantuml.jar "$@"
  • .eps or .eps.stpl embedded postscript files.

    This needs inkscape.

  • .pyg contains python code that produces a graphic. If the python code defines a to_svg or a save_to_png function, then that is used, to create a png. Else the following is tried

    • pyx.canvas.canvas from the pyx library or
    • cairocffi.Surface from cairocffi
    • matplotlib. If matplotlib.pyplot.get_fignums()>1 the figures result in <name><fignum>.png

    The same code or the file names can be used in a .r?st.stpl file with pngembed() or dcx.svgembed() to embed in html output.

    {{!svgembed("egpyx.pyg",outinfo)}}
    <%
    ansvg=svgembed('''
    from svgwrite import cm, mm, drawing
    d=drawing.Drawing(viewBox=('0 0 300 300'))
    d.add(d.circle(center=(2*cm, 2*cm), r='1cm', stroke='blue', stroke_width=9))
    '''.splitlines(),outinfo)
    %>
    {{!ansvg}}
    

Conventions

Files

  • main files and included files: .rest, .rst or vice versa. .txt are for literally included files (use :literal: option).
  • templates separately rendered : *.rest.stpl and *.rst.stpl template included: *.rst.tpl Template lookup is done in . and .. with respect to the current file.
    • with %include('some.rst.tpl', param="test") with optional parameters
    • with %globals().update(include('utility.rst.tpl')) if it contains only definitions

Links

  • .. _`id`: are reST targets. reST targets should not be template-generated. The template files should have a higher or equal number of targets than the generated file, in order for tags to jump to the template original. If one wants to generate reST targets, then this should better happen in a previous step, e.g. with gen files mentioned above.
  • References use replacement substitutions: |id|.
  • If you want an overview of the linking (traceability), add .. include:: _traceability_file.rst to index.rest or another .rest parallel to it. It is there in the example project, to include it in tests. _traceability_file.{svg,png,rst} are all in the same directory.

Link files are created in link roots, which are folders where the first main file (.rest or .rst) is encoutered during depth-first traversal. Non-overlapping link root paths produce separately linked file sets.

.. include:: /_links_sphinx.r?st, with the one initial / instead of a relative or absolute path, will automatically search upward for the _links_xxx.r?st file (_sphinx is replaced by what is needed by the wanted target when the docs are generated).

Sphinx conf.py is augmented by configuration for Pandoc and Docutils. It should be where the input file is, or better at the project root to be usable with waf.

See the example project created with --rest/stpl/ipdt/over and the sources of the documentation of rstdoc.

rstdcx CLI

rstdcx is the same as rstdoc.

Without parameters: creates |substitution| links and .tags ctags for reST targets.

With two or three parameters: process file or dir to out file or dir through Pandoc, Sphinx, Docutils (third parameter):

  • html, docx, odt, pdf, … uses Pandoc.
  • rst_html, rst_odt, rst_pdf, … uses rst2html, …
  • sphinx_html, sphinx_pdf, … uses Sphinx. Sphinx provides a nice entry point via the sphinx bootstrap theme.

4th parameter onward become python defines usable in .stpl files.

Pdf output needs latex. Else you can make odt or docx and use

  • win: swriter.exe --headless --convert-to pdf Untitled1.odt
  • linux: lowriter --headless --convert-to pdf Untitled1.odt

Inkscape (.eps, .svg), Dot (.dot), Planuml (.uml), latex (.tex,.tikz) are converted to .png into ./_images or <updir>/_images or ‘.’. Any of the files can be a SimpleTemplate template (xxx.yyy.stpl).

Configuration is in conf.py or ../conf.py.

rstdoc --stpl|--rest|--ipdt|-over create sample project trees.

--stpl with .rest.stpl template files, --rest with only a doc folder with .rest files, --ipdt with inform-plan-do-test enhancement cycles --over with .rest files all over the project tree including symbolic links

Examples

Example folders (see wscript and Makefile there):

rstdoc --rest <folder> [--rstrest]
rstdoc --stpl <folder> [--rstrest]
rstdoc --ipdt <folder> [--rstrest]
rstdoc --over <folder> [--rstrest]

Use --rstrest to produce .rst for the main file, as .rest is not recognized by github/gitlab, who also don’t support file inclusion, so no need for two extension anyway.

Examples usages with the files generated by rstdoc --stpl tmp:

cd tmp/doc
rstdcx   #expand .stpl and produce .tag and _links_xxx files

#expand stpl and append substitutions (for simple expansion use ``stpl <file> .``)
rstdcx dd.rest.stpl - rest           # expand to stdout, appending dd.html substitutions, to pipe to Pandoc
rstdcx dd.rest.stpl - html.          # as before
rstdcx dd.rest.stpl - docx.          # expand to stdout, appending dd.docx substitutions, to pipe to Pandoc
rstdcx dd.rest.stpl - newname.docx.  # expand template, appending substitutions for target newname.docx
rstdcx dd.rest.stpl - html           # expand to stdout, already process through Pandoc to produce html on stdout
rstdcx dd.rest.stpl                  # as before
rstdcx sy.rest.stpl - rst_html       # expand template, already process through Docutils to produce html on stdout
stpl sy.rest.stpl | rstdcx - - sy.html. # appending sy.html substitutions, e.g. to pipe to Pandoc
stpl dd.rest.stpl | rstdcx - - dd.html  # appending tp.html substitutions and produce html on stdout via Pandoc
rstdcx dd.rest.stpl dd.rest          # expand into dd.rest, appending substitutions for target dd.html
rstdcx dd.rest.stpl dd.html html     # expand template, process through Pandoc to produce dd.html
rstdcx dd.rest.stpl dd.html          # as before
rstdcx dd.rest.stpl dd.html rst_html # expand template, already process through Docutils to produce dd.html
rstdcx dd.rest.stpl dd.docx          # expand template, process through Pandoc to produce dd.docx
rstdcx dd.rest.stpl dd.odt pandoc    # expand template, process through Pandoc to produce dd.odt
rstdcx dd.rest.stpl dd.odt           # as before
rstdcx dd.rest.stpl dd.odt rst_odt   # expand template, process through Docutils to produce dd.odt
rstdcx dd.rest.stpl dd.odt rst       # as before
rstdcx . build html                  # convert current dir to build output dir using pandoc
rstdcx . build sphinx_html           # ... using sphinx (if no index.rest, every file separately)

#Sphinx is not file-oriented
#but with rstdcx you need to provide the files to give Sphinx ``master_doc`` (normally: index.rest)
#Directly from ``.stpl`` does not work with Sphinx
rstdcx index.rest ../build/index.html sphinx_html   # via Sphinx the output directory must be different

#convert the graphics and place the into _images or <updir>/_images
#if no _images directory exists they will be placed into the same directory
rstdcx egcairo.pyg
rstdcx egdot.dot.stpl
rstdcx egeps.eps
rstdcx egother.pyg
rstdcx egplt.pyg
rstdcx egpygal.pyg
rstdcx egpyx.pyg
rstdcx egsvg.svg.stpl
rstdcx egtikz.tikz
rstdcx egtikz1.tikz
rstdcx eguml.uml

#Convert graphics to a png (even if _images directory exists):
rstdcx eguml.uml eguml.png

#Files to other files:

rstdoc dd.rest.stpl dd.rest
rstdoc dd.rest.stpl dd.html html
rstdoc dd.rest.stpl dd.html
rstdoc sr.rest.stpl sr.html rst_html
rstdoc dd.rest.stpl dd.docx
rstdoc dd.rest.stpl dd.odt pandoc
rstdoc dd.rest.stpl dd.odt
rstdoc sr.rest.stpl sr.odt rst_odt
rstdoc sr.rest.stpl sr.odt rst
rstdoc index.rest build/index.html sphinx_html

#Directories to other directories with out info:

rstdoc . build html
rstdoc . build sphinx_html

Grep with python re in .py, .rst, .rest, .stpl, .tpl:

rstdoc --pygrep inline

Grep for keyword lines containing ‘png’:

rstdoc --kw png

Default keyword lines:

.. {{{kw1,kw2
.. {kw1,kw2}
{{_ID3('kw1 kw2')}}
%__ID3('kw1 kw2')
:ID3: kw1 kw2

API

import rstdoc.dcx as dcx

The functions in dcx.py are available to the gen_xxx(lns,**kw) functions (dhy).

dcx.is_project_root_file:
 
def is_project_root_file(filename):
return filename=='.git' or filename=='waf' or filename=='Makefile' or filename.lower().startswith('readme')

Identifies the root of the project by a file name contained there.

dcx.DPI:
DPI = 600

Used for png creation.

dcx.g_config:
g_config = None

g_config can be used to inject a global config. This overrides the defaults and is overriden by an updir conf.py.

dcx.cmd:
def cmd(cmdlist, **kwargs):

Runs cmdlist via subprocess.run and return stdout. In case of problems RstDocError is raised.

param cmdlist:command as list
param kwargs:arguments forwarded to subprocess.run()
dcx.new_cwd:
@contextlib.contextmanager
def new_cwd(apth):

Use as:

with new_cwd(dir):
    #inside that directory
dcx.startfile:
def startfile(filepath):

Extends the Python startfile to non-Windows platforms

dcx.up_dir:
def up_dir(match,start=None):

Find a parent path producing a match on one of its entries. Without a match an empty string is returned.

param match:a function returning a bool on a directory entry
param start:absolute path or None
return:directory with a match on one of its entries
>>> up_dir(lambda x: False)
''
dcx.tempdir:
def tempdir():

Make temporary directory and register it to be removed with atexit.

This can be used inside a .stpl file to create images from inlined images source, place them in temporary file, and include them in the final .docx or .odt.

dcx.run_inkscape:
 
def run_inkscape(infile,  outfile, dpi=DPI):

Uses inkscape commandline to convert to .png

param infile:.svg, .eps, .pdf filename string (for list with actual .eps or .svg data use dcx.svgpng or dcx.epspng)
param outfile:.png file name
dcx.rst_sphinx:
@infile_cwd
def rst_sphinx(
        infile, outfile, outtype=None, **config
        ):

Run Sphinx on infile.

param infile:.txt, .rst, .rest filename
param outfile:the path to the target file (not target directory)
param outtype:html, latex,… or any other sphinx writer
param config:keys from config_defaults
>>> olddir = os.getcwd()
>>> cd(dirname(__file__))
>>> cd('../doc')

>>> infile, outfile = ('index.rest',
... '../build/doc/sphinx_html/index.html')
>>> rst_sphinx(infile, outfile) #doctest: +ELLIPSIS
>>> exists(outfile)
True

>>> infile, outfile = ('dd.rest',
... '../build/doc/sphinx_html/dd.html')
>>> rst_sphinx(infile, outfile) #doctest: +ELLIPSIS
>>> exists(outfile)
True

>>> infile, outfile = ('dd.rest',
... '../build/doc/sphinx_latex/dd.tex')
>>> rst_sphinx(infile, outfile) #doctest: +ELLIPSIS
>>> exists(outfile)
True

>>> cd(olddir)
dcx.g_include:
g_include = []

One can append paths to rstdoc.dcx.g_include for stpl expansion or finding other files.

dcx.rst_pandoc:
@infile_cwd
def rst_pandoc(
        infile, outfile, outtype, **config
        ):

Run Pandoc on infile.

param infile:.txt, .rst, .rest filename
param outfile:the path to the target document
param outtype:html,…
param config:keys from config_defaults
dcx.rst_rst2:
@infile_cwd
def rst_rst2(
        infile, outfile, outtype, **config
        ):

Run the rst2xxx docutils fontend tool on infile.

param infile:.txt, .rst, .rest filename
param outfile:the path to the target document
param outtype:html,…
param config:keys from config_defaults
dcx.PageBreakHack:
 
def PageBreakHack(destination_path):

This introduces a PageBreak style into content.xml to allow the following raw page break of opendocument odt:

.. raw:: odt

    <text:p text:style-name="PageBreak"/>

This is no good solution, as it introduces an empty line at the top of the new page.

Unfortunately the following does not work with or without text:use-soft-page-breaks="true"

.. for docutils
.. raw:: odt

    <text:p text:style-name="PageBreak"/>

.. for pandoc
.. raw:: opendocument

    <text:p text:style-name="PageBreak"/>

According to C066363e.pdf it should work.

See utility.rst.tpl in the --stpl created example project tree.

dcx.svgpng:
@png_post_process_if_any
@normoutfile
@readin
def svgpng(infile, outfile=None, *args, **kwargs):

Converts a .svg file to a png file.

param infile:a .svg file name or list of lines
param outfile:if not provided the input file with new extension .png in ./_images, <updir>/_images or parallel to infile.
dcx.texpng:
@png_post_process_if_any
@partial(in_temp_if_list, suffix='.tex')
@infile_cwd
def texpng(infile, outfile=None, *args, **kwargs):

Latex has several graphic packages, like

  • tikz
  • chemfig

that can be converted to .png with this function.

For .tikz file use dcx.tikzpng.

param infile:a .tex file name or list of lines (provide outfile in the latter case)
param outfile:if not provided, the input file with .png in ./_images, <updir>/_images or parallel to infile.
dcx.tikzpng:
tikzpng = normoutfile(readin(_tikzwrap(_texwrap(texpng))))

Converts a .tikz file to a png file.

See dcx.texpng.

dcx.dotpng:
@png_post_process_if_any
@partial(in_temp_if_list, suffix='.dot')
@infile_cwd
def dotpng(
        infile,
        outfile=None,
        *args,
        **kwargs
        ):

Converts a .dot file to a png file.

param infile:a .dot file name or list of lines (provide outfile in the latter case)
param outfile:if not provided the input file with new extension .png in ./_images, <updir>/_images or parallel to infile.
dcx.umlpng:
@png_post_process_if_any
@partial(in_temp_if_list, suffix='.uml')
@infile_cwd
def umlpng(
        infile,
        outfile=None,
        *args,
        **kwargs
        ):

Converts a .uml file to a png file.

param infile:a .uml file name or list of lines (provide outfile in the latter case)
param outfile:if not provided the input file with new extension .png in ./_images, <updir>/_images or parallel to infile.
dcx.epspng:
@png_post_process_if_any
@partial(in_temp_if_list, suffix='.eps')
@infile_cwd
def epspng(
        infile,
        outfile=None,
        *args,
        **kwargs):

Converts an .eps file to a png file using inkscape.

param infile:a .eps file name or list of lines (provide outfile in the latter case)
param outfile:if not provided the input file with new extension .png in ./_images, <updir>/_images or parallel to infile.
dcx.pygpng:
@png_post_process_if_any
@normoutfile
@readin
@infile_cwd
def pygpng(
        infile, outfile=None, *args,
        **kwargs
        ):

Converts a .pyg file to a png file.

.pyg contains python code that produces a graphic. If the python code defines a to_svg or a save_to_png function, then that is used. Else the following is tried

  • pyx.canvas.canvas from the pyx library or
  • svgwrite.drawing.Drawing from the svgwrite library or
  • cairocffi.Surface from cairocffi
  • matplotlib. If matplotlib.pyplot.get_fignums()>1 the figures result <name><fignum>.png
param infile:a .pyg file name or list of lines (provide outfile in the latter case)
param outfile:if not provided the input file with new extension .png in ./_images, <updir>/_images or parallel to infile.
dcx.pygsvg:
@readin
@infile_cwd
def pygsvg(infile, *args, **kwargs):

Converts a .pyg file or according python code to an svg string.

.pyg contains python code that produces an SVG graphic. Either there is a to_svg() function or the following is tried

  • io.BytesIO containing SVG, e.g via cairo.SVGSurface(ioobj,width,height)
  • io.StringIO containing SVG
  • object with attribute _repr_svg_
  • svgwrite.drawing.Drawing from the svgwrite library or
  • cairocffi.SVGSurface from cairocffi
  • matplotlib.
param infile:a .pyg file name or list of lines
dcx.svgembed:
def svgembed(
        pyg_or_svg, outinfo, *args, **kwargs
        ):

If outinfo ends with html, SVG is embedded. Else the SVG is converted to a temporary image file and included in the DOCX or ODT zip.

dcx.pngembed:
def pngembed(
        pyg_or_pngfile, outinfo, *args, **kwargs
        ):

If outinfo ends with html, the PNG is embedded. Else the PNG is included in the DOCX or ODT zip.

dcx.dostpl:
@infile_cwd
def dostpl(
        infile,
        outfile=None,
        lookup=None,
        **kwargs
        ):

Expands an .stpl file.

The whole rstdoc.dcx namespace is forwarded to the template code.

.stpl is unrestrained python:

  • e.g. one can create temporary images, which are then included in the final .docx of .odt See dcx.tempdir.
param infile:a .stpl file name or list of lines
param outfile:if not provided the expanded is returned
param lookup:lookup paths can be absolute or relative to infile
>>> infile = ['hi {{2+3}}!']
>>> dostpl(infile)
['hi 5!']
dcx.dorst:
def dorst(
        infile,
        outfile=io.StringIO,
        outinfo=None,
        fn_i_ln=None,
        **kwargs
        ):

Default interpreted text role is set to math. The link lines are added to the rest file or rst lines

param infile:

a .rest, .rst, .txt file name or list of lines

param outfile:

None and ‘-’ mean standard out.

If io.StringIO, then the lines are returned. |xxx| substitutions for reST link targets in infile are appended if no _links_sphinx.rst there

param outinfo:

specifies the tool to use.

  • html, docx, odt,… via pandoc
  • sphinx_html,… via sphinx
  • rst_html,… via rst2xxx frontend tools

General format of outinfo:

[infile/][tgtfile.]docx[.]

infile is used, if the function infile param are lines.

tgtfile is target file used in links.

tgtfile is the target file to create. A final dot tells not to create the target file. This is of use in the command line if piping a file to rstdoc then to pandoc. The doc will only be generated by pandoc, but links need to know the doc to link to already before that.

param fn_i_ln:

(fn, i, ln) of the .stpl with all stpl includes sequenced (used by dcx.convert)

>>> olddir = os.getcwd()
>>> cd(dirname(__file__))
>>> cd('../doc')

>>> dorst('dd.rest') #doctest: +ELLIPSIS
['.. default-role:: math\n', ...

>>> dorst('ra.rest.stpl') #doctest: +ELLIPSIS
['.. default-role:: math\n', ...

>>> dorst(['hi there']) #doctest: +ELLIPSIS
['.. default-role:: math\n', '\n', 'hi there\n', ...

>>> dorst(['hi there'], None,'html') #doctest: +ELLIPSIS
<!DOCTYPE html>
...

>>> drst=lambda x,y: dorst(x,y,None,pandoc_doc_optref={'docx':'--reference-doc doc/reference.'+y.split('.')[1]})
>>> dorst('ra.rest.stpl','ra.docx') #doctest: +ELLIPSIS
>>> exists('ra.docx')
True
>>> rmrf('ra.docx')
>>> exists('ra.docx')
False
>>> rmrf('ra.rest.stpl.rest')
>>> exists('ra.rest.stpl.rest')
False

>>> dorst(['hi there'],'test.html') #doctest: +ELLIPSIS
>>> exists('test.html')
True
>>> rmrf('test.html')
>>> exists('test.html')
False
>>> rmrf('rest.rest.rest')
>>> exists('rest.rest.rest')
False

>>> dorst(['hi there'],'test.odt','rst') #doctest: +ELLIPSIS
>>> exists('rest.rest.rest')
True
>>> rmrf('rest.rest.rest')
>>> exists('rest.rest.rest')
False
>>> exists('test.odt')
True
>>> rmrf('test.odt')
>>> exists('test.odt')
False
>>> cd(olddir)
dcx.convert:
def convert(
        infile,
        outfile=io.StringIO,
        outinfo=None,
        **kwargs
        ):

Converts any of the known files.

Stpl files are forwarded to the next converter.

The main job is to normalized the input params, because this is called from dcx.main and via Python. It forwards to the right converter.

Examples:

>>> olddir = os.getcwd()
>>> cd(dirname(__file__))
>>> cd('../doc')

>>> convert([' ','   hi {{2+3}}!'], outinfo='rest')
['   .. default-role:: math\n', '\n', ' \n', '   hi 5!\n', '\n']

>>> convert([' ','   hi {{2+3}}!'])  #doctest: +ELLIPSIS
['<!DOCTYPE html>\n', ...]
>>> rmrf('rest.rest.rest')

>>> infile, outfile, outinfo = ([
... "newpath {{' '.join(str(i)for i in range(4))}} rectstroke showpage"
... ],'tst.png','eps')
>>> 'tst.png' in convert(infile, outfile, outinfo) #doctest: +ELLIPSIS
True
>>> exists('tst.png')
True
>>> rmrf('tst.png')
>>> exists('tst.png')
False

>>> convert('ra.rest.stpl') #doctest: +ELLIPSIS
['<!DOCTYPE html>\n', ...

>>> cnvrt=lambda x,y: convert(x,y,None,pandoc_doc_optref={'docx':'--reference-doc doc/reference.'+y.split('.')[1]})
>>> cnvrt('ra.rest.stpl','ra.docx')
>>> exists('ra.rest.rest')
True
>>> rmrf('ra.rest.rest')
>>> exists('ra.rest.rest')
False
>>> exists('ra.docx')
True
>>> rmrf('ra.docx')
>>> exists('ra.docx')
False

>>> convert('dd.rest', None,'html') #doctest: +ELLIPSIS
<!DOCTYPE html>
...
>>> exists('dd.rest.rest')
True
>>> rmrf('dd.rest.rest')
>>> exists('dd.rest.rest')
False
>>> cd(olddir)
param infile:any of .tikz, .svg, .dot, .uml, .eps, .pyg or else stpl is assumed. Can be list of lines, too.
param outfile:- means standard out, else a file name, or None for automatic (using outinfo), or io.StringIO to return lines instead of stdout
param outinfo:html, sphinx_html, docx, odt, file.docx,… interpet input as rest, else specifies graph type
dcx.convert_in_tempdir:
 
convert_in_tempdir = in_temp_if_list(infile_cwd(convert))

Same as dcx.convert, but creates temporary dir for a list of lines infile argument.

>>> tmpfile = convert_in_tempdir("""digraph {
... %for i in range(3):
...    "From {{i}}" -> "To {{i}}";
... %end
...    }""".splitlines(), outinfo='dot')
>>> stem_ext(tmpfile)[1]
'.png'
>>> tmpfile = convert_in_tempdir("""
... This is re{{'st'.upper()}}
...
... .. _`xx`:
...
... xx:
...     text
...
... """.splitlines(), outinfo='rst_html')
>>> stem_ext(tmpfile)[1]
'.html'
dcx.rindices:
def rindices(regex, lns):

Return the indices matching the regular expression regex.

param regex:regular expression string or compiled
param lns:lines
>>> lns=['a','ab','b','aa']
>>> [lns[i] for i in rindices(r'^a\w*', lns)]==['a', 'ab', 'aa']
True
dcx.rlines:
def rlines(regex, lns):

Return the lines matched by regex.

param regex:regular expression string or compiled
param lns:lines
dcx.doc_parts:
def doc_parts(
        lns,
        relim=r"^\s*r?'''([\w.:]*)\s*\n*$",
        reid=r"\s(\w+)[(:]|(\w+)\s\=",
        reindent=r'[^#/\s]',
        signature=None,
        prefix=''
        ):

doc_parts() yields doc parts delimited by relim regular expression possibly with id, if reid matches

If start and stop differ use regulare expression | in relim.

  • There is no empty line between doc string and preceding code lines that should be included.
  • There is no empty line between doc string and succeeding code lines that should be included.
  • Included code lines end with an empty line.

In case of __init__() the ID can come from the class line and the included lines can be those of __init__(), if there is no empty line between the doc string and class above as well as _init__() below.

If the included code comes only from one side of the doc string, have an empty line at the other side.

Immediately after the initial doc string marker there can be a prefix, e.g. classname..

param lns:list of lines
param relim:regular expression marking lines enclosing the documentation. The group is a prefix.
param reid:extract id from preceding or succeeding non-empty lines
param reindent:determines start of text
param signature:
 if signature language is given the preceding or succeeding lines will be included
param prefix:prefix to make id unique, e.g. module name. Include the dot.
>>> with open(__file__) as f:
...     lns = f.readlines()
...     docparts = list(doc_parts(lns, signature='py'))
...     doc_parts_line = rlines('doc_parts', docparts)
>>> doc_parts_line[1]
':doc_parts:\n'
dcx.rstincluded:
 
@_memoized
def rstincluded(
        fn,
        paths=(),
        withimg=False,
        withrest=False
        ):

Yield the files recursively included from an RST file.

param fn:file name without path
param paths:paths where to look for fn
param withimg:also yield image files, not just other RST files
param withrest:rest files are not supposed to be included
>>> olddir = os.getcwd()
>>> cd(dirname(__file__))
>>> list(rstincluded('ra.rest',('../doc',)))
['ra.rest.stpl', '_links_sphinx.rst']
>>> list(rstincluded('sr.rest',('../doc',)))
['sr.rest', '_links_sphinx.rst']
>>> list(rstincluded('meta.rest',('../doc',)))
['meta.rest', 'files.rst', '_traceability_file.rst', '_links_...']
>>> 'dd.rest' in list(rstincluded(
... 'index.rest',('../doc',), False, True))
True
>>> cd(olddir)
dcx.pair:
def pair(alist, blist, cmp):

pair two sorted lists where the second must be at least as long as the first

param alist:first list
param blist:second list longer or equal to alist
param cmp:compare function
>>> alist=[1,2,4,7]
>>> blist=[1,2,3,4,5,6,7]
>>> cmp = lambda x,y: x==y
>>> list(pair(alist,blist,cmp))
[(1, 1), (2, 2), (None, 3), (4, 4), (None, 5), (None, 6), (7, 7)]

>>> alist=[1,2,3,4,5,6,7]
>>> blist=[1,2,3,4,5,6,7]
>>> cmp = lambda x, y: x==y
>>> list(pair(alist, blist, cmp))
[(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7)]
dcx.gen:
def gen(
        source,
        target=None,
        fun=None,
        **kw
        ):

Take the gen_[fun] functions enclosed by #def gen_[fun](lns,**kw) to create a new file.

param source:either a list of lines or a path to the source code
param target:either save to this file or return the generated documentation
param fun:use #gen_<fun>(lns,**kw): to extract the documentation
param kw:kw arguments to the gen_<fun>() function
>>> source=[i+'\\n' for i in """
...        #def gen(lns,**kw):
...        #  return [l.split('#@')[1] for l in rlines(r'^\s*#@', lns)]
...        #def gen
...        #@some lines
...        #@to extract
...        """.splitlines()]
>>> [l.strip() for l in gen(source)]
['some lines', 'to extract']
dcx.parsegenfile:
 
def parsegenfile(genpth):

Parse the file genpth which is either

  • python code or

  • has format

    sourcefile | targetfile | suffix | kw paramams or {}

suffix refers to gen_<suffix>.

The yields are used for the dcx.gen function.

param genpth:path to gen file
dcx.RstFile.__init__:
 
class RstFile:
    def __init__(self, reststem, doc, tgts, lnks, nlns):

Contains the targets for a .rst or .rest file.

param reststem:rest file this doc belongs to (without extension)
param doc:doc belonging to reststem, either included or itself (.rest, .rst, .stpl)
param tgts:list of Tgt objects yielded by dcx.RstFile.make_tgts.
param lnks:list of (line index, target name (|target|)) tuples
param nlns:number of lines of the doc
dcx.RstFile.make_tgts:
 
@staticmethod
def make_tgts(
        lns,
        doc,
        counters=None,
        fn_i_ln=None
        ):

Yields ((line index, tag address), target, link name) of lns of a restructureText file. For a .stpl file the linkname comes from the generated RST file.

param lns:lines of the document
param doc:the rst/rest document for tags
param counters:if None, the starts with {“.. figure”:1,”.. math”:1,”.. table”:1,”.. code”:1}
fn_i_ln:(fn, i, ln) of the .stpl with all stpl includes sequenced
def links_and_tags(
    scanroot='.'
    ):

Creates _links_xxx.rst files and a .tags.

param scanroot:directory for which to create links and tags
>>> olddir = os.getcwd()
>>> cd(dirname(__file__))
>>> rmrf('../doc/_links_sphinx.rst')
>>> '_links_sphinx.rst' in ls('../doc')
False

>>> links_and_tags('../doc')
>>> '_links_sphinx.rst' in ls('../doc')
True
>>> cd(olddir)
dcx.grep:
def grep(
      regexp=rexkw,
      dir=None,
      exts=set(['.rst','.rest','.stpl','.tpl','.adoc','.md','.wiki','.py','.jl','.lua','.tex',
                '.js', '.h','.c','.hpp','.cpp','.java','.cs','.vb','.r','.sh','.vim','.el',
                '.php','.sql','.swift','.go','.rb','.m','.pl','.rs','.f90','.dart','.bib',
                '.yml','.mm','.d','.lsp','.kt','.hs','.lhs','.ex','.scala','.clj']),
      **kwargs
):

Uses python re to find regexp and return [(file,1-based index,line),...] in dir (default: os.getcwd()) for exts files

param regexp:default is ‘^s*.. {’
param dir:default is current dir
param exts:the extension of files searched
def yield_with_kw (kws, fn_ln_kw=None, **kwargs):

Find keyword lines in fn_ln_kw list or using grep(), that contain the keywords in kws.

Keyword line are either of:

.. {{{kw1,kw2
.. {kw1,kw2}
{{_ID3('kw1 kw2')}}
%__ID3('kw1 kw2')
:ID3: kw1 kw2

.. can also be two comment chars of popular programming languages. This is due to dcx.rexkw, which you can change. See also dcx.grep() for the keyword parameters.

param kws:string will be split by non-chars
param fn_ln_kw:list of (file, line, keywords) tuples or regexp for grep()
>>> list(yield_with_kw('a',[('a/b',1,'a b'),('c/d',1,'c d')]))
[(0, ['a/b', 1, 'a b'])]
>>> list(yield_with_kw('a c',[('a/b',1,'a b'),('c/d',1,'c d')]))
[]
>>> list(yield_with_kw('a',[('a/b',1,'a b'),('c/d',1,'a c d')]))
[(0, ['a/b', 1, 'a b']), (1, ['c/d', 1, 'a c d'])]
>>> kwargs={'dir':normjoin(dirname(__file__),'../test/fixtures')}
>>> kws = 'svg'
>>> len(list(yield_with_kw(kws,**kwargs)))
6
>>> kws = 'png'
>>> len(list(yield_with_kw(kws,**kwargs)))
7
dcx.Counter.__init__:
 
class Counter:
    def __init__(self, before_first=0):

Counter object.

param before_first:
 first-1 value
>>> myc = Counter()
>>> myc()
1
>>> myc()
2
dcx.x:
gpdtid = pdtid
def pdtAAA(pdtfile,dct,pdtid=pdtid,
            pdtfileid=lambda x:'ipdt'[int(x[0])]):

pdtAAA is for use in an .stpl document:

% pdtAAA(__main_file__,globals())

See the example generated with:

rstdoc --ipdt
param pdtfile:file path of pdt
param dct:dict to take up the generated defines
param pdtid:function returning the ID for the pdt cycle or regular expression with group for full path or regular expression for just the base name without extension (pdtok)
param pdtfileid:
 extracts/maps a file base name to one of the letters ipdt. E.g. to have the files in order one could name them {0,1,2,3}.rest.stpl, and map each to one of ‘ipdt’.

A pdt is a project enhancement cycle with its own documentation. pdt stands for

  • plan: why
  • do: specification
  • test: tests according specification

Additionally there should be an

  • inform: non-technical purpose for or from external people.

There can also be only the inform document, if the pdt item is only informative.

The repo looks like this (preferred):

project repo
    pdt
        ...
        AAA
            i*.rest.stpl
            p*.rest.stpl
            d*.rest.stpl
            t*.rest.stpl

or:

project repo
    pdt
        ...
        AAA.rst.stpl

In the first case, the UID starts with {i,p,d,t}AAA. This is useful to trace related items by their plan-do-test-aspect.

Further reading: pdt

pdtAAA makes these Python defines:

  • _[x]AAA returns next item number as AAABB. Use: {{_[x]AAA('kw1')}}
  • _[x]AAA_, _[x]AAA__, _[x]AAA___, … returns headers. Use: {{_[x]AAA_('header text')}}
  • __[x]AAA, same as _[x]AAA, but use: %__[x]AAA('kw1') (needs _printlist in dct)
  • __[x]AAA_, __[x]AAA__, __[x]AAA___, … Use: %__[x]AAA_('header text')

A, B are base36 letters and x is the initial of the file. The generated macros do not work for indented text, as they produce line breaks in RST text.

>>> dct={'_printlist':str}
>>> pdtfile = "a/b/a.rest.stpl"
>>> pdtAAA(pdtfile,dct,pdtid=r'.*/(.)\.rest\.stpl')
>>> dct['_a']('x y').strip()
'.. {a01 x y}\\n\\na01: **x y**'
>>> dct['__a']('x y').strip() #needs _printlist
"['\\\\n.. {a02 x y}\\\\n\\\\na02: **x y**', '\\\\n']"
>>> dct={}
>>> pdtfile = "pdt/000/d.rest.stpl"
>>> pdtAAA(pdtfile,dct)
>>> dct['_d000']('x y').strip()
'.. {d00001 x y}\\n\\nd00001: **x y**'
>>> dct={}
>>> pdtfile = "a/b/003.rest.stpl"
>>> pdtAAA(pdtfile,dct)
>>> dct['_003']('x y').strip()
'.. {00301 x y}\\n\\n00301: **x y**'
>>> dct['_003_']('x y')
'\\n.. {003 x y}\\n\\n003 x y\\n======='
>>> pdtfile="a/b/003/d.rest.stpl"
>>> pdtAAA(pdtfile,dct)
>>> dct['_003']('x y').strip()
'.. {00301 x y}\\n\\n00301: **x y**'
>>> dct['_d003']('x y').strip()
'.. {d00301 x y}\\n\\nd00301: **x y**'
>>> dct['_003_']('x y')
'\\n.. {003 x y}\\n\\n003 x y\\n======='
>>> dct['_d003_']('x y')
'\\n.. {d003 x y}\\n\\nd003 x y\\n========'
dcx.index_toctree:
 
def index_toctree(index_file):

Construct:

.. toctree::
    file1
    file2

for the sphinx index file, i.e. index.rest.stpl or index.rst.stpl. Use like:

{{! index_toctree(__file__) }}
dcx.initroot:
def initroot(
        rootfldr
        ,sampletype
        ):

Creates a sample tree in the file system.

param rootfldr:directory name that becomes root of the sample tree
param sampletype:
 either ‘ipdt’ or ‘stpl’ for templated sample trees, or ‘rest’ or ‘over’ for non-templated

See example_rest_tree, example_stpl_subtree, example_ipdt_tree, example_over_tree in dcx.py.

dcx.index_dir:
def index_dir(
    root='.'
    ):

Index a directory.

param root:All subdirectories of root that contain a .rest or .rest.stpl file are indexed.
  • expands the .stpl files
  • generates the files as defined in the gen file (see example in dcx.py)
  • generates _links_xxx.rst for xxx = {sphinx latex html pdf docx odt}
  • generates .tags with jumps to reST targets
dcx.main:
def main(**args):

This corresponds to the rstdcx shell command.

rstfromdocx

rstfromdocx: shell command
fromdocx: rstdoc module

Convert DOCX to RST in a subfolder of current dir, named after the DOCX file. It also creates conf.py, index.py and Makefile and copies dcx.py into the folder.

See rstdcx for format conventions for the RST.

There are options to post-process through:

--listtable (--join can be provided)
--untable
--reflow (--sentence True,  --join 0)
--reimg

rstfromdocx -lurg combines all of these.

To convert more DOCX documents into the same RST documentation folder, proceed like this:

  • rename/copy the original DOCX to the name you want for the .rest file
  • run rstfromdocx -lurg doc1.docx; instead of -lurg use your own options
  • check the output in the doc1 subfolder
  • repeat the previous 2 steps with the next DOCX files
  • create a new folder, e.g. doc
  • merge all other folders into that new folder

fromdocx.docx_rst_5 creates 5 different rst files with different postprocessing.

See rstreflow for an alternative proceeding.

API

import rstdoc.fromdocx as fromdocx
fromdocx.extract_media:
 
def extract_media(adocx):

extract media files from a docx file to a subfolder named after the docx

param adocx:docx file name
fromdocx.main:
def main(**args):

This corresponds to the rstfromdocx shell command.

param args:Keyword arguments. If empty the arguments are taken from sys.argv.

listtable, untable, reflow, reimg default to False.

returns: The file name of the generated file.

fromdocx.docx_rst_5:
 
def docx_rst_5(docx ,rename ,sentence=True):

Creates 5 rst files:

  • without postprocessing: rename/rename.rest
  • with listtable postprocessing: rename/rename_l.rest
  • with untable postprocessing: rename/rename_u.rest
  • with reflow postprocessing: rename/rename_r.rest
  • with reimg postprocessing: rename/rename_g.rest
param docx:the docx file name
param rename:the new name to give to the converted files (no extension)
param sentence:split sentences into new lines (reflow)

rstlisttable

rstlisttable: shell command
listable: rstdoc module

Convert RST grid tables to list-tables.

  1. Convert grid tables in a file to list-tables. The result is output to stdout:

    $ listtable.py input.rst
    
  2. Convert several files:

    $ listtable.py input1.rst input2.rst
    $ listtable.py *.rst
    
  3. Use pipe (but cat might not keep the encoding):

    $ cat in.rst | listtable.py -  | untable.py - > out.rst
    

Options

-j, --join e.g.002. Join method per column: 0=””.join; 1=” “.join; 2=”\n”.join

API

import rstdoc.listtable as listtable
listtable.row_to_listtable:
 
def row_to_listtable(
        row ,colwidths ,withheader ,join ,indent ,tableend
    ):

This is the default process_row parameter of listtable.gridtable.

param row:list of cells for the row
param colwidths:
 The widths of the columns
param withheader:
 produce :header-:param rows: 1
param join:0,1,2 telling how to combine the lines of a cell
  • 0 = without space
  • 1 = with space
  • 2 = keep multi-line
param indent:indentation of the table
param tableend:True, if end of table
listtable.gridtable:
 
def gridtable(
        data ,join='012' ,process_row=row_to_listtable
    ):

Convert grid table to list table with same column number throughout. See listtable.row_to_listtable.

param data:from file.readlines() or str.splitlines(True)
param join:join column 0 without space, column 1 with space and leave the rest as-is
param process_row:
 creates a list-table entry for the row
listtable.main:
def main(**args):

This corresponds to the rstlisttable shell command.

param args:Keyword arguments. If empty the arguments are taken from sys.argv.

rstfile is the file name

in_place defaults to False

join defaults to “012”

rstuntable

rstuntable: shell command
untable: rstdoc module

Convert tables of following format to paragraphs with an ID. The ‘-’ in ID is removed and the ID is made lower case. Bold is removed.

ID-XY-00 text goes here
ID-XY-01 text again goes here

If the first entry does contain no word chars or spaces between words, then the table stays. For a different behavior replace paragraph23.

A file produced from a docx using pandoc or fromdocx.py will need a pre-processing via rstlisttable to convert grid tables to list-table tables. This is done in one step with rstfromdocx -lu doc.rst.

API

import rstdoc.untable as untable
untable.paragraph23:
 
def paragraph23(row, nColumns, org, islast, withheader):

For process_row parameter of untable.

For a table of 2 or 3 columns, transform to text. The first column must hold only one line for an ID.

If not transformed to paragraph, then the orginal text (org) is yielded.

param row:list of strings representing the row
param nColumns:number of columns in the table
param org:orginal text
param islast:this call is with the last table entry
param withheader:
 the table has a header line
untable.untable:
 
def untable(lns, process_row=paragraph23):

Transform a RST list-table to normal paragraphs. The table is supposed to have this format:

  • The first column holds an ID.
  • Optionally the second column holds keywords.
  • The last column holds the details.
param lns:list of strings
param process_row:
 called for each row to transform to paragraph
untable.main:
def main(**args):

This corresponds to the rstuntable shell command.

param args:Keyword arguments. If empty the arguments are taken from sys.argv.

rstfile is the file name

in_place defaults to False

rstreflow

rstreflow: shell command
reflow: rstdoc module

Reflow tables and paragraphs in a rst document produced from a docx.

Post-process a docx in this order:

rstfromdocx doc.docx
rstlisttable doc/doc.rst > doc/tmp.rst
rstuntable doc/tmp.rst > doc/tmp1.rst
rstreflow doc/tmp1.rst > doc/tmp2.rst
rstreimg doc/tmp2.rst > doc/tmp3.rst
rm doc/doc.rst
mv doc/tmp3.rst doc/doc.rst
rm doc/tmp*

Check the intermediate results.

Else one can also do inplace:

rstfromdocx doc.docx
rstlisttable -i doc/doc.rst
rstuntable -i doc/doc.rst
rstreflow -i doc/doc.rst
rstreimg -i doc/doc.rst

Note

DOCX to RST using Pandoc

rstfromdocx -lurg doc.rst converts a docx to RST and does all the post-processing in one step.

It is adviced, though, to compare the output with the original and do some manual corrections here and there.

API

import rstdoc.reflow as reflow
reflow.reflowparagraph:
 
def reflowparagraph(p, sentence=False):

Reflow a paragaph using textwarp.wrap. Possibly split sentences.

param p:paragraph
param sentence:if True lines are split at the end of the sentence
reflow.reflowparagraphs:
 
def reflowparagraphs(lns, sentence=False):

Reflow paragraphs using reflow.reflowparagraph.

param lns:lines from rst file
param sentence:if True lines are split at the end of the sentence
reflow.nostrikeout:
 
def nostrikeout(lns):

Removes [strikeout:xxx]

param lns:lines from rst file
reflow.rmextrablankline:
 
def rmextrablankline(lns):

Remove excessive blank lines.

param lns:lines from rst file
reflow.no3star:
def no3star(lns):

Removes three stars, as they are not supported by docutils.

param lns:lines from rst file
reflow.noblankend:
 
def noblankend(lns):

Removes blanks at the end of the line.

param lns:lines from rst file
reflow.reflowrow:
 
class reflowrow():

This replaces listtable.row_to_listtable in listtable.gridtable to reflow a grid table.

reflow.reflow:
def reflow(lns, join='1', sentence=False):

Combines all rst corrections of this file.

param lns:lines from rst file
param join:0 no space, 1 with space, 2 keep as-is
param sentence:if True lines are split at the end of the sentence
reflow.main:
def main(**args):

This corresponds to the rstreflow shell command.

param args:Keyword arguments. If empty the arguments are taken from sys.argv.

rstfile is the file name

in_place defaults to False

rstreimg

rstreimg: shell command
reimg: rstdoc module

Reimg renames the images in the rst file and the files themselves. It uses part of the document name and a number as new names.

This is useful, if more RST documents converted from DOCX should be combined in one directory and the names of the images have no meaning (image13,…).

API

import rstdoc.reimg as reimg
reimg.reimg:
def reimg(data, prefix):

This renames all the images in the rst file converted from docx, to avoid

  • images having strange names
  • collision of image names from different docx
param data:rst file read by f.read()
param prefix:string prefix for images, should be derived from docx file name
reimg.main:
def main(**args):

This corresponds to the rstreimg shell command.

param args:Keyword arguments. If empty the arguments are taken from sys.argv.

rstfile is the file name

in_place defaults to False

rstretable

rstretable: shell command
retable: rstdoc module

Transforms list tables to grid tables.

This file also contains the code from the Vim plugin vim-rst-tables-py3, plus some little fixes. rstdoc is used by the Vim plugin vim_py3_rst , which replaces vim-rst-tables-py3.

API

import rstdoc.retable as retable
retable.title_some:
 
title_some = """=-^"'`._~+:;,"""

The rst title order was partly taken from https://github.com/jimklo/atom-rst-snippets then converted to http://documentation-style-guide-sphinx.readthedocs.io/en/latest/style-guide.html

retable.reformat_table:
 
def reformat_table(lines, row=0, col=0, withheader=0):

Create or reformat a grid table in lines. The table is delimited by emtpy lines starting from (row,col).

param lines:list of strings
param row:of cursor position,
param col:… as only the lines delimited by an empty line are used
param withheader:
 user the first line as table header
retable.create_rst_table:
 
def create_rst_table(data, withheader=0):

Create a rst table from data

Example:

>>> lns=[['one','two','three'],[1,2,3]]
>>> create_rst_table(lns)
'+-----+-----+-------+\n| one | two | three |\n+-----+-----+-------+\n| 1   | 2   | 3     |\n+-----+-----+-------+'
param data:list of list of data
retable.reflow_table:
 
def reflow_table(lines, row=0, col=0):

Adapt an existing table to the widths of the first line. The table is delimited by emtpy lines starting from (row,col).

lines: list of strings row: of cursor position, col: … as only the lines delimited by an empty line are considered

retable.re_title:
 
def re_title(lines, row=0, col=0, down=0):

Adjust the under- or overline of a title.

param lines:list of lines
param row:of cursor position,
param col:… as only the lines delimited by an empty line are considered
param down:>0down, <0up
>>> lines="""\
...   ###########
...       title
...   ###########
...   """.splitlines()
>>> re_title(lines)
>>> lines
['      #####', '      title', '      #####', '  ']
retable.retable:
 
def retable(lns):

Transform listtable to grid table. Yield the resulting lines.

param lns:list of strings
retable.main:
def main(**args):

This corresponds to the rstretable shell command.

param args:Keyword arguments. If empty the arguments are taken from sys.argv.

rstfile is the file name

in_place defaults to False