From fe08dcdb9220bf5b8b42ea503637e20ddbb818eb Mon Sep 17 00:00:00 2001 From: Vivien Kraus Date: Wed, 5 Oct 2022 20:57:30 +0200 Subject: Start the document. --- mped.xml | 368 +++++++++++++++++++++++++++++++++++++++++++++++++++ tangle-bootstrap.xsl | 111 ++++++++++++++++ 2 files changed, 479 insertions(+) create mode 100644 mped.xml create mode 100644 tangle-bootstrap.xsl diff --git a/mped.xml b/mped.xml new file mode 100644 index 0000000..ecfe859 --- /dev/null +++ b/mped.xml @@ -0,0 +1,368 @@ + + + + Meta-Programming in Extensible Documents + + Vivien Kraus + + + + Literate Programming is a technique for writing programs, in a + way that the code comes to support the ideas developed in a + human language. Successful programs written with that + technique are easy to understand, because you can get all the + important ideas while reading the document, cover to cover. + + + It is tempting to add code evaluation to the literate + programming technique. With this common addition, the + techniques becomes a meta-programming technique as it lets + programmers use programs to write programs, possibly in other + programming languages. + + + While many tools provide such meta-programming capabilities to + the literate programming task, it remains fairly uncommon to + have it applied to extensible documents, in the XML + ecosystem. This book provides a new extension to Docbook, to + support meta-programming. + + + + Meta-Programming + Literate Programming + XML + + + 2022 + Vivien Kraus + + + + I have not made a decision about the license of the program. + + + + + What this book is trying to do + + As developers, we like to broaden our ideas about how + programming should be done. Repeating the same design process of + computer programs is boring, to the point that it seems robots + would do it better than us. After all, writing programs is very + far from a purely scientific or engineering task, a large part + of the writing process is deciding how to lay out the idea on + the medium. Choices of styles, or technologies te write the + program, are always more a question of personal preference than + an objective cost of development. + + + Lawyers worldwide seem to have noticed that, too, which is why + it has been decided that computer programs would be ruled by + copyright law: there are different ways to express an idea, none + of which is inherently better than the others, so the law + controls the expression of these ideas, not the ideas + themselves. + + + Among the different ways to write a program, literate + programming must be one of the most appealing. Write a book, + develop your ideas, and support them with code. I like the idea! + Let’s do it. What do we need? + + + First, we need to write a book. A very special book that is: it + must feature text, programs, and documentation of this + program. This is not very typical of a book, so we want the + authoring process to be extensible, so that + it lets authors add elements to their books without modifying + the process they use to write their books. Books are typically + written in a markup language: text is divided into elements that + carry some intrinsic semantics, such as chapters. We want an + extensible markup language, so that we can create new semantic + elements without changing the language. I know two classes of + extensible markup languages: ones where the extensions are code + plug-ins to editors for that markup language, which is how you + add features for org-mode through emacs plugins, for instance, + and XML. For this present task, I want to use XML. + + + We also need to write a program. Thus, our markup language + should be able to take the pieces of code around and compile + them to a program. While it is possible to write a program that + would parse the document and extract the source code, I find it + way more elegant to leverage XSLT, the stylesheet and + transformation language for markup languages based on XML. + + + Finally, we need to combine everything into a printable + document. There, XSLT is a tool to be used too. + + + The work presented here uses its own namespace: + https://labo.planete-kraus.eu/mped.git, that we will + now summarize as “mped”. + + + + Tangling pieces of code from the document + + One of the most iconic features of literate programming is its + ability to extract source code blocks and put them in files. + +
+ One source block to one file + + The document contains program listings that support the + development of ideas. These are usually written in elements, + siblings to paragraphs, and for Docbook, of type + <programlisting>. The most important attribute, + “language”, identifies the programming language. + + + However, there is no attribute in Docbook that tells the + tangling program where each piece of code should end up. This is + why we introduce our first extension: the “mped:tangle-to” + attribute. + + + To tangle a document, an XSLT stylesheet is defined. It reads a + Docbook document, and outputs a shell script that writes the + correct pieces of code to the correct file names. The key + template to do the task is: + + + + mkdir -p $(dirname " + + ") + + cat >> + + << "_MPED_EOF" + + + + _MPED_EOF + + ]]> + + + This template starts by creating the directory where the file + should go, then fills the file with the source code. For this to + work, we need to do two things about the text of the program + listing: remove the first empty lines and the last empty + lines of the content (but preserve indentation). + + + Let us start with removing leading or trailing empty + lines. Removing leading empty lines seems easier. + + + + + + + + + + + + + + + + + + + + + + + + + + + ]]> + + + There are three different cases. If the text starts with a + newline, discard the indentation that we carried and the + newline. If the text starts with whitespace, carry it and look + at the next character. Otherwise, the whitespace that we carried + is indentation, so print it, and print the text. + + + To avoid exposing the carried indentation, it is better to mark + this template as internal and wrap it in a new template. + + + + + + + + + + ]]> + + + To remove trailing empty lines, the solution is easier since + there is no indentation to keep around: just discard all the + trailing whitespace. + + + + + + + + + + + + + + + + + + + + + + + + + ]]> + + + Using these templates, we can process the program listing code + in the “copy-source-code” mode: if there is only one text node, + then remove the leading emtpy lines and trailing + whitespace. Otherwise, remove the leading emtpy lines from the + first text node and the trailing whitespace from the last text + node. By “first” (respectively, “last”) text node, I mean the + text node that has no preceding (respectively, following) + siblings. Maybe there are no such text nodes. + + + + + + + + + + + + + + + + + + + + + + + + ]]> + + + Tangling should never touch anything else. So, text should not + be copied to output. + + + + ]]> + +
+
+ Paste other listings in place + + Literate programming requires the author to be able to discuss + bits of code in isolation, and then insert each bit into a + larger bit. Mped provides this operation with a new tag, + “mped:copy”. It has a “linkend” attribute that resolves to a + program listing anywhere in the document. When copying source + code, matching this element will insert the linked listing + directly here. + + + + + + + + There are no listing with ID ' + + '. + + + + + There are multiple listings with ID ' + + '. + + + + + ]]> + +
+
+ Putting it all together + + The collection of all these templates gives the following: + + + + + + + + ]]> + + + + + + + + + ]]> + +
+
+
diff --git a/tangle-bootstrap.xsl b/tangle-bootstrap.xsl new file mode 100644 index 0000000..ffe19f5 --- /dev/null +++ b/tangle-bootstrap.xsl @@ -0,0 +1,111 @@ + + + + + + mkdir -p $(dirname " + + ") + + cat >> + + << "_MPED_EOF" + + + +_MPED_EOF + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + There are no listing with ID ' + + '. + + + + + + There are multiple listings with ID ' + + '. + + + + + + -- cgit v1.2.3