emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@posteo.net>
To: emacs-orgmode@gnu.org, Thomas S. Dye <tsd@tsdye.online>,
	Karthik Chikmagalur <karthikchikmagalur@gmail.com>,
	Matt <matt@excalamus.com>
Subject: [PATCH v2] org-manual: Describe export process flow
Date: Wed, 27 Dec 2023 13:43:42 +0000	[thread overview]
Message-ID: <87le9fep69.fsf@localhost> (raw)
In-Reply-To: <87wmt1dp1c.fsf@localhost>

[-- Attachment #1: Type: text/plain, Size: 3258 bytes --]

Ihor Radchenko <yantar92@posteo.net> writes:

> I'd like to add a new section to Org mode manual.
> The new section will describe all the steps performed by Org export
> process. This should hopefully create a more clear picture on how
> various export hooks and filters are used.
>
> The patch is attached.
> I'd appreciate feedback from people not familiar with ox.el.

Thank you all for the feedback!
I am attaching revised version of the patch with most of the comments
addressed.

I will put more detailed responses inline.


"Thomas S. Dye" <tsd@tsdye.online> writes:
> I'm not too familiar with ox.el.
>
> I edited mostly to use an active voice. I put author queries in 
> parentheses.  I haven't paid attention to manual formatting 
> conventions.
>
> IMO, more links would likely be helpful.
>
> * Suggested revision
> ...

I believe that I have addressed everything you commented on.

Karthik Chikmagalur <karthikchikmagalur@gmail.com> writes:
> - When exporting a sub-tree, at what stage of the export process is the
>   buffer narrowed to the sub-tree?

I added a clarification on subtree export now.

> - Are "inner" and "outer" templates described in the manual, and if they
>   are could you add a link to those sections when mentioning them in
>   this summary?  I only found references to the plist properties
>   BEAMER_INNER_THEME etc.

This is internal terminology. I changed the wording, expanding on what
inner and outer template do.

Matt <matt@excalamus.com> writes:
> Here are all the hooks and functions for org-export (via =C-h v
> org-export--hooks TAB= and =C-h v org-export--function TAB=).  I see
> 59 of them.
> ...
> * Feedback 1:
> How are the functions not present in the patch handled?

- I fixed the obsolete variable names.
- `org-export-stack-mode-hook' is not directly relevant to the export
  process - it is for asynchronous export listing buffer
- Syntax-specific filters are applied according to the corresponding Org
  syntax element. I tried to make it more clear.
- Special filters, like `org-export-before-parsing-functions' are
  described separately. I think I have mentioned all of them.

> I would write out "src" as "source".  Do we have an official way to
> refer to source blocks?  For example, we standardize on "Org":
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/tree/doc/Documentation_Standards.org#n47

We use "source block", "code block", and "source code block" across the
manual. Not "src block" though.
I went with "code block" in the patch.

I am not sure if it is necessary to standardize the above three terms.
They are all equally understandable I believe.

> Remove the trailing period or add periods to all the others.  I tend
> leave the period of the last sentence of a list.  I'm not sure of a
> style guide that recommends one or the other.  Maybe someone knows
> what's "right"?

No idea. We do not have a consistency in the manual either.  I went with
";" as Thomas suggested.

> Maybe use "converted" instead of "transcoded"?  I'm a native speaker
> but I wonder if "converted" is a simpler word for people who aren't.

"transcoded" is closer to what we use in the code. I tried to use
"converted" in more general description, but still hint on the
internal terminology.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: v2-0001-doc-org-manual.org-Fix-some-obsolete-variable-nam.patch --]
[-- Type: text/x-patch, Size: 1919 bytes --]

From efee8fb5e8aca473b1b80aacc2b38951421225cc Mon Sep 17 00:00:00 2001
Message-ID: <efee8fb5e8aca473b1b80aacc2b38951421225cc.1703683799.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Wed, 27 Dec 2023 14:23:29 +0100
Subject: [PATCH v2 1/2] doc/org-manual.org: Fix some obsolete variable names

* doc/org-manual.org (Export hooks): Use the new
`org-export-before-processing-functions' and
`org-export-before-parsing-functions' instead of their obsolete
aliases.
---
 doc/org-manual.org | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/doc/org-manual.org b/doc/org-manual.org
index 23f250fa7..b35a83434 100644
--- a/doc/org-manual.org
+++ b/doc/org-manual.org
@@ -16397,12 +16397,14 @@ *** Export hooks
 :END:
 
 #+vindex: org-export-before-processing-hook
+#+vindex: org-export-before-processing-functions
 #+vindex: org-export-before-parsing-hook
 The export process executes two hooks before the actual exporting
-begins.  The first hook, ~org-export-before-processing-hook~, runs
-before any expansions of macros, Babel code, and include keywords in
-the buffer.  The second hook, ~org-export-before-parsing-hook~, runs
-before the buffer is parsed.
+begins.  The first hook, ~org-export-before-processing-functions~,
+runs before any expansions of macros, Babel code, and include keywords
+in the buffer.  The second hook,
+~org-export-before-parsing-functions~, runs before the buffer is
+parsed.
 
 Functions added to these hooks are called with a single argument: the
 export backend actually used, as a symbol.  You may use them for
@@ -16421,7 +16423,7 @@ *** Export hooks
      ;; the docstring of `org-map-entries' for details.
      (setq org-map-continue-from (point)))))
 
-(add-hook 'org-export-before-parsing-hook #'my-headline-removal)
+(add-hook 'org-export-before-parsing-functions #'my-headline-removal)
 #+end_src
 
 *** Filters
-- 
2.42.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: v2-0002-doc-org-manual.org-Describe-export-flow.patch --]
[-- Type: text/x-patch, Size: 6504 bytes --]

From 704eda63d6abf847efcd79cd97e3560ba4729921 Mon Sep 17 00:00:00 2001
Message-ID: <704eda63d6abf847efcd79cd97e3560ba4729921.1703683799.git.yantar92@posteo.net>
In-Reply-To: <efee8fb5e8aca473b1b80aacc2b38951421225cc.1703683799.git.yantar92@posteo.net>
References: <efee8fb5e8aca473b1b80aacc2b38951421225cc.1703683799.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Tue, 26 Dec 2023 15:15:23 +0100
Subject: [PATCH v2 2/2] doc/org-manual.org: Describe export flow

* doc/org-manual.org (Summary of the export process): Explain how the
export process is handled in Org mode.
---
 doc/org-manual.org | 149 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 149 insertions(+)

diff --git a/doc/org-manual.org b/doc/org-manual.org
index b35a83434..12e7d031a 100644
--- a/doc/org-manual.org
+++ b/doc/org-manual.org
@@ -16505,6 +16505,155 @@ *** Defining filters for individual files
 ,#+END_SRC
 #+end_example
 
+*** Summary of the export process
+:PROPERTIES:
+:UNNUMBERED: notoc
+:END:
+
+Org mode export is a multi-step process that works on a temporary copy
+of the buffer.  On high-level, the export process consists of 4 major
+steps:
+
+1. Process the temporary copy, making necessary changes to the buffer
+   text;
+
+2. Parse the buffer, converting plain Org markup into abstract syntax
+   tree (AST);
+
+3. Convert the AST to text, as prescribed by the selected export
+   backend;
+
+4. Post-process the resulting exported text.
+
+
+#+texinfo: @noindent
+Process temporary copy of the source Org buffer [fn::Unless
+otherwise specified, each step of the export process only operates on
+the accessible portion of the buffer.  When subtree export is selected
+(see [[*The Export Dispatcher]]), the buffer is narrowed to the body of
+the selected subtree, so that the rest of the buffer text, except
+export keywords, does not contribute to the export output.]:
+
+1. Execute ~org-export-before-processing-functions~ (see [[*Export hooks]]);
+
+2. Expand =#+include= keywords in the whole buffer (see
+   [[*Include Files]]);
+
+3. Remove commented subtrees in the whole buffer (see [[*Comment
+   Lines]]);
+
+4. Replace macros in the whole buffer (see [[*Macro Replacement]]);
+
+5. When ~org-export-use-babel~ is non-nil (default), process code
+   blocks:
+
+   - Leave code blocks inside archived subtrees (see [[*Internal
+     archiving]]) as is;
+
+   - Evaluate all the other code blocks according to code block
+     headers (see [[*Limit code block evaluation]]);
+
+   - Remove code, results of evaluation, both, or neither according
+     to =:exports= header argument (see [[*Exporting Code Blocks]]).
+
+
+#+texinfo: @noindent
+Parse the temporary buffer, creating AST:
+
+1. Execute ~org-export-before-parsing-functions~ (see [[*Export hooks]]).
+   The hook functions may still modify the buffer;
+
+2. Calculate export option values according to subtree-specific export
+   settings, in-buffer keywords, =#+BIND= keywords, and buffer-local
+   and global customization.  The whole buffer is considered;
+
+3. Determine contributing bibliographies and record them into export
+   options (see [[*Citations]]).  The whole buffer is considered;
+
+4. Execute ~org-export-filter-options-functions~;
+
+5. Parse the accessible portion of the temporary buffer to generate
+   AST.  The AST is a nested list of lists representing Org syntax
+   elements (see [[https://orgmode.org/worg/dev/org-element-api.html][Org Element API]] for more details):
+
+   : (org-data ...
+   :  (heading
+   :   (section
+   :    (paragraph (plain-text) (bold (plain-text))))
+   :   (heading)
+   :   (heading (section ...))))
+
+   Past this point, modifications in the temporary buffer copy no
+   longer affect export; Org export works only with the AST;
+
+6. Remove elements that will not be exported from the AST:
+
+   - Headings according to =SELECT_TAGS= and =EXCLUDE_TAGS= export
+     keywords, and =task=, =inline=, =arch= export options (see
+     [[*Export Settings]]);
+
+   - Comments;
+
+   - Clocks, drawers, fixed-width environments, footnotes, LaTeX
+     environments and fragments, node properties, planning lines,
+     property drawers, statistics cookies, timestamps, timestamps,
+     etc according to =#+OPTIONS= keyword (see [[*Export Settings]]);
+
+   - Table rows containing width and alignment markers (see [[*Column
+     Width and Alignment]]);
+
+     - Table columns containing recalc marks (see [[*Advanced features]]).
+
+7. Expand environment variables in file link AST nodes, according to
+   the =expand-links= export option (see [[*Export Settings]]);
+
+8. Execute ~org-export-filter-parse-tree-functions~.  These
+   functions can modify AST by side effect;
+
+9. Replace citation AST nodes and =#+print_bibliography= keyword AST
+   nodes as prescribed by the selected citation export processor
+   (see [[*Citation export processors]]).
+
+
+#+texinfo: @noindent
+Convert the AST to text by traversing the AST nodes, depth-first:
+
+1. Convert the leaf nodes (without children) to text as prescribed
+   by "transcoders" in the selected export backend
+   [fn:: See transcoders and ~:translate-alist~ in the docstrings
+   of ~org-export-define-backend~ and ~org-export-define-derived-backend~.];
+
+2. Pass the converted nodes through the corresponding export
+   filters (see [[*Filters]]);
+
+3. Concatenate all the converted child nodes to produce parent
+   node contents;
+
+4. Convert the nodes with children to text, passing the nodes
+   themselves and their contents to the corresponding transcoders
+   and then to export filters (see [[*Filters]]).
+
+
+#+texinfo: @noindent
+Post-process the exported text:
+
+  1. Post-process the converted AST, as prescribed by the export
+     backend. [fn:: See ~inner-template~ in the docstring of ~org-export-define-backend~.]
+     This step usually adds generated content (like Table of Contents)
+     to the exported text;
+
+  2. Execute ~org-export-filter-body-functions~;
+
+  3. Unless body-only export is selected (see [[*The Export Dispatcher]]),
+     add the necessary metadata to the final document, as prescribed
+     by the export backend.  Examples: Document author/title; HTML
+     headers/footers; LaTeX preamble;
+
+  4. Add bibliography metadata, as prescribed by the citation export
+     processor;
+
+  5. Execute ~org-export-filter-final-output-functions~.
+
 *** Extending an existing backend
 :PROPERTIES:
 :UNNUMBERED: notoc
-- 
2.42.0


[-- Attachment #4: Type: text/plain, Size: 224 bytes --]


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

  parent reply	other threads:[~2023-12-27 14:08 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-26 14:19 [PATCH] org-manual: Describe export process flow Ihor Radchenko
2023-12-26 19:19 ` Thomas S. Dye
2023-12-26 21:22 ` Karthik Chikmagalur
2023-12-26 21:56 ` Matt
2023-12-26 22:02   ` Matt
2023-12-27 13:43 ` Ihor Radchenko [this message]
2023-12-27 15:03   ` [PATCH v2] " Matt
2023-12-27 17:08     ` [PATCH v3] " Ihor Radchenko
2023-12-27 18:02       ` Matt
2023-12-28 12:06         ` [PATCH v4] " Ihor Radchenko
2023-12-28 15:29           ` Matt
2024-02-12 13:25           ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87le9fep69.fsf@localhost \
    --to=yantar92@posteo.net \
    --cc=emacs-orgmode@gnu.org \
    --cc=karthikchikmagalur@gmail.com \
    --cc=matt@excalamus.com \
    --cc=tsd@tsdye.online \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).