Fossil

Artifact [9c64b9d1]
Login

Artifact [9c64b9d1]

Artifact 9c64b9d1aa5a836099b4b2d1090e8f116eea2fe3e6ec7333a6c47477751a1376:

Wiki page [branch/markdown-footnotes] by george 2022-02-21 04:48:24.
D 2022-02-21T04:48:24.401
L branch/markdown-footnotes
N text/x-markdown
P 1e9b7701cee09b9092efe0e7ee61cb012dbc35dc5f6763de1c13367318a69c03
U george
W 10086
The primary purpose of a footnote is to provide a reader with *supplementary*
information in an unobtrusive way, so that the reader's attention is
not distracted from the main line of a narrative.
Thus a natural lenght of a footnote is usually a couple of sentences,
maybe a paragraph. If a footnote exceeds a paragraph then it may be beneficial
to rearrange a composition and incorporate that footnote as a subsection of
the current (or some other) document and point there using a regular hyperlink.

There may be (at least) two ways to think of footnotes:

 1. From the viewpoint of a document's author (i.e. Markdown syntax):
    
    Within this point of view a footnote is defined either in the place
    where the corresponding numeric marker is inserted
    or at some other location in the document (e.g. after a paragraph or at
    the end of a document's source) and referenced by the corresponding label.  
    The former will be called as **inline** while the later as **referenced**
    (or **labeled**).
 
 2. From the viewpoint of a person who reads a rendered document:
    
    Within this point of view either a footnote is explicitly associated
    with a specific text or it is the task of the reader to deduce an exact
    phrase to which that footnote applies. The former will be called
    **span-bounded** (or *fragment-bounded*) and the later as **free-standing**.

Hence there are four types of footnotes
(suggestions for better naming [are welcomed](forum:/forumthread/d752446a4f63f390)),
and this branch implements all of them.
Footnotes rendering is supported in all places where Markdown is supported
(but see [issues and limitations](#il)).

## The syntax

[Proposed syntax](/doc/markdown-footnotes/src/markdown.md#ftnts)
is documented along with the other markdown rules.[^1]
It was desired to have a syntax that naturally extends conventional Markdown
and which is consistent for all four types of footnotes.[^2]

### Free-standing labeled footnotes

It seems that by "footnotes in Markdown" Internet typically means
**referenced free-standing** case.
The syntax for that is more or less widespread and settled as:

```md
  Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed doeiusmod
  tempor incididunt[^label] ut labore et dolore magna aliqua.

  [^label]: Ut enimad minim veniam, quis nostrud exercitation ullamco
     laboris nisi utaliquip ex ea commodo consequat.
```

The advantage of *referenced* footnotes is that a single footnote
may be referenced multiple times and the place(s) of use may appear before
the footnote's definition.

### Free-standing inline footnotes

These may be more convenient for simple cases and as
[noted in the forum](forum:/forumpost/b189beee861943a4)
might be easier in maintenance.

For the time being, *free* markdown processors that support inline footnotes
are unknown. The following variants were considered:

```
  1.  Lorem ipsum dolor sit amet(^Ut enimad minim veniam)
   
  2.  Lorem ipsum dolor sit amet[^Ut enimad minim veniam]
   
  3.  Lorem ipsum dolor sit amet^[Ut enimad minim veniam]
   
  4.  Lorem ipsum dolor sit amet^(Ut enimad minim veniam)
```

The first variant was chosen since it looks a bit more natural
in the source form.

### Span-bounded footnotes

These seems to be a pretty rare thing.  
For *referenced* footnotes the following syntax variants were considered:

```
  1.  Lorem ipsum [dolor sit amet][^label]
   
  2.  Lorem ipsum [dolor sit amet](^label)
  
  3.  Lorem ipsum [dolor sit amet]^[label]
  
  4.  Lorem ipsum [dolor sit amet]^(label)
```

and for *inline* footnotes the following:

```
  1.  Lorem ipsum [dolor sit amet](^Ut enimad minim veniam)
  
  2.  Lorem ipsum [dolor sit amet][^Ut enimad minim veniam]
  
  3.  Lorem ipsum [dolor sit amet]^[Ut enimad minim veniam]
  
  4.  Lorem ipsum [dolor sit amet]^(Ut enimad minim veniam)
```

In the examples above both the scope of the footnote and its content are
pretty short. That might become the typical case but the parser is
capable of a longer snippets that span several lines (albeit not paragraphs).

In all cases the scope is specified inside square brackets because
it is consistent with the syntax for regular links.
The first variant in each category was selected as it seems consistent
with the syntax for regular hyperlinks and simplifies implementation.

The corresponding text fragment of a *span-bounded* footnote is
highlighted when a user follows footnote's back-reference
or when that span is hovered over.

### Styling with user-provided classes

If a footnote's text starts with a token of the special form
then this token is used to derive a set of CSS classes
that are added to that footnote and its references.
This enables users to style elements of a particular footnote
provided that the administrator provisioned and documented
some special CSS classes in a custom skin.
Default skin does not provide any of such special classes.

The token must start with a dot and must end with a colon;
in between of these it must be a dot-separated list of words;
each word may contain only ASCII alphanumeric characters and hyphens.

### Linting

The numbers for misreferences, unreferenced footnotes and joined footnotes
(that have several definitions with the same label) are counted.

If any of these counter is non-zero then TH1 variable `$footnotes_issues_counters`
is set to the space separated list of corresponding integers.
This simplifies reporting about issues with footnotes from within a header
of a page (if such warning is provisioned in a custom skin).

Also `--lint-footnotes` option is added to the `test-markdown-render` command[^3].
If this flag is given and footnotes in the input have issues, then the
above-mentioned counters are printed to `stderr` and non-zero exit code is set.

<a id="il"></a>
### Known issues and limitations

 1. There is an issue for webpages where `<base href="...">` is
    inconsistent with the actual REQUEST_URI of a page. As of this writing
    preview tabs of the `/wikiedit` and `/fileedit` pages are affected.
    There was an [attempt to fix `base href`](/timeline?r=base-href-fix)
    but as of this writing it's not merged onto mainline.
    An attempt to [fix it by including `RQUEST_URI`](/info/2c1f8f3592ef00e0)
    into the generated hyperlinks does not seem to help for these AJAXified pages.  
    There is a hope to solve this issue before version 2.19 of Fossil.

 2. A footnote's text is parsed and rendered in "inline mode".
    This is what is needed for the usual case, but
    it means that block-level markup might not work inside of a footnote.
    If the "block-level" markup is desired then it can be done via HTML tags
    in the text of a footnote.

 3. Source text of an inline footnote may not contain blank lines.
    This limitation comes from the current architecture of the parser
    where paragraphs are identified before the links (and footnotes alike).  
    That does not seem like a big problem in the light of the second point and
    the provisioned purpose of footnotes (described in the beginning).

## Notes about implementation

A static integer counter is incremented upon each rendering
of a Markdown document within a single process execution.
This enables to generate unique IDs for footnotes and back-references
within a single webpage. The counter is incremented for each
invocation of `markdown_to_html()` in the hope for better durability
of hyperlinks that point to a footnote (or a footnote's marker)
within a particular post of the forum.

Counters for issues are global variables (`g.ftntsIssues`).
The other options were considered as over-complication.

### Security considerations

This implementation adds three places (above the ordinary Markdown processing) where
the user input is used to construct HTML.

 1. `REQUEST_URI` is escaped via newly added `escape_quotes()` function before
    getting into the values of `href` attributes.
    
 2. Labels and source code of unreferenced footnotes are processed in the same
    way as other code blocks in Markdown source (via `html_escape()` function).

 3. User provided classes are constrained to ASCII alphanumeric characters and hyphens.

Thus it is believed that this feature does not introduce vulnerability for HTML injection.

### Performance considerations

Footnotes processing always terminates.
The amount of CPU required for the processing of typical footnotes should be *on par*
with the rest of the Markdown processor.
The worst theoretical case constitutes of a long chain of recursively nested
inline footnotes. For this case the theoretical complexity is O(n<sup>2</sup>).
Thus to keep CPU consumption in bounds the maximal depth of nesting that is
considered is limited to 5.

## See also:

> [Wikipedia](https://en.wikipedia.org/wiki/Markdown)  
  [GitHub](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#footnotes)  
  [StackExchange](https://meta.stackexchange.com/questions/5017/markdown-footnotes)  
  <https://daringfireball.net/2005/07/footnotes>  
  <https://www.markdownguide.org/extended-syntax/>  
  <https://support.typora.io/Markdown-Reference/#footnotes>  
  <https://michelf.ca/projects/php-markdown/extra/#footnotes>  
  <https://rephrase.net/box/word/footnotes/syntax/>  

Footnotes
---------

[^1]: If this version of Fossil supports[^4] footnotes then
  the actual syntax should be documented at [/md_rules](/md_rules#ftnts).

[^2]: Some examples (albeit unusual) may be seen at
  [](/doc/markdown-footnotes/test/markdown-test3.md)

[^3]: A link to [the corresponding check-in](/info/1f525713ff85cf5f)
  should also test extraction of backlinks from within footnotes.

[^4]: These four footnotes should test the functionality
  of footnotes on Wiki pages.
Z e312f9458cd82daccda8b4ee8751a484