Fossil User Forum

Using LibreOffice FODT and Fossil for document revisioning
Login

Using LibreOffice FODT and Fossil for document revisioning

Using LibreOffice FODT and Fossil for document revisioning

(1) By Andy Bradford (andybradford) on 2025-04-03 02:58:40 [source]

A  few  months  ago  I  suggested  to  some  people  who  were  planning
on  collaborating  on a  document  to  use Fossil  as  a  VCS for  their
document which they were planning  on writing in LibreOffice. After some
investigation I discovered  that LibreOffice has a  "flat" format called
FODT which  looked promising so  I thought  that it might  actually work
well.

After a  few commits, however, it  became apparent that there  were some
things  about LibreOffice  that made  managing  the file  with Fossil  a
nightmare, despite the fact that the  FODT format seemed to be ideal for
this.

The biggest challenges we discovered were:

1)  LibreOffice has  a  counter of  sorts called  Rsid  that is  updated
everytime the document is saved.

2) LibreOffice generates automatic styles for just about every paragraph
and sometimes  even every word  and character within the  paragraph that
change with  almost every paragraph  when you  alter text. This  is done
even if you think you have applied a specific style to something because
the automatic style  is actually linked to the original  style through a
parent-child relationship.

3) The rsid  is part of the automatic styles,  so everytime the document
is  changed dozens  of new  styles get  inserted automatically  into the
document. As a result, a one character change to a paragraph (e.g. for a
spelling correction) could actually end  up being committed to Fossil as
a 500 line change. Completely useless and unmanageable.

These  things,  among  others,  result  in  automatic  styles  that  are
identical in "function" but have  entirely different names and different
rsids. After  just a few commits  there were over 300  of such automatic
styles  that  were  in  the   document  making  comparing  revisions  an
impossible task because even though two paragraphs visually had the same
style, in  the actual source of  the document they had  a different name
and rsid.

What we  found, however, is  that it is possible  to avoid all  of these
behaviors by very careful and deliberate use of user-defined styles. For
example, rather than highlighting a word  in a paragraph and changing it
to an italic font, one should instead apply a predefined character style
to the word.

Similarly, instead  of highlighting  a paragraph  and changing  the font
size or font name to something, one should simply click on the paragraph
and change the style using user-defined styles.

In other words, one should have explicit styles and always be deliberate
in  assigning styles  to words,  paragraphs and  characters. If  a style
doesn't exist for some text effect  that is desired, create it first and
then assign it to the text in question.

Finally,  it is  possible  to  disable the  Rsid  which also  eliminates
another source  of noise  in the  diff between  commits. The  setting is
called  "Store it  when changing  the  document" and  disabling it  will
prevent LibreOffice from saving this with the document.

It even works  with images because images are placed  in the document as
base64 encoded  blobs. This is  not really ideal because  compression is
likely not going to be as good  as it could be, but some sacrifices have
to  be made  to have  a  useful history  that management  in Fossil  can
provide.

Before every commit, it is necessary to run "fossil diff" or use the new
"fossil ui" to look at the diff  and verify that no new automatic styles
have been  introduced. If any  are found, then  it's easy to  track them
down in  the document and highlight  the text and use  the "Clear direct
formatting" option to reset it to  the user-defined style and then apply
a proper style if necessary. Continue doing this until the diff shows no
new automatic styles and then commit.

After all of this, I'm going to  consider this a success because as long
as one is careful about not  letting LibreOffice run away with automatic
styles, Fossil can actually be quite  useful to compare revisions of the
document.

I'm  sure I've  left something  out, but  I thought  I would  share this
experience  just  in  case  someone  else gets  the  notion  to  combine
LibreOffice FODT documents with Fossil revision management.

Andy

(2) By Mike Swanson (chungy) on 2025-04-03 03:17:29 in reply to 1 [link] [source]

That's some pretty good advice. I encountered similar drawbacks a couple years ago when I tried to store versioned *.fod? files (it was Git at the time, but the problems with the Flat OpenDocument format will be apparent with every VCS). I didn't really dig into workarounds, and ended up just storing the binary *.od? files instead. One drawback I found particularly damaging was that *.fods files did not keep conditional formatting, and it wasn't always consistent. Using *.ods never seemed to exhibit a similar problem.

I'm glad that there is a way, with care, to maintain diff-able *.fod? files, I may have to keep this post in mind for the future.

For text documents, I think there are a few alternatives that make it easier to maintain sensible diffs, in order from easiest to hardest:

  • Plain text files, especially the Markdown variety.
    • Pros: Any text editor in the world can see them, diffs will look natural, Fossil can display Markdown files with built-in HTML rendering.
    • Cons: Pandoc can be difficult to use, and Markdown can be much more limiting than a normal word processor.
  • TeXmacs
    • Pros: stores files in a fairly readable plain text format, making diffs look natural. Brings almost all the features of LaTeX in a nice-ish GUI.
    • Cons: Uncommon software that needs installation, and the GUI isn't as nice as LibreOffice is.
  • (La)TeX
    • Pros: Stable format (documents from the 1980s should still render fine in modern versions), plain text format making diffs look natural
    • Cons: The language is infamously difficult to learn; even LaTeX that was created to have macros to do common things that TeX doesn't offer out-of-the-box did not solve the problem completely.

(3) By Andy Bradford (andybradford) on 2025-04-03 03:32:20 in reply to 2 [link] [source]

> For text documents, I think there  are a few alternatives that make it
> easier to maintain sensible diffs, in order from easiest to hardest:

Absolutely, and in fact  I tend to prefer using TeX,  but this wasn't an
option for the others who wanted to contribute.

Searching this Forum for FODT will turn up comments by others (including
some by me when I first started investigating FODT with Fossil).

Andy

(4) By anonymous on 2025-04-05 01:29:34 in reply to 2 [link] [source]

You could also use org-mode or djot as alternatives to Markdown, which in theory would allow you to dispense with Pandoc. Both Org-mode and Djot has the favorable capability of not being based on HTML output, but rather being more general lightweight markup formats.

Or just use a smaller, more focused markdown utility like lowdown(1).

(5) By Vadim Goncharov (nuclight) on 2025-06-24 17:39:15 in reply to 2 [link] [source]

BTW, has someone seriously considered *roff language (nroff, groff) or it is nowhere in modern UTF-8 world?

(6) By Andy Bradford (andybradford) on 2025-06-24 19:33:38 in reply to 5 [link] [source]

> BTW, has someone seriously considered *roff language (nroff, groff)

That's  another good  idea  and one  that would  lend  itself nicely  to
version  control.  Unfortunately,  I  had to  find  something  that  was
otherwise  easily consumed  by  others  with less  time  to dedicate  to
learning.

Andy

(7) By Warren Young (wyoung) on 2025-06-25 18:05:02 in reply to 5 [link] [source]

Quoting the groff man page:

Input to GNU troff…must be in the character encoding it recognizes: ISO Latin-1 (8859-1).

It then goes on to offer a preprocessor that smashes UTF-8 down to Latin-1.

These tools were written decades before Unicode existed, much less UTF-8, and there seems to be no interest in updating them to fix that, doubtless out of backwards compatibility concerns.

(8) By aitap on 2025-06-27 14:49:23 in reply to 5 [link] [source]

There's neatroff (on GitHub) which does speak UTF-8 (and UTF-8 only), but it's not widely used. There's even an equation typesetter by the same author.