Fossil

The Fossil Build Process
Login

The Fossil Build Process

1.0 Introduction

The build process for Fossil is tricky in that the source code needs to be processed by three different preprocessor programs before it is compiled. Most users will download a precompiled binary so this is of no consequence to them, and even those who want to compile the code themselves can use one of the existing makefiles. So must people do not need to be concerned with the build complexities of Fossil. But hard-core developers who desire a deep understanding of how Fossil is put together can benefit from reviewing this article.

2.0 Source Code Tour

The source code for Fossil is found in the src/ subdirectory of the source tree. The src/ subdirectory contains all code, including the code for the separate preprocessor programs.

Each preprocessor program is a separate C program implemented in a single file of C source code. The three preprocessor programs are:

  1. mkindex.c
  2. translate.c
  3. makeheaders.c

Fossil uses SQLite for on-disk storage. The SQLite implementation is contained in three source code files that do not participate in the preprocessing steps. These three files that implement SQLite are:

  1. sqlite3.c
  2. sqlite3.h
  3. shell.c

All three SQLite source files are byte-for-byte copies of files by the same name in the standard amalgamation. The sqlite3.c file implements the database engine. The shell.c file implements the command-line shell, which is accessed in fossil using the fossil sql command.

The shell.c command-line shell uses the linenoise library to implement line editing. linenoise comprises two source files which were copied from the upstream repository with only very minor portability edits:

  1. linenoise.c
  2. linenoise.h

The TH1 script engine is implemented using files:

  1. th.c
  2. th.h

The proprocessing steps are omitted for all of these imported files.

The VERSION.h header file is generated from other information sources using a small program called:

  1. mkversion.c

The builtin_data.h header file contains the definitions of C-language byte-array constants that contain various resources such as scripts and images. The builtin_data.h header file is generate from the original resource files using a small program called:

  1. mkbuiltin.c

Examples of built-in resources include the diff.tcl script used to implement the --tk option to fossil diff, the markdown documentation, and the various CSS scripts, headers, and footers used to implement built-in skins. New resources files are added to the "extra_files" variable in makemake.tcl.

The src/ subdirectory also contains documentation about the makeheaders preprocessor program:

  1. makeheaders.html

Click on the link to read this documentation. In addition there is a Tcl script used to build the various makefiles:

  1. makemake.tcl

Running this Tcl script will automatically regenerate all makefiles. In order to add a new source file to the Fossil implementation, simply edit makemake.tcl to add the new filename, then rerun the script, and all of the makefiles for all targets will be rebuilt.

There is an optional code verification step implemented using

  1. codecheck1.c

This file implements a small utility program ("codecheck1") that scans other Fossil source files looking for errors in printf-style format strings. The codecheck1 utility detects missing or surplus arguments on printf-like functions and dangerous uses of "%s" that might permit SQL injection or cross-site scripting attacks. This code check step is run automatically on each build of Fossil, and can also be run separately by typing "make codecheck". Note that the built-in printf format checking of GCC does not function for Fossil since Fossil implements its own printf (in the printf.c source file) that includes special features and formatting letters that are useful to Fossil. The codecheck1 utility can be seen as an enhanced application-specific replacement for the GCC printf format checker.

Finally, there is one of the makefiles generated by makemake.tcl:

  1. main.mk

The main.mk makefile is invoked from the Makefile in the top-level directory. The main.mk is generated by makemake.tcl and should not be hand edited. Other makefiles generated by makemake.tcl are in other subdirectories (currently all in the win/ subdirectory).

All the other files in the src/ subdirectory (79 files at the time of this writing) are C source code files that are subject to the preprocessing steps described below. In the sequel, we will call these other files "src.c" in order to have a convenient name. The reader should understand that whenever "src.c" or "src.h" is used in the text that follows, we really mean all (79) other source files other than the exceptions described above.

3.0 Automatically generated files

The "VERSION.h" header file contains some C preprocessor macros that identify the version of Fossil that is to be build. The VERSION.h file is generated automatically from information extracted from the "manifest", "manifest.uuid", and "VERSION" source files in the root directory of the source tree. (The "manifest" and "manifest.uuid" files are automatically generated and updated by Fossil itself. See the fossil set manifest command for additional information.)

The VERSION.h header file is generated by a C program: tools/mkversion.c. To run the VERSION.h generator, first compile the tools/mkversion.c source file into a command-line program (named "mkversion.exe") then run:

mkversion.exe manifest.uuid manifest VERSION >VERSION.h

The pathnames in the above command might need to be adjusted to get the directories right. The point is that the manifest.uuid, manifest, and VERSION files in the root of the source tree are the three arguments and the generated VERSION.h file appears on standard output.

The builtin_data.h header file is generated by a C program: tools/mkbuiltin.c. The builtin_data.h file contains C-language byte-array definitions for the content of resource files used by Fossil. To generate the builtin_data.h file, first compile the mkbuiltin.c program, then run:

mkbuiltin.exe diff.tcl OtherFiles... >builtin_data.h

At the time of this writing, the "diff.tcl" script (a Tcl/Tk script used to generate implement --tk option on the diff command) is the only resource file processed using mkbuiltin.exe. However, new resources will likely be added using this facility in future versions of Fossil.

4.0 Preprocessing

There are three preprocessors for the Fossil sources. The mkindex and translate preprocessors can be run in any order. The makeheaders preprocessor must be run after translate.

4.1 The mkindex preprocessor

The mkindex program scans the "src.c" source files looking for special comments that identify routines that implement various Fossil commands, web interface methods, and help text comments. The mkindex program generates some C code that Fossil uses in order to dispatch commands and HTTP requests and to show on-line help. Compile the mkindex program from the mkindex.c source file. Then run:

./mkindex src.c >page_index.h

Note that "src.c" in the above is a stand-in for the (79) regular source files of Fossil - all source files except for the exceptions described in section 2.0 above.

The output of the mkindex program is a header file that is #include-ed by the main.c source file during the final compilation step.

4.2 The translate preprocessor

The translate preprocessor looks for lines of source code that begin with "@" and converts those lines into string constants or (depending on context) into special "printf" operations for generating the output of an HTTP request. The translate preprocessor is a simple C program whose sources are in the translate.c source file. The translate preprocess is run on each of the other ordinary source files separately, like this:

./translate src.c >src_.c

In this case, the "src.c" file represents any single source file from the set of ordinary source files as described in section 2.0 above. Note that each source file is translated separately. By convention, the names of the translated source files are the names of the input sources with a single "_" character at the end. But a new makefile can use any naming convention it wants - the "_" is not critical to the build process.

After being translated, the output files (the "src_.c" files) should be used for all subsequent preprocessing and compilation steps.

4.3 The makeheaders preprocessor

For each C source module "src.c", there is an automatically generated header module "src.h" that contains all of the datatype and procedure declarations needed by the source module. These header files are generated automatically by the makeheaders program. The sources to makeheaders are contained in a single file "makeheaders.c". Additional documentation on makeheaders can be found in tools/makeheaders.html.

The makeheaders program is run once. It scans all inputs source files and generates header files for each one. Note that the sqlite3.c and shell.c source files are not scanned by makeheaders. Makeheaders only runs over "ordinary" source files, not the exceptional source files. However, makeheaders also uses some extra header files as input. The general format is like this:

makeheaders src_.c:src.h sqlite3.h th.h VERSION.h

In the example above the "src_.c" and "src.h" names represent all of the (79) ordinary C source files, each as a separate argument.

5.0 Compilation

After all generated files have been created and all ordinary source files have been preprocessed, the generated and preprocessed files can be combined into a single executable using a C compiler. This can be done all at once, or each preprocessed source file can be compiled into a separate object code file and the resulting object code files linked together in a final step.

Some files require special C-preprocessor macro definitions. When compiling sqlite.c, the following macros are recommended:

The first three symbol definitions above are required; the others are merely recommended. Extension loading is omitted as a security measure. The dbstat virtual table is needed for the /repo-tabsize page. FTS4 is needed for the search feature. Fossil is single-threaded so mutexing is disabled in SQLite as a performance enhancement. The SQLITE_ENABLE_EXPLAIN_COMMENTS option makes the output of "EXPLAIN" queries in the "fossil sql" command much more readable.

When compiling the shell.c source file, these macros are required:

The "main()" routine in the shell must be changed into sqlite3_main() to prevent it from colliding with the real main() in Fossil, and to give Fossil an entry point to jump to when the fossil sql command is invoked.

All the other source code files can be compiled without any special options.

6.0 Linkage

Fossil needs to be linked against zlib. If the HTTPS option is enabled, then it will also need to link against the appropriate SSL implementation. And, of course, Fossil needs to link against the standard C library. No other libraries or external dependences are used.

7.0 Debugging

Debug mode is controlled via FOSSIL_DEBUG preprocessor macro which could be set explicitly at the make command for the target platform.

However, in practice it is instead recommended to add a respective configure option for the target platform and then perform a clean build. This way the Debug flags are consistently applied across the whole build process. For example, use these Debug flags in addition to other flags passed to the configure scripts:

On Linux, *NIX and similar platforms:

./configure --fossil-debug

On Windows:

win\buildmsvc.bat FOSSIL_DEBUG=1

The resulting fossil binary could then be loaded into a platform-specific debugger. Source files displayed in the debugger correspond to the ones generated from the translation stage of the build process, that is what was actually compiled into the object files.

8.0 See Also