Fossil: Artifact [c0198bbf]

Artifact c0198bbfaf878e9cd32fd3790b8f02763eda5606c6f00cf3d4cf0c079463d3de:

File www/server.wiki — part of check-in [f153777f] at 2019-03-17 07:01:18 on branch trunk — Expanded the "Standalone server" section of www/server.wiki to add more examples. Also fixed a few grammar problems elsewhere in the doc. (user: wyoung size: 15755) [more...]
<title>How To Configure A Fossil Server</title>

<h2>Introduction</h2>

<blockquote>
A server is not necessary to use Fossil, but a server does help in collaborating with
peers.  A Fossil server also works well as a complete website for a project.
For example, the complete [https://www.fossil-scm.org/] website, including the
page you are now reading,
is just a Fossil server displaying the content of the
self-hosting repository for Fossil.

This article is a guide for setting up your own Fossil server.

See "[./aboutcgi.wiki|How CGI Works In Fossil]" for background
information on the underlying CGI technology.
See "[./sync.wiki|The Fossil Sync Protocol]" for information on the
wire protocol used for client/server communication.
</blockquote>

<h2>Overview</h2>

<blockquote>
There are basically four ways to set up a Fossil server:

<ol>
  <li>A stand-alone server
  <li>Using inetd, xinetd, or stunnel
  <li>CGI
  <li>SCGI (a.k.a. SimpleCGI)
</ol>

Each of these can serve either a single repository, or a directory hierarchy
containing many repositories with names ending in ".fossil".
</blockquote>


<h2 id="standalone">Standalone server</h2>

<blockquote>
The easiest way to set up a Fossil server is to use either the
[/help/server|server] or the [/help/ui|ui] commands:

<ul>
  <li><b>fossil server</b> <i>REPOSITORY</i>
  <li><b>fossil ui</b> <i>REPOSITORY</i>
</ul>

The <i>REPOSITORY</i> argument is either the name of the repository file, or
a directory containing many repositories named <tt>*.fossil</tt>.
Both of these commands start a Fossil server, usually on TCP port 8080, though
a higher numbered port might also be used if 8080 is already occupied.  You can
access these using URLs of the form <b>http://localhost:8080/</b>, or if
<i>REPOSITORY</i> is a directory, URLs of the form
<b>http://localhost:8080/</b><i>repo</i><b>/</b> where <i>repo</i> is the base
name of the repository file without the ".fossil" suffix.

There are several key differences between "ui" and "server":

<ul>
  <li>"ui" always binds the server to the loopback IP address
      (127.0.0.1) so that it cannot serve to other machines.
  <li>anyone who visits this URL is treated as the all-powerful Setup
      user, which is why the first difference exists.
  <li>"ui" launches a local web browser pointed at this URL.
</ul>

You can omit the <i>REPOSITORY</i> argument if you run one of the above
commands from within a Fossil checkout directory to serve that
repository:

<blockquote><pre>
$ fossil ui          # or...
$ fossil serve
</pre></blockquote>

Note that you can abbreviate Fossil sub-commands, as long as they are
unambiguous. "<tt>server</tt>" can currently be as short as
"<tt>ser</tt>".

As a more complicated example, you can serve a directory containing
multiple <tt>*.fossil</tt> files like so:

<blockquote><pre>
$ fossil server --port 9000 --repolist /path/to/repo/dir
</pre></blockquote>

There is an [/file/tools/fslsrv | example script] in the Fossil
distribution that wraps <tt>fossil server</tt> to produce more
complicated effects. Feel free to take it, study it, and modify it to
suit your local needs.

See the [/help/server|online documentation] for more information on the
options and arguments you can give to these commands.
</blockquote>


<h2 id="inetd">Fossil as an inetd/xinetd service</h2>

<blockquote>

A Fossil server can be launched on-demand by inetd or xinetd using
the [/help/http|fossil http] command. To launch Fossil from inetd, modify
your inetd configuration file (typically "/etc/inetd.conf") to contain a
line something like this:

<blockquote>
<pre>
80 stream tcp nowait.1000 root /usr/bin/fossil /usr/bin/fossil http /home/fossil/repo.fossil
</pre>
</blockquote>

In this example, you are telling "inetd" that when an incoming connection
appears on TCP port "80", that it should launch the binary "/usr/bin/fossil"
program with the arguments shown.
Obviously you will
need to modify the pathnames for your particular setup.
The final argument is either the name of the fossil repository to be served,
or a directory containing multiple repositories.


If you use a non-standard TCP port on
systems where the port-specification must be a symbolic name and cannot be
numeric, add the desired name and port to /etc/services.  For example, if
you want your Fossil server running on TCP port 12345 instead of 80, you
will need to add:

<blockquote>
<pre>
fossil          12345/tcp  #fossil server
</pre>
</blockquote>

and use the symbolic name ('fossil' in this example) instead of the numeral ('12345')
in inetd.conf. For details, see the relevant section in your system's documentation, e.g.
the [https://www.freebsd.org/doc/en/books/handbook/network-inetd.html|FreeBSD Handbook] in
case you use FreeBSD.

If your system is running xinetd, then the configuration is likely to be
in the file "/etc/xinetd.conf" or in a subfile of "/etc/xinetd.d".
An xinetd configuration file will appear like this:

<blockquote>
<pre>
service http
{
  port = 80
  socket_type = stream
  wait = no
  user = root
  server = /usr/bin/fossil
  server_args = http /home/fossil/repos/
}
</pre>
</blockquote>

The xinetd example above has Fossil configured to serve multiple
repositories, contained under the "/home/fossil/repos/" directory.

In both cases notice that Fossil was launched as root.  This is not required,
but if it is done, then Fossil will automatically put itself into a chroot
jail for the user who owns the fossil repository before reading any information
off of the wire.

Inetd or xinetd must be enabled, and must be (re)started whenever their configuration
changes - consult your system's documentation for details.

Using inetd or xinetd is a more complex setup
than the "standalone" server, but it has the
advantage of only using system resources when an actual connection is
attempted.  If no-one ever connects to that port, a Fossil server will
not (automatically) run. It has the disadvantage of requiring "root" access
and therefore may not normally be available to lower-priced "shared" servers
on the Internet.

The configuration for <tt>stunnel</tt> is similar, but it is covered in
[./ssl.wiki#stunnel|a separate document].
</blockquote>

<h2 id="cgi">Fossil as CGI</h2>

<blockquote>

A Fossil server can also be run from an ordinary web server as a CGI program.
This feature allows Fossil to be seamlessly integrated into a larger website.
CGI is how the [./selfhost.wiki | self-hosting fossil repositories] are
implemented.

To run Fossil as CGI, create a CGI script (here called "repo") in the CGI directory
of your web server and having content like this:

<blockquote><pre>
#!/usr/bin/fossil
repository: /home/fossil/repo.fossil
</pre></blockquote>

As always, adjust your paths appropriately.
It may be necessary to set permissions properly, or to modify an ".htaccess"
file or make other server-specific changes.  Consult the documentation
for your particular web server. In particular, the following permissions are
<em>normally</em> required (but, again, may be different for a particular
configuration):

<ul>
  <li>The Fossil binary (/usr/bin/fossil in the example above)
  must be readable/executable, and ALL directories leading up to it
  must be readable by the process which executes the CGI.</li>
  <li>ALL directories leading to the CGI script must also be readable and the CGI
  script itself must be executable for the user under which it will run (which often differs
  from the one running the web server - consult your site's documentation or administrator).</li>
  <li>The repository file AND the directory containing it must be writable by the same account
  which executes the Fossil binary (again, this might differ from the WWW user). The directory
  needs to be writable so that sqlite can write its journal files.</li>
  <li>Fossil must be able to create temporary files, the default directory
  for which depends on the OS.  When the CGI process is operating within
  a chroot, ensure that this directory exists and is readable/writeable
  by the user who executes the Fossil binary.</li>
</ul>

Once the script is set up correctly, and assuming your server is also set
correctly, you should be able to access your repository with a URL like:
<b>http://mydomain.org/cgi-bin/repo</b> (assuming the "repo" script is
accessible under "cgi-bin", which would be a typical deployment on Apache
for instance).

To serve multiple repositories from a directory using CGI, use the "directory:"
tag in the CGI script rather than "repository:".   You might also want to add
a "notfound:" tag to tell where to redirect if the particular repository requested
by the URL is not found:

<blockquote><pre>
#!/usr/bin/fossil
directory: /home/fossil/repos
notfound: http://url-to-go-to-if-repo-not-found/
</pre></blockquote>

Once deployed, a URL like: <b>http://mydomain.org/cgi-bin/repo/XYZ</b>
will serve up the repository "/home/fossil/repos/XYZ.fossil" (if it exists).

Additional options available to the CGI script are documented in the
source code.  As of 2017-07-02, the available options are described at
[/artifact/9a52a07b?ln=1777-1824|main.c lines 1777 through 1824].
</blockquote>

<h2 id="scgi">Fossil as SCGI</h2>
<blockquote>

The [/help/server|fossil server] command, described above as a way of
starting a stand-alone web server, can also be used for SCGI.  Simply add
the --scgi command-line option and the stand-alone server will interpret
and respond to the SimpleCGI or SCGI protocol rather than raw HTTP.  This can
be used in combination with a webserver (such as [http://nginx.org|Nginx])
that does not support CGI.  A typical Nginx configuration to support SCGI
with Fossil would look something like this:

<blockquote><pre>
location /demo_project/ {
    include scgi_params;
    scgi_pass localhost:9000;
    scgi_param SCRIPT_NAME "/demo_project";
    scgi_param HTTPS "on";
}
</pre></blockquote>

Note that Fossil requires the SCRIPT_NAME variable
in order to function properly, but Nginx does not provide this
variable by default,
so it is necessary to provide the SCRIPT_NAME parameter in the configuration.
Failure to do this will cause Fossil to return an error.

All of the features of the stand-alone server mode described above,
such as the ability to serve a directory full of Fossil repositories
rather than just a single repository, work the same way in SCGI mode.

For security, it is probably a good idea to add the --localhost option
to the [/help/server|fossil server] command to prevent Fossil from accepting
off-site connections.  One might also want to specify the listening TCP port
number, rather than letting Fossil choose one for itself, just to avoid
ambiguity.  A typical command to start a Fossil SCGI server
would be something like this:

<blockquote><pre>
fossil server $REPOSITORY --scgi --localhost --port 9000
</pre></blockquote>
</blockquote>

<h2 id="tls">Securing a repository with TLS</h2>

<blockquote>
  Fossil's built-in HTTP server (e.g. "fossil server") does not support
  TLS, but there are multiple ways to protect your Fossil server with
  TLS. All of this is covered in a separate document, <a
  href="./ssl.wiki">Using TLS-Encrypted Communications with Fossil</a>.
</blockquote>

<h2 id="loadmgmt">Managing Server Load</h2>

<blockquote>
A Fossil server is very efficient and normally presents a very light
load on the server.
The Fossil [./selfhost.wiki | self-hosting server] is a 1/24th slice VM at
[http://www.linode.com | Linode.com] hosting 65 other repositories in
addition to Fossil (and including some very high-traffic sites such
as [http://www.sqlite.org] and [http://system.data.sqlite.org]) and
it has a typical load of 0.05 to 0.1.  A single HTTP request to Fossil
normally takes less than 10 milliseconds of CPU time to complete, so
requests can be arriving at a continuous rate of 20 or more per second,
and the CPU can still be mostly idle.

However, there are some Fossil web pages that can consume large
amounts of CPU time, especially on repositories with a large number
of files or with long revision histories.  High CPU usage pages include
[/help?cmd=/zip | /zip], [/help?cmd=/tarball | /tarball],
[/help?cmd=/annotate | /annotate] and others.  On very large repositories,
these commands can take 15 seconds or more of CPU time.
If these kinds of requests arrive too quickly, the load average on the
server can grow dramatically, making the server unresponsive.

Fossil provides two capabilities to help avoid server overload problems
due to excessive requests to expensive pages:

<ol>
  <li><p>An optional cache is available that remembers the 10 most recently
      requested /zip or /tarball pages and returns the precomputed answer
      if the same page is requested again.</p>
  <li><p>Page requests can be configured to fail with a
      [http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.4 | "503 Server Overload"]
      HTTP error if an expensive request is received while the host load
      average is too high.</p>
</ol>

Both of these load-control mechanisms are turned off by default, but they
are recommended for high-traffic sites.

The webpage cache is activated using the [/help?cmd=cache|fossil cache init]
command-line on the server.  Add a -R option to specify the specific repository
for which to enable caching.  If running this command as root, be sure to
"chown" the cache database (which is a separate file in the same directory
and with the same name as the repository but with the suffix changed to ".cache")
to give it write permission for the userid of the webserver.

To activate the server load control feature
visit the Admin → Access setup page in the administrative web
interface; in the "<b>Server Load Average Limit</b>" box
enter the load average threshold above which "503 Server
Overload" replies will be issued for expensive requests.  On the
self-hosting Fossil server, that value is set to 1.5, but you could easily
set it higher on a multi-core server.

The maximum load average can also be set on the command line using
commands like this:
<blockquote><pre>
fossil set max-loadavg 1.5
fossil all set max-loadavg 1.5
</pre></blockquote>

The second form is especially useful for changing the maximum load average
simultaneously on a large number of repositories.

Note that this load-average limiting feature is only available on operating
systems that support the "getloadavg()" API.  Most modern Unix systems have
this interface, but Windows does not, so the feature will not work on Windows.
Note also that Linux implements "getloadavg()" by accessing the "/proc/loadavg"
file in the "proc" virtual filesystem.  If you are running a Fossil instance
inside a chroot() jail on Linux, you will need to make the "/proc" file
system available inside that jail in order for this feature to work.  On
the [./selfhost.wiki|self-hosting Fossil repositories], this was accomplished
by adding a line to the "/etc/fstab" file that looks like:

<blockquote><pre>
chroot_jail_proc /home/www/proc proc ro 0 0
</pre></blockquote>

The /home/www/proc pathname should be adjusted so that the "/proc" component is
in the root of the chroot jail, of course.

To see if the load-average limiter is functional, visit the [/test_env] page
of the server to view the current load average.  If the value for the load
average is greater than zero, that means that it is possible to activate
the load-average limiter on that repository.  If the load average shows
exactly "0.0", then that means that Fossil is unable to find the load average
(either because it is in a chroot() jail without /proc access, or because
it is running on a system that does not support "getloadavg()") and so the
load-average limiter will not function.

</blockquote>