Fossil

Check-in [f52d63e3]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Added Jupyter nbviewer link for www/image-format-vs-repo-size.ipynb, and made a few small edits to the notebook after seeing it in that viewer.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: f52d63e37bf8d2ec66fd36d5b62342193b2b42753e7535ef41e1681a040e361d
User & Date: wyoung 2019-03-27 19:59:23.619
Context
2019-03-30
00:46
Support both "1)" and "1." for numbered lists in markdown, as commonmark does. Patch from A. Kupries. ... (check-in: 9d6a0aac user: drh tags: trunk)
2019-03-27
19:59
Added Jupyter nbviewer link for www/image-format-vs-repo-size.ipynb, and made a few small edits to the notebook after seeing it in that viewer. ... (check-in: f52d63e3 user: wyoung tags: trunk)
18:45
Several improvements to the image-format-vs-repo-size experiment and the report documenting it: dropped the first 3 columns of data to make the bar chart clearer; drawing the bar chart on a transparent BG in case it's used on a page with a non-white BG, as with a selected Fossil forum post; added axis labels; added a run time calculation to the expensive first step; fixed a few syntax problems that prevent the Python code from compiling on Python 3; documented some problems with running it under Anaconda on macOS; better documented the notebook's dependencies; many clarifications to the experimental report text. ... (check-in: 41e5237a user: wyoung tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to www/image-format-vs-repo-size.ipynb.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Image Format vs Fossil Repository Size\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "This notebook was developed with [JupyterLab][jl]. To follow in my footsteps, install that and the needed Python packages:\n",
    "\n",
    "    $ sudo pip install jupyterlab matplotlib pandas wand\n",
    "\n",
    "In principle, it should also work with [Anaconda Navigator][an], but because [Wand][wp] is not currently in the Anaconda base package set, you may run into difficulties making it work, as we did on macOS. These problems might not occur on Windows or Linux.\n",
    "\n",
    "This notebook uses the Python 2 kernel because macOS does not include Python 3, and we don't want to make adding that a prerequisite for those re-running this experiement on their own macOS systems. The code was written with Python 3 syntax changes in mind, but we haven't yet successfully tried it in a Python 3 Jupyter kernel.\n",
    "\n",
    "[an]: https://www.anaconda.com/distribution/\n",
    "[jl]: https://github.com/jupyterlab/\n",












|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Image Format vs Fossil Repository Size\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "This notebook was developed with [JupyterLab][jl]. To follow in my footsteps, install that and the needed Python packages:\n",
    "\n",
    "    $ pip install jupyterlab matplotlib pandas wand\n",
    "\n",
    "In principle, it should also work with [Anaconda Navigator][an], but because [Wand][wp] is not currently in the Anaconda base package set, you may run into difficulties making it work, as we did on macOS. These problems might not occur on Windows or Linux.\n",
    "\n",
    "This notebook uses the Python 2 kernel because macOS does not include Python 3, and we don't want to make adding that a prerequisite for those re-running this experiement on their own macOS systems. The code was written with Python 3 syntax changes in mind, but we haven't yet successfully tried it in a Python 3 Jupyter kernel.\n",
    "\n",
    "[an]: https://www.anaconda.com/distribution/\n",
    "[jl]: https://github.com/jupyterlab/\n",
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
    "\n",
    "# Merge per-format test data into a single DataFrame without the first\n",
    "# first 3 rows: the initial empty repo state (boring) and the repo DB\n",
    "# size as it \"settles\" in its first few checkins.\n",
    "data = pd.concat(repo_sizes, axis=1).drop(range(3))\n",
    "\n",
    "mpl.rcParams['figure.figsize'] = (6, 4)\n",
    "#plt.rcParams['axes.facecolor'] = 'white'\n",
    "ax = data.plot(kind = 'bar', colormap = 'coolwarm',\n",
    "          grid = False, width = 0.8,\n",
    "          edgecolor = 'white', linewidth = 2)\n",
    "ax.axes.set_xlabel('Checkin index')\n",
    "ax.axes.set_ylabel('Repo size (MiB)')\n",
    "plt.savefig('image-format-vs-repo-size.svg', transparent=True)\n",
    "plt.show()"







<







148
149
150
151
152
153
154

155
156
157
158
159
160
161
    "\n",
    "# Merge per-format test data into a single DataFrame without the first\n",
    "# first 3 rows: the initial empty repo state (boring) and the repo DB\n",
    "# size as it \"settles\" in its first few checkins.\n",
    "data = pd.concat(repo_sizes, axis=1).drop(range(3))\n",
    "\n",
    "mpl.rcParams['figure.figsize'] = (6, 4)\n",

    "ax = data.plot(kind = 'bar', colormap = 'coolwarm',\n",
    "          grid = False, width = 0.8,\n",
    "          edgecolor = 'white', linewidth = 2)\n",
    "ax.axes.set_xlabel('Checkin index')\n",
    "ax.axes.set_ylabel('Repo size (MiB)')\n",
    "plt.savefig('image-format-vs-repo-size.svg', transparent=True)\n",
    "plt.show()"
Changes to www/image-format-vs-repo-size.md.
75
76
77
78
79
80
81
82
83

84
85
86
87
88
89
90
[oox]: https://en.wikipedia.org/wiki/Office_Open_XML
[wi]:  https://en.wikipedia.org/wiki/Windows_Installer



## Demonstration

The companion [`image-format-vs-repo-size.ipynb` file][nb] is a
[Jupyter][jp] notebook implementing the following experiment:


1.  Create an empty Fossil repository; save its initial size.

2.  Use [ImageMagick][im] via [Wand][wp] to generate a JPEG file of a
    particular size — currently 256 px² — filled with Gaussian noise to
    make data compression more difficult than with a solid-color image.








|
|
>







75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
[oox]: https://en.wikipedia.org/wiki/Office_Open_XML
[wi]:  https://en.wikipedia.org/wiki/Windows_Installer



## Demonstration

The companion `image-format-vs-repo-size.ipynb` file ([download][nbd],
[preview][nbp]) is a [Jupyter][jp] notebook implementing the following
experiment:

1.  Create an empty Fossil repository; save its initial size.

2.  Use [ImageMagick][im] via [Wand][wp] to generate a JPEG file of a
    particular size — currently 256 px² — filled with Gaussian noise to
    make data compression more difficult than with a solid-color image.

105
106
107
108
109
110
111
112
113
114

115
116
117
118
119
120
121
122
modify the notebook to try different things.  Want to see how the
results change with a different image size?  Easy, change the `size`
value in the second cell of the notebook.  Want to try more image
formats?  You can put anything ImageMagick can recognize into the
`formats` list. Want to find the break-even point for images like those
in your own respository?  Easily done with a small amount of code.

[im]: https://www.imagemagick.org/
[jp]: https://jupyter.org/
[nb]: ./image-format-vs-repo-size.ipynb

[wp]: http://wand-py.org/


## Results

Running the notebook gives a bar chart something like⁴ this:

![results bar chart](./image-format-vs-repo-size.svg)







|
|
|
>
|







106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
modify the notebook to try different things.  Want to see how the
results change with a different image size?  Easy, change the `size`
value in the second cell of the notebook.  Want to try more image
formats?  You can put anything ImageMagick can recognize into the
`formats` list. Want to find the break-even point for images like those
in your own respository?  Easily done with a small amount of code.

[im]:  https://www.imagemagick.org/
[jp]:  https://jupyter.org/
[nbd]: ./image-format-vs-repo-size.ipynb
[nbp]: https://nbviewer.jupyter.org/urls/fossil-scm.org/fossil/doc/trunk/www/image-format-vs-repo-size.ipynb
[wp]:  http://wand-py.org/


## Results

Running the notebook gives a bar chart something like⁴ this:

![results bar chart](./image-format-vs-repo-size.svg)