Fossil Forum

Wiki search: page titles, case sensitivity

Wiki search: page titles, case sensitivity

(1) By Konstantin Khomutov (kostix) on 2022-10-18 17:58:21 [link] [source]


I'm trying to use Fossil 2.15.2 818f40f62f to manage a set of assorted notes using it wiki functionality (I like the idea of fossil ui and no-brainer backups or syncs to a remove machine with SSH access).

Unfortunately, I have two issues with it: the wiki search apparently does not consider page titles—only contents, and the search is case-sensitive.

Both issues are rather critical to me as I'm accustomed to somewhat fuzzy search where it's better to have more search hits than less.

So, the questions are:

  • Are these aspects configurable (I failed to find suitable configuration knobs)?
  • If no, does anyone know whether is this fixed in a later version?
  • If no, is it worth filing a feature request(s)?

(2) By Stephan Beal (stephan) on 2022-10-18 20:34:16 in reply to 1 [link] [source]

Are these aspects configurable


If no, does anyone know whether is this fixed in a later version?

Not insofar as i recall.

If no, is it worth filing a feature request(s)?

You just did ;).

(3) By anonymous on 2023-09-08 19:44:16 in reply to 2 [link] [source]

I was looking for this functionality as well. Thankfully the search was able to lead me to this thread.

I was looking at Fossil mainly for its Wiki functionality and configuration to allow searching of titles would be greatly appreciated. It does seem that it is no longer case sensitive.

Seeing as non-case-sensitivity was added, will title searchability be added as well?

(4) By Stephan Beal (stephan) on 2023-09-08 21:35:02 in reply to 3 [link] [source]

will title searchability be added as well?

It's just waiting on someone with the time, energy, and inclination to implement it :). Patches would be thoughtfully considered.

(5) By Preben Guldberg (preben) on 2023-09-12 07:02:43 in reply to 4 [source]

As I read the code, the problem is that the wiki page content does not include the title of the page, causing it not to be indexed.

I think this can be addressed by explicitly setting the title for the search in get_stext_by_mimetype() for wiki pages. The patch below allows for that.

In my testing, this works for wiki pages and I did not see differences for checkins, docs, forums or tickets.

BTW, after initializing the title Blob like this, the many calls to add the title to pOut and calling markdown_to_html() became fairly repetitive to look at. I took it a notch further and dealt with these after handling the mimetypes. It may be too much, but I felt this better separated the code in to first handling different mimetypes for the purpose of determining the title and converting to HTML, followed by generating the resulting text in pOut.

Index: src/search.c
--- src/search.c
+++ src/search.c
@@ -1229,50 +1229,52 @@
 ** This is a helper function for search_stext().  Writing into pOut
 ** the search text obtained from pIn according to zMimetype.
+** If a title is not specified in zTitle (e.g. for wiki pages that do not
+** include the title in the body), it is determined from the page content.
 ** The title of the document is the first line of text.  All subsequent
 ** lines are the body.  If the document has no title, the first line
 ** is blank.
 static void get_stext_by_mimetype(
   Blob *pIn,
   const char *zMimetype,
+  const char *zTitle,
   Blob *pOut
   Blob html, title;
+  Blob *pHtml = &html;
   blob_init(&html, 0, 0);
-  blob_init(&title, 0, 0);
+  if( zTitle==0 ){
+    blob_init(&title, 0, 0);
+  }else{
+    blob_init(&title, zTitle, -1);
+  }
   if( zMimetype==0 ) zMimetype = "text/plain";
   if( fossil_strcmp(zMimetype,"text/x-fossil-wiki")==0 ){
-    Blob tail;
-    blob_init(&tail, 0, 0);
-    if( wiki_find_title(pIn, &title, &tail) ){
-      blob_appendf(pOut, "%s\n", blob_str(&title));
+    if( blob_size(&title) ){
+      wiki_convert(pIn, &html, 0);
+    }else{
+      Blob tail;
+      blob_init(&tail, 0, 0);
+      wiki_find_title(pIn, &title, &tail);
       wiki_convert(&tail, &html, 0);
-    }else{
-      blob_append(pOut, "\n", 1);
-      wiki_convert(pIn, &html, 0);
-    html_to_plaintext(blob_str(&html), pOut);
   }else if( fossil_strcmp(zMimetype,"text/x-markdown")==0 ){
-    markdown_to_html(pIn, &title, &html);
-    if( blob_size(&title) ){
-      blob_appendf(pOut, "%s\n", blob_str(&title));
-    }else{
-      blob_append(pOut, "\n", 1);
-    }
-    html_to_plaintext(blob_str(&html), pOut);
+    markdown_to_html(pIn, blob_size(&title) ? NULL : &title, &html);
   }else if( fossil_strcmp(zMimetype,"text/html")==0 ){
-    if( doc_is_embedded_html(pIn, &title) ){
-      blob_appendf(pOut, "%s\n", blob_str(&title));
-    }
-    html_to_plaintext(blob_str(pIn), pOut);
+    if( blob_size(&title)==0 ) doc_is_embedded_html(pIn, &title);
+    pHtml = pIn;
+  }
+  blob_appendf(pOut, "%s\n", blob_str(&title));
+  if( blob_size(pHtml) ){
+    html_to_plaintext(blob_str(pHtml), pOut);
-    blob_append(pOut, "\n", 1);
     blob_append(pOut, blob_buffer(pIn), blob_size(pIn));
@@ -1305,11 +1307,11 @@
       blob_appendf(pAccum, "%s: %s |\n", zColName, db_column_text(pQuery,i));
       Blob txt;
       blob_init(&txt, db_column_text(pQuery,i), -1);
       blob_appendf(pAccum, "%s: ", zColName);
-      get_stext_by_mimetype(&txt, zMime, pAccum);
+      get_stext_by_mimetype(&txt, zMime, NULL, pAccum);
       blob_append(pAccum, " |", 2);
@@ -1344,11 +1346,11 @@
   switch( cType ){
     case 'd': {   /* Documents */
       Blob doc;
       content_get(rid, &doc);
       blob_to_utf8_no_bom(&doc, 0);
-      get_stext_by_mimetype(&doc, mimetype_from_name(zName), pOut);
+      get_stext_by_mimetype(&doc, mimetype_from_name(zName), NULL, pOut);
     case 'f':     /* Forum messages */
     case 'e':     /* Tech Notes */
@@ -1366,11 +1368,11 @@
         blob_appendf(&wiki, "From %s:\n\n%s", pWiki->zUser, pWiki->zWiki);
         blob_init(&wiki, pWiki->zWiki, -1);
       get_stext_by_mimetype(&wiki, wiki_filter_mimetypes(pWiki->zMimetype),
-                            pOut);
+                            cType=='w' ? pWiki->zWikiTitle : NULL, pOut);
     case 'c': {   /* Check-in Comments */
@@ -1396,11 +1398,11 @@
           db_column_blob(&q, 0, pOut);
           Blob x;
           db_column_blob(&q, 0, &x);
-          get_stext_by_mimetype(&x, "text/x-fossil-wiki", pOut);
+          get_stext_by_mimetype(&x, "text/x-fossil-wiki", NULL, pOut);
@@ -1507,11 +1509,11 @@
   Blob in, out;
   if( g.argc!=4 ) usage("FILENAME MIMETYPE");
   blob_read_from_file(&in, g.argv[2], ExtFILE);
   blob_init(&out, 0, 0);
-  get_stext_by_mimetype(&in, g.argv[3], &out);
+  get_stext_by_mimetype(&in, g.argv[3], NULL, &out);

(6) By Daniel Dumitriu (danield) on 2023-09-12 14:28:04 in reply to 5 [link] [source]

Since we cannot use your patch as-is and this wasn't the first one coming from you: would you be interested in contributing to Fossil? If so, you'd need to fill out the Contributor Agreement and send it to Richard (per snail mail). After that's taken care of, we'll set you up with commit rights.

(7) By Preben Guldberg (preben) on 2023-09-13 11:00:45 in reply to 6 [link] [source]

I'd certainly be interested.

I will sign and send the agreement, if not later today then tomorrow.

Let me know if you need anything else. I'll read up on related documentation and try to be careful with commits.

Thank you.

(8) By Stephan Beal (stephan) on 2023-09-13 11:14:40 in reply to 7 [link] [source]

Let me know if you need anything else.

Once your waiver arrives and Richard has it in his fire safe, you'll be set up with dev access and will be sent a mail with the "golden rules." We look forward to your patches.

(9) By Preben Guldberg (preben) on 2023-09-26 10:34:18 in reply to 8 [link] [source]

Thanks for accepting me in.

I checked in the changes that lead to the patch above in the search-wiki-titles branch.

(10) By Preben Guldberg (preben) on 2023-11-14 15:45:53 in reply to 9 [link] [source]

I never got to merge this after creating the branch. AFAICT searching titles works well (also if I merge trunk in.).

Any objections to me merging the search-wiki-titles branch branch into trunk?

(11) By Stephan Beal (stephan) on 2023-11-14 16:01:30 in reply to 10 [link] [source]

Any objections to me merging the search-wiki-titles branch branch into trunk?

Just a few hours ago i was wondering whether this had been merged. If you're happy with how it works, please do.

A tip, in case you don't know this already:

$ fossil update trunk
$ fossil merge --integrate search-wiki-titles

When you check that in, the --integrate flag will causes the search-wiki-titles branch to automatically be closed.

(12) By Preben Guldberg (preben) on 2023-11-14 16:15:52 in reply to 11 [link] [source]

That worked like a charm.