Fossil

Check-in [e483b3b1]
Login

Check-in [e483b3b1]

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Fix the unicode code-point width estimating function to align with the SQLite CLI.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: e483b3b15fad08e6dc03fe1d7f304d6cee69db48539dd3a2912561813ae9b8d4
User & Date: drh 2024-09-30 18:21:31
Context
2024-10-05
13:29
Merge updates to the character width measurements of the comment formatter. Note that multi-byte and wide characters are not handled in the comment prefix, which is entirely controlled by the application and only contains ASCII text. ... (check-in: 725af947 user: florian tags: trunk)
2024-10-02
14:43
Fix the off-by-one errors if a fullwidth character only fits partially, and take into account character widths when scanning forward to find the distance to the next space. ... (check-in: d5479ba7 user: florian tags: comment-formatter-wcwidth)
06:51
Render forum content as <description> in RSS feed. page /timeline.rss renders final HTML; command rss renders the source. ... (Leaf check-in: 9fbdea8b user: vor0nwe tags: rss-forum-content)
2024-09-30
18:21
Fix the unicode code-point width estimating function to align with the SQLite CLI. ... (check-in: e483b3b1 user: drh tags: trunk)
2024-09-28
18:21
Take into account zero-width and double-width unicode character when formatting the command-line timeline. ... (check-in: 83743188 user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to src/comformat.c.
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
/* Lookup table to estimate the number of columns consumed by a Unicode
** character.
*/
static const struct {
  unsigned char w;    /* Width of the character in columns */
  int iFirst;         /* First character in a span having this width */
} aUWidth[] = {
   /* {0, 0x00000},  {1, 0x00020},  {0, 0x0007f},  {1, 0x000a0}, */
  {0, 0x00300},  {1, 0x00370},  {0, 0x00483},  {1, 0x00487},  {0, 0x00488},
  {1, 0x0048a},  {0, 0x00591},  {1, 0x005be},  {0, 0x005bf},  {1, 0x005c0},
  {0, 0x005c1},  {1, 0x005c3},  {0, 0x005c4},  {1, 0x005c6},  {0, 0x005c7},
  {1, 0x005c8},  {0, 0x00600},  {1, 0x00604},  {0, 0x00610},  {1, 0x00616},
  {0, 0x0064b},  {1, 0x0065f},  {0, 0x00670},  {1, 0x00671},  {0, 0x006d6},
  {1, 0x006e5},  {0, 0x006e7},  {1, 0x006e9},  {0, 0x006ea},  {1, 0x006ee},
  {0, 0x0070f},  {1, 0x00710},  {0, 0x00711},  {1, 0x00712},  {0, 0x00730},







|







37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
/* Lookup table to estimate the number of columns consumed by a Unicode
** character.
*/
static const struct {
  unsigned char w;    /* Width of the character in columns */
  int iFirst;         /* First character in a span having this width */
} aUWidth[] = {
   /* {1, 0x00000}, */
  {0, 0x00300},  {1, 0x00370},  {0, 0x00483},  {1, 0x00487},  {0, 0x00488},
  {1, 0x0048a},  {0, 0x00591},  {1, 0x005be},  {0, 0x005bf},  {1, 0x005c0},
  {0, 0x005c1},  {1, 0x005c3},  {0, 0x005c4},  {1, 0x005c6},  {0, 0x005c7},
  {1, 0x005c8},  {0, 0x00600},  {1, 0x00604},  {0, 0x00610},  {1, 0x00616},
  {0, 0x0064b},  {1, 0x0065f},  {0, 0x00670},  {1, 0x00671},  {0, 0x006d6},
  {1, 0x006e5},  {0, 0x006e7},  {1, 0x006e9},  {0, 0x006ea},  {1, 0x006ee},
  {0, 0x0070f},  {1, 0x00710},  {0, 0x00711},  {1, 0x00712},  {0, 0x00730},
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
** Inaccuracies in the width estimates might cause columns to be misaligned.
** Unfortunately, there is nothing we can do about that.
*/
static int cli_wcwidth(int c){
  int iFirst, iLast;

  /* Fast path for common characters */
  if( c<0x20 ) return 0;
  if( c<0x7f ) return 1;
  if( c<0xa0 ) return 0;
  if( c<=0x300 ) return 1;

  /* The general case */
  iFirst = 0;
  iLast = sizeof(aUWidth)/sizeof(aUWidth[0]) - 1;
  while( iFirst<iLast-1 ){
    int iMid = (iFirst+iLast)/2;







<
<
<







115
116
117
118
119
120
121



122
123
124
125
126
127
128
** Inaccuracies in the width estimates might cause columns to be misaligned.
** Unfortunately, there is nothing we can do about that.
*/
static int cli_wcwidth(int c){
  int iFirst, iLast;

  /* Fast path for common characters */



  if( c<=0x300 ) return 1;

  /* The general case */
  iFirst = 0;
  iLast = sizeof(aUWidth)/sizeof(aUWidth[0]) - 1;
  while( iFirst<iLast-1 ){
    int iMid = (iFirst+iLast)/2;