Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Assorted improvements to www/globs.md, mainly to clarity and grammar. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA3-256: |
7898593d9de25804fc1c737148ab0f96 |
User & Date: | wyoung 2020-03-21 19:57:43.803 |
Context
2020-03-22
| ||
14:29 | Update the built-in SQLite to the latest 3.32.0 alpha that includes fixes for the DBCONFIG_MAINDBNAME problem. ... (check-in: 8d114c2a user: drh tags: trunk) | |
2020-03-21
| ||
19:57 | Assorted improvements to www/globs.md, mainly to clarity and grammar. ... (check-in: 7898593d user: wyoung tags: trunk) | |
2020-03-20
| ||
04:02 | Rename test function to match the test command name ... (check-in: 77be1777 user: andygoth tags: trunk) | |
Changes
Changes to www/globs.md.
1 2 3 4 5 6 7 8 | # File Name Glob Patterns A [glob pattern][glob] is a text expression that matches one or more file names using wild cards familiar to most users of a command line. For example, `*` is a glob that matches any name at all and `Readme.txt` is a glob that matches exactly one file. | < | > | > > | > < | | | | < < | | < < | < | | > | | < | < | | > > | < | | > > > > > | > > > | | > > > > > > > > > | > > | | > | | | < < < < < < < | | | | | | < | | > | > | | > > > | > | > | | > | | > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | # File Name Glob Patterns A [glob pattern][glob] is a text expression that matches one or more file names using wild cards familiar to most users of a command line. For example, `*` is a glob that matches any name at all and `Readme.txt` is a glob that matches exactly one file. A glob should not be confused with a [regular expression][regexp] (RE), even though they use some of the same special characters for similar purposes, because [they are not fully compatible][greinc] pattern matching languages. Fossil uses globs when matching file names with the settings described in this document, not REs. [glob]: https://en.wikipedia.org/wiki/Glob_(programming) [greinc]: https://unix.stackexchange.com/a/57958/138 [regexp]: https://en.wikipedia.org/wiki/Regular_expression These settings hold one or more file glob patterns to cause Fossil to give matching named files special treatment. Glob patterns are also accepted in options to certain commands and as query parameters to certain Fossil UI web pages. Where Fossil also accepts globs in commands, this handling may interact with your OS’s command shell or its C runtime system, because they may have their own glob pattern handling. We will detail such interactions below. ## Syntax Where Fossil accepts glob patterns, it will usually accept a *list* of such patterns, each individual pattern separated from the others by white space or commas. If a glob must contain white spaces or commas, it can be quoted with either single or double quotation marks. A list is said to match if any one glob in the list matches. A glob pattern matches a given file name if it successfully consumes and matches the *entire* name. Partial matches are failed matches. Most characters in a glob pattern consume a single character of the file name and must match it exactly. For instance, “a” in a glob simply matches the letter “a” in the file name unless it is inside a special character sequence. Other characters have special meaning, and they may include otherwise normal characters to give them special meaning: :Pattern |:Effect --------------------------------------------------------------------- `*` | Matches any sequence of zero or more characters `?` | Matches exactly one character `[...]` | Matches one character from the enclosed list of characters `[^...]` | Matches one character *not* in the enclosed list Note that unlike [POSIX globs][pg], these special characters and sequences are allowed to match `/` directory separators as well as the initial `.` in the name of a hidden file or directory. This is because Fossil file names are stored as complete path names. The distinction between file name and directory name is “below” Fossil in this sense. [pg]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13 The bracket expresssions above require some additional explanation: * A range of characters may be specified with `-`, so `[a-f]` matches exactly the same characters as `[abcdef]`. Ranges reflect Unicode code points without any locale-specific collation sequence. Therefore, this particular sequence never matches the Unicode pre-composed character `é`, for example. (U+00E9) * This dependence on character/code point ordering may have other effects to surprise you. For example, the glob `[A-z]` not only matches upper and lowercase ASCII letters, it also matches several punctuation characters placed between `Z` and `a` in both ASCII and Unicode: `[`, `\`, `]`, `^`, `_`, and <tt>\`</tt>. * You may include a literal `-` in a list by placing it last, just before the `]`. * You may include a literal `]` in a list by making the first character after the `[` or `[^`. At any other place, `]` ends the list. * You may include a literal `^` in a list by placing it anywhere except after the opening `[`. * Beware that a range must be specified from low value to high value: `[z-a]` does not match any character at all, preventing the entire glob from matching. Some examples of character lists: :Pattern |:Effect --------------------------------------------------------------------- `[a-d]` | Matches any one of `a`, `b`, `c`, or `d` but not `ä` `[^a-d]` | Matches exactly one character other than `a`, `b`, `c`, or `d` `[0-9a-fA-F]` | Matches exactly one hexadecimal digit `[a-]` | Matches either `a` or `-` `[][]` | Matches either `]` or `[` `[^]]` | Matches exactly one character other than `]` `[]^]` | Matches either `]` or `^` `[^-]` | Matches exactly one character other than `-` White space means the specific ASCII characters TAB, LF, VT, FF, CR, and SPACE. Note that this does not include any of the many additional spacing characters available in Unicode such as U+00A0, NO-BREAK SPACE. Because both LF and CR are white space and leading and trailing spaces are stripped from each glob in a list, a list of globs may be broken into lines between globs when the list is stored in a file, as for a versioned setting. Note that 'single quotes' and "double quotes" are the ASCII straight quote characters, not any of the other quotation marks provided in Unicode and specifically not the "curly" quotes preferred by typesetters and word processors. ## File Names to Match Before it is compared to a glob pattern, each file name is transformed to a canonical form: * all directory separators are changed to `/` * redundant slashes are removed * all `.` path components are removed * all `..` path components are resolved (There are additional details we are ignoring here, but they cover rare edge cases and follow the principle of least surprise.) The glob must match the *entire* canonical file name to be considered a match. The goal is to have a name that is the simplest possible for each particular file, and that will be the same regardless of the platform you run Fossil on. This is important when you have a repository cloned from multiple platforms and have globs in versioned settings: you want those settings to be interpreted the same way everywhere. Beware, however, that all glob matching in Fossil is case sensitive regardless of host platform and file system. This will not be a surprise on POSIX platforms where file names are usually treated case sensitively. However, most Windows file systems are case preserving but case insensitive. That is, on Windows, the names `ReadMe` and `README` are usually names of the same file. The same is true in other cases, such as by default on macOS file systems and in the file system drivers for Windows file systems running on non-Windows systems. (e.g. exfat on Linux.) Therefore, write your Fossil glob patterns to match the name of the file as checked into the repository. Some example cases: :Pattern |:Effect -------------------------------------------------------------------------------- `README` | Matches only a file named `README` in the root of the tree. It does not match a file named `src/README` because it does not include any characters that consume (and match) the `src/` part. `*/README` | Matches `src/README`. Unlike Unix file globs, it also matches `src/library/README`. However it does not match the file `README` in the root of the tree. |
︙ | ︙ | |||
476 477 478 479 480 481 482 | [`test-echo`]: /help?cmd=test-echo [`test-glob`]: /help?cmd=test-glob ## Converting `.gitignore` to `ignore-glob` Many other version control systems handle the specific case of | | | | | | | | | | | | | | | | | | 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 | [`test-echo`]: /help?cmd=test-echo [`test-glob`]: /help?cmd=test-glob ## Converting `.gitignore` to `ignore-glob` Many other version control systems handle the specific case of ignoring certain files differently from Fossil: they have you create individual "ignore" files in each folder, which specify things ignored in that folder and below. Usually some form of glob patterns are used in those files, but the details differ from Fossil. In many simple cases, you can just store a top level "ignore" file in `.fossil-settings/ignore-glob`. But as usual, there will be lots of edge cases. [Git has a rich collection of ignore files][gitignore] which accumulate rules that affect the current command. There are global files, per-user files, per workspace unmanaged files, and fully version controlled files. Some of the files used have no set name, but are called out in configuration files. [gitignore]: https://git-scm.com/docs/gitignore In contrast, Fossil has a global setting and a local setting, but the local setting overrides the global rather than extending it. Similarly, a Fossil command's `--ignore` option replaces the `ignore-glob` setting rather than extending it. With that in mind, translating a `.gitignore` file into `.fossil-settings/ignore-glob` may be possible in many cases. Here are some of features of `.gitignore` and comments on how they relate to Fossil: * "A blank line matches no files...": same in Fossil. * "A line starting with # serves as a comment....": not in Fossil. * "Trailing spaces are ignored unless they are quoted..." is similar in Fossil. All whitespace before and after a glob is trimmed in Fossil unless quoted with single or double quotes. Git uses backslash quoting instead, which Fossil does not. * "An optional prefix "!" which negates the pattern...": not in Fossil. * Git's globs are relative to the location of the `.gitignore` file: Fossil's globs are relative to the root of the workspace. * Git's globs and Fossil's globs treat directory separators differently. Git includes a notation for zero or more directories that is not needed in Fossil. ### Example In a project with source and documentation: work +-- doc |
︙ | ︙ | |||
548 549 550 551 552 553 554 | ## Implementation and References | | < < < < < < < | | > > > | > > | > | < | < | < | 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 | ## Implementation and References The implementation of the Fossil-specific glob pattern handling is here: :File |:Description -------------------------------------------------------------------------------- [`src/glob.c`][] | pattern list loading, parsing, and generic matching code [`src/file.c`][] | application of glob patterns to file names [`src/glob.c`]: https://www.fossil-scm.org/index.html/file/src/glob.c [`src/file.c`]: https://www.fossil-scm.org/index.html/file/src/file.c See the [Adding Features to Fossil][aff] document for broader details about finding and working with such code. The actual pattern matching leverages the `GLOB` operator in SQLite, so you may find [its documentation][gdoc], [source code][gsrc] and [test harness][gtst] helpful. [aff]: ./adding_code.wiki [gdoc]: https://sqlite.org/lang_expr.html#like [gsrc]: https://www.sqlite.org/src/artifact?name=9d52522cc8ae7f5c&ln=570-768 [gtst]: https://www.sqlite.org/src/artifact?name=66a2c9ac34f74f03&ln=586-673 |