Skip to content

Commit

Permalink
Update docs and prepare for 7.0 release (#130)
Browse files Browse the repository at this point in the history
  • Loading branch information
facelessuser authored Jul 27, 2020
1 parent 0d3d728 commit 534924f
Show file tree
Hide file tree
Showing 6 changed files with 37 additions and 55 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ jobs:

steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v1
with:
Expand Down
11 changes: 5 additions & 6 deletions docs/src/markdown/about/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,12 @@ Check out [Release Notes](./release.md#upgrade-to-70) to learn more about upgrad
can be disabled.
- **NEW**: Search functions that use `scandir` will not return `.` and `..` for wildcard patterns that require iterating
over a directory to match the files against a pattern. This matches Python's glob and is most likely what most users
expect. Using a literal `.` or `..` in a pattern will still cause `.` and `..` to be matched.
expect. Pattern matching logic is unaffected.
- **NEW**: Add `SCANDOTDIR` flag to enable previous behavior of injecting `.` and `..` in `scandir` results.
This means that wildcard patterns (such as `.*`) will cause `glob` to return `.` and `..`, which matches Bash's
behavior. This only controls `scandir` behavior and will not affect match patterns in things like `globmatch`.
- **NEW**: Flag `NODOTDIR` has been added to disable patterns such as `.*` from matching `.` and `..` in matching
functions (that don't crawl the filesystem) such as `globmatch`, `pathlib.PurePath.match`, etc. When enabled, matching
functions will require a literal pattern of `.` and `..` to match the special directories `.` and `..`.
`SCANDOTDIR` has no affect on match functions such as `globmatch` which don't use directory scanning.
- **NEW**: Flag `NODOTDIR` has been added to disable patterns such as `.*` from matching `.` and `..`. When enabled,
matching logic is changed to require a literal pattern of `.` and `..` to match the special directories `.` and `..`.
This is more Zsh like.
- **FIX**: Negative extended glob patterns (`!(...)`) incorrectly allowed for hidden files to be returned when one of
the subpatterns started with `.`, even when `DOTMATCH`/`DOTGLOB` was not enabled.
- **FIX**: When `NOUNIQUE` is enabled and `pathlib` is being used, you could still get non-unique results across
Expand Down
47 changes: 25 additions & 22 deletions docs/src/markdown/about/release.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,18 @@ Notable changes will be highlighted here to help with migration to 7.0.
File globbing with [`glob.glob`](../glob.md#glob), [`glob.iglob`](../glob.md#iglob),
[`pathlib.path.glob`](../pathlib.md#glob), and [`pathlib.Path.rglob`](../pathlib.md#rglob) no longer inject `.` and `..`
into results when scanning directories. This *only* affects the results of a scanned directory and does not
fundamentally change how glob patterns evaluate a path. If there is a desire to have glob pattern evaluation adopt this
behavior, the flag [`NODITDIR`](../glob.md#nodotdir) can be enabled, and will change pattern evaluation to act the same
way.
fundamentally change how glob patterns evaluate a path.

Python's default glob will not return `.` or `..` for any "magic" (non-literal) patterns in `glob`. This is because
magic patterns trigger glob to iterate over a directory in an attempt to find a file that can can match the given
"magic" pattern. Since `.` and `..` are not returned by `scandir`, `.` and `..` never get evaluated. Literal
patterns can side step the directory iterating with a simple check to see if the file exists. What this
means is that a "magic" pattern of `.*` will not match `.` or `..`, but a literal pattern of `.` or `..` will.
Python's default glob does not return `.` or `..` for any "magic" (non-literal) patterns in `glob`. This is because
magic patterns trigger glob to iterate over a directory in an attempt to find a file that can match the given "magic"
pattern. Since `.` and `..` are not returned by Python's implementation of `scandir`, `.` and `..` never get evaluated.
Literal patterns can side step the directory iteration with a simple check to see if the file exists. What this means is
that a "magic" pattern of `.*` will not match `.` or `..`, because it is not returned in the scan, but a literal pattern
of `.` or `..` will as the literal patterns are simply checked to see if they exist.

This is common behavior for a number of libraries, Python, [node-glob], etc., but not all. Moving forward, we have
chosen to adopt the Python's behavior as our default behavior, with the option of forcing Bash's behavior of returning
`.` and `..` in a directory scan if desired.

These examples will illustrate the behavior. In the first example, Python's `pathlib` is used to glob a
directory. We can note that not a single entry in the results is `.` or `..`.
Expand All @@ -37,16 +40,16 @@ We can also show that if we search for the literal pattern of `..` that glob wil
```

When using the `match` function, we see that the pattern can match `..` just fine. This illustrates that it is not the
patterns the pattern logic that restricts this, but a result of the behavior exhibited by `scandir`.
pattern logic that restricts this, but a result of the behavior exhibited by `scandir`.

```pycon3
>>> import pathlib
>>> pathlib.Path('..').match('.*')
True
```

While our algorithm is more complicated due to some of the features we support, and it may oversimplify things to say we
now turn off injecting `.` and `..` into `scandir` results, bit for all intents and purposes, all of our file system
While our algorithm is different due to some of the features we support, and it may oversimplify things to say we
now turn off injecting `.` and `..` into `scandir` results, but for all intents and purposes, all of our file system
globbing functions exhibit the same behavior as Python's default glob now.


Expand All @@ -71,7 +74,7 @@ patterns, which are used to filter the results, can match `.` or `..` with `.*`:
[]
```

If we want to modify the pattern matcher and not just the the directory scanner, we can use the flag
If we want to modify the pattern matcher, and not just the the directory scanner, we can use the flag
[`NODITDIR`](../glob.md#nodotdir).

```pycon3
Expand All @@ -86,8 +89,8 @@ These changes were done for a couple of reasons:

1. Generally, it is rare to specifically want `.` and `..`, so often when people glob with something like `**/.*`, they
are just trying to get hidden files. While we generally model our behavior off Bash, there are many alternative
shells (such as Zsh) that do not return `.` and `..` except when a literal pattern of `.` and `..` is
provided.
shells (such as Zsh) that do not return or match `.` and `..` with magic patterns by design, regardless of what
directory scanner returns.

2. Many people who come to use our library are probably coming from having experience with Python's glob. By mirroring
this behavior out of the box, it may help people adapt to the library easier.
Expand Down Expand Up @@ -156,23 +159,23 @@ were attempted, failure was likely.
7.0 brings improvements related to Windows drives and UNC paths. Glob patterns will now properly respect extended UNC
paths such as `//?/UNC/LOCALHOST/c$` and others. This means you can use these patterns without issues. And just like
simple cases (`//server/mount`), extended case do not require escaping meta characters, except when using pattern
simple cases (`//server/mount`), extended cases do not require escaping meta characters, except when using pattern
expansion syntax that is available via [`BRACE`](../glob.md#brace) and [`SPLIT`](../glob.md#split).
### Glob Escaping
Because it can be problematic trying to mix Windows drives that use characters such as `{` and `}` with the
[`BRACE`](../glob.md#brace) flag, you can now escape these meta characters in drives if required. Prior to 7.0, such
escaping was disallowed, but now you can safely escape `{` and `}` to ensure optimal brace handling. While you can
safely escape other meta characters in drive as well, it is never actually needed.
safely escape other meta characters in drives as well, it is never actually needed.
Additionally, [`glob.escape`](../glob.md#escape) and [`glob.raw_escape`](../glob.md#raw_escape) will automatically
escape `{`, `}` and `|` to avoid in complications [`BRACE`](../glob.md#brace) and [`SPLIT`](../glob.md#split).
escape `{`, `}` and `|` to avoid complications with [`BRACE`](../glob.md#brace) and [`SPLIT`](../glob.md#split).
In general, a lot of corner cases with [`glob.escape`](../glob.md#escape) and [`glob.raw_escape`](../glob.md#raw_escape)
were cleaned up. [`glob.escape`](../glob.md#escape) is meant to handle the escaping of normal paths, into strings that
can be used in patterns. For instance, to use back slashes in a glob pattern, you must use escaped back slashes because
you can also escape meta characters:
were cleaned up.
[`glob.escape`](../glob.md#escape) is meant to handle the escaping of normal paths so that they can be used in patterns.
```pycon3
>>> glob.escape(r'my\file-[work].txt', unix=False)
Expand All @@ -186,7 +189,7 @@ represented by two `\`), then [`glob.raw_escape`](../glob.md#raw_escape) is what
'my\\\\file\\-\\[work\\].txt'
```
By default [`glob.raw_escape`](../glob.md#raw_escape) always translates Python character back references into actual
By default, [`glob.raw_escape`](../glob.md#raw_escape) always translates Python character back references into actual
characters, but if this is not needed, a new option called `raw_chars` (`True` by default) has been added to disable
this behavior:
Expand All @@ -213,7 +216,7 @@ would still be returned for multiple patterns, and even a case where duplicates
Due to `pathlib` file path normalization, `.` directories are stripped out, and trailing slashes are stripped off paths.
With the changes noted in [Globbing](#globbing-special-directories) single pattern cases no longer return duplicate
paths, but results across multiple patterns still could. For instance, it is possible that three different patterns,
provided at the same time (or through pattern expansion) could match the following paths: `file/./path`, `file/path/.`
provided at the same time (or through pattern expansion) could match the following paths: `file/./path`, `file/path/.`,
and `file/path`. Each of these results are unique as far as glob is concerned, but due to the `pathlib` normalization of
`.` and trailing slashes, `pathlib` glob will return all three of these results as `file/path`, giving three identical
results.
Expand Down
7 changes: 4 additions & 3 deletions docs/src/markdown/glob.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,10 @@ Pattern | Meaning
- In general, Wildcard Match's behavior is modeled off of Bash's, and prior to version 7.0, unlike Python's default
[`glob`][glob], Wildcard Match's [`glob`](#glob) would match and return `.` and `..` for magic patterns like `.*`.
This is because our directory scanning logic inserts `.` and `..` into results to be faithful to Bash. While this
emulates Bash's behavior, it can be surprising to the user. In 7.0 we now avoid returning `.` and `..` in our
directory scanner. You can once again enable the old Bash-like behavior with the flag [`SCANDOTDIR`](#scandotdir) if
this old behavior is desired.
emulates Bash's behavior, it can be surprising to the user, especially if they are used to Python's default glob. In
7.0 we now avoid returning `.` and `..` in our directory scanner. This does not affect how patterns are matched, just
what is returned via our directory scan logic. You can once again enable the old Bash-like behavior with the flag
[`SCANDOTDIR`](#scandotdir) if this old behavior is desired.

Python's default:

Expand Down
23 changes: 0 additions & 23 deletions docs/src/markdown/pathlib.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,29 +109,6 @@ matching:
- [`match`](#match) will exhibit the same right to left behavior.
- Prior to version 7.0, Wildcard Match used to return `.` and `..` when scanning a directory for a "magic" pattern, this
would cause [`glob`](#glob) and [`rglob`](#rglob) to return `.` and `..` for "magic" patterns such as `.*`. While this
matched Bash's behavior quite well, this did not match Python's default library and created some confusion in certain
scenarios. In version 7.0+, Wildcard Match's directory scanning will no longer return `.` and `..`. In order to match
`.` and `..`, a literal pattern of `.` or `..` should be used.
It is important to note that this only affects the directory scanning behavior, a glob pattern of `.*` will still
match `.` and `..`, you just won't see these results when a directory is scanned for a "magic" pattern. Exclude
patterns via [`NEGATE`](#negate) will still match `.` and `..` with `.*` as these are applied to the final results
after directory scanning has occurred.
For an example why this change is important to `pathlib`, let's consider the pattern `**/.*`. Wildcard Match's glob
patterns would reasonably match `.hidden` and `.hidden/.` with such a pattern. `pathlib` would normalize both of
these results to simply `.hidden` as `.` and trailing slashes would get removed. This made it difficult for users to
understand why `.hidden` was matched twice. Even more confusing to users was when `**/.*` would match
`not-hidden/.` but be normalized as `not-hidden`.
For more information as to why these changes were made, please see the
[Release Note](./about/release.md#upgrade-to-70).
!!! new "New 7.0"
[`glob`](#glob) and [`rglob`](#rglob) directory scanning does not return `.` and `..`.
## Classes
#### `pathlib.PurePath` {: #purepath}
Expand Down
2 changes: 1 addition & 1 deletion requirements/docs.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
mkdocs_pymdownx_material_extras==1.0.4
mkdocs_pymdownx_material_extras==1.0.6
mkdocs-git-revision-date-localized-plugin
mkdocs-minify-plugin
pyspelling

0 comments on commit 534924f

Please sign in to comment.