From b7f92423b9595cd4cd1b87005292d853282476c8 Mon Sep 17 00:00:00 2001 From: facelessuser Date: Sun, 30 Dec 2018 19:25:35 -0700 Subject: [PATCH] Update wording of documentation. --- README.md | 37 +++++++++++++++------------------- docs/src/markdown/index.md | 6 +++--- docs/src/markdown/selectors.md | 27 +++++++++++++++---------- 3 files changed, 35 insertions(+), 35 deletions(-) diff --git a/README.md b/README.md index 495a8552..ca9a79f6 100644 --- a/README.md +++ b/README.md @@ -8,27 +8,22 @@ ## Overview -Soup Sieve is a CSS4 selector library designed to be used with -[Beautiful Soup 4](https://beautiful-soup-4.readthedocs.io/en/latest/#). It aims to provide selecting, matching, and -filtering with using modern CSS selectors. - -While Beautiful Soup comes with a builtin CSS selection API, it is not without issues. In addition, it also lacks -support for some more modern CSS features. - -Soup Sieve implements most of the CSS4 selectors, though there are a number that don't make sense in a non-browser -environment. Selectors that cannot provide meaningful functionality simply do not match anything. Some of the supported -selectors are: - -- `.classes` -- `#ids` -- `[attributes=value]` -- `parent child` -- `parent > child` -- `sibling ~ sibling` -- `sibling + sibling` -- `:not(element.class, element2.class)` -- `:is(element.class, element2.class)` -- `parent:has(> child)` +Soup Sieve is a CSS selector library designed to be used with [Beautiful Soup 4](https://beautiful-soup-4.readthedocs.io/en/latest/#). It aims to provide selecting, matching, and filtering using modern CSS selectors. Soup Sieve currently provides selectors from the CSS level 1 specifications up through the latest CSS level 4 drafts (though some are not yet implemented). + +While Beautiful Soup comes with a builtin CSS selection API, it is very basic and not without issues. It lacks support for many modern CSS features. Soup Sieve is planned to officially replace Beautiful Soup's current internal CSS selector implementation, but can also be imported in order to use its API directly. + +Soup Sieve implements most of the CSS selectors, though there are a number that don't make sense in a non-browser environment. Selectors that cannot provide meaningful functionality simply do not match anything. Some of the supported selectors are: + +- `#!css .classes` +- `#!css #ids` +- `#!css [attributes=value]` +- `#!css parent child` +- `#!css parent > child` +- `#!css sibling ~ sibling` +- `#!css sibling + sibling` +- `#!css :not(element.class, element2.class)` +- `#!css :is(element.class, element2.class)` +- `#!css parent:has(> child)` - and many more ## Installation diff --git a/docs/src/markdown/index.md b/docs/src/markdown/index.md index a4f46ac3..88ffa793 100644 --- a/docs/src/markdown/index.md +++ b/docs/src/markdown/index.md @@ -2,11 +2,11 @@ ## Overview -Soup Sieve is a CSS selector library designed to be used with [Beautiful Soup 4][bs4]. It aims to provide selecting, matching, and filtering using modern CSS selectors. Soup Sieve currently provides selectors from a subset of the CSS4 specification. +Soup Sieve is a CSS selector library designed to be used with [Beautiful Soup 4][bs4]. It aims to provide selecting, matching, and filtering using modern CSS selectors. Soup Sieve currently provides selectors from the CSS level 1 specifications up through the latest CSS level 4 drafts (though some are not yet implemented). -While Beautiful Soup comes with a builtin CSS selection API, it is not without issues. In addition, it also lacks support for some more modern CSS features. +While Beautiful Soup comes with a builtin CSS selection API, it is very basic and not without issues. It lacks support for many modern CSS features. Soup Sieve is planned to officially replace Beautiful Soup's current internal CSS selector implementation, but can also be imported in order to use its API directly. -Soup Sieve implements most of the CSS4 selectors, though there are a number that don't make sense in a non-browser environment. Selectors that cannot provide meaningful functionality simply do not match anything. Some of the supported selectors are: +Soup Sieve implements most of the CSS selectors, though there are a number that don't make sense in a non-browser environment. Selectors that cannot provide meaningful functionality simply do not match anything. Some of the supported selectors are: - `#!css .classes` - `#!css #ids` diff --git a/docs/src/markdown/selectors.md b/docs/src/markdown/selectors.md index cf1d3fc2..f8c8877c 100644 --- a/docs/src/markdown/selectors.md +++ b/docs/src/markdown/selectors.md @@ -4,11 +4,11 @@ ### HTML and XML Selectors -The CSS selectors are based off of the CSS level 4 specification. Primarily support has been added for selectors that were feasible to implement and most likely to get practical use. Selectors that cannot provide meaningful matches, simply match nothing. An example would be `:focus` which will match nothing because elements cannot be focused outside of a browser. Though most of the selectors have been implemented, there are still a few that are not. +The CSS selectors are based off of the CSS level 4 specification. Primarily support has been added for selectors that were feasible to implement and most likely to get practical use. Selectors that cannot provide meaningful matches will match nothing. An example would be `:focus` which will match nothing because elements cannot be focused outside of a browser. Though most of the selectors have been implemented, there are still a few that have not. Below shows accepted selectors. When speaking about namespaces, they only apply to XML, XHTML, or when dealing with recognized foreign tags in HTML5. You must configure the CSS [namespaces](./api.md#namespaces) when attempting to evaluate namespaces. -While an effort is made to mimic CSS selector behavior, there may be some differences or quirks, please report issues if any are found. We do not support all CSS selector features, but enough to make filtering and searching more enjoyable. +While an effort is made to mimic CSS selector behavior, there may be some differences or quirks, please report issues if any are found. Selector | Example | Description ------------------------------- | ----------------------------------- | ----------- @@ -57,16 +57,23 @@ Selector | Example | Descript `:root` | `#!css :root` | Selects the root element. In HTML, this is usually the `#!html ` element. `:scope` | `#!css :scope div` | Selects all `#!html
` elements under the current scope element. `:scope` is the element under match or select. In the case where a document (`BeautifulSoup` object, not a `Tag` object) is under select or match, `:scope` equals `:root`. -!!! warning "Experimental Selectors" +!!! warning "Expensive Selectors" + Some selectors are more expensive to use than others. For instance, `:has()` can be a bit more expensive as `:has(a)` will search all children of every element to find if the element contains an `#!html ` element. + + While an effort is made to prioritize evaluation of less expensive selectors first in the hopes to invalidate the search early on and avoid evaluating expensive selectors unless needed, you should still try to be as specific as possible to limit how often expensive selectors are evaluated. For instance, using `p.special:has(a)` will limit evaluating `:has()` to only `#!html

` elements that contain the `special` class. + +!!! warning "CSS4 Selectors" In general, CSS4 specific features and selectors are not finalized in the official CSS4 specification, and may change in the future. While some are most likely quite stable, some may be less certain. - Some implementations are based from our interpretation of the specification. It is possible our interpretation is incorrect. This is more likely with selectors that currently have no reference implementations in browsers such as `:has()` and `of S` support in `:nth-child(an+b [of S]?)`. If any issues are discovered please report the issue with details and examples so we can get them right. + Some implementations are based off our interpretation of the specification. It is possible our interpretation is incorrect. This is more likely with selectors that currently have no reference implementations in browsers, such as `:has()` and `of S` support in `:nth-child(an+b [of S]?)`. If any issues are discovered please report the issue with details and examples so we can get them right. + + If at anytime CSS4 drops a selector from the current draft, it will most likely also be removed here, except in the rare case that the selector is found to be far too useful despite being rejected. !!! danger "Not Implemented" Pseudo elements are not supported as they do not represent real elements. - At-rules (`@page`) are not supported. + At-rules (`@page`, etc.) are not supported. ### HTML Only Selectors @@ -106,19 +113,17 @@ Selector | Example | Descript ## Custom Selectors -Below is listed non-standard CSS selectors. These can contain useful selectors that were rejected from the official CSS specifications, selectors implemented by other systems such as JQuery, or even selectors specific to Soup Sieve. +Below is a list of non-standard CSS selectors that we support. These can contain useful selectors that were rejected from the official CSS specifications, selectors implemented by other systems such as JQuery, or even selectors specifically created for Soup Sieve. -Just because we include selectors from one source, does not mean we have intentions of implementing other selectors from the same source. +Just because we include selectors from one source, does not mean we have intentions of implementing other selectors from the same such source. Selector | Example | Description ------------------------------- | ----------------------------------- | ----------- `[attribute!=value]` | `#!css [target!=_blank]` | Equivalent to `#!css :not([target=_blank])`. -`:contains(text)` | `#!css p:contains(text)` | Select all `#!html

` elements that contain "text" in their content, either directly in themselves or indirectly in their decedents. +`:contains(text)` | `#!css p:contains(text)` | Select all `#!html

` elements that contain "text" in their content, either directly in themselves or indirectly in their descendants. !!! warning "Contains" - Contains is an expensive operation as it scans every element's content, which includes all of the content of each child of the element under consideration. - - If you use a pattern such as `#!css *:contains(text)` it will scan all the children of every element. This will cause some elements to get scanned over and over again as the tree is walked. The more specific you are the better. If you use `#!css p.special:contains(text)`, only `#!html

` elements with the class `special` will have their content scanned (along with their children). `contains` is evaluated as one of the last checks, so if any other comparison invalidates the match, `contains` will not be performed. + `:contains()` is an expensive operation as it scans all the text nodes of an element under consideration, which includes all descendants. Using highly specific selectors can reduce how often it is evaluated. --8<-- refs.txt