Skip to content

Commit

Permalink
feat(filter): allow for PartialOrd comparisons
Browse files Browse the repository at this point in the history
* Extend the filters capability to perform PartialOrd comparisons like 5 >= 4 or 4 < 20
* Additionally adding documentation to README.md with the available filters and fixing some markdown lint warnings.
* Additionally, I have added a few filters in filters/github.rs that I found interesting/useful (i.E. I might only want to back up repositories with at least 5 stargazers).
  • Loading branch information
cedi committed Nov 29, 2024
1 parent 6803644 commit ab27e2a
Show file tree
Hide file tree
Showing 10 changed files with 342 additions and 9 deletions.
56 changes: 49 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# GitHub Backup

**Automatically backup your GitHub repositories to your local machine.**

This tool is designed to automatically pull the list of GitHub repositories from one, or more,
Expand All @@ -7,8 +8,11 @@ as part of a scheduled backup process with the ultimate goal of ensuring that yo
copy of all of your GitHub repositories should the unthinkable happen.

## Features

- **Backup Multiple Organizations**, automatically gathering the full list of repositories for
each organization through the GitHub API.
- **Backup Starred Repos**, automatically gathering the full list of your starred repositories for your account
up and which are not.
- **Repo Allowlists/Denylists** to provide fine-grained control over which repositories are backed
up and which are not.
- **GitHub Enterprise Support** for those of you running your own GitHub instances and not relying
Expand Down Expand Up @@ -53,6 +57,7 @@ backups:
```
### OpenTelemetry Reporting
In addition to the standard logging output, this tool also supports reporting metrics to an
OpenTelemetry-compatible backend. This can be useful for tracking the performance of the tool
over time and configuring monitoring in case backups start to fail.
Expand All @@ -67,6 +72,7 @@ OTEL_TRACES_SAMPLER_ARG=1.0
```

## Filters

This tool allows you to configure filters to control which GitHub repositories are backed up and
which are not. Filters are used within the `backups` section of your configuration file and can
be specified on a per-user or per-organization basis.
Expand All @@ -75,12 +81,48 @@ When writing a filter, the goal is to write a logical expression which evaluates
you wish to include a repository and `false` when you wish to exclude it. The filter language supports
several operators and properties which can be used to control this process.

### Available filters

For `kind: github/repo` and `kind: github/star`

| Field | Type | Description (_Example_) |
| --------------------- | --------- | ------------------------------------------------------------------------------------------------- |
| `repo.name` | `string` | The name of the repository (_Hello-World_) |
| `repo.fullname` | `string` | The full-name of the repository (_octocat/Hello-World_) |
| `repo.private` | `boolean` | Whether the repository is private |
| `repo.public` | `boolean` | Whether the repository is public |
| `repo.fork` | `boolean` | Whether the repository is a fork |
| `repo.size` | `integer` | The size of the repository, in kilobytes (_1024_). |
| `repo.archived` | `boolean` | Whether the repository is archived |
| `repo.disabled` | `boolean` | Returns whether or not this repository disabled |
| `repo.default_branch` | `string` | The default branch of the repository (_main_) |
| `repo.empty` | `boolean` | Whether the repository is empty (When a repository is initially created, `repo.empty` is `true`) |
| `repo.template` | `boolean` | Whether this repository acts as a template that can be used to generate new repositories |
| `repo.forks` | `integer` | The number of times this repository is forked |
| `repo.stargazers` | `integer` | The number of people starred this repository |

For `kind: github/release`

| Field | Type | Description (_Example_) |
| -------------------- | --------- | ----------------------------------------------------------------- |
| `release.tag` | `string` | The name of the tag (_v1.0.0_) |
| `release.name` | `string` | The name of the release (_v1.0.0_) |
| `release.draft` | `boolean` | Whether the release is a draft (unpublished) release |
| `release.prerelease` | `boolean` | Whether to identify the release as a prerelease or a full release |
| `release.published` | `boolean` | Whether the release is a published (not a draft) release |
| `asset.name` | `string` | The file name of the asset (_github-backup-darwin-arm64_) |
| `asset.size` | `integer` | The size of the asset, in kilobytes. (_1024_) |
| `asset.downloaded` | `boolean` | If the asset was downloaded at least once from the GitHub Release |

### Examples

Here are some examples of filters you might choose to use:

- `!repo.fork || !repo.archived || !repo.empty` - Do not include repositories which are forks, archived, or empty.
- `repo.private` - Only include private repositories in your list.
- `repo.public && !repo.fork` - Only include public repositories which are not forks.
- `repo.name contains "awesome"` - Only include repositories which have "awesome" in their name.
- `(repo.name contains "awesome" || repo.name contains "cool") && !repo.fork` - Only include repositories which have "awesome" or "cool" in their name and are not forks.
- `!release.prerelease && !asset.source-code` - Only include release artifacts which are not marked as pre-releases and are not source code archives.
- `repo.name in ["git-tool", "grey"]` - Only include repositories with the names "git-tool" or "grey".
- `!repo.fork || !repo.archived || !repo.empty` - Do not include repositories which are forks, archived, or empty.
- `repo.private` - Only include private repositories in your list.
- `repo.public && !repo.fork` - Only include public repositories which are not forks.
- `repo.name contains "awesome"` - Only include repositories which have "awesome" in their name.
- `(repo.name contains "awesome" || repo.name contains "cool") && !repo.fork` - Only include repositories which have "awesome" or "cool" in their name and are not forks.
- `!release.prerelease && !asset.source-code` - Only include release artifacts which are not marked as pre-releases and are not source code archives.
- `repo.name in ["git-tool", "grey"]` - Only include repositories with the names "git-tool" or "grey".
- `repo.stargazers >= 5` - Only include repositories with at least 5 stars.
23 changes: 23 additions & 0 deletions src/filter/interpreter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ impl<'a, T: Filterable> ExprVisitor<FilterValue> for FilterContext<'a, T> {
Token::In(..) => right.contains(&left).into(),
Token::StartsWith(..) => left.startswith(&right).into(),
Token::EndsWith(..) => left.endswith(&right).into(),
Token::GreaterThan(..) => (left > right).into(),
Token::SmallerThan(..) => (left < right).into(),
Token::GreaterEqual(..) => (left >= right).into(),
Token::SmallerEqual(..) => (left <= right).into(),
token => unreachable!("Encountered an unexpected binary operator '{token}'"),
}
}
Expand Down Expand Up @@ -139,6 +143,25 @@ mod tests {
assert_eq!(TestFilterable::matches(filter), expected);
}

#[rstest]
#[case("2 > 1", true)]
#[case("1 > 2", false)]
#[case("2 >= 1", true)]
#[case("2 >= 2", true)]
fn greater_than(#[case] filter: &str, #[case] expected: bool) {
assert_eq!(TestFilterable::matches(filter), expected);
}

#[rstest]
#[case("1 < 2", true)]
#[case("2 < 1", false)]
#[case("1 <= 2", true)]
#[case("1 <= 1", true)]
#[case("2 <= 1", false)]
fn smaller(#[case] filter: &str, #[case] expected: bool) {
assert_eq!(TestFilterable::matches(filter), expected);
}

#[rstest]
#[case("boolean != true", false)]
#[case("boolean != false", true)]
Expand Down
32 changes: 31 additions & 1 deletion src/filter/lexer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,32 @@ impl<'a> Iterator for Scanner<'a> {
))));
}
}
'>' => {
if self.match_char('=') {
return Some(Ok(Token::GreaterEqual(Loc::new(
self.line,
1 + idx - self.line_start,
))));
} else {
return Some(Ok(Token::GreaterThan(Loc::new(
self.line,
idx - self.line_start,
))));
}
}
'<' => {
if self.match_char('=') {
return Some(Ok(Token::SmallerEqual(Loc::new(
self.line,
1 + idx - self.line_start,
))));
} else {
return Some(Ok(Token::SmallerThan(Loc::new(
self.line,
idx - self.line_start,
))));
}
}
'"' => {
return Some(self.read_string(idx));
}
Expand Down Expand Up @@ -275,13 +301,17 @@ mod tests {
#[test]
fn test_comparison_operators() {
assert_sequence!(
"== != contains in startswith endswith",
"== != contains in startswith endswith > >= < <=",
Token::Equals(..),
Token::NotEquals(..),
Token::Contains(..),
Token::In(..),
Token::StartsWith(..),
Token::EndsWith(..),
Token::GreaterThan(..),
Token::GreaterEqual(..),
Token::SmallerThan(..),
Token::SmallerEqual(..),
);
}

Expand Down
4 changes: 4 additions & 0 deletions src/filter/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,10 @@ mod tests {
#[case("age != 30", false)]
#[case("age == 31", false)]
#[case("age != 31", true)]
#[case("age > 31", false)]
#[case("age < 31", true)]
#[case("age >= 30", true)]
#[case("age <= 30", true)]
#[case("tags == [\"red\"]", true)]
#[case("tags != [\"red\"]", false)]
#[case("tags == [\"blue\"]", false)]
Expand Down
8 changes: 8 additions & 0 deletions src/filter/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,10 @@ impl<'a, I: Iterator<Item = Result<Token<'a>, Error>>> Parser<'a, I> {
| Some(Ok(Token::Contains(..)))
| Some(Ok(Token::StartsWith(..)))
| Some(Ok(Token::EndsWith(..)))
| Some(Ok(Token::GreaterThan(..)))
| Some(Ok(Token::GreaterEqual(..)))
| Some(Ok(Token::SmallerThan(..)))
| Some(Ok(Token::SmallerEqual(..)))
) {
let token = self.tokens.next().unwrap().unwrap();
let right = self.unary()?;
Expand Down Expand Up @@ -222,6 +226,10 @@ mod tests {
#[case("true != false", Expr::Binary(Box::new(Expr::Literal(true.into())), Token::NotEquals(Loc::new(1, 6)), Box::new(Expr::Literal(false.into()))))]
#[case("\"xyz\" startswith \"x\"", Expr::Binary(Box::new(Expr::Literal("xyz".into())), Token::StartsWith(Loc::new(1, 7)), Box::new(Expr::Literal("x".into()))))]
#[case("\"xyz\" endswith \"z\"", Expr::Binary(Box::new(Expr::Literal("xyz".into())), Token::EndsWith(Loc::new(1, 7)), Box::new(Expr::Literal("z".into()))))]
#[case("1 < 2", Expr::Binary(Box::new(Expr::Literal(1.0.into())), Token::SmallerThan(Loc::new(1, 2)), Box::new(Expr::Literal(2.0.into()))))]
#[case("1 > 2", Expr::Binary(Box::new(Expr::Literal(1.0.into())), Token::GreaterThan(Loc::new(1, 2)), Box::new(Expr::Literal(2.0.into()))))]
#[case("1 <= 2", Expr::Binary(Box::new(Expr::Literal(1.0.into())), Token::SmallerEqual(Loc::new(1, 3)), Box::new(Expr::Literal(2.0.into()))))]
#[case("1 >= 2", Expr::Binary(Box::new(Expr::Literal(1.0.into())), Token::GreaterEqual(Loc::new(1, 3)), Box::new(Expr::Literal(2.0.into()))))]
fn parse_comparison_expressions(#[case] input: &str, #[case] ast: Expr) {
let tokens = crate::filter::lexer::Scanner::new(input);
match Parser::parse(tokens.into_iter()) {
Expand Down
12 changes: 12 additions & 0 deletions src/filter/token.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@ pub enum Token<'a> {
In(Loc),
StartsWith(Loc),
EndsWith(Loc),
GreaterThan(Loc),
SmallerThan(Loc),
GreaterEqual(Loc),
SmallerEqual(Loc),

Not(Loc),
And(Loc),
Expand Down Expand Up @@ -53,6 +57,10 @@ impl Token<'_> {
Token::In(..) => "in",
Token::StartsWith(..) => "startswith",
Token::EndsWith(..) => "endswith",
Token::GreaterThan(..) => ">",
Token::GreaterEqual(..) => ">=",
Token::SmallerThan(..) => "<",
Token::SmallerEqual(..) => "<=",

Token::Not(..) => "!",
Token::And(..) => "&&",
Expand Down Expand Up @@ -82,6 +90,10 @@ impl Token<'_> {
Token::In(loc) => *loc,
Token::StartsWith(loc) => *loc,
Token::EndsWith(loc) => *loc,
Token::GreaterThan(loc) => *loc,
Token::SmallerThan(loc) => *loc,
Token::GreaterEqual(loc) => *loc,
Token::SmallerEqual(loc) => *loc,

Token::Not(loc) => *loc,
Token::And(loc) => *loc,
Expand Down
101 changes: 101 additions & 0 deletions src/filter/value.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
use std::fmt::{Debug, Display};
use std::cmp::Ordering;

/// A trait for types which can be filtered by the filter system.
///
Expand Down Expand Up @@ -92,6 +93,81 @@ impl PartialEq for FilterValue {
}
}

impl PartialOrd for FilterValue {
fn lt(&self, other: &Self) -> bool {
match (self, other) {
(FilterValue::Null, FilterValue::Null) => true,
(FilterValue::Bool(a), FilterValue::Bool(b)) => a < b,
(FilterValue::Number(a), FilterValue::Number(b)) => a < b,
(FilterValue::String(a), FilterValue::String(b)) => a < b,
(FilterValue::Tuple(a), FilterValue::Tuple(b)) => {
a.len() < b.len() && a.iter().zip(b.iter()).all(|(a, b)| a < b)
}
_ => false,
}
}

fn le(&self, other: &Self) -> bool {
match (self, other) {
(FilterValue::Null, FilterValue::Null) => true,
(FilterValue::Bool(a), FilterValue::Bool(b)) => a <= b,
(FilterValue::Number(a), FilterValue::Number(b)) => a <= b,
(FilterValue::String(a), FilterValue::String(b)) => a <= b,
(FilterValue::Tuple(a), FilterValue::Tuple(b)) => {
a.len() <= b.len() && a.iter().zip(b.iter()).all(|(a, b)| a <= b)
}
_ => false,
}
}

fn gt(&self, other: &Self) -> bool {
match (self, other) {
(FilterValue::Null, FilterValue::Null) => true,
(FilterValue::Bool(a), FilterValue::Bool(b)) => a > b,
(FilterValue::Number(a), FilterValue::Number(b)) => a > b,
(FilterValue::String(a), FilterValue::String(b)) => a > b,
(FilterValue::Tuple(a), FilterValue::Tuple(b)) => {
a.len() > b.len() && a.iter().zip(b.iter()).all(|(a, b)| a > b)
}
_ => false,
}
}

fn ge(&self, other: &Self) -> bool {
match (self, other) {
(FilterValue::Null, FilterValue::Null) => true,
(FilterValue::Bool(a), FilterValue::Bool(b)) => a >= b,
(FilterValue::Number(a), FilterValue::Number(b)) => a >= b,
(FilterValue::String(a), FilterValue::String(b)) => a >= b,
(FilterValue::Tuple(a), FilterValue::Tuple(b)) => {
a.len() >= b.len() && a.iter().zip(b.iter()).all(|(a, b)| a >= b)
}
_ => false,
}
}

fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
match (self, other) {
(FilterValue::Null, FilterValue::Null) => Some(Ordering::Equal),
(FilterValue::Bool(a), FilterValue::Bool(b)) => a.partial_cmp(b),
(FilterValue::Number(a), FilterValue::Number(b)) => a.partial_cmp(b),
(FilterValue::String(a), FilterValue::String(b)) => a.partial_cmp(b),
(FilterValue::Tuple(a), FilterValue::Tuple(b)) => {
if a.len() != b.len() {
a.len().partial_cmp(&b.len())
} else {
a.iter()
.zip(b.iter())
.map(|(x, y)| x.partial_cmp(y))
.find(|&cmp| cmp != Some(Ordering::Equal))
.unwrap_or(Some(Ordering::Equal))
}
}
_ => None, // Return None for non-comparable types
}
}
}

impl Display for FilterValue {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Expand Down Expand Up @@ -194,4 +270,29 @@ mod tests {
fn test_truthy<V: Into<FilterValue>>(#[case] value: V, #[case] truthy: bool) {
assert_eq!(value.into().is_truthy(), truthy);
}

#[test]
fn test_bool_comparison() {
assert!(FilterValue::Bool(false) < FilterValue::Bool(true));
assert!(FilterValue::Bool(true) > FilterValue::Bool(false));
assert_eq!(FilterValue::Bool(true), FilterValue::Bool(true));
assert_eq!(FilterValue::Bool(false), FilterValue::Bool(false));
}

#[test]
fn test_number_comparison() {
assert!(FilterValue::Number(1.0) < FilterValue::Number(2.0));
assert!(FilterValue::Number(2.0) > FilterValue::Number(1.0));
assert_eq!(FilterValue::Number(2.0), FilterValue::Number(2.0));
}

#[test]
fn test_string_comparison() {
assert!(FilterValue::String(String::from("abc")) < FilterValue::String(String::from("xyz")));
assert!(FilterValue::String(String::from("xyz")) > FilterValue::String(String::from("abc")));
assert_eq!(
FilterValue::String(String::from("abc")),
FilterValue::String(String::from("abc"))
);
}
}
3 changes: 3 additions & 0 deletions src/helpers/github.rs
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,9 @@ impl MetadataSource for GitHubRepo {
metadata.insert("repo.disabled", self.disabled);
metadata.insert("repo.default_branch", self.default_branch.as_str());
metadata.insert("repo.empty", self.size == 0);
metadata.insert("repo.template", self.is_template);
metadata.insert("repo.forks", self.forks_count as u32);
metadata.insert("repo.stargazers", self.stargazers_count as u32);
}
}

Expand Down
6 changes: 5 additions & 1 deletion src/pairing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -180,13 +180,17 @@ mod tests {
}

#[rstest]
#[case("true", 30)]
#[case("true", 31)]
#[case("false", 0)]
#[case("repo.fork", 19)]
#[case("!repo.fork", 11)]
#[case("repo.empty", 2)]
#[case("!repo.empty", 28)]
#[case("!repo.fork && !repo.empty", 11)]
#[case("repo.stargazers >= 1", 7)]
#[case("repo.forks > 3", 1)]
#[case("repo.template", 1)]
#[case("!repo.template", 30)]
#[tokio::test]
async fn filtering(#[case] filter: &str, #[case] matches: usize) {
use tokio_stream::StreamExt;
Expand Down
Loading

0 comments on commit ab27e2a

Please sign in to comment.