diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 917077c1..e2e49ebb 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -1,8 +1,8 @@ # Harper's Architecture -This document seeks to solve one simple problem: +This document seeks to solve one simple problem: -> "Roughly, it takes 2x more time to write a patch if you are unfamiliar with the project, but it takes 10x more time to figure out __where__ you should change the code." - [Alex Kladov](https://matklad.github.io/2021/02/06/ARCHITECTURE.md.html) +> "Roughly, it takes 2x more time to write a patch if you are unfamiliar with the project, but it takes 10x more time to figure out **where** you should change the code." - [Alex Kladov](https://matklad.github.io/2021/02/06/ARCHITECTURE.md.html) This document is meant to serve as a kind of table of contents for the Harper project. Hopefully, we can reduce that 10x down to something a little more reasonable. @@ -13,7 +13,7 @@ Harper tries to do one thing well: find grammatical and spelling errors in Engli If possible, provide suggestions to correct those errors. An error and it's possible corrections together form what we call a lint. -In this vein, Harper serves the role of a [Linter](https://en.wikipedia.org/wiki/Lint_(software)) for English. +In this vein, Harper serves the role of a [Linter]() for English. ## `harper-core` @@ -24,7 +24,7 @@ At a high level, there are just a couple types you need to worry about. - [Document](https://docs.rs/harper-core/latest/harper_core/struct.Document.html): A representation of an English document. Implements [`TokenStringExt`](https://docs.rs/harper-core/latest/harper_core/trait.TokenStringExt.html) to make it easier to query. - [Parser](https://docs.rs/harper-core/latest/harper_core/parsers/trait.Parser.html): A trait that describes an object that consumes text and emits tokens. The name is somewhat of a misnomer since it is supposed to only lex English (and emit [Tokens](https://docs.rs/harper-core/latest/harper_core/struct.Token.html)), not parse it. It is called a parser since most types that implement this trait parse _other_ languages (JavaScript) to extract the English text. - - The [Markdown parser](https://docs.rs/harper-core/latest/harper_core/parsers/struct.Markdown.html) is a great example. + - The [Markdown parser](https://docs.rs/harper-core/latest/harper_core/parsers/struct.Markdown.html) is a great example. - [Linter](https://docs.rs/harper-core/latest/harper_core/linting/trait.Linter.html): A trait that, provided a document, will produce zero or more [Lints](https://docs.rs/harper-core/latest/harper_core/linting/struct.Lint.html#). This is usually done using direct queries on the document or by implementing a [`PatternLinter`](https://docs.rs/harper-core/latest/harper_core/linting/trait.PatternLinter.html). If you want to add a linter to Harper, create a new file under the `linters` module in `harper-core` and create a public struct that implements the `Linter` trait. diff --git a/COMPARISON.md b/COMPARISON.md index 51bd811e..e8b2c0e7 100644 --- a/COMPARISON.md +++ b/COMPARISON.md @@ -1,8 +1,8 @@ # Comparison to Other Grammar Checkers -| | Suggestion Time | License | LSP Support | Ruleset | Multi-Lingual/Multi-Dialect | -| ------------ | --------------- | ------------------------ | ---------------------------------------------------------------------------------------------------- | --------------------------- | --------------------------- | -| harper | 10ms | Apache-2.0 | ✅ | hunspell/MySpell derivative | ❌ | -| LanguageTool | 650ms | LGPL-2.1 | 🟨 Through [ltex-ls](https://github.com/valentjn/ltex-ls) | LanguageTool(XML) | 🟨 Not simultaneously | -| hunspell | | LGPL/GPL/MPL tri-license | ❌ | hunspell/MySpell | 🟨 Not simultaneously | -| Grammarly | 4000ms | Proprietary | 🟨 Through [grammarly-language-server](https://github.com/emacs-grammarly/grammarly-language-server) | Proprietary | ❌ | +| | Suggestion Time | License | LSP Support | Ruleset | Multi-Lingual/Multi-Dialect | +| ------------ | --------------- | ------------------------ | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | --------------------------- | +| Harper | 10ms | Apache-2.0 | ✅ | [Custom](https://github.com/elijah-potter/harper/tree/master/harper-core/src/linting) | ❌ | +| LanguageTool | 650ms | LGPL-2.1 | 🟨 Through [ltex-ls](https://github.com/valentjn/ltex-ls) | [Custom](https://community.languagetool.org/rule/list?lang=en) + N-Gram Based + LLM Based | 🟨 Not simultaneously | +| hunspell | | LGPL/GPL/MPL tri-license | ❌ | hunspell/MySpell | 🟨 Not simultaneously | +| Grammarly | 4000ms | Proprietary | 🟨 Through [grammarly-language-server](https://github.com/emacs-grammarly/grammarly-language-server) | Proprietary | ❌ | diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 486086bb..6beb6f3b 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -26,7 +26,7 @@ I would highly recommend that you run `just setup` to populate your build caches ## Committing -Harper follows [conventional commit practices](https://www.conventionalcommits.org/en/v1.0.0/). +Harper follows [conventional commit practices](https://www.conventionalcommits.org/en/v1.0.0/). Before creating a pull request, please make sure all your commits follow the linked conventions. Additionally, to minimize the labor required to review your commit, we run a relatively strict suite of formatting and linting programs. diff --git a/README.md b/README.md index fb7556d9..61ffa977 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ Harper is an English grammar checker designed to be _just right._ I created it after years of dealing with the shortcomings of the competition. -Grammarly was too expensive and too overbearing. +Grammarly was too expensive and too overbearing. Its suggestions lacked context, and were often just plain _wrong_. Not to mention: it's a privacy nightmare. Everything you write with Grammarly is sent to their servers. @@ -22,7 +22,7 @@ LanguageTool is great, if you have gigabytes of RAM to spare and are willing to Besides the memory requirements, I found LanguageTool too slow: it would take several seconds to lint even a moderate-size document. That's why I created Harper: it is the grammar checker that fits my needs. -Not only does it take milliseconds to lint a document, take less than 1/50th of LanguageTool's memory footprint, +Not only does it take milliseconds to lint a document, take less than 1/50th of LanguageTool's memory footprint, but it is also completely private. Harper is even small enough to load via [WebAssembly.](https://writewithharper.com) @@ -33,7 +33,7 @@ If you want to use Harper on your machine, you have two choices. ### `harper-ls` -`harper-ls` provides an integration that works for most code editors. +`harper-ls` provides an integration that works for most code editors. [Read more here.](./harper-ls/README.md) diff --git a/demo.md b/demo.md index a74dcccb..e11d75f1 100644 --- a/demo.md +++ b/demo.md @@ -1,4 +1,4 @@ -There are some cases where the the standard grammar checkers +There are some cases where the the standard grammar checkers don't cut it. That s where Harper comes in handy. Harper is an language checker for developers. it can detect @@ -7,9 +7,9 @@ as well as a number of other issues. Like if you break up words you shouldn't. Harper works everywhere, even offline. Since you r data -never leaves your device, you don't ned to worry aout us +never leaves your device, you don't ned to worry aout us selling it or using it to train large language models. The best part: Harper can give you feedback instantly. -For most documents, Harper can serve up suggestions in +For most documents, Harper can serve up suggestions in under 10 ms.