Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement grit format command #595

Merged
merged 38 commits into from
Jan 26, 2025
Merged

feat: implement grit format command #595

merged 38 commits into from
Jan 26, 2025

Conversation

Arian8j2
Copy link
Contributor

@Arian8j2 Arian8j2 commented Jan 1, 2025

trying to resolve #574
this is just a proof of concept right now and needs more work and time to get complete
/claim #574

@Arian8j2 Arian8j2 requested a review from a team as a code owner January 1, 2025 16:24
@algora-pbc algora-pbc bot mentioned this pull request Jan 1, 2025
Copy link

algora-pbc bot commented Jan 1, 2025

💵 To receive payouts, sign up on Algora, link your Github account and connect with Stripe.

@Arian8j2 Arian8j2 marked this pull request as draft January 1, 2025 16:25
@morgante
Copy link
Contributor

morgante commented Jan 3, 2025

Thanks @Arian8j2 for working on this, just wanted to chime in that it's definitely on the right track for what I had in mind. Once you're ready, feel free to request a review.

resolved.sort();

let file_path_to_resolved = group_resolved_patterns_by_group(resolved);
for (file_path, resovled_patterns) in file_path_to_resolved {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be easily executed in parallel, however the tests that verify stdout will fail due to inconsistencies in the output

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the input to make tests happy, another option is to sort the test output by line so it is stabilized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Ok(())
}

fn format_yaml_file(file_content: &str) -> Result<String> {
Copy link
Contributor Author

@Arian8j2 Arian8j2 Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morgante Is it ok to also format the whole yaml file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we need to parse the whole yaml and edit the body and then serialize it back to yaml format, in this way the yaml code also get's formatted. i couldn't find a way to just change body field, also because it's usually multi line string it's also hard to just find the bytes and replace it, because we need additional space and tabs to make it valid yaml code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not not serialize/deserialize the whole body. Round-tripping is lossy. We'd also lose any comments in the yaml.

I understand finding and replacing just the bytes is harder but it's very much possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem is in finding the code, for example consider this yaml file:

version: 0.0.1
patterns:
  - name: aspect_ratio_yaml
    description: Yaml version of aspect_ratio.md
    body: |
      language css

      `a { $props }` where {
        $props <: contains `aspect-ratio: $x`
      }

  - file: ./others/test_move_import.md

i can easily get the grit code:

language css

`a { $props }` where {
  $props <: contains `aspect-ratio: $x`
}

and format it no problem, but then changing the body with new formatted code is difficult because how i find the old grit code in the file? each line is prefixed with indent spaces:

      language css

      `a { $props }` where {
        $props <: contains `aspect-ratio: $x`
      }

i need to know how much space i need to put before the code (i used to hard code this in the past commits but then it introduces other problems), thats the hard part, and i couldn't think of a way that is not hacky and breakable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i could try to find the code like this:

for i in 1..10 {
    let indent =  " ".repeat(i);
    let grit_code_with_indent = prefix_each_line(indent, grit_code)
    if let Some(pos) = yaml_file_content.position(grit_code_with_indent) {
        // ...
        break
    }
}

but it's a bit hacky

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually have a lot of utils in GritQL itself for handling indentation—tested here—so I think you should be able to leverage those.

Copy link
Contributor Author

@Arian8j2 Arian8j2 Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice hint, GritQL really shines in these issues but i don't know if it's overkill or not for this, btw i updated the code, now it uses a grit pattern to find and replace yaml bodies with formatted ones

@Arian8j2 Arian8j2 marked this pull request as ready for review January 4, 2025 07:30
resolved.sort();

let file_path_to_resolved = group_resolved_patterns_by_group(resolved);
for (file_path, resovled_patterns) in file_path_to_resolved {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the input to make tests happy, another option is to sort the test output by line so it is stabilized.

Ok(())
}

fn format_yaml_file(file_content: &str) -> Result<String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason?

@morgante
Copy link
Contributor

@Arian8j2 Please make sure all tests / clippy pass.

@Arian8j2 Arian8j2 changed the title implement grit format command feat: implement grit format command Jan 12, 2025
@Arian8j2
Copy link
Contributor Author

@morgante Couldn't build biome with current cargo version of CI (1.76.0-nightly):

error[E0658]: the `#[expect]` attribute is an experimental feature
   --> .cargo/git/checkouts/biome-d49ce8898b420a50/8bec9fc/crates/biome_string_case/src/lib.rs:629:13
    |
629 |             #[expect(clippy::disallowed_methods)]
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #54503 <https://github.com/rust-lang/rust/issues/54503> for more information
    = help: add `#![feature(lint_reasons)]` to the crate attributes to enable

error[E0658]: the `#[expect]` attribute is an experimental feature
   --> .cargo/git/checkouts/biome-d49ce8898b420a50/8bec9fc/crates/biome_string_case/src/lib.rs:645:13
    |
645 |             #[expect(clippy::disallowed_methods)]
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #54503 <https://github.com/rust-lang/rust/issues/54503> for more information
    = help: add `#![feature(lint_reasons)]` to the crate attributes to enable

error[E0658]: the `#[expect]` attribute is an experimental feature
   --> .cargo/git/checkouts/biome-d49ce8898b420a50/8bec9fc/crates/biome_string_case/src/lib.rs:660:13
    |
660 |             #[expect(clippy::disallowed_methods)]
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: see issue #54503 <https://github.com/rust-lang/rust/issues/54503> for more information
    = help: add `#![feature(lint_reasons)]` to the crate attributes to enable

Should we perhaps increase the CI rust version?

@morgante
Copy link
Contributor

I think newer Rust versions were breaking our binary on osx. Can you see about activating the feature directly?

@Arian8j2
Copy link
Contributor Author

Arian8j2 commented Jan 14, 2025

Can you see about activating the feature directly?

I don't think we can activate feature of other crates without changing them directly, We need to fork and change if we really want to keep the rust version. stack overflow of same issue
Edit: Also we need to fork one dependency of biome, oxc-resolver because it also uses expect attribute too

I think newer Rust versions were breaking our binary on osx

Whats the issue on osx? Maybe fixing that would be easier

@Arian8j2
Copy link
Contributor Author

@morgante If you are comfortable with maintaining forks, These are the patches that needs to be applied to forks.

biome patch:

diff --git a/Cargo.toml b/Cargo.toml
index a861ceaf44..c1281620f8 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -186,7 +186,7 @@ ignore             = "0.4.23"
 indexmap           = { version = "2.7.1", features = ["serde"] }
 insta              = "1.42.0"
 natord             = "1.0.9"
-oxc_resolver       = "3.0.3"
+oxc_resolver       = { git = "our_oxc_resolver_fork"}
 proc-macro2        = "1.0.86"
 quickcheck         = "1.0.3"
 quickcheck_macros  = "1.0.0"
diff --git a/crates/biome_formatter/src/lib.rs b/crates/biome_formatter/src/lib.rs
index dc84856781..7adb5f4d5c 100644
--- a/crates/biome_formatter/src/lib.rs
+++ b/crates/biome_formatter/src/lib.rs
@@ -19,6 +19,7 @@
 //! * [`format_args!`]: Concatenates a sequence of Format objects.
 //! * [`write!`]: Writes a sequence of formatable objects into an output buffer.
 
+#![feature(lint_reasons)]
 #![deny(rustdoc::broken_intra_doc_links)]
 
 mod arguments;
diff --git a/crates/biome_grit_factory/src/lib.rs b/crates/biome_grit_factory/src/lib.rs
index 4eb787ecb1..2d19ead6fe 100644
--- a/crates/biome_grit_factory/src/lib.rs
+++ b/crates/biome_grit_factory/src/lib.rs
@@ -1,3 +1,5 @@
+#![feature(lint_reasons)]
+
 use biome_grit_syntax::GritLanguage;
 use biome_rowan::TreeBuilder;
 
diff --git a/crates/biome_grit_formatter/src/lib.rs b/crates/biome_grit_formatter/src/lib.rs
index fa7b8a3305..bb3438e08a 100644
--- a/crates/biome_grit_formatter/src/lib.rs
+++ b/crates/biome_grit_formatter/src/lib.rs
@@ -1,3 +1,5 @@
+#![feature(lint_reasons)]
+
 mod comments;
 pub mod context;
 mod cst;
diff --git a/crates/biome_rowan/src/lib.rs b/crates/biome_rowan/src/lib.rs
index 30a6cd76b3..7a7f651409 100644
--- a/crates/biome_rowan/src/lib.rs
+++ b/crates/biome_rowan/src/lib.rs
@@ -1,5 +1,6 @@
 //! A generic library for lossless syntax trees.
 //! See `examples/s_expressions.rs` for a tutorial.
+#![feature(lint_reasons, ptr_addr_eq, offset_of)]
 #![forbid(
 // missing_debug_implementations,
 unconditional_recursion,
diff --git a/crates/biome_string_case/src/lib.rs b/crates/biome_string_case/src/lib.rs
index 7d062b61ce..8d2df5a477 100644
--- a/crates/biome_string_case/src/lib.rs
+++ b/crates/biome_string_case/src/lib.rs
@@ -1,5 +1,6 @@
 //! Identify string case and convert to various string cases.
 
+#![feature(lint_reasons)]
 use std::{borrow::Cow, cmp::Ordering, ffi::OsStr};
 
 // Include the file generated by `../build.rs`
diff --git a/crates/biome_text_size/src/lib.rs b/crates/biome_text_size/src/lib.rs
index fd27c65523..34815c9aca 100644
--- a/crates/biome_text_size/src/lib.rs
+++ b/crates/biome_text_size/src/lib.rs
@@ -16,6 +16,7 @@
 //!
 //! Minimal Supported Rust Version: latest stable.
 
+#![feature(lint_reasons)]
 #![forbid(unsafe_code)]
 #![warn(missing_debug_implementations, missing_docs)]
 

oxc-resolver patch:

diff --git a/src/lib.rs b/src/lib.rs
index 3e7f743..77d1a65 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -47,6 +47,8 @@
 #![doc = include_str!("../examples/resolver.rs")]
 //! ```
 
+#![feature(lint_reasons)]
+
 mod builtins;
 mod cache;
 mod context;

@morgante
Copy link
Contributor

@Arian8j2 I'e upgraded the main branch to 1.82, so hopefully you can rebase and this will be mergeable.

@morgante
Copy link
Contributor

Looks like you need to run cargo format: https://github.com/getgrit/gritql/actions/runs/12972060346/job/36179168549

@Arian8j2
Copy link
Contributor Author

@morgante Also be careful when including this on release, It still panics on some stdlib patterns due to biome issues

@morgante morgante merged commit 5af841f into getgrit:main Jan 26, 2025
5 checks passed
@morgante
Copy link
Contributor

Hi @Arian8j2 I'm going to take this over. While this initial implementation is kind of okay, it actually has massive memory usage and doesn't work remotely well.

@Arian8j2
Copy link
Contributor Author

Arian8j2 commented Jan 27, 2025

@morgante

it actually has massive memory usage

Is the issue from grit side or biome?

doesn't work remotely well

Whats the issue?

Please feel free to make improvements, but I believe a complete rewrite of this PR isn't necessary. Additionally, will the bounty be awarded? Also i can give a hand on rewrite if you like, And also knowing what is on your mind will also help.

@morgante
Copy link
Contributor

morgante commented Feb 4, 2025

Hi @Arian8j2, the main issue I found is explosive memory usage. Running grit format on the stdlib uses 25GB of memory.

Obviously we can't ship it in that state. Some of the problems are obvious (we shouldn't keep all patterns in memory at once) but others will require deeper fixes. I started on a rewrite here: #606

Additionally, will the bounty be awarded?

I can only reward it once it's working viably.

@Arian8j2
Copy link
Contributor Author

Arian8j2 commented Feb 4, 2025

@morgante

we shouldn't keep all patterns in memory at once

I don't think patterns take that much memory, Even with stdlib that has unusual amount of patterns.


I looked into this and the problem is from biome, Really small test that reproduces it:

#[test]
fn biome_memory_issue() {
    let code = "language js\n";
    biome_grit_parser::parse_grit(code);
}

Run this test in release mode and boom it will explode memory, Interestingly it doesn't explode on debug mode, Previously i just tested the debug version, So i was shocked when you said it has memory issues, One other thing that i found out is that the \n has something to do with it, If you remove that it doesn't explode.

@morgante
Copy link
Contributor

morgante commented Feb 4, 2025

I don't think patterns take that much memory, Even with stdlib that has unusual amount of patterns.

We still shouldn't need to keep them all in memory at the same time.

I looked into this and the problem is from biome, Really small test that reproduces it:

I've opened an issue upstream: biomejs/biome#5032

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

grit format command
2 participants