From cd6c76570b4a1c1635ad0c08dcd277ba6102786c Mon Sep 17 00:00:00 2001 From: Stevan Andjelkovic Date: Mon, 30 Sep 2024 13:39:57 +0200 Subject: [PATCH] Comment on coverage-guidence code. --- README-unprocessed.md | 44 ++++++++++++++++++++++---------- README.md | 58 +++++++++++++++++++++++++++++-------------- src/QuickCheckV1.hs | 3 ++- 3 files changed, 73 insertions(+), 32 deletions(-) diff --git a/README-unprocessed.md b/README-unprocessed.md index 5e5714e..e5c07d7 100644 --- a/README-unprocessed.md +++ b/README-unprocessed.md @@ -113,15 +113,10 @@ To give you an idea of how powerful this idea is, check out the list of [bugs](https://lcamtuf.coredump.cx/afl/#bugs) that it found and this post about how it manages to figure out the [jpeg format](https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html) -on its own. - -For more details about how it works, see the [AFL -"whitepaper"](https://lcamtuf.coredump.cx/afl/technical_details.txt) and its -[mutation -heuristics](https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html). +on its own[^2]. Since AFL is the tool that Dan explicitly mentions in his post, let's stop at -this point and go back to this point, before looking at what happened since +this point and go back to his point, before looking at what happened since with coverage-guided fuzzers. Recall that Dan was asking why this idea of coverage-guidance wasn't present in @@ -164,9 +159,9 @@ By now we should have enough background to see that the idea of combining coverage-guidance and property-based testing makes sense. What if we can use user provided generators to kick start the exploration, but then mutate the data and use coverage information to find problems that wouldn't have been -surfaced with the user provided input generators alone (or recall from the +surfaced with the user provided input generators alone, or recall from the example in the introduction, the probability of generating the input in a -single shot is simply too unlikely). +single shot is simply too unlikely. ### After 2015 @@ -323,7 +318,7 @@ machinary for gathering run-time statistics of the generated data! This machinary is [crucial](https://www.youtube.com/watch?v=NcJOiQlzlXQ) for writing good tests and has been part of the QuickCheck implementation since the -very first version[^2]! +very first version[^3]! So the question is: can we implement coverage-guided property-based testing using the internal notion of coverage that property-based testing already has? @@ -374,15 +369,27 @@ functions. ### The extension to add coverage-guidance +Okey, so the above is the first version of the original property-based testing +tool, QuickCheck. Now let's add coverage-guidence to it! + +The function that checks a property with coverage-guidance slight different +from `quickCheck`[^4]: ``` {.haskell include=src/QuickCheckV1.hs snippet=coverCheck} ``` +In particular notice that instead of `Testable a` we use an explicit predicate +on a list of `a`, `[a] -> Property`. The reason for using a list in the +predicate is so that we can iteratively make progress, using the coverage +information. We see this more clearly if we look at the coverage-guided +analogue of the `tests` function, in particular the `xs` parameter: + ``` {.haskell include=src/QuickCheckV1.hs snippet=testsC1} ``` -``` {.haskell include=src/QuickCheckV1.hs snippet=testsC2} -``` +The other important difference is the `cov`erage parameter, which keeps track +of how many things have been `classify`ed (the `stamps` parameter). Notice how +we only add the newly generated input, `x`, if the `cov`erage increases. The full source code is available [here](https://github.com/stevana/coverage-guided-pbt). @@ -598,7 +605,18 @@ Falsifiable, after 88 tests: } ``` -[^2]: See the appendix of the original +[^2]: For more details about how it works, see the [AFL + "whitepaper"](https://lcamtuf.coredump.cx/afl/technical_details.txt) and + its [mutation + heuristics](https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html). + +[^3]: See the appendix of the original [paper](https://dl.acm.org/doi/10.1145/351240.351266) that first introduced property-based testing. It's interesting to note that the collecting statistics functionality is older than shrinking. + +[^4]: It might be interesting to note that we can implement this signature + using the original combinators: + ``` {.haskell include=src/QuickCheckV1.hs snippet=testsC2} + ``` + diff --git a/README.md b/README.md index 7350a15..02f6cbb 100644 --- a/README.md +++ b/README.md @@ -118,15 +118,10 @@ To give you an idea of how powerful this idea is, check out the list of [bugs](https://lcamtuf.coredump.cx/afl/#bugs) that it found and this post about how it manages to figure out the [jpeg format](https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html) -on its own. - -For more details about how it works, see the [AFL -"whitepaper"](https://lcamtuf.coredump.cx/afl/technical_details.txt) and -its [mutation -heuristics](https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html). +on its own[^2]. Since AFL is the tool that Dan explicitly mentions in his post, let's -stop at this point and go back to this point, before looking at what +stop at this point and go back to his point, before looking at what happened since with coverage-guided fuzzers. Recall that Dan was asking why this idea of coverage-guidance wasn't @@ -171,8 +166,8 @@ combining coverage-guidance and property-based testing makes sense. What if we can use user provided generators to kick start the exploration, but then mutate the data and use coverage information to find problems that wouldn't have been surfaced with the user provided input generators -alone (or recall from the example in the introduction, the probability -of generating the input in a single shot is simply too unlikely). +alone, or recall from the example in the introduction, the probability +of generating the input in a single shot is simply too unlikely. ### After 2015 @@ -360,7 +355,7 @@ the generated data! This machinary is [crucial](https://www.youtube.com/watch?v=NcJOiQlzlXQ) for writing good tests and has been part of the QuickCheck -implementation since the very first version[^2]! +implementation since the very first version[^3]! So the question is: can we implement coverage-guided property-based testing using the internal notion of coverage that property-based @@ -531,6 +526,12 @@ tests config gen rnd0 ntest nfail stamps ### The extension to add coverage-guidance +Okey, so the above is the first version of the original property-based +testing tool, QuickCheck. Now let's add coverage-guidence to it! + +The function that checks a property with coverage-guidance slight +different from `quickCheck`[^4]: + ``` haskell coverCheck :: (Arbitrary a, Show a) => Config -> ([a] -> Property) -> IO () coverCheck config prop = do @@ -538,6 +539,13 @@ coverCheck config prop = do testsC config arbitrary prop [] 0 rnd 0 0 [] ``` +In particular notice that instead of `Testable a` we use an explicit +predicate on a list of `a`, `[a] -> Property`. The reason for using a +list in the predicate is so that we can iteratively make progress, using +the coverage information. We see this more clearly if we look at the +coverage-guided analogue of the `tests` function, in particular the `xs` +parameter: + ``` haskell testsC :: Show a => Config -> Gen a -> ([a] -> Property) -> [a] -> Int -> StdGen -> Int -> Int -> [[String]] -> IO () @@ -571,13 +579,10 @@ testsC config gen prop xs cov rnd0 ntest nfail stamps (rnd3,rnd4) = split rnd2 ``` -``` haskell -testsC' :: Show a => Config -> Gen a -> ([a] -> Property) -> StdGen -> Int -> Int -> [[String]] -> IO () -testsC' config gen prop = tests config genResult - where - Prop genResult = forAll (genList gen) prop - genList gen = sized $ \len -> replicateM len gen -``` +The other important difference is the `cov`erage parameter, which keeps +track of how many things have been `classify`ed (the `stamps` +parameter). Notice how we only add the newly generated input, `x`, if +the `cov`erage increases. The full source code is available [here](https://github.com/stevana/coverage-guided-pbt). @@ -815,7 +820,24 @@ you can see how it first find the `"b"`, then `"ba"`, etc: return 5 } -[^2]: See the appendix of the original +[^2]: For more details about how it works, see the [AFL + "whitepaper"](https://lcamtuf.coredump.cx/afl/technical_details.txt) + and its [mutation + heuristics](https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html). + +[^3]: See the appendix of the original [paper](https://dl.acm.org/doi/10.1145/351240.351266) that first introduced property-based testing. It's interesting to note that the collecting statistics functionality is older than shrinking. + +[^4]: It might be interesting to note that we can implement this + signature using the original combinators: + + ``` haskell + testsC' :: Show a => Config -> Gen a -> ([a] -> Property) + -> StdGen -> Int -> Int -> [[String]] -> IO () + testsC' config gen prop = tests config genResult + where + Prop genResult = forAll (genList gen) prop + genList gen = sized $ \len -> replicateM len gen + ``` diff --git a/src/QuickCheckV1.hs b/src/QuickCheckV1.hs index fbe06e8..468402a 100644 --- a/src/QuickCheckV1.hs +++ b/src/QuickCheckV1.hs @@ -387,7 +387,8 @@ done mesg ntest stamps = -- the end. -- start snippet testsC2 -testsC' :: Show a => Config -> Gen a -> ([a] -> Property) -> StdGen -> Int -> Int -> [[String]] -> IO () +testsC' :: Show a => Config -> Gen a -> ([a] -> Property) + -> StdGen -> Int -> Int -> [[String]] -> IO () testsC' config gen prop = tests config genResult where Prop genResult = forAll (genList gen) prop