Add first draft of conclusion.

stevana · Oct 3, 2024 · 9d44314 · 9d44314
1 parent 1876280
commit 9d44314
Show file tree

Hide file tree

Showing 5 changed files with 113 additions and 82 deletions.
diff --git a/README-unprocessed.md b/README-unprocessed.md
@@ -683,48 +683,48 @@ The full source code is available
 
 ## Conclusion and further work
 
-XXX:
-
-* Exponential -> polynomial
-
-* Makes more sense for stateful systems than pure functions? Or atleast
-  properties that expect a sequence of inputs?
-
-* Don't rerun all commands for every newly generate command
-  + only reset the system when shrinking
-
-* Problem of strategy (pick something as basis for progress): coverage, logs,
-  value of memory, helps bootstap the process. Generalise to support more?
+We've seen how to add converage-guidance to the first version of the first
+property-based testing tool, QuickCheck, in about 35 lines of code.
 
-* Local maxima?
+Coverage-guidance effectively reduced a exponential problem into a polynomial
+one, by building on previous test runs' successes in increasing the coverage.
 
-* Problem of tactics: picking a good input distributed for the testing problem
-  at hand. Make previous input influence the next input? Dependent events, e.g.
-  if one packet gets lost, there's a higher chance that the next packet will be
-  lost as well.
+The solution does change the QuickCheck API slightly by requring a property on
+a list of `a`, rather than merely `a`, so it's not suitable for all properties.
 
-* Save `(Coverage, Mutation, Frequency, Coverage)` stats?
+I think this limitation isn't so important, because going further I'd like to
+apply coverage-guidance to testing stateful systems. When testing stateful
+systems, which I've written about
+[here](https://stevana.github.io/the_sad_state_of_property-based_testing_libraries.html),
+one always generates a list of commands anyway, so the limitation doesn't matter.
 
-* More realistic example, e.g.: leader election, transaction rollback,
-  failover?
-* Annoying to sprinkle sometimes assertions everywhere?
-  - Can it be combined with logging or tracing?
+A more serious limitation with the current approach is that it's too greedy and
+will seek to maximise coverage, without ever backtracking. This means that it
+can easily get stuck in local maxima. Consider the example:
 
-* Use size parameter to implement AFL heuristic for choosing integers? Or just
-  use `frequency`?
+```
+if input[0] == 'b'
+  if input[1] == 'a'
+    if input[2] == 'd'
+      skip
+if input[0] == 'w'
+  if input[1] == 'o'
+    if input[2] == 'r'
+      if input[3] == 's'
+        if input[4] == 'e'
+          error
+```
 
-* Type-generic mutation?
-* sometimes_each?
-* https://en.wikipedia.org/wiki/L%C3%A9vy_flight (optimises search)
+If we generate an input that starts with 'b' (rather than 'w'), then we'll get
+stuck never finding the error.
 
-## See also
+Real coverage-guided tools, like AFL, will not get stuck like that. While I
+have a variant of the code that can cope with this, I chose to present the
+above greedy version because it's simpler. 
 
-* https://carstein.github.io/fuzzing/2020/04/18/writing-simple-fuzzer-1.html
-* https://carstein.github.io/fuzzing/2020/04/25/writing-simple-fuzzer-2.html
-* https://carstein.github.io/fuzzing/2020/05/02/writing-simple-fuzzer-3.html
-* https://carstein.github.io/fuzzing/2020/05/21/writing-simple-fuzzer-4.html
-* [How Antithesis finds bugs (with help from the Super Mario
-  Bros)](https://antithesis.com/blog/sdtalk/)
+I might write another post with a more AFL-like solution at some later point,
+but I'd also like to encourge others to port these ideas to your favorite
+language and experiment!
 
 
 [^1]: This example is due to Dmitry Vyukov, the main author of

diff --git a/README.md b/README.md
@@ -896,54 +896,51 @@ The full source code is available
 
 ## Conclusion and further work
 
-XXX:
-
-- Exponential -\> polynomial
-
-- Makes more sense for stateful systems than pure functions? Or atleast
-  properties that expect a sequence of inputs?
-
-- Don't rerun all commands for every newly generate command
-
-  - only reset the system when shrinking
-
-- Problem of strategy (pick something as basis for progress): coverage,
-  logs, value of memory, helps bootstap the process. Generalise to
-  support more?
-
-- Local maxima?
-
-- Problem of tactics: picking a good input distributed for the testing
-  problem at hand. Make previous input influence the next input?
-  Dependent events, e.g. if one packet gets lost, there's a higher
-  chance that the next packet will be lost as well.
-
-- Save `(Coverage, Mutation, Frequency, Coverage)` stats?
-
-- More realistic example, e.g.: leader election, transaction rollback,
-  failover?
-
-- Annoying to sprinkle sometimes assertions everywhere?
-
-  - Can it be combined with logging or tracing?
-
-- Use size parameter to implement AFL heuristic for choosing integers?
-  Or just use `frequency`?
-
-- Type-generic mutation?
-
-- sometimes_each?
-
-- <https://en.wikipedia.org/wiki/L%C3%A9vy_flight> (optimises search)
-
-## See also
+We've seen how to add converage-guidance to the first version of the
+first property-based testing tool, QuickCheck, in about 35 lines of
+code.
 
-- <https://carstein.github.io/fuzzing/2020/04/18/writing-simple-fuzzer-1.html>
-- <https://carstein.github.io/fuzzing/2020/04/25/writing-simple-fuzzer-2.html>
-- <https://carstein.github.io/fuzzing/2020/05/02/writing-simple-fuzzer-3.html>
-- <https://carstein.github.io/fuzzing/2020/05/21/writing-simple-fuzzer-4.html>
-- [How Antithesis finds bugs (with help from the Super Mario
-  Bros)](https://antithesis.com/blog/sdtalk/)
+Coverage-guidance effectively reduced a exponential problem into a
+polynomial one, by building on previous test runs' successes in
+increasing the coverage.
+
+The solution does change the QuickCheck API slightly by requring a
+property on a list of `a`, rather than merely `a`, so it's not suitable
+for all properties.
+
+I think this limitation isn't so important, because going further I'd
+like to apply coverage-guidance to testing stateful systems. When
+testing stateful systems, which I've written about
+[here](https://stevana.github.io/the_sad_state_of_property-based_testing_libraries.html),
+one always generates a list of commands anyway, so the limitation
+doesn't matter.
+
+A more serious limitation with the current approach is that it's too
+greedy and will seek to maximise coverage, without ever backtracking.
+This means that it can easily get stuck in local maxima. Consider the
+example:
+
+    if input[0] == 'b'
+      if input[1] == 'a'
+        if input[2] == 'd'
+          skip
+    if input[0] == 'w'
+      if input[1] == 'o'
+        if input[2] == 'r'
+          if input[3] == 's'
+            if input[4] == 'e'
+              error
+
+If we generate an input that starts with 'b' (rather than 'w'), then
+we'll get stuck never finding the error.
+
+Real coverage-guided tools, like AFL, will not get stuck like that.
+While I have a variant of the code that can cope with this, I chose to
+present the above greedy version because it's simpler.
+
+I might write another post with a more AFL-like solution at some later
+point, but I'd also like to encourge others to port these ideas to your
+favorite language and experiment!
 
 [^1]: This example is due to Dmitry Vyukov, the main author of
     [go-fuzz](https://github.com/dvyukov/go-fuzz), but it's basically an

diff --git a/SEE_ALSO.md b/SEE_ALSO.md
@@ -0,0 +1,9 @@
+# See also
+
+* https://carstein.github.io/fuzzing/2020/04/18/writing-simple-fuzzer-1.html
+* https://carstein.github.io/fuzzing/2020/04/25/writing-simple-fuzzer-2.html
+* https://carstein.github.io/fuzzing/2020/05/02/writing-simple-fuzzer-3.html
+* https://carstein.github.io/fuzzing/2020/05/21/writing-simple-fuzzer-4.html
+
+* [How Antithesis finds bugs (with help from the Super Mario
+  Bros)](https://antithesis.com/blog/sdtalk/)
diff --git a/TODO.md b/TODO.md
@@ -0,0 +1,26 @@
+# Todo 
+
+* Don't rerun all commands for every newly generate command
+  + only reset the system when shrinking
+
+* Problem of strategy (pick something as basis for progress): coverage, logs,
+  value of shared memory, helps bootstap the process. Generalise to support more?
+
+* Problem of tactics: picking a good input distributed for the testing problem
+  at hand. Make previous input influence the next input? Dependent events, e.g.
+  if one packet gets lost, there's a higher chance that the next packet will be
+  lost as well.
+
+* More realistic example, e.g.: leader election, transaction rollback,
+  failover?
+* Annoying to sprinkle sometimes assertions everywhere?
+  - Can it be combined with logging or tracing?
+
+* Use size parameter to implement AFL heuristic for choosing integers? Or just
+  use `frequency`?
+
+* Type-generic mutation?
+* sometimes_each?
+* https://en.wikipedia.org/wiki/L%C3%A9vy_flight (optimises search)
+
+
diff --git a/src/Mutator.hs b/src/Mutator.hs
@@ -14,7 +14,6 @@ type Mutate a = StdGen -> a -> a
 
 mutateChar :: Mutate Char
 mutateChar prng ch =  generate 0 prng genChar
--- if ch == 'A' then 'Z' else pred ch
 
 mutateInt16 :: Mutate Int16
 mutateInt16 prng _i = fst (random prng) -- XXX: this doesn't actually mutate...