Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copilot-theorem: Make What4 backend refuse to prove existentially quantified properties #254

Open
robdockins opened this issue Sep 21, 2021 · 9 comments · May be fixed by #596
Open

copilot-theorem: Make What4 backend refuse to prove existentially quantified properties #254

robdockins opened this issue Sep 21, 2021 · 9 comments · May be fixed by #596
Assignees
Labels
CR:Status:Verification Admin only: Change request that is currently being verified CR:Type:Bug Admin only: Change request pertaining to error detected
Milestone

Comments

@robdockins
Copy link
Contributor

Looking at the datatypes involved in the high-level and low-level Spec datatypes (e.g. SpecItem), it seems to me that properties and theorems are both represented as a bare stream of booleans. It strikes me as odd that we don't retain information about which quantifier was used to state the property, as it makes it impossible later to determine what semantics where intended; in particular, the quantifier for the property is not passed to the askProver method of Copilot.Theorem.Prove.prove, which strikes me as very strange. Am I missing something here?

@ivanperez-keera
Copy link
Member

I'm with you. I was also confused.

But I don't see how Spec could have a quantifier to give theorems a stream-based interpretation that also has operational meaning in online RV.

What do you suggest?

@robdockins
Copy link
Contributor Author

I think the current stream-based interpretation is fine, as long as we also remember the associated quantifier. I don't know that I would necessarily expect a run-time interpretation of these properties, but they are certainly an important distinction when proving properties about a spec. It's important not to assume an existential property as though it is universal (which is unsound); and the proof techniques one can use for universal properties are different than for existential (induction vs model checking).

In principle, one could interpret a universal property as a safety property and monitor it, but the trigger mechanism already basically does that.

@ivanperez-keera ivanperez-keera added the question Further information is requested label Sep 21, 2021
@RyanGlScott
Copy link
Collaborator

Here is a test case that demonstrates the issue:

{-# LANGUAGE NoImplicitPrelude #-}
module Main (main) where

import Data.Foldable
import Data.Functor

import Copilot.Theorem.What4
import Language.Copilot

trueThenFalses :: Stream Bool
trueThenFalses = [True] ++ constant False

spec :: Spec
spec = do
  void $ prop "forAll trueThenFalses (should be invalid)" (forAll trueThenFalses)
  void $ prop "exists trueThenFalses (should be valid)" (exists trueThenFalses)

main :: IO ()
main = do
  spec' <- reify spec
  results <- prove Z3 spec'
  forM_ results $ \(nm, res) -> do
    putStr $ nm <> ": "
    case res of
      Valid   -> putStrLn "valid"
      Invalid -> putStrLn "invalid"
      Unknown -> putStrLn "unknown"

As its name suggests, exists trueThenFalses (should be valid) should yield valid, but it instead yields invalid. I suspect this is happening because Copilot is dropping the existential quantifier and incorrectly treating the underlying stream as though it were universally quantified.

@RyanGlScott
Copy link
Collaborator

Here is the Kind2 equivalent of the program above:

{-# LANGUAGE NoImplicitPrelude #-}
module Main (main) where

import Data.Functor

import Copilot.Theorem.Kind2
import Copilot.Theorem.Prove
import Language.Copilot

trueThenFalses :: Stream Bool
trueThenFalses = [True] ++ constant False

spec :: Spec
spec = do
  void $ theorem "t1" (forAll trueThenFalses) (check (kind2Prover def))
  void $ theorem "t2" (exists trueThenFalses) (check (kind2Prover def))

main :: IO ()
main = void $ reify spec
$ PATH=~/Software/kind2-0.7.2/bin:$PATH runghc Bug.hs
(define-pred top
  ((prop-t1.out Bool)
   (prop-t2.out Bool)
   (s0.out Bool)
   (s0.out.1 Bool))
  (init
    (and
      (= prop-t1.out s0.out)
      (= prop-t2.out s0.out)
      (= s0.out true)
      (= s0.out.1 false)))
  (trans
    (and
      (= (prime prop-t1.out) (prime s0.out))
      (= (prime prop-t2.out) (prime s0.out))
      (= (prime s0.out) s0.out.1)
      (= (prime s0.out.1) false))))

(check-prop
  ((t2 prop-t2.out)))
t2: invalid ()
Finished: t2: proof failed
(define-pred top
  ((prop-t1.out Bool)
   (prop-t2.out Bool)
   (s0.out Bool)
   (s0.out.1 Bool))
  (init
    (and
      (= prop-t1.out s0.out)
      (= prop-t2.out s0.out)
      (= s0.out true)
      (= s0.out.1 false)))
  (trans
    (and
      (= (prime prop-t1.out) (prime s0.out))
      (= (prime prop-t2.out) (prime s0.out))
      (= (prime s0.out) s0.out.1)
      (= (prime s0.out.1) false))))

(check-prop
  ((t1 prop-t1.out)))
t1: invalid ()
Finished: t1: proof failed
Warning: failed to check some proofs.

Both the t1 and t2 proofs fail, but t2 ought to succeed.

@ivanperez-keera
Copy link
Member

ivanperez-keera commented Jan 29, 2025

Description

Copilot.Language.Spec.prop and Copilot.Language.Spec.theorem are both reified into as a bare stream of booleans, and do not retain information about whether the Prop passed as argument was a Forall or an Exists. Because this information is not retained when later using Copilot.Theorem.Prove.prove to check the properties, the results returned by the SMT solvers may be incorrect.

Type

  • Bug: Information is lost, leading to unexpected results .

Additional context

None.

Requester

  • Rob Dockins.

Method to check presence of bug

The issue can be detected with the following sample programs:

{-# LANGUAGE NoImplicitPrelude #-}
module Main (main) where

import Data.Foldable
import Data.Functor

import Copilot.Theorem.What4
import Language.Copilot

trueThenFalses :: Stream Bool
trueThenFalses = [True] ++ constant False

spec :: Spec
spec = do
  void $ prop "forAll trueThenFalses (should be invalid)" (forAll trueThenFalses)
  void $ prop "exists trueThenFalses (should be valid)" (exists trueThenFalses)

main :: IO ()
main = do
  spec' <- reify spec
  results <- prove Z3 spec'
  forM_ results $ \(nm, res) -> do
    putStr $ nm <> ": "
    case res of
      Valid   -> putStrLn "valid"
      Invalid -> putStrLn "invalid"
      Unknown -> putStrLn "unknown"

Expected result

Running the specs above or a similar variant via What4 should yield valid when the property is indeed valid (the second property above), indicating that respects the quantifier is being respected.

If the prover is not able to prove a property of that kind, a suitable error message should be printed.

Desired result

Running the specs above or a similar variant via What4 should yield valid when the property is indeed valid (the second property above), indicating that respects the quantifier is being respected.

If the prover is not able to prove a property of that kind, a suitable error message should be printed.

Proposed solution

Modify copilot-language:Copilot.Language.Spec to carry enough information about the quantifier used, and either use the information to ensure a proper encoding of the property when connecting to the SMT-solver, or have the backend refuse to prove the property when the result would be incorrect.

Further notes

None.

@ivanperez-keera ivanperez-keera added CR:Type:Bug Admin only: Change request pertaining to error detected CR:Status:Initiated Admin only: Change request that has been initiated and removed question Further information is requested labels Jan 29, 2025
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 4, 2025
…ge#254.

Previously, `copilot-language` would remember whether a proposition (i.e., a
stream of booleans) was quantified universally (i.e., using `forAll`) or
existentially (i.e., using `exists`). When translating from `copilot-language`
to `copilot-core`, however, the quantifier would be discarded. This meant that
a `copilot-core` `Property` would not record any quantifier information at all,
making it impossible for downstream libraries that use `copilot-core` to handle
universal quantification differently from existential quantification.

This commit changes the `copilot-core` API to preserve quantifier information
when translating `copilot-language` propositions into `copilot-core` ones.
Specifically, it introduces a `Prop` data type in `copilot-core` (largely
inspired by a data type of the same name in `copilot-language`) to record a
proposition's quantifier, and it changes the `propertyExpr :: Expr Bool` field
of `copilot-core`'s `Property` data type to `propertyProp :: Prop`. These API
changes require some corresponding code changes in `copilot-language`,
`copilot-prettyprinter`, and `copilot-theorem`.

This commit also introduces an `extractProp :: Prop -> Expr Bool` function for
retrieving the underlying boolean expression. Generally, this function should
not be used, as different quantifiers usually require different treatment, and
misuse of the `extractProp` function can potentially lead to unsoundness. There
are a handful of places where the use of `extractProp` is justified, however.
In each such place, a comment has been left to justify why the use of
`extractProp` is sound.

The payoff for these API changes will happen in a follow-up commit, where
`copilot-theorem` will use the quantifier information in a `copilot-core`
`Prop`, rather than unsoundly discarding the quantifiers using `extractProp`.
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 4, 2025
…4 backend. Refs Copilot-Language#254.

The functions in `Copilot.Theorem.What4` (e.g., `prove`) take a proposition and
attempt to prove that it is valid at all possible time steps. Previously, they
would do this regardless of whether the proposition was universally quantified
or existentiall quantifier, however, which was unsound. The reason that this
happened is because the functions would call `extractProp` on the proposition
being proven, causing the proposition's quantifier not to be taken into
account.

This commit removes the unsound uses of `extractProp` and instead checks if the
proposition was actually quantified universally (i.e., with a `Forall`), which
is the intended use case for `Copilot.Theorem.What4`. If the proposition was
instead quantified existentially (i.e., with an `Exists`), then a
`ProveException` will be thrown. Warnings have been added to Haddocks of the
relevant functions that can potentially throw a `ProveException`.
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 4, 2025
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 4, 2025
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 4, 2025
@ivanperez-keera
Copy link
Member

Change Manager: Confirmed that the issue exists.

@ivanperez-keera ivanperez-keera added CR:Status:Confirmed Admin only: Change request that has been acknowledged by the change manager and removed CR:Status:Initiated Admin only: Change request that has been initiated labels Feb 5, 2025
@ivanperez-keera
Copy link
Member

Technical Lead: Confirmed that the issue should be addressed.

@ivanperez-keera ivanperez-keera added CR:Status:Accepted Admin only: Change request accepted by technical lead and removed CR:Status:Confirmed Admin only: Change request that has been acknowledged by the change manager labels Feb 5, 2025
@ivanperez-keera
Copy link
Member

Technical Lead: Issue scheduled for fixing in Copilot 4.3.

Fix assigned to: @RyanGlScott .

@ivanperez-keera ivanperez-keera added CR:Status:Scheduled Admin only: Change requested scheduled and removed CR:Status:Accepted Admin only: Change request accepted by technical lead labels Feb 5, 2025
@ivanperez-keera ivanperez-keera added this to the 4.3 milestone Feb 5, 2025
@ivanperez-keera ivanperez-keera added CR:Status:Implementation Admin only: Change request that is currently being implemented and removed CR:Status:Scheduled Admin only: Change requested scheduled labels Feb 5, 2025
@ivanperez-keera ivanperez-keera changed the title prop and theorem forget their quantifier? copilot-theorem: Make What4 backend refuse to prove existentially quantified properties Feb 5, 2025
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 26, 2025
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 26, 2025
…4 backend. Refs Copilot-Language#254.

The functions in `Copilot.Theorem.What4` (e.g., `prove`) take a proposition and
attempt to prove that it is valid at all possible time steps. Previously, they
would do this regardless of whether the proposition was universally quantified
or existential quantifier, however, which was unsound. The reason that this
happened is because the functions would call `extractProp` on the proposition
being proven, causing the proposition's quantifier not to be taken into
account.

This commit removes the unsound uses of `extractProp` and instead checks if the
proposition was actually quantified universally (i.e., with a `Forall`), which
is the intended use case for `Copilot.Theorem.What4`. If the proposition was
instead quantified existentially (i.e., with an `Exists`), then a
`ProveException` will be thrown. Warnings have been added to Haddocks of the
relevant functions that can potentially throw a `ProveException`.
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 26, 2025
@RyanGlScott
Copy link
Collaborator

Implementor: Solution implemented, review requested.

RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 26, 2025
…ge#254.

Previously, `copilot-language` would remember whether a proposition (i.e., a
stream of booleans) was quantified universally (i.e., using `forAll`) or
existentially (i.e., using `exists`). When translating from `copilot-language`
to `copilot-core`, however, the quantifier would be discarded. This meant that
a `copilot-core` `Property` would not record any quantifier information at all,
making it impossible for downstream libraries that use `copilot-core` to handle
universal quantification differently from existential quantification.

This commit changes the `copilot-core` API to preserve quantifier information
when translating `copilot-language` propositions into `copilot-core` ones.
Specifically, it introduces a `Prop` data type in `copilot-core` (largely
inspired by a data type of the same name in `copilot-language`) to record a
proposition's quantifier, and it changes the `propertyExpr :: Expr Bool` field
of `copilot-core`'s `Property` data type to `propertyProp :: Prop`. These API
changes require some corresponding code changes in `copilot-language`,
`copilot-prettyprinter`, and `copilot-theorem`.

This commit also introduces an `extractProp :: Prop -> Expr Bool` function for
retrieving the underlying boolean expression. Generally, this function should
not be used, as different quantifiers usually require different treatment, and
misuse of the `extractProp` function can potentially lead to unsoundness. There
are a handful of places where the use of `extractProp` is justified, however.
In each such place, a comment has been left to justify why the use of
`extractProp` is sound.

The payoff for these API changes will happen in a follow-up commit, where
`copilot-theorem` will use the quantifier information in a `copilot-core`
`Prop`, rather than unsoundly discarding the quantifiers using `extractProp`.
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 26, 2025
…4 backend. Refs Copilot-Language#254.

The functions in `Copilot.Theorem.What4` (e.g., `prove`) take a proposition and
attempt to prove that it is valid at all possible time steps. Previously, they
would do this regardless of whether the proposition was universally quantified
or existential quantifier, however, which was unsound. The reason that this
happened is because the functions would call `extractProp` on the proposition
being proven, causing the proposition's quantifier not to be taken into
account.

This commit removes the unsound uses of `extractProp` and instead checks if the
proposition was actually quantified universally (i.e., with a `Forall`), which
is the intended use case for `Copilot.Theorem.What4`. If the proposition was
instead quantified existentially (i.e., with an `Exists`), then a
`ProveException` will be thrown. Warnings have been added to Haddocks of the
relevant functions that can potentially throw a `ProveException`.
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 26, 2025
@ivanperez-keera ivanperez-keera added CR:Status:Verification Admin only: Change request that is currently being verified and removed CR:Status:Implementation Admin only: Change request that is currently being implemented labels Feb 27, 2025
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 28, 2025
…ge#254.

Previously, `copilot-language` would remember whether a proposition (i.e., a
stream of booleans) was quantified universally (i.e., using `forAll`) or
existentially (i.e., using `exists`). When translating from `copilot-language`
to `copilot-core`, however, the quantifier would be discarded. This meant that
a `copilot-core` `Property` would not record any quantifier information at all,
making it impossible for downstream libraries that use `copilot-core` to handle
universal quantification differently from existential quantification.

This commit changes the `copilot-core` API to preserve quantifier information
when translating `copilot-language` propositions into `copilot-core` ones.
Specifically, it introduces a `Prop` data type in `copilot-core` (largely
inspired by a data type of the same name in `copilot-language`) to record a
proposition's quantifier, and it changes the `propertyExpr :: Expr Bool` field
of `copilot-core`'s `Property` data type to `propertyProp :: Prop`. These API
changes require some corresponding code changes in `copilot-language`,
`copilot-prettyprinter`, and `copilot-theorem`.

This commit also introduces an `extractProp :: Prop -> Expr Bool` function for
retrieving the underlying boolean expression. Generally, this function should
not be used, as different quantifiers usually require different treatment, and
misuse of the `extractProp` function can potentially lead to unsoundness. There
are a handful of places where the use of `extractProp` is justified, however.
In each such place, a comment has been left to justify why the use of
`extractProp` is sound.

The payoff for these API changes will happen in a follow-up commit, where
`copilot-theorem` will use the quantifier information in a `copilot-core`
`Prop`, rather than unsoundly discarding the quantifiers using `extractProp`.
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 28, 2025
…4 backend. Refs Copilot-Language#254.

The functions in `Copilot.Theorem.What4` (e.g., `prove`) take a proposition and
attempt to prove that it is valid at all possible time steps. Previously, they
would do this regardless of whether the proposition was universally quantified
or existential quantifier, however, which was unsound. The reason that this
happened is because the functions would call `extractProp` on the proposition
being proven, causing the proposition's quantifier not to be taken into
account.

This commit removes the unsound uses of `extractProp` and instead checks if the
proposition was actually quantified universally (i.e., with a `Forall`), which
is the intended use case for `Copilot.Theorem.What4`. If the proposition was
instead quantified existentially (i.e., with an `Exists`), then a
`ProveException` will be thrown. Warnings have been added to Haddocks of the
relevant functions that can potentially throw a `ProveException`.
RyanGlScott added a commit to GaloisInc/copilot-1 that referenced this issue Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CR:Status:Verification Admin only: Change request that is currently being verified CR:Type:Bug Admin only: Change request pertaining to error detected
3 participants