fix bug, add new encodings, rename methods, and improve code

toda-lab · Dec 5, 2024 · 0540bc2 · 0540bc2
1 parent e49cd8c
commit 0540bc2
Show file tree

Hide file tree

Showing 41 changed files with 2,864 additions and 3,367 deletions.
diff --git a/CHANGES.rst b/CHANGES.rst
@@ -3,6 +3,77 @@
 CHANGES
 =======
 
+Version 2.3.0 - 2024-12-05
+--------------------------
+
+Fixed
+^^^^^
+
+- absexpr.py: constant atoms of different classes (say, true constants of Prop and Fog) were represented as the same objects due to bug in _to_key method of absexpr.py, which are now distinguished by adding class name to key.
+- ecc.py: compute_separating_ecc method did not compute separating ecc, which is now corrected and tested.
+- cnf.py: all external indices from 1 to self.base are ensured to be decodable.
+
+Changed
+^^^^^^^
+
+- absexpr.py: clear method of IndexGen class is changed so as to reset initial index too.
+- absprop.py: reduce_formula_step method is changed so as to reduce more by handling cases of implies and iff operators.
+- prop.py: syntax of variable accepted by read method is simplified
+- fog.py: _normalize_aux method (nomalizing the order of values in aux fields) is changed, depending on atom type (less-than relation, introduced in this release, is not normalized, while the others are normalized same as before).
+- ecc.py: simpgraph module is used to handle graph.
+- cnf.py: constructor is changed to accept an iterable object.
+- cnf.py: encoding to external indices and decoding to internal indices are improved and tested.
+- name.py: lookup_index method is changed to fails if names of auxiliary variables introduced by get_aux_index method (see below) are given; if the leading character is "_" and the name is already registered, then this means that it is the index of an auxiliary variable.
+- be.py: management of variables introduced in boolean encodings are changed; they are holded in instance variables of be object and managed by be object. Such variables were previosly registered to name manager with variable names seprated by '@', which is now deprecated.
+- grst.py: it is improved so as to be extendable and easy to add new encodings.
+
+Added
+^^^^^
+
+- fog.py:  atom of form x < y becomes accepted by read method
+- name.py: get_aux_index method is added to be able to introduce auxiliary variables in boolean encoding as needed
+- be.py: is_decodable_boolean_var method to determine if a boolean variable is decodable into an FO-variable.
+- grst.py: new encoding methods (direct, log, and vertex) are implemented.
+- op.py: compute_size method to compute formula size.
+- constraints.py: encoding of cardinality constraint (at-most-r constraint)
+
+Deprecated
+^^^^^^^^^^
+
+- the following methods on the left of ">" have been duplicated and will be remove in version 3.0.0.
+    - absexpr.py: is_atom_term     > is_atom
+    - absexpr.py: is_unop_term     > is_unop
+    - absexpr.py: is_binop_term    > is_binop
+    - absneg.py:  is_neg_term      > is_neg
+    - absprop.py: is_land_term     > is_land
+    - absprop.py: is_lor_term      > is_lor
+    - absprop.py: is_implies_term  > is_implies
+    - absprop.py: is_iff_term      > is_iff
+    - absfo.py:   is_forall_term   > is_forall
+    - absfo.py:   is_exists_term   > is_exists
+    - absfo.py:   is_qf_term       > is_qf
+    - prop.py:    get_atom_value   > (to be deleted)
+    - fog.py:     get_atom_value   > get_atom_arg
+    - op.py:      generator        > generate_subformulas
+    - fog.py:     le               > lt
+　  - fog.py:     _LE              > _LT
+    - fog.py:     get_le_tag       > get_lt_tag
+    - fog.py:     is_le_atom       > is_lt_atom
+    - fog.py:     action_le_rel    > action_lt_rel
+    - cnf.py:     get_nvar         > (to be deleted)
+    - cnf.py:     get_ncls         > (to be deleted)
+    - cnf.py:     get_clause       > (to be deleted)
+    - cnf.py:     encode_lit       > _encode_lit
+    - cnf.py:     decode_lit       > _decode_lit
+    - be.py:      get_symbol_index > get_variable_position_pair
+    - be.py:      get_code_pos     > get_variable_position_pair
+    - be.py:      exists_symbol    > (deleted)
+    - be.py:      exists_boolean_var > (deleted)
+    - grst.py:    encode_eq        > be_eq
+    - grst.py:    encode_edg       > be_edg
+    - grst.py:    encode_F         > be_F
+    - grst.py:    encode_T         > be_T
+
 Version 2.2.0 - 2024-09-09
 --------------------------
 

diff --git a/docs/getting-started.rst b/docs/getting-started.rst
@@ -11,7 +11,7 @@ Install Pygplib
 Quick Example
 -------------
 
-Let us count the number of independent sets of size ``3`` in the following
+Let us randomly generate independent sets of size ``3`` in the following
 graph, ``G``, with ``pygplib``.
 
 .. code:: shell-session
@@ -24,7 +24,7 @@ graph, ``G``, with ``pygplib``.
     # |    |     |
     # V4---V7   V6
 
-Import necessary modules and create a graph structure.
+Let us import necessary modules and create a graph structure.
 
 .. code:: python
 
@@ -34,7 +34,7 @@ Import necessary modules and create a graph structure.
     edge_list = [(1,2),(1,3),(2,4),(2,5),(3,6),(4,7),(5,7)]  # edges of G
     st = GrSt(vertex_list, edge_list)
 
-Parse an expression and construct a first-order formula object.
+Let us parse an expression and construct a first-order formula object.
 See :ref:`FirstOrderLogicofGraphs` and :ref:`FormatofFirstOrderFormula` for
 details.
 
@@ -43,7 +43,7 @@ details.
     f = Fog.read("(~ edg(x1,x2)) & (~ edg(x1,x3)) & (~ edg(x2,x3)) "\
                 +"& (~ x1=x2) & (~ x1=x3) & (~ x2=x3)")
 
-Perform Boolean encoding for ``f`` and 
+Let us perform Boolean encoding for ``f`` and 
 compute the domain constraint for each free variable of ``f``.
 The list ``[g, ] + li`` consists of propositional formulas such that 
 ``f`` is satisfiable in ``G`` if and only if the conjunction of propositional formulas is
@@ -58,34 +58,52 @@ The encodings performed in this code block are described in more details in :ref
     g = op.perform_boolean_encoding(f, st)
     li  = [st.compute_domain_constraint(v) \
                     for v in op.get_free_vars(f)]
-    mgr = Cnf( [g, ] + li )
+    mgr = Cnf( [g, ] + li, st=st)
 
-Generate a CNF formula to ``f.cnf`` in `DIMACS CNF format
-<http://www.satcompetition.org/2009/format-benchmarks2009.html>`__ .
 
-.. code:: python
+Let us import unigen, `UniGen approximately uniform sampler <https://github.com/meelgroup/unigen>`__ , 
+and perform random sampling of solutions of the cnf. 
+(If necessary, install unigen module beforehand.)
 
-    with open("f.cnf","w") as out:
-        mgr.write(stream=out)
+.. code:: python
 
+    from pysat.formula import CNF
+    from pysat.solvers import Solver
+    from pyunigen import Sampler
+
+    num=5 # number of samples to be generated
+
+    sampler=Sampler()
+    for clause in mgr.cnf:
+        sampler.add_clause(clause)
+
+    cells, hashes, samples = sampler.sample(num=num, sampling_set=range(1,mgr.base+1))
+    for ext_partial_assign in samples:
+        with Solver(\
+            bootstrap_with=CNF(\
+                from_clauses=\
+                    list(mgr.cnf) + [(lit,) for lit in ext_partial_assign])) as solver:
+            if solver.solve():
+                ext_full_assign = solver.get_model() # external CNF vars.
+                int_assign = mgr.decode_assignment(ext_full_assign) # internal CNF vars.
+                fo_assign = struct.decode_assignment(int_assign) # first-order vars.
+                ans = sorted([struct.object_to_vertex(fo_assign[key]) \
+                                        for key in fo_assign.keys()])
+                print(ans)
+            else:
+                print("Unexpected error occured during sampling!")
+
+Sampling solutions of combinatrial problems is computationally hard in general.
+To make the above computation more efficient, pygplib provides a technique of so-called symmetry breaking.
+The formula of independent set is symmetry, i.e., any performulation of 
+a satisfying assignment of vertices to first-order variables is also a solution of the formula,
+which results in an enormous number of solutions, making it hard to perform sampling.
+To overcome this, let us consider the following formula to which the constraint that all vertices assigned to variables are sorted is added instead of all-different constraint.
 
-To count the number of solutions, 
-download and build a model counter `sharpSAT
-<https://github.com/marcthurley/sharpSAT.git>`__ .
-Run the following command.
+.. code:: python
 
-.. code:: shell-session
+    f = Fog.read("x1<x2 & x2<x3"\
+                +"& (~ x1=x2) & (~ x1=x3) & (~ x2=x3)")
 
-    $ path-to-sharpSAT/sharpSAT f.cnf
-    (The first part omitted)
-    # solutions 
-    48
-    # END
-    
-    time: 0.108726s
-
-Note that solutions are the permutations of all independent sets of size
-``3``.
-For example, the assignment ``x1=2,x2=7,x3=3`` is distinguished
-from any other permutation of it, say ``x1=7,x1=2,x3=3``.
-Hence the number of all independent sets of size ``3`` amounts to ``48/3!=8``.
+After that, let us encode it into CNF and perform sampling in the same way as described just above.
+Sampling for larger graphs would become more efficient.
diff --git a/docs/howto-use-pygplib.rst b/docs/howto-use-pygplib.rst
@@ -411,18 +411,6 @@ A ``Cnf`` object manages the index mapping between
 *internal* CNF variables (those in propositional formulas in ``[g, ] + li``)
 and *external* CNF variables (those in the output DIMACS CNF).
 
-A ``Cnf`` object provides the following instance methods.
-
-- ``get_nvars()``: returns the number of CNF variables (in other words, the
-  maximum index of an external CNF variable)
-- ``get_ncls()``: returns the number of clauses
-- ``get_clause(i)``: returns the ``i``-th clause, a tuple of nonzero-integers
-  (indices of external CNF variables),
-  where ``i`` ranges from ``0`` to the number of clauses minus ``1``.
--  ``write(stream=stdout)``: generates a CNF in DIMACS format to stream (``stdout`` if not given).
-- ``decode_assignment(assign)``: decodes the assignment of DIMACS CNF
-  variables (external CNF variables), ``assign``, to the assignment of internal CNF variables.
-
 The easiest way to solve CNF would be to use 
 ``pysat``, `a toolkit for SAT-based prototyping in Python <https://pysathq.github.io/>`__ .
 The ``pygplib`` in itself does not provide any functionality of