-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathproof.tex
951 lines (780 loc) · 65.3 KB
/
proof.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
\documentclass[12pt, a4 paper]{article}
\usepackage[british]{babel}
\usepackage{graphicx, hyperref, xcolor, fancyhdr, amssymb, amsmath, amsthm, mdframed, tikz, listings}
\usepackage{mathrsfs}
\usepackage[acronym, toc]{glossaries}
\usepackage{pdfpages}
\usepackage{svg}
\usepackage{algorithmic}
\usepackage[ruled, linesnumbered]{algorithm2e}
\usetikzlibrary {arrows.meta,automata,positioning}
\usetikzlibrary{automata}
% Colors definition
\definecolor{isa_red}{RGB}{255, 58, 71}
\definecolor{isa_orange}{RGB}{219, 130, 57}
\definecolor{isa_light_blue}{RGB}{0, 163, 255}
\definecolor{isa_blue}{RGB}{0, 103, 158}
\definecolor{isa_dark_blue}{RGB}{26, 0, 253}
\definecolor{isa_green}{RGB}{0, 157, 97}
\definecolor{isa_dark_green}{RGB}{0,131, 0}
\definecolor{isa_purple}{RGB}{174, 5, 238}
% Define custom language for syntax highlighting
\lstdefinelanguage{Isabelle}{
% List of custom keywords
keywords=[1]{definition, if, then, else, let, in, case, of, do, SOME, lemma, note, THE},
% Custom keyword styles
keywordstyle=[1]\color{isa_blue},
keywords=[2]{where, and},
keywordstyle=[2]\color{isa_green},
keywords=[3]{RETURNT, monadic_WHILEIET, SPEC, ASSERT, consume, cost},
keywordstyle=[3]\color{isa_red},
keywords=[4]{assume, define},
keywordstyle=[4]\color{isa_light_blue},
literate={`}{\textasciigrave\ }1,
% Other language elements styles
commentstyle=\color{isa_dark_green},
string=[b]{''},
stringstyle=\color{isa_purple},
numberstyle=\tiny\color{isa_dark_blue},
% Frame and background settings
frame=l,
framerule=3pt,
rulecolor=\color{black!15},
backgroundcolor=\color{black!5},
% Other settings
basicstyle=\ttfamily\small,
breaklines=true,
breakatwhitespace=true,
showstringspaces=false,
showtabs=false,
tabsize=2
}
\lstset{%
language=Isabelle,
columns=fixed,
extendedchars,
basewidth={0.5em,0.45em},
basicstyle=\ttfamily,
mathescape=true
}
% editorial comments in the text or in marginal notes
% 1st argument: initials of the person making the comment,
% 2nd argument: comment to insert
\long\def\ednote#1#2{\par\noindent\framebox{\begin{minipage}{.98\linewidth}\linespread{.7}\footnotesize #1: #2\end{minipage}}\par}
\newcommand{\edmargin}[2]{\marginpar{\raggedright\linespread{.7}\tiny #1: #2}}
% redefine proof environment with grey background and black square at the end
\renewenvironment{proof}[1][Proof]{\begin{mdframed}[backgroundcolor=black!5, topline=false, rightline=false, bottomline=false, linecolor=black!15, linewidth=3pt]{\noindent\textit{#1.}\ }}{\noindent\par\hfill$\blacksquare$\end{mdframed}}
% \usepackage{times}
\hypersetup{
colorlinks,
linkcolor={red!60!black},
citecolor={blue!80!black},
urlcolor={blue!80!black}
}
% \pagestyle{fancy}
% \renewcommand\headrulewidth{.5pt}
% \fancyhead[L]{
% V. Trélat
% }
% \fancyhead[C]{
% \textbf{Verification in Isabelle/HOL of Hopcroft's algorithm for minimizing DFAs including runtime analysis}
% }
% \fancyhead[R]{
% % Current section
% \ifnum\c@secnumdepth>0
% \thesection
% }
\newcommand{\midtilde}{\raisebox{0.5ex}{\texttildelow}}
\newcommand{\isavar}[1]{{\color{isa_dark_green}\text{#1}}}
\newcommand{\isafixed}[1]{{\color{isa_dark_blue}\text{#1}}}
% theorems
\theoremstyle{definition}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{remark}{Remark}
\newtheorem{definition}{Definition}
\newtheorem{requirement}{Requirement}
% Glossary
\makeglossaries
% \newacronym{tiaa}{TIAA}{This Is An Acronym}
\begin{document}
\begin{titlepage}
\vspace*{4cm}
\begin{center}
\Large{Verification in Isabelle/HOL of Hopcroft's algorithm for minimizing DFAs including runtime analysis}
\end{center}
\vspace{2cm}
\begin{center}
Vincent Trélat
\vspace{1cm}
\textit{\today}
\end{center}
\vfill
\end{titlepage}
\tableofcontents
\pagebreak
\section{Introduction}
The goal of this project was to verify Hopcroft's algorithm for minimizing DFAs in Isabelle/HOL and to analyze its worst-case time complexity using the Refinement Framework.
Some considerable work had already been carried out by Peter Lammich and Thomas T\"urk on the proof of correctness ten years ago in an older version of Isabelle \cite{lammich:hop}.
The goal was to port this proof to the latest version of Isabelle and to extend it with a time complexity analysis.
\subsection{Setting up the environment}
In order to be able to comfortably browse the theories and inspect the proofs, it is recommended to use Isabelle\footnote{The project was carried out using \href{https://isabelle.in.tum.de/}{Isabelle2022}.}/jEdit. Yet, the project is not part of the \href{https://www.isa-afp.org/}{Archive of Formal Proofs} (AFP) and can only be downloaded from the \href{https://github.com/VTrelat/Hopcroft_Verif}{GitHub repository}.
Then, building an image of the main framework is strongly recommended, as it will greatly speed up the compilation of the theories, especially for the time complexity analysis. This may take a few minutes, but only needs to be done once:
\begin{lstlisting}[language=bash]
cd Hopcroft_Verif/Hopcroft_Time
isabelle build -d . -l Isabelle_LLVM_Time Hopcroft_Time.thy
\end{lstlisting}
Isabelle may then be launched with jEdit by running the following command. This will open the main theory file \texttt{Hopcroft\_Time.thy} in jEdit and should take less than a minute:
\begin{lstlisting}[language=bash]
isabelle jedit -d . -l Isabelle_LLVM_Time Hopcroft_Time.thy
\end{lstlisting}
\subsection{Hopcroft's algorithm}
John E. Hopcroft's algorithm for minimizing deterministic finite automata (DFA) was first presented in his original 1971 paper \cite{Hop71} as a formal algorithm.
It has been a major breakthrough in the field of automata theory, since it allowed for minimization in linearithmic\footnote{$O(n \log n)$} time in the worst case\footnote{Other minimization algorithms were presented prior to Hopcroft, e.g.\ by Moore (quadratic) or Brzozowski (exponential).}.
Thanks to this, minimization has become almost costless and thus this algorithm has been widely used in the field of formal verification, e.g.\ in model checking, since it allows to reduce the size of the automaton to be verified.
The formal idea behind the algorithm is to partition the set of states of the DFA into equivalence classes, i.e.\ sets of states that are equivalent\footnote{Two states are equivalent if they accept the same language.} with respect to the language they recognize.
The algorithm starts with two partitions, namely the set of final states and the set of non-final states.
Then, it iteratively refines the partitions by splitting them into smaller ones until no more refinement is possible.
The algorithm is guaranteed to terminate since the number of partitions is finite and strictly decreases at each iteration.
The equivalence class of the set of states of the DFA is then the set of partitions, so that any two different sets of states from that partition recognize different languages.
\bigskip
\begin{algorithm}[H]
\SetAlgoLined
\caption{Hopcroft's original formal algorithm}
\label{alg:original}
\KwData{a finite DFA $\mathcal{A} = (\mathcal{Q}, \Sigma, \delta, q_0, \mathcal{F})$}
\KwResult{the equivalence class of $\mathcal{Q}$ under state equivalence}
Construct $\delta^{-1}(q, a) := \{t \in \mathcal{Q}\ |\ \delta(t, a) = q\}$ for all $q \in \mathcal{Q}$ and $a \in \Sigma$ \;\label{alg:original:1}
Construct $P_1 := \mathcal{F}$, $P_2 := \mathcal{Q} \setminus \mathcal{F}$ and $s_{a,i} := \{q \in P_i \ | \ \delta^{-1}(q, a) \neq \varnothing \}$ for all $i \in \{1, 2\}$ and $a \in \Sigma$ \;
Let $k := 3$ \;
For all $a \in \Sigma$, construct $L_a := \underset{1 \leq i < k}{\operatorname*{arg\,min}}\ \left| s_{a, i} \right|$ \; \label{alg:original:4}
\While{$\exists a \in \Sigma, L_a \neq \varnothing$\label{alg:original:loop}}{
Pick $a \in \Sigma$ such that $L_a \neq \varnothing$ and $i \in L_a$ \; \label{alg:original:pick}
$L_a := L_a \setminus \{i\}$ \;
\ForAll{$j < k, \exists q \in P_j, \delta(q, a) \in s_{a,i}$}{
$P'_j := \{t \in P_j\ |\ \delta(t, a) \in s_{a,i} \}$ and $P''_j := P_j \setminus P'_j$ \;
$P_j := P'_j$ and $P_{k} := P''_j$; construct $s_{a,j}$ and $s_{a,k}$ for all $a \in \Sigma$ accordingly \;
For all $a \in \Sigma$, $L_a := \begin{cases}
L_a \cup \{j\} & \text{if $j \notin L_a \land \left| s_{a,j} \right| \leq \left| s_{a,k} \right|$} \\
L_a \cup \{k\} & \text{otherwise}
\end{cases}$ \;
$k := k + 1$ \;
}
}
\end{algorithm}
\bigskip
Algorithm~\ref{alg:original} is a direct translation of the original algorithm from \cite{Hop71} with only slight changes in the notations.
\subsection{Modern formalisation}
The algorithm is now usually given in a more mathematical and formalised way\footnote{see for example \cite{esparza2023automata}}, which loosens up the choice for an implementation.
The algorithm is given below in Algorithm~\ref{alg:modern}.
\bigskip
\begin{algorithm}[H]
\SetAlgoLined
\caption{Hopcroft's algorithm in a modern style}
\label{alg:modern}
\KwData{a finite DFA $\mathcal{A} = (\mathcal{Q}, \Sigma, \delta, q_0, \mathcal{F})$}
\KwResult{the language partition $P_\ell$}
\eIf{$\mathcal{F} = \varnothing \lor \mathcal{Q} \setminus \mathcal{F} = \varnothing$}{
\Return $\mathcal{Q}$
}{
$\mathcal{P} := \{ \mathcal{F}, \mathcal{Q} \setminus \mathcal{F} \}$ \;
$\mathcal{W} := \{ (a, \min \{ \mathcal{F}, \mathcal{Q} \setminus \mathcal{F}\}), a \in \Sigma \}$ \;
\While{$\mathcal{W} \neq \varnothing$}{
Pick $(a, C)$ from $\mathcal{W}$ and remove it\;
\ForAll{$B \in \mathcal{P}$}{
Split $B$ with $(a, C)$ into $B_0$ and $B_1$ \;
$\mathcal{P} := (\mathcal{P} \setminus \{B\}) \cup \{B_0, B_1\}$ \;
\ForAll{$b \in \Sigma$}{
\eIf{$(b, B) \in \mathcal{W}$}{
$\mathcal{W} := (\mathcal{W} \setminus \{(b, B)\}) \cup \{(b, B_0), (b, B_1)\}$ \;
}{
$\mathcal{W} := \mathcal{W} \cup \{(b, \min \{ B_0, B_1 \})\}$ \;
}
}
}
}
}
\end{algorithm}
\bigskip
We justify this transformation.
First, instead of storing indices for the blocks, we directly deal with sets and explicitly work with a partition $\mathcal{P}$ of $\mathcal{Q}$.
Then, we define splitters.
\begin{definition}
Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, I, F)$ be a DFA. Let $P$ be a partition of $Q$. Let $B \in P$. For $a \in \Sigma$ and $C \in P$ we say that
$(a, C)$ splits $B$ or is a splitter of $B$ if $$\exists q_1, q_2 \in B \quad \delta(q_1, a) \in C \land \delta(q_2, a) \notin C.$$
\end{definition}
\begin{remark}
If $(a, C)$ is a splitter of $B$, $P$ can be updated to $P \setminus \{B\} \cup \{B', B''\}$, where $$B' := \{q \in B \ | \ \delta(q, a) \in C\} \text{ and } B'' := B \setminus B'.$$
\end{remark}
\begin{definition}
For a splitter $s \in \Sigma \times P$, we define its size $\| s \|$ as the size of its second component, i.e.\ if $s = (a, C)$, then $\| s \| := |C|$.
\end{definition}
In Algorithm \ref{alg:original}, picking an index $i \in L_a$ corresponds to picking a set $s_{a, i}$, i.e.\ -- by definition -- the subset of $P_i$ of states having at least one predecessor.
We cannot directly replace $L_a$ and $s_{a, i}$ by a set of splitters $(a, P_i)$ as is, however the next step in the algorithm is to go over all blocks in the partition that have a successor in $s_{a, i}$.
If such a block $B$ is found, it is split into two blocks, namely the set of states having a successor in $s_{a, i}$ and the others.
Overall, we look for $B$ such that $P_i$ has a predecessor in $B$. We add the condition that $B$ must also have at least one successor not in $P_i$ to avoid splitting $B$ into $\{B, \varnothing\}$.
Since this is equivalent to splitting $B$ with $(a, P_i)$, we can replace $L_a$ by a set of splitters $(a, P_i)$.
Since the symbol $a$ is chosen such that $L_a \neq \varnothing$, we can define a workset $\mathcal{W}$ of all splitters, with all symbols of $\Sigma$. Therefore, $L_a$ is empty if and only if there is no $a$-splitter in $\mathcal{W}$.
Because testing emptiness is equivalent to testing membership, and because of the existential quantifier in the definition of splitters, the $s_{a, i}$ are no longer needed.
We can explicitly write the trivial cases where either $\mathcal{F}$ or $\mathcal{Q}\setminus\mathcal{F}$ is empty to improve performance for this specific case.
This is not done in Algorithm \ref{alg:original} but this does not produce wrong results.
If one of the two sets is empty, one of the $s_{a, 1}$ or $s_{a, 2}$ will be empty for all $a \in \Sigma$, and the \textit{while} loop will be entered $|\Sigma|$ times but the inner \textit{for} loop will never be entered.
This results in a useless $O(|\Sigma|)$ computation.
Another difference in that case is that it would return the partition $\min\{\mathcal{F}, \mathcal{Q}\setminus\mathcal{F}\}$ -- where $\min$ returns the set with the least number of elements -- which we avoid in the modern version by just returning $\mathcal{Q}$.
\section{Proof of correctness}
Since the proof of correctness is not the purpose of this paper, we do not go into details. John E. Hopcroft's original paper \cite{Hop71} contains a proof of correctness for his low-level algorithm, however the ideas behind the invariants are the same for the formalisation.
The proof of correctness for Hopcroft's algorithm for DFA minimization typically involves two main components: partition refinement and equivalence checking.
The basic idea of the proof is to show that the algorithm correctly partitions the states of the input DFA into equivalence classes, where states within the same class are indistinguishable from each other with respect to the language recognized by the DFA. This partition represents the minimal DFA that recognizes the same language as the input DFA.
The algorithm starts with an initial partition that distinguishes accepting states from non-accepting states. Then, it iteratively refines this partition by considering pairs of states that are currently in different equivalence classes and determining if they can be distinguished by a symbol from the input alphabet.
During each iteration, the algorithm examines each symbol in the alphabet and checks if it distinguishes any pair of states in different equivalence classes. If a symbol does distinguish a pair, it splits the equivalence class containing those states into two new classes. This process continues until no more splits can be made, and the resulting partition represents the minimal DFA.
To prove the correctness of Hopcroft's algorithm, one needs to show two key properties:
\begin{itemize}
\item the resulting partition is a valid equivalence relation, meaning it satisfies the properties of reflexivity, symmetry, and transitivity,
\item the resulting partition is indeed the minimal partition that correctly distinguishes states based on their language equivalence.
\end{itemize}
The proof typically involves reasoning about the behavior of the algorithm during each iteration, considering how the partition is refined inductively and ensuring that it converges to the minimal partition.
In the Refinement Framework, the algorithm is first described at the abstract level, so its correctness can be proved independently of any implementation details. Then, the algorithm is progressively refined into a concrete implementation, and the correctness of each refinement until the implementation is proved by showing that it refines the abstract algorithm.
\section{Time complexity analysis}
\subsection{Paper proof}
In a first time, we focus on the original algorithm presented in Algorithm~\ref{alg:original} in order to work on the arguments given in \cite{Hop71}.
The data structures used at that time were mostly linked lists, but let us rather give some requirements for the data structures instead of actual implementations.
The goal is to show that the algorithm can be executed in $O(m \cdot n \log n)$ time, where $m$ is the number of symbols in the alphabet and $n$ is the number of states in the DFA.
The following requirements come directly from \cite{Hop71} and are specific to the algorithm presented in Algorithm~\ref{alg:original}.
\begin{requirement}
\label{req:1}
Sets such as $\delta^{-1}(q, a)$ and $L_a$ must be represented in a way that allows $O(1)$ time for addition and deletion in ``front position''\footnote{This does not make sense for sets and is specific for a data structure. What is meant here is a bit informal and designates a position directly accessible, e.g.\ the head for a stack, the value of the root for a tree, etc.}.
\end{requirement}
\begin{requirement}
\label{req:2}
Vectors must be maintained to indicate whether a state is in a given set.
\end{requirement}
\begin{requirement}
\label{req:3}
Sets such as $P_i$ must be represented in a way that allows $O(1)$ time for addition and deletion at any given position.
\end{requirement}
\begin{remark}
\label{req:4}
From Req.\ \ref{req:2} and Req.\ \ref{req:3}, we can show that for a state $q$ in the representation of a set $P_i$ or $s_{a,i}$, its ``position'' -- in the implemented data structure -- must be determined in $O(1)$ time.
\end{remark}
\bigskip
\begin{lemma}
In Algorithm \ref{alg:original}, lines \ref{alg:original:1} to \ref{alg:original:4} can be executed in $O(\left| \Sigma \right| \cdot \left| \mathcal{Q} \right|)$ time.
\end{lemma}
\begin{proof}
The non trivial part is the computation of the inverse transition function $\delta^{-1}(q, a)$, for all $q \in \mathcal{Q}$ and $a \in \Sigma$. This can be done in $O(\left| \Sigma \right| \cdot \left| \mathcal{Q} \right|)$ time by iterating over $\Sigma$ and traversing the automaton (e.g.\ with a DFS) while keeping track of the predecessor at each step.
\end{proof}
\bigskip
\begin{lemma}\label{lem:time_iteration}
An iteration of the \textit{while} loop in both algorithms for a splitter $s = (a, C)$ takes a time proportional to the number of transitions terminating in $C$ and the number of symbols in the alphabet, i.e. $\Theta\left(\left| \Sigma \right| \cdot \left| \delta^{-1}(C, a) \right| \right)$ time.
\end{lemma}
\begin{proof}
We start by showing it for Algorithm \ref{alg:original}. We pick $a \in \Sigma$ such that $L_a \neq \varnothing$ and an $i \in L_a$. We need to examine all $j < k$ such that $\exists q \in P_j, \delta(q, a) \in s_{a,i}$ to construct the sets corresponding to splitting the block $P_j$ w.r.t.\ $a$ and $P_i$.
Let $j < k$. From the definition of $s_{a,i}$, we obtain the following:
\begin{align*}
\exists q \in P_j, \delta(q, a) \in s_{a,i} &\iff \exists q \in P_j, \delta(q, a) \in P_i \land \underset{\text{true}}{\underbrace{\delta^{-1}(\delta(q, a), a) \neq \varnothing}} \\
& \iff \exists q \in P_j, \delta(q, a) \in P_i
\end{align*}
Which corresponds to finding states in $P_j$ having an outgoing $a$-transition to a state in $P_i$, as represented in the following scheme:
\begin{center}
\begin{tikzpicture}
\definecolor{color1}{RGB}{213, 216, 224}
\definecolor{color2}{RGB}{202, 222, 200}
\fill[color=color1] (0,0) ellipse (1.8cm and 1.4cm);
\draw[dashed] (-0.5,-1.35) -- (0,1.4);
\fill[color=color2] (5,0) ellipse (2cm and 1cm);
\node at (-2.1,0) {$P_j$};
\node at (-1,0) {$P''_j$};
\node at (0.5,0) {$P'_j$};
\node at (7.3,0) {$P_i$};
\draw[->] (0, 0.8) to[in=160, out=20] node[above]{$a$} (5,0.5);
\draw[->] (1,0) to[in=180, out=10] node[above]{$a$} (5,0.5);
\draw[->] (-.2,-1) to[in=210, out=-20] node[above]{$a$} (4,-.2);
\end{tikzpicture}
\end{center}
This set of states can be expressed via the inverse transition function:
$$ \{q \in P_j\ |\ \delta(q,a) \in P_i \} = \left( \bigcup_{q' \in P_i} \delta^{-1}(q', a)\right) \cap P_j $$
Since $\delta^{-1}$ was already computed in the first step of the algorithm, we can determine using req.\ \ref{req:3} whether a state of $\bigcup_{q \in P_i} \delta^{-1}(q, a)$ is also in $P_j$ in $\Theta(1)$ time.
Thus, instead of examining $P_j$ for all $j < k$, we rather go through the table of $\delta^{-1}$ and for each state $q$ such that $\delta(q, a) \in P_i$, we know from req.\ \ref{req:4} that we can determine the index $j < k$ (because there are $k$ blocks) of the block $P_j$ containing $q$ in $\Theta(1)$ time.
The sets $P'_j$ and $P''_j = P_k$ resulting from the partition can be constructed on the fly without any additional time cost. The construction of the sets $s_{b, j}$ and $s_{b, k}$ as well as the update of $L_b$ for all $b \in \Sigma$ can also be done on the fly but require $\Theta(1)$ time for each symbol $b \in \Sigma$ and thus add up to a total of $\Theta(\left| \Sigma \right|)$ time.
\ednote{VT}{the on-the-fly computation part is a little hand-waving...}
Overall, one iteration of the loop runs in time
$$\Theta \left( \left| \Sigma \right| \cdot \left| \bigcup_{q \in P_i} \delta^{-1}(q, a) \right| \right) $$
\bigskip
We now show the result for Algorithm \ref{alg:modern}. A splitter ${s =: (a, C)}$ is picked from $\mathcal{W}$.
Then, we iterate over all blocks $B \in \mathcal{P}$ that may be split with $s$. Let $B \in \mathcal{P}$.
\begin{align*}
s\ \text{splits}\ B &\iff \exists q_1, q_2 \in B \quad \delta(q_1, a) \in C \land \delta(q_2, a) \notin C\\
&\iff \{q_1 \in B, \delta(q_1, a) \in C\} \neq \varnothing \land \{q_2 \in B, \delta(q_2, a) \notin C\} \neq \varnothing \\
&\iff \left(\bigcup_{q \in C} \delta^{-1}(q, a)\right)\cap B \neq \varnothing \land \left(\bigcup_{q \in B}\delta(q, a)\right)\setminus C \neq \varnothing
\end{align*}
\ednote{VT}{we may express the right-hand conjunct in terms of $\delta^{-1}$ in order to prepare the paragraph that follows (But it wouldn't be the same union because you have to start from states outside $C$)}
Likewise, instead of examining all blocks $B \in \mathcal{P}$, we go through $\delta^{-1}$ and we update the splitters for all symbols in $\Sigma$ and one iteration of the loop runs in time
$$\Theta \left( \left| \Sigma \right| \cdot \left| \bigcup_{q \in C} \delta^{-1}(q, a) \right| \right) $$
\end{proof}
\newcommand{\preds}[2]{\ensuremath{\overset{\hookleftarrow {#1}}{#2}}}
For the sake of simplicity, we now denote $\displaystyle{\bigcup_{q \in C} \delta^{-1}(q, a)}$ by $\preds{a}{C}$.
We will now justify the logarithmic factor in the time complexity.
We briefly explain why a logarithm stands out and prove the statement by induction.
The idea is that for each symbol $a \in \Sigma$, a state $q \in \mathcal{Q}$ can be in at most one of the splitters in $\mathcal{W}$.
When the loop iterates over this splitter, it will split the block and keep the smaller one, whose size will be at most the size of the splitter divided by two.
This means that a splitter $s$ can be processed at most $\log \| s \|$ times.
We now properly state and prove the property by induction.
We see splitters as updatable program variables.
This means that $s_{a,i}$ as a set may differ along the loop, however we know there exists some $q \in \mathcal{Q}$ such that $s_{a,i}$ is the unique set containing $q$ throughout the execution.
This gives a way to characterize splitters throughout the whole execution.
\bigskip
\begin{lemma}\label{lem:log}
Any splitter in the workset $s \in \mathcal{W}$ is processed -- i.e.\ picked at the beginning of the \textit{while} loop -- at most $\lfloor \log \| s \| \rfloor$ times.
\end{lemma}
\begin{proof}
We first show the following statement:
\textit{During an iteration, the chosen splitter $s$ will either be removed from the set of splitters or its size will be reduced by at least $\frac{\| s\|}{2}$ after the iteration\footnote{That is a bit informal because, the original splitter is actually removed and another ``similar'' splitter (cf the paragraph above the statement of the lemma) of size at most $\frac{||s||}{2}$ will be added.}.}
\bigskip
Let $\{P_1, \cdots, P_\ell\}$ be the current partition and let $s =: (a, C)$ be the picked splitter. Thus, $s$ is no longer in the set of splitters $\mathcal{W}$. Since we go over all splittable blocks, $C$ may also be split by $s$.
\begin{itemize}
\item If $C$ is not split by $s$, then $s$ was already removed from the splitters. Note that it can be added again later if it is split by another splitter.
\item If $C$ is split into $C'$ and $C''$, then $(a, \min\{C', C''\})$ is added to the splitters and its size is at most $\frac{\| s \|}{2}$.
\end{itemize}
Therefore, since a splitter cannot be empty, $s$ can be processed at most $m$ times where $m$ is such that
$$ 1 \leq \frac{\| s \|}{2^m} < 2$$
which is equivalent to
$$ m \leq \log \| s \| < m + 1 \quad \text{i.e.}\quad \left\lfloor\log \| s \|\right\rfloor = m $$
\end{proof}
\bigskip
\begin{lemma}\label{lem:log_non_splitter}
Any block $B$ resulting from a split and not added as a splitter is processed at most $\lfloor \log \frac{| B |}{2} \rfloor$ times.
\end{lemma}
\begin{proof}
Let $P \in \mathcal{P}$ be a block in the partition and $s =: (a, C)$ be a splitter such that $s$ splits $P$ into $S$ and $B$ so that $(a, S)$ is added to the workset and $B$ is not.
In order for $B$ to be processed, some $(b, B)$ must be added to the workset.
Since the only way to create a fresh splitter is to split $B$, this means that there exists some later step in the loop such that some splitter $(b, B')$ splits $B$ into $B_1$ and $B_2$ and $(b, \min\{B_1, B_2\})$ is added to the workset.
From lemma \ref{lem:log}, we know that this splitter can be processed at most $\lfloor\log |\min \{B_1,B_2\}|\rfloor$ times.
Since $|\min\{B_1, B_2\}| \leq \frac{| B |}{2}$ and since there is no splitter containing $B$ in the workset, we obtain that $B$ can be processed at most $\lfloor\log \frac{| B |}{2}\rfloor$ times.
\end{proof}
\bigskip
\begin{lemma}\label{lem:bound}
Let us consider some step in the algorithm such that $\mathcal{P}$ is the current partition. We give an estimation of the total time spent in the loop until termination as an upper bound:
$$ T := \theta \cdot \left(\sum_{(a, C) \in \mathcal{W}} | \preds{a}{C} | \log | C | + \sum_{(a, B) \in (\Sigma\times \mathcal{P})\setminus \mathcal{W}} | \preds{a}{B} | \log \frac{| B |}{2} \right)$$
where $\theta$ is the constant factor of proportionality that may be obtained from lemma \ref{lem:time_iteration}.
\end{lemma}
\begin{proof}
We show the result by induction over the steps.
\bigskip
\textbf{Base case:} the current partition is $\{P_1, P_2\}$ and we may assume w.l.o.g.\ that $P_1$ is the smaller set, so that $\mathcal{W} = \{(a, P_1), a \in \Sigma\}$.
We have to show that the total time spent in the loop is bounded by
$$T = \theta \cdot \sum_{a \in \Sigma} \left(| \preds{a}{P_1} | \log | P_1 | + | \preds{a}{P_2} | \log \frac{| P_2 |}{2}\right)$$
We know from lemma \ref{lem:time_iteration} that an iteration of the loop for $(a, P_i)$ takes $\theta | \preds{a}{P_i} | $ time.
\begin{itemize}
\item From lemma \ref{lem:log}, for all $a\in\Sigma$, $(a, P_1)$ can be processed at most $\log \| (a, P_1) \| = \log |P_1|$ times\footnote{We drop the floor function for simplicity.
Note that since we are giving upper bounds, this is completely valid. However, the floor function will have some importance later in the formalization.}, hence the total time for $(a, P_1)$ is bounded by $\theta | \preds{a}{P_1} | \log | P_1 |$ and thus the total time for any splitter $(a, P_1)$ is bounded by:
$$\theta | \preds{a}{P_1} | \log | P_1 |$$
\item From lemma \ref{lem:log_non_splitter}, for all $a\in\Sigma$, $(a, P_2)$ can be processed at most $\log \frac{\| (a, P_2) \|}{2} = \log \frac{| P_2 |}{2}$ times. Thus, the total time for any non-splitter $(a, P_2)$ is bounded by:
$$ \theta | \preds{a}{P_2} | \log \frac{| P_2 |}{2}$$
\end{itemize}
\textbf{Inductive step:} Let $\{P_1, \cdots, P_\ell\} := \mathcal{P}$ be the current partition.
The induction hypothesis states that the estimate for the total time spent in the loop until termination for $a$ is bounded by
$$ T := \theta \left(\sum_{(a, C) \in \mathcal{W}} | \preds{a}{C} | \log | C | + \sum_{(a, B) \in \Sigma\times \mathcal{P}\setminus \mathcal{W}} | \preds{a}{B} | \log \frac{| B |}{2} \right)$$
By going through one more step, some blocks of $\mathcal{P}$ may be split and we define a new estimate $\hat{T}$ over this new partition and we have to show that $\hat{T} \leq T$.
Suppose some $B \in \mathcal{P}$ is split into $B'$ and $B''$ by any splitter.
We have to consider the cases where $(a , B)$ is a splitter or not for all $a \in \Sigma$.
Let $a \in \Sigma$.
\begin{itemize}
\item $(a, B) \in \mathcal{W}$: $\mathcal{W}$ is updated as follows:
$$ \mathcal{W} := \mathcal{W} \setminus \{(a, B)\} \cup \{(a, B'), (a, B'')\}$$
Thus, instead of taking a time $\theta | \preds{a}{B} | \log | B |$ time for $(a, B)$, it now takes a time bounded by:
$$\theta | \preds{a}{B'} | \log | B' | + \theta | \preds{a}{B''} | \log | B'' |$$
Finally, we show the following:
\begin{equation*}
| \preds{a}{B'} | \log | B' | + | \preds{a}{B''} | \log | B'' | \leq | \preds{a}{B} | \log | B |
\end{equation*}
From $B = B' \cup B''$ and because the automaton is deterministic, we get $|B| = |B'| + |B''|$ and $| \preds{a}{B} | = | \preds{a}{B'} | + | \preds{a}{B''} |$, hence:
\begin{align*}
& |\preds{a}{B'} | \log | B' | + | \preds{a}{B''} | \log | B'' | - | \preds{a}{B} | \log | B | \\
&= | \preds{a}{B'} | \log | B' | + (| \preds{a}{B} | - | \preds{a}{B'}|) \log \left(|B|-|B'|\right) - | \preds{a}{B} | \log | B | \\
&= | \preds{a}{B'} | \log \frac{|B'|}{|B|-|B'|} + | \preds{a}{B} | \log \frac{|B|-|B'|}{|B|} \\
&\leq | \preds{a}{B'} | \log \frac{|B|}{|B|-|B'|} + | \preds{a}{B} | \log \frac{|B|-|B'|}{|B|} \\
&= (| \preds{a}{B}| - |\preds{a}{B'} |) \log \underset{\leq 1}{\underbrace{\frac{|B|-|B'|}{|B|}}} \leq 0
\end{align*}
\item $(a, B) \notin \mathcal{W}$: $(a, \min\{B', B''\})$ is added to the set of splitters. We may assume w.l.o.g.\ that $B'$ is the smaller set.
Thus, the corresponding term in the sum $|\preds{a}{B}| \log \frac{| B |}{2}$ is updated as follows:
\begin{equation*}
\theta | \underset{\text{lemma \ref{lem:log}}}{\underbrace{\preds{a}{B'} | \log | B' |}} + \theta \underset{\text{lemma \ref{lem:log_non_splitter}}}{\underbrace{|\preds{a}{B''} | \log \left(\frac{| B'' |}{2}\right)}}
\end{equation*}
Since $B = B' \cup B''$, the sum above can be written as:
\begin{equation*}
\theta | \preds{a}{B'} | \log | B' | + \theta (| \preds{a}{B} | - | \preds{a}{B'} |) \log \left(\frac{|B|-|B'|}{2}\right)
\end{equation*}
By using the fact that $\displaystyle{| B' | \leq \frac{|B|}{2}}$, we have:
\begin{align*}
& | \preds{a}{B'} | \log | B'| + (| \preds{a}{B} | - | \preds{a}{B'}|)\log \left(\frac{| B | - |B'|}{2} \right) - |\preds{a}{B}| \log \frac{|B|}{2} \\
& \leq | \preds{a}{B'}| \log \frac{| B |}{2} + (| \preds{a}{B}| - | \preds{a}{B'}|) \log \left(\frac{|B| - |B'|}{2}\right) - | \preds{a}{B}| \log \frac{| B |}{2} \\
& \leq (| \preds{a}{B}| - | \preds{a}{B'} |) \log \left(\frac{| B | - | B' |}{| B |}\right) \leq 0
\end{align*}
Overall, splitting any block of $\mathcal{P}$ does not increase the total time spent over the loop.
By iterating over all blocks, we obtain that the total time $\hat{T}$ spent in the loop from this new iteration until termination is such that $\hat{T} \leq T$.
\end{itemize}
\end{proof}
\bigskip
\begin{theorem}
\label{thm:complexity}
Algorithm \ref{alg:original} and Algorithm \ref{alg:modern} run in $O\left(|\Sigma|\cdot|\mathcal{Q}| \log |\mathcal{Q}|\right)$ time.
\end{theorem}
\begin{proof}
From lemma \ref{lem:bound}, we obtain in particular that for the initial partition the time spent in the loop until termination is bounded by:
$$T = \theta \cdot \sum_{a \in \Sigma}\left(| \preds{a}{P_1} | \log | P_1 | + | \preds{a}{P_2} | \log \frac{| P_2 |}{2}\right)$$
where $\theta$ is the constant of proportionality obtained in lemma \ref{lem:time_iteration} and $\{P_1, P_2\}$ is the initial partition.
Thus using concavity of the logarithm, the following holds:
$$T \leq \theta \sum_{a \in \Sigma} (\underset{\leq |\mathcal{Q}|}{\underbrace{|\preds{a}{P_1}| + |\preds{a}{P_2}|}})\log(\underset{\leq |\mathcal{Q}|}{\underbrace{|P_1| + |P_2|}}) \leq \theta|\Sigma||\mathcal{Q}| \log |\mathcal{Q}|$$
\end{proof}
\subsection{Towards a formal proof}
The objective of the project is to obtain a formal proof for the time complexity of the algorithm using the Refinement Framework with its extension to runtime analysis.
We follow a top-down strategy, from the abstract level to the implementation.
\subsubsection{Abstract level}
The purpose of this section is to explain how the analysis at the abstract level is structured without adding too many details about Isabelle.
\paragraph{Setting up the framework:}
First, we define the algorithm on the abstract level using the NREST framework as shown below:
\noindent
\begin{minipage}{\linewidth}
\begin{lstlisting}[language=Isabelle]
definition Hopcroft_abstractT where
Hopcroft_abstractT $\color{isa_dark_blue}\mathcal{A}$ $\equiv$
if (check_states_empty_spec $\color{isa_dark_blue}\mathcal{A}$) then mop_partition_empty else
if (check_final_states_empty_spec $\color{isa_dark_blue}\mathcal{A}$) then mop_partition_singleton ($\mathcal{Q}$ $\color{isa_dark_blue}\mathcal{A}$) else
do {
PL $\leftarrow$ init_spec $\color{isa_dark_blue}\mathcal{A}$;
(P, _) $\leftarrow$ monadic_WHILEIET (Hopcroft_abstract_invar $\color{isa_dark_blue}\mathcal{A}$)
($\lambda \isavar{s}$. estimate_iteration $\color{isa_dark_blue}\mathcal{A}$ $\isavar{s}$) check_b_spec (Hopcroft_abstract_f $\color{isa_dark_blue}\mathcal{A}$) PL;
RETURNT P
}
\end{lstlisting}
\end{minipage}
It is parameterized by abstract costs for each operation, like checking emptiness, returning an empty partition or a singleton, initializing the state\footnote{States in this setting are tuples $(P, L)$ where $P$ is a partition of the set of states $\mathcal{Q}$ of $\mathcal{A}$ together with a workset $L$ of splitters.}, etc.
For the sake of simplicity and since names are rather self-explanatory, we only give the definition of \texttt{check\_states\_empty\_spec} as an example and detail the \textit{while} loop. Checking emptiness for the set of states costs one coin of the currency \texttt{check\_states\_empty} and can be defined as follows:
\begin{lstlisting}[language=Isabelle]
definition check_states_empty_spec ${\color{isa_dark_blue}\mathcal{A}}$ $\equiv$
consume (RETURNT ($\mathcal{Q}$ ${\color{isa_dark_blue}\mathcal{A}}$ = {})) (cost ''check_states_empty'' 1)
\end{lstlisting}
The purpose of using currencies is to be able to give an abstract description of the costs, and then to instantiate them with concrete values in the implementation which will be expressed in terms of concrete elementary costs like the number of comparisons or assignments. For example, if the set of states is represented as a tree \texttt{t}, there should be a method or an attribute \texttt{t.isempty} kept up to date during the execution of the algorithm, so that the cost of checking emptiness is equal to the cost of a function call plus the cost for reading memory at the location of the boolean \texttt{t.isempty}, that is constant time.
The \textit{while} loop requires to be provided with an invariant, an estimation of the total amount of resource to be consumed -- here time, and this estimation has to decrease over an iteration, the loop condition, the function to be executed in the loop body and the initial state.
The loop body defined in \texttt{Hopcroft\_abstract\_f} is given below:
\noindent
\begin{minipage}{\linewidth}
\begin{lstlisting}[language=Isabelle]
definition Hopcroft_abstract_f where
Hopcroft_abstract_f $\color{isa_dark_blue}\mathcal{A}$ =
$\lambda$PL. do {
ASSERT (Hopcroft_abstract_invar $\color{isa_dark_blue}\mathcal{A}$ PL);
(a,p) $\leftarrow$ pick_splitter_spec PL;
PL' $\leftarrow$ update_split_spec $\color{isa_dark_blue}\mathcal{A}$ PL a p;
RETURNT PL'
}
\end{lstlisting}
\end{minipage}
The main operation, i.e.\ splitting blocks of the partition and updating the workset accordingly is specified in \texttt{update\_split\_spec}.
As written in the informal paper proof of lemma \ref{lem:time_iteration}, this operation is a bit tricky since both splitting and updating are performed on-the-fly at the same time, thus they are gathered together for the moment.
The definition of \texttt{update\_split\_spec} is given below:
\begin{lstlisting}[language=Isabelle]
definition update_split_spec $\color{isa_dark_blue}\mathcal{A}$ $\equiv$ $\lambda$(P,L) a p.
SPEC ($\lambda$(P', L').
Hopcroft_update_splitters_pred $\color{isa_dark_blue}\mathcal{A}$ p a P L L' $\land$ P' = Hopcroft_split $\color{isa_dark_blue}\mathcal{A}$ p a {} P
) ($\lambda$_. cost ''update_split'' (enat ($|\Sigma$ $\color{isa_dark_blue}\mathcal{A}|$ * $|$ preds $\color{isa_dark_blue}\mathcal{A}$ a p $|$)))
\end{lstlisting}
where \texttt{preds $\mathcal{A}$ a p} represents the set of all predecessors $\bigcup_{q \in p} \delta^{-1}(q, a) = \preds{a}{p}$ of all states of $p$ labelled by $a$. The cost specified by this definition stems from the result of lemma \ref{lem:time_iteration}, namely that the cost for splitting and updating the partition and the workset is $\Theta(|\Sigma| \cdot |\preds{a}{p}|)$.
\bigskip
The most important part of this definition is the estimate of the cost for the loop, i.e.\ the amount of resource consumed by the loop. From lemma \ref{lem:bound}, we have the following estimate:
\begin{equation}\label{eq:estimate}
T = \theta \cdot \left(\sum_{(a, C) \in \mathcal{L}} | \preds{a}{C} | \lfloor \log_2 | C | \rfloor + \sum_{(a, B) \in (\Sigma\times \mathcal{P})\setminus \mathcal{L}} | \preds{a}{B} | \lfloor \log_2 \frac{| B |}{2} \rfloor \right)
\end{equation}
\begin{remark}
In the following, we denote $x \mapsto \lfloor \log_2 x \rfloor$ by $\mapsto \log x$ for simplicity.
\end{remark}
The constant $\theta$ comes from lemma \ref{lem:time_iteration} and represents constant costs inherent to the loop body, like the cost of checking emptiness, calling a function, etc.
In Isabelle, this factor $\theta$ is defined following the definition of \texttt{Hopcroft\_abstract\_f} and \texttt{check\_b\_spec} as follows:
\begin{lstlisting}[language=Isabelle]
definition cost_1_iteration $\equiv$
cost ''call'' 1 + cost ''check_l_empty'' 1 + cost ''if'' 1 +
cost ''pick_splitter'' 1 + cost ''update_split'' 1
\end{lstlisting}
Then, given a suitable definition \texttt{estimate} for the inner expression in $T$ -- i.e.\ the ``raw'' estimate -- the estimate for the loop is defined as the product\footnote{The operator \texttt{{\color{isa_blue}*m}} lifts multiplication to cost expressions.} of this constant factor and the inner expression, as follows:
\begin{lstlisting}[language=Isabelle]
definition estimate_iteration where
estimate_iteration $\mathcal{A}$ PL $\equiv$ cost_1_iteration ${\text{\color{isa_blue}{*m}}}$ estimate $\mathcal{A}$ PL
\end{lstlisting}
\paragraph{Finding a suitable definition for the estimate:}
Finding a suitable definition for the inner expression in $T$ is not straightforward. The first attempt was to define it as follows:
\begin{lstlisting}[language=Isabelle]
definition estimate1 ${\color{isa_dark_blue}\mathcal{A}}$ $\equiv$ $\lambda$(P,L).
$\sum${(card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst s) (snd s)))*Discrete.log(card(snd s)) | s. s $\in$ L} +
$\sum${(card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst s) (snd s)))*Discrete.log(card(snd s)/2) | s. s $\in$ $\Sigma$ ${\color{isa_dark_blue}\mathcal{A}}$ $\times$ P - L}
\end{lstlisting}
\begin{remark}
The function \texttt{Discrete.log :: nat $\Rightarrow$ nat} is the floor of the logarithm in base 2, i.e.\ it formalizes the function $x \mapsto \lfloor \log_2 x \rfloor$. Unlike the informal proof of lemma \ref{lem:bound}, the floor function cannot be dropped -- as is -- in Isabelle, mostly because the Refinement Framework works with natural numbers for the costs. This makes sense because the logarithm essentially captures the number of times blocks are split, which is a natural number.
\end{remark}
The problem with this definition is that the function $s := (a, C) \mapsto |\preds{a}{C}| \log |C|$ has no reason to be injective either on $L$ or its complement and thus the set comprehension expressions within the sums may actually contain duplicates that are merged together. One way to fix this is to sum over multisets. Multisets can also be defined via set comprehension in Isabelle, however having a structure allowing for induction is often more convenient. Thus, we prefer lists to multisets.
\bigskip
A very convenient way to link lists and multisets is to use permutations, which are defined in Isabelle using multisets: a list $\ell$ is a permutation of a list $\ell'$ if and only if $\ell$ and $\ell'$ have the same multiset, i.e.\ in Isabelle if and only if \texttt{mset $\ell$ = mset $\ell'$}. We can lift this definition\footnote{A list $\ell$ permutes a set $S$, which we denote by \texttt{$\ell$ <\midtilde\midtilde\midtilde> $S$}, iff \texttt{mset $\ell$ = mset\_set $S$}.} to permutations between lists and sets with the help of the function \texttt{mset\_set} in Isabelle. Thus, we can define the estimate as follows:
\begin{lstlisting}[language=Isabelle]
definition estimate ${\color{isa_dark_blue}\mathcal{A}}$ $\equiv$ $\lambda$(P,L).
let xs = (SOME xs. xs <$\midtilde\midtilde\midtilde$> L) and
ys = (SOME ys. ys <$\midtilde\midtilde\midtilde$> $\Sigma$ ${\color{isa_dark_blue}\mathcal{A}}$ $\times$ P - L) in
$\sum$s$\leftarrow$xs. card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst s) (snd s)) * Discrete.log(card(snd s)) +
$\sum$s$\leftarrow$ys. card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst s) (snd s)) * Discrete.log(card(snd s)/2)
\end{lstlisting}
\begin{remark}
This definition makes use of the non-deterministic choice operator \texttt{SOME} of Isabelle to choose a permutation of $L$ and $\Sigma \times \mathcal{P} \setminus \mathcal{L}$ respectively. Since we are working with finite automata, such permutations always exist.
\end{remark}
\paragraph{Simplifying the estimate:}
Unfortunately, such a definition makes the proofs very difficult. For a state transition $(\mathcal{P}, \mathcal{L}) \rightarrow (\mathcal{P}', \mathcal{L}')$, we would have to examine all cases for all blocks whether they are splitters in $L$ or whether they are split by some other splitter or whether there are new splitters created from this splitter in $L'$. This would be very tedious to formalize, but here is a sketch of some ideas.
Splitters of the workset can be uniquely characterized thanks to ``$a$-$q$-splitters''. It is defined as follows for a state $(P, L)$:
\begin{definition}
Let $a \in \Sigma$ and $q \in \mathcal{Q}$. Let $B$ be the unique block of the partition $\mathcal{P}$ containing $q$. If $(a, B) \in L$, it is the \textbf{$a$-$q$-splitter}. Otherwise, there is no $a$-$q$-splitter.
\end{definition}
This can be achieved in Isabelle using the option type. Although this is a good characterization of splitters, it is not easy to work with it. For example, if we want to show that the number of splitters decreases, we would have to show that the number of $a$-$q$-splitters decreases for any $a \in \Sigma$ and any $q \in \mathcal{Q}$. This is rather tricky to formalize.
\bigskip
In order to simplify the expression of the estimate (\ref{eq:estimate}), we can remove the halving factor in the second term, which allows both term to be gathered together under one sum. We obtain an upper bound for the previous expression and the difference between the two is ``small'':
\begin{align*}
T & \leq \theta \cdot \left(\sum_{(a, C) \in \mathcal{L}} | \preds{a}{C} | \log | C | + \sum_{(a, B) \in (\Sigma\times \mathcal{P})\setminus \mathcal{L}} | \preds{a}{B} | \log | B | \right)\\
& = \theta \sum_{(a, C) \in \Sigma\times \mathcal{P}} | \preds{a}{C} | \log | C |
\end{align*}
Thus, we work with the following simplified estimate:
\begin{equation}\label{eq:estimate_simplified}
\hat{T} := \theta \sum_{(a, C) \in \Sigma\times \mathcal{P}} | \preds{a}{C} | \log | C |
\end{equation}
\begin{remark}
Recall that we dropped the floor function in the notations for simplicity.
However, the floor function will have some importance later in the formalization.
The full expression of the estimate is given by
\begin{equation*}
\hat{T} = \theta \sum_{(a, C) \in \Sigma\times \mathcal{P}} | \preds{a}{C} | \lfloor \log_2 | C | \rfloor
\end{equation*}
\end{remark}
\noindent
The error $\Delta T$ is given by
\begin{equation}
\Delta T := |\hat{T} - T| = \theta \sum_{(a, B) \in (\Sigma\times \mathcal{P})\setminus \mathcal{L}} | \preds{a}{B} | \log \frac{| B |}{2}
\end{equation}
Thus, performing this simplification does not change the asymptotic behavior of the estimate.
The definition in Isabelle is also simpler:
\begin{lstlisting}[language=Isabelle]
definition estimate ${\color{isa_dark_blue}\mathcal{A}}$ $\equiv$ $\lambda$($\isavar{P}$,$\isavar{L}$).
let $\isavar{xs}$ = (SOME $\isavar{xs}$. $\isavar{xs}$ <$\midtilde\midtilde\midtilde$> $\Sigma$ ${\color{isa_dark_blue}\mathcal{A}}$ $\times$ $\isavar{P}$) in
$\sum \isavar{s} \leftarrow \isavar{xs}$. card (preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst $\isavar{s}$) (snd $\isavar{s}$)) * Discrete.log (card (snd $\isavar{s}$))
\end{lstlisting}
\begin{remark}
The previous definition could be equivalently achieved by using Isabelle's summation over sets:
\begin{lstlisting}[language=Isabelle]
definition estimate ${\color{isa_dark_blue}\mathcal{A}}$ $\equiv$ $\lambda$($\isavar{P}$,$\isavar{L}$).
$\sum \isavar{s} \in$ $\Sigma$ ${\color{isa_dark_blue}\mathcal{A}}$ $\times$ $\isavar{P}$. card (preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst $\isavar{s}$) (snd $\isavar{s}$)) * Discrete.log (card (snd $\isavar{s}$))
\end{lstlisting}
Unfortunately, I noticed it after the formalization was done. It may shorten some proofs and improve overall understanding, so it is worth switching to this definition.
\end{remark}
\paragraph{Proving that the estimate decreases:} We can now prove that the estimate decreases. We can consider two approaches:
\begin{itemize}
\item[(1)] either we follow the idea of the paper proof given in theorem \ref{thm:complexity} and a non-strict inequality is sufficient,
\item[(2)] or we follow the rules given by the Refinement Framework (for a \textit{while} loop given an estimate of the time spent in the loop) and we show a strict inequality.
\end{itemize}
We start with approach (1) since it is easier to prove. We first sketch out the idea of the proof, followed by some technical details about the proof in Isabelle. Given a state $(\mathcal{P}, \mathcal{L})$ satisfying some invariants and a state transition $(\mathcal{P}, \mathcal{L}) \rightarrow (\mathcal{P}', \mathcal{L}')$, we have to show that
\begin{equation*}
\hat{T}(\mathcal{P}', \mathcal{L}') \leq \hat{T}(\mathcal{P}, \mathcal{L})
\end{equation*}
Forgetting the constant $\theta$, we have to show that
\begin{equation*}
\sum_{(a, C) \in \Sigma\times \mathcal{P'}} | \preds{a}{C} | \log | C | \leq \sum_{(a, C) \in \Sigma\times \mathcal{P}} | \preds{a}{C} | \log | C |
\end{equation*}
Let $\{P_1, \dots, P_m\} := \mathcal{P}$ be an enumeration of the $m$ blocks of $\mathcal{P}$ such that the first $k$ blocks $P_1, \dots, P_k$ are split and $P_{k+1}, \dots, P_m$ are not.
This means that for each $i \in \{1, \dots, k\}$, there exists $P_i'$ and $P_i''$ in $\mathcal{P'}$ resulting from the split of $P_i$ in $\mathcal{P}$.
Thus, $\mathcal{P'}$ can be written as follows:
\begin{equation*}
\mathcal{P'} = \{P_1', P_1'', \dots, P_k', P_k'', P_{k+1}, \dots, P_m\}
\end{equation*}
Therefore, we can rewrite the previous inequality as follows:
\begin{equation*}
\sum_{(a, C) \in \Sigma\times \{P_1', P_1'', \dots, P_k', P_k'', P_{k+1}, \dots, P_m\}} | \preds{a}{C} | \log | C | \leq \sum_{(a, C) \in \Sigma\times \{P_1, \dots, P_m\}} | \preds{a}{C} | \log | C |
\end{equation*}
By splitting the left-hand side sum into two sums over $\{P_1', P_1'', \dots, P_k', P_k''\}$ and $\{P_{k+1}, \dots, P_m\}$ and subtracting the terms over $\{P_{k+1}, \dots, P_m\}$ on both sides, we can reduce the problem to showing that
\begin{equation*}
\sum_{(a, C) \in \Sigma\times \{P_1', P_1'', \dots, P_k', P_k''\}} | \preds{a}{C} | \log | C | \leq \sum_{(a, C) \in \Sigma\times \{P_1, \dots, P_k\}} | \preds{a}{C} | \log | C |
\end{equation*}
that is,
\begin{equation*}
\sum_{i=1}^k \left(| \preds{a}{P_i'} | \log | P_i' | + | \preds{a}{P_i''} | \log | P_i'' |\right) \leq \sum_{i=1}^k | \preds{a}{P_i} | \log | P_i |
\end{equation*}
\bigskip
Then, we consider the following facts\footnote{The proofs of those facts are omitted here since they are simple and not very interesting.}:
\begin{itemize}
\item A property induced by the concavity of the logarithm:
\begin{equation*}
\forall a, b, x, y \in \mathbb{R}_+^* \quad (a+b)\log (x+y) > a\log x + b\log y
\end{equation*}
\item A property on the set of predecessors and the definition of a split:
\begin{equation*}
(a, B) \text{ splits } C \text{ into } (C', C'') \Longrightarrow \preds{a}{C} = \preds{a}{C'} \cup \preds{a}{C''} \land \preds{a}{C'} \cap \preds{a}{C''} = \varnothing
\end{equation*}
from which we deduce that
\begin{equation*}
(a, B) \text{ splits } C \text{ into } (C', C'') \Longrightarrow |\preds{a}{C}| = |\preds{a}{C'}| + |\preds{a}{C''}|
\end{equation*}
\item A property on the definition of a split:
\begin{equation*}
(a, B) \text{ splits } C \text{ into } (C', C'') \Longrightarrow |C| = |C'| + |C''|
\end{equation*}
\end{itemize}
Given those properties, we show that
\begin{equation*}
\forall i \in \{1, \dots, k\} \quad | \preds{a}{P_i'} | \log | P_i' | + | \preds{a}{P_i''} | \log | P_i'' | \leq | \preds{a}{P_i} | \log | P_i |
\end{equation*}
which is sufficient to conclude the proof.
\bigskip
We now give some technical details about the proof in Isabelle. The goal is to show the following lemma:
\begin{lstlisting}[language=Isabelle]
lemma estimate_decrease: estimate ${\color{isa_dark_blue}\mathcal{A}}$ ($\isafixed{P'}$, $\isafixed{L'}$) $\leq$ estimate ${\color{isa_dark_blue}\mathcal{A}}$ ($\isafixed{P}$, $\isafixed{L}$)
\end{lstlisting}
The definition of the estimate using permutations allows us to do the disjunction between split and non-split blocks in the partition. First, we have to obtain a list \texttt{xs} (resp. \texttt{xs'}) permuting $\Sigma \times \mathcal{P}$ (resp. $\Sigma \times \mathcal{P'}$) such that
\begin{lstlisting}[language=Isabelle]
estimate ${\color{isa_dark_blue}\mathcal{A}}$ ($\isafixed{P}$, $\isafixed{L}$) = $\sum\isavar{s}\leftarrow\color{isa_orange}\text{xs}$. card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst $\isavar{s}$) (snd $\isavar{s}$)) * log(card(snd $\isavar{s}$))
\end{lstlisting}
resp.
\begin{lstlisting}[language=Isabelle]
estimate ${\color{isa_dark_blue}\mathcal{A}}$ ($\isafixed{P'}$, $\isafixed{L'}$) = $\sum\isavar{s}\leftarrow\color{isa_orange}\text{xs'}$. card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst $\isavar{s}$) (snd $\isavar{s}$)) * log(card(snd $\isavar{s}$))
\end{lstlisting}
Then, we obtain two lists \texttt{ys} and \texttt{zs} (resp. \texttt{ys'} and \texttt{zs'}) such that
\begin{lstlisting}[language=Isabelle]
$\color{isa_orange}\text{xs}$ = $\color{isa_orange}\text{ys}$ @ $\color{isa_orange}\text{zs}$ $\land$
$\forall$ $\isavar{e}$ $\in$ set $\color{isa_orange}\text{ys}$. snd $\isavar{e}$ $\in$ $\isafixed{P}$ $\cap$ $\isafixed{P'}$ $\land$
$\forall$ $\isavar{e}$ $\in$ set $\color{isa_orange}\text{zs}$. snd $\isavar{e}$ $\notin$ $\isafixed{P}$ $\cap$ $\isafixed{P'}$
\end{lstlisting}
resp.
\begin{lstlisting}[language=Isabelle]
$\color{isa_orange}\text{xs'}$ = $\color{isa_orange}\text{ys'}$ @ $\color{isa_orange}\text{zs'}$ $\land$
$\forall$ $\isavar{e}$ $\in$ set $\color{isa_orange}\text{ys'}$. snd $\isavar{e}$ $\in$ $\isafixed{P}$ $\cap$ $\isafixed{P'}$ $\land$
$\forall$ $\isavar{e}$ $\in$ set $\color{isa_orange}\text{zs'}$. snd $\isavar{e}$ $\notin$ $\isafixed{P}$ $\cap$ $\isafixed{P'}$
\end{lstlisting}
Just like above, by showing that \texttt{ys = ys'} we reduce the current goal to showing that
\begin{lstlisting}[language=Isabelle]
$\sum\isavar{s}\leftarrow\color{isa_orange}\text{zs}$. card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst $\isavar{s}$) (snd $\isavar{s}$)) * log(card(snd $\isavar{s}$)) $\leq$
$\sum\isavar{s}\leftarrow\color{isa_orange}\text{zs'}$. card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst $\isavar{s}$) (snd $\isavar{s}$)) * log(card(snd $\isavar{s}$))
\end{lstlisting}
\bigskip
Then, we show two key properties which will be useful to conclude the proof.
Since picking any $(A, B) \in \mathcal{P'} \times \mathcal{P'}$ does not imply that $A$ and $B$ result from the same split, we cannot directly obtain a single property with an equivalence capturing the whole idea.
Instead, we have to prove two separate implications that are dual properties.
The idea is that given a block of the current partition, we can find the two blocks of the new partition resulting from the split of this block.
Conversely, given a block of the new partition, we can find the other block of the new partition and the block from the current partition such that the two blocks from the new partition result from the split of the block from the current partition.
The latter property is the one making a formulation with an equivalence quite tedious since we cannot directly pick such two blocks together from the new partition.
Those two properties are named \texttt{unique\_split} and \texttt{unique\_split\_conv} and are defined in Isabelle as follows:
\begin{lstlisting}[language=Isabelle]
note unique_split=
$\isafixed{B}$ $\in$ set $\color{isa_orange}\text{zs}$ $\Longrightarrow$ $\exists$!($\isavar{B1}$, $\isavar{B2}$). $\isavar{B1}$ $\in$ set $\color{isa_orange}\text{zs'}$ $\land$ $\isavar{B2}$ $\in$ set $\color{isa_orange}\text{zs'}$ $\land$
(snd $\isafixed{B}$, snd $\isavar{B1}$, snd $\isavar{B2}$) $\in$ Hopcroft_splitted ${\color{isa_dark_blue}\mathcal{A}}$ $\isafixed{C}$ $\isafixed{a}$ {} $\isafixed{P}$ $\land$
fst $\isafixed{B}$ = fst $\isavar{B1}$ $\land$ fst $\isafixed{B}$ = fst $\isavar{B2}$
\end{lstlisting}
and
\begin{lstlisting}[language=Isabelle]
note unique_split_conv=
$\isafixed{B1}$ $\in$ set $\color{isa_orange}\text{zs'}$ $\Longrightarrow$ $\exists$!($\isavar{B}$, $\isavar{B2}$). $\isafixed{B1}$ $\in$ set $\color{isa_orange}\text{zs}$ $\land$ $\isavar{B2}$ $\in$ set $\color{isa_orange}\text{zs'}$ $\land$
split_pred ${\color{isa_dark_blue}\mathcal{A}}$ $\isafixed{P}$ (snd $\isavar{B}$) $\isafixed{a}$ $\isafixed{C}$ (snd $\isafixed{B1}$) (snd $\isavar{B2}$) $\land$
fst $\isavar{B}$ = fst $\isafixed{B1}$ $\land$ fst $\isavar{B}$ = fst $\isavar{B2}$
\end{lstlisting}
Let us dissect those two properties and translate them into words in order to understand their meaning.
\begin{itemize}
\item[\texttt{unique\_split}:] Let $\sigma \in \Sigma$ be a label and let $B \in \mathcal{P} \setminus \mathcal{P'}$ be a block of the current partition $\mathcal{P}$ that is about to be split, i.e.\ that is not in $\mathcal{P'}$ -- which is represented in Isabelle by $(\sigma, B)$ being a member of the list \texttt{zs}.
We can find two unique blocks $B_1$ and $B_2$ of the new partition $\mathcal{P'}$ -- actually, in $\mathcal{P'} \setminus \mathcal{P}$ which is represented in Isabelle by $(\alpha, B_1)$ and $(\alpha, B_2)$ being members of the list \texttt{zs'} for some label $\alpha \in \Sigma$ -- such that $B_1$ and $B_2$ are the result of the split of $B$ by some splitter\footnote{Any splitter, in particular it can be with the label $\sigma$ or with the block $B$ itself.} -- and $B_1$ and $B_2$ are both labelled by the same letter $\sigma$ if they are splitters in the new workset, i.e.\ if $(\sigma, B) \in \mathcal{L}$ then $(\sigma, B_1) \in \mathcal{L}'$ and $(\sigma, B_2) \in \mathcal{L}'$.
This property allows us to uniquely characterize and obtain blocks resulting from a split.
\item[\texttt{unique\_split\_conv}:] Let $\sigma \in \Sigma$ be a label and let $B_1 \in \mathcal{P'} \setminus \mathcal{P}$ be a block of the new partition $\mathcal{P'}$ that is not in $\mathcal{P}$ -- which is represented in Isabelle by $(\sigma, B_1)$ being a member of the list \texttt{zs'}.
We can find two unique blocks $B$ of the current partition $\mathcal{P}$ and $B_2$ of the new partition $\mathcal{P'}$ -- actually, in $\mathcal{P} \setminus \mathcal{P'}$ (resp. $\mathcal{P'} \setminus \mathcal{P}$) which is represented in Isabelle by $(\alpha, B)$ (resp. $(\alpha, B_2)$) being a member of the list \texttt{zs} (resp. \texttt{zs'}) for some label $\alpha \in \Sigma$ -- such that $B_1$ and $B_2$ result from the split of $B$ by some splitter and $B$, $B_1$ and $B_2$ are all labelled by the same letter $\sigma$ if they are splitters in the current and new worksets, i.e.\ if $(\sigma, B_1) \in \mathcal{L}'$ then $(\sigma, B) \in \mathcal{L}$ and $(\sigma, B_2) \in \mathcal{L}'$.
\end{itemize}
Then, we define\footnote{This definition is local to this proof, thus $f$ depends on all formerly obtained variables appearing in its definition.} a function $f$ which maps a block of the current partition to the two blocks of the new partition resulting from the split of this block. This function is defined as follows:
\begin{lstlisting}[language=Isabelle]
define $\color{isa_orange}f$ = $\lambda$$\isavar{B}$. (THE ($\isavar{B1}$, $\isavar{B2}$).
$\isavar{B1}$ $\in$ set ${\color{isa_orange}\text{zs'}}$ $\land$ $\isavar{B2}$ $\in$ set ${\color{isa_orange}\text{zs'}}$ $\land$
(snd $\isavar{B}$, snd $\isavar{B1}$, snd $\isavar{B2}$) $\in$ Hopcroft_splitted $\color{isa_dark_blue}\mathcal{A}$ $\isafixed{C}$ $\isafixed{a}$ {} $\isafixed{P}$ $\land$
fst $\isavar{B}$ = fst $\isavar{B1}$ $\land$ fst $\isavar{B}$ = fst $\isavar{B2}$)
\end{lstlisting}
The operator {\color{isa_blue}\texttt{THE}} in Isabelle allows us to select ``the'' element satisfying a predicate. Thanks to the property \texttt{unique\_split} giving unicity for this predicate, the function $f$ is well-defined.
Then, we prove that the image of \texttt{zs} by $f$ is exactly (a permutation of) \texttt{zs'}. Formally, we prove that
\begin{lstlisting}[language=Isabelle]
$\bigcup${{fst ($\color{isa_orange}\text{f}$ $\isavar{B}$), snd ($\color{isa_orange}\text{f}$ $\isavar{B}$)} | $\isavar{B}$. $\isavar{B}$ $\in$ set $\color{isa_orange}\text{zs}$} = set $\color{isa_orange}\text{zs'}$
\end{lstlisting}
and
\begin{lstlisting}[language=Isabelle]
fold ((@) $\circ$ ($\lambda\isavar{x}$. [fst ($\color{isa_orange}\text{f}$ $\isavar{x}$), snd ($\color{isa_orange}\text{f}$ $\isavar{x}$)])) $\color{isa_orange}\text{zs}$ [] <$\midtilde\midtilde$> $\color{isa_orange}\text{zs'}$
\end{lstlisting}
\bigskip
We can then replace the expression on the blocks\footnote{Actually, the tuples of letters and blocks.} in the sum corresponding to the estimate for $(P', L')$ over \texttt{zs'} by their preimages in \texttt{zs} by $f$ and we obtain the sum for the estimate for $(P, L)$ over \texttt{zs}.
Formally, we prove that
\begin{lstlisting}[language=Isabelle]
$\sum\isavar{s}\leftarrow$$\color{isa_orange}\text{zs'}$. card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst $\isavar{s}$) (snd $\isavar{s}$)) * log(card(snd $\isavar{s}$)) =
$\sum\isavar{s}\leftarrow$$\color{isa_orange}\text{zs}$. card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst (fst ($\color{isa_orange}\text{f}$ $\isavar{s}$))) (snd (fst ($\color{isa_orange}\text{f}$ $\isavar{s}$)))) * log(card(snd (fst ($\color{isa_orange}\text{f}$ $\isavar{s}$)))) +
card(preds ${\color{isa_dark_blue}\mathcal{A}}$ (fst (snd ($\color{isa_orange}\text{f}$ $\isavar{s}$))) (snd (snd ($\color{isa_orange}\text{f}$ $\isavar{s}$)))) * log(card(snd (snd ($\color{isa_orange}\text{f}$ $\isavar{s}$))))
\end{lstlisting}
We then use the property \texttt{unique\_split\_conv} to show that the sum on the right-hand side is bounded by the estimate for $(P, L)$ over \texttt{zs}, which concludes the proof.
\bigskip
We now consider approach (2) which requires a strict inequality.
This approach is still a work in progress and is not yet finished.
We would need to find an estimate $\Gamma$ such that for any state transition $(\mathcal{P}, \mathcal{L}) \rightarrow (\mathcal{P}', \mathcal{L}')$, we have
\begin{equation*}
\Gamma(\mathcal{P}', \mathcal{L}') < \Gamma(\mathcal{P}, \mathcal{L})
\end{equation*}
My guess is that the estimate $T$ defined in \ref{eq:estimate} would satisfy the strict inequality, however I am not sure that the estimate $\hat{T}$ defined in \ref{eq:estimate_simplified} would satisfy it. The reason for this intuition is that $T$ is defined on the workset $\mathcal{L}$ whereas $\hat{T}$ is defined on the partition $\mathcal{P}$.
Indeed, we can prove that for any state transition $(\mathcal{P}, \mathcal{L}) \rightarrow (\mathcal{P}', \mathcal{L}')$ of the algorithm, we have $\mathcal{L} \neq \mathcal{L}'$ but it may happen that $\mathcal{P} = \mathcal{P}'$.
\begin{remark}
Consider the following automaton:
\begin{figure}[!ht]
\centering
\begin{tikzpicture}[shorten >=1pt,node distance=2.5cm, on grid, >={Stealth[round]},initial text=,
every state/.style={thick}]
\node[state, initial] (q0) {$q_0$};
\node[state, initial] (q0) {$q_0$};
\node[state, above of=q0] (q1) {$q_1$};
\node[state, above of=q0] (q1) {$q_1$};
\node[state, accepting, right of=q0] (q2) {$q_2$};
\node[state, accepting, right of=q0] (q2) {$q_2$};
\node[state, accepting, above of=q2] (q3) {$q_3$};
\node[state, accepting, above of=q2] (q3) {$q_3$};
\node[state, accepting, right of=q2] (q4) {$q_4$};
\node[state, accepting, right of=q2] (q4) {$q_4$};
\node[state, right of=q3] (q5) {$q_5$};
\node[state, right of=q3] (q5) {$q_5$};
\node[state, right of=q3] (q5) {$q_5$};
\path[->] (q0) edge[bend left] node[left]{$\alpha$} (q1);
\path[->] (q0) edge node[below]{$\beta$} (q2);
\path[->] (q1) edge[bend left] node[right]{$\alpha$} (q0);
\path[->] (q1) edge node[above]{$\beta$} (q3);
\path[->] (q2) edge node[below]{$\alpha$} (q4);
\path[->] (q2) edge node[left=.3cm]{$\beta$} (q5);
\path[->] (q3) edge node[above]{$\beta$} (q5);
\path[->] (q3) edge node[above=.3cm]{$\alpha$} (q4);
\path[->] (q4) edge[loop right] node[right]{$\alpha$} (q4);
\path[->] (q4) edge node[right]{$\beta$} (q5);
\path[->] (q5) edge[loop right] node[right]{$\alpha,\beta$} (q5);
\end{tikzpicture}
\end{figure}
\begin{table}[!ht]
\centering
\small
\begin{tabular}{c|c|c}
Splitter & Partition $\mathcal{P}$ & Workset $\mathcal{L}$\\
\hline
-- & $\{q_0,q_1, q_5\}\{q_2, q_3, q_4\}$ & $(\alpha, \{q_0,q_1, q_5\})\ (\beta, \{q_0,q_1, q_5\})$\\
$(\beta, \{q_0,q_1, q_5\})$ & $\{q_0,q_1\}\{q_5\}\{q_2, q_3, q_4\}$ & $(\alpha, \{q_0,q_1\})\ (\alpha, \{q_5\})$\\
$(\alpha, \{q_0,q_1\})$ & $\{q_0,q_1\}\{q_5\}\{q_2, q_3, q_4\}$ & $(\alpha, \{q_5\})$\\
$(\alpha, \{q_5\})$ & $\{q_0,q_1\}\{q_5\}\{q_2, q_3, q_4\}$ & $\varnothing$\\
\end{tabular}
\end{table}
Note that the partition is stable after the second iteration of the algorithm. The workset however changes after each iteration.
\end{remark}
Thus, it may happen that the estimate $\hat{T}$ does not decrease after a state transition where the partition does not change.
With the estimate $T$, we are sure that at each step, a splitter is used and thus a term in the sum over $\mathcal{L}$ is removed and ``goes'' into the sum over $\Sigma \times \mathcal{P} \setminus \mathcal{L}$.
This remark is very informal.
However, because of the floor operation, we still cannot garantee that the estimate $T$ strictly decreases\footnote{Consider $\lfloor \log_2 n \rfloor$ and $\lfloor \log_2 n+1 \rfloor$: the bigger $n$ is, the more likely it is that both values are equal. Such terms might show up in border cases such as when a splitter splits off only one element from a block.}.
\section{Further work}
As mentioned above, finding a proof for a strict inequality is still a work in progress. It involves finding a suitable estimate, although the estimate $T$ seems to be a good candidate. However, this estimate makes the formalization more complex since it involves to consider separately elements from the workset and its complement and how they are affected by a state transition.
Everything linked to the estimate left aside, the formalization is still at a very early stage since we only focussed on the abstract level. However, this was the most important part of the formalization since it is the most complex one. The next steps would be to dive into the successive refinements and refine the abstract currencies. The formalization of the concrete currencies will be more straightforward since it will be mostly about implementing the abstract currencies with the chosen data structures, which probably already exists.
% \printglossary[type=\acronymtype]
% \addcontentsline{toc}{section}{CV}
% \includepdf{cv.pdf}
\pagebreak
% bibliography
\bibliographystyle{alpha}
\bibliography{ref}
\end{document}