## [7860bc]: thys / Collections / Userguide.thy  Maximize  Restore  History

### 565 lines (469 with data), 32.2 kB

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 (* Title: Isabelle Collections Library Author: Peter Lammich Maintainer: Peter Lammich *) header {* \chapter{Isabelle Collections Framework Userguide} *} theory Userguide imports Collections Efficient_Nat begin text_raw {*\label{thy:Userguide}*} section "Introduction" text {* The Isabelle Collections Framework defines interfaces of various collection types and provides some standard implementations and generic algorithms. The relation between the data structures of the collection framework and standard Isabelle types (e.g. for sets and maps) is established by abstraction functions. Currently, the following interfaces and data-structures are provided by the Isabelle Collections Framework: \begin{itemize} \item Set and map implementations based on (associative) lists, red-black trees, hashing and tries. \item An implementation of a FIFO-queue based on two stacks. \item Annotated lists implemented by finger trees. \item Priority queues implemented by binomial heaps, skew binomial heaps, and annotated lists (via finger trees). \end{itemize} The red-black trees are imported from the standard isabelle library. The binomial and skew binomial heaps are imported from the {\em Binomial-Heaps} entry of the archive of formal proofs. The finger trees are imported from the {\em Finger-Trees} entry of the archive of formal proofs. *} subsection "Getting Started" text {* To get started with the Isabelle Collections Framework (assuming that you are already familiar with Isabelle/HOL and Isar), you should first read the introduction (this section), that provides many basic examples. Section~\ref{sec:userguide.structure} explains the concepts of the Isabelle Collections Framework in more detail. Section~\ref{sec:userguide.ext} provides information on extending the framework along with detailed examples, and Section~\ref{sec:userguide.design} contains a discussion on the design of this framework. There is also a paper \cite{LammichLochbihler2010ITP} on the design of the Isabelle Collections Framework available. *} subsection "Introductory Example" text {* We introduce the Isabelle Collections Framework by a simple example. Given a set of elements represented by a red-black tree, and a list, we want to filter out all elements that are not contained in the set. This can be done by Isabelle/HOL's @{const filter}-function\footnote{Note that Isabelle/HOL uses the list comprehension syntax @{term [source] "[x\l. P x]"} as syntactic sugar for filtering a list.}: *} definition rbt_restrict_list :: "'a::linorder rs \ 'a list \ 'a list" where "rbt_restrict_list s l == [ x\l. rs_memb x s ]" text {* The type @{typ "'a rs"} is the type of sets backed by red-black trees. Note that the element type of sets backed by red-black trees must be of sort @{text linorder}. The function @{const rs_memb} tests membership on such sets. *} text {* Next, we show correctness of our function: *} lemma rbt_restrict_list_correct: assumes [simp]: "rs_invar s" shows "rbt_restrict_list s l = [x\l. x\rs_\ s]" by (simp add: rbt_restrict_list_def rs.memb_correct) text {* The lemma @{thm [source] rs.memb_correct}: @{thm [display] rs.memb_correct[no_vars]} states correctness of the @{const rs_memb}-function. The function @{const rs_\} maps a red-black-tree to the set that it represents. Moreover, we have to explicitely keep track of the invariants of the used data structure, in this case red-black trees. The premise @{thm (prem 1) rs.memb_correct} represents the invariant assumption for the collection data structure. Red-black-trees are invariant-free, so this defaults to @{term "True"}. For uniformity reasons, these (unnecessary) invariant assumptions are present in all correctness lemmata. Many of the correctness lemmas for standard RBT-set-operations are summarized by the lemma @{thm [source] rs_correct}: @{thm [display] rs_correct[no_vars]} *} text {* All implementations provided by this library are compatible with the Isabelle/HOL code-generator. Now follow some examples of using the code-generator and the related value-command: *} text {* There are conversion functions from lists to sets and, vice-versa, from sets to lists: *} value "list_to_rs [1::int..10]" value "rs_to_list (list_to_rs [1::int .. 10])" value "rs_to_list (list_to_rs [1::int,5,6,7,3,4,9,8,2,7,6])" text {* Note that sets make no guarantee about ordering, hence the only thing we can prove about conversion from sets to lists is: @{thm [source] rs.to_list_correct}: @{thm [display] rs.to_list_correct[no_vars]} *} value "rbt_restrict_list (list_to_rs [1::nat,2,3,4,5]) [1::nat,9,2,3,4,5,6,5,4,3,6,7,8,9]" definition "test n = (list_to_rs [(1::int)..n])" ML {* @{code test} 9000 *} subsection "Theories" text {* To make available the whole collections framework to your formalization, import the theory @{theory Collections}. Other theories in the Isabelle Colelction Framework include: \begin{description} \item[@{theory SetSpec}] Specification of sets and set functions \item[@{theory OrderedSet}] Specification of ordered sets and set functions \item[@{theory SetGA}] Generic algorithms for sets \item[@{theory SetStdImpl}] Standard set implementations (list, rb-tree, hashing, tries) \item[@{theory MapSpec}] Specification of maps \item[@{theory OrderedMap}] Specification of ordered maps \item[@{theory MapGA}] Generic algorithms for maps \item[@{theory MapStdImpl}] Standard map implementations (list,rb-tree, hashing, tries) \item[@{theory Algos}] Various generic algorithms \item[@{theory SetIndex}] Generic algorithm for building indices of sets \item[@{theory ListSpec}] Specification of lists \item[@{theory Fifo}] Amortized fifo queue \item[@{theory DatRef}] Data refinement for the while combinator \end{description} *} subsection "Iterators" text {* An important concept when using collections are iterators. An iterator is a kind of generalized fold-functional. Like the fold-functional, it applies a function to all elements of a set and modifies a state. There are no guarantees about the iteration order. But, unlike the fold functional, you can prove useful properties of iterations even if the function is not left-commutative. Proofs about iterations are done in invariant style, establishing an invariant over the iteration. The iterator combinator for red-black tree sets is @{const rs_iterate}, and the proof-rule that is usually used is: @{thm [source] rs.iterate_rule_P}: @{thm [display] rs.iterate_rule_P[no_vars]} The invariant @{term I} is parameterized with the set of remaining elements that have not yet been iterated over and the current state. The invariant has to hold for all elements remaining and the initial state: @{term "I (rs_\ S) \0"}. Moreover, the invariant has to be preserved by an iteration step: @{term [display] "\x it \. \x \ it; it \ rs_\ S; I it \\ \ I (it - {x}) (f x \)"} And the proposition to be shown for the final state must be a consequence of the invarant for no elements remaining: @{term "\\. I {} \ \ P \"}. A generalization of iterators are {\em interruptible iterators} where iteration is only continues while some condition on the state holds. Reasoning over interruptible iterators is also done by invariants: @{thm [source] rs.iteratei_rule_P}: @{thm [display] rs.iteratei_rule_P[no_vars]} Here, interruption of the iteration is handled by the premise @{term [display] "\\ it. \it \ rs_\ S; it \ {}; \ c \; I it \\ \ P \"} that shows the proposition from the invariant for any intermediate state of the iteration where the continuation condition does not hold (and thus the iteration is interrupted). *} text {* As an example of reasoning about results of iterators, we implement a function that converts a hashset to a list that contains precisely the elements of the set. *} definition "hs_to_list' s == hs_iterate (op #) s []" text {* The correctness proof works by establishing the invariant that the list contains all elements that have already been iterated over. Again @{term "hs_invar s"} denotes the invariant for hashsets which defaults to @{term "True"}. *} lemma hs_to_list'_correct: assumes INV: "hs_invar s" shows "set (hs_to_list' s) = hs_\ s" apply (unfold hs_to_list'_def) apply (rule_tac I="\it \. set \ = hs_\ s - it" in hs.iterate_rule_P[OF INV]) txt {* The resulting proof obligations are easily discharged using auto: *} apply auto done text {* As an example for an interruptible iterator, we define a bounded existential-quantification over the list elements. As soon as the first element is found that fulfills the predicate, the iteration is interrupted. The state of the iteration is simply a boolean, indicating the (current) result of the quantification: *} definition "hs_bex s P == hs_iteratei (\\. \ \) (\x \. P x) s False" lemma hs_bex_correct: "hs_invar s \ hs_bex s P \ (\x\hs_\ s. P x)" apply (unfold hs_bex_def) txt {* The invariant states that the current result matches the result of the quantification over the elements already iterated over: *} apply (rule_tac I="\it \. \ \ (\x\hs_\ s - it. P x)" in hs.iteratei_rule_P) txt {* The resulting proof obligations are easily discharged by auto: *} apply auto done section "Structure of the Framework" text_raw {*\label{sec:userguide.structure}*} text {* The concepts of the framework are roughly based on the object-oriented concepts of interfaces, implementations and generic algorithms. The concepts used in the framework are the following: \begin{description} \item[Interfaces] An interface describes some concept by providing an abstraction mapping $\alpha$ to a related Isabelle/HOL-concept. The definition is generic in the datatype used to implement the concept (i.e. the concrete data structure). An interface is specified by means of a locale that fixes the abstraction mapping and an invariant. For example, the set-interface contains an abstraction mapping to sets, and is specified by the locale @{text SetSpec.set}. An interface roughly matches the concept of a (collection) interface in Java, e.g. {\em java.util.Set}. \item[Functions] A function specifies some functionality involving interfaces. A function is specified by means of a locale. For example, membership query for a set is specified by the locale @{const [source] SetSpec.set_memb} and equality test between two sets is a function specified by @{const [source] SetSpec.set_equal}. A function roughly matches a method declared in an interface, e.g. {\em java.util.Set\#contains, java.util.Set\#equals}. \item[Generic Algorithms] A generic algorithm specifies, in a generic way, how to implement a function using other functions. For example, the equality test for sets may be implemented using a subset function. It is described by the constant @{const [source] SetGA.subset_equal} and the corresponding lemma @{thm [source] SetGA.subset_equal_correct}. There is no direct match of generic algorithms in the Java Collections Framework. The most related concept are abstract collection interfaces, that provide some default algorithms, e.g. {\em java.util.AbstractSet}. The concept of {\em Algorithm} in the C++ Standard Template Library \cite{C++STL} matches the concept of Generic Algorithm quite well. \item[Implementation] An implementation of an interface provides a data structure for that interface together with an abstraction mapping and an invariant. Moreover, it provides implementations for some (or all) functions of that interface. For example, red-black trees are an implementation of the set-interface, with the abstraction mapping @{const rs_\} and invariant @{const rs_invar}; and the constant @{const rs_ins} implements the insert-function, as stated by the lemma @{thm [source] rs_ins_impl}. An implementation matches a concrete collection interface in Java, e.g. {\em java.util.TreeSet}, and the methods implemented by such an interface, e.g. {\em java.util.TreeSet\#add}. \item[Instantiation] An instantiation of a generic algorithm provides actual implementations for the used functions. For example, the generic equality-test algorithm can be instantiated to use red-black-trees for both arguments (resulting in the function @{const rr_equal} and the lemma @{thm [source] rr_equal_impl}). While some of the functions of an implementation need to be implemented specifically, many functions may be obtained by instantiating generic algorithms. In Java, instantiation of a generic algorithm is matched most closely by inheriting from an abstract collection interface. In the C++ Standard Template Library instantiation of generic algorithms is done implicitely when using them. \end{description} *} subsection "Naming Conventions" text {* The Isabelle Collections Framework follows these general naming conventions. Each implementation has a two-letter (or three-letter) and a one-letter (or two-letter) abbreviation, that are used as prefixes for the related constants, lemmas and instantiations. The two-letter and three-letter abbreviations should be unique over all interfaces and instantiations, the one-letter abbreviations should be unique over all implementations of the same interface. Names that reference the implementation of only one interface are prefixed with that implementation's two-letter abbreviation (e.g. @{const hs_ins} for insertion into a HashSet (hs,h)), names that reference more than one implementation are prefixed with the one-letter (or two-letter) abbreviations (e.g. @{const lhh_union} for set union between a ListSet(ls,l) and a HashSet(hs,h), yielding a HashSet) Currently, there are the following abbreviations: \begin{description} \item[lm,l] List Map \item[lmi,li] List Map with explicit invariant \item[rm,r] RB-Tree Map \item[hm,h] Hash Map \item[ahm,a] Array-based hash map \item[tm,t] Trie Map \item[ls,l] List Set \item[lsi,li] List Set with explicit invariant \item[rs,r] RB-Tree Set \item[hs,h] Hash Set \item[ahs,a] Array-based hash map \item[ts,t] Trie Set \end{description} Each function @{text name} of an interface @{text interface} is declared in a locale @{text interface_name}. This locale provides a fact @{text name_correct}. For example, there is the locale @{const set_ins} providing the fact @{thm [source] set_ins.ins_correct}. An implementation instantiates the locales of all implemented functions, using its two-letter abbreviation as instantiation prefix. For example, the HashSet-implementation instantiates the locale @{const set_ins} with the prefix {\em hs}, yielding the lemma @{thm [source] hs.ins_correct}. Moreover, an implementation with two-letter abbreviation {\em aa} provides a lemma @{text aa_correct} that summarizes the correctness facts for the basic operations. It should only contain those facts that are safe to be used with the simplifier. E.g., the correctness facts for basic operations on hash sets are available via the lemma @{thm [source] hs_correct}. *} section "Extending the Framework" text_raw {*\label{sec:userguide.ext}*} text {* This section illustrates, by example, how to add new interfaces, functions, generic algorithms and implementations to the framework: *} subsection "Interfaces" text {* An interface provides a new concept, that is usually mapped to a related Isabelle/HOL-concept. An interface is defined by providing a locale that fixes an abstraction mapping and an invariant. For example, consider the definition of an interface for sets: *} locale set' = -- "Abstraction mapping to Isabelle/HOL sets" fixes \ :: "'s \ 'a set" -- "Invariant" fixes invar :: "'s \ bool" text {* The invariant makes it possible for an implementation to restrict to certain subsets of the type's universal set. Usually, it is convenient to hide this invariant in a typedef and to set up the code generator approriately. However, in some cases such invariants may enable more efficient implementations (e.g. disoint insert for distinct lists), so all specifications should be with respect to a implementation-provided invariant. Most implementations will just set this invariant to @{text "\_. True"}. *} subsection "Functions" text {* A function describes some operation on instances of an interface. It is specified by providing a locale that includes the locale of the interface, fixes a parameter for the operation and makes a correctness assumption. For an interface @{text interface} and an operation @{text name}, the function's locale has the name @{text interface_name}, the fixed parameter has the name @{text name} and the correctness assumption has the name @{text name_correct}. As an example, consider the specifications of the insert function for sets and the empty set: *} locale set'_ins = set' + -- "Give reasonable names to types:" constrains \ :: "'s \ 'a set" -- "Parameter for function:" fixes ins :: "'a \ 's \ 's" -- {* Correctness assumption. A correctness assumption usually consists of two parts: \begin{itemize} \item A description of the operation on the abstract level, assuming that the operands satisfy the invariants. \item The invariant preservation assumptions, i.e. assuming that the result satisfies its invariants if the operands do. \end{itemize} *} assumes ins_correct: "invar s \ \ (ins x s) = insert x (\ s)" "invar s \ invar (ins x s)" locale set'_empty = set' + constrains \ :: "'s \ 'a set" fixes empty :: "'s" assumes empty_correct: "\ empty = {}" "invar empty" text {* In general, more than one interface or more than one instance of the same interface may be involved in a function. Consider, for example, the intersection of two sets. It involves three instances of set interfaces, two for the operands and one for the result: *} locale set'_inter = set' \1 invar1 + set' \2 invar2 + set' \3 invar3 for \1 :: "'s1 \ 'a set" and invar1 and \2 :: "'s2 \ 'a set" and invar2 and \3 :: "'s3 \ 'a set" and invar3 + fixes inter :: "'s1 \ 's2 \ 's3" assumes inter_correct: "\invar1 s1; invar2 s2\ \ \3 (inter s1 s2) = \1 s1 \ \2 s2" "\invar1 s1; invar2 s2\ \ invar3 (inter s1 s2)" text {* For use in further examples, we also specify a function that converts a list to a set *} locale set'_list_to_set = set' + constrains \ :: "'s \ 'a set" fixes list_to_set :: "'a list \ 's" assumes list_to_set_correct: "\ (list_to_set l) = set l" "invar (list_to_set l)" subsection "Generic Algorithm" text {* A generic algorithm describes how to implement a function using implementations of other functions. Thereby, it is parametric in the actual implementations of the functions. A generic algorithm comes with the definition of a function and a correctness lemma. The function takes the required functions as arguments. The convention for argument order is that the required functions come first, then the implemented function's arguments. Consider, for example, the generic algorithm to convert a list to a set\footnote{To keep the presentation simple, we use a non-tail-recursive version here}. This function requires implementations of the {\em empty} and {\em ins} functions\footnote{Due to name-clashes with @{const [long_names] Map.empty} we have to use slightly different parameter names here}: *} fun list_to_set' :: "'s \ ('a \ 's \ 's) \ 'a list \ 's" where "list_to_set' empt ins' [] = empt" | "list_to_set' empt ins' (a#ls) = ins' a (list_to_set' empt ins' ls)" lemma list_to_set'_correct: fixes empty ins -- {* Assumptions about the required function implementations: *} assumes "set'_empty \ invar empty" assumes "set'_ins \ invar ins" -- {* Provided function: *} shows "set'_list_to_set \ invar (list_to_set' empty ins)" proof - interpret set'_empty \ invar empty by fact interpret set'_ins \ invar ins by fact { fix l have "\ (list_to_set' empty ins l) = set l \ invar (list_to_set' empty ins l)" by (induct l) (simp_all add: empty_correct ins_correct) } thus ?thesis by unfold_locales auto qed text_raw {*\paragraph{Generic Algorithms with ad-hoc function specification}*} text {* The collection framework also contains a few generic algorithms that do not implement a function that is specified via a locale, but the function is specified ad-hoc within the correctness lemma. An example is the generic algorithm @{const [source] Algos.map_to_nat} that computes an injective map from the elements of a given finite set to an initial segment of the natural numbers. There is no locale specifying such a function, but the function is implicitly specified by the correctness lemma @{thm [source] map_to_nat_correct}: @{thm [display] map_to_nat_correct[no_vars]} This kind of ad-hoc specification should only be used when it is unlikely that the same function may be implemented differently. *} subsection {* Implementation *} text {* An implementation of an interface defines an actual data structure, an invariant, and implementations of the functions. An implementation has a two-letter (or three-letter) abbreviation that should be unique and a one-letter (or two-letter) abbreviation that should be unique amongst all implementations of the same interface. Consider, for example, a set implementation by distinct lists. It has the abbreviations (lsi,li). To avoid name clashes with the existing list-set implementation in the framework, we use ticks (') here and there to disambiguate the names. *} -- "The type of the data structure should be available as the two-letter abbreviation: " types 'a lsi' = "'a list" -- "The abstraction function:" definition "lsi'_\ == set" -- "The invariant: In our case we constrain the lists to be distinct:" definition "lsi'_invar == distinct" -- "The locale of the interface is interpreted with the two-letter abbreviation as prefix:" interpretation lsi': set' lsi'_\ lsi'_invar . text {* Next, we implement some functions. The implementation of a function @{text name} is prefixed by the two-letter prefix: *} definition "lsi'_empty == []" text {* Each function implementation has a corresponding lemma that shows the instantiation of the locale. It is named by the function's name suffixed with @{text "_impl"}: *} lemma lsi'_empty_impl: "set'_empty lsi'_\ lsi'_invar lsi'_empty" by (unfold_locales) (auto simp add: lsi'_empty_def lsi'_invar_def lsi'_\_def) text {* The corresponding function's locale is interpreted with the function implementation and the interface's two-letter abbreviation as prefix: *} interpretation lsi': set'_empty lsi'_\ lsi'_invar lsi'_empty using lsi'_empty_impl . text {* This generates the lemma @{thm [source] lsi'.empty_correct}: @{thm [display] lsi'.empty_correct[no_vars]} *} definition "lsi'_ins x l == if x\set l then l else x#l" text {* Correctness may optionally be established using separate lemmas. These should be suffixed with {\em \_aux} to indicate that they should not be used by other proofs:*} lemma lsi'_ins_correct_aux: "lsi'_invar l \ lsi'_\ (lsi'_ins x l) = insert x (lsi'_\ l)" "lsi'_invar l \ lsi'_invar (lsi'_ins x l)" by (auto simp add: lsi'_ins_def lsi'_invar_def lsi'_\_def) lemma lsi'_ins_impl: "set'_ins lsi'_\ lsi'_invar lsi'_ins" by unfold_locales (simp_all add: lsi'_ins_correct_aux) interpretation lsi': set'_ins lsi'_\ lsi'_invar lsi'_ins using lsi'_ins_impl . subsection {* Instantiations (Generic Algorithm)*} text {* The instantiation of a generic algorithm substitutes actual implementations for the required functions. A generic algorithm is instantiated by providing a definition that fixes the function parameters accordingly. Moreover, an @{text "impl"}-lemma and an interpretation of the implemented function's locale is provided. These can usually be constructed canonically from the generic algorithm's correctness lemma: For example, consider conversion from lists to list-sets by instantiating the @{const list_to_set'}-algorithm: *} definition "lsi'_list_to_set == list_to_set' lsi'_empty lsi'_ins" lemmas lsi'_list_to_set_impl = list_to_set'_correct[OF lsi'_empty_impl lsi'_ins_impl, folded lsi'_list_to_set_def] interpretation lsi': set'_list_to_set lsi'_\ lsi'_invar lsi'_list_to_set using lsi'_list_to_set_impl . text {* Note that the actual framework slightly deviates from the naming convention here, the corresponding instantiation of @{const [source] SetGA.gen_list_to_set} is called @{const list_to_ls}, the @{text impl}-lemma is called @{thm [source] list_to_ls_impl}. *} text {* Generating all possible instantiations of generic algorithms based on the available implementations can be done mechanically. Currently, we have not implemented such an approach on the Isabelle ML-level. However, we used an ad-hoc ruby-script ({\em scripts/inst.rb}) to generate the thy-file {\em StdInst.thy} from the file {\em StdInst.in.thy}. *} section "Design Issues" text_raw {*\label{sec:userguide.design}*} text {* In this section, we motivate some of the design decisions of the Isabelle Collections Framework and report our experience with alternatives. Many of the design decisions are justified by restrictions of Isabelle/HOL and the code generator, so that there may be better options if those restrictions should vanish from future releases of Isabelle/HOL. *} text {* The main design goals of this development are: \begin{enumerate} \item\label{dg_unified} Make available various implementations of collections under a unified interface. \item\label{dg_extensible} It should be easy to extend the framework by new interfaces, functions, algorithms, and implementations. \item\label{dg_concise} Allow simple and concise reasoning over functions using collections. \item\label{dg_genalgo} Allow generic algorithms, that are independent of the actual data structure that is used. \item\label{dg_exec} Support generation of executable code. \item\label{dg_control} Let the user precisely control what data structures are used in the implementation. \end{enumerate} *} subsection {* Data Refinement *} text {* In order to allow simple reasoning over collections, we use a data refinement approach. Each collection interface has an abstraction function that maps it on a related Isabelle/HOL concept (abstract level). The specification of functions are also relative to the abstraction. This allows most of the correctness reasoning to be done on the abstract level. On this level, the tool support is more elaborated and one is not yet fixed to a concrete implementation. In a next step, the abstract specification is refined to use an actual implementation (concrete level). The correctness properties proven on the abstract level usually transfer easily to the concrete level. Moreover, the user has precise control how the refinement is done, i.e. what data structures are used. An alternative would be to do refinement completely automatic, as e.g. done in the code generator setup of the Theory~{\em Executable-Set}. This has the advantage that it induces less writing overhead. The disadvantage is that the user loses a great amount of control over the refinement. For example, in {\em Executable-Set}, all sets have to be represented by lists, and there is no possibility to represent one set differently from another. *} subsection {* Record Based Interfaces *} text {* We have experimented with grouping functions of an interface together via a record. This has the advantage that parameterization of generic algorithms becomes simpler, as multiple function parameters are replaced by a single record parameter. For maps and sets, theories @{theory RecordSetImpl} and @{theory RecordMapImpl} provide these instantiations for all implementations (except for tries). Moreover, the priority queue implementations contain such records for all important operations. The records do not include operations that depend on extra type variables because these operations would become monomorphic due to Isabelle's type system restrictions. *} subsection {* Locales for Generic Algorithms *} text {* Another tempting possibility to define a generic algorithm is to define a locale that includes the locales of all required functions, and do the definition of the generic algorithm inside that locale. This has the advantage that the function parameters are made implicit, thus improving readability. On the other hand, the code generator has problems with generating code from definitions inside a locale. Currently, one has to manually set up the code generator for such definitions. Moreover, when fixing function parameters in the declaration of the locale, their types will be inferred independently of the definitions later done in the locale context. In order to get the correct types, one has to add explicit type constraints. These tend to become rather lengthy, especially for iterator states. The approach taken in this framework -- passing the required functions as explicit parameters to a generic algorithm -- usually needs less type constraints, as type inference usually does most of the job, in particular it infers the correct types of iterator states. *} subsection {* Explicit Invariants vs Typedef *} text {* The interfaces of this framework use explicit invariants. This provides a more general specification which allows some operations to be sometimes implemented more efficiently, cf. @{term "lsi_ins_dj"} in @{theory ListSetImpl_Invar}. Most implementations, however, hide the invariant in a typedef and setup the code generator appropriately. In that case, the invariant is just @{term "\_. True"}, which still shows up in some premises and conclusions due to uniformity reasons. *} end