wneuper/isa: doc-src/TutorialI/Inductive/AB.thy@20e3616b2d9c

     1 (*<*)theory AB imports Main begin(*>*)

     3 section{*Case Study: A Context Free Grammar*}

     5 text{*\label{sec:CFG}

     6 \index{grammars!defining inductively|(}%

     7 Grammars are nothing but shorthands for inductive definitions of nonterminals

     8 which represent sets of strings. For example, the production

     9 $A \to B c$ is short for

    10 \[ w \in B \Longrightarrow wc \in A \]

    11 This section demonstrates this idea with an example

    12 due to Hopcroft and Ullman, a grammar for generating all words with an

    13 equal number of $a$'s and~$b$'s:

    14 \begin{eqnarray}

    15 S &\to& \epsilon \mid b A \mid a B \nonumber\\

    16 A &\to& a S \mid b A A \nonumber\\

    17 B &\to& b S \mid a B B \nonumber

    18 \end{eqnarray}

    19 At the end we say a few words about the relationship between

    20 the original proof \cite[p.\ts81]{HopcroftUllman} and our formal version.

    22 We start by fixing the alphabet, which consists only of @{term a}'s

    23 and~@{term b}'s:

    24 *}

    26 datatype alfa = a | b

    28 text{*\noindent

    29 For convenience we include the following easy lemmas as simplification rules:

    30 *}

    32 lemma [simp]: "(x \<noteq> a) = (x = b) \<and> (x \<noteq> b) = (x = a)"

    33 by (case_tac x, auto)

    35 text{*\noindent

    36 Words over this alphabet are of type @{typ"alfa list"}, and

    37 the three nonterminals are declared as sets of such words.

    38 The productions above are recast as a \emph{mutual} inductive

    39 definition\index{inductive definition!simultaneous}

    40 of @{term S}, @{term A} and~@{term B}:

    41 *}

    43 inductive_set

    44   S :: "alfa list set" and

    45   A :: "alfa list set" and

    46   B :: "alfa list set"

    47 where

    48   "[] \<in> S"

    49 | "w \<in> A \<Longrightarrow> b#w \<in> S"

    50 | "w \<in> B \<Longrightarrow> a#w \<in> S"

    52 | "w \<in> S        \<Longrightarrow> a#w   \<in> A"

    53 | "\<lbrakk> v\<in>A; w\<in>A \<rbrakk> \<Longrightarrow> b#v@w \<in> A"

    55 | "w \<in> S            \<Longrightarrow> b#w   \<in> B"

    56 | "\<lbrakk> v \<in> B; w \<in> B \<rbrakk> \<Longrightarrow> a#v@w \<in> B"

    58 text{*\noindent

    59 First we show that all words in @{term S} contain the same number of @{term

    60 a}'s and @{term b}'s. Since the definition of @{term S} is by mutual

    61 induction, so is the proof: we show at the same time that all words in

    62 @{term A} contain one more @{term a} than @{term b} and all words in @{term

    63 B} contain one more @{term b} than @{term a}.

    64 *}

    66 lemma correctness:

    67   "(w \<in> S \<longrightarrow> size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b])     \<and>

    68    (w \<in> A \<longrightarrow> size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b] + 1) \<and>

    69    (w \<in> B \<longrightarrow> size[x\<leftarrow>w. x=b] = size[x\<leftarrow>w. x=a] + 1)"

    71 txt{*\noindent

    72 These propositions are expressed with the help of the predefined @{term

    73 filter} function on lists, which has the convenient syntax @{text"[x\<leftarrow>xs. P

    74 x]"}, the list of all elements @{term x} in @{term xs} such that @{prop"P x"}

    75 holds. Remember that on lists @{text size} and @{text length} are synonymous.

    77 The proof itself is by rule induction and afterwards automatic:

    78 *}

    80 by (rule S_A_B.induct, auto)

    82 text{*\noindent

    83 This may seem surprising at first, and is indeed an indication of the power

    84 of inductive definitions. But it is also quite straightforward. For example,

    85 consider the production $A \to b A A$: if $v,w \in A$ and the elements of $A$

    86 contain one more $a$ than~$b$'s, then $bvw$ must again contain one more $a$

    87 than~$b$'s.

    89 As usual, the correctness of syntactic descriptions is easy, but completeness

    90 is hard: does @{term S} contain \emph{all} words with an equal number of

    91 @{term a}'s and @{term b}'s? It turns out that this proof requires the

    92 following lemma: every string with two more @{term a}'s than @{term

    93 b}'s can be cut somewhere such that each half has one more @{term a} than

    94 @{term b}. This is best seen by imagining counting the difference between the

    95 number of @{term a}'s and @{term b}'s starting at the left end of the

    96 word. We start with 0 and end (at the right end) with 2. Since each move to the

    97 right increases or decreases the difference by 1, we must have passed through

    98 1 on our way from 0 to 2. Formally, we appeal to the following discrete

    99 intermediate value theorem @{thm[source]nat0_intermed_int_val}

   100 @{thm[display,margin=60]nat0_intermed_int_val[no_vars]}

   101 where @{term f} is of type @{typ"nat \<Rightarrow> int"}, @{typ int} are the integers,

   102 @{text"\<bar>.\<bar>"} is the absolute value function\footnote{See

   103 Table~\ref{tab:ascii} in the Appendix for the correct \textsc{ascii}

   104 syntax.}, and @{term"1::int"} is the integer 1 (see \S\ref{sec:numbers}).

   106 First we show that our specific function, the difference between the

   107 numbers of @{term a}'s and @{term b}'s, does indeed only change by 1 in every

   108 move to the right. At this point we also start generalizing from @{term a}'s

   109 and @{term b}'s to an arbitrary property @{term P}. Otherwise we would have

   110 to prove the desired lemma twice, once as stated above and once with the

   111 roles of @{term a}'s and @{term b}'s interchanged.

   112 *}

   114 lemma step1: "\<forall>i < size w.

   115   \<bar>(int(size[x\<leftarrow>take (i+1) w. P x])-int(size[x\<leftarrow>take (i+1) w. \<not>P x]))

   116    - (int(size[x\<leftarrow>take i w. P x])-int(size[x\<leftarrow>take i w. \<not>P x]))\<bar> \<le> 1"

   118 txt{*\noindent

   119 The lemma is a bit hard to read because of the coercion function

   120 @{text"int :: nat \<Rightarrow> int"}. It is required because @{term size} returns

   121 a natural number, but subtraction on type~@{typ nat} will do the wrong thing.

   122 Function @{term take} is predefined and @{term"take i xs"} is the prefix of

   123 length @{term i} of @{term xs}; below we also need @{term"drop i xs"}, which

   124 is what remains after that prefix has been dropped from @{term xs}.

   126 The proof is by induction on @{term w}, with a trivial base case, and a not

   127 so trivial induction step. Since it is essentially just arithmetic, we do not

   128 discuss it.

   129 *}

   131 apply(induct_tac w)

   132 apply(auto simp add: abs_if take_Cons split: nat.split)

   133 done

   135 text{*

   136 Finally we come to the above-mentioned lemma about cutting in half a word with two more elements of one sort than of the other sort:

   137 *}

   139 lemma part1:

   140  "size[x\<leftarrow>w. P x] = size[x\<leftarrow>w. \<not>P x]+2 \<Longrightarrow>

   141   \<exists>i\<le>size w. size[x\<leftarrow>take i w. P x] = size[x\<leftarrow>take i w. \<not>P x]+1"

   143 txt{*\noindent

   144 This is proved by @{text force} with the help of the intermediate value theorem,

   145 instantiated appropriately and with its first premise disposed of by lemma

   146 @{thm[source]step1}:

   147 *}

   149 apply(insert nat0_intermed_int_val[OF step1, of "P" "w" "1"])

   150 by force

   152 text{*\noindent

   154 Lemma @{thm[source]part1} tells us only about the prefix @{term"take i w"}.

   155 An easy lemma deals with the suffix @{term"drop i w"}:

   156 *}

   159 lemma part2:

   160   "\<lbrakk>size[x\<leftarrow>take i w @ drop i w. P x] =

   161     size[x\<leftarrow>take i w @ drop i w. \<not>P x]+2;

   162     size[x\<leftarrow>take i w. P x] = size[x\<leftarrow>take i w. \<not>P x]+1\<rbrakk>

   163    \<Longrightarrow> size[x\<leftarrow>drop i w. P x] = size[x\<leftarrow>drop i w. \<not>P x]+1"

   164 by(simp del: append_take_drop_id)

   166 text{*\noindent

   167 In the proof we have disabled the normally useful lemma

   168 \begin{isabelle}

   169 @{thm append_take_drop_id[no_vars]}

   170 \rulename{append_take_drop_id}

   171 \end{isabelle}

   172 to allow the simplifier to apply the following lemma instead:

   173 @{text[display]"[x\<in>xs@ys. P x] = [x\<in>xs. P x] @ [x\<in>ys. P x]"}

   175 To dispose of trivial cases automatically, the rules of the inductive

   176 definition are declared simplification rules:

   177 *}

   179 declare S_A_B.intros[simp]

   181 text{*\noindent

   182 This could have been done earlier but was not necessary so far.

   184 The completeness theorem tells us that if a word has the same number of

   185 @{term a}'s and @{term b}'s, then it is in @{term S}, and similarly

   186 for @{term A} and @{term B}:

   187 *}

   189 theorem completeness:

   190   "(size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b]     \<longrightarrow> w \<in> S) \<and>

   191    (size[x\<leftarrow>w. x=a] = size[x\<leftarrow>w. x=b] + 1 \<longrightarrow> w \<in> A) \<and>

   192    (size[x\<leftarrow>w. x=b] = size[x\<leftarrow>w. x=a] + 1 \<longrightarrow> w \<in> B)"

   194 txt{*\noindent

   195 The proof is by induction on @{term w}. Structural induction would fail here

   196 because, as we can see from the grammar, we need to make bigger steps than

   197 merely appending a single letter at the front. Hence we induct on the length

   198 of @{term w}, using the induction rule @{thm[source]length_induct}:

   199 *}

   201 apply(induct_tac w rule: length_induct)

   202 apply(rename_tac w)

   204 txt{*\noindent

   205 The @{text rule} parameter tells @{text induct_tac} explicitly which induction

   206 rule to use. For details see \S\ref{sec:complete-ind} below.

   207 In this case the result is that we may assume the lemma already

   208 holds for all words shorter than @{term w}. Because the induction step renames

   209 the induction variable we rename it back to @{text w}.

   211 The proof continues with a case distinction on @{term w},

   212 on whether @{term w} is empty or not.

   213 *}

   215 apply(case_tac w)

   216  apply(simp_all)

   217 (*<*)apply(rename_tac x v)(*>*)

   219 txt{*\noindent

   220 Simplification disposes of the base case and leaves only a conjunction

   221 of two step cases to be proved:

   222 if @{prop"w = a#v"} and @{prop[display]"size[x\<in>v. x=a] = size[x\<in>v. x=b]+2"} then

   223 @{prop"b#v \<in> A"}, and similarly for @{prop"w = b#v"}.

   224 We only consider the first case in detail.

   226 After breaking the conjunction up into two cases, we can apply

   227 @{thm[source]part1} to the assumption that @{term w} contains two more @{term

   228 a}'s than @{term b}'s.

   229 *}

   231 apply(rule conjI)

   232  apply(clarify)

   233  apply(frule part1[of "\<lambda>x. x=a", simplified])

   234  apply(clarify)

   235 txt{*\noindent

   236 This yields an index @{prop"i \<le> length v"} such that

   237 @{prop[display]"length [x\<leftarrow>take i v . x = a] = length [x\<leftarrow>take i v . x = b] + 1"}

   238 With the help of @{thm[source]part2} it follows that

   239 @{prop[display]"length [x\<leftarrow>drop i v . x = a] = length [x\<leftarrow>drop i v . x = b] + 1"}

   240 *}

   242  apply(drule part2[of "\<lambda>x. x=a", simplified])

   243   apply(assumption)

   245 txt{*\noindent

   246 Now it is time to decompose @{term v} in the conclusion @{prop"b#v \<in> A"}

   247 into @{term"take i v @ drop i v"},

   248 *}

   250  apply(rule_tac n1=i and t=v in subst[OF append_take_drop_id])

   252 txt{*\noindent

   253 (the variables @{term n1} and @{term t} are the result of composing the

   254 theorems @{thm[source]subst} and @{thm[source]append_take_drop_id})

   255 after which the appropriate rule of the grammar reduces the goal

   256 to the two subgoals @{prop"take i v \<in> A"} and @{prop"drop i v \<in> A"}:

   257 *}

   259  apply(rule S_A_B.intros)

   261 txt{*

   262 Both subgoals follow from the induction hypothesis because both @{term"take i

   263 v"} and @{term"drop i v"} are shorter than @{term w}:

   264 *}

   266   apply(force simp add: min_less_iff_disj)

   267  apply(force split add: nat_diff_split)

   269 txt{*

   270 The case @{prop"w = b#v"} is proved analogously:

   271 *}

   273 apply(clarify)

   274 apply(frule part1[of "\<lambda>x. x=b", simplified])

   275 apply(clarify)

   276 apply(drule part2[of "\<lambda>x. x=b", simplified])

   277  apply(assumption)

   278 apply(rule_tac n1=i and t=v in subst[OF append_take_drop_id])

   279 apply(rule S_A_B.intros)

   280  apply(force simp add: min_less_iff_disj)

   281 by(force simp add: min_less_iff_disj split add: nat_diff_split)

   283 text{*

   284 We conclude this section with a comparison of our proof with

   285 Hopcroft\index{Hopcroft, J. E.} and Ullman's\index{Ullman, J. D.}

   286 \cite[p.\ts81]{HopcroftUllman}.

   287 For a start, the textbook

   288 grammar, for no good reason, excludes the empty word, thus complicating

   289 matters just a little bit: they have 8 instead of our 7 productions.

   291 More importantly, the proof itself is different: rather than

   292 separating the two directions, they perform one induction on the

   293 length of a word. This deprives them of the beauty of rule induction,

   294 and in the easy direction (correctness) their reasoning is more

   295 detailed than our @{text auto}. For the hard part (completeness), they

   296 consider just one of the cases that our @{text simp_all} disposes of

   297 automatically. Then they conclude the proof by saying about the

   298 remaining cases: ``We do this in a manner similar to our method of

   299 proof for part (1); this part is left to the reader''. But this is

   300 precisely the part that requires the intermediate value theorem and

   301 thus is not at all similar to the other cases (which are automatic in

   302 Isabelle). The authors are at least cavalier about this point and may

   303 even have overlooked the slight difficulty lurking in the omitted

   304 cases.  Such errors are found in many pen-and-paper proofs when they

   305 are scrutinized formally.%

   306 \index{grammars!defining inductively|)}

   307 *}

   309 (*<*)end(*>*)

author	Walther Neuper <neuper@ist.tugraz.at>
	Thu, 12 Aug 2010 15:03:34 +0200
branch	isac-from-Isabelle2009-2
changeset 37913	20e3616b2d9c
parent 27167	a99747ccba87
permissions	-rw-r--r--