2007/11/11

QSort in Agda, Round 2

上一回合證明 qsort 是 length-preserving operation,這一回合換個角度,證明 qsort 的確傳回 sorted list。證明規模比上一個大一些,不過環環相扣,寫起來挺有快感 XD。

首先要定義一些 predicates 並證明它們的性質。Library 裡面的 _≤_ 運算子不帶證明,不好用,所以自己定義一個 _≼_ relation (predicate)。這個 relation 具有 reflexivity 和 transitivity。

data _≼_ : Nat -> Nat -> Set where
  ≼-base : {n : Nat} -> 0 ≼ n
  ≼-step : {m n : Nat} -> m ≼ n -> (suc m) ≼ (suc n)

≼-refl : {x : Nat} -> x ≼ x
≼-refl {0} = ≼-base
≼-refl {suc n} = ≼-step (≼-refl {n})

≼-trans : {x y z : Nat} -> x ≼ y -> y ≼ z -> x ≼ z
≼-trans {0}{_}{_} _ _ = ≼-base
≼-trans {suc x}{suc y}{suc z}
        (≼-step x≼y) (≼-step y≼z) = ≼-step (≼-trans x≼y y≼z)

定義一個 compare function,可比較兩數間的大小關係,並附證明。

cmp : (m n : Nat) -> (m ≼ n) \/ (n ≼ m)
cmp 0 n = \/-IL ≼-base
cmp (suc m) 0 = \/-IR ≼-base
cmp (suc m) (suc n) with cmp m n
... | \/-IL subpf = \/-IL (≼-step subpf)
... | \/-IR subpf = \/-IR (≼-step subpf)

接著是兩個對偶的 predicates BoundedAboveBoundedBelow,分別宣稱一個 list 上方有界或下方有界,並須指明 bound 為何。

data BoundedAbove : List Nat -> Nat -> Set where
  bdd-above-base : {n : Nat} -> BoundedAbove [] n
  bdd-above-step : {n : Nat}{x : Nat}{xs : List Nat} ->
                     BoundedAbove xs n -> x ≼ n -> BoundedAbove (x :: xs) n

data BoundedBelow : List Nat -> Nat -> Set where
  bdd-below-base : {n : Nat} -> BoundedBelow [] n
  bdd-below-step : {n : Nat}{x : Nat}{xs : List Nat} ->
                     BoundedBelow xs n -> n ≼ x -> BoundedBelow (x :: xs) n

遞增排序和遞減排序的 predicates 可用以上的 predicates 組合出來。

data IncSorted : List Nat -> Set where
  inc-sorted-base : IncSorted []
  inc-sorted-step  : {b x : Nat} {xs : List Nat} -> IncSorted xs ->
                       BoundedBelow xs b -> x ≼ b -> IncSorted (x :: xs)

data DecSorted : List Nat -> Set where
  dec-sorted-base : DecSorted []
  dec-sorted-step  : {b x : Nat} {xs : List Nat} -> DecSorted xs ->
                       BoundedAbove xs b -> b ≼ x -> DecSorted (x :: xs)

給定一個 bound n,以下兩個 helper functions 可以傳回 "bounded above/below by n" 的 unary predicate。

bddAbove : Nat -> List Nat -> Set
bddAbove n xs = BoundedAbove xs n

bddBelow : Nat -> List Nat -> Set
bddBelow n xs = BoundedBelow xs n

上界可以提高,下界可以降低。

relaxUpperBound : {xs : List Nat}{m n : Nat} ->
                    BoundedAbove xs m -> m ≼ n -> BoundedAbove xs n
relaxUpperBound {[]}{_}{n} _ _ = bdd-above-base
relaxUpperBound {x :: xs} (bdd-above-step bdd-subpf x≼m) m≼n =
  bdd-above-step (relaxUpperBound {xs} bdd-subpf m≼n) (≼-trans x≼m m≼n)

relaxLowerBound : {xs : List Nat}{m n : Nat} ->
                    BoundedBelow xs m -> n ≼ m -> BoundedBelow xs n
relaxLowerBound {[]}{_}{n} _ _ = bdd-below-base
relaxLowerBound {x :: xs} (bdd-below-step bdd-subpf m≼x) n≼m =
  bdd-below-step (relaxLowerBound {xs} bdd-subpf n≼m) (≼-trans n≼m m≼x)

然後是三個主要的 predicates 和 ++ operator 的互動關係:兩個上界 / 下界相同的 lists 接起來,上界 / 下界不變;如果 xs 上界和 ys 下界為同一個值,而且分別都是 increasingly sorted,那麼 xs ++ ys 也是 increasingly sorted。

bdd-above-concat : {xs ys : List Nat}{n : Nat} -> BoundedAbove xs n ->
                     BoundedAbove ys n -> BoundedAbove (xs ++ ys) n
bdd-above-concat {[]}{ys} _ ys-bddabv = ys-bddabv
bdd-above-concat {x :: xs}{ys} (bdd-above-step xs-bddabv n≼x) ys-bddabv =
  bdd-above-step (bdd-above-concat {xs}{ys} xs-bddabv ys-bddabv) n≼x

bdd-below-concat : {xs ys : List Nat}{n : Nat} -> BoundedBelow xs n ->
                     BoundedBelow ys n -> BoundedBelow (xs ++ ys) n
bdd-below-concat {[]}{ys} _ ys-bddblw = ys-bddblw
bdd-below-concat {x :: xs}{ys} (bdd-below-step xs-bddblw x≼n) ys-bddblw =
  bdd-below-step (bdd-below-concat {xs}{ys} xs-bddblw ys-bddblw) x≼n

inc-sorted-concat : {xs ys : List Nat}{n : Nat} ->
  IncSorted xs -> IncSorted ys -> BoundedAbove xs n ->
  BoundedBelow ys n -> IncSorted (xs ++ ys)
inc-sorted-concat {[]}{ys} _ ys-sorted _ _ = ys-sorted
inc-sorted-concat {x :: xs}{ys} (inc-sorted-step xs-sorted xs-bddblw x≼b)
                  ys-sorted (bdd-above-step xs-bddabv x≼n) ys-bddblw =
  inc-sorted-step
    (inc-sorted-concat xs-sorted ys-sorted xs-bddabv ys-bddblw)
    (bdd-below-concat (relaxLowerBound xs-bddblw x≼b)
                      (relaxLowerBound ys-bddblw x≼n))
    ≼-refl

以上是我們需要的性質。這一版的 qsort 的 type 是:

qsort'' : List Nat -> List Nat ∣ IncSorted

我們心中打算的是讓 partition'' 把 input list 分成一個上方有界的 list 和一個下方有界的 list,接著遞迴呼叫 qsort'' 分別排序,最後把它們接起來。從這裡就可以看到我們的 induction hypothesis 不夠強:因為遞迴呼叫 qsort'' 所得的結果只保證遞增排序,關於上界下界的資訊都遺失了。所以很自然地,我們必須改證明一個較強的版本:

  strong-qsort'' : {m n : Nat} -> List Nat ∣ bddAbove m ⋯ bddBelow n ->
                     List Nat ∣ IncSorted ⋯ bddAbove m ⋯ bddBelow n

為了方便操作一連串的 predicates,我定義一些 operators,可以把 predicates 以及它們的 proofs 方便地串起來。這些 predicates 串起來的結構和 list 一樣,我們可以定義兩個 helper functions 取得一個 predicate list 的第一個 predicate 或其餘的 predicates。

data _∧_ (P Q : Set) : Set where
  _‖_ : P -> Q -> P ∧ Q

_⋯_ : {A : Set} -> (P Q : A -> Set) -> A -> Set
_⋯_ P Q a = P a ∧ Q a

takeTailPred : {A : Set}{P Q : A -> Set} -> A ∣ P ⋯ Q -> A ∣ Q
takeTailPred (_⋮_ a (p ‖ q)) = _⋮_ a q

takeHeadPred : {A : Set}{P Q : A -> Set} -> A ∣ P ⋯ Q -> A ∣ P
takeHeadPred (_⋮_ a (p ‖ q)) = _⋮_ a p

_∧_Logic.Base 裡面的 _/\_ 意義完全一樣,但這裡定義的 constructor 是 operator form,使用上比較方便。(只做 aliasing 的話沒辦法做 pattern matching。)另外我也把 dependent pair 的 constructor 改成 operator form。(到時候命題和證明就是用一堆直線橫線和「直的橫的點點點」組成 XD。)

data _∣_ (A : Set) (P : A -> Set) : Set where
  _⋮_ : (x : A) -> P x -> A ∣ P

strong-qsort'' 的前提也比 qsort'' 強(i.e., 前者的 type 並非後者的 subtype),沒辦法直接做 reduction,所以必須另外證明一個 lemma,說每個 list 都有上下界。

lemma3 : (xs : List Nat) ->
           ∃ (\n -> List Nat ∣ equalTo xs ⋯ bddAbove n ⋯ bddBelow 0)
lemma3 [] = ∃-I 0 (_⋮_ [] (refl ‖ bdd-above-base ‖ bdd-below-base))
lemma3 (x :: xs) with lemma3 xs
lemma3 (x :: xs) | ∃-I n (_⋮_ xs' (eq-pf ‖ bdd-above-pf ‖ bdd-below-pf))
                   with cmp x n
lemma3 (x :: xs) | ∃-I n (_⋮_ xs' (eq-pf ‖ bdd-above-pf ‖ bdd-below-pf))
                 | \/-IL x≼n =
  ∃-I n (x :: xs ⋮
           (refl {_}{x :: xs} ‖
            bdd-above-step
              (subst (\ys -> BoundedAbove ys n) (sym eq-pf) bdd-above-pf)
              x≼n ‖
            bdd-below-step
              (subst (\ys -> BoundedBelow ys 0) (sym eq-pf) bdd-below-pf)
              (≼-base {x})))
lemma3 (x :: xs) | ∃-I n (_⋮_ xs' (eq-pf ‖ bdd-above-pf ‖ bdd-below-pf))
                 | \/-IR n≼x =
  ∃-I x (x :: xs ⋮
           (refl {_}{x :: xs} ‖
            bdd-above-step
              (relaxUpperBound
                (subst (\ys -> BoundedAbove ys n)
                       (sym eq-pf) bdd-above-pf) n≼x)
              (≼-refl {x})  ‖
            bdd-below-step
              (subst (\ys -> BoundedBelow ys 0) (sym eq-pf) bdd-below-pf)
              (≼-base {x})))

qsort'' 的本體就可以寫出來了。

qsort'' : List Nat -> List Nat ∣ IncSorted
qsort'' xs =
  (takeHeadPred ○ strong-qsort'' ○ takeTailPred) (takeProof (lemma3 xs))

沒寫成

qsort'' = takeHeadPred ○ strong-qsort'' ○ takeTailPred ○ takeProof ○ lemma3

的原因是後面兩個 functions 有 dependent type。qsort'' 的定義裡用到一個 helper function 把 existential quantifier 的證明部份提取出來。

takeWitness : {A : Set}{P : A -> Set} -> ∃ P -> A
takeWitness (∃-I wit pf) = wit

takeProof : {A : Set}{P : A -> Set} -> (ex : ∃ P) -> P (takeWitness ex)
takeProof (∃-I wit pf) = pf

現在整個問題重整到 strong-qsort''。先從 partition'' 下手,這個版本的 input list 和 output lists 都附帶上下界的限制。

partition'' : {m n : Nat} -> (p : Nat) ->
                List Nat ∣ bddAbove m ⋯ bddBelow n ->
                (List Nat ∣ bddAbove p ⋯ bddBelow n) ×
                (List Nat ∣ bddAbove m ⋯ bddBelow p)
partition'' p ([] ⋮ upperBoundPf ‖ lowerBoundPf) =
  < [] ⋮ bdd-above-base {p} ‖ lowerBoundPf ,
    [] ⋮ upperBoundPf ‖ bdd-below-base {p} >
partition'' p (x :: xs ⋮ bdd-above-step xs-bddabv x≼m ‖
                         bdd-below-step xs-bddblw n≼x)
  with partition'' p (xs ⋮ xs-bddabv ‖ xs-bddblw) | cmp x p
... | < ys ⋮ ys-bddabv ‖ ys-bddblw , zs&pfs > | \/-IL x≼p =
  < x :: ys ⋮ bdd-above-step ys-bddabv x≼p ‖
              bdd-below-step ys-bddblw n≼x ,
    zs&pfs >
... | < ys&pfs , zs ⋮ zs-bddabv ‖ zs-bddblw > | \/-IR p≼x =
  < ys&pfs ,
    x :: zs ⋮ bdd-above-step zs-bddabv x≼m ‖
              bdd-below-step zs-bddblw p≼x >

最後 strong-qsort'' 只要很有耐心地把證明組合出來就可以了。

strong-qsort'' : {m n : Nat} -> List Nat ∣ bddAbove m ⋯ bddBelow n ->
                   List Nat ∣ IncSorted ⋯ bddAbove m ⋯ bddBelow n
strong-qsort'' ([] ⋮ bddabv ‖ bddblw) =
  [] ⋮ inc-sorted-base ‖ bddabv ‖ bddblw
strong-qsort'' (x :: xs ⋮ bdd-above-step xs-bddabv x≼m ‖
                          bdd-below-step xs-bddblw n≼x)
  with partition'' x (xs ⋮ xs-bddabv ‖ xs-bddblw)
... | < left , right > with strong-qsort'' left | strong-qsort'' right
strong-qsort'' (x :: xs ⋮ bdd-above-step xs-bddabv x≼m ‖
                          bdd-below-step xs-bddblw n≼x) | < left , right >
  | ys ⋮ ys-sorted ‖ ys-bddabv ‖ ys-bddblw
  | zs ⋮ zs-sorted ‖ zs-bddabv ‖ zs-bddblw =
  ys ++ x :: zs ⋮
    (inc-sorted-concat
      ys-sorted (inc-sorted-step zs-sorted zs-bddblw ≼-refl)
      ys-bddabv (bdd-below-step zs-bddblw ≼-refl)) ‖
    (bdd-above-concat
      (relaxUpperBound ys-bddabv x≼m) (bdd-above-step zs-bddabv x≼m)) ‖
    (bdd-below-concat
      ys-bddblw (bdd-below-step (relaxLowerBound zs-bddblw n≼x) n≼x))

以上就是紮紮實實的「qsort 會排序」的侵入式證明(即把欲證的命題寫在 function type 裡面),我不知道寫非侵入式的證明會比較麻煩還是比較簡單。用 formal method 去做這種證明,會發現很多平常很容易疏忽的關鍵。Agda programming 上比較有意思的大概是藉著組合 constraint predicates 證明關於 lists 的性質,而不需動到 list 的定義(notations 可能要再斟酌就是了 XD)。這種做法的缺點之一是,每一個 predicate 裡面都自己悄悄藏一個 list 的結構,相較於「直接把性質放進 list 定義」的做法會有比較複雜的 terms,不過倘若事後把所有證明抹掉就無妨了。最後當然如果自己把整個證明寫過一遍,會發現相當有美感 XD。

--
不太敢想像 round 3 會變成什麼樣子 XD。


再想想,似乎還是要用非侵入式的方式證明才方便組合。下一回合應該來試試看這個 XD。

Labels: ,