Doctoral Thesis

Measuring Defects in Finite Automata

Katja Meckel

1st Reviewer
(Supervisor): Prof. Dr. Martin Kutrib

2nd Reviewer: Prof. Dr. Markus Holzer

Fachbereich 07 - Mathematik und Informatik, Physik, Geographie

Institut für Informatik

Justus-Liebig-Universität Gießen

Disputation: 12.05.2016


Danksagung

An dieser Stelle möchte ich mich bei meinen Betreuern, den Herren Prof. Dr. Mar-
tin Kutrib und Prof. Dr. Markus Holzer, bedanken. Ihre Ratschläge und Anregun-
gen waren eine unschätzbare Hilfe bei der Erarbeitung der vorliegenden Ergebnisse.
Zusätzlich möchte ich auch noch einen Dank an Dr. Matthias Wendlandt und Dr. Se-
bastian Jakobi aussprechen. Die beiden haben mich in meinem Forschungsvorhaben
ebenfalls sehr unterstützt und bei Problemen, die während der Forschung auftraten,
mit ihren Ideen neue Perspektiven und Blickwinkel auf das Problem eröffnet.

Selbstverständlich möchte ich mich hiermit ebenfalls bei all jenen bedanken, die
mit kritischen Bemerkungen zu einer Verbesserung dieser Arbeit beigetragen haben.
Ich bedanke mich auch für die Geduld und das Verständnis, das meine Familie und
Freunde während der letzten fünf Jahre mir gegenüber aufgebracht und damit eben-
falls einen Beitrag zum Gelingen dieser Arbeit geleistet haben.


Contents

1 Introduction 1

2 Definitions 8
2.1 Words and Languages . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Regular Languages and Regular Expressions . . . . . . . . . . . . . . 9
2.3 Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Defects in Finite Automata . . . . . . . . . . . . . . . . . . . . . . . 15

3 State Complexity of DFA for Defects 19
3.1 Defects Occurring on the Transitions . . . . . . . . . . . . . . . . . . 19
3.2 Defects Occurring on the States . . . . . . . . . . . . . . . . . . . . . 34
3.3 Defects in DFA Accepting Finite Languages . . . . . . . . . . . . . . 40

4 Recognition and Correction of Defects in DFA – By Example 57
4.1 Recognition of Defects in a Finite Automaton . . . . . . . . . . . . . 58
4.2 Correction of Defects in a Finite Automaton . . . . . . . . . . . . . . 60
4.3 Languages Related to Defective Automata . . . . . . . . . . . . . . . 69

4.3.1 Construction of DFA for L+, L−, and L? . . . . . . . . . . . 70
4.3.2 Analysis of the Accepting DFA for the Three Languages . . . 71

5 Distances 75
5.1 Introduction to Distances . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 The Paramerized Prefix Distance . . . . . . . . . . . . . . . . . . . . 80

5.2.1 Upper and Lower Bounds for the Prefix Distance . . . . . . . 80
5.2.2 Distances Below the Upper Bound . . . . . . . . . . . . . . . 84
5.2.3 Decidability of the Order of the Distances . . . . . . . . . . . 87
5.2.4 Computation of the Precise Distance . . . . . . . . . . . . . . 93
5.2.5 Distinguishing Different Computations of the Prefix Distance 95
5.2.6 Criteria for the Cases . . . . . . . . . . . . . . . . . . . . . . 99
5.2.7 Construction of an Automaton to Determine the Prefix Distance104
5.2.8 Detailed Differentiation of the Cases Based on Automaton H 114
5.2.9 Calculation of the Prefix Distance Based on Automaton H . 120
5.2.10 Analysis of the Sum for the Prefix Distance on Base of H . . 134

5.3 The Parameterized Suffix Distance . . . . . . . . . . . . . . . . . . . 143
5.3.1 Upper and Lower Bounds . . . . . . . . . . . . . . . . . . . . 144
5.3.2 Distances Below the Upper Bound . . . . . . . . . . . . . . . 144
5.3.3 Decidability of the Order of the Distances . . . . . . . . . . . 145

i


ii Contents

5.3.4 Calculation of the Suffix Distance . . . . . . . . . . . . . . . . 145

6 Concluding Remarks 146
6.1 Parameterized Distances and Defects . . . . . . . . . . . . . . . . . . 146
6.2 A Different Definition of L+, L− and L? . . . . . . . . . . . . . . . . 148
6.3 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Measuring Defects in Finite Automata


List of Figures

2.1 NFA that is converted into a DFA in Example 2.3.1. . . . . . . . . . 13

2.2 DFA that is constructed by the power set construction from the
NFA depicted in Figure 2.1. . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Minimal DFA accepting the language {a, b}2{a, b}∗ ∪ {λ}. . . . . . . 16

2.4 Defective finite automaton derived from the DFA depicted in Fig-
ure 2.3 for the defect exchange symbol affecting the transition on a
for state q0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5 Defective finite automaton derived from the DFA depicted in Fig-
ure 2.3 for the defect flip transition affecting the transition on b for
state q0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.6 Defective finite automaton derived from the DFA depicted in Fig-
ure 2.3 for the defect delete transition affecting the transition on b for
state q0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.7 Defective finite automaton derived from the DFA depicted in Fig-
ure 2.3 for the defect insert transition that inserts the transition
δ(q1, b) = q0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.8 Defective finite automaton derived from the DFA depicted in Fig-
ure 2.3 by the defect delete accepting state affecting state q2. . . . . 17

2.9 Defective finite automaton derived from the DFA depicted in Fig-
ure 2.3 by the defect delete non-accepting state affecting state q1. . . 18

2.10 Defective finite automaton derived from the DFA depicted in Fig-
ure 2.3 for the defect remove acceptance affecting state q0. . . . . . . 18

2.11 Defective finite automaton derived from the DFA depicted in Fig-
ure 2.3 for the defect add acceptance affecting state q1. . . . . . . . . 18

3.1 DFA that proves the optimality of the upper bound for the defect
exchange symbol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2 NFA of Moore whose equivalent minimal DFA needs precisely 2n

states for n ≥ 2 [54]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3 DFA that proves the tightness of the upper bound for the defect flip
transition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.4 NFA of Jirásková and Šebej for which the equivalent minimal DFA has
exactly 2n states for n ≥ 2. . . . . . . . . . . . . . . . . . . . . . . . 22

3.5 DFA A for the defect flip transition for unary alphabets. . . . . . . . 22

3.6 DFA for which the defect flip transition can result in any number of
states between 2 and n− 1 for the equivalent minimal DFA. . . . . . 24

iii


iv List of Figures

3.7 An incomplete DFA whose equivalent minimal DFA needs exactly
n+ 1 states. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.8 DFA to prove the existence of minimal DFA equivalent to defective
automata for the defect delete transition with a number of states
between 1 and n− 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.9 Incomplete DFA whose equivalent minimal DFA needs exactly i states. 27

3.10 DFA that proves the existence of a minimal DFA with n states, that
is equivalent to a defective automaton received for the defect delete
transition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.11 Incomplete DFA whose equivalent minimal DFA needs exactly n states. 27

3.12 NFA whose equivalent minimal DFA needs exactly 2n − 1 states for
n ≥ 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.13 DFA for which the equivalent minimal DFA of the defective automa-
ton with inserted transition from q1 to q2 on a needs 2n − 1 states. . 30

3.14 General structure of a DFA that accepts a unary regular language.
The acceptance property is omitted. . . . . . . . . . . . . . . . . . . 30

3.15 DFA for which the defect insert transition inserting a transition on a
from q2 to qn results in a minimal DFA with n2 − 2n+ 2 states. . . . 30

3.16 DFA for the proof of the optimality of the upper bound for the defect
delete accepting state. . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.17 DFA used in the proof for the upper bound for the defect of inserting
the accepting property to a non-accepting state. . . . . . . . . . . . 38

3.18 Minimal DFA with unary alphabet and i states. . . . . . . . . . . . . 39

3.19 DFA for which the defect exchange symbol affecting the transition
of q1 on b by exchanging it into an a results in a minimal DFA with
O(n2) states. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.20 DFA for which the defect flip transition affecting the transition of q1

to q2 for a results in a minimal DFA with Ω(2n) states. . . . . . . . 46

3.21 DFA used in the proofs of several defects. . . . . . . . . . . . . . . . 48

3.22 DFA for which the defect insert transition inserts a transition from
q1 to q2 on a results in a minimal DFA with Ω(2n) states. . . . . . . 50

3.23 DFA for which the defect delete non-accepting state affecting state q2

results in a minimal DFA with n− 1 states. . . . . . . . . . . . . . . 52

3.24 DFA for which the defect of insertion of the accepting property for
state qn results in a minimal DFA with i states, where 1 ≤ i ≤ n. . . 55

4.1 Minimal DFA used in Example 4.2.1 for the defect exchange symbol. 61

4.2 Defective automaton used in Example 4.2.2 for the defect exchange
symbol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.3 Not minimal DFA used in Example 4.2.2 for the defect exchange symbol. 61

4.4 Minimal DFA used in Example 4.2.5 for the defect delete transition. 63

4.5 Minimal DFA used in Example 4.2.6 for the defect insert transition. 64

4.6 Minimal DFA used in Example 4.2.7 for the defect insert transition. 64

Measuring Defects in Finite Automata


List of Figures v

4.7 Minimal DFA used in Example 4.2.8 for the defect delete accepting
state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.8 Minimal DFA used in Example 4.2.8 for the defect delete accepting
state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.9 Minimal DFA used in Example 4.2.9 for the defect delete non-accepting
state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.10 Minimal DFA used in Example 4.2.9 for the defect delete non-accepting
state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.11 Minimal DFA used in Example 4.2.13 for the defect remove acceptance. 68

5.1 Structure of minimal DFA used in the proof of Theorem 5.2.2. . . . 91
5.2 Two minimal DFA that accept the languages L(A1) = {a, b}{a}∗

respectively L(A2) = {a}{a, b}∗. . . . . . . . . . . . . . . . . . . . . 108
5.3 An example for a history automatonH, based on the minimal DFA de-

picted in Figure 5.2. Here only some part of the automaton is depicted
for words starting with a. The missing transitions and states can be
added like described in the definition of H. . . . . . . . . . . . . . . 109

5.4 An example for a history automatonH, based on the minimal DFA de-
picted in Figure 5.2. Here only some part of the automaton is depicted
for words starting with b. The missing transitions and states can be
added like described in the definition of H. . . . . . . . . . . . . . . 110

5.5 One possible path leading into a state s = (s1, s2, reads, hists, accept2),
where s2 is productive and hists consists of k triples. Then, the in-
formation in position i of hists was inserted by state rk−i+1. . . . . 127

5.6 One possible path leading into a state s = (s1, s2, reads, hists, accept2),
where s2 is non-productive, and the only state on this path storing
a productive state of A2 is s′. Then, the information in position 1
of hists was inserted by s′, and the information on position i ≥ 2 is
already stored in position i− 1 in the history of s′. . . . . . . . . . 128

Measuring Defects in Finite Automata


List of Tables

3.1 Bounds for defects affecting the transitions of a minimal DFA. For
unary alphabets, these bounds are tight. For arbitrary regular lan-
guages, these bound are also tight already for binary alphabets, except
for the defect insert transition. For this defect, the alphabet is at least
ternary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.2 Tight bounds for defects affecting the states of a minimal DFA with
n states. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3 Bounds for minimal DFA that are equivalent to a defective automa-
ton received from a minimal DFA that accepts a finite language. The
bounds for the defects flip or insert transition, and for the defect ex-
change symbol are tight in the order of magnitude for at least ternary
alphabets. All of the other bounds are tight even for unary alphabets. 56

4.1 Characteristic properties of a defective automaton resulting from a
minimal DFA for precisely one defect of one of the considered types. 59

4.2 Upper bounds for the numbers of states of minimal DFA accepting
the languages L+, L−, respectively L? for the different defects. Here,
n is the number of states of the original automaton. . . . . . . . . . 74

6.1 Upper bounds for the number of states of minimal DFA accepting
the languages L+, L−, and L? for the different defects, where n is the
number of states of the original DFA, Σ is its alphabet, and F is the
set of accepting states of the original DFA. . . . . . . . . . . . . . . 150

vi


1 Introduction

Back in the 1950s, Chomsky started to research for formalisms to decribe the English
language and its grammar. In the papers [14] and [15], he characterised special types
of grammars, where a grammar is given by rules that describe and build words and
sentences of the described language. Nowadays, the introduced hierarchy for these
grammars is well known as the Chomsky hierarchy. Each level of this hierachy has to
obey certain restrictions for the structure of the grammar rules. The grammars with
less and weaker restrictions have a greater power of description than the ones with
stronger restrictions. In the theory of formal languages, the grammars describing a
certain class of languages in this hierarchy are of special interest.

The grammars obeying the weakest restrictions are the type 0 grammars. They
describe the class of recursively enumerable languages. The next level is given by
the type 1 grammars. They describe the context-sensitive languages. This type
of grammar needs to fulfill stronger restrictions than the type 0 grammars. The
context-free languages are generated by the so called type 2 grammars. The lowest
level of this hierarchy describes the regular languages. The grammars of this level
are also called type 3 grammars. These grammars underly the strongest restrictions
in the whole hierarchy defined by Chomsky.

The language classes defined in terms of grammars in the Chomsky hierarchy can
also be described by different formal models. For example, there exist automata
models for each of the classes.

Already in the 1930s, Turing [66] introduced a model of an abstract machine,
long before there were computers. These machines are called Turing machines. The
aim of Turing was to find a model that describes all that is computable, which
comprises all computable numbers, functions, and so on. The Turing machines can
be considered to be an abstract model of todays computers, so the Turing machines
also describe the computational power of computers.

A Turing machine can also be regarded as a recognition model for formal lan-
guages. Chomsky proved that this automaton model describes precisely the class of
recursively enumerable languages [15]. This means, the class of languages that are
described by the type 0 grammars is the same one that is recognised by the Turing
machines.

In general, a Turing machine is a device having a finite central processing unit
that converts an input into an output using some internal memory. Its input is
given on a tape. The memory is also considered to be a tape that the machine
may use to store intermediate results needed for the computation of the output.
Usually, this tape is called working tape. Genarally, these two tapes are considered
to be unbounded by any size to the left and to the right. In a standard Turing

1


2

machine, the transformation of an input into an output is done deterministically.
This means each step of the conversion is uniquely determined by the currently read
input symbol, the content of the working tape, and the current state of the machine.
The output may be given on a separate tape, but, more conveniently, all tapes, the
input tape, the working tape, and the output tape are considered to be only one
tape. Due to the unbounded size of the tapes, this is possible.

Also modifications of Turing machines were considered over the past decades, like
Turing machines with more than only one working tape, or with a working tape
that is bounded on one side, or non-deterministic Turing machines. All of these
modified Turing machines can be simulated by a deterministic Turing machine with
only one unbounded tape. This means, the mentioned modified Turing machines
also recognise the class of recursively enumerable languages (see for example [29]).

Also for the other classes of languages described in the Chomsky hierarchy, there
exist recognising automata models. Just like for the grammars, the restrictions of the
automata models grow for lower levels in the hierarchy. All of these automata models
may be considered to be restricted Turing machines. The following three automata
models are restricted Turing machines describing the other language classes of the
Chomsky hierarchy.

For example, a linear bounded automaton (LBA) is a non-deterministic Turing
machine, that may only access a limited number of cells of the tape. This number
is a linear function in the size of the machines initial input. This automaton model
was shown to recognise precisely the class of context-sensitive languages [43]. An-
other type of linear bounded automata are the deterministic ones (DLBA). They
were introduced by Myhill [56], and were shown to recognise a proper subset of the
context-sensitive languages [46].

When starting from a non-deterministic Turing machine with a one-way and read-
only input tape, and one working tape that is bounded on one side, another possible
restriction concerns the accessability of the content on the working tape. If the last
inserted symbol needs to be processed first, that is the working tape is restricted to
a last-in-first-out (LIFO) mode, the class of languages recognised by this automaton
model is the class of context-free languages. This new type of automaton is called
pushdown automaton (PDA) [62].

Also the finite automata (FA) that recognise the regular languages (see for ex-
ample [29]) can be interpreted as restricted Turing machines. A finite automaton
only consists of the finite processing unit of the Turing machine, and a read-only
tape. The input of the automaton is contained on this tape in the beginning of a
computation. While processing this word, the finite automaton may read the input
only from left to right, and may not access any other cell. This especially means,
every symbol of the input is processed only once.

Like for the Turing machines, modifications of the other automata models were
introduced and investigated. Some of those new models recognise language classes
that are different from the ones defined in the Chomsky hierarchy. For example
the deterministic pushdown automata were shown to be strictly weaker than non-
deterministic PDA [22, 23]. Also the language class accepted by DLBA mentioned

Measuring Defects in Finite Automata


3

before differs from the ones described in the Chomsky hierarchy.
From all of the classes of languages yet known, this thesis concentrates on the

regular languages. This language class is perhaps the best researched of all yet
known classes of languages. This is no surprise, since this type of language and
its representations have several applications. The regular languages are used, for
example, in the lexical analysis in programming language compilation [1, 2], circuit
design [10], or text editing and pattern matching [42].

The class of regular languages is still one of the most intensely researched lan-
guage classes in the theory of formal languages. This is particularly interesting
since already in the late seventies, it was a wide spread belief that all important and
everything of interest has already been known about regular languages. The only
exceptance were some very hard problems, that were summarised by Brzozowski in
1979 at the International Symposium on Formal Language Theory [9].

Over the last few decades, this belief was proven to be wrong. First of all, at least
three of the six very hard open problems have been solved: the restricted star height
problem [24], the regularity of noncounting classes problem [18], and the problem
of the optimality of prefix codes [63]. Additionally, new problems have been stated
and solved for this language class. For example, the complexities for operations on
regular languages respectively finite automata were introduced [26, 39, 59, 60, 68].
Some recent surveys about descriptional complexities are given by Gao, Moreira,
Reis and Yu [19] and by Holzer and Kutrib [27, 28].

Since the beginnigs of the theory of formal languages and automata theory, a lot
of different representations for the regular languages have been defined. Besides
the type 3 grammars and finite automata, the regular languages were shown to be
representable also by so called regular expressions. This model was presented by
Kleene in 1956 [41]. A regular expression consists of symbols from a given alphabet,
that are combined by union, concatenation, and Kleene star. Additionally, also ∅
and λ are regular expressions.

Over the years, also different models of automata were introduced, like two-way
finite automata, alternating automata, and many more (see, for example, [29]). All
of these models were shown to be equivalent to finite automata, and, therefore, they
also describe the class of regular languages. Two somehow different types of finite
automata were introduced by Mealy [50] and Moore [53]. In their definitions, the
automata produce an output when processing some sequence of symbols. This is
why they can be interpreted to be translaters or transducers.

Most of the equivalence results are proven by simulating one model by another
one. This naturally leads to the question for a possibility to compare the different
models. One such possiblity for automata recognising the regular languages is to
compare the numbers of states. This number is called the size of such an automaton.
The size of finite automata is also called their state complexity. For example, the
size of a DFA that is equivalent to an NFA with n states can at most be 2n. This
was first shown by Myhill [55] and Nerode [57]. Their famous algorithm for this
conversion is called powerset construction. There exist several examples that prove
the tightness of this bound, which means there exist NFA with n states that cannot

Measuring Defects in Finite Automata


4

be transformed into a DFA having strictly less than 2n states [48, 51, 54].
Other costs for conversions are for example the one between alternating finite

automata and deterministic finite automata. For an AFA with n states there exist
DFA with 22n states. There exist examples such that any DFA needs not less than
this many states [12]. The costs for the conversion of a 2NFA with n states into a
DFA are shown in [40] to be n(nn − (n− 1)n).

The mentioned numbers for the costs of conversions are so called trade offs. In
general, a trade off is a number, that enables the comparison of the size of one
automaton model with another model simulating the first model. Trade offs between
models describing the same family of languages can normally be expressed by a
function depending on the size of the simulated model. Such a trade off is called
recursive. If there does not exist such a function, the trade off is non-recursive.
This type of trade off often occurs when simulating one model by another model,
where the models do not recognise the same class of languages. This phenomenon is
discussed in detail by Kutrib in [44]. One such non-recursive trade off occurs when
simulating a pushdown automaton by a deterministic finite automaton. This was
already shown by Meyer and Fischer [51].

The trade offs for the conversions of automata recognising the regular languages
often differ for unary and at least binary alphabets. Chrobak showed in [16, 17]
that the trade off between non-deterministic and deterministic finite automata for
unary alphabets is given by eΘ(

√
n·lnn). Also finite languages often differ in the state

complexity. For finite languages over arbitrary alphabets Salomaa and Yu showed
tight bounds in [61]. Due to this, also in this thesis finite and unary regular languages
are considered separately for some of the considered problems.

But not only for language classes belonging to the hierarchy defined by Chomsky
trade offs were considered. Also for several classes of subregular languages trade
offs for the conversion of non-deterministic to deterministic finite automata were
considered for example by Bordihn, Holzer, and Kutrib [3]. In addition to trade
offs for subregular languages, also the state complexity for operations on subregular
languages and the finite automata recognising these types of languages were subject
of research. Results on star-free languages were shown by Brzozowski and Liu in
2012 [8]. In this work, another definition for the size of deterministic finite automata
was used. The used quotient complexity is not defined on the states of the automata
but is related to the quotients of an automaton. Even though the approach differs,
the quotient and the state complexities are equivalent. For more details on the
definition of quotient complexity, the reader is referred to [5]. Other result on
quotient complexities are found for example in [6] and [7].

Besides the state complexity for regular languages respectively their recognising
finite automata, there exist several other measures. The state complexity often only
compares the size of different automata recognising the same fixed regular language.
A measure for the comparison of different regular languages is the distance. Intro-
duced by Choffrut and Pighizzini [13], the already existing measures between words
were extended to measures between regular languages. Examples for such word dis-
tances are the prefix, suffix, subword, Hamming, and the edit distance. All of these

Measuring Defects in Finite Automata


5

distances can be extended in the way Choffrut and Pighizzini did. They defined the
distance between two regular languages to be the maximum of the suprema of the
minimal distances of all words of one language to all words of the other language.
This may be difficult to use in pratice, since, in general, a regular language consists
of infinitely many words.

In practice, there exist a lot of applications using finite automata. Especially in
language processing, finite automata with millions of states are used [52]. Like for
circuits, also in finite automata defects may occur. In such big automata, it is more
likely that a defect occurs, especially when changing between different representation
models. Such defects may affect the states, or the transitions, or both. For a
defective automaton with that many states it may be of interest to know, if the defect
results in a large or small blow up in the size of an equivalent minimal deterministic
automaton. If may be uselful, if the size does not change too much. Another question
concerns the accepted languages. Do the languages of the defective respectivley the
original automaton differ much? A practical example for such a comparison is given
in the production of flat screens. A limited number of defective pixels is acceptable,
in contrast to many such defects.

Another example in practice is the comparison of reproduced parts with a sample.
In case that there exist too many differences, the reproduced parts will not be used
or selled. For this, there needs to exist a sample. In our case for defective automata,
this means the original automaton is reproducible in some way. One possibility is
by its accepted language.

Another question arises naturally for defective automata: is the defect recognis-
able? And if it can be recognised, is possible to repair the automaton? Then the
whole problem concerning blow ups for determinising and minimising such automata
does not bother at all. Unfortunately, only one of the considered defects in this the-
sis is possible to be repaired. But for all of the defects, we will consider languages
related to the languages to the defective and the original finite automaton. Depend-
ing on the sizes of the these languages, the language of the defective automaton may
not differ too much from the one of the original deterministic finite automaton. But
also the opposite may happen.

The second chapter of this thesis gives a short introduction to the fundamentals
of formal languages and automata theory used in the subsequent chapters. This
introduction focusses on the regular languages and their representation by finite au-
tomata. Also the defects investigated in most of the chapters are defined. Examples
are given for the possible defects and their consequences. In the end of this chapter
we explain by example the reason for the decision to only investigate such defective
automata that are received by only one defect of one of the defined types.

In Chapter 3, for all defined defects, bounds for the size of minimal deterministic
automata equivalent to the defective automata are given. For most of the defects,
the bounds for at least binary alphabets differ from the one for unary alphabets.
The results for the unary case are stated separately in this chapter.

For some defects, we also consider the possibility of the existence of defective finite
automata of size n such that the equivalent minimal deterministic finite automaton

Measuring Defects in Finite Automata


6

has a fixed size less than or equal to the upper size bound. This question is related
to the magic number problem. This problem asks for the existence of minimal
deterministic automata equivalent to non-deterministic automata of size n, having
a fixed size i between n and the upper bound of 2n. If there exists a number i
such that there exists no non-deterministic finite automaton having an equivalent
deterministic finite automaton of size i, then this number is called magic [32].

The last section of the third chapter covers the question for the state complexity of
deterministic finite automata equivalent to defective automata for finite languages.
It is shown that most of the bounds for arbitrary regular languages and the finite
languages differ massively.

Chapter 4 covers the questions for the recognisability of defects and the possibility
of correction. Assuming that only one occurrence of one type of defect results in
a given defective automaton, properties of these automata to recognise the type of
defect are summarised. It is proven by example that for most of the defects it is im-
possible to fix the defective automaton to receive the original minimal deterministic
automaton.

In this chapter, also three types of languages related to the defective and the orig-
inal automaton are defined and researched. For the definition of these languages,
the defective automaton, the type of defect occurring in this automaton, and the
affected state is assumed to be provided. One of these languages collects all words
that are accepted by the defective automaton without using the defect. Another
language collects all those words that are rejected by the automaton that do not use
the defect while being processed by the defective automaton. These two languages
are subsets of the languages accepted respectively rejected by the original automa-
ton. The last considered language collects all words that use the defect while being
processed by the defective automaton. This includes all such accepted or rejected
words. The union of all the three languages is Σ∗, where Σ denotes the underlying
alphabet. For these three defined languages, upper bounds for the state complexity
of minimal deterministic finite automata are considered.

In Chapter 5, a distance between two regular languages is introduced. In the
first part of this chapter, this distance is based on the prefix distance between two
words, and in the second part on the suffix distance. The introduced distances are
parameterised. Since they sum up the word distances for words in one language to
the other language and vice versa, only words up to a certain length are considered
in these sums. This length is the parameter in the language distance.

Properties of this distance are investigated. It is proven that the defined distances
can only be constant, polynomial, or exponential. A construction of an automaton
assisting to compute the precise distance between two languages is given. Different
cases are separated that may occur when calculating the distance between a word
and a regular language. Based on these criteria and the constructed automaton,
the precise distance for two languages can be computed. This computation is also
analysed in detail which gives criteria for the decision if the distance is constant,
polynomial, or exponential. All these results are proven for the parameterized prefix
distance and are inherited for the parameterized suffix distance in the last part of

Measuring Defects in Finite Automata


7

this chapter. For this, the relation between the two distances is also proven.
The last chapter concludes this thesis. In one section, the languages related to the

defective automata and the distances for defective automata are linked. It is shortly
summarised that the combination of two possibilities to measure the influence of
a defect may give better criteria for the decision if a defective automaton differs
much or not much from the original automaton. Another section covers another
definition for the languages related to the defective automaton. This definition is
based only on the given defective automaton. It does not presume the knowledge
of the affected state or transition. Upper bounds for minimal deterministic finite
automata recognising these languages are given. The last part of Chapter 6 gives
perspectives of possible future research.

Measuring Defects in Finite Automata


2 Definitions

In this chapter, nearly all of the formalisms used in the following chapters will be
defined. Most of them were already introduced by other researchers. One source for
these definitions is [29], others for example are [67] or [47].

2.1 Words and Languages

A non-empty and finite set of symbols is called an alphabet. This is the base for the
definition of languages. The symbols belonging to an alphabet are also called letters.
A word over an alphabet Σ is a finite sequence of symbols from Σ. For example,
the sequence ababba is a word over the alphabet Σ = {a, b}. The sequenc abc is
not a word over Σ, since c does not belong to Σ, but it is a word over the alphabet
{a, b, c, d}. This also shows, that it is not necessary for a word to contain all the
symbols of the alphabet.

The length of a word w is the number of its not necessarily different symbols and is
denoted by |w|. Let w = a1a2 . . . an be a word that consists of symbols a1, a2, . . . , an
from a fixed alphabet. Then the length of w is |w| = n. The word of length zero,
that does not contain any symbols, is called empty word and is denoted by λ.

The reversal of a word w is denoted by wR. If w = a1a2 · · · ak for some k ≥ 1, its
reversal is wR = ak · · · a2a1.

Let u = a1a2 · · · an and v = b1b2 · · · bm be two words, where all the symbols
a1, a2, . . . , an and b1, b2, . . . , bm belong to some alphabet. The concatenation of the
words u and v is defined to be

u · v = a1a2 · · · anb1b2 · · · bm.

This means, the concatenation of two words is received by stringing them together.
The concatenation symbol · between the two words is often omitted to simplify the
expression. For every word w, its concatenation with the empty string is given by
w · λ = λ · w = w.

For two words u, v over some alphabet Σ, v is called a prefix of u, if there exists
a word w over Σ such that u = vw. If v 6= u, and v 6= λ, v is said to be non-trivial,
otherwise it is a trivial prefix. In the same way, the suffix of a word is defined to
be a word v such that the word u can be split up into u = wv, where w is another
word over Σ. The suffix v is also called non-trivial, if v 6= u and v 6= λ.

The set of all words over a fixed alphabet Σ is denoted by Σ∗. This set also
includes the empty word λ. The set Σ∗ \ {λ} is denoted by Σ+. It consists of all
words of length at least one over Σ. The set containing all words over Σ up to a fixed

8


2.2. Regular Languages and Regular Expressions 9

length n ≥ 0 is denoted by Σ≤n, and the set of words of length precisely n is Σ=n.
The size or cardinality of a set A is denoted by |A|. For finite sets, this number is
the number of elements belonging to A. For infinite sets, this number is ∞. The
power set of a given finite set A is denoted by 2A, and consists of

∣∣2A∣∣ = 2|A| many
elements. Each of its elements is a subset of A. This also includes the empty set ∅,
since this always is a subset. The set A itself also belongs to its power set.

A subset L of Σ∗ is a collection of certain words. Such a set is called a language.
The trivial languages are the empty set ∅ and the whole set Σ∗. For languages
L,L1, L2 ⊆ Σ∗, the union

L1 ∪ L2 = {w ∈ Σ∗ | w ∈ L1 or w ∈ L2},

intersection
L1 ∩ L2 = {w ∈ Σ∗ | w ∈ L1 and w ∈ L2},

difference
L1 \ L2 = {w ∈ Σ∗ | w ∈ L1 and w 6∈ L2},

and symmetric difference

L14L2 = (L1 \ L2) ∪ (L2 \ L1)

are defined in the usual way. Also the Kleene star of a language

L∗ = {w ∈ Σ∗ | w = w1 · w2 · · · · · wk for some k ≥ 0 and some w1, w2, . . . , wk ∈ L}

is defined as usual, where k = 0 means w = λ, even though λ may not belong to L.
Notice that Kleene star applied to the empty language ∅ yields the language {λ}.

The Kleene Star for a language can also be expressed by

L∗ =
⋃
i≥0

Li,

where the languages Li are inductively defined by

L0 = {λ}, Li+1 = L · Li.

2.2 Regular Languages and Regular Expressions

A lot of different models are existent that represent the regular languages. This
class of languages, denoted by REG, is important and well studied. In the Chomsky
hierarchy, described by Chomsky [14, 15], the regular languages represent the lowest
level.

A common and quite natural way to describe this class of languages are the regular
expressions. This model was introduced by Kleene [41]. A regular expression is a
finite combination of concatenation, union, and Kleene star over the symbols of an
alphabet and the empty language ∅.

Formally, a regular expression over an alphabet Σ is defined as follows:

Measuring Defects in Finite Automata


10 2.3. Finite Automata

• For each a ∈ Σ, λ, ∅, and a are regular expressions.

• If r and s are regular expressions, so are (r · s), (r + s), and r∗.

The language described by regular expressions r and s is given by

• L(λ) = {λ}, L(∅) = ∅, and L(a) = {a} for all a ∈ Σ.

• L(r · s) = L(r) · L(s), L((r + s)) = L(r) ∪ L(s), and L(r∗) = (L(r))∗.

To simplify the notation, it is common to give highest priority to the star op-
erator ∗, followed by ·, and then +. Due to this priority and the associativity of
concatenation and union, superfluous parentheses can be omitted. Usually, also the
operator · is omitted. It is conventional to write ri instead of r · r · . . . · r (i times),
and r+ for r · r∗.
Example 2.2.1. Let

r =

((
(a+ b) + c

)∗ · ((((a+ c) · (a)
)
·
(
a+ (b+ c)

)∗) · (a · b))) · (c)
be a regular expression. This expression can be simplified by omitting unnecessary
braces, and the · for the concatenation. So, r can also be denoted by the expression
(a + b + c)∗(a + c)a(a + b + c)∗abc. If Σ = {a, b, c}, the expression can be further
simplified to r = Σ∗(a + c)aΣ∗abc. The language L(r) described by the regular
expression r consists of all words over {a, b, c} that end on abc and contain at least
one of the sequences aa, or ca before the last sequence of abc. This is precisely the
language L(r) = Σ∗{aa, ca}Σ∗{abc}. _ ^

^ _

Regular expressions can also be used to prove some closure properties for REG. A
language class L is said to be closed under some language operation, if all possible ap-
plications of this operation to a language belonging to L leads to a language that also
belongs to L. From the definition of regular expressions it is derived immediately,
that the class REG is closed under the operations concatenation, union, and Kleene
star. Also under the operation of reversal, REG is closed. The language L(r)R for
some regular expression r can be described by its reversal rR.

2.3 Finite Automata

Another possibility to describe the regular languages is given in form of finite au-
tomata. A finite automaton can be considered to be a device with a finite input tape,
that either accepts or rejects an input. The device changes its state while reading
symbols from left to right provided on the input tape, but it does not produce any
output. The only information obtained from finite automata after processing some
word, is the type of the state that is entered at the end of the computation. The
type of a state can be accepting or non-accepting. This means, finite automata can
be regarded to be some kind of recognition device for regular languages.

Measuring Defects in Finite Automata


2.3. Finite Automata 11

The following definition for non-deterministic finite automata was first introduced
by Rabin and Scott [58]. They were the first to introduce non-determinism for finite
automata.

A non-deterministic finite automaton (NFA) is a quintuple A = 〈Q,Σ, δ, q0, F 〉,
where

• Q is a finite set of states,

• Σ is an alphabet,

• δ : Q×Σ→ 2Q is the state transition function (or simply transition function),

• q0 ∈ Q is the initial state, and

• F ⊆ Q is the set of accepting states.

The rules defined by the state transition function determine the behaviour of the
automaton. If the NFA is in some state q ∈ Q, and reads a symbol a ∈ Σ on its input
tape, the NFA can change its state into some state p ∈ δ(q, a). Since there may exist
more than only one such state p, the automaton is called non-deterministic. It is
also possible, that such a state p does not exist at all. In this case, the transition
for q on a is said to be undefined.

In each step of the automaton, such an NFA consumes one symbol of the in-
put and changes its state according to its transition function. If more than one
state is existent, into which the automaton may change, one of them is chosen non-
deterministically.

If |δ(q, a)| ≤ 1 for all q ∈ Q and all a ∈ Σ, the automaton is called deterministic.
In such a deterministic finite automaton (DFA), there exists no possibility for the
choice of the next state for each combination of a state and a symbol in the input.
In this thesis, all DFA are considered to be complete, which means there exist no
undefined transitions. Every incomplete DFA can be converted into a complete
DFA. This is done by letting each undefined transition lead into a rejecting sink
state, which is a non-leaving state.

The state transition function δ of a finite automaton 〈Q,Σ, δ, q0, F 〉 is defined only
for single symbols. For all states q ∈ Q, it can be extended to a transition function
for words w ∈ Σ∗ by δ(q, w) =

⋃
p∈δ(q,a)

δ(p, w), if w = av for some symbol a ∈ Σ and

a word v ∈ Σ∗, respectively by δ(q, w) = {q}, if w = λ. To simplify the notation of
the transition function for DFA, the set braces are omitted for the successing state.

While processing a word, a finite automaton enters several states. Such a sequence
of states is called a path of the automaton. In this sequence, it is possible that one
state appears several times. The length of a path is defined to be the number of
states contained in the path. If a state occurs several times, each occurrence is
counted separately. Let for example s1s2s1s3 denote a path. Then its length is four,
even though s1 appears twice. A subsequence of the path that begins with a state s,
ends with the same state s, with s not appearing elsewhere in this subsequence, is

Measuring Defects in Finite Automata


12 2.3. Finite Automata

called a cycle or ring of the DFA. The length of a cycle is defined as the number of
states within the cycle minus one, since state s builds the beginning and the end of
the cycle. If a cycle is of length one, it is also called a loop. A loop is nothing else
but a single transition that does not change the state.

For every word over the alphabet of the automaton there exists a path that begins
with the initial state. Let s0s1 . . . sk be a path for some word, where s0 = q0 and
k ≥ 0, and let si be one of these states. All the states, that are visited before
entering si are called its predecessors, and all the states that are visited after this
state are the successors of si. In the path, this means thart all states sj , where j < i
are predecessors, and that all states sl with l > i are the successors.

Let A be a finite automaton, either deterministic or non-deterministic. The ac-
cepted language of A, in terms L(A), is the set of words that lead the automa-
ton into acceptance, when starting the processing in the initial state. Formally,
for a finite automaton A = 〈Q,Σ, δ, q0, F 〉, its accepted language is defined to be
L(A) = {w ∈ Σ∗ | δ(q0, w) ∩ F 6= ∅}.

Two finite automata A and B are said to be equivalent, in terms A ≡ B, if they
accept the same language, which means L(A) = L(B).

Kleene showed that the language class that is recognised by finite automata is
precisely the class REG [41].

Theorem 2.3.1 (Kleene [41]). A language L is regular if and only if there exists a
finite automaton A with L(A) = L.

Conversions of NFA into DFA Since NFA and DFA are both models that describe
the class REG, it should be possible to convert one model into the other. Indeed,
this is possible. It was shown by Rabin and Scott in 1959 [58]. The conversion is
done by the so-called power set construction.

Let A = 〈Q,Σ, δ, q0, F 〉 be an arbitrary NFA. An equivalent DFA is given by
B = 〈2Q,Σ, δ′, {q0}, F ′〉, where δ′(P, a) =

⋃
p∈P

δ(p, a) for all P ⊆ Q and a ∈ Σ, and

F ′ = {P ∈ 2Q | P ∩ F 6= ∅}.
The following example gives an idea of how the power set construction works. It

also shows how finite automata are represented by state transition diagrams.

Example 2.3.1. Let A = 〈{q0, q1, q2, q3}, {a, b}, δ, q0, {q3}〉 be the NFA with state
transition function

δ(q0, a) = {q1, q2}, δ(q0, b) = {q2},
δ(q1, a) = δ(q1, b) = {q2},
δ(q2, a) = {q3},
δ(q3, a) = δ(q3, b) = {q3}

that is depicted in Figure 2.1.
An equivalent DFA is given by B = 〈Q, {a, b}, δ′, {q0}, F 〉, where its state set

is Q = {∅, {q0}, {q2}, {q3}, {q1, q2}, {q2, q3}}, its set of accepting states is given by

Measuring Defects in Finite Automata


2.3. Finite Automata 13

q0 q1 q2 q3
a a, b a

a, b

a, b

Figure 2.1: NFA that is converted into a DFA in Example 2.3.1.

{q0} {q1, q2} {q2, q3} {q3}

{q2}∅

a a a, b

b
b

a

b

a, b

a, b

Figure 2.2: DFA that is constructed by the power set construction from the NFA de-
picted in Figure 2.1.

F = {{q3}, {q2, q3}}, and its transition function is defined by

δ′({q0}, a) = {q1, q2}, δ′({q0}, b) = {q2},
δ′({q2}, a) = {q3}, δ′({q2}, a) = ∅,
δ′({q3}, a) = δ′({q3}, b) = {q3},
δ′({q1, q2}, a) = {q2, q3}, δ′({q1, q2}, b) = {q2},
δ′({q2, q3}, a) = δ′({q2, q3}, b) = {q3},
δ′(∅, a) = δ′(∅, b) = ∅.

This automaton is depicted in Figure 2.2 _ ^

^ _

The formal description of the DFA constructed by the power set construction
contains all possible states. The states that are not depicted in Figure 2.2 cannot be
reached from the initial state {q0} by any word over {a, b}. It is a general convention
to omit such states for finite automata.

Minimal Deterministic Finite Automata Taking a closer look at the constructed
DFA from Example 2.3.1, one can recognise that by the states {q2, q3} and {q3} the
same words are accepted. This leads to the formal definition of equivalent states:

Measuring Defects in Finite Automata


14 2.3. Finite Automata

Given a DFA with state transition function δ. Two states p and q are said to be
equivalent, if for any word w, δ(p, w) is accepting if and only if δ(q, w) is accepting.
Otherwise, the states p and q can be distinguished and are inequivalent.

Equivalent states can be merged. This means, for a DFA A = 〈Q,Σ, δ, q0, F 〉,
and two equivalent states p, q ∈ Q, a new DFA A′ = 〈Q \ {p},Σ, δ′, q′0, F \ {p}〉 is
constructed, where

δ′(s, a) =

{
q if δ(p, a) = p,

δ(s, a) else
,

and

q′0 =

{
q if q0 = p,

q0 else
.

Merging equivalent states always leads to a new DFA. If a DFA only contains
states that are not equivalent, it is called minimal.

There exist several algorithms to minimise a DFA, for example one by Brzo-
zowski [4], one by Hopcroft and Ullman [29], one by Huffman [30], or one by
Moore [53].

One well-known algorithm to minimise a DFA is the so called table-filling algo-
rithm, see for example [29]. This algorithm determines recursively all inequivalent
states by marking pairs of states. For a given DFA A = 〈Q,Σ, δ, q0, F 〉, at first all
state pairs of states are marked, where only one of them is accepting. Then, for all
non-marked pairs of states p and q, and symbols a ∈ Σ, this pair is marked, if and
only if the state pair δ(p, a) and δ(q, a) is already marked. This algorithm stops,
if there either exist no more unmarked pairs of states, or if there exists no more
possibility to mark the remaining unmarked pairs. The unmarked pairs are exactly
the equivalent states.

Theorem 2.3.2. If two states are not distinguished by the table-filling algorithm,
then the states are equivalent.

The following important property of deterministic finite automata results from
this.

Theorem 2.3.3. If A is a DFA, and B the DFA constructed from A by the table-
filling algorithm, then B has as few states as any DFA equivalent to A.

From all these properties of deterministic finite automata, the following theorem
can be derived, that gives a characterisation of minimal DFA.

Theorem 2.3.4. A DFA is minimal if and only if all states are reachable from its
initial state, and if all of its states are pairwise inequivalent.

This especially means, that for any regular language, there exists a DFA that
accepts this language, and has a minimal number of states. This also means, all
minimal DFA are the same except for a renaming of the states. Therefore, this

Measuring Defects in Finite Automata


2.4. Defects in Finite Automata 15

minimal number of states can be used to measure the size of DFA and indirectly
the size of regular languages. This measure is called state complexity.

This measure can also be extended to non-deterministic finite automata. A min-
imal NFA accepting a certain language is defined to be an NFA, for which there
exists no equivalent NFA that has fewer states.

Now it is possible to use this measure to express the compactness of the repre-
sentation of regular languages by NFA instead of DFA. It was shown by Rabin and
Scott [58] that an upper bound for the conversion of NFA into DFA is exponen-
tial. The tightness of this bound was shown later independently by Lupanov [48],
Moore [54], and Meyer and Fischer [51].

Theorem 2.3.5. There exist regular languages, that are accepted by NFA with n
states, and by minimal DFA with 2n states.

2.4 Defects in Finite Automata

Finite automata can be manipulated. Such a manipulation can affect either the
transitions or the states. In the following, the manipulations are called defects. A
finite automaton that contains a defect will be called defective. The automaton,
from which the defective automaton is received by some defect, is called original
automaton. The investigations in this thesis concentrate on the following defects.

Definition 2.4.1. The following defects can affect a deterministic finite automaton.
Their definitions will be reduced to only one occurrence of the defect.

Exchange Symbol: The symbol of a transition is exchanged by another one.

Flip Transition: The source and target of a transition are interchanged.

Delete Transition: A transition is deleted.

Insert Transition: A new transition is inserted.

Delete Accepting State: An accepting state is deleted, including all transitions
having this state as source or target.

Delete Non-Accepting State: A non-accepting state is deleted, including all tran-
sitions having this state as source or target.

Remove Acceptance: The property of acceptance is removed for a state.

Add Acceptance: The property of acceptance is added for a state.

Example 2.4.1. Let A = 〈Q,Σ, δ, q0, F 〉 be the DFA depicted in Figure 2.3, where
Q = {q0, q1, q2}, Σ = {a, b}, F = {q0, q2}, and

δ(q0, a) = δ(q0, b) = q1,

δ(q1, a) = δ(q1, b) = q2,

δ(q2, a) = δ(q2, b) = q2.

Measuring Defects in Finite Automata


16 2.4. Defects in Finite Automata

q0 q1 q2
a, b a, b

a, b

Figure 2.3: Minimal DFA accepting the language {a, b}2{a, b}∗ ∪ {λ}.

q0 q1 q2
b, b a, b

a, b

Figure 2.4: Defective finite automaton derived from the DFA depicted in Figure 2.3
for the defect exchange symbol affecting the transition on a for state q0.

This automaton accepts the language {a, b}2{a, b}∗ ∪ {λ}.
The defect exchange symbol affecting the transition from state q0 to state q1 by

replacing a by the symbol b in DFA A results in an incomplete non-deterministic
finite automaton. This defective automaton is depicted in Figure 2.4, and accepts
the language {b}{a, b}+ ∪ {λ}.

If the defect flip transition affects the transition from state q0 to q1 on symbol b
for DFA A, the resulting defective DFA is the one given in Figure 2.5 which accepts
the language {a}{ba}∗{a, b}+ ∪ {λ}.

The automaton depicted in Figure 2.6 accepts the language {a}{a, b}+ ∪{λ}, but
is the result of the defect delete the transition affecting the transition on b for state q0

in DFA A.

If the defect insert transition affects the DFA A by inserting a transition on b from
state q1 to state q0, the accepted language does not change. The resulting defective
automaton is depicted in Figure 2.7.

q0 q1 q2
a a, b

a, b

b

Figure 2.5: Defective finite automaton derived from the DFA depicted in Figure 2.3
for the defect flip transition affecting the transition on b for state q0.

Measuring Defects in Finite Automata


2.4. Defects in Finite Automata 17

q0 q1 q2
a a, b

a, b

Figure 2.6: Defective finite automaton derived from the DFA depicted in Figure 2.3
for the defect delete transition affecting the transition on b for state q0.

q0 q1 q2
a, b a, b

a, b

b

Figure 2.7: Defective finite automaton derived from the DFA depicted in Figure 2.3
for the defect insert transition that inserts the transition δ(q1, b) = q0.

The deletion of the accepting state q2, or the non-accepting state q1 both result
in defective finite automata, that are both incomplete DFA. These two automata
are depicted in Figures 2.8 respectively 2.9. Both automata only accept the lan-
guage {λ}.

If the acceptance property of state q0 for DFA A is removed, the language accepted
by the defective automaton only differs by the empty word from language L(A). The
defective automaton is depicted in Figure 2.10.

The last remaining defect of adding the acceptance for some state of A results in
the automaton from Figure 2.11. This automaton accepts the language {a, b}∗. _ ^

^ _

In general, all defects change the accepted language of the affected automaton. If
the DFA is affected by more than only one defect, it is possible that the accepted
language does not change at all, which means in detail that the defective DFA is
equivalent to the original DFA. This is the case if, for example, the defects delete
transition and insert transition affect the same transitions.

If is also possible that one defect applied several times results in a massive change

q0 q1
a, b

Figure 2.8: Defective finite automaton derived from the DFA depicted in Figure 2.3
by the defect delete accepting state affecting state q2.

Measuring Defects in Finite Automata


18 2.4. Defects in Finite Automata

q0 q2

a, b

Figure 2.9: Defective finite automaton derived from the DFA depicted in Figure 2.3
by the defect delete non-accepting state affecting state q1.

q0 q1 q2
a, b a, b

a, b

Figure 2.10: Defective finite automaton derived from the DFA depicted in Figure 2.3
for the defect remove acceptance affecting state q0.

q0 q1 q2
a, b a, b

a, b

Figure 2.11: Defective finite automaton derived from the DFA depicted in Figure 2.3
for the defect add acceptance affecting state q1.

of the accepted language. For example, if the defect of removing acceptance for
some states affects a minimal DFA several times, the accepted language may change
massively. This can be seen by removing the acceptance of the states q0 and q2 of
the minimal DFA depicted in Figure 2.3. The resulting defective automaton does
not accept any word, which means that infinitely many words are not accepted any-
more. Similar things may happen when combining different defects. Combinations
of defects may also lead to other types of defects. For example, the combination of
inserting and deleting one transition each can result in the defect flip transition. If
the transition of DFA A depicted in Figure 2.3 on b is deleted for state q0, but a
transition on b is added to q1 that leads into state q0, the resulting defective finite
automaton is the one we received in Example 2.4.1 for the defect flip transition.

This is the reason why, in this thesis, only one occurrence of a single defect will
be considered.

Measuring Defects in Finite Automata


3 State Complexity of DFA for Defects

This chapter deals with the state complexity of finite automata resulting from de-
fects. These automata are retrieved from minimal and complete DFA which are af-
fected by a defect. Those defects always lead to at least incomplete DFA or even to
non-deterministic finite automata. These automata will be converted into complete
and minimal DFA again. The size of these automata will be given in dependency of
the size of the original, non-defective DFA.

3.1 Defects Occurring on the Transitions

In this section, we concentrate on defects on the transitions of a DFA. The possible
defects on transitions are exchange symbol, flip transition, delete transition, and
insert transition.

The first defect to be examined is exchange symbol. The following theorem gives a
tight upper bound for the state complexity of a complete and minimal DFA accepting
the language of the defective automaton. Its size will be determined in terms of the
size of the original DFA.

Theorem 3.1.1. Let A be a minimal DFA with n ≥ 2 states and let the NFA, which
is received by the defect exchange symbol affecting DFA A be denoted by A′. Then
the equivalent minimal DFA of A′ needs at most 2n states. This bound is tight even
for binary alphabets.

Proof. In the worst case the considered defect implements ambiguity in the automa-
ton and leads to undefined transitions. An equivalent minimal DFA may need all
states generated by the power set construction. This means a minimal DFA has at
most 2n states.

We now show that this bound is reachable by a defect on a single transition in a
DFA with binary alphabet. Let A = 〈Q, {a, b}, δ, q1, {qn}〉 be the DFA depicted in
Figure 3.1 with Q = {q1, q2, . . . , qn} and

δ(qj , a) =

{
qj+1 if j ∈ {1, 2, . . . , n− 1},
q2 if j = n

,

δ(qj , b) =

{
q1 if j = 1 or j = n,

qj+1 if j ∈ {2, 3, . . . , n− 1}
.

This DFA is complete. It is also minimal since for x, y ∈ {1, 2, . . . , n − 1}, x < y,
we have that δ(qx, a

n−y) 6= qn but δ(qy, a
n−y) = qn.

19


20 3.1. Defects Occurring on the Transitions

q1 q2 q3 qn−1 qn
a a, b a, b a, b a, b

b

b

a

Figure 3.1: DFA that proves the optimality of the upper bound for the defect ex-
change symbol.

q1 q2 q3 qn−1 qn
a a, b a, b a, b a, b

b

a

a

Figure 3.2: NFA of Moore whose equivalent minimal DFA needs precisely 2n

states for n ≥ 2 [54].

Modifying the transition δ(qn, b) = q1 by exchanging the letter b by a results in
the NFA given in Figure 3.2. For this automaton it was already shown by Moore
in [54] that the equivalent minimal DFA needs precisely 2n states for n ≥ 2.

This defect has no effect for unary alphabets since in such an alphabet there do
not exist enough symbols to exchange one by another. This gives a tight bound of n
states for unary alphabets.

The theorem below provides the upper bound for the state complexity of a DFA af-
feted by the defect flip transition.

Theorem 3.1.2. Let A be a minimal DFA with n ≥ 2 states and let the NFA A′
be the automaton that is received from A by the defect flip transition. Then the
equivalent minimal and complete DFA of NFA A′ needs at most 2n states. This
bound is tight even for binary alphabets.

Proof. In the worst case, the modification implements ambiguity in the automaton
and leads to undefined transitions. An equivalent complete and minimal DFA may
need all states generated by the power set construction. This means such a DFA needs
at most 2n states.

In the following we show that this bound can be reached by manipulating a single
transition in a DFA with a binary alphabet. Let A = 〈Q, {a, b}, δ, q1, {qn}〉 be the

Measuring Defects in Finite Automata


3.1. Defects Occurring on the Transitions 21

q1 q2 qn−5 qn−4 qn−3 qn−2 qn−1 qn
a a a a a a a, b

b b b b b

a

b

a

b

Figure 3.3: DFA that proves the tightness of the upper bound for the defect flip
transition.

DFA from Figure 3.3 with state set Q = {q1, q2, . . . , qn} and

δ(qj , a) =


qj+1 if j ∈ {1, 2, . . . , n− 4, n− 2, n− 1}
q1 if j = n− 3,

qn−2 if j = n

,

δ(qj , b) =


qj if j ∈ {1, 2, . . . , n− 4, n},
qj+1 if j ∈ {n− 3, n− 1},
qn−3 if j = n− 2

.

This DFA is complete. The following considerations prove the minimality of this
automaton.

In case state qi belongs to the set {q1, q2, . . . , qn−3} it is reachable from state q1

by the word ai−1. If qi is one of the states qn−2, qn−1, or qn it can be reached from
the initial state by the word an−4bai−(n−2).

For indices x, y ∈ {1, 2, . . . , n − 3}, x < y, we have that δ(qx, a
n−3−xba2) = qn

but δ(qy, a
n−3−xba2) 6= qn. It also holds that δ(qn−2, ab) = qn but δ(qx, ab) 6= qn,

δ(qn−1, a) = qn but δ(qx, a) 6= qn and δ(qn−2, a) 6= qn. This proves that all the states
of automaton A are reachable and inequivalent.

In case the considered defect flip transition affects the transition δ(qn−1, b) = qn
it is replaced by the transition δ(qn, b) = qn−1. This leads to the NFA given in
Figure 3.4. For this automaton Jirásková and Šebej have already shown in [38] that
the equivalent minimal DFA needs exactly 2n states for n ≥ 2.

Theorem 3.1.2 gives a tight bound for at least binary alphabets. For unary alpha-
bets, this bound differs. The following theorem covers the unary case.

Theorem 3.1.3. Let A be a minimal DFA with n ≥ 2 states and unary alphabet,
and let A′ be the NFA that is received by the defect flip transition affecting A. Then
the equivalent minimal DFA for A′ needs at most n+ 1 states. This bound is tight.

Proof. It is well known since Chrobak [16, 17] that a complete and minimal DFA over
a unary alphabet consists of a connected chain of states followed by a ring of states.

Measuring Defects in Finite Automata


22 3.1. Defects Occurring on the Transitions

q1 q2 qn−5 qn−4 qn−3 qn−2 qn−1 qn
a a a a a a

b b b b b

a

b b

a

b a

Figure 3.4: NFA of Jirásková and Šebej for which the equivalent minimal DFA has
exactly 2n states for n ≥ 2.

q1 q2 qn−1 qn
a a a a

a

Figure 3.5: DFA A for the defect flip transition for unary alphabets.

This ring may be a single (accepting or non-accepting) state having a transition to
itself on the only symbol of the alphabet. It can also contain more than one state.
It is also possible that the chain of states does not exist at all and the automaton
only consists of a ring of states. In a DFA, where the ring is a single accepting state,
or consists of more than only one state, a rejecting sink state does not exist. In the
following, only DFA of this structure will be considered.

The structure of such DFA especially means that there exists only one transition
leaving each state and for most states only one transition enters this state. The only
exception to this is the last state of the chain, which is the end of the chain and the
beginning and ending of the ring at the same time. This state q is the only state
that is the target of two transitions.

If the defect flip transition affects a unary minimal DFA, the resulting defective
automaton always accepts a finite language. The reason for this is, that the in-
terchanging of the source state s and target state t 6= q of a transition deletes the
connection between s and t. Due to the structure of unary DFA, this means at the
same time the loss of reachability of all states that are reached from s in the original
DFA. The equivalent minimal DFA to this defective automaton needs at most as
many states as the original one, since the insertion of a rejecting sink state suffices.

If the target state is q, all states may still be reachable, since there exist two
transitions leading into q. This happens, if the transition leading from the last state
of the ring r into the first state of the ring q is flipped. In this case, also a finite
language is accepted by the defective automaton. The longest word of this language
has at most n − 1 symbols, if the number of states of the original DFA is n. It is
well known, that a minimal DFA accepting a (unary) finite language has a number

Measuring Defects in Finite Automata


3.1. Defects Occurring on the Transitions 23

of states, that is given by the length of the longest word plus two. In this case, this
gives the upper bound of n+1.

To show the optimality of this bound, let A = 〈Q, {a}, δ, q1, {qn}〉 be the minimal
DFA depicted in Figure 3.5, where Q = {q1, q2, . . . , qn} is the set of states and

δ(qj , a) =

{
qj+1 if j ∈ {1, 2, . . . , n− 1},
q1 if j = n

is its state transition function.
This DFA is complete. The following considerations provide the minimality of the

chosen automaton.
Every state qi ∈ {q1, q2, . . . , qn} is reachable from the initial state by the word ai−1.
For indices x, y ∈ {1, 2, . . . , n − 1}, x < y, we have that δ(qx, a

n−y) 6= qn but
δ(qy, a

n−y) = qn. Thus, all the states of the considered automaton are reachable
and inequivalent.

If the defect interchanges the transition from state qn to q1, we receive an unde-
fined transition for state qn and an ambiguity for state q1. The accepted language
is modified to the finite language {a, an−1}. It is well known that the minimal
DFA accepting this language needs exactly n+ 1 states, n states to count up to the
length n− 1 and one more state to reject all words longer than n− 1. This proves
the optimality of the upper bound.

For the defect flip transition in the unary case, the next theorem proves that it is
possible for the minimal DFA that is equivalent to the defective automaton to have
any number of states below the upper bound.

This question is closely related to the one asking for magic numbers. This question
was stated by Iwama, Kambayashi, and Takaki [31]. It asks whether there always
exists a nondeterministic finite automaton with n states whose equivalent minimal
DFA has α states for all integers n and α, with n ≤ α ≤ 2n.

Iwama, Matsuura, and Paterson [32] called a number α magic, if there exists no
NFA with n states having an equivalent DFA with α states. In [31] it was shown,
that α = 2n − 2k or α = 2n − 2k − 1, for 0 ≤ k ≤ n/2− 2 is non-magic, and in [32]
this was shown for α = 2n−k, with 5 ≤ k ≤ 2n−2, and some coprimality condition
for k. These results were shown for binary NFA.

The results for binary alphabets were improved in [34], [35], [20], and [49]. The
question asking for magic numbers turned out to become easier, if the size of the
alphabet is increased. In [34] it was even shown that for exponentially growing
alphabets, no magic numbers exist at all. In [20], this result was improved to smaller
growing alphabets. In [33] it was shown for four-letter alphabets that there exist no
magic numbers, and in [37], this was even proven for ternary alphabets. The magic
number problem for unary alphabets was studied in [21]. Further results on magic
numbers can be found, for example, in [35], [36], and [25].

Theorem 3.1.4. Let n ≥ 4 be a number. Then a minimal DFA A exists with n
states and unary alphabet, such that by the defect flip transition, the equivalent

Measuring Defects in Finite Automata


24 3.1. Defects Occurring on the Transitions

q1 q2 qi−2 qi−1 qi qi+1 qi+2 qn
a a a a a a a a a

a

Figure 3.6: DFA for which the defect flip transition can result in any number of
states between 2 and n− 1 for the equivalent minimal DFA.

minimal DFA for the defective automaton needs precisely a number of i states, for
all 1 ≤ i ≤ n+ 1.

Proof. The existence of a minimal DFA with n + 1 states was already shown in
Theorem 3.1.3.

To prove the existence of a minimal DFA with a number of states i between 1 and
n, that is equivalent to a defective automaton, we use the DFA A = 〈Q,Σ, δ, q1, F 〉
depicted in Figure 3.6, where Q = {q1, q2, . . . , qn} is the set of states, its set of
accepting states is given by F = {qi−1, qi, qi+1}, and the transition function δ is
defined by

δ(qj , a) =

{
qj+1 if j ∈ {1, 2, . . . , n− 1},
q1 if j = n

.

This DFA is complete. It is also minimal, since each state qj , 1 ≤ j ≤ n, is
reachable from the initial state q1 by the word aj−1. The accepting states can
be differentiated by the words a and a2 since then we have δ(qi−1, a

2) = qi+1,
δ(qi, a

2), δ(qi+1, a
2) 6∈ {qi−1, qi, qi+1}, and δ(qi, a) = qi+1, δ(qi+1, a

2) 6∈ {qi−1, qi, qi+1}.
For the states with indices 1 ≤ j < l ≤ i − 2, we have δ(ql, a

i−l−1) = qi−1, but
δ(qj , a

i−l−1) = qk, where k < i − 1. Thus, the states from the set {q1, q2, . . . , qi−2}
are pairwise inequivalent. The same is true for the states with indices j and l where
i + 1 ≤ j < l ≤ n, since δ(ql, a

n+i−l−1) = qi−1 is accepting, but on the other hand
we have δ(qj , a

n+i−l−1) 6∈ {qi−1, qi, qi+1}, and, therefore, non-accepting.

Any state qj ∈ {q1, q2, . . . , qi−2} is inequivalent to any state ql ∈ {qi+2, qi+3, . . . , qn}
since δ(qj , a

i−j−1) = qi−1, but δ(ql, a
i−j−1) 6∈ {qi−1, qi, qi+1}.

In the following, we distinguish the two cases 2 ≤ i ≤ n− 1, and i = 1 or i = n.

To receive a number of states i between 2 and n−1 for the minimal DFA equivalent
to the defective automaton, we assume that the considered defect affects the transi-
tion from state qi−1 to qi of DFA A. Then all states of the set {qi, qi+1, . . . , qn} are
not reachable anymore. The remaining accepted language is the language {ai−2},
which is finite and can be accepted by a minimal DFA with i states.

For the case i = 1 or i = n, we modify the automaton from above. Let qn−2, qn−1,
and qn be the accepting states in this case. If the transition from q1 to q2 is defective,

Measuring Defects in Finite Automata


3.1. Defects Occurring on the Transitions 25

q1 q2 qn−2 qn−1 qn
a a a a a

Figure 3.7: An incomplete DFA whose equivalent minimal DFA needs exactly n+ 1
states.

the resulting automaton accepts the empty language. Thus, one state is sufficient
for an equivalent minimal DFA. This proves i = 1. In case the defect affects the
transition from qn−1 to qn, the automaton accepts the finite language {an−3, an−2},
which can be accepted by a minimal DFA with n states. This concludes the proof.

The three theorems from above considering the defect flip transition show that
this defect can result in a very huge minimal DFA for arbitrary alphabets. For unary
alphabets, this number is much smaller. This, again, proves the difference between
unary alphabet sizes, and at least binary alphabets concerning the state complexity.

In the proof of Theorem 3.1.4, we used the DFA depicted in Figure 3.6. This
DFA accepts an infinite language. The defective automata constructed in the proof
accept finite languages. This means that the defect can result in a loss of a lot of
information about the accepted language. This is motivates the analysis of defects.

The next theorem proves an upper bound for the number of states of an equivalent
minimal DFA that is received by the defect delete transition affecting a given DFA.

Theorem 3.1.5. Let A be a complete and minimal DFA with n ≥ 4 states and let
the NFA A′ be received from A by the defect delete transition. Then the equivalent
minimal DFA of A′ needs at most n + 1 states. This bound is tight even for unary
alphabets.

Proof. Even in the worst case, the deletion of a transition in a complete and minimal
DFA cannot implement ambiguity in the automaton. It can only lead to undefined
transitions. Thus, an equivalent minimal DFA may need at most one more state
that represents a rejecting sink which will be the target of every undefined transition.
Therefore, a minimal DFA needs at most a number of n+ 1 states.

To prove the optimality of the bound for this defect affecting only a single tran-
sition in a minimal DFA with unary alphabet, we will use the DFA depicted in
Figure 3.5. We assume the set of accepting states to be {qn−2, qn−1, qn}. In the
proof of Theorem 3.1.3 it was already shown, that this DFA is complete and mini-
mal.

Let this DFA be defected by deleting the transition δ(qn, a) = q1. We receive the
incomplete DFA given in Figure 3.7. This automaton accepts the finite language
{an−3, an−2, an−1}, which is known to be accepted by a minimal DFA with n + 1
states.

Measuring Defects in Finite Automata


26 3.1. Defects Occurring on the Transitions

q1 q2 qi−2 qi−1 qi qn−1 qn
a a a a a a a a

a

Figure 3.8: DFA to prove the existence of minimal DFA equivalent to defective au-
tomata for the defect delete transition with a number of states between
1 and n− 1.

We now know that a minimal DFA equivalent to the defective automaton needs
at most n+ 1 states when deleting one transition of a given DFA with n ≥ 2 states.
In this context the question naturally arises if there exists an equivlant minimal
DFA for any number of states between 1 and n+ 1 for the defect delete transition.
The following theorem provides the answer.

Theorem 3.1.6. Let n ≥ 3 be a number. Then there exist minimal DFA A with n
states, such that by the defect delete transition, the equivalent minimal DFA for the
defective automaton needs precisely a number of i states, for all 1 ≤ i ≤ n+ 1. This
is true even for unary alphabets.

Proof. In the proof of Theorem 3.1.5 it has already been shown that there exists
a DFA with n ≥ 1 states that meets the upper bound of n + 1 states. For the
remaining numbers of states, we first consider i ≤ n − 1. Let A = 〈Q, {a}, δ, q1, F 〉
be the DFA from Figure 3.8 with Q = {q1, q2, . . . , qn}, F = {qi−1, qn} where i−1 = 0
means that qn is the only accepting state of the automaton, and

δ(qj , a) =

{
qj+1 if j ∈ {1, 2, . . . , n− 1},
qn if j = n

.

This DFA is complete. The following considerations provide the minimality of
automaton A.

Every state qj ∈ {q1, q2, . . . , qn} is reachable by aj−1.
For indices x, y ∈ {1, 2, . . . , n − 1} \ {i − 1}, x < y, either δ(qx, a

n−y) 6= qi−1 or
δ(qx, a

n−y) = qi−1 but δ(qy, a
n−y) = qn. In the first case, qx and qy are inequivalent.

In the other case, qx and qy can be distinguished by reading an−ya since then we
have δ(qx, a

n−ya) = qi 6∈ F and still δ(qy, a
n−ya) = qn. To show that states qi−1

and qn are inequivalent, we use the word a and obtain δ(qi−1, a) = qi, i < n and
δ(qn, a) = qn. Thus, all the states are reachable and inequivalent.

For 2 ≤ i ≤ n−1, we modify the DFA A by deleting the transition δ(qi−1, a) = qi.
We receive the incomplete DFA depicted in Figure 3.9. Since this automaton accepts
the finite language {ai−2}, an equivalent minimal DFA needs precisely i states.

For i = 1, the transition connecting states q1 and q2 is deleted. The result-
ing incomplete DFA accepts the empty language, which is accepted by a minimal
DFA with only one state.

Measuring Defects in Finite Automata


3.1. Defects Occurring on the Transitions 27

q1 q2 qi−2 qi−1

a a a a

Figure 3.9: Incomplete DFA whose equivalent minimal DFA needs exactly i states.

q1 q2 qn−2 qn−1 qn
a a a a a

a

Figure 3.10: DFA that proves the existence of a minimal DFA with n states, that
is equivalent to a defective automaton received for the defect delete
transition.

For a state complexity of n let A = 〈Q, {a}, δ, q1, F 〉 be the DFA depicted in
Figure 3.10, where Q = {q1, q2, . . . , qn}, F = {qn−1} and

δ(qj , a) =

{
qj+1 if j ∈ {1, 2, . . . , n− 1},
qn if j = n

.

This DFA is complete. The following considerations provide the minimality of the
chosen automaton.

Every state qj ∈ {q1, q2, . . . , qn} is reachable by aj−1.
For indices x, y ∈ {1, 2, . . . , n} \ {n− 1}, x < y, we have δ(qx, a

n−1−y) 6= qn−1 but
δ(qy, a

n−1−y) = qn−1. Thus, all the states are reachable and inequivalent.
We now modify the DFA A by deleting the transition δ(qn−1, a) = qn. We receive

the incomplete DFA from Figure 3.11. Since this automaton accepts the finite
language {an−2}, the equivalent minimal DFA has n states.

Before we turn to the next defect, we prove that the minimal DFA for the NFA de-
picted in Figure 3.12 needs exactly 2n − 1 states. We use this automaton to prove
the optimality of Theorem 3.1.7.

Lemma 3.1.1. The minimal DFA equivalent to the NFA depicted in Figure 3.12
has precisely 2n − 1 states for n ≥ 2.

q1 q2 qn−2 qn−1

a a a a

Figure 3.11: Incomplete DFA whose equivalent minimal DFA needs exactly n states.

Measuring Defects in Finite Automata


28 3.1. Defects Occurring on the Transitions

q1 q2 q3 qn−1 qn
a, c a, b, c a, b, c a, b, c a, b, c

a, b

a, b, c

Figure 3.12: NFA whose equivalent minimal DFA needs exactly 2n − 1 states for
n ≥ 2.

Proof. Let A = 〈Q, {a, b, c}, δ, q1, {qn}〉 be the NFA with n ≥ 2 states depicted in
Figure 3.12 with state set Q = {q1, q2, . . . , qn}. The state transition function δ can
be determined from the picture.

At first we consider the reachability of the states of 2Q. The singletons {qi} for
all qi ∈ Q are reachable by reading the words ci−1 from the initial state {q1}. The
sets {q1, qx}, where x ∈ {2, 3, . . . , n} can be reached by the words abx−2. By induc-
tion, the sets {q1, qx1 , qx2 , . . . , qxk}, where k ≥ 2, x1, x2, . . . , xk ∈ {2, 3, . . . , n}, and
x1 < x2 < · · · < xk, are reachable from the set {q1, qx2−x1+1, qx3−x1+1, . . . , qxk−x1+1}
of size k − 2 by reading the word abx1−2. Therefore, all subsets of Q containing
state q1 are reachable.

The sets {qx1 , qx2 , . . . , qxk}, where k ≥ 2, and where x1, x2, . . . , xk ∈ {2, 3, . . . , n}
are indices with x1 < x2 < · · · < xk, can also be shown to be reachable. Since all
states containing q1 are reachable we can also reach the set {qx1 , qx2 , . . . , qxk} by
reading the word cx1−1 from the set {q1, qx2−(x1−1), qx3−(x1−1), . . . , qxk−(x1−1)}.

Thus, every non-empty set can be reached from the initial state {q1}. The only
non-reachable set from 2Q is the empty set. This cannot be reached by any word
read from the initial state {q1} since there exists no undefined transition in the given
NFA.

In the following we show that these 2n − 1 states are inequivalent. Let x and y
denote two arbitrary indices of states of A belonging to {1, 2, . . . , n − 1}, where
x < y. In the DFA built by the power set construction the sets {qx} and {qy} can be
distinguished by the word an−y. Therefore, also all the sets S, T ⊆ {q1, q2, . . . , qn−1},
where S 6= T , and S 6= ∅, T 6= ∅, can be distinguished.

This leaves the inequivalence of the accepting states containing qn. Let S and T
be the sets selected as above. The first case we consider is qn−1 6∈ S. Then we have
δ(S ∪ {qn}, a) = S′ ⊆ {q1, q2, . . . , qn−1} and δ(T ∪ {qn}, a) = T ′ where S′, T ′ are
not empty. If qn−1 ∈ T then T ′ contains qn and the sets S ∪ {qn} and T ∪ {qn} are
distinguishable.

In case that qn−1 6∈ T , T ′ is a non-empty subset of {q1, q2, . . . , qn−1} and we
already know that S′ and T ′ can be distinguished and, thus, also S ∪ {qn} and
T ∪ {qn}.

Measuring Defects in Finite Automata


3.1. Defects Occurring on the Transitions 29

If qn−1 ∈ S we have that δ(S ∪{qn}, a) = S′ ∪{qn}, where S′ ⊂ {q1, q2, . . . , qn−1},
and δ(T ∪ {qn}, a) = T ′. If qn−1 6∈ T , the considered sets are inequivalent. In case
that qn−1 ∈ T , we choose j = max{i | qi ∈ S \T ∪T \S}. Without loss of generality,
let qj ∈ S. We then have δ(S ∪{qn}, an−j) = S′∪{qn} where S′ ⊆ {q1, q2, . . . , qn−1}
and δ(T ∪ {qn}, an−j) = T ′ ⊆ {q1, q2, . . . , qn−1}. This proves all sets S ∪ {qn} and
T ∪ {qn} to be inequivalent.

Summarising, we have 2n − 1 inequivalent and reachable states in the minimal
DFA that accepts the same language as the NFA given in Figure 3.12.

With the help of Lemma 3.1.1 we can show the following theorem.

Theorem 3.1.7. Let A be a minimal DFA with n ≥ 2 states and let the NFA A′
be received by the defect insert transition. Then the minimal DFA equivalent to A′
needs at most 2n − 1 states. This bound is tight for at least ternary alphabets.

Proof. In the worst case the defect insert transition implements ambiguity in the
automaton but does not lead to undefined transitions since the DFA affected by the
defect is complete. Thus, an equivalent minimal DFA may need all states generated
by the power set construction except the empty set which can only be reached in
the DFA if there exists an undefined transition in the NFA. This means a complete
and minimal DFA needs at most 2n − 1 states.

We now show that this bound can be reached by inserting a single transition
in a minimal DFA with ternary alphabet. Let A = 〈Q, {a, b, c}, δ, q1, {qn}〉 be the
DFA given in Figure 3.13, where Q = {q1, q2, . . . , qn} and

δ(qj , a) =

{
q1 if j = 1 or j = n,

qj+1 if j ∈ {2, 3, . . . , n− 1}
,

δ(qj , b) =

{
q1 if j = 1 or j = n,

qj+1 if j ∈ {2, 3, . . . , n− 1}
,

and

δ(qj , c) =

{
qj+1 if j ∈ {1, 2, . . . n− 1},
q1 if j = n

.

This DFA is complete. It is also minimal since every state qi ∈ {q2, q3, . . . , qn} can be
reached from the initial state q1 by reading can−i−1 and state q1 is reached by reading
symbol a. The states are also inequivalent, since for states qk, ql ∈ {q1, q2, . . . , qn−1}
with k < l, we have that δ(qk, a

n−l) 6= qn but δ(ql, a
n−l) = qn.

We modify this DFA by inserting the transition δ(q1, a) = q2. We receive the
NFA depicted in Figure 3.12. For this automaton we in Lemma 3.1.1 it was already
shown that the equivalent minimal DFA needs precisely 2n − 1 states.

Since the upper bound stated in the former theorem needs a ternary alphabet, the
following theorem gives the upper bound for unary alphabet. For binary alphabets,
we were not able to prove a tight upper bound, yet. In this thesis, this is left over
for future work.

Measuring Defects in Finite Automata


30 3.1. Defects Occurring on the Transitions

q1 q2 q3 qn−1 qn
c a, b, c a, b, c a, b, c a, b, c

a, b

a, b, c

Figure 3.13: DFA for which the equivalent minimal DFA of the defective automaton
with inserted transition from q1 to q2 on a needs 2n − 1 states.

q1 q2 qk−1 qk

qk+1 qk+2

qn−1qn

a a a a

a

a

a

a

a

a

Figure 3.14: General structure of a DFA that accepts a unary regular language. The
acceptance property is omitted.

Theorem 3.1.8. Let A be a complete and minimal DFA with n ≥ 2 states and
unary alphabet, and let A′ be the NFA that results from the defect insert transition.
Then the equivalent minimal DFA for A′ has at most n2−2n+ 2 states. This bound
is tight.

Proof. It is well known since Chrobak [16, 17], that minimal DFA for unary alphabets
consist of a chain of states followed by a ring of states. There exists only one state
belonging to both of these parts. We want to call this state the connecting state,
and denote it by qk, like in the DFA depicted in Figure 3.14.

q1 q2 qn−1 qn
a a a a

a

Figure 3.15: DFA for which the defect insert transition inserting a transition on a
from q2 to qn results in a minimal DFA with n2 − 2n+ 2 states.

Measuring Defects in Finite Automata


3.1. Defects Occurring on the Transitions 31

The insertion of one additional transition to one of the states causes an ambiguity
for precisely this one state. Let qa denote this state. The defective automaton
consists of as many states as the original DFA since the defect does not lead to a
loss of any states. Therefore, the number of states for the DFA equivalent to the
defective automaton is denoted in terms of the number of states of the original DFA.

In case qa is not reachable from all of its successors, the ambiguity can be used
only once on every path leading through the automaton. For an equivalent minimal
DFA constructed by the power set construction this means that the predecessors
of qa are represented by singletons and the successors of this state are represented
by sets consisting of at most two states of the original DFA. The restriction to at
most two states within such a set is due to that the ambiguity leads into at most
two different states, from which state qa is not reachable anymore.

Let A denote a unary DFA like depicted in Figure 3.14. The insertion of a
transition for any state in {q1, q2, . . . , qk−1} to a state with a higher index, leads to
the restriction that the ambiguity can only be used once. For example, this is the
case when inserting a transition from q2 to q4 in A. In the power set automaton for
this defective automaton, only the singletons {q1} and {q2} are reachable, and the
pairs of states of the form {qi, qi+1}, where 3 ≤ i ≤ n−1, and {qn, qk}. Once entering
state {qn, qk} in the power set automaton, only the pairs {qi, qi+1} for k ≤ i ≤ n−1,
and {qn, qk} are reachable.

For such an inserted transition, an upper bound for the number of states for
the power set automaton is given by 2n. This includes the possibility to reach all
singletons and all pairs of two consecutive states of the DFA. For pairs of non-
consecutive states, the number may only decrease.

The inserted ambiguity may also be used more than only once on some paths in
the automaton. In this case, the state qa needs to be reachable from at least one of
its successors. This means, there exists at least one path that starts and ends with
state qa. Inserting another transition on a in the DFA depicted in Figure 3.14 for
state q4 leading into state q2, a path from q4 to itself is inserted. The shortes such
path is given by q4q2q3q4. In this example, only the singletons {q1}, {q2}, {q3}, and
{q4} are reachable in the power set automaton for k ≥ 5.

This is because all states in the power set automaton that are reachable from
{q4} are represented by sets of size at least two. These sets are {q2, q5}, {q3, q6},
and {q4, q7}. The reachable triples are {q2, q5, q8}, {q3, q6, q9}, and {q4, q7, q10}, if
n = 10. The indices differ for different numbers n, and also the number of reachable
sets of a fixed size.

From this, we can deduce that the more states, also the more sets of states are
existent and reachable of a fixed size between 2 and n − 1. But since n is fixed to
some number, the number of reachable sets is maximisedby minimising the length
of the path connecting qa to itself. Then the maximal number of sets of states of
a fixed size is reachable in the power set automaton. This means, the insertion of
δ(q4, a) = q3 would fulfill these requirements.

If the additional transition is inserted for a state within the cycle qkqk+1 . . . qnqk,
then there exist two different cycles that begin and end with state qa. If, for example,

Measuring Defects in Finite Automata


32 3.1. Defects Occurring on the Transitions

the additional transition is inserted for state qk+1 leading into state qk+4, there exist
the cycles qk+1qk+2 . . . qnqkqk+1, and qk+1qk+4qk+5 . . . qnqkqk+1 for state qk+1. For
the states within the power set automaton this means that all singletons up to {qk+1}
are reachable. Also all pairs of the form {qi, qi+2} for k + 2 ≤ i ≤ n − 2, and the
pairs {qn−1, qk}, {qn, qk+1} are reachable. These are n − k − 2 many pairs. The
reachable sets of three states are {qi, qi+2, qi+4} for k ≤ i ≤ n − 4, and the triples
{qn−3, qn−1, qk}, and {qn−2, qn, qk+1}. These are also n− k − 2 sets.

From this, we can deduce that there exist n − k − 2 numbers of reachable state
sets for each size between 2 and n − 1. To maximise this number, k needs to be
minimised. This is the case, if k is equal to one. Since most of the reachable sets
consist of states with indices that have the same fixed distance greater than or
equal to one, the number of such sets is maximised, if this distance is fixed to one.
Summarising all the maximising conditions, this means the original DFA consists
only of a ring of n states and the inserted transition inserts a second cycle from qa
to itself of length n − 1. All n singletons also need to be reachable, wherefore the
state qa needs to be the state qn.

In the power set automaton for such a defective automaton, the rejecting sink state
cannot be reached, since there do not exist any undefined transitions. Combining
the results from above, there exist precisely n − 1 reachable sets of states in the
power set automaton of the defective automaton for each size i between 2 and n−1.
All in all this gives the following sum for the number of reachable state sets:

n+
n−1∑
i=2

n− 1 = n2 − 2n+ 2.

Since it is possible that these state sets are all inequivalent, this gives an upper
bound for the number of states of a complete and minimal DFA accepting the
language of the defective automaton.

The upper bound is met when the defect insert transition affects the DFA depicted
in Figure 3.15. Let A = 〈Q, {a}, δ, q1, {qn}〉 denote this DFA, where its set of states
is Q = {q1, q2, . . . , qn}, and its state transition function is given by

δ(qj , a) =

{
qj+1 if j ∈ {1, 2, . . . , n− 1},
q1 if j = n

.

This DFA is complete and each state qi ∈ Q is reachable from the initial state q1

by the word ai−1, where 1 ≤ i ≤ n. The states are all inequivalent since for two
states qi, qj ∈ Q, 1 ≤ i < j ≤ n − 1, the word an−j distinguishes the two states.
Thus, the depicted DFA is minimal.

If the defect adds the transition δ(qn, a) = q2 to this DFA the result is an NFA con-
sisting of two cycles, one of length n and one of length n− 1.

The states of the equivalent DFA resulting from the power set construction have a
special structure due to the structure of A. In this special case, for the sets consisting
of more than one state, the states are consecutive. That means if the state with the

Measuring Defects in Finite Automata


3.1. Defects Occurring on the Transitions 33

smallest index within the set is qi, where 1 ≤ i ≤ n − 1, then the rest of the states
of the set are qi+1, qi+2, . . . , qi+l. This also includes the accepting state consisting of
all the states of the NFA. The indices are to be considered modulo n plus one.

The only state containing state q1 and not q2 is the singleton {q1}. All of the
other state sets contain either both, only q2 or none of these two states.

The accepting states of the power set automaton are the ones containing state qn,
the other state sets are non-accepting.

For this DFA we have that all of these sets are reachable by a certain number
of a’s to be read. Not all of the reachable states are inequivalent. The two states
represented by the sets {q2, q3, . . . , qn} and {q1, q2, . . . , qn} are equivalent, since both
states are accepting and by any number of a’s read from each of the states the
resulting state is {q1, q2, . . . , qn}. Thus, they are equivalent.

To prove the inequality of the other states, let X and Y be two arbitrary subsets
of {q1, q2, . . . , qn} representing a state of the power set automaton, where X 6= Y , qn
belongs to both of the sets, and none of the sets X and Y is equal to {q2, q3, . . . , qn}.
Let qx denote the state of the symmetric difference X4Y , where we define the
number x = max{m | qm ∈ X4Y } to be the biggest index of all states belonging
to only one of the sets X and Y . This especially means 1 ≤ x < n. Without loss
of generality, we assume qx ∈ X. Then the word an−x shows the inequality of the
sets X and Y . If this word is processed from set X, the power set automaton is
driven into a state represented by a set containing state qn, and if it is processed from
state set Y into a state set without qn. This proves the inequality of all accepting
states of the power set DFA. The only exception are the two accepting state sets
{q2, q3, . . . , qn} and {q1, q2, . . . , qn}, which were already proven to be equivalent.

This leaves the proof of the inequality of the reachable non-accepting states of
the power set automaton. For this, let X,Y ⊆ {q1, q2, . . . , qn−1} now be two ar-
bitrary, non-accepting state sets, where X 6= Y . These sets consist of consecutive
states since they do not contain state qn. Let X = {qi, qi+1, . . . , qi+k} and let
Y = {qj , qj+1, . . . , qj+l}, where 1 ≤ i, j ≤ n − 2, k, l ≥ 0, and qi+k 6= qn, qj+l 6= qn.
Since these two sets are not equal and consist of consecutive states, their intersec-
tion X ∩ Y is either empty or contains states, that belong to both of the sets.
In the first case, there exists a state qx in one of the sets X or Y , for which
x = max{m | qm ∈ X ∪ Y } is the biggest index of all states belonging to either X
or Y . Without loss of generality, we can assume qx ∈ X. Then the word an−x proves
the inequality of the sets X and Y . If this word is read from set X the automaton
is driven into a state containing state qn, and read from state set Y into a state
without qn.

In case that X ∩ Y is not empty, there exists a state qx in this intersection,
where x = max{m | qm ∈ X ∩ Y } is the biggest index belonging to both sets.
Processing the word an−x from X and Y , the power set automaton is driven into
two different accepting states, since X 6= Y . These accepting states cannot be the
two equivalent accepting sets {q2, q3, . . . , qn} and {q1, q2, . . . , qn}, since the latter one
is only reachable from the former one. The inequivalence of the accepting states also
proves the inequivalence of X and Y .

Measuring Defects in Finite Automata


34 3.2. Defects Occurring on the States

The total number of states for a minimal DFA accepting the same language as
the defective automaton constructed above, is given by summing up the numbers
of all the reachable state sets of a fixed size between 1 and n that were proven to
be inequivalent. This minimal DFA has two state less than the DFA received from
the power set construction, since {q2, q3, . . . , qn} and {q1, q2, . . . , qn} are the only
equivalent states, and the empty set does not represent one of the reachable states.
The rest of the unreachable states were already omitted.

This DFA consists of precisely n different singletons. For each size 2 ≤ i ≤ n− 1,
the minimal DFA contains a state represented by a state set of exactly size i. For
each of these sizes there exist exactly n − 1 such sets. The set {q2, q3, . . . , qn} is
equivalent to the last state also belonging to the power set automaton, which is
represented by the state set Q. The following sum then gives the number of all
states of the minimal DFA for the considered defective automaton:

n+ 1 +

((
n−1∑
i=2

n− 1

)
− 1

)
= n+ (n− 2)(n− 1) = n2 − 2n+ 2.

This concludes the proof.

3.2 Defects Occurring on the States

This section deals with defects occurring on the states of deterministic finite au-
tomata. The following theorem covers the defect of deletions of accepting states of a
given DFA and of all transitions leading into these states and the ones leaving these
states.

Theorem 3.2.1. Let A be a minimal DFA with n ≥ 2 states and let the NFA A′ be
received by the defect delete accepting state affecting A. Then the equivalent minimal
DFA for A′ needs at most n states. This bound is tight even for unary alphabets.

Proof. For this defect the upper bound is n since the deletion of an accepting state
and all its incoming and outgoing transitions only leads to undefined transitions.
It does not insert any ambiguity to the automaton. Thus, to receive an equivalent
minimal DFA it is sufficient to add a rejecting sink state if it does not already exist.

We show that the automaton given in Figure 3.16 meets the upper bound when
deleting the accepting state n. Let A = 〈Q, {a}, δ, q1, {qn−1, qn}〉 denote the depicted
DFA, where Q = {q1, q2, . . . , qn} is the set of states, and

δ(qj , a) =

{
qj+1 if j ∈ {1, 2, . . . , n− 1},
q1 if j = n

defines its state transition function.
Deleting the accepting state qn and all transitions leading into or going out of this

state results in an NFA that accepts the finite language {an−2}. The complete and
minimal DFA accepting this language needs n states.

Measuring Defects in Finite Automata


3.2. Defects Occurring on the States 35

q1 q2 q3 qn−1 qn
a a a a a

a

Figure 3.16: DFA for the proof of the optimality of the upper bound for the defect
delete accepting state.

The theorem below deals with the question for magic numbers for the defect delet
accepting state.

Theorem 3.2.2. Let n ≥ 2 be a number. Then there exist minimal DFA A with
n states, such that by the defect delete accepting state, the equivalent DFA for the
defective automaton needs precisely a number of i states, for all 1 ≤ i ≤ n. This is
true even for unary alphabets.

Proof. For i = n we have already shown in the proof of Theorem 3.2.1 that there
exists a DFA that meets the upper bound of n states.

Let 1 ≤ i ≤ n − 1 and A = 〈Q, {a}, δ, q1, F 〉 be the DFA depicted in Figure 3.8
with state set Q = {q1, q2, . . . , qn}, set of accepting states F = {qi−1, qn} where
i− 1 = 0 means that qn is the only accepting state of this automaton, and

δ(qj , a) =

{
qj+1 if j ∈ {1, 2, . . . , n− 1},
qn if j = n

.

This DFA is complete and minimal, which was alredy used in the proof of Theo-
rem 3.1.6.

If the considered defect affecting DFA A deletes the accepting state qn and all
transitions having qn as source or target, we receive an incomplete DFA that accepts
the finite language {ai−2}. For this language it is well known that the accepting
minimal DFA consists of precisely i states.

The next possible defect on states we want to consider is the defect delete non-
accepting state. In the following theorem an upper bound for the number of states
of an equivalent minimal DFA for the defective automaton is presented.

Theorem 3.2.3. Let A be a minimal DFA with n ≥ 2 states and let A′ denote the
NFA that is received by the defect delete non-accepting state affecting A. Then the
equivalent minimal DFA of A′ has at most n states. This bound is tight even for
unary alphabet.

Proof. Deletion of one non-accepting state and all transitions leading into and leav-
ing this state in a minimal DFA with n ≥ 2 states does not lead to ambiguity, only
to undefined transitions. Thus, this defect results in an incomplete DFA. By adding

Measuring Defects in Finite Automata


36 3.2. Defects Occurring on the States

a rejecting sink state if it does not already exist, the upper bound for the number
of states of an equivalent minimal DFA is given by n.

For the proof of tightnes