Automata theory
From Wikipedia, the free encyclopedia
In theoretical computer science, automata theory is the study of abstract machines and problems they are able to solve. Automata theory is closely related to formal language theory as the automata are often classified by the class of formal languages they are able to recognize.
Contents |
[edit] Basic description
An automaton is a mathematical model for a finite state machine (FSM). An FSM is a machine that, given an input of symbols, 'jumps' through a series of states according to a transition function (which can be expressed as a table). In the common "Mealy" variety of FSMs, this transition function tells the automaton which state to go to next given a current state and a current symbol.
The input is read symbol by symbol, until it is consumed completely (think of it as a tape with a word written on it, that is read by a reading head of the automaton; the head moves forward over the tape, reading one symbol at a time). Once the input is depleted, the automaton is said to have stopped.
Depending on the state in which the automaton stops, it's said that the automaton either accepts or rejects the input. If it landed in an accept state, then the automaton accepts the word. If, on the other hand, it lands on a reject-accept state, the word is rejected. The set of all the words accepted by an automaton is called the language accepted by the automaton.
[edit] Formal description
[edit] Definitions
Let us first define a few concepts to make our lives easier afterwards:
- Symbol
- An arbitrary datum which has some meaning to or effect on the machine.
- Word
- A finite string formed by the concatenation of a number of symbols.
- Alphabet
- A finite set of symbols.
- Language
- A set of words, formed by symbols in a given alphabet. May or may not be infinite.
- Automaton
- Formally, an automaton is represented by the 5-tuple , where:
- Q is a finite set of states.
- ∑ is a finite set of symbols, that we will call the alphabet of the language the automaton accepts.
- δ is the transition function, that is
- This function can be extended so that instead of taking just one symbol of the alphabet, it receives a string of symbols, and returns the state in which the automaton will stay after processing the input. This can be rewritten as
- ...where ∑* is the Kleene Closure of ∑.
- The definition of δ is a little more complex for non deterministic and non finite states automata.
- S0 is the start state, that is, the state in which the automaton is when no input has been processed yet (Obviously, S0∈ Q).
- F is a set of states of Q (i.e. F⊆Q), called accept states.
With all this, we can now say that the language L accepted by a deterministic finite automaton is:
That is, the set of all words w, over the Σ alphabet, that when given as input to M will make it end up in some state from F.
[edit] Classes of Automata
The following are three kinds of finite automata
- Deterministic Finite Automata (DFA)
- Each state of an automaton of this kind has a transition for every symbol in the alphabet.
- Nondeterministic Finite Automata (NFA)
- States of an automaton of this kind may or may not have a transition for each symbol in the alphabet, or can even have multiple transitions for a symbol. The automaton accepts a word if there exists at least one path from S0 to a state in F labeled with the input word. If a transition is undefined, so that the automaton does not know how to keep on reading the input, the word is rejected.
- Nondeterministic Finite Automata, with ε transitions (FND-ε or ε-NFA)
- Besides of being able to jump to more (or none) states with any symbol, these can jump on no symbol at all. That is, if a state has transitions labeled with ε, then the NFA can be in any of the states reached by the ε-transitions, directly or through other states with ε-transitions. The set of states that can be reached by this method from a state q, is called the ε-closure of q.
It can be shown, though, that all these automata can accept the same languages. You can always construct some DFA M' that accepts the same language as a given NFA M.
[edit] Extensions of finite automata
The family of languages accepted by the above-described automata is called the family of regular languages. More powerful automata can accept more complicated languages. Such automata include:
- Pushdown automata (PDA)
- Such machines are identical to DFAs (or NFAs), except that they additionally carry memory in the form of a stack. The transition function δ will now also depend on the symbol(s) on top of the stack, and will specify how the stack is to be changed at each transition. Non-determinstic PDAs accept the context-free languages.
- Linear Bounded Automata (LBA)
- An LBA is a limited Turing machine; instead of an infinite tape, the tape has an amount of space proportional to the size of the input string. LBAs accept the context-sensitive languages.
- Turing machines
- These are the most powerful computational machines. They possess an infinite memory in the form of a tape, and a head which can read and change the tape, and move in either direction along the tape. Turing machines are equivalent to algorithms, and are the theoretical basis for modern computers. Turing machines decide recursive languages and recognize the recursively enumerable languages.
[edit] External links
[edit] References
- John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman - Introduction to Automata Theory, Languages, and Computation (2nd Edition)
- Michael Sipser (1997). Introduction to the Theory of Computation. PWS Publishing. ISBN 0-534-94728-X. Part One: Automata and Languages, chapters 1–2, pp.29–122. Section 4.1: Decidable Languages, pp.152–159. Section 5.1: Undecidable Problems from Language Theory, pp.172–183.
Automata theory: formal languages and formal grammars | |||
---|---|---|---|
Chomsky hierarchy |
Grammars | Languages | Minimal automaton |
Type-0 | Unrestricted | Recursively enumerable | Turing machine |
n/a | (no common name) | Recursive | Decider |
Type-1 | Context-sensitive | Context-sensitive | Linear-bounded |
n/a | Indexed | Indexed | Nested stack |
Type-2 | Context-free | Context-free | Nondeterministic Pushdown |
n/a | Deterministic Context-free | Deterministic Context-free | Deterministic Pushdown |
Type-3 | Regular | Regular | Finite |
Each category of languages or grammars is a proper subset of the category directly above it. |