1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 1 Mälardalen University 2005
2 Content Mathematical Preliminaries Countable Sets (Uppräkneliga mängder) Uncountable sets (Överuppräkneliga mängder) Languages, Alphabets and Strings Strings & String Operations Languages & Language Operations Regular Expressions
3 Lecturer & Examiner Gordana Dodig-Crnkovic
4 Teaching Assistent Andreas Ermedahl
5 kurser/cd5560/05_04 visit home page regularly! Course Home Page
6 Why Theory of Computation? 1.A real computer can be modelled by a mathematical object: a theoretical computer. 2.A formal language is a set of strings, and can represent a computational problem. 3.A formal language can be described in many different ways that ultimately prove to be identical. 4.Simulation: the relative power of computing models can be based on the ease with which one model can simulate another.
7 5. Robustness of a computational model. 6. The Church-Turing thesis: anything that can be computed can be computed by a Turing machine. 7. Nondeterminism: languages can be described by the existence or nonexistence of computational paths. 8. Unsolvability: for some computational problems there is no corresponding algorithm that will unerringly solve them.
8 Practical Applications 1.Efficient compilation of computer languages 2.String searching 3.Identifying the limits; Recognizing difficult problems 4.Applications to other areas: –circuit verification –economics and game theory (finite automata as strategy models in decision-making); –theoretical biology (L-systems as models of organism growth) –computer graphics (L-systems) –linguistics (modeling by grammars)
9 History Euclid's attempt to axiomatize geometry (Archimedes realized, during his own efforts to define the area of a planar figure, that Euclid's attempt had failed and that additional postulates were needed. ) Leibniz's dream of a symbolic logic de Morgan, Boole, Frege, Russell, Whitehead: Mathematics as branch of symbolic logic!
Hilberts program first programming languages 1931 Gödels incompleteness theorem 1936 Turing maschine (showed to be equivalent with recursive functions). Commonly accepted: TM as ultimate computer 1950 automata 1956 language/automata hierarchy
11 Every mathematical truth expressed in a formal language consisting of a fixed alphabet of admissible symbols, and explicit rules of syntax for combining those symbols into meaningful words and sentences
12 Turing used a Universal Turing machine (UTM) to prove an even more powerful incompleteness theorem because it destroyed not one but two of Hilbert's dreams: 1.finding a finite list of axioms from which all mathematical truths can be deduced 2.Solving the entscheidungsproblem, ("decision problem“) by producing a "fully automatic procedure" for deciding whether a given proposition (sentence) is true or false.
13 Mathematical Preliminaries
14 Sets Functions Relations Graphs Proof Techniques
15 A set is a collection of elements SETS We write
16 Set Representations C = { a, b, c, d, e, f, g, h, i, j, k } C = { a, b, …, k } S = { 2, 4, 6, … } S = { j : j > 0, and j = 2k for some k>0 } S = { j : j is nonnegative and even } finite set infinite set
17 A = { 1, 2, 3, 4, 5 } Universal Set: All possible elements U = { 1, …, 10 } A U
18 Set Operations A = { 1, 2, 3 } B = { 2, 3, 4, 5} Union A U B = { 1, 2, 3, 4, 5 } Intersection A B = { 2, 3 } Difference A - B = { 1 } B - A = { 4, 5 } U A B A-B
19 Complement Universal set = {1, …, 7} A = { 1, 2, 3 } A = { 4, 5, 6, 7} A A A = A
20 { even integers } = { odd integers } even odd Integers
21 DeMorgan’s Laws A U B = A B U A B = A U B U
22 Empty, Null Set: = { } S U = S S = S - = S - S = U = Universal Set
23 Subset A = { 1, 2, 3} B = { 1, 2, 3, 4, 5 } A B U Proper Subset:A B U A B
24 Disjoint Sets A = { 1, 2, 3 } B = { 5, 6} A B = U A B
25 Set Cardinality For finite sets A = { 2, 5, 7 } |A| = 3
26 Powersets A powerset is a set of sets Powerset of S = the set of all the subsets of S S = { a, b, c } 2 S = {, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c} } Observation: | 2 S | = 2 |S| ( 8 = 2 3 )
27 Cartesian Product A = { 2, 4 } B = { 2, 3, 5 } A X B = { (2, 2), (2, 3), (2, 5), ( 4, 2), (4, 3), (4, 5) } |A X B| = |A| |B| Generalizes to more than two sets A X B X … X Z
28 PROOF TECHNIQUES Proof by construction Proof by induction Proof by contradiction
29 Construction We define a graph to be k-regular if every node in the graph has degree k. Theorem. For each even number n > 2 there exists 3-regular graph with n nodes n = 4 n = 6
30 Construct a graph G = (V, E) with n > 2 nodes. V= { 0, 1, …, n-1 } E = { {i, i+1} for 0 i n-2} {{n-1,0}} (*) {{i, i+n/2 for 0 i n/2 –1} (**) The nodes of this graph can be written consecutively around the circle. (*) edges between adjacent pairs of nodes (**) edges between nodes on opposite sides Proof by Construction END OF PROOF
31 Induction We have statements P 1, P 2, P 3, … If we know for some k that P 1, P 2, …, P k are true for any n k that P 1, P 2, …, P n imply P n+1 Then Every P i is true
32 Proof by Induction Inductive basis Find P 1, P 2, …, P k which are true Inductive hypothesis Let’s assume P 1, P 2, …, P n are true, for any n k Inductive step Show that P n+1 is true
33 Example Theorem A binary tree of height n has at most 2 n leaves. Proof let L(i) be the number of leaves at level i L(0) = 1 L(3) = 8
34 We want to show: L(i) 2 i Inductive basis L(0) = 1 (the root node) Inductive hypothesis Let’s assume L(i) 2 i for all i = 0, 1, …, n Induction step we need to show that L(n + 1) 2 n+1
35 Induction Step hypothesis: L(n) 2 n Level n n+1
36 hypothesis: L(n) 2 n Level n n+1 L(n+1) 2 * L(n) 2 * 2 n = 2 n+1 Induction Step END OF PROOF
37 Inductionsbevis: Potensmängdens kardinalitet Påstående: En mängd med n element har 2 n delmängder Kontroll Tomma mängden {} (med noll element) har bara en delmängd: {}. Mängden {a} (med ett element) har två delmängder: {} och {a}
38 Påstående: En mängd med n element har 2 n delmängder Kontroll (forts.) Mängden {a, b} (med två element) har fyra delmängder: {}, {a}, {b} och {a,b} Mängden {a, b, c} (med tre element) har åtta delmängder: {}, {a}, {b}, {c} och {a,b}, {a,c}, {b,c}, {a,b,c} Påstående stämmer så här långt.
39 Bassteg Enklaste fallet är en mängd med noll element (det finns bara en sådan), som har 2 0 = 1 delmängder.
40 Induktionssteg Antag att påståendet gäller för alla mängder med k element, dvs antag att varje mängd med k element har 2 k delmängder. Visa att påståendet i så fall också gäller för alla mängder med k+1 element, dvs visa att varje mängd med k+1 element har 2 k+1 delmängder.
41 Vi betraktar en godtycklig mängd med k+1 element. Delmängderna till mängden kan delas upp i två sorter: Delmängder som inte innehåller element nr k+1: En sådan delmängd är en delmängd till mängden med de k första elementen, och delmängder till en mängd med k element finns det (enligt antagandet) 2 k stycken. Delmängder som innehåller element nr k+1: En sådan delmängd kan man skapa genom att ta en delmängd som inte innehåller element nr k+1 och lägga till detta element. Eftersom det finns 2 k delmängder utan element nr k+1 kan man även skapa 2 k delmängder med detta element. Totalt har man 2 k + 2 k = 2. 2 k = 2 k+1 delmängder till den betraktade mängden. END OF PROOF (Exempel från boken: Diskret matematik och diskreta modeller, K Eriksson, H. Gavel)
42 Proof by Contradiction We want to prove that a statement P is true we assume that P is false then we arrive at a conclusion that contradicts our assumptions therefore, statement P must be true
43 Example Theorem is not rational Proof Assume by contradiction that it is rational = n/m n and m have no common factors We will show that this is impossible
44 Therefore, n 2 is even n is even n = 2 k 2 m 2 = 4k 2 m 2 = 2k 2 m is even m = 2 p Thus, m and n have common factor 2 Contradiction! = n/m 2 m 2 = n 2 END OF PROOF
45 Countable Sets (Uppräkneliga mängder)
46 Infinite sets are either Countable or Uncountable
47 Countable set There is a one to one correspondence between elements of the set and natural numbers
48 We started with the natural numbers, then add infinitely many negative whole numbers to get the integers, then add infinitely many rational fractions to get the rationals, then added infinitely many irrational fractions to get the reals. Each infinite addition seem to increase cardinality: |N| < |Z| < |Q| < |R| But is this true? NO!
49 Example Integers: The set of integers is countable Correspondence: Natural numbers:
50 Example The set of rational numbers is countable Positive Rational numbers:
51 Naive Idea Rational numbers: Natural numbers: Correspondence: Doesn’t work! we will never count numbers with nominator 2:
52 Better Approach... Rows: constant numerator (täljare) Columns: constant denominator
53...
54 We proved: the set of rational numbers is countable by describing an enumeration procedure
55 Definition An enumeration procedure for is an algorithm that generates all strings of one by one Let be a set of strings
56 A set is countable if there is an enumeration procedure for it Observation
57 Example The set of all finite strings is countable We will describe the enumeration procedure Proof
58 Naive procedure: Produce the strings in lexicographic order: Doesn’t work! Strings starting with will never be produced
59 Better procedure 1. Produce all strings of length 1 2. Produce all strings of length 2 3. Produce all strings of length 3 4. Produce all strings of length Proper Order
60 Produce strings in Proper Order length 2 length 3 length 1
61 Theorem The set of all finite strings is countable Proof Find an enumeration procedure for the set of finite strings Any finite string can be encoded with a binary string of 0’s and 1’s
62 Produce strings in Proper Order length 2 length 3 length … …. String = programNatural number
63 PROGRAM = STRING (syntactic way) PROGRAM = FUNCTION (semantic way) PROGRAM string PROGRAM natural number n natural number n
64 Uncountable Sets (Överuppräkneliga mängder)
65 A set is uncountable if it is not countable Definition
66 Theorem The set of all infinite strings is uncountable We assume we have an enumeration procedure for the set of infinite strings Proof(by contradiction)
67 Infinite string Encoding... = = = Cantor’s diagonal argument...
68 Cantor’s diagonal argument We can construct a new string that is missing in our enumeration! The set of all infinite strings is uncountable! Conclusion
69 There are some integer functions that that cannot be described by finite strings (programs/algorithms). Conclusion An infinite string can be seen as FUNCTION (n:th output is n:th bit in the string)
70 Theorem Let be an infinite countable set The powerset of is uncountable Example of uncountable infinite sets
71 Proof Since is countable, we can write
72 Elements of the powerset have the form: ……
73 We encode each element of the power set with a binary string of 0’s and 1’s Powerset element Encoding...
74 Let’s assume (for contradiction) that the powerset is countable. we can enumerate the elements of the powerset Then:
75 Powerset element Encoding...
76 Take the powerset element whose bits are the complements in the diagonal
77 New element: (binary complement of diagonal)...
78 The new element must be some of the powerset However, that’s impossible: the i-th bit of must be the complement of itself from definition of Contradiction!
79 Since we have a contradiction: The powerset of is uncountable END OF PROOF
80 Example Alphabet : The set of all finite strings: infinite and countable uncountable infinite The powerset of contains all languages: An Application: Languages
81 Finite strings (algorithms): countable Languages (power set of strings): uncountable There are infinitely many more languages than finite strings.
82 There are some languages that cannot be described by finite strings (algorithms). Conclusion
83 Kardinaltal Kardinaltal är mått på storleken av mängder. Kardinaltalet för en ändlig mängd är helt enkelt antalet element i mängden. Två mängder är lika mäktiga om man kan para ihop elementen i den ena mängden med elementen i den andra på ett uttömmande sätt, dvs det finns en bijektion mellan dem. Detta mäktighetstänkande kan utvidgas till oändliga mängder. Till exempel är mängden av positiva heltal och mängden av heltal lika mäktiga.
84 Kardinaltal Däremot kan man inte para ihop alla reella tal med heltalen på detta sätt. Mängden av reella tal har större mäktighet än mängden av heltal. Man kan införa kardinaltal på ett sådant sätt att två mängder har samma kardinaltal om och endast om de har samma mäktighet. T ex kallas kardinaltalet som hör till de hela talen för 0 (alef 0, alef är den första bokstaven i det hebreiska alfabetet). Dessa oändliga kardinaltal kallas transfinita kardinaltal.
85 Georg Cantor utvecklade i slutet av 1800-talet matematikens logiska grund, mängdläran. Cantor införde begreppet transfinita kardinaltal. Den enklaste, "minsta", oändligheten kallade han 0. Mer om oändligheter…
86 0 är den uppräkningsbara oändliga mängdens (exempelvis mängden av alla heltal) kardinaltalet. Kardinaltalet av mängden punkter på en linje, och även punkterna på ett plan och i en kropp, kallade Cantor 1. Fanns det större oändligheter? Mer om oändligheter…
87 Ja! Cantor kunde visa att antalet funktioner på en linje var ännu oändligare än punkterna på linjen, och han kallade den mängden 2. Cantor fann att det gick att räkna med kardinaltalen precis som med vanliga tal, men räknereglerna blev något enahanda.. 0 + 1= 0 0 + 0 = 0 0 · 0 = 0.
88 Men vid exponering hände det något: 0 0 ( 0 upphöjt till 0 ) = 1. Mer generellt visade det sig att 2 n (2 upphöjt till n ) = n+1 Det innebar att det fanns oändligt många oändligheter, den ena mäktigare än den andra!
89 Men var det verkligen säkert att det inte fanns någon oändlighet mellan den uppräkningsbara och punkterna på linjen? Cantor försökte bevisa den så kallade kontinuumhypotesen. Cantor: two different infinities 0 and 1 Continuum Hypothesis: 0 < 1 = 2 0 Se även:
90 Languages, Alphabets and Strings
91 defined over an alphabet: Languages A language is a set of strings A String is a sequence of letters An alphabet is a set of symbols
92 Alphabets and Strings We will use small alphabets: Strings
93 Operations on Strings
94 String Operations m n bbbv aaaw y bbbaaa x abba Concatenation (sammanfogning) xy abbabbbaaa
95 Reverse (reversering) Example: Longest odd length palindrome in a natural language: saippuakauppias (Finnish: soap sailsman)
96 String Length Length: Examples:
97 Recursive Definition of Length For any letter: For any string : Example:
98 Length of Concatenation vuuv aababaabuv 5, 3, vabaabv uaabu Example:
99 Proof of Concatenation Length Claim: Proof: By induction on the length Induction basis: From definition of length:
100 Inductive hypothesis: Inductive step: we will prove for
101 Inductive Step Write, where From definition of length: From inductive hypothesis: Thus: END OF PROOF
102 Empty String A string with no letters: (Also denoted as ) Observations:
103 Substring (delsträng) Substring of string: a subsequence of consecutive characters String Substring
104 Prefix and Suffix Suffixes prefix suffix Prefixes
105 Repetition Example: Definition: n n } (String repeated n times)
106 The * (Kleene star) Operation the set of all possible strings from alphabet [Kleene is pronounced "clay-knee“]
107 The + Operation : the set of all possible strings from alphabet except ,ba ,,,,,,,,,*aabaaabbbaabaaba
108 Example *, oj, fy, usch, ojoj, fyfy,uschusch, ojfy, ojusch * ,fyoj , usch oj, fy, usch, ojoj, fyfy,uschusch, ojfy, ojusch
109 Operations on Languages
110 Language A language is any subset of Example: Languages: ,,,,,,,,*, aaabbbaabaaba ba },,,,,{,, aaaaaaabaababaabba aabaaa
111 Example An infinite language
112 Operations on Languages The usual set operations ,,,,,,,,,*aabaaabbbaabaaba Complement:
113 Reverse Examples: Definition:
114 Concatenation Definition: Example
115 Repeat Definition: Special case:
116 Example
117 Star-Closure (Kleene *) Definition: Example:
118 Positive Closure Definition *L 2 1 L LL
119 Regular Expressions
120 Regular Expressions: Recursive Definition are Regular Expressions Primitive regular expressions: Given regular expressions and
121 Examples A regular expression: Not a regular expression:
122 Zero or more. a* means "zero or more a's." To say "zero or more ab's," that is, {, ab, abab, ababab,...}, you need to say (ab)*. ab* denotes {a, ab, abb, abbb, abbbb,...}. Building Regular Expressions
123 One or more. Since a* means "zero or more a's", you can use aa* (or equivalently, a*a) to mean "one or more a's.“ Similarly, to describe "one or more ab's," that is, {ab, abab, ababab,...}, you can use ab(ab)*. Building Regular Expressions
124 Any string at all. To describe any string at all (with = {a, b, c}), you can use (a+b+c)*. Any nonempty string. This can be written as any character from followed by any string at all: (a+b+c)(a+b+c)*. Building Regular Expressions
125 Any string not containing.... To describe any string at all that doesn't contain an a (with = {a, b, c}), you can use (b+c)*. Any string containing exactly one... To describe any string that contains exactly one a, put "any string not containing an a," on either side of the a, like this: (b+c)*a(b+c)*. Building Regular Expressions
126 Languages of Regular Expressions Example language of regular expression
127 Definition For primitive regular expressions:
128 Definition (continued) For regular expressions and
129 Example Regular expression:
130 Example Regular expression
131 Example Regular expression
132 Example Regular expression { all strings with at least two consecutive 0 }
133 Example Regular expression (consists of repeating 1’s and 01’s). = { all strings without two consecutive 0 }
134 Example = { all strings without two consecutive 0 } (In order not to get 00 in a string, after each 0 there must be an 1, which means that strings of the form are repeated. That is the first parenthesis. To take into account strings that end with 0, and those consisting of 1’s solely, the rest of the expression is added.) Equivalent solution:
135 Equivalent Regular Expressions Regular expressions and are equivalent if Definition:
136 In order to see that both regular expressions describe the same language, you can even run the a5 program.
137 Example = { all strings without two consecutive 0 } and are equivalent regular expressions.