Non Regular Language

Bhavik Bondade
5 min readJun 1, 2022

--

What is Non-Regular Language?

A language that cannot be defined by a regular expression is a nonregular language or an irregular language.

The existence of non-regular languages is guaranteed by the fact that the regular languages of any alphabet are countable, and we know that the set of all subsets of strings is not countable. Nevertheless, the point of establishing non-regular languages is not so much one of existence, but of illustrating that certain languages which are “computable” in some sense are not regular.

How to identify if a language is regular or non regular?

Prerequisite — Regular Expressions, Regular Grammar and Regular Languages, Pumping Lemma.
There is a well established theorem to identify if a language is regular or not, based on Pigeon Hole Principle, called as Pumping Lemma. But pumping lemma is a negativity test, i.e. if a language doesn’t satisfy pumping lemma, then we can definitely say that it is not regular, but if it satisfies, then the language may or may not be regular. Pumping Lemma is more of a mathematical proof, takes more time and it may be confusing sometimes. In exams, we need to address this problem very quickly, so based on common observations and analysis, here are some quick rules that will help.

Given an expression of non-regular language, but the value of parameter is bounded by some constant, then the language is regular (means it has kind of finite comparison).
Example 1 — L = {a^n [Tex]b^n [/Tex]| n <= 101010} is regular, because it is upper bounded and thus a finite language.

NONREGULAR LANGUAGES

  • A language that cannot be defined by a regular expression is called a nonregular language.
  • By Kleene’s theorem, a nonregular language can also not be accepted by any FA or TG. All languages are either regular or nonregular, none are both.
  • Let us first consider a simple case. Let us define the language L. L = {Λ ab aabb aaabbb aaaabbbb aaaaabbbbb …}
  • We could also define this language by the formula L = {anbn for n = 0 1 2 3 4 5 … } or for short L = {anbn}
  • We shall now show that this language is nonregular. Let us note, though, that it is a subset of many regular languages, such as a*b*, which, however, also includes such strings as aab and bb that {anbn} does not.
  • Let us be very careful to note that {anbn} is not a regular expression. It involves the symbols { } and n that are not in the alphabet of regular expressions. This is a language-defining expression that is not regular.
  • Suppose on the contrary that this language were regular. Then there would have to exist some FA that accepts it.

Pumping Lemma

Let us consider the NFA given below.

This NFA accepts among others some strings of length greater than 5 such as abbabbb, abbabbabbb etc. Those strings which are accepted by this NFA and whose length is greater than 5 have a substring which can be repeated any number of times without being rejected by the NFA.

For example the string abbabbb is accepted by the NFA and if one of its substrings bba is repeated any number of times in abbabbb, the resultant strings such as abbb (bba repeated 0 times), abbabbabbb, abbabbabbabbb etc. are also accepted by the NFA.
In general if a string w (such as abbabbb in the example above) is accepted by an NFA with n states and if its length is longer than n, then there must be a cycle in the NFA along some path from the initial state to some accepting state (such as the cycle 2–3–4–2 in the above example). Then the substring representing that cycle (bba in the example) can be repeated any number of times within the string w without being rejected by the NFA.
The following theorem which is called Pumping Lemma is based on this observation. It states that if a language is regular, then any long enough string of the language has a substring which can be repeated any number of times with the resultant strings still in the language. It is stated without a proof here.

Pumping Lemma : Suppose that a language L is regular. Then there is an FA that accepts L. Let n be the number of states of that FA. Then for any string x in L with |x| n, there are strings u, v and w which satisfy the following relationships:
x = uvw
|uv| n
|v| > 0 and
for every integer m 0, uvmw L.

Note that Pumping Lemma gives a necessity for regular languages and that it is not a sufficiency, that is, even if there is an integer n that satisfies the conditions of Pumping Lemma, the language is not necessarily regular. Thus Pumping Lemma can not be used to prove the regularity of a language. It can only show that a language is nonregular.

Example : As an example to illustrate how Pumping Lemma might be used to prove that a language is nonregular, let us prove that the language L = akbk is nonregular, where k is a natural number.

Suppose that L is regular and let n be the number of states of an FA that accepts L. Consider a string x = anbn for that n. Then there must be strings u, v, and w such that

x = uvw,
|uv| n
|v| > 0 , and
for every m 0, uvmw L.

Since |v| > 0 , v has at least one symbol. Also since |uv| n, v = ap, for some p > 0 ,
Let us now consider the string uvmw for m = 2.
Then uv2w = an-pa2pbn = an+pbn . Since p > 0 , n + p n . Hence an+pbn can not be in the language L represented by akbk .
This violates the condition that for every m 0, uvmw L. Hence L is not a regular language.

--

--