Chomsky's hierarchy

format_list_bulleted Contenido keyboard_arrow_down
ImprimirCitar
Noam Chomsky.

In linguistics the Chomsky hierarchy (occasionally also called the Chomsky–Schützenberger hierarchy) is a hierarchical classification of different types of formal grammars that generate formal languages. This hierarchy was described by Noam Chomsky in 1956.

The hierarchy

Chomsky's Hierarchy consists of four levels:

  • Type 0 grammar (without restrictions), which includes all formal grammar. These grammars generate all languages capable of being recognized by a Turing machine. Languages are known as correctly enumerable languages. Note that this category is different from that of recursive languages, whose decision can be made by a Turing machine that stops.
  • Type 1 grammar (context-sensitive grammars) generate context-sensitive languages. These grammars have rules of the way α α Aβ β → → α α γ γ β β {displaystyle alpha Abeta rightarrow alpha gamma beta} with A{displaystyle A} a non-stop and α α {displaystyle alpha }, β β {displaystyle beta } and γ γ {displaystyle gamma } terminal chains and not terminals. The chains α α {displaystyle alpha } and β β {displaystyle beta } can be empty, but γ γ {displaystyle gamma } It can't be. The rule S→ → ε ε {displaystyle Srightarrow epsilon } is allowed if S{displaystyle S} does not appear on the right side of any rule. The languages described by these grammars are exactly all those languages recognized by a deterministic Turing machine whose memory tape is coupled with a certain integer of times on the length of entry, also known as linearly-set automatons.
  • Type 2 grammar (context free grammars) generate the languages independent of the context. The rules are the way A→ → γ γ {displaystyle Arightarrow gamma } with A{displaystyle A} a non-stop and γ γ {displaystyle gamma } a chain of terminals and not terminals. These languages are those that can be recognized by a battery automaton.
  • Type 3 grammar (regular grammars) generate regular languages. These grammars are restricted to those rules that have on the left a non-stop, and on the right side a single terminal, possibly followed by a non-stop. The rule S→ → ε ε {displaystyle Srightarrow epsilon } also allowed if S{displaystyle S} does not appear on the right side of any rule. These languages are those that can be accepted by a finite automaton. This language family can also be obtained through regular expressions.

Note that the set of grammars corresponding to recursive languages is not a member of the hierarchy.

Every regular language is in turn context free, likewise a context free language is also context dependent, it is recursive and in turn, recursively enumerable. The inclusions are, however, proper, that is, there are languages in each level that are not in previous levels.

Type Language Automata Grammar production standards Examples
0 recursively enumerable (LRE) Turing Machine α α Aβ β → → δ δ {displaystyle alpha Abeta rightarrow delta}L={w日本語w{displaystyle L={wint} describes a Turing machine!{displaystyle}
1 context dependent (LSC) Linearly boarded α α Aβ β → → α α γ γ β β {displaystyle alpha Abeta rightarrow alpha gamma beta}0}}" xmlns="http://www.w3.org/1998/Math/MathML">L={anbncn日本語n▪0!{displaystyle L={a^{n}b^{n}c^{n}{n}0}}" aria-hidden="true" class="mwe-math-fallback-image-inline" src="https://wikimedia.org/api/rest_v1/media/math/render/svg/2bb63b71ffcba840f36e802aafe4c9951cf9ec38" style="vertical-align: -0.838ex; width:20.198ex; height:2.843ex;"/>
2 context independent (LLC) Authomata with battery A→ → γ γ {displaystyle Arightarrow gamma }0}}" xmlns="http://www.w3.org/1998/Math/MathML">L={anbn日本語n▪0!{displaystyle L={a^{n}b^{n}0}}" aria-hidden="true" class="mwe-math-fallback-image-inline" src="https://wikimedia.org/api/rest_v1/media/math/render/svg/dfafe0fa14e5249f492f5cbde42062ba4904d56f" style="vertical-align: -0.838ex; width:17.973ex; height:2.843ex;"/>
3 regular (LR) Automata finito A→ → a{displaystyle Arightarrow {text{a}}}γ γ {displaystyle gamma }

A→ → aB{displaystyle Arightarrow {text{a}B}

L={an日本語n≥ ≥ 0!{displaystyle L={a^{n}{n}{ngeq 0}}}
Meaning of symbols:
  • a{displaystyle {text{a}}} = terminal
  • A{displaystyle A}, B{displaystyle B} = no terminal
  • α α {displaystyle alpha }, β β {displaystyle beta }, γ γ {displaystyle gamma } = terminal chain and/or not terminal
  • α α {displaystyle alpha }, β β {displaystyle beta }, δ δ {displaystyle delta } = possibly empty chain
  • γ γ {displaystyle gamma } = non-empty chain


Recursively Enumerable Languages (of type 0)

The grammars that generate these languages can have compression rules.

The production rules are as follows:


P={(u→ → v)日本語u=xAand;u한 한 ・ ・ +;v,x,and한 한 ・ ・ ↓ ↓ ;A한 한 N!{displaystyle mathrm {P} ={(urightarrow v)healthy=xAy;uin sigma ^{+};v,x,yin Sigma ^{*};Ain N}}}

Context Dependent Languages (context sensitive, type 1)

There are no compression rules in the whole theory, except, optionally, the one that derives the axiom from the empty word.

There are rules in which a non-terminal symbol can lead to different sentential forms, depending on the symbols that appear around it.

The production rules are as follows:


P={(S→ → λ λ )or(xAand→ → xvand)日本語v한 한 ・ ・ +;x,and한 한 ・ ・ ↓ ↓ ;A한 한 N!{displaystyle mathrm {P} ={(Srightarrow lambda)o(xAyrightarrow xvy)Δvin Sigma ^{+};x,yin Sigma ^{*};Ain N}}

Context-Independent Languages (Context Free, Type 2)

Most programming languages fall into this category in terms of their syntactic form, although in reality programming languages are context-dependent, they are recognized through type 2 languages because their recognition is O(n) while those of type 1 have a recognition order O(n^3) in the worst case. For this reason, a semantic analysis is executed to recognize if the program is correct.

The production rules are as follows:


P={(S→ → λ λ )or(A→ → v)日本語v한 한 ・ ・ +;A한 한 N!{displaystyle mathrm {P} ={(Srightarrow lambda)o(Arightarrow v)Șvin Sigma ^{+};Ain N}}}}

Regular Languages (type 3)

These are the simplest languages in the Chomsky Hierarchy. They are usually expressed using regular expressions.

There are 2 types: linear on the right and linear on the left. The production rules are as follows:

Linear to the right:

P={(S→ → λ λ )or(A→ → aB)or(A→ → a)日本語a한 한 T;A,B한 한 N!{displaystyle mathrm {P} ={(Srightarrow lambda)o(Arightarrow aB)o(Arightarrow a)cultain T;A,Bin N}}

Left linear:

P={(S→ → λ λ )or(A→ → Ba)or(A→ → a)日本語a한 한 T;A,B한 한 N!{displaystyle mathrm {P} ={(Srightarrow lambda)o(Arightarrow Ba)o(Arightarrow a)

Contenido relacionado

Languages of japan

The most widely spoken language in Japan is Japanese, which is divided into several dialects and the Tokyo dialect is considered standard...

Uto-Aztec languages

The Uto-Aztecan languages or Yutonahuas form a family of Amerindian languages widely spread throughout North America, with approximately two million speakers....

Speech synthesis

Speech synthesis is the artificial production of speech. The computerized system that is used for this purpose is called a speech computer or speech...
Más resultados...
Tamaño del texto:
undoredo
format_boldformat_italicformat_underlinedstrikethrough_ssuperscriptsubscriptlink
save