Forth

ImprimirCitar

Forth or FORTH is a programming language and programming environment for computers devised by Charles H. Moore between 1965 and 1970 at the National Radio Astronomy Observatory. from Kitt Peak, Arizona.

Its name is a contraction of the English word fourth, since its creators considered it intended for the fourth generation of computers, but the first edition of the language was prepared for an IBM 1130, which only allowed names with a maximum length of five letters; his name stayed forever in FORTH. Forth is sometimes spelled in all capitals following customary usage during the early years, although the name is not an acronym.

Initially designed for a very specific application, astronomy (calculation of trajectories of bodies in orbit, chromatography, analysis of emission spectra), has evolved to be applicable to almost all other fields related or not to that branch of science. science (probability calculations, databases, statistical and even financial analysis).

Subsequently, a program for automatic and continuous data acquisition carried out in this language has discovered at least half of the interstellar clusters known today.

Forth is a procedural, structured, imperative, reflexive, stack-based, non-type-checking computer programming language. Forth offers both interactive command execution (making it convenient as a shell for systems lacking a more formal operating system) and the ability to compile scripts for subsequent execution. Some Forth implementations (usually the early versions or those written to be extremely portable) compile threaded code, but many implementations today generate optimized machine code like other language compilers.

One of its important features is the use of a data stack to pass the arguments between the words, which are the constituents of a Forth program.

Although not as popular as other programming systems, Forth has enough support to keep various vendors and language contractors in business. Forth is currently used in boot loaders such as Open Firmware, space applications, and other embedded systems. An implementation of Forth by the GNU Project is actively maintained, with its last release being in November 2008. The 1994 standard is currently undergoing revision, tentatively named Forth 200x.

Overview

A Forth environment combines the compiler with an interactive shell. The user interactively defines and runs subroutines, or "words", in a virtual machine similar to the runtime environment. Words can be tested, redefined, and debugged as the source code is entered without recompiling or restarting the entire program. All syntactic elements, including variables and basic operators, appear as such procedures (in the form of words). Even if a particular word is optimized so as not to require a subroutine call, it is still available as a subroutine as well. On the other hand, the shell can compile interactively typed commands into machine code before running them. (This behavior is common, but not required.) Forth environments vary in how the resulting program is stored, but ideally, running the program has the same effect as manually entering the source code again. This is in contrast to combining C with UNIX shells, where compiled functions are a special class of program objects and interactive commands are strictly interpreted [citation required]. Most of Forth's unique features result from this principle. By including interaction, scripting, and compilation, Forth was popular on resource-constrained computers such as the BBC Micro and Apple II series, and remains so in applications such as firmware and small microcontrollers. Where C compilers can now generate more compact and better performing code, Forth retains the advantage of interactivity.[citation needed]

Batteries

Every subroutine programming environment implements a stack for control flow. This structure typically also stores local variables, including subroutine parameters (in a call-by-value system like C). Frequently, however, Forth does not have local variables, nor is it called-by-value. Instead, the intermediate values are held on a second stack. The words operate directly on the top values of this stack. Therefore, the "parameter" or "data", but is better known as the "stack". The function call stack is then called the return or "chaining" (in English linkage or return stack), abbreviated rstack. Special kernel-provided rstack manipulation functions allow it to be used for temporary storage within a word, but it cannot otherwise be used to pass parameters or manipulate data.

Most words are specified in terms of their effect on the stack. Typically, the parameters are pushed to the top of the stack before the word is executed. After execution, the parameters are cleared and replaced by return values. For arithmetic operators, this follows the rule of reverse Polish notation. See below for examples illustrating the use of the stack.

Maintenance

Forth is a simple and extensible language; its modularity and extensibility allow the writing of high-level programs such as CAD systems. However, extensibility also helps poor programmers write incomprehensible code, which has given Forth a reputation as a "write-only language". Forth has been used successfully on large and complex projects, while applications developed by competent and disciplined professionals have proven easy to maintain on changing hardware platforms over decades of use. Forth has a niche in both astronomical and space applications. Still today, Forth is used in many embedded systems (small computerized devices), due to its portability, efficient use of memory, short development time, and fast execution speed. It has been efficiently implemented in modern RISC processors, and processors using Forth as the machine language have been produced. Other uses of Forth include Open Firmware, boot ROMs used by Apple, IBM, Sun, and OLPC XO-1; and the first stage of the FICL-based boot driver of the FreeBSD operating system.

History

Forth developed from Charles H. Moore's personal programming system, which had been in continuous development since 1958. Forth was first exposed to other programmers in the early 1970s, beginning with Elizabeth Rather at the National Radio Astronomy Observatory in the United States. After their work at NRAO, Charles Moore and Elizabeth formed Forth, inc. in 1973, refining and porting Forth systems to dozens of other platforms over the next decade.

Forth was named so because in 1968 "the file containing the interpreter was labeled FOURTH, for the 4th (next) generation of software - but the IBM 1130 operating system restricted file names to 5 characters& #34;. Moore saw Forth as the compile-chain-execute successor to third-generation languages, or software for "fourth-generation" hardware, not in the sense of a fourth-generation programming language as it has come. to be used the term.

Because Charles Moore had moved frequently from job to job in his career, an early pressure on the development language was the ease of porting it to various computer architectures. A Forth system has often been used to "lift" a new hardware. For example, Forth was the first resident software on the new Intel 8086 chip in 1978 and MacFORTH was the first resident development system for the first Apple Macintosh in 1984.

Beginning in 1976, the microFORTH of Forth inc. it was developed for the Intel 8080, Motorola 6800, and the Zilog Z80 microprocessors. MicroFORTH was later used by hobbyists to generate Forth systems for other architectures, such as the 1978 MOS 6502. Wide dissemination eventually led to standardization of the language. The common practice was codified in the de facto standards of FORTH-79. and FORTH-83 of the years 1979 and 1983, respectively. These standards were unified by ANSI in 1994, commonly referred to as ANS Forth.

Forth became very popular in the 1980s because it was well suited to the small microcomputers of that time, being compact and portable. At least one personal computer, the British Jupiter Ace, had Forth in its ROM-resident operating system. The Canon Cat also used Forth for its system programming. Rockwell also produced the microcomputers on a single chip, the R65F11 and R65F12, with resident Forth cores.

The programmer's perspective

Forth relies heavily on the explicit use of a data stack and Reverse Polish Notation (RPN, or Postfix Notation, commonly used on Hewlett-Packard calculators. In RPN, the operator is placed after its operands, unlike from the more common infix notation where it is placed between operands Postfix notation makes the language easier to parse and extend Forth does not use a BNF grammar, and does not have a monolithic compiler Extending the compiler only requires typing of a new word, rather than modifying a grammar and changing the underlying implementation.

Using RPN, you can get the result of the math expression (25 * 10 + 50) like this:

25 10 * 50 +.
300 ok
Stack1.svg

This command line first puts the numbers 25 and 10 on the stack involved.

Forthstack1 5.png

The keyword * multiplies the two numbers at the top of the stack and replaces them with their product.

Forthstack2.PNG

So the number 50 is pushed on the stack.

Forthstack3.PNG

The keyword + adds it to the previous product. Finally, the . command prints the result to the user terminal.

Even the Forth's structural features are stack-based. For example:

: FLOOR5 (n -- n') DUP 6 ≤ IF DROP 5 ELSE 1 - THEN;

This code defines a new word called FLOOR5 using the following commands (again, "word" is the term used for a subroutine):

  • DUP double the number in the stack;
  • < compares 6 to the top number in the stack and replaces them with a true-or-false value;
  • IF take a true-or-false value and choose to execute commands immediately after it or jump to the ELSE;
  • DROP discard the value in the stack;
  • and THEN Finish the condition.

The text in parentheses is a comment, noting that this word expects a number on the stack and will return a possibly changed number. The word FLOOR5 is equivalent to this function written in the C programming language:

int floor5(int v) { return v ≤ 6 ? 5: v - 1; }

This function is written more succinctly as:

: FLOOR5 (n -- n') 1- 5 MAX;

You can run this word as follows:

1 FLOOR5.
5 ok
8 FLOOR5.
7 ok

First the parser pushes a number (1 or 8) onto the stack, then calls FLOOR5, which pops this number again and pushes the result. Finally, a call to "." removes the result from the stack and prints it to the user terminal.

Facilities

The Forth parser is simple, since it has no explicit grammar. The interpreter reads a line of input from a user input device, which is then parsed to a word using the spaces as delimiters; some systems recognize additional whitespace characters. When the parser finds a word, it tries to find the word in the dictionary. If the word is found, the interpreter executes the code associated with the word, and then returns to parse the rest of the input stream. If the word is not found, the word is assumed to be a number, and an attempt is made to convert the text representing it to a number and push it onto the stack; if successful, the interpreter continues parsing the input stream. Otherwise, if the word lookup and number conversion fail, the interpreter prints the word followed by an error message indicating that the word is not recognized, flushes the input stream, and wait for new user input.

The definition of a new word begins with the word : (a colon) and ends with the word ; (a semicolon). For example:

: X DUP 1+..;

This will compile the word X, and make the name findable in the dictionary. When executed by typing, for example, 10 X in the console this will print 11 10.

Most Forth systems include a specialized assembler that produces executable words. The assembler is a special dialect of the compiler. Forth assemblers often use a reverse polish syntax in which the parameters of an instruction precede the instruction. The usual design of the Forth assembler is to build the instruction on the stack, then copy it into memory like the last step. Registers can be referred to by the name used by the manufacturer, numbered (0..n, as used in the actual opcode) or named for their purpose in the Forth system: eg. "YES" for the register used as the stack pointer.

Operating system, files and multitasking

Classic Forth systems traditionally use neither the operating system nor the file system. Instead of storing code in files, source code is stored in blocks written to physical addresses on disk. The BLOCK keyword is used to translate the number of a 1 KB-sized disk block into the address of a buffer containing the data, which is handled automatically by the Forth system. Some systems implement contiguous disk files using system disk access, where files are located in fixed disk block ranges. Usually these are implemented as fixed-length binary registers, with an integer number of registers per disk block. A fast search is achieved by hash access on the key data.

Multitasking, (most commonly by cooperative round-robin scheduling), is normally available (although multitasking words and support are not covered by the ANSI Forth standard). The PAUSE keyword is used to save the execution context of the current task, to locate the next task, and to restore its execution context. Each task has its own stacks, private copies of some control variables, and a scratch area. As a result, task switching is simple and efficient. Forth multitasking is available even on very simple microcontrollers such as the Intel 8051, Atmel AVR, and TI MSP430.

By contrast, some Forth systems run under a host operating system such as Microsoft Windows, Linux, or a version of UNIX and use the host operating system's file system for source and data files; the ANSI Forth standard describes the words used for input/output. Other non-standard facilities include a mechanism for making calls to the host OS or window system, and many provide extensions that make use of provisions provided by the operating system. They typically have a larger and different set of words than the Forth's default PAUSE word, for task creation, suspension, destruction, and priority modification.

Self-compile and cross-compile

A full-fledged Forth system, with all the source code, will compile itself, with a technique commonly referred to by Forth programmers as metacompilation (although the term is not exactly equivalent to metacompilation as it is normally defined). Usually the method is to redefine a bunch of words that put compiled bits into memory. The compiler words use specially named versions of fetch and store that are redirected to a buffer area in memory. The buffer area simulates or accesses an area of memory starting at a different address than the code buffer. Such compilers define words to access both the memory of the target computer and the memory of the host (compilation) computer.

After the fetch and store operations are redefined for code space, the compiler, assembler, etc. they are collected using the new definition of fetch and store. This effectively revokes all compiler and interpreter code. So, the Forth system code is compiled, but this version is buffered. The in-memory buffer is written to disk, and ways to temporarily load it into memory for testing are provided. When the new version seems to work, it overwrites the previous version.

There are numerous variations of such compilers for various environments. For embedded systems, code can be written on another computer, a technique known as cross-compiling, over a serial port or even over a single TTL bit, while keeping word names and other non-executable parts of the dictionary in the dictionary. original build computer. The minimal definitions for this Forth compiler are the words fetch and store for a byte, and the word that commands a Forth word to be executed. Often the most time-consuming part of writing a remote port is building the initial program to implement fecth, store, and execute, but many modern microprocessors have built-in debugger features (such as the Motorola CPU32) that eliminate this task.

Language structure

The basic data structure of Forth is the "dictionary" which maps "words" to executable code or named data structures. The dictionary rests in memory as a chained list tree with the links proceeding from the last word defined (most recent) to the oldest, until a sentinel, usually a NULL pointer, is encountered. A context switch causes the list lookup to start on a different leaf and the chained list lookup continues in such a way that the branch is merged into the main stem again eventually addressing the sentinel (NULL), the root (in rare cases like in meta-compilation the dictionary can be isolated, there are several). The effect is sophisticated use of namespaces and can critically have the effect of overloading keywords, the meaning is contextual.

A defined word generally consists of a head and a body, with the head consisting of the name field (NF) and the link field (LF) and a body consisting of the code field (CF) and the parameter field (PF).

The header and body of a dictionary entry are treated separately because they may not be contiguous. For example, when a Forth program is recompiled for a new platform, the header may stay on the compilation computer, while the body goes on the new platform. In some environments (such as embedded systems) headers take up memory unnecessarily. However, some cross compilers may put heads on the target computer if this computer is expected to support an interactive Forth system.

Dictionary entry

The exact format of a dictionary entry is not prescribed, and implementations vary. However, certain components are almost always present, although the exact size and order may vary. Described as a structure, a dictionary entry might look like this:

structure
byte: flag  3bit flags + length of word's name
char-array: name  name's runtime length isn't known at compile time
address: previous  link field, backward ptr to previous word
address: codeword  ptr to the code to execute this word
any-array: parameterfield  unknown length of data, words, or opcodes
end-structure forthword

The name field starts with a prefix giving the length of the word name (typically up to 32 bytes), and several bits for flags. So the character representation of the word name follows the prefix. Depending on the particular Forth implementation, there may be one or more NUL bytes ("") for alignment.

The link field contains a pointer to the previously defined word. The pointer can be a relative offset or an absolute address pointing to the previous sibling.

The code field pointer will be: o the address of the word that will execute the code, o data in the parameter field, o the beginning of the machine code that the processor will directly execute. For words defined by the word "colon" :, the code field pointer points to the word that will save the current instruction pointer (IP) of Forth on the return stack, and loads the IP with the new address from which to continue word execution. This is the same as what the processor's call/return instructions do.

Structure of the compiler

The compiler itself consists of system-visible Forth words, it is not a monolithic program. This allows a programmer to change compiler words for special purposes.

The "compile time" in the name field is set for words with "compile time" behavior. Most simple words execute the same code whether they are typed on a command line, or inserted into other code. When compiling these, the compiler simply places the code or a threaded pointer to the word.

Classic examples of compile-time words are control structures, such as IF and WHILE. All of Forth's control structures, and almost all of its compiler, are implemented as compile-time words. All Forth flow words are executed during compilation to compile various combinations of the BRANCH and ?BRANCH (branch and branch if false) primitive words. During compilation, the data stack is used to support balancing, nesting, and backpatching of branch addresses of control structures. The little sample code:

... DUP 6 IF DROP 5 ELSE 1 - THEN...

will be compiled as the following sequence within a definition:

... BRANCH 5 DROP LIT 5 BRANCH 3 LIT 1 -...

The numbers after BRANCH represent relative jump addresses. LIT is the primitive word to push a number "literal" on the data stack.

Compile Status and Interpretation Status

The word : (colon) parses a name as a parameter, creates a word (a "colon" definition), and enters the state of compilation. The interpreter continues to read space-delimited words from the user's input device. If a word is found, the interpreter performs the compilation semantics associated with the word, rather than the interpretation semantics. By default the compilation semantics of a word is to add its interpretation semantics to the current definition.

The word ; (semicolon) ends the current definition and returns to interpretation state. Ella is an example of a word whose compilation semantics differ from the default. The interpretation semantics of ; (semicolon), that of most control flow words, and that of some other words are undefined in ANS Forth, which means that they should only be used within definitions and not on the interactive command line.

The interpreter state can be changed manually with the words [ and ] (left bracket and right bracket) entering interpret state or compile state, respectively. These words can be used with the LITERAL word to compute a value during compilation and insert it into the current colon definition. The LITERAL has the compile semantics to take an object from the data stack and add the semantics to the current colon definition to put that object on the data stack.

In ANS Forth, the current state of the interpreter can be read from the STATE flag which contains the value true when it is in compile state and false otherwise. This allows the implementation of smart state words with behavior that changes based on the current state of the shell.

Immediate words

The word IMMEDIATE marks the most recent colon definition as an immediate word, effectively replacing its compilation semantics with its interpretation semantics. Immediate words are normally executed during compilation, they are not compiled, but this can be overridden by the programmer, in any state. ; is an example of an immediate word. In the ANS Forth, the word POSTPONE takes a name as a parameter and adds the named word's compilation semantics to the current definition, even if the word was immediately marked. Forth-83 defined the separate words COMPILE and [COMPILE] to force compilation of non-immediate and immediate words, respectively.

Unnamed Words and Execution Tokens

In ANS Forth, no-name words can be defined with the word :NONAME (no name) which compiles the following words up to the next ; (semicolon) and leaves an execute token on the data stack. The execution token provides an opaque handle for compilation semantics, similar to function pointers in the C programming language.

Execution tokens can be stored in variables. The EXECUTE keyword takes an execution token from the data stack and performs the associated semantics. The COMPILE, (compile-comma) keyword takes an execution token from the data stack and adds the associated semantics to the current definition.

The word ' (single quote) takes the name of a word as a parameter, and returns on the data stack the execution token associated with that word. In the interpret state, ' RANDOM-WORD EXECUTE is equivalent to RANDOM-WORD.

Parsing words and comments

The words : (colon) and ' (single quote) are examples of words that take their arguments from the user input device instead of the data stack. Another example is the keyword ( (open parentheses) which reads and ignores the following words up to and including the closed parentheses and is used for comments. Similarly, the keyword (backslash) is used for comments that continue to the end of the current line ( (open parentheses) and (backslash) back) are words like the rest, and therefore they must be separated from what follows them (in this case the comment) by at least one blank space.

Code structure

In most Forth systems, the body of a code definition consists of machine language or some form of threaded code. The original Forth, which follows the informal Forth Interest Group (FIG) standard, is a Threaded Interpretive Language (TIL). This is also called indirect-threaded code, but direct-threaded and the Forth subroutine have also become popular in modern times. The fastest modern Forths use subroutine threading, insert simple words as macros, and perform peephole optimization or other optimization strategies to make code smaller and faster.

Data Objects

When a word is a variable or other data object, the CF points to the runtime code associated with the definition of the word that created it. A definition word has a characteristic "definition behavior" (by creating a dictionary entry and possibly allocating and initializing a space for the data) and also specifies the behavior of an instance of the class of words constructed by this definition word. Examples include:

VARIABLE
Name an uninitialized memory position, a memory location of a cell. The behavior of an instance VARIABLE returns your location address to the stack.
CONSTANT
Name a value (specified as an argument to the word CONSTANT). The behavior of the instance returns that value.
CREATE
Name a location; space can be assigned in this location, or can be fixed to contain a string or other initialized value. The behavior of the instance returns the direction of the beginning of this space of memory.

Forth also provides a facility whereby a programmer can define new application-specific definition words, specifying both definition behavior (at compile time) and instance behavior (at run time).. Some examples include ring buffers, named bits on I/O ports, and automatically indexed arrays.

Data objects defined by these words, and the like, are global in scope. The function for local variables provided in other languages, in Forth is provided by the data stack (although Forth has actual local variables as well). Compared to other languages, the Forth style of programming uses very few named data objects; such data objects are typically used to contain data that is used by a number of words or tasks (in a multitasking implementation).

Forth does not enforce consistency in the use of data types; it is the responsibility of the programmer to use the appropriate operators to read (fetch) and to store (store) the values or to perform other operations on the data.

Programming

Words written in Forth are compiled into an executable form. Classic implementations of "indirect threading" (indirect threaded) compile lists of the addresses of the words to be executed; many modern systems generate actual machine code (including calls to certain words and external codes for others extended in place). Some systems have optimization compilers. Generally speaking, a Forth program is saved as the memory image of the compiled program with a single command (eg, RUN) that is executed when the compiled version is loaded.

During development, the programmer uses the interpreter to run and test each small piece as it is developed. Therefore, most Forth programmers advocate loose top-down design, and bottom-up development with continuous testing and integration.

The top-down design is generally a separation of the program into "vocabularies" which are then used as high-level toolkits to write the final program. A well-designed Forth program reads like natural language, and implements not just a single simple solution, but also a set of tools to attack related problems.

Implementations

Since the Forth virtual machine is simple to implement and has no standard implementation reference, there are a plethora of language implementations. In addition to supporting the standard varieties of desktop computer systems (POSIX, Microsoft Windows, Mac OS X), many of these Forth systems also target a variety of embedded systems. Listed here are some of the more prominent systems that conform to the ANS Forth 1994 standard.

  • GNU Forth - ANS Forth portable implementation of the GNU Project
  • Forth Inc. - founded by the originators of Forth, sell the desktop version (SwiftForth) and the ANS Forth solutions (SwiftX)
  • MPE Ltd. - sells highly optimized ANS Forth desktop compilers (VFX) and powered
  • Open Firmware - a boot loader (bootloader) and a standard BIOS based on ANS Forth
  • Implementation freely available
  • Commercial implementation
  • A more updated Forth System Index, organized by platform

It is also worth noting an implementation in the Minecraft mod, RedPower 2 as a control system.

Contenido relacionado

ARM architecture

ARM, formerly Advanced RISC Machine, originally Acorn RISC Machines, is a RISC architecture of 32 bits and, with the arrival of its version V8-A, also of 64...

National Institute of Aerospace Technology

The Esteban Terradas National Institute of Aerospace Technology is an autonomous body in Spain considered a public research body, attached to the Secretary of...

Fourth generation of computers

The first 8-bit microprocessor was the Intel 8008, developed in 1972 for use in computer terminals. The Intel 8008 contained 3,300 transistors. The first...
Más resultados...
Tamaño del texto:
Copiar