Abstract data type

format_list_bulleted Contenido keyboard_arrow_down
ImprimirCitar

In computer science an abstract data type (ADD) or abstract data type (TAD) is a mathematical model composed of a collection of operations defined on a data set for the model.

Introduction

In the world of programming there are various programming languages that have been created over time and that have been perfected due to the needs of the programmers of the time to which they belong. The first programming languages were of the linear type, since a program was traversed from a point marked as Start until it reached an End point. Over time, new languages were created and today the most used are the so-called "target-oriented". objects".

Object-oriented programming languages (OOP) have the characteristic that they are not linear languages, but are made up of various functions, which are called in the order in which the program itself requests them or the user determines them. To better understand how object-oriented languages work, we are going to introduce a fundamental concept in Data Structures called Data Abstraction and that is an important part of these Languages and the way most commercial software works today.

History

The concept of abstract data type (ADT), was first proposed around 1974 by John Guttag and others, but it was not until 1975 that Barbara Liskov first proposed it for the CLU language.

The Turbo Pascal language was decisive for the common acceptance of the TDA with the introduction of the Units, although these do not comply with the basic characteristics of an abstract data type such as data encapsulation. The Ada programming language was able to successfully implement ADTs with its packages. It is worth remembering that these last two languages formally support Modular Programming.

Definition

Very often the terms ADT and Data Abstraction are used in an equivalent way, and this is due to the similarity and interdependence of both. However, it is important to define the two concepts separately.

As already mentioned, Object Oriented Programming Languages are languages made up of different methods or functions and that are called in the order in which the program requires it, or the user wants it. Data abstraction consists of hiding the characteristics of an object and ignoring them, so that we only use the name of the object in our program. This is similar to an everyday life situation. When I say the word "dog," you don't need me to tell you what the dog is doing. You already know the shape of a dog and you also know that dogs bark. So I abstracted all the characteristics of all the dogs into a single term, which I called "dog." This is called 'abstraction' and it is a very useful concept in programming, since a user does not need to mention all the features and functions of an object each time it is used, but they are declared separately in the program and the abstract term (“dog”) is simply used to mention it.

In the example above, “dog” is an Abstract Data Type and the whole process of defining, implementing and mentioning it is what we call Data Abstraction.

Let's put a real example of programming. Suppose that in some Object Oriented Programming Language a small program takes the area of a rectangle of the dimensions that a user decides. Let's also think that the user probably wants to know the area of several rectangles. It would be very tedious for the programmer to define the multiplication of 'base' by 'height' several times in the program, furthermore it would limit the user to output a certain number of areas. Therefore, the programmer can create a function called 'Area', which will be called, with the corresponding parameters, the number of times that are needed by the user and thus the programmer avoids a lot of work, the program is faster, more efficient and shorter. To achieve this, the Area method is created in a separate way from the graphical interface presented to the user and the operation to be performed is stipulated there, returning the value of the multiplication. In the main method, only the Area function is called and the program does the rest.

The fact of keeping all the characteristics and abilities of an object separately is called Encapsulation and it is also an important concept to understand the data structuring. It is common for Encapsulation to be used as a synonym for Information Hiding, although some believe that this is not the case.

Separation of interface and implementation

When used in a computer program, an ADT is represented by its interface, which serves as a cover for the corresponding implementation. The idea is that the users of an ADT have to worry only about the interface, but not about the implementation, since this can change over time and, if there were no encapsulation, it would affect the programs that use the data. This is based on the concept of Information Hiding, a protection for the program from design decisions that are subject to change.

The strength of an ADT rests on the idea that the implementation is hidden from the user. Only the interface is public. This means that ADT can be implemented in different ways, but as long as it remains consistent with the interface, programs that use it are not affected.

There is a difference, although sometimes subtle, between the Abstract Data Type and the Data Structure used in its implementation. For example, a TDA of a list can be implemented by an Array or a Linked List or even a Binary Search Tree. A list is an Abstract Data Type with well-defined operations (add element, add to end, add to beginning, retrieve, delete, etc) while a linked list is a data structure based on pointers or references (depending on the language) that can be used to create a representation of a List. The Linked List is commonly used to represent an ADT List, and sometimes even confused. An example is the Java Linked List class, which offers a large number of methods that do not correspond to a "pure" Linked List, but to a strong ADT.

Similarly, an ADT Binary Search Tree can be represented in many ways: Binary Tree, AVL Tree, Red-Black Tree, Array, etc. Despite the implementation, a Binary Tree always has the same operations (insert, delete, find, etc.)

Separating the interface from the implementation doesn't always mean that the user is completely unaware of the routine's implementation, but enough that they don't depend on any aspect of the implementation. For example, a TDA can be created using a script or any script that can be decompiled.

Many modern compilers allow you to decompile and display the assembler code from any source code and thus view internally any of its ADTs, as is often the case with C, C++, Fortran, and other languages.

Characterization

A TDA is characterized by a set of operations (functions) which is usually called public interface and represents the behavior of the TDA; whereas the implementation as the private part of the TDA is hidden from the client program that uses it. All high-level languages have TDA predefined; which are the so-called simple types and the predefined structures, and these have their public interfaces that include operations like +, -, *, etc.

You don't need to know how such operators act on the internal representation of the defined types, which also tends to be an implementation that is quite dependent on the machine on which the compiler works. The interesting thing is that the current languages will allow us to extend the predefined TDA with others that will be defined by the programmer himself in order to adapt the data types to the needs of the programs.

The TDAs that we are going to be interested in from now on are those that reflect a certain behavior by organizing a certain variety of data structuredly. We will refer to this structured way of storing the data to characterize each TDA.

The TDAs that have simple information but depend on a structural behavior will be called political and those simple TDAs, such as the predefined types where the information is not related by any structure and do not admit more than one value at any time, will be called TDAs monolithic.

Note that when we talk about a TDA we will not make any allusion to the type of the elements but only to the way in which these elements are arranged. We are only interested in the structure that supports the information and its operations. To determine the structural behavior, it is enough to observe the behavior that the data will follow.

Let us then characterize ADDs. A TDA will have a part that will be invisible to the user, which must be protected and that can be said to be irrelevant for the user's use and is constituted both by the algorithmic machinery that implements the semantics of the operations and by the data that serve as link between the elements of the TDA, that is, internal information necessary for the implementation that is being made for that behavior of the TDA. Summing up, we can say that both the implementation of the operations and the internal elements of the TDA will be private to external access and hidden from any other level.

An ADT represents an abstraction:

  • The details (normally few) of the specification (what).
  • The details (almost always numerous) of implementation (how).

Abstraction

Abstraction, one of the tools that helps us the most when solving a problem, is a fundamental mechanism for understanding problems and phenomena that have a large amount of detail, its main idea is to handle a problem, phenomenon, object, theme or idea as a general concept, without considering the great amount of detail that these may have.

The abstraction process presents two complementary aspects:

  1. Emphasize the relevant aspects of the object.
  2. Ignoring the irrelevant aspects of it (relevance depends on the level of abstraction, since if it is passed to more concrete levels, certain aspects may become relevant).

In general, we can say that abstraction allows establishing a hierarchical level in the study of phenomena, which is established by successive levels of detail. Generally, a descending sense of detail is followed, from the most general levels to the most concrete levels.

For example: high-level programming languages allow the programmer to abstract from the endless details of assembly languages. Another example, computer memory is a one-dimensional structure made up of cells and yet we work as if it were unique.

Abstraction gives us the possibility of defining a series of successive refinements to our TDA and it should be understood that when we say successive refinements we are referring to the strategy used to break down a problem into subproblems. As software design evolves, each level of modules represents a refinement at the abstraction level. That is, include details that were overlooked at a higher level, at a lower level of the hierarchy.

Let's see the different types of abstraction that we can find in a program:

1. Functional abstraction: creating procedures and functions and calling them using a name that highlights what the function does and ignores how it does it. The user only needs to know the specification of the abstraction (the what) and can ignore the rest of the details (the how).

2. Data abstraction:

  • Data type: provided by high-level languages. The representation used is invisible to the programmer, which is only allowed to see predefined operations for each type.
  • Types defined: by the programmer that enable the definition of data values closer to the problem to be solved.
  • TDA: for the definition and representation of data types (values + operations), along with their properties.
  • Objects: They are TDA to which reuse and code compartment properties are added.

If we go deeper into the world of programming and its concepts, there are two of these concepts that should not be confused, they are: data type and data structure.

A data type, in a programming language, defines a set of values that a particular variable can take, as well as the basic operations on that set. Now let's see how these concepts are related. Data types constitute a first level of abstraction, since it does not take into account how the information in the machine's memory is actually implemented or represented. To the user, the implementation or rendering process is invisible.

Let's see then what data structures are. Data structures are collections of variables, not necessarily of the same type, related to each other in some way. Data structures are characterized by the data type of the elements stored in the structure and by the relationship defined on these elements.

At the data structure level, the operations on a particular element are totally irrelevant, only the operations that wrap the structure globally are relevant.

Data abstraction is the characteristic of a database system that allows the user or programmer to operate with the data without the need to know details that are not “important” to him, thus offering an abstract view of these data.. To achieve this purpose, different levels of abstraction have been defined:

  • Physical level. Determine how data (tracks, sectors, cylinders) are physically stored, represents the lowest level.
  • Logical or Conceptual Level. Determine the organization of the files. Indices, keys, field order, relationships, data types.
  • Level of Views. It hides part of the information to users, that is to say it makes visible only a part of the database.

Examples of use of TDA

Some examples of the use of TDA in programming are:

  • Lists: a sequential collection of elements with operations as add, delete, tour.
  • Piles: a collection of elements following the LIFO property. Common operations include push, pop, top.
  • Colas: a collection of elements that follows the FIFO property. Common operations include in Queue or addof Queue or returnFirst.
  • Sets: implementation of assemblies with their basic operations (union, intersection and difference), insertion operations, erased, search...
  • Dictionaries: they are a set of key and value associations. Their common operations are Fuck! (laughs) get (obtain), travel, eliminate.
  • Graphs: Implementation of graphs; a series of vertices joined by a series of arches or edges.

Contenido relacionado

GM-NAA I/O

The GM-NAA I/O is the first operating system in the history of...

Kilobit per second

Equivalent to 1000 bits per second = 1000...

PHP nuke

PHP-Nuke is an automated web-based news and content management system based on PHP and MySQL technologies. Originally PHP-Nuke was a fork made by the...

DirectX

DirectX is a collection of APIs developed to facilitate complex tasks related to multimedia, especially video and game programming, on the Microsoft Windows...

Clipper (programming language)

Unlike other xBase languages, Clipper never had an interpreter mode, similar to dBase's. Its database management utilities, such as the creation of tables...
Más resultados...
Tamaño del texto:
undoredo
format_boldformat_italicformat_underlinedstrikethrough_ssuperscriptsubscriptlink
save