SQL
SQL (Structured Query Language) is a domain-specific language designed to manage and retrieve information from relational database management systems. One of its main characteristics is the handling of algebra and relational calculus to make queries in order to easily retrieve information from databases, as well as make changes to them.
Originally based on relational algebra and relational calculus, SQL consists of a data definition language, a data manipulation language, and a data control language. The scope of SQL includes data insertion, queries, updates and deletion, schema creation and modification, and data access control. Also SQL is sometimes described as a declarative language, it also includes procedural elements.
SQL was one of the first commercial languages for Edgar Frank Codd's relational model as described in his 1970 research paper The Relational Data Model for Large Shared Data Stores. Despite not fully adhering to the relational model described by Codd, it became the most widely used database language.
SQL became the standard of the American National Standards Institute (ANSI) in 1986 and of the International Organization for Standardization (ISO) in 1987. Since then, the standard has been revised to include more features. Despite the existence of both standards, most SQL code is not fully portable between different database systems without further adjustment.
Origins and evolution
The origins of SQL are tied to relational databases, specifically those that resided on IBM machines under the System R management system, developed by a group at IBM in San Jose, California.
In the beginning it was IBM, and IBM created SQL. SQL, originally an acronym for "Structured Query Language" ("Structured Query Language"), is a unified language for defining, querying, modifying, and controlling data in a relational database. His name is officially pronounced "ess-cue-ell" (according to the American National Standards Institute).
The relational model of database management was proposed in 1970 by Dr. E. F. Codd at the IBM Research Laboratory in San Jose, California, and was developed over the next decade in universities and research laboratories. SQL, one of several languages that emerged from this early work, has now almost completely taken over the world of relational database languages. Relational database management system vendors who initially chose other languages have flocked to SQL; National and international standards organizations have proposed a codified version of the language.
In 1970, E. F. Codd proposed the relational model and associated with it a data access sublanguage based on predicate calculus. Based on these ideas, the IBM laboratories defined the language SEQUEL (Structured English Query Language) which was later widely implemented by the experimental System R database management system (DBMS), developed in 1977 also by IBM. However, it was Oracle that first introduced it in 1979 in a commercial product.
SEQUEL ended up being the predecessor of SQL, which is an evolved version of the former. SQL becomes the language par excellence of the various relational database management systems that emerged in the following years and was finally standardized in 1986 by ANSI, giving rise to the first standard version of this language, " SQL-86" or "SQL1". The following year this standard is also adopted by ISO.
However, this first standard did not cover all developer needs and included storage definition functionality that was considered to be removed. So, in 1992, a new extended and revised SQL standard called "SQL-92" or "SQL2".
Today, SQL is the de facto standard for the vast majority of commercial DBMSs. And, although the diversity of particular additions that the different commercial implementations of the language include is wide, the support for the SQL-92 standard is general and very broad.
ANSI SQL underwent several revisions and additions over time:
Year | Name | Alias | Comments |
---|---|---|---|
1986 | SQL-86 | SQL-87 | First publication by ANSI. Confirmed by the International Organization for Standardization in 1987. |
1989 | SQL-89 | Lower review. | |
1992 | SQL-92 | SQL2 | Major review. |
1999 | SQL:1999 | SQL2000 | Regular expressions, recursive consultations (for hierarchical relationships), triggers and some object-oriented features were added. |
2003 | SQL:2003 | Enter some XML features, functions changes, sequence object standardization and autonumeric columns. | |
2006 | SQL:2006 | ISO/IEC 9075-14:2006 Defines the ways in which SQL can be used in conjunction with XML. Defines ways to import and save XML data into an SQL database, manipulating them within the database and publishing the XML and the standard SQL data. In addition, it provides facilities that allow applications to integrate within their SQL code the use of XQuery, XML query language published by W3C (World Wide Web Consortium) for concurrent access to SQL regular data and XML documents. | |
2008 | SQL:2008 | It allows the use of the ORDER BY clause outside the definitions of the cursors. Includes INSTEAD OF type triggers. Add the TRUNCATE judgment. | |
2011 | SQL:2011 | Temporary data (PERIOD FOR). Improvements in window functions and FETCH clause. | |
2016 | SQL:2016 | It allows search for patterns, polymorphic table functions and compatibility with JSON files. |
General features of SQL
SQL is a database access language that exploits the flexibility and power of relational systems and thus allows a wide variety of operations.
It is a "high-level" or "non-procedural" which, thanks to its strong theoretical base and its orientation to handling sets of records —and not individual records— allows high productivity in coding and object orientation. In this way, a single statement can be equivalent to one or more programs that would be used in a register-oriented low-level language. SQL also has the following features:
- Data definition language: The SQL LDD provides commands for the definition of relationship schemes, erased from relationships and modifications of relationship schemes.
- Interactive language of data manipulation: The SQL LMD includes query languages based on both relational algebra and in relational tuple calculation.
- Integrity: The SQL LDD includes commands to specify the integrity restrictions that the data stored in the database must comply with.
- Definition of views: The LDD includes commands to define the views.
- Transaction control: SQL has commands to specify the beginning and end of a transaction.
- Built-in and dynamic SQL: This means that SQL instructions can be incorporated into programming languages such as: C++, C, Java, PHP, COBOL, Pascal and Fortran.
- Authorization: The LDD includes commands to specify the rights of access to relationships and views.
Data Types
Some of the basic SQL data types are:
Integers:
- TYNYINT(size): -128 to 127 normal. 0 to 255 SIN FIRMAR *. The maximum number of digits can be specified between parenthesis
- SMALLINT(size): -32768 to 32767 normal. 0 to 65535 SIN FIRMAR *. The maximum number of digits can be specified between parenthesis
- MEDIUMINT(Table): -8388608 to 8388607 normal. 0 to 16777215 SIN FIRMAR *. The maximum number of digits can be specified between parenthesis
- INT(size): -2147483648 to 2147483647 normal. 0 to 4294967295 SIN FIRMAR *. The maximum number of digits can be specified between parenthesis
- BIGINT(size): -9223372036854775808 to 9223372036854775807 normal. 0 to 18446744073709551615 SIN FIRMAR *. The maximum number of digits can be specified between parenthesis
Floating point numbers:
- FLOAT(size, d): A small number with a floating decimal point. The maximum amount of digits can be specified in the size parameter. The maximum number of digits at the right of the decimal point is specified in the d
- DOBLE (size, d): A large number with a floating decimal point. The maximum amount of digits can be specified in the size parameter. The maximum number of digits at the right of the decimal point is specified in the d
- DECIMAL (size, d): A DOBLE stored as a string, allowing a fixed decimal point. The maximum amount of digits can be specified in the size parameter. The maximum number of digits at the right of the decimal point is specified in the d
Dates and times
- DATE (): A date. Format: AAAA-MM-DD Note: the admitted range is from '1000-01-01' to '9999-12-31'
- DATETIME (): * A combination of date and time. Format: AAAA-MM-DD HH: MI: SS Note: the admitted range is from '1000-01-01 00:00:00' to '9999-12-31 23:59:59'
- TIMESTAMP (): * A time mark. TIMESTAMP values are stored as the number of seconds since the time of Unix ('1970-01-01 00:00:00' UTC). Format: AAAA-MM-DD HH: MI: SS Note: the admitted range is from '1970-01-01 00:00:01' UTC to '2038-01-09 03:14:07' UTC
- TIME (): A while. Format: HH: MI: SS Note: the admitted range is from '-838: 59: 59' to '838: 59: 59'
- YEAR (): A year in two or four digit format. Note: Values allowed in four digit format: from 1901 to 2155. Values allowed in two-digit format: 70-69, representing the 1970s to 2069
Character string:
- CHAR (size): It has a fixed length string (it can contain letters, numbers and special characters). The fixed size is specified in brackets. Can store up to 255 characters
- VARCHAR (size): It has a variable length string (can contain letters, numbers and special characters). The maximum size is specified in brackets. It can store up to 255 characters. Note: if you add a value greater than 255, it will become a text type
- TINYTEXT: It has a string with a maximum length of 255 characters
- TEXT: It has a string with a maximum length of 65.535 characters
- BLOB: For BLOB (big binary objects). Stores up to 65,535 bytes of data
- MEDIUMTEXT: It has a string with a maximum length of 16,777,215 characters
- MEDIUMBLOB: For BLOB (big binary objects). It has capacity for 16,777,215 data bytes
- LONGTEXT: It has a string with a maximum length of 4.294.967.295 characters
- LONGBLOB: For BLOB (big binary objects). It has capacity for 4,294,967,295 data bytes
Enum and set:
- Enum (x, y, z, etc.): Allows you to enter a list of possible values. You can list up to 65535 values in an ENUM list. If a value that is not on the list is inserted, a blank value will be inserted. Note: values are ordered in the order in which you enter them. Enter the possible values in this format: ENUM ('X', 'Y', 'Z')
- Set: Similar to ENUM, except that SET can contain up to 64 list items and can store more than one option
Binaries:
- bit: Connect that can be 0, 1 or NULL
Optimization
As stated before, and often common in high-level database access languages, SQL is a declarative language. In other words, it specifies what is wanted and not how to get it, so a statement does not explicitly establish an order of execution.
The internal execution order of a statement can seriously affect the efficiency of the DBMS, which is why it is necessary for it to carry out an optimization before its execution. Many times, the use of indexes speeds up a query statement, but slows down the update of the data. Depending on the use of the application, indexed access or a quick update of the information will be prioritized. Optimization differs significantly for each database engine and depends on many factors.
Modern database systems have a component called a query optimizer. This performs a detailed analysis of the possible execution plans of an SQL query and chooses the one that is most efficient to carry it out.
There is an extension of SQL known as FSQL (Fuzzy SQL, fuzzy SQL) that allows access to fuzzy databases, using fuzzy logic. This language has been implemented at an experimental level and is evolving rapidly.
Data Definition Language (DDL)
The data definition language (in English Data Definition Language, or DDL), is the one that is in charge of modifying the structure of the objects of the database. It includes commands to modify, delete or define the tables in which the data of the database is stored. There are four basic operations: CREATE, ALTER, DROP, and TRUNCATE.
CREATE
This command allows you to create data objects, such as new databases, tables, views, and stored procedures.
- Example (create a table)
CREATE TABLE clients;
ALTER
This command allows you to modify the structure of a table or object. You can add/remove fields to a table, modify the type of a field, add/remove indexes to a table, modify a trigger, etc.
- Example (Add column to a table)
ALTER TABLE pupils ADD Age INT UNSIGNED;
DROP
This command removes an object from the database. It can be a table, view, index, trigger, function, procedure, or any object that the database engine supports. It can be combined with the ALTER statement.
- Example
DROP TABLE pupils;
TRUNCATE
This command only applies to tables and its function is to delete the entire content of the specified table. The advantage over the DELETE command is that if you want to delete all the contents of the table, it is much faster, especially if the table is very large. The downside is that TRUNCATE only works when you want to delete absolutely all records, since the WHERE clause is not allowed. Although, at first, this statement appears to be DML (Data Manipulation Language), it is actually a DDL, since internally, the TRUNCATE command deletes the table and recreates it and does not execute any transactions.
- Example
TRUNCATE TABLE name_tabla;
DML Data Manipulation Language
Definition
A data manipulation language (Data Manipulation Language, or DML) is a language provided by the database management system that allows users to Users carry out the tasks of consulting or manipulating the data, organized by the appropriate data model.
The most popular data manipulation language today is SQL, used to retrieve and manipulate data in a relational database.
SELECT (Select)
The SELECT statement allows us to query the data stored in a database table.
Basic shape
SELECT [chuckles]{ALL日本語DISTINCT!] .name_campo▪[, .name_campo▪... ]FROM {.name_tabla한name_vis▪![, {.name_tabla한name_vis▪!... ][chuckles]WHERE .condition▪ [chuckles]{AND日本語OR! .condition▪...]][chuckles]GROUP BY .name_campo▪[, .name_campo▪...]][chuckles]HAVING .condition▪ [chuckles]{AND日本語OR! .condition▪...]][chuckles]ORDER BY {.name_campo한indice_campo▪! [chuckles]{ASC日本語DESC!][, {.name_campo한indice_campo▪! [chuckles]{ASC日本語DESC!]];
SELECT | Key word that indicates that the SQL sentence we want to execute is a query. Select both the fields listed and all the records that comply with the status of the WHERE part. When the attributes are taken from different tables in the FROM part, the meeting also takes place (join). That is why it is said to be an orthogonal language.
When you put the keyword ALLIndicates that we want to select all values, that is to say that it generates a multiset or bag instead of a set. It is the default value and is not usually specified almost never. When you put the keyword DISTINCTIndicate that we want to select only the different values. The result is a set instead of a multiset or bag. |
FROM | Indicates the table (or tables) from which we want to recover the data. In the event that more than one table exists it is called the "combined consultation" or "join" meeting. In the combined consultations it is necessary to apply a combination condition through a clause WHERE. |
WHERE | Specifies a condition that must be fulfilled for the data to be returned by the query. Supports logical operators AND and OR besides the relational and others. |
GROUP BY | Specifies the grouping given to the data. It is always used in combination with added functions. |
HAVING | Specifies a condition that must be fulfilled for the data to be returned by the query. Its operation is similar to that of WHERE but applied to the set of results returned by the consultation. It should always be applied together with GROUP BY and the condition must be referred to the fields contained in it. |
ORDER BY | It presents the result ordered by the columns indicated. The order can be expressed with ASC (in ascending order) and DESC (descending order). The default value is ASC. |
Example:
To formulate a query to the cars table and retrieve the license plate, make, model, color, number_kilometres, num_places fields, we must execute the following query. The data will be returned sorted by make and model in ascending order, from smallest to largest. The FROM keyword indicates that the data will be retrieved from the Cars table.
SELECT Enrolment, brand, model, color, number_kilometers, num_plazasFROM carsORDER BY brand, model;
Example of a simplified query through a field wildcard (*):
The use of the asterisk indicates that we want the query to return all the fields that exist in the table and the data will be returned ordered by make and model.
SELECT ♪FROM carsORDER BY brand, model;
WHERE clause
The WHERE clause is the statement that allows us to filter the result of a SELECT statement. Usually we don't want to get all the information that exists in the table, but we want to get only the information that is useful to us at the moment. The WHERE clause filters the data before it is returned by the query. When we want to include a text type in the WHERE Clause, we must include the value in single quotes.
Examples:
In our example, we want to query a specific car, for this we added a WHERE clause. This clause specifies one or more conditions that must be true for the SELECT statement to return data. In this case, the query will return only the data of the car with the license plate so that the query returns only the data of the car with the license plate MF-234-ZD
or the license plate FK-938-ZL
. You can use the WHERE clause alone, or in combination with as many conditions as you like.
SELECT Enrolment, brand, model, color, number_kilometers, num_plazasFROM carsWHERE Enrolment = 'MF-234-ZD' OR Enrolment = 'FK-938-ZL';
A WHERE condition can be negated through the NOT Logical Operator. The following query will return all the data from the Cars table, except the one with the License Plate MF-234-ZD
.
SELECT Enrolment, brand, model, color, number_kilometers, num_plazasFROM carsWHERE NOTE Enrolment = 'MF-234-ZD';
The following query uses the DISTINCT conditional, which will return the table generated by selecting only the make and model fields of each record, eliminating the repeated lines. In other words, the set of models of each brand that are in cars.
SELECT DISTINCT brand, model FROM cars;
If DISTINCT were omitted or ALL was used, the generated table would have repeated rows because there may be several cars of the same make and model but with different license plates. In this case, the result would be a multiset where each row with the make and model fields in the response corresponds to some license plate from the cars table that is not shown in the response.
ORDER BY clause
The ORDER BY clause is the statement that allows us to specify the order in which the data will be returned. We can specify the order in ascending or descending order through the ASC and DESC keywords. The order depends on the type of data that is defined in the column, so that a numeric field will be ordered as such, and an alphanumeric will be ordered from A to Z, even if its content is numeric. Defaults to ASC if not specified when querying.
Examples:
SELECT Enrolment, brand, model, color, number_kilometers, num_plazasFROM carsORDER BY brand ASC, model DESC;
This example selects all the license plate, make, model, color, number_kilometres and num_places fields from the cars table, ordering them by the make and model fields, brand in ascending order and model in descending order.
SELECT Enrolment, brand, model, color, number_kilometers, num_plazasFROM carsORDER BY 2;
This example selects all the license plate, make, model, color, number_kilometres and num_places fields from the cars table, ordering them by the make field, since it appears second in the list of fields that make up the SELECT.
Subqueries
A subquery is a SELECT statement that is embedded in a clause of another SQL statement. Subqueries can also be used in the INSERT, UPDATE, DELETE commands, and in the FROM clause.
Subqueries can be useful if you need to select rows from a table with a condition that depends on data in the table itself or in another table.
The subquery (inner query) is executed before the main query; the result of the subquery is used by the main query (outer query).
SELECT c.Enrolment, c.modelFROM cars AS cWHERE c.Enrolment IN ( SELECT m.Enrolment FROM fines AS m WHERE m.import ▪ 100 );
In this example, the license plates and models of cars whose fines exceed $100 are selected.
INSERT
An SQL INSERT statement adds one or more records to one (and only one) table in a relational database.
Basic shape
INSERT INTO Table(Column, [chuckles]ColumnB, ...])VALUES ('value1', [chuckles]'value2', ...]);-- Or can also be used as:INSERT INTO Table VALUES ('value1', 'value2');
The number of columns and values must be equal. If a column is not specified, it will be assigned the default value. Values specified (or implied) by the INSERT
statement shall satisfy all applicable restrictions. If a syntax error occurs or if any of the constraints is violated, the row is not added and an error is returned.
Example
INSERT INTO agenda_telefonica (Name, Number)VALUES ('Roberto Jeldrez', 4886850);
When all the values of a table are specified, the shortened statement can be used:
INSERT INTO name_tabla VALUES ('value1', [chuckles]'value2', ...]);
Example (assuming that 'name' and 'number' are the only columns in the 'phonebook' table):
INSERT INTO agenda_telefonicaVALUES ('Johnny Aguilar', 080473968);
Advanced forms
A feature of SQL (since SQL-92) is the use of row constructors to insert multiple rows at once, with a single SQL statement:
INSERT INTO Table(column1[, column2, ...])VALUES ('value1A', [chuckles]'value1B', ...]), ('value2A', [chuckles]'value2B', ...]), ...
This feature is supported by DB2, PostgreSQL (since version 8.2), MySQL, and H2.
Example (assuming name and number are the only columns in the phonebook table):
INSERT INTO agenda_telefonicaVALUES ('Roberto Fernández', '4886850'), ('Alejandro Sosa', '4556550');
That could have been done by the sentences
INSERT INTO agenda_telefonica VALUES ('Roberto Fernández', '4886850');INSERT INTO agenda_telefonica VALUES ('Alejandro Sosa', '4556550');
Note that separate statements may have different semantics (especially with respect to triggers), and may have different performance than the multiple insert statement.
To insert multiple rows in MS SQL you can use this construct:
INSERT INTO phone_bookSELECT 'John Doe', '555-1212'UNION ALLSELECT 'Peter Doe', '555-2323';
Note that this is not a valid SQL statement according to the SQL standard (SQL: 2003), due to the incomplete subselect clause.
To do the same in Oracle the DUAL Table is used, as long as it is just a single row:
INSERT INTO phone_bookSELECT 'John Doe', '555-1212' FROM DUALUNION ALLSELECT 'Peter Doe','555-2323' FROM DUAL
A standard-compliant implementation of this logic is shown in the following example, or as shown above (not applicable in Oracle):
INSERT INTO phone_bookSELECT 'John Doe', '555-1212' FROM LATERAL ( VALUES (1) ) AS t(c)UNION ALLSELECT 'Peter Doe','555-2323' FROM LATERAL ( VALUES (1) ) AS t(c)
Copy rows from other tables
An INSERT can also be used to retrieve data from others, modify it if necessary, and insert it directly into the table. All of this is done in a single SQL statement that does not involve any intermediate processing in the client application. A SUBSELECT is used in place of the VALUES clause. The SUBSELECT can contain the JOIN statement, calls to functions, and can even query the data that is inserted in the same TABLE. Logically, the SELECT is evaluated before the INSERT operation is started. An example is given below.
INSERT INTO phone_book2SELECT ♪FROM phone_bookWHERE name IN ('John Doe', 'Peter Doe');
A variance is needed when some of the data from the source table is being inserted into the new table, but not the entire record. (Or when the schemas of the tables are not the same.)
INSERT INTO phone_book2 (b)name], [chuckles]phoneNumber])SELECT [chuckles]name], [chuckles]phoneNumber]FROM phone_bookWHERE name IN ('John Doe', 'Peter Doe');
The SELECT produces a (temporary) table, and the schema of the temporary table must match the schema of the table where the data is inserted.
UPDATE
An SQL UPDATE statement is used to modify the values of a set of existing records in a table.
Example
UPDATE My_table SET field1 = 'updated value' WHERE field2 = 'N';
DELETE
An SQL DELETE statement deletes one or more existing records in a table.
Basic shape
DELETE FROM Table WHERE column1 = 'value1';
Example
DELETE FROM mi_tabla WHERE column2 = 'N';
Key Recovery
Database designers who use a surrogate key as the primary key for each table will run into the occasional scenario where it is necessary to automatically recover the database, generating a primary key from an SQL INSERT statement to its use in other SQL statements. Most systems do not allow SQL INSERT statements to return rows of data. Therefore, it becomes necessary to apply a solution in such scenarios.
Common implementations include:
- Using a specific stored database procedure that generates the alternate key, perform the INSERT operation, and finally returns the generated key.
- Using a specific SELECT statement of database, on a temporary table containing the last row inserted. DB2 implements this feature as follows:
SELECT ♪FROM NEW TABLE ( INSERT INTO phone_book VALUES ('Cristobal Jeldrez','0426.817.10.30')) AS t
- Using a SELECT judgment after the INSERT judgment with specific database function, which returns the primary key generated by the registry inserted more recently.
- Using a unique combination of elements of the original SQL INSERT in a later SELECT sentence.
- Using a GUID in the SQL INSERT sentence and recovers it in a SELECT sentence.
- Using the MySQL PHP mysql_insert_id() function after INSERT judgment.
- Using an INSERT with the RETURNING clause for Oracle, which can only be used within a PL/SQL block, in the case of PostgreSQL you can also use both with SQL and PL/SQL.
INSERT INTO phone_book VALUES ('Cristobal Jeldrez', '0426.817.10.30')RETURNING phone_book_id INTO v_pb_id
- In the case of MS SQL you can use the following instruction:
Set NoCount On;INSERT INTO phone_book VALUES ('Cristobal Jeldrez', '0426.817.10.30');Select @@Identity a id
Triggers
Triggers, also known as triggers in English, are defined on the table on which the INSERT statement operates, and are evaluated in the context of the operation. BEFORE INSERT triggers allow modification of the values to be inserted into the table. AFTER INSERT triggers cannot change data from now on, but can be used to initiate actions on other tables, for example to apply Excel auditing mechanisms.
Database management systems
The most widely used database management systems with SQL support are, in alphabetical order:
- DB2
- Firebird
- HSQL
- Informix
- InterBase
- MariaDB
- Microsoft SQL Server
- MySQL
- Oracle
- PostgreSQL
- PervasiveSQL
- SQLite
- Sybase ASE
Interoperability
The query languages of the different database management systems are incompatible with each other and do not necessarily fully follow the standard. In particular, date and time syntax, string concatenation, nulls, and text comparison in terms of case sensitivity vary from vendor to vendor. One particular exception is PostgreSQL, which strives for compliance with the standard.
Popular SQL implementations commonly omit support for basic standard SQL functions, such as the DATE
or TIME
data types. This is the case of the Oracle database manager (whose DATE
type behaves like DATETIME
, and lacks a TIME
type) and MS SQL Server (before 2008 version). As a result, SQL code can rarely be ported between database systems without modification.
There are several reasons for this lack of portability between database systems:
- The complexity and size of the SQL standard means that most SQL implementations are not compatible with the entire standard.
- The norm does not specify the behavior of the database in several important areas (e.g. indexes, file storage, etc.), leaving implementations to decide how to behave.
- The SQL standard accurately specifies the syntax that a database system as it should implement. However, it is not so well defined the specification in the standard of semantics of the constructions of the language, which leads to ambiguity.
- Many database providers have large existing customer bases, so making changes to fit the standard could lead to incompatibility in user facilities and the provider may not be willing to abandon compatibility with previous versions.
- There is little commercial incentive for a provider to provide users with the change of database provider.
- Users who evaluate database software tend to value more other factors such as higher performance in their priorities on standard compliance.
The ODBC (Open Database Connectivity) standard allows access to information from any application regardless of the database management system (DBMS) in which the information is stored, thus decoupling the application from the database.
Contenido relacionado
Video games development
Multipurpose Internet Mail Extensions
CAD