IF you are new to mariadb or mysql, you might want to take a sneak peek at the Introduction To MariaDB, I covered a couple of interesting concept in that post.
In this guide, you'll learn how to create a mariadb database and table.
You can think of a database as a container that houses different tables, and in order for you to add and alter data, it is a must to create a database.
Factors To Consider When Creating a Table
When creating tables, you need to understand the structure of each table beforehand, the number of columns, the type of data each column would hold, how the table will be indexed, and more. The following is a list of what to do when creating a structure for your data:
- Choose a meaningful name, this would avoid confusion down the line, if you would be creating tons of tables, you don't want to use a name like abc1, abc2.
- The number of tables to include in your database.
- The number of columns each table should contain, as well as meaningful column names.
- For each column, what kind of data is to be stored, you don't want to use an Integer type when you would be storing a text.
There are different types of data you can use for your column, you can use a number type, alphanumeric (containing both alphabetical and numerical symbols), you can choose the maximum characters you want each column to contain, e.g if you would be storing a description of a certain feature, you would need a column that can store plenty of text and less to store maybe a date or author information.
It just makes perfect sense to get a good idea of how you want it to be structured, doing this alone can make your database super-efficient.
Creating a Database
What I most do when naming a database is to concatenate the user running the site with the site name, something like:
username_websitename
In this guide, I'll use a fictitious username "marn" and a website name "melodic", so, now, we would have:
marn_melodic
I love the name ;), the next thing is to create the database, this is super simple, and you can do that using:
CREATE DATABASE marn_melodic;
This is an empty database, and it is a place where we would add our tables, which will, in turn, hold different types of data.
Understanding Character set and Collation
In MariaDB, the default character set is latin1, and the default collation is latin1_swedish_ci, this may be different in some Linux distro. When changing a character set and not specifying a collation, the default collation for the new character set is always used.
If you are wondering the meaning of a character set, and collation, don't worry it is simple. A character set is a set of symbols and encodings, and a collation is a set of rules for comparing characters in a character set.
To put it simply, a character set defines how and which characters are stored to support a particular language or languages. A collation, on the other hand, defines the order used when comparing strings (i.e. the position of any given character within the alphabet of that language).
An explanation from the official doc of mysql:
Suppose that we have an alphabet with four letters: A, B, a, b. We give each letter a number: A = 0, B = 1, a = 2, b = 3. The letter A is a symbol, the number 0 is the encoding for A, and the combination of all four letters and their encodings is a character set.
Suppose that we want to compare two string values, A and B. The simplest way to do this is to look at the encodings: 0 for A and 1 for B. Because 0 is less than 1, we say A is less than B. What we've just done is applied a collation to our character set. The collation is a set of rules (only one rule in this case): “compare the encodings.” We call this simplest of all possible collations a binary collation.
But what if we want to say that the lowercase and uppercase letters are equivalent? Then we would have at least two rules: (1) treat the lowercase letters a and b as equivalent to A and B; (2) then compare the encodings. We call this a case-insensitive collation. It is a little more complex than a binary collation.
In real life, most character sets have many characters: not just A and B but whole alphabets, sometimes multiple alphabets or eastern writing systems with thousands of characters, along with many special symbols and punctuation marks. Also in real life, most collations have many rules, not just for whether to distinguish lettercase, but also for whether to distinguish accents (an “accent” is a mark attached to a character as in German Ö), and for multiple-character mappings (such as the rule that Ö = OE in one of the two German collations).
Source: MySQL Charset
MariaDB can do all the above for you, it can:
- Store strings using a variety of character sets.
- Compare strings using a variety of collations.
- Mix strings with different character sets or collations in the same server, the same database, or even the same table.
- Enable specification of character set and collation at any level.
If you need to use a specific character set, then you need to know what character sets and collations are available, how to change the defaults (we would do that in a moment), here is a list of available character sets and collation, you can cross-check the description to see what fits your language.
For our table, we would be using the common utf8mb4, and a collation utf8mb4_general_ci, this would cover almost all languages (even Chinese or Arabic), so, we are good to go, don't worry if you don't understand all this by now, it would become a part of you later down the line.
Let's drop the marn_melodic
database and create it again with a new character set and collation:
DROP DATABASE marn_melodic;
CREATE DATABASE marn_melodic CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
All of the above is one SQL statement spread over two lines, ending with the semicolon. The second line, which is new, tells mariadb that the default characters that will be used in tables in the database are UTF8 letters and other characters.
The third line tells mariadb that the default method of sorting data in tables is based on utf8mb4_general_ci, the ci means it is case insensitive, so, for example, when searching, "Jailer" is the same as "jailer"
Let's confirm the database we created, you use:
SHOW DATABASES;
+--------------------+
| Database |
+--------------------+
| information_schema |
| marn_melodic |
| mysql |
| performance_schema |
+--------------------+
As you can see, the result returned marn_melodic, the other three databases were created by mariadb by default, so, leave it untouched.
Before beginning to add tables to the marn_melodic database, enter the following command into the mariadb client:
USE marn_melodic;
The USE command is self-explanatory, it allows us to select a database we want to work with, in my case, I want to work with the marn_melodic database.
Once you’ve executed the command, it will change MariaDB [(none)]>
to MariaDB [marn_melodic]>
This is a very useful feature as it indicates the database you are currently working with, so, anything you are doing would only affect this database.
Creating Tables
To structure our database, we need to create tables, and keeping structure in mind, you should make sure your table names reflect the goal of the project. So, here are the preferred table naming convention:
- Use a lower case table name, this would ensure uniformity on systems that are not case insensitive e.g Linux, so, again, make sure your table names are in lower cases
- Prefix your table names, e.g if we are working on a project called beatrecords, and I need to store user info, I would name the table beatrecords_user
- Avoid using spaces, numbers, and crazy characters.
For this guide, the name of our imaginary project is - classical music, and the table name would be cmusic for short, we would avoid prefixing for now.
Enter the following SQL statement to create the table:
CREATE TABLE cmusic (
music_id INT AUTO_INCREMENT PRIMARY KEY,
artiste VARCHAR(255),
title VARCHAR(255),
genre_id INT,
description TEXT);
We created the table cmusic with five columns or fields, with commas separating the information about each column.
Note that all the columns together are contained in a pair of parentheses. For each column, we specify the name, the type, and optional settings.
The first column has the following information:
- name - music_id
- Type INT (This means it has to contain integers; whole numbers)
- The settings, AUTO_INCREMENT and PRIMARY KEY
The names of the columns can be anything except reserved words for SQL statements, clauses, and functions. Although you can use them when quoted, it just makes sense to avoid them.
Here is a list of Reserved Words
Here is a list of available data types
As you've seen, we created the table with only five columns, but you can have a lot of them (up to 1000), it is a bad practice to have too many columns, If a table has too many columns, it can be cumbersome to use and the table will be sluggish when it’s accessed, so, it is better to break them down in silos (isolate them by their characteristics).
The first column in the cmusic table is a simple identification number, music_id. It will be the primary key column on which data will be indexed, which is why we have the keyword — PRIMARY KEY. PRIMARY KEY isn't compulsory, but not adding it is like not giving thousands of pictures distinct numbers, yes, you would still be able to find the pictures if they do not have a distinct number, but suppose you ever need to remove a duplicate image, how would you identify which one to remove and which to keep?
You get the idea now I guess, and I can read your mind, you can also use the music title as the primary key, but that doesn't make much sense since names can be very long, and a certain music title might not be unique to an artiste, and even if it is, there would always be a day when an artiste would have the same song title as the one you've stored, so, numbers are better and unique.
The AUTO_INCREMENT allows a unique number to be generated automatically when a new record is inserted into a table. It will start with the number 1 unless we specify a different number.
We've set the artiste column to have a variable-width character data type (VARCHAR). The 255 that we specify in the parentheses after it sets the maximum size, this should be sufficient for long artiste names we’ll need to accommodate.
If the name of an artiste is lesser than 255 characters, the storage engine will reduce the size of the column for the row, and that is because VARCHAR is a variable-length string data type, so it holds only the characters you assign to it.
There is also a data type called CHAR, and it works differently from VARCHAR, if the data type in a CHAR column is less than its maximum, space is still allocated for the full width that you set. If you know what to expect, USE CHAR (example of such usage is when storing a country code) tables run faster and can be indexed more easily with a CHAR column.
The good thing about using a VARCHAR data type is that the column can use less space on the server’s hard drive and is less prone to fragmentation, and thus improve performance. When you know for sure that a column will have a set number of characters, use CHAR. When the width may vary, use VARCHAR, this is just a heads up.
The third column (title) also uses a VARCHAR type with a maximum of 255 maximum column length in characters.
The fourth column uses an INT, that would hold the genre id of the actual genre, we would create a table about the genre later, and the last field (description) uses a TEXT data type, which means that it’s a variable-width column, and it can hold up 65,535 bytes of data for each row. This will allow us to enter plenty of text about each music.
Note that there are more data types you can use, but this depends on your goal, so, make sure to check the data type page I linked to above.
To see how the table looks, use the DESCRIBE statement. It displays information about the columns of a table or the table schema, enter the following to bring that up:
DESCRIBE cmusic;
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| music_id | int(11) | NO | PRI | NULL | auto_increment |
| artiste | varchar(255) | YES | | NULL | |
| title | varchar(255) | YES | | NULL | |
| genre_id | int(11) | YES | | NULL | |
| description | text | YES | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
The first column is the heading, and that describes the rows of information that follows it, so, as you can see, the first column contains the field or columns of tables we just created.
The second column list the data type for each field, some are VARCHAR with different width, we have one has INT(11), the 11 doesn't mean the max number you can store, it just how you want your numbers to be padded.
The third column in the preceding results, Null, indicates whether each field may contain NULL values. In computer programming, a null is a character denoting nothing, this is different from empty content in a field, when a field is empty, it isn’t null in the sense that it still has a length, when a variable is empty it has a length of 0 stored in memory, when it is null, it has absolutely nothing, I hope you get the explanation, don't worry if you don't understand that, for now, just accept that null is nothing.
The fourth column, Key, indicates whether a field is a key field — an indexed column. In the example shown, the music_id column is a primary key, shortened to PRI in the above display.
The fifth column (Default) would contain any default value set for
each field. We didn’t set any when creating the cmusic table, though, we could have done so.
The last column (Extra) provides any extra information the table maintains on each column. In the example shown, we can see that the values for music_id will be incremented automatically.
If you want to change anything, feel free to DROP the table using DROP TABLE cmusic; and renter the data again.
Inserting Data
So far, we've only created a table, and haven't added any data whatsoever, to put data into the music table,we use the INSERT statement, enter data by following the above:
INSERT INTO cmusic (title, artiste)
VALUES
('Silhouette', 'Avicii'),
('Higher Steps', 'Beautiful Nubia'),
('Tables Turn', 'Beautiful Nubia'),
('Observation Is No Crime', 'Fela'),
('Levels', 'Avicii'),
('Colonial Mentality', 'Fela'),
('Jailer', 'Asa');
I got:
Query OK, 7 rows affected (0.01 sec)
Records: 7 Duplicates: 0 Warnings: 0
As you can see, 7 rows were added, with no duplicates nor error.
To display the data you have entered in the cmusic table, enter the following:
SELECT * FROM cmusic;
+----------+-----------------+-------------------------+----------+-------------+
| music_id | artiste | title | genre_id | description |
+----------+-----------------+-------------------------+----------+-------------+
| 1 | Avicii | Silhouette | NULL | NULL |
| 2 | Beautiful Nubia | Higher Steps | NULL | NULL |
| 3 | Beautiful Nubia | Tables Turn | NULL | NULL |
| 4 | Fela | Observation Is No Crime | NULL | NULL |
| 5 | Avicii | Levels | NULL | NULL |
| 6 | Fela | Colonial Mentality | NULL | NULL |
| 7 | Asa | Jailer | NULL | NULL |
+----------+-----------------+-------------------------+----------+-------------+
As you can see from the results, MySQL put values in the two columns we gave it, and set the other columns to their default values (i.e., NULL). We can change those values later.
Also, you see the music_id incremented the number for us, that is the function of the AUTO_INCREMENT we added when we created the table in the "Creating a Table Section"
To conclude this guide, let's create another table that would hold information on members that would listen to our classical music:
CREATE TABLE users
(human_id INT AUTO_INCREMENT PRIMARY KEY,
name_first VARCHAR(30),
name_last VARCHAR(30),
email_address VARCHAR(255));
The following adds 5 people to our table of member of the site:
INSERT INTO users
(name_first, name_last, email_address)
VALUES
('Blaise', 'Pascal', 'blaise@melodic.com'),
('Charles', 'Dawton', 'charles@melodic.com'),
('James', 'Peter', 'james@melodic.com'),
('Devsrealm', 'Guy', 'devsrealm@melodic.com'),
('Nubia', 'Putin', 'nubia@melodic.com');
To display the tables we have so far, you use:
SHOW TABLES IN marn_melodic;
+------------------------+
| Tables_in_marn_melodic |
+------------------------+
| cmusic |
| users |
+------------------------+
If you for example want to create a table that is similar to the one that you just created, you can use the SHOW CREATE TABLE statement, and that would show how you might enter the CREATE TABLE to create an existing table, here is an example:
SHOW CREATE TABLE cmusic \G
MariaDB [marn_melodic]> SHOW CREATE TABLE cmusic \G
*************************** 1. row ***************************
Table: cmusic
Create Table: CREATE TABLE `cmusic` (
`music_id` int(11) NOT NULL AUTO_INCREMENT,
`artiste` varchar(255) DEFAULT NULL,
`title` varchar(255) DEFAULT NULL,
`genre_id` int(11) DEFAULT NULL,
`description` text,
PRIMARY KEY (`music_id`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8mb4
There are more options that you can set for each column; if you don’t specify them, the server will use the default choices. . Notice that we did not set a default value for any of the fields (except the first one that uses the AUTO_INCREMENT), so it set each column to a default of NULL.
If you notice that the last line says ENGINE=InnoDB, this is the type of storage engine used, and it is the default in mariadb. Data is stored and handled in different ways by different storage engines.
Let's create a table that would hold the actual genre, if we like we can create in such a way that it will hold more information about the genre of music, so, we would tie it into the genre_id column in the cmusic table, this would make things a bit more clearer.
CREATE TABLE genre (
genre_id INT AUTO_INCREMENT PRIMARY KEY,
genre_description VARCHAR(255) );
I am creating two columns, the first one would be indexed and will be referenced by the cmusic table. We are only doing this so whenever we issue an SQL statement, the query can reference both tables. The SQL statements would join the genre table to the cmusic table based on the genre_id columns in both tables.
The second column would hold the description of the genre.
This would conclude the guide on creating databases, and tables. In a future guide, we would look into manipulating or altering the data.