Exploring the Cosmos: 
Building and Querying a Planetary Database with MySQL

Exploring the Cosmos: Building and Querying a Planetary Database with MySQL

After installing MySQL, in this tutorial we will make a MySQL database and create a table in that database. Then we'll add data to the table and query it. Make sure the MySQL server is up and running. On OSX I will use
brew services start mysql

After starting the MySQL server, you can access the MySQL shell by typing:

mysql -u root -p

Before we create our own database, lets run the SHOW DATABASES; command.
When you run the command SHOW DATABASES; in MySQL, it retrieves a list of all the databases that exist within the MySQL server instance you are connected to.

The output below illustrates the result of this command:

+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+

The above system databases and their tables are integral parts of the MySQL server these databases are automatically created when you install MySQL and contain system-related information, they aren't traditional databases in the sense of holding user-generated data. Instead, they serve administrative, informational, and performance-monitoring purposes.

You can create a new database using the CREATE DATABASE command in MySQL. So lets create a new database called "Astronomy" in MySQL, you can use the CREATE DATABASE statement followed by the desired database name. Here's how you can do it:

CREATE DATABASE Astronomy;

After executing this SQL command, MySQL will create a new database named "Astronomy". You can then proceed to work with this database by switching to it or by creating tables within it to organize your data related to astronomy.

Lets use the SHOW DATABASES;command to see this new database:

mysql> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| Astronomy          |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.02 sec)

Now out Astronomy database has been created, we need to tell MySQL to switch to this database.

To switch to a different database in MySQL, we can use the USE statement followed by the name of the database we want to switch to.

Here's how we can switch to the "Astronomy" database:

USE Astronomy;

After executing this SQL command, you will switch your current MySQL session to the "Astronomy" database. Any subsequent SQL commands you execute will operate within this database context, meaning they will affect or retrieve data from tables within the "Astronomy" database until you switch to a different database or end the session. In MySQL, you can use the SELECT DATABASE() function to retrieve the name of the current database you are working with.

SELECT DATABASE();
mysql> SELECT DATABASE();
+------------+
| DATABASE() |
+------------+
| astronomy  |
+------------+
1 row in set (0.01 sec)

In MySQL, the database and table names are not case-sensitive by default on most operating systems. This means that "Astronomy", "astronomy", "ASTRONOMY", and any other variations of capitalization will refer to the same database or table.

Now let's create a new table called "Planets" with columns for "PlanetName", "DayLength", and "YearLength" in MySQL, you can use the CREATE TABLE statement. Here's how you can do it:

CREATE TABLE Planets (
    PlanetName VARCHAR(25) PRIMARY KEY,
    DayLength INT,
    YearLength INT
);
mysql> CREATE TABLE Planets (
    ->     PlanetName VARCHAR(25) PRIMARY KEY,
    ->     DayLength INT,
    ->     YearLength INT
    -> );
Query OK, 0 rows affected (0.06 sec)

To see the structure of the new table "Planets" that you created in MySQL, we can use the DESCRIBE statement followed by the table name. Here's how you can do it:

DESCRIBE Planets;

Executing this SQL command will provide you with information about the columns, data types, and any constraints applied to the "Planets" table. It will show you details such as the column names, data types, whether they allow NULL values, and any additional constraints like primary keys.

mysql> DESCRIBE Planets;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| PlanetName | varchar(25) | NO   | PRI | NULL    |       |
| DayLength  | int         | YES  |     | NULL    |       |
| YearLength | int         | YES  |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+
3 rows in set (0.02 sec)

Alternatively, you can also use the SHOW COLUMNS statement to achieve the same result. Both of these commands will display the structure of the "Planets" table you created, allowing you to verify that it has been created correctly and contains the columns you specified.

mysql> SHOW COLUMNS FROM Planets;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| PlanetName | varchar(25) | NO   | PRI | NULL    |       |
| DayLength  | int         | YES  |     | NULL    |       |
| YearLength | int         | YES  |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

Now we have created our Planets table inside our Astronomy database, we can seed our Planets table with some sample data. We can use the
INSERT INTO statement to add data to our table:

INSERT INTO Planets (PlanetName, DayLength, YearLength) 
VALUES 
    ('Mercury', 4223, 88),
    ('Venus', 2802, 225),
    ('Earth', 24, 365),
    ('Mars', 25, 687),
    ('Jupiter', 10, 4331),
    ('Saturn', 11, 10747),
    ('Uranus', 17, 30589),
    ('Neptune', 16, 59800);

This SQL command inserts data into the "Planets" table. Each INSERT INTO statement adds a row to the table with values for the columns PlanetName, DayLength, and YearLength corresponding to the provided data.

To see the content of our "Planets" table, we can use the SELECT statement. Here's how you can use it to view all the rows in the "Planets" table:

SELECT * FROM Planets;

Executing this SQL command will retrieve all rows from the "Planets" table and display them. It will show you the values of all columns for each row, including PlanetName, DayLength, and YearLength. This will allow you to verify that the seeding was done correctly and that the data is present in the table as expected.


mysql> SELECT * FROM Planets;
+------------+-----------+------------+
| PlanetName | DayLength | YearLength |
+------------+-----------+------------+
| Earth      |        24 |        365 |
| Jupiter    |        10 |       4331 |
| Mars       |        25 |        687 |
| Mercury    |      4223 |         88 |
| Neptune    |        16 |      59800 |
| Saturn     |        11 |      10747 |
| Uranus     |        17 |      30589 |
| Venus      |      2802 |        225 |
+------------+-----------+------------+
8 rows in set (0.00 sec)

In SQL, tables are inherently unordered sets of rows. When you insert data into a table, the database engine doesn't guarantee any specific order for the rows.

Let's say we want to find out how many planets in our Solar system have a day length longer than 24 hours. To find planets with a day length longer than 24 hours from the "Planets" table, we can use the SELECT statement with a WHERE clause to filter the results. Here's how you can do it:

SELECT * FROM Planets WHERE DayLength > 24;

This SQL command selects all rows from the "Planets" table where the value in the DayLength column is greater than 24 (hours). It will retrieve the planets with a day length longer than 24 hours.

mysql> SELECT * FROM Planets WHERE DayLength > 24;
+------------+-----------+------------+
| PlanetName | DayLength | YearLength |
+------------+-----------+------------+
| Mars       |        25 |        687 |
| Mercury    |      4223 |         88 |
| Venus      |      2802 |        225 |
+------------+-----------+------------+
3 rows in set (0.10 sec)

To find out how many results you get, you can simply count the number of rows returned by the query. If you're using a MySQL client, it will typically display the count along with the query results as shown above.

Otherwise, you can add the COUNT() function to the query to explicitly count the rows:

SELECT COUNT(*) FROM Planets WHERE DayLength > 24;

Executing this query will return the number of planets with a day length longer than 24 hours.

mysql> SELECT COUNT(*) FROM Planets WHERE DayLength > 24;
+----------+
| COUNT(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec)

Again, let's query the table to find planets with a year length longer than 400 days. To find planets with a year length longer than 400 days from the "Planets" table, we can use the SELECT statement with a WHERE clause to filter the results based on the YearLength column. Here's how you can do it:

SELECT * FROM Planets WHERE YearLength > 400;

This SQL command selects all rows from the "Planets" table where the value in the YearLength column is greater than 400 (days). It will retrieve the planets with a year length longer than 400 days.

mysql> SELECT * FROM Planets WHERE YearLength > 400;
+------------+-----------+------------+
| PlanetName | DayLength | YearLength |
+------------+-----------+------------+
| Jupiter    |        10 |       4331 |
| Mars       |        25 |        687 |
| Neptune    |        16 |      59800 |
| Saturn     |        11 |      10747 |
| Uranus     |        17 |      30589 |
+------------+-----------+------------+
5 rows in set (0.00 sec)

As before we can use the COUNT(*) function to find out how many results you can get for planets where the year length is greater than 400 days:

SELECT COUNT(*) FROM Planets WHERE YearLength > 400;

Executing this SQL command will return the number of planets with a year length longer than 400 days:

mysql> SELECT COUNT(*) FROM Planets WHERE YearLength > 400;
+----------+
| COUNT(*) |
+----------+
|        5 |
+----------+
1 row in set (0.00 sec)

Now let's create a new table called Moons with columns for PlanetName, MoonName and HasLiquidWater. Again we will use the CREATE TABLE statement:

CREATE TABLE Moons (
    PlanetName VARCHAR(25),
    MoonName VARCHAR(100),
    HasLiquidWater VARCHAR(10)
    PRIMARY KEY (PlanetName, MoonName)
);

Notice, unlike our Planets table where we used PlanetName as our primary key, in our Moons table we need to use a composite key.

CREATE TABLE Planets (
    PlanetName VARCHAR(25) PRIMARY KEY,
    DayLength INT,
    YearLength INT
);

A composite key, also known as a composite primary key, is a key in a relational database table that consists of multiple columns. Unlike a single-column primary key, which uses only one column to uniquely identify each row in the table, a composite key uses a combination of two or more columns to achieve uniqueness.

Since each moon's name is unique within the context of its planet, combining these two columns "PlanetName" and "MoonName" would ensure uniqueness across the entire table.

In our Moons table using only the "PlanetName" column as the primary key would only work if each planet has only one moon, ensuring that the combination of "PlanetName" and "MoonName" is unique. However, it's not a recommended approach because it restricts your table to storing only one moon per planet!

In reality, many planets have multiple moons, and using just the "PlanetName" as the primary key wouldn't allow for this. Using a composite primary key with both "PlanetName" and "MoonName" allows you to accommodate multiple moons per planet while ensuring that each moon is uniquely identified within the context of its planet.

Additionally, using a composite primary key aligns better with the relational model's principles, as it reflects the natural uniqueness of moon names within their respective planets. It also helps maintain data integrity by preventing duplicate moon entries for the same planet.

Now that we have created our Moons table, to see all the tables in our "Astronomy" database, we can use the SHOW TABLES statement:

mysql> SHOW TABLES;
+---------------------+
| Tables_in_astronomy |
+---------------------+
| Moons               |
| Planets             |
+---------------------+
2 rows in set (0.01 sec)

let's look at our two tables again using the DESCRIBE statement:

mysql> DESCRIBE Moons;
+----------------+--------------+------+-----+---------+-------+
| Field          | Type         | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| PlanetName     | varchar(25)  | NO   | PRI | NULL    |       |
| MoonName       | varchar(100) | NO   | PRI | NULL    |       |
| HasLiquidWater | varchar(10)  | YES  |     | NULL    |       |
+----------------+--------------+------+-----+---------+-------+
3 rows in set (0.01 sec)

mysql> DESCRIBE Planets;;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| PlanetName | varchar(25) | NO   | PRI | NULL    |       |
| DayLength  | int         | YES  |     | NULL    |       |
| YearLength | int         | YES  |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+
3 rows in set (0.01 sec)

Note: For simplicity I used the VARCHAR datatype for HasLiquidWater, however, in this specific example the ENUM datatype might be preferred.

An ENUM is a data type in SQL that represents a set of predefined values.
It allows us to specify a list of permissible values for a column, and each column can only contain one of the specified values or NULL.

In our example here, using ENUM has several advantages:

Data Integrity: ENUM ensures data integrity by restricting the values that can be stored in a column to a predefined set. This helps prevent invalid data from being inserted into the database.

Readability: ENUM can improve the readability of your data by providing meaningful labels for column values. This makes it easier for developers and users to understand the data without needing to refer to external documentation.

Performance: ENUM can offer better performance compared to VARCHAR for certain operations because it stores values internally as integers rather than strings. This can lead to faster data retrieval and comparisons.

So let's alters the definition of the HasLiquidWater column in the Moons table.

ALTER TABLE Moons 
MODIFY COLUMN HasLiquidWater 
ENUM('Yes', 'No', 'Unknown', 'Suspected');

The above SQL changes the data type of the HasLiquidWater column from its current VARCHAR data type to an ENUM data type with the specified list of permissible values.

#Before:
mysql> DESCRIBE Moons;
+----------------+--------------+------+-----+---------+-------+
| Field          | Type         | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| PlanetName     | varchar(25)  | NO   | PRI | NULL    |       |
| MoonName       | varchar(100) | NO   | PRI | NULL    |       |
| HasLiquidWater | varchar(10)  | YES  |     | NULL    |       |
+----------------+--------------+------+-----+---------+-------+

#After ALTER Table
mysql> DESCRIBE MOONS;
+----------------+----------------------------------------+------+-----+---------+-------+
| Field          | Type                                   | Null | Key | Default | Extra |
+----------------+----------------------------------------+------+-----+---------+-------+
| PlanetName     | varchar(25)                            | NO   | PRI | NULL    |       |
| MoonName       | varchar(100)                           | NO   | PRI | NULL    |       |
| HasLiquidWater | enum('Yes','No','Unknown','Suspected') | YES  |     | NULL    |       |
+----------------+----------------------------------------+------+-----+---------+-------+

Let's seed our Moons table with some data found on wikipedia.

INSERT INTO Moons (PlanetName, MoonName, HasLiquidWater) VALUES
('Earth', 'Moon', 'Yes'),
('Jupiter', 'Io', 'No'),
('Jupiter', 'Europa', 'Yes'),
('Jupiter', 'Ganymede', 'Yes'),
('Jupiter', 'Callisto', 'Yes'),
('Mars', 'Phobos', 'No'),
('Mars', 'Deimos', 'No'),
('Neptune', 'Triton', 'Yes'),
('Neptune', 'Nereid', 'Unknown'),
('Neptune', 'Proteus', 'Unknown'),
('Neptune', 'Larissa', 'Unknown'),
('Saturn', 'Titan', 'Yes'),
('Saturn', 'Rhea', 'Unknown'),
('Saturn', 'Iapetus', 'Suspected'),
('Saturn', 'Dione', 'Unknown'),
('Uranus', 'Titania', 'Unknown'),
('Uranus', 'Oberon', 'Unknown'),
('Uranus', 'Umbriel', 'Unknown'),
('Uranus', 'Ariel', 'Unknown');

Since PlanetName and MoonName together form the composite primary key for our MOONS table, neither PlanetName nor MoonName can be NULL, as NULL values are not allowed in primary key columns.

Given this constraint, we need to provide a non-NULL value for both PlanetName and MoonName in every row we insert into the MOONS table.

Since 'Venus' has no moons, we should omit the row for 'Venus' entirely from our INSERT INTO statement, as it cannot have a corresponding MoonName value due to the composite primary key constraint.

Let's retrieve all rows and columns from the MOONS table and display them as a result set.

SELECT * FROM MOONS;

Executing this SQL query will fetch all rows and columns from the MOONS table and display them as a result set. Each row will represent a moon associated with a planet, with columns showing the PlanetName, MoonName, and HasLiquidWater for each moon.

+------------+-----------+----------------+
| PlanetName | MoonName  | HasLiquidWater |
+------------+-----------+----------------+
| Earth      | Moon      | Yes            |
| Jupiter    | Io        | No             |
| Jupiter    | Europa    | Yes            |
| Jupiter    | Ganymede  | Yes            |
| Jupiter    | Callisto  | Yes            |
| Mars       | Phobos    | No             |
| Mars       | Deimos    | No             |
| Neptune    | Triton    | Yes            |
| Neptune    | Nereid    | Unknown        |
| Neptune    | Proteus   | Unknown        |
| Neptune    | Larissa   | Unknown        |
| Saturn     | Titan     | Yes            |
| Saturn     | Rhea      | Unknown        |
| Saturn     | Iapetus   | Suspected      |
| Saturn     | Dione     | Unknown        |
| Uranus     | Titania   | Unknown        |
| Uranus     | Oberon    | Unknown        |
| Uranus     | Umbriel   | Unknown        |
| Uranus     | Ariel     | Unknown        |
+------------+-----------+----------------+

Let's say we want to find moons with liquid water that orbit planets with a day length of less than 11 hours. To do this we can use a SQL query with a JOIN operation between the MOONS and PLANETS tables:

SELECT Moons.*
FROM Moons
JOIN Planets ON Moons.PlanetName = Planets.PlanetName
WHERE Moons.HasLiquidWater = 'Yes' AND Planets.DayLength < 11;

In most SQL database systems, when you use the keyword JOIN without specifying a specific type of join (such as INNER JOIN, LEFT JOIN, RIGHT JOIN, or FULL JOIN), it defaults to an INNER JOIN.

In SQL, an INNER JOIN returns only the rows where there is a match in both tables being joined based on the specified join condition. In this case, the join condition is Moons.PlanetName = Planets.PlanetName, meaning that only rows where the PlanetName column in the Moons table matches the PlanetName column in the Planets table will be included in the result set.

So, the query retrieves rows from the Moons table that have corresponding matching rows in the Planets table based on the common PlanetName column. This is characteristic of an INNER JOIN.

mysql> SELECT Moons.*
    -> FROM Moons
    -> JOIN Planets ON Moons.PlanetName = Planets.PlanetName
    -> WHERE Moons.HasLiquidWater = 'Yes' AND Planets.DayLength < 11;
+------------+----------+----------------+
| PlanetName | MoonName | HasLiquidWater |
+------------+----------+----------------+
| Jupiter    | Callisto | Yes            |
| Jupiter    | Europa   | Yes            |
| Jupiter    | Ganymede | Yes            |
+------------+----------+----------------+
3 rows in set (0.02 sec)

Just to confirm this, lets look at our Astronomy database again and select all the planets from the planets table with a day length less than 11 hours.

mysql> SELECT *
    -> FROM planets
    -> WHERE DayLength < 11;
+------------+-----------+------------+
| PlanetName | DayLength | YearLength |
+------------+-----------+------------+
| Jupiter    |        10 |       4331 |
+------------+-----------+------------+
1 row in set (0.00 sec)

And let's retrieve all of Jupiter's moons and see which ones have water:

mysql> SELECT Moons.MoonName, Moons.HasLiquidWater
    -> FROM Moons
    -> JOIN Planets ON Moons.PlanetName = Planets.PlanetName
    -> WHERE Moons.PlanetName = 'Jupiter';
+----------+----------------+
| MoonName | HasLiquidWater |
+----------+----------------+
| Callisto | Yes            |
| Europa   | Yes            |
| Ganymede | Yes            |
| Io       | No             |
+----------+----------------+
4 rows in set (0.00 sec)

Let's see Which planets have moons with suspected liquid water and a day length less than Earth's. In this scenario, we need data from both the PLANETS and MOONS tables. Specifically, we need the PlanetName, HasLiquidWater, and DayLength columns.

Since the information we need is spread across multiple tables (PLANETS and MOONS), we need to join these tables together based on their common column, which is PlanetName. This allows us to access data from both tables simultaneously.

We then need to apply filters to the joined tables to narrow down the results. We need to filter the data to include only planets with moons having suspected liquid water (HasLiquidWater = 'Suspected') and a day length less than Earth's.

mysql> SELECT DISTINCT p.PlanetName
    -> FROM PLANETS p
    -> JOIN MOONS m ON p.PlanetName = m.PlanetName
    -> WHERE m.HasLiquidWater = 'Suspected'
    -> AND p.DayLength < (SELECT DayLength FROM PLANETS WHERE PlanetName = 'Earth');
+------------+
| PlanetName |
+------------+
| Saturn     |
+------------+
1 row in set (0.01 sec)

Let's break down the SQL query step by step:

  1. SELECT DISTINCT p.PlanetName: This part of the query selects the distinct PlanetName column from the PLANETS table. It specifies the alias p for the PLANETS table.

  2. FROM PLANETS p: Here, we specify the table PLANETS with the alias p to indicate that we are selecting data from the PLANETS table and referring to it as p in the query.

  3. JOIN MOONS m ON p.PlanetName = m.PlanetName: This line of code joins the PLANETS table (p) with the MOONS table (m) based on the PlanetName column. It connects the moons to their respective planets based on the shared PlanetName column.

  4. WHERE m.HasLiquidWater = 'Suspected': This part of the query applies a filter to the result set. It restricts the results to only include rows where the HasLiquidWater column in the MOONS table is marked as 'Suspected', indicating that the moon is suspected to have liquid water.

  5. AND p.DayLength < (SELECT DayLength FROM PLANETS WHERE PlanetName = 'Earth'): This condition further filters the results. It ensures that only planets with a day length less than that of Earth are included. It compares the DayLength of each planet (p.DayLength) to the day length of Earth, which is obtained through a subquery (SELECT DayLength FROM PLANETS WHERE PlanetName = 'Earth').

  6. Result: After executing the query, the result is displayed in a tabular format. In this specific case, the result includes only the planet "Saturn" because it meets the specified conditions: its moons are suspected to have liquid water, and its day length is less than that of Earth.


Exporting the Database

To export our Astronomy database and import it into another MySQL database on a different server, we can use the mysqldump command.

The mysqldump command is a utility provided by MySQL for creating backups of MySQL databases. It allows you to dump the contents of a MySQL database into a SQL text file, which can then be used to recreate the database or restore its contents later.

  1. Open a terminal or command prompt on your local machine.

  2. Navigate to the directory where you want to store the backup SQL file.

  3. Use the mysqldump command to export the Astronomy database to a SQL file. Replace username with your MySQL username

  4. After running the command, you'll be prompted to enter your MySQL password. Once entered, the SQL dump will be created in the specified file.

mysqldump -u your_username -p astronomy > astronomy_backup.sql

the astronomy_backup.sql file will be created in the current directory, containing the SQL commands to recreate the Astronomy database schema and insert all its data.

Importing the Database

  1. Transfer the astronomy_backup.sql file to the server where you want to import the database.

  2. Log in to the MySQL server on the new server where you want to import the database.

  3. Log in to the MYSQL server and create a new database with the same name as the original Astronomy database:

CREATE DATABASE Astronomy;

Log out of the MySQL server and in the command prompt use the mysql command to import the SQL file into the newly created database. Replace username with your MySQL username, password with your MySQL password, and database_name with the name of your database:

mysql -u username -p password Astronomy < astronomy_backup.sql

After running this command, you'll be prompted to enter your MySQL password. Once entered, the SQL commands from the astronomy_backup.sql file will be executed, recreating the Astronomy database schema and inserting all its data into the new database.

If you are using the Coursera labs , there is no need to use the user name ans password, as this is automatically set for you as soon as you log in to the labs. You can simply use:

$ mysql Astronomy < /home/coder/project/astronomy_backup.sql

That's it! You've successfully exported your Astronomy database, transferred it to another server, and imported it into a new MySQL database.


Resources