For example, this table
|Coldplay||Chris Martin, Jonny Buckland, Guy Berryman|
|Chainsmokers||Andrew Taggart, Alex Pall|
|Fall Out Boy||Patrick Stump, Joe Trohman, Pete Wentz|
does not contain atomic values because the members column has multiple values of the same type (Chris Martin and Jonny Buckland are the same type of thing and there are multiple of those in a cell )
This next table is also not atomic because multiple columns represent the same sort of thing (member1, member2, member3).
|Coldplay||Chris Martin||Jonny Buckland||Guy Berryman|
|Chainsmokers||Andrew Taggart||Alex Pall|
|Fall Out Boy||Patrick Stump||Joe Trohman||Pete Wentz|
The following table meets both criteria of First Normal Form (1NF) because all the values are atomic and each row has a unique key. The username is unique and in the table we represent the primary key with PK:
Synthetic Primary Key is a key that is not part of the original Data. So the
username column in the table above was not a synthetic primary key but the
id column in the following table is:
|1||Santa Fe||New Mexico||NM||67947|
Alternatively we can make a two column primary key. So, for example, the city and state columns combined may be unique:
|city (PK)||state (PK)||ST||population|
|Santa Fe||New Mexico||NM||67947|
This is called a Composite key (a key made up of more than one column.)
The value of one column depends on another column.
Some non-key column is dependent is dependent on some but not all of the columns of the composite key.
A non-key column is related to another non-key column
So in the above table repeated here:
|city (PK)||state (PK)||ST||population|
|Santa Fe||New Mexico||NM||67947|
We have a functional dependency between the
ST column and the
state column. If we change the value in the
state column we would need to change the value in the
ST column. It is a partial functional dependency by definition (some non-key column is dependent is dependent on some but not all of the columns of the composite key).
Suppose we have a university class schedule:
|10405||cpsc110||Introduction to computer science||4||3||MWF|
|10406||cpsc110||Introduction to computer science||3||3||TTh|
If a table is in 1NF and has a one column primary key it is in 2NF. This table is in 2NF but not 3NF. The title column is dependent on the course column and this is a transitive functional dependency.
The solution is to divide the table into two tables:
|cpsc110||Introduction to computer science||3|
In the sections table, the course column is a foreign key.
CREATE TABLE courses (course TEXT PRIMARY KEY, title TEXT, credits, NUMERIC, si CHAR(1)); CREATE TABLE sections (crn NUMERIC PRIMARY KEY, course TEXT REFERENCES courses (course), section NUMERIC, days TEXT);
Suppose we want to find the course (cpsc110) and title (Introduction to Computer Science) that meet on TTh:
SELECT courses.course, courses.title FROM courses JOIN sections ON courses.course = sections.course WHERE sections.days = 'TTh';
This table is in 1NF but not 2NF.
|band||member||member real name||band formed||birthplace|
|Imagine Dragons||Dan Reynolds||Daniel Reynolds||2008||Las Vegas, NV|
|Imagine Dragons||Wing Sermon||Daniel Wayne Sermon||2008||American Fork, UT|
|Imagine Dragons||. Daniel Platzman||Daniel Platzman||2008||Atlanta, GA|
|Imagine Dragons||Ben McKee||Benjamin Arthur McKee||2008||Forestville, CA|
|Egyptian||Dan Reynolds||Daniel Reynolds||2010||Las Vegas, NV|
|Egyptian||Aja Volkman||Aja Volkman-Reynolds||2010||Eugene, OR|
A quick discussion of different types of relations
|1||Imagine Dragons||alternative rock||2008|
|3||Baby Metal||heavy metal||2010|
|5||Earth, Wind & Fire||funk||1969|
|1||Dan Reynolds||Daniel Reynolds||Las Vegas, NV|
|2||Wing Sermon||Daniel Wayne Sermon||American Fork, UT|
|3||Daniel Platzman||Daniel Platzman||Altlanta, FA|
|4||Ben McKee||Benjamin Arthur McKee||Forestville, CA|
|5||Aja Volkman||Aja Volkman-Reynolds||Eugene, OR|
We connect this table with a junction table:
|band_id (PK)||musician_id (PK)|
We can represent this as follows:
DROP DATABASE bands;
CREATE DATABASE bands; \c bands; CREATE TABLE band ( band_id integer PRIMARY KEY, name text, genre text, formed integer ); INSERT INTO band (band_id, name, genre, formed) VALUES (1, 'Baby Metal', 'heavy metal', 2010), (2, 'Imagine Dragons', 'alternative rock', 2008), (3, 'Sakura Gakuin', 'J-pop', 2010), (4, 'Egyptian', 'alternative rock', 2010), (5, 'Earth Wind & Fire', 'funk', 1969); CREATE TABLE musician ( musician_id integer PRIMARY KEY, stagename text, realname text, origin text, birthdate DATE ); INSERT INTO musician (musician_id, stagename, realname, origin, birthdate) VALUES (1, 'Su-Metal', 'Suzuka Nakamoto', 'Hiroshima Prefecture, Japan', '1997-12-20'), (2, 'MoaMetal', 'Moa Kikuchi', ' Kanagawa Prefecture, Japan', '1999-7-4'), (3, 'YuiMetal', 'Yui Mizuno', 'Aichi Prefecture, Japan', '1999-06-20'), (4, 'Aiko Yamaide', 'Aiko Yamaide', 'Kagoshima Prefecture, Japan', '2002-12-01'), (5, 'Mariri Sugimoto', 'Mariri Sugimoto', 'Hiroshima Prefecture, Japan', '2000-08-04'), (6, 'Hana Taguchi', 'Hana Taguchi', 'Nagano Prefecture, Japan', '1999-7-4'), (7, 'Dan Reynolds', 'Daniel Reynolds', 'Las Vegas, Nevada, United States', '1987-07-14'), (8, 'Wing', 'Daniel Wayne Sermon', ' American Fork, Utah, United States', NULL), (9, 'Aja Volkman-Reynolds', 'Aja Volkman-Reynolds', 'Eugene, OR', '1987-01-01'), (10, 'Daniel Platzman', 'Daniel Platzman', ' Atlanta, Georgia, United States', '1986-09-28'), (11, 'Ben McKee', 'Benjamin Arthur McKee', 'Forestville, California, United States', '1985-04-07'); create table bandmember ( band_id integer REFERENCES band(band_id), member_id integer REFERENCES musician(musician_id), PRIMARY KEY (band_id, member_id)); INSERT INTO bandmember VALUES (1, 1), (1, 2), (1, 3), (2, 7), (2, 8), (2, 10), (2, 11), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (4, 7), (4, 9); create table instrument ( id integer PRIMARY KEY, name text, family text ); insert into instrument (id, name, family) VALUES (1, 'vocals', 'voice'), (2, 'screams', 'voice'), (3, 'dance', 'dance'), (4, 'koto', 'string'), (5, 'electric guitar', 'string'), (6, 'drums', 'percussion'), (7, 'bass', 'string'), (8, 'keyboards', 'keyboards'); create table plays ( member_id integer REFERENCES musician(musician_id), instrument_id integer REFERENCES instrument(id), PRIMARY KEY (member_id, instrument_id)); insert into plays VALUES (1, 1), (1, 3), (2, 2), (2, 3), (3, 2), (3,3), (7, 1), (7, 8), (8, 5), (9, 1), (9,8), (10, 6), (10,1), (11, 7);
SELECT musician.stagename FROM musician JOIN bandmember ON musician.musician_id = bandmember.member_id JOIN band ON bandmember.band_id = band.band_id WHERE band.name = 'Imagine Dragons'; stagename ----------------- Dan Reynolds Wing Daniel Platzman Ben McKee (4 rows)
SELECT instrument.name FROM instrument JOIN plays ON instrument.id = plays.instrument_id JOIN musician ON musician.musician_id = plays.member_id WHERE musician.stagename = 'Ben McKee';
Suppose as part of the bands database we have a table specifying who is friends with whom.
CREATE TABLE friends ( person INTEGER REFERENCES musician(musician_id), friends_with INTEGER REFERENCES musician(musician_id)); INSERT INTO friends (person, friends_with) VALUES (7, 8), (7, 9), (9, 11);
For example, the first insert represents that Dan Reynolds is friends with Wing. Let’s consider how we would write a query to find out who is Dan Reynolds friends with. At first we might think to write:
SELECT musician.stagename FROM musician JOIN friends ON musician.musician_id = friends.person WHERE musician.stagename = 'Dan Reynolds';
You will see that this doesn’t work as all it returns is
For this query we actually need two copies of the musicians table– one to find the musician_id of Dan Reynolds. Once we find out his id is 7, look at the friends table and discover he is friends with id 8, we need another table to lookup the stagename for id 8. We implement it as follows:
SELECT m1.stagename FROM musician as m1 JOIN friends ON m1.musician_id = friends.friends_with JOIN musician as m2 ON friends.person = m2.musician_id WHERE m2.stagename = 'Dan Reynolds'; stagename ---------------------- Wing Aja Volkman-Reynolds (2 rows)
Suppose we want to know who is in bands with Dan Reynolds. For that the query is
SELECT m1.stagename FROM musician as m1 JOIN bandmember AS b1 ON m1.musician_id = b1.member_id JOIN bandmember as b2 ON b1.band_id = b2.band_id JOIN musician AS m2 ON m2.musician_id = b2.member_id WHERE m1.stagename <> 'Dan Reynolds' AND m2.stagename = 'Dan Reynolds';
Question: why do we have
m1.stagename <> 'Dan Reynolds'?