Ryan Fox
2002-08-29 16:21:10 UTC
I've created a schema for cddb data storage that I submit for
comments/feedback. It is below, followed by my comments. In case it
gets mangled, it's also available at
http://www.amerisuk.com/~rfox/cddb/schema.sql . If you like, you can
plop it into MySQL (version >= 3.23.23) using a command like 'mysql -u
root -p < ./schema.sql' after you save it to a file.
create database cddb;
use cddb;
create table disc (
id int unique auto_increment not null,
length int not null,
discid char(8) not null unique primary key,
revisionlead varchar(255),
revisionversion varchar(255) not null,
revisionrelease varchar(255),
revisionlevel varchar(255),
artist varchar(255) not null,
title varchar(255) not null,
releaseyear decimal(4),
maingenre char(16) not null,
genre varchar(255) not null,
numtitles int not null,
extendeddata text,
playorder varchar(255),
fulltext key(artist), fulltext key(title), fulltext (artist,title),
key(maingenre), key(genre), key(releaseyear), key(discid)
);
create table discoffsets (
disc_id int not null,
subid int not null,
offset int not null,
key(disc_id), key(subid)
);
create table disctracks (
disc_id int not null,
number int not null,
name varchar(255) not null,
extendeddata text,
key(disc_id), key(name)
);
There are 3 tables. Disc, discoffsets, and disctracks. Disc contains
info such as album title, genre, release year, etc., and there would be
1 record per album. In the interest of data normalization, I think
artist name and genre should have separate tables. This would make it
easier to search for albums by the same artist or genre, as well as use
less disk space. I did not implement the disc table in this fashion
because of the difficulty in converting from the current format to that
normalized format. It would probably be easier if we moved from the
current format to this intermediate sql format, and then moved towards
data normalization in a later version.
The discoffsets table references an id of an album, and a subid (more on
that in a minute), and an offset. There would be 1 record per album,
per unique variation, per offset. For example, if the cddb database
contained to different versions of album A, there would be 1 record in
the disc table (as all of that info is the same between the 2 albums),
and 1 set of records in the disc offset table (1 per offset) for each
version of the album.
The disctracks table contains 1 record per track per album. In the
example above, there will only be 1 set of records (1 per song) in this
table. Although the different versions of the above album have
different offsets, they have the same song list, and should share this
list of info.
Using this method, this should take care of the current linked album
discussion, as well aas move us down a path towards better data storage.
Hopefully I'm not too far off track with this. Please respond with
thoughts/questions/flames/whatever. :)
Thanks,
Ryan
comments/feedback. It is below, followed by my comments. In case it
gets mangled, it's also available at
http://www.amerisuk.com/~rfox/cddb/schema.sql . If you like, you can
plop it into MySQL (version >= 3.23.23) using a command like 'mysql -u
root -p < ./schema.sql' after you save it to a file.
create database cddb;
use cddb;
create table disc (
id int unique auto_increment not null,
length int not null,
discid char(8) not null unique primary key,
revisionlead varchar(255),
revisionversion varchar(255) not null,
revisionrelease varchar(255),
revisionlevel varchar(255),
artist varchar(255) not null,
title varchar(255) not null,
releaseyear decimal(4),
maingenre char(16) not null,
genre varchar(255) not null,
numtitles int not null,
extendeddata text,
playorder varchar(255),
fulltext key(artist), fulltext key(title), fulltext (artist,title),
key(maingenre), key(genre), key(releaseyear), key(discid)
);
create table discoffsets (
disc_id int not null,
subid int not null,
offset int not null,
key(disc_id), key(subid)
);
create table disctracks (
disc_id int not null,
number int not null,
name varchar(255) not null,
extendeddata text,
key(disc_id), key(name)
);
There are 3 tables. Disc, discoffsets, and disctracks. Disc contains
info such as album title, genre, release year, etc., and there would be
1 record per album. In the interest of data normalization, I think
artist name and genre should have separate tables. This would make it
easier to search for albums by the same artist or genre, as well as use
less disk space. I did not implement the disc table in this fashion
because of the difficulty in converting from the current format to that
normalized format. It would probably be easier if we moved from the
current format to this intermediate sql format, and then moved towards
data normalization in a later version.
The discoffsets table references an id of an album, and a subid (more on
that in a minute), and an offset. There would be 1 record per album,
per unique variation, per offset. For example, if the cddb database
contained to different versions of album A, there would be 1 record in
the disc table (as all of that info is the same between the 2 albums),
and 1 set of records in the disc offset table (1 per offset) for each
version of the album.
The disctracks table contains 1 record per track per album. In the
example above, there will only be 1 set of records (1 per song) in this
table. Although the different versions of the above album have
different offsets, they have the same song list, and should share this
list of info.
Using this method, this should take care of the current linked album
discussion, as well aas move us down a path towards better data storage.
Hopefully I'm not too far off track with this. Please respond with
thoughts/questions/flames/whatever. :)
Thanks,
Ryan