For this search application, we will need to have two tables. One will hold each URL that we have gathered keywords from. The other will store each keyword and a reference to the URL it is associated with. Both tables are very simple with only two columns a piece. For the purposes of this tutorial, we will assume you are storing these tables in a database named test.
The urls table
The urls table will have two columns. The first is a unique id that we will use in the keywords table to associate each keyword back to its URL. The second column is the URL itself. We have created a primary key on this table on the id column. This will ensure that this column does not have any duplicate values in it. We have also added an index on the url column named url_idx. This will speed up queries on this column for our search application.
CREATE TABLE urls ( id int(10) unsigned NOT NULL auto_increment, url varchar(100) NOT NULL default '', PRIMARY KEY (id), KEY url_idx (url) ) TYPE=MyISAM COMMENT='Table to hold URLs';
The keywords table
The keywords table also only has two columns. The first holds the keyword, and the second is the id of the URL it is associated with. We have created two indexes on this table. The first, named keyword_idx, is an index on the keyword column. It will allow for faster searching on that column. The second index, called url_id_idx, will speed up our queries when we later join the two tables together.
CREATE TABLE keywords ( url_id int(11) NOT NULL default '0', keyword varchar(100) NOT NULL default '', KEY keyword_idx (keyword), KEY url_id_idx (url_id) ) TYPE=MyISAM COMMENT='Table to store keywords';
Once you have those tables created in your database, you are ready to move on to the script that will gather the keywords.