Re-indexing huge database (the English Wikipedia) efficiently Re-indexing huge database (the English Wikipedia) efficiently mysql mysql

Re-indexing huge database (the English Wikipedia) efficiently


I really like Wikipedia so I'll try to help.

You need to use a lot of

ALTER TABLE

Add primary keys

ALTER TABLE page ADD PRIMARY KEY (page_id);ALTER TABLE revision ADD PRIMARY KEY (rev_id);ALTER TABLE text ADD PRIMARY KEY (old_id);

Add auto increments back

ALTER TABLE page MODIFY COLUMN page_id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT;

I need the table descriptions for all tables before continuing. If rev_id and old_id are same definitions as page_id then:

ALTER TABLE revision MODIFY COLUMN rev_id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT;ALTER TABLE text MODIFY COLUMN old_id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT;

Add unique keys

ALTER TABLE page ADD UNIQUE name_title(page_namespace, page_title);ALTER TABLE revision ADD UNIQUE rev_page_id(rev_page, rev_id);

Other indexes

ALTER TABLE page ADD INDEX page_random(page_random);ALTER TABLE page ADD INDEX page_len(page_len);ALTER TABLE page ADD INDEX page_redirect_namespace(page_is_redirect, page_namespace, page_len);ALTER TABLE revision ADD INDEX rev_timestamp(rev_timestamp);ALTER TABLE revision ADD INDEX page_timestamp(rev_page, rev_timestamp);ALTER TABLE revision ADD INDEX user_timestamp(rev_user, rev_timestamp);ALTER TABLE revision ADD INDEX user_text_timestamp(rev_user_text, rev_timestamp);

Again, there may be column definitions that change this stuff. You need to provide the CREATE TABLE info.