Mysql 5.5 Table partition user and friends Mysql 5.5 Table partition user and friends mysql mysql

Mysql 5.5 Table partition user and friends


First I would recommend if possible that you upgrade to 5.6.5 or later of Mysql to ensure you are taking advantage of partitioning properly and with best performance. This is not always possible due to GA concerns, but my experience is that there was a difference in performance between 5.5 and 5.6, and 5.6 offers some other types of partitioning.

1) My experience is that inserts and updates ARE faster on partitioned sets as well as selects AS LONG AS YOU ARE INCLUDING THE COLUMN THAT YOU ARE PARTITIONING ON IN THE QUERY. If I ask for a count of all records across all partitions, I see slower responses. That is to be expected because the partitions are functioning LIKE separate tables, so if you have 30 partitions it is like reading 30 tables and not just one.

You must include the value you are partitioning on in the primary key AND it must remain stable during the life of the record.

2) I would include user_id and id in the primary key - assuming that your friends tables user_id and id do not change at all once the record is established (i.e. any change would be a delete/insert). In my case it was "redundant" but more than worth the access. Whether you choose user_id/id or id/user_id depends on your most frequent access.

A final note. I tried to create LOTS of partitions when I first started breaking my data into partitions, and found that just a few seemed to hit the sweet spot - 6-12 partitions seemed to work best for me. YMMV.


1. Use this sql query to select table and excepting all column, except id:

I answer what you need:

I suggest you to remove FOREIGN KEY and PRIMARY KEY

I know this is crazy, but they can ask computer to know what the current id, last id, next id and this wlll take long than create id manually.other way you can create int id manually by java .

use this sql query to insert fastly:

INSERT INTO user (id,NAME,email)VALUES ('CREATE ID WITH JAVA', 'NAME', 'EMAIL@YAHOO.COM')

I can't decide my query can work faster or not...

Because all depend on your computer performance, make sure you use it on server, because server can finish all tasks fastly.

and for select, in page where profile info located you will need one row for one user that defined in profile id.

use mysql limit if you only need one and if you need more than one ...Just change the limit values like thisfor one row:

select * from user where id = 999999 limit 1;

and for seven row:

select * from user where id = 999999 limit 7;

I think this query will work faster than without limitand remember limit can work with insert too

2. For friend partition:the answer is drop the primary key

Table with no primary key is no problem

Once again, create the id with java...java designed to be faster in interface and your code include whileand java can do it.For example you need to retrieve your all friend data ...use this query to perform faster:

select fr.friend_id, usr.* from friends as fr INNER JOIN user as usr ON dr.friend_id = usr.idwhere fr.user_id = 999999 LIMIT 10;

and i think this is enoughsorry i can only explain about mysql and not in java.Because, i'm not expert in java but i understand about it


1) If You use always(or mostly) only id to select data it is obvious to use this field as base for partitioning condition. As it is number there is no need for hash function simply use range partitioning. How many partitions to create(what numbers to choose as borders) you need to find by Yourself but as @TJChambers mentioned before around 8-10 should be efficient enough.

Insert are slower because You test it wrong.You simply insert 1000000 rows one after another without any randomness and the only difference is that for partitioned table mysql needs to calculate hash which is extra time.But as in Your case id is base of condition for partitioning You will never gain anything with inserting as all new rows go on the end of table.

If You had for example table with GPS localizations and partitioned it by lat and lon You could see difference in inserting if for example each partition was different continent.And difference would be seen if You had a table with some random(real) data and were inserting some random values not linear.

Your select for partitioned table is slower because again You test it wrong.

@TJChambers wrote before me about it, Your query needs to work on all partitions(it is like working with many tables) so it extends time. Try to use where to work with data from just one partition to see a difference.

for example run:

select count(*) from user_partition where id<99999;

and

select count(*) from user where id<99999;

You will see a difference.

2) This one is hard. There is no way to partition it without redundancy of data(at least no idea coming to my mind) but if time of access (select speed) is the most important the best way may be to partition it same way as user table (range on one of the id's) and insert 2 rows for each relationship it is (a,b) and (b,a). It will double number of rows but if You partition in to more than 4 parts you will work on less records per query anyway and You will have just one condition to check no need for or.

I tested it with with this schema

CREATE TABLE `test`.`friends` (`a` INT NOT NULL ,`b` INT NOT NULL ,INDEX ( `a` ),INDEX ( `b` )) ENGINE = InnoDB;CREATE TABLE `test`.`friends_part` (`a` INT NOT NULL ,`b` INT NOT NULL ,INDEX ( `a` , `b` )) ENGINE = InnoDBPARTITION BY RANGE (a) (    PARTITION p0 VALUES LESS THAN (1000),    PARTITION p1 VALUES LESS THAN (2000),    PARTITION p2 VALUES LESS THAN (3000),    PARTITION p3 VALUES LESS THAN (4000),    PARTITION p4 VALUES LESS THAN (5000),    PARTITION p5 VALUES LESS THAN (6000),    PARTITION p6 VALUES LESS THAN (7000),    PARTITION p7 VALUES LESS THAN (8000),    PARTITION p8 VALUES LESS THAN (9000),    PARTITION p9 VALUES LESS THAN MAXVALUE);delimiter //DROP procedure IF EXISTS fill_friends//create procedure fill_friends()begin    declare i int default 0;    declare a int;    declare b int;    while i<2000000    do    set a = rand()*10000;    set b = rand()*10000;    insert into friends values(a,b);    set i = i + 1;    end while;end//delimiter ;delimiter //DROP procedure IF EXISTS fill_friends_part//create procedure fill_friends_part()begin    insert into friends_part (select a,b from friends);    insert into friends_part (select b as a, a as b from friends);end//delimiter ;

Queries I have run are:

select * from friends where a=317 or b=317;

result set: 475times: 1.43, 0.02, 0.01

select * from friends_part where a=317;

result set: 475times: 0.10, 0.00, 0.00

select * from friends where a=4887 or b=4887;

result set: 483times: 1.33, 0.01, 0.01

select * from friends_part where a=4887;

result set: 483times: 0.06, 0.01, 0.00

I didn't bother about uniqueness of data but in your example You may use unique index.As well I used InnoDB engine, but MyISAM is better if most of the queries are select and you are not going to do many writes.There is no big difference for 2nd and 3rd run probably because of caching, but there is visible difference for 1st run. It is faster because we are breaking one of prime rules of database designing, but the end justifies the means so it may be good solution for really big tables. If you are going to have less than 1M of records I think You can survive without partitioning.