MySQL: Alternatives to ORDER BY RAND() MySQL: Alternatives to ORDER BY RAND() sql sql

MySQL: Alternatives to ORDER BY RAND()


UPDATE 2016

This solution works best using an indexed column.

Here is a simple example of and optimized query bench marked with 100,000 rows.

OPTIMIZED: 300ms

SELECT     g.*FROM    table g        JOIN    (SELECT         id    FROM        table    WHERE        RAND() < (SELECT                 ((4 / COUNT(*)) * 10)            FROM                table)    ORDER BY RAND()    LIMIT 4) AS z ON z.id= g.id

note about limit ammount: limit 4 and 4/count(*). The 4s need to be the same number. Changing how many you return doesn't effect the speed that much. Benchmark at limit 4 and limit 1000 are the same. Limit 10,000 took it up to 600ms

note about join: Randomizing just the id is faster than randomizing a whole row. Since it has to copy the entire row into memory then randomize it. The join can be any table that is linked to the subquery Its to prevent tablescans.

note where clause: The where count limits down the ammount of results that are being randomized. It takes a percentage of the results and sorts them rather than the whole table.

note sub query: The if doing joins and extra where clause conditions you need to put them both in the subquery and the subsubquery. To have an accurate count and pull back correct data.

UNOPTIMIZED: 1200ms

SELECT     g.*FROM    table gORDER BY RAND()LIMIT 4

PROS

4x faster than order by rand(). This solution can work with any table with a indexed column.

CONS

It is a bit complex with complex queries. Need to maintain 2 code bases in the subqueries


Here's an alternative, but it is still based on using RAND():

  SELECT u.id,          p.photo,         ROUND(RAND() * x.m_id) 'rand_ind'    FROM users u,          profiles p,         (SELECT MAX(t.id) 'm_id'            FROM USERS t) x   WHERE p.memberid = u.id      AND p.photo != ''      AND (u.ownership=1 OR u.stamp=1) ORDER BY rand_ind   LIMIT 18

This is slightly more complex, but gave a better distribution of random_ind values:

  SELECT u.id,          p.photo,         FLOOR(1 + RAND() * x.m_id) 'rand_ind'    FROM users u,          profiles p,         (SELECT MAX(t.id) - 1 'm_id'            FROM USERS t) x   WHERE p.memberid = u.id      AND p.photo != ''      AND (u.ownership=1 OR u.stamp=1) ORDER BY rand_ind   LIMIT 18


It is not the fastest, but faster then common ORDER BY RAND() way:

ORDER BY RAND() is not so slow, when you use it to find only indexed column. You can take all your ids in one query like this:

SELECT idFROM testTableORDER BY RAND();

to get a sequence of random ids, and JOIN the result to another query with other SELECT or WHERE parameters:

SELECT t.*FROM testTable tJOIN    (SELECT id    FROM `testTable`    ORDER BY RAND()) AS z ON z.id= t.id   WHERE t.isVisible = 1LIMIT 100; 

in your case it would be:

SELECT u.id, p.photo FROM users u, profiles p JOIN    (SELECT id    FROM users    ORDER BY RAND()) AS z ON z.id = u.id   WHERE p.memberid = u.id   AND p.photo != ''   AND (u.ownership=1 OR u.stamp=1) LIMIT 18 

It's very blunt method and it can be not proper with very big tables, but still it's faster than common RAND(). I got 20 times faster execution time searching 3000 random rows in almost 400000.