Arrange/distribute array items uniformly Arrange/distribute array items uniformly arrays arrays

Arrange/distribute array items uniformly


Here's a solution that avoids repeating patterns whenever possible.

For AAAAABBBCC it would generate ABABABACAC;

For AAAAABBBCCC it would generate ABCABABACAC;

Apart from sorting by type count, it runs in linear time (it accepts an unsorted data array). The result is in $distributed_data. For explanation see below.

Code

$data = array(  array( "name" => "SomeName", "type" => "A"),  array( "name" => "SomeName", "type" => "A"),  array( "name" => "SomeName", "type" => "A"),  array( "name" => "SomeName", "type" => "B"),  array( "name" => "SomeName", "type" => "B"),);$distributed_data = array();$counts = array();$size = sizeof($data);// Count valuesforeach ($data as $entry) {  $counts[$entry["type"]] = isset($counts[$entry["type"]]) ? $counts[$entry["type"]] + 1 : 1;}// Set counterfor ($i = 0; $i < $size; $i++) {  $data[$i]["count"] = $counts[$data[$i]["type"]];}// Sort by countusort($data, function($entry1, $entry2) {    return $entry2["count"] <=> $entry1["count"];});// Generate the distributed array$max_length = $data[0]["count"];$rows = ceil($size / $max_length);$last_row = ($size - 1) % $max_length + 1;$row_cycle = $rows;$row = 0;$col = 0;for ($i = 0; $i < $size; $i++) {  if ($i == $rows * $last_row) {    $row_cycle -= 1;  }  $distributed_data[$i] = $data[$row * $max_length + $col];  $row = ($row + 1) % $row_cycle;  if ($row == 0) {    $col++;  }}

Explanation

First, order the entries according to the number of repetitions each type has. E.g. CBBCAAB becomes BBBAACC.

Then imagine a table that has as many columns as the most frequent occurrence (e.g. if you have AAAABBCC, the most frequent occurrence would be 4, and the table would have 4 columns).

Then write all entries into the table, left to right, jumping to new row as necessary.

E.g. for AAAAABBBCCC you would get a table like this:

Table example

To generate the final distributed array just read the entries top-down, shifting to a new column as necessary.

In the above example, you would get ABCABABACAC.

The only way to get repeating entries is to either have two of the same characters in a column, or meet the same character when shifting to a column on the right.

The first scenario can't happen because a character group would need to wrap around and this can't happen, because there is no character group longer than the number of columns (that's how we defined the table).

The second scenario can only happen when the second row isn't full. E.g. AAAABB leaves the second row with two empty cells.


Algorithm:

function distribute($data) {    $groups = [];    foreach ($data as $row) {        $groups[$row['type']][] = $row;    }    $groupSizes = array_map('count', $groups);    asort($groupSizes);    $result = [];    foreach ($groupSizes as $type => $groupSize) {        if (count($result) == 0) {            $result = $groups[$type];        } elseif (count($result) >= count($groups[$type])) {            $result = merge($result, $groups[$type]);        } else {            $result = merge($groups[$type], $result);        }    }    return $result;}function merge($a, $b) {    $c1 = count($a);    $c2 = count($b);    $result = [];    $i1 = $i2 = 0;    while ($i1 < $c1) {        $result[] = $a[$i1++];        while ($i2 < $c2 && ($i2+1)/($c2+1) < ($i1+1)/($c1+1)) {            $result[] = $b[$i2++];        }    }    return $result;}

The main idea is to split the data into groups and merge the next smallest group into the result (starting with an empty result).

While merging two arrays the items are sorted by a float key, which is calculated (on the flow) in this line

while ($i2 < $c2 && ($i2+1)/($c2+1) < ($i1+1)/($c1+1))

as

floatKey = (index + 1) / (groupSize + 1)

(This part however can be improved, so the distance to the "corners" (0 and 1) would be half as big as the distance between two items).

On tie the item from the bigger group comes first.

Example: Merging AAAA and BB the keys for A would be 0.2, 0.4, 0.6, 0.8 anf for B - 0.33, 0.66. The result would be

A(0.2), B(0.33), A(0.4), A(0.6), B(0.66), A(0.8)

Tests:

$testData = [    'AAAAABBBCC',    'AAAAABBBCCC',    'ABBCCC',    'AAAAAABBC',    'AAAAAABBBBCCD',    'AAAAAAAAAABC',    'hpp',    'stackoverflow',    'ACCD', // :-)];$results = [];foreach ($testData as $dataStr) {    $a = str_split($dataStr);    $data = [];    foreach ($a as $type) {        $data[] = ['type' => $type];    }    $result = distribute($data);    $resultStr = implode(array_column($result, 'type'));    $results[$dataStr] = $resultStr;}var_export($results);

Test results:

'AAAAABBBCC' => 'BACABACABA','AAAAABBBCCC' => 'CABACABACAB','ABBCCC' => 'BCACBC','AAAAAABBC' => 'ABAACAABA','AAAAAABBBBCCD' => 'BACABADABACAB','AAAAAAAAAABC' => 'AAACAAAABAAA','hpp' => 'php','stackoverflow' => 'sakeofwlrovct','ACCD' => 'ACDC',

Test demo: http://rextester.com/BWBD90255

You can easily add more test cases to the demo.


You should take a sorted array of sorted types and do iterative walk through it step by step changing selected type by one.

$data = array(  array( "name" => "SomeName1", "type" => "A"),  array( "name" => "SomeName2", "type" => "A"),  array( "name" => "SomeName3", "type" => "A"),  array( "name" => "SomeName4", "type" => "A"),  array( "name" => "SomeName5", "type" => "A"),  array( "name" => "SomeName6", "type" => "B"),  array( "name" => "SomeName7", "type" => "B"),  array( "name" => "SomeName8", "type" => "B"),  array( "name" => "SomeName9", "type" => "C"),  array( "name" => "SomeName0", "type" => "C"));$dataSorted = array();$counts = array();foreach($data as $elem) {    // just init values for a new type    if(!isset($counts[$elem['type']])) {        $counts[$elem['type']] = 0;        $dataByType[$elem['type']] =  array();    }    // count types    $counts[$elem['type']]++;    // save it to grouped array    $dataByType[$elem['type']][] =  $elem;}// sort it to A=>5, B=>3 C=>2arsort($counts, SORT_NUMERIC);// get sorted types as an array$types = array_keys($counts);// index will be looped 0 -> count($types) - 1 and then down to 0 again$currentTypeIndex = 0;// make a walk on sorted array. First get the most popular, then less popular etc.// when all types are added, repeatwhile(count($dataSorted) < count($data)) {    $currentType = $types[$currentTypeIndex];    // skip adding if we ran out this type    if($counts[$currentType]) {        // pop an element of selected type        $dataSorted[] = array_pop($dataByType[$currentType]);        // decrease counter        $counts[$currentType]--;    }    // choose next type    $currentTypeIndex = (++$currentTypeIndex)%count($types);}print_r($dataSorted);

The code outputs elements in order of ABCABCABAA.

UPD. trailing doubling takes place in case count(maxtype) > count(nexttype) + 1