Techniques for sorting data to columns

There are many ways to display data in tables. I’m going to discuss a couple alternatives to the one record – one row method. Let me jump right in with some examples.

Given the following arrays

{1, 2, 3, 4, 5, 6, 7, 8, 9}
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}

typical way to loop through and display data is as follows. This first example is fairly trivial; by simply looping through the array you can display a table like the one below. The array is displayed in sorted order, left to right, then top to bottom. Much like reading a book.

123
456
789

This may not be how I want the data displayed. There are cases when I want to display my data sorted alphabetically in columns (top to bottom), then left to right. I also want the data to be distributed as equally as possible among columns.

147
258
369
158
269
3710
4
AppleLGPioneer
Hewlett PackardMotorolaSamsung
HitachiPanasonicSony
JVC

If I pull an array of data from a database, or any other source, I will pull it in alphabetical order.

    Apple
    Hewlett Packard
    Hitachi
    JVC
    LG
    Motorola
    Panasonic
    Pioneer
    Samsung
    Sony

In order to print the table, it would be much easier if the array was ordered in a way that I can print the array one row at a time. If looking at the table above, with 3 columns, this array is ordered so as I print each row, the results will be alphabetical by column, reading columns left to right.

    Apple
    LG
    Pioneer
    Hewlett Packard
    Motorola
    Samsung
    Hitachi
    Panasonic
    Sony
    JVC
    [blank]
    [blank]

If I know that I will be printing to 3 columns, I can sort the array ahead of time to make the display easier. Take a look at the following function which provides the solution I’m looking for. I’ll pass in a sorted array, and the number of columns, and the function will return an array sorted as per above, allowing me to display the table by rows.


Displaying Data In Groups

In another situation, I am selecting data that will be sorted in groups. In the data below, the data is sorted by make and model. I would like to display the data in columns, sorted top to bottom, and left to right, while displaying data by groups. Review the example below to see what I’m talking about.

{"make" => "Apple", "model" => "1000"}
{"make" => "Apple", "model" => "2000"}
{"make" => "Apple", "model" => "3000"}   
{"make" => "Apple", "model" => "4000"}
{"make" => "Apple", "model" => "5000"} 

{"make" => "HP", "model" => "1000"}
{"make" => "HP", "model" => "2000"}

...
    
{"make" => "Sony", "model" => "1000"}
{"make" => "Sony", "model" => "2000"}
{"make" => "Sony", "model" => "3000"}
{"make" => "Sony", "model" => "4000"}

This example shows how I can display the data by grouping, sorted in columns, from left to right. A key point to note here is that I do not want to split a group by starting it in the bottom of one column, and continuing in the next column. A group of data should never be split across columns. This will make column heights slightly different, but we can get them close in height under most circumstances.

AppleIBMMotorola
100010001000
200020002000
30003000
4000Panasonic
5000JVC1000
10002000
Hewlett Packard2000
10003000Pioneer
200040001000
50002000
Hitachi2000
1000LG3000
200010004000
30002000
3000Sony
40001000

The code provided below is a class object designed to provide all the functionality discussed on this page. Just copy the class and include it in your project.

Show me the code

I will admit I’m not entirely satisfied with my solution. I suspect there must be a better way to mathematically calculate the heights of columns, but this is my solution for now. For those that are interested, the concept behind the chunkColumns function is as follows. First I determine how many items are in each group. I then calculate an average column height (by cell count) based on the number of columns and total cell items. After I have that information, I examine each group, and using a variance algorithm, calculate the closest point that I can get to the average column height, while still breaking on group boundaries. I store the break point position, and build the output array based on these breakpoints.

The class provides two functions which you can call.

chunkColumns($data, $field_to_group_by, $number_of_columns);
arrayColumns($data, $number_of_columns, $str_for_blank_cell); 

Leave a Reply

Your email address will not be published.

You may use these <abbr title="HyperText Markup Language">HTML</abbr> tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*