A UDF for sorting arrays of beans (instantiated CFC’s), or arrays of structures

27 09 2008

A while a ago, I posted a discussion about different ways to represent complex data, and a UDF to sort arrays of structures on multiple keys. After reading this post, Brian Fitzgerald asked me if I knew of any UDF that could provide the same kind of functionality for an array of beans (instantiated CFC’s). This made me realize that my blog post had omitted perhaps the most useful way of representing complex data in ColdFusion (version 6 and up): ColdFusion Components!

This is not the place for an in-depth discussion of CFC’s, but if you are interested I can recommend this introduction by Hal Helms. For me, the essential advantage of CFC’s is, that they provide the possibility to encapsulate both data and functionality inside one object(variable). This makes it possible for CFC’s to represent real-life objects more naturally than any other CF datatype. CFC’s also provide great possibilities for writing re-usable and portable code, because they can contain the logic that is necessary to handle the specific data they are holding.

As for Brian’s question, it inspired me to improve my sorting UDF, so that now it is able to sort arrays of objects, the object being either instantiated CFC’s (beans), or structures. The UDF, together with two usage examples, can be downloaded here. The code of the examples is shown below .

<!--- Example: sorting an array of instantiated CFCs --->
<!--- Note: this templates depends on person.cfc being in the same folder --->
<!--- Include UDF template --->
<cfinclude template="sortArrayOfObjects.cfm">
<cfscript>
//EXAMPLE USAGE:
//an array of 4 persons, with three properties...
variables.persons = arrayNew(1);
variables.person = createObject('component','person');
variables.person.setName('Martijn');
variables.person.setAge(29);
variables.person.setWeight(78);
arrayAppend(variables.persons, variables.person);
variables.person = createObject('component','person');
variables.person.setName('Jelle');
variables.person.setAge(29);
variables.person.setWeight(85);
arrayAppend(variables.persons, variables.person);
variables.person = createObject('component','person');
variables.person.setName('Lammert');
variables.person.setAge(51);
variables.person.setWeight(78);
arrayAppend(variables.persons, variables.person);
variables.person = createObject('component','person');
variables.person.setName('Bas');
variables.person.setAge(51);
variables.person.setWeight(78);
arrayAppend(variables.persons, variables.person);
// specify how to sort:
// first on age - descending
// second on weight - ascending
// third on name - ascending
variables.sortkeys = arrayNew(1);
variables.sortKey = structNew();
variables.sortKey.keyName = "age";
variables.sortKey.sortOrder = "descending";
arrayAppend(variables.sortKeys, variables.sortKey);
variables.sortKey = structNew();
variables.sortKey.keyName = "weight";
variables.sortKey.sortOrder = "ascending";
arrayAppend(variables.sortKeys, variables.sortKey);
variables.sortKey = structNew();
variables.sortKey.keyName = "name";
variables.sortKey.sortOrder = "ascending";
arrayAppend(variables.sortKeys, variables.sortKey);
// do the sorting
variables.persons_sorted = sortArrayOfObjects(
arrayToSort = variables.persons,
sortKeys = variables.sortKeys,
doDuplicate = false,
useGetterMethods = true);
</cfscript>
<h3>Unsorted:</h3>
<cfloop from="1" to="#arrayLen(variables.persons)#" index="variables.i">
<cfset variables.person = persons[variables.i]>
<cfoutput>
<p>
Position: #variables.i# <br />
Age: #variables.person.getAge()# <br />
Weight: #variables.person.getWeight()# <br />
Name: #variables.person.getName()# </br>
</p>
</cfoutput>
</cfloop>
<h3>Sorted:</h3>
<cfloop from="1" to="#arrayLen(variables.persons_sorted)#" index="variables.i">
<cfset variables.person = persons_sorted[variables.i]>
<cfoutput>
<p>
Position: #variables.i# <br />
Age: #variables.person.getAge()# <br />
Weight: #variables.person.getWeight()# <br />
Name: #variables.person.getName()# </br>
</p>
</cfoutput>
</cfloop>

<!--- Example: sorting an array of structures --->
<!--- Include UDF template --->
<cfinclude template="sortArrayOfObjects.cfm">
<cfscript>
//     EXAMPLE USAGE:
//an array of 4 persons, with three properties...
variables.persons = arrayNew(1);
variables.person = structNew();
variables.person.name = 'Martijn';
variables.person.age = 29;
variables.person.weight = 78;
arrayAppend(variables.persons, variables.person);
variables.person = structNew();
variables.person.name = 'Jelle';
variables.person.age = 29;
variables.person.weight = 85;
arrayAppend(variables.persons, variables.person);
variables.person = structNew();
variables.person.name = 'Lammert';
variables.person.age = 51;
variables.person.weight = 78;
arrayAppend(variables.persons, variables.person);
variables.person = structNew();
variables.person.name = 'Bas';
variables.person.age = 51;
variables.person.weight = 78;
arrayAppend(variables.persons, variables.person);
// specify how to sort:
// first on age - descending
// second on weight - ascending
// third on name - ascending
variables.sortkeys = arrayNew(1);
variables.sortKey = structNew();
variables.sortKey.keyName = "age";
variables.sortKey.sortOrder = "descending";
arrayAppend(variables.sortKeys, variables.sortKey);
variables.sortKey = structNew();
variables.sortKey.keyName = "weight";
variables.sortKey.sortOrder = "ascending";
arrayAppend(variables.sortKeys, variables.sortKey);
variables.sortKey = structNew();
variables.sortKey.keyName = "name";
variables.sortKey.sortOrder = "ascending";
arrayAppend(variables.sortKeys, variables.sortKey);
// do the sorting
variables.persons_sorted = sortArrayOfObjects(
arrayToSort = variables.persons,
sortKeys = variables.sortKeys,
doDuplicate = false,
useGetterMethods = false);
</cfscript>
<h3>Unsorted:</h3>
<cfdump var="#variables.persons#">
<h3>Sorted:</h3>
<cfdump var="#variables.persons_sorted#">





Representation and sorting of complex data

6 07 2008

Yesterday I saw a blog post by Ray Camden, showing an example of how to do a bubble sort on a two-dimensional array. I like the fact that this kind of sort is so easy to implement, with so few lines of code. As Ray pointed out, however, the bubble sort not suitable for sorting large sets of data. As an alternative, Ray suggests creating the array as a query, which of course can be sorted on multiple keys by using ColdFusion’s in-memory query capabilities.

Another alternative is to replace the two-dimensional array with an array of structures. Representing a set of objects as an array of structures is something I often choose to do, and in this blog post I will try to explain my thoughts on this matter.

In cases such as described by Ray, where the information is about objects which have one or more properties, arrays of structures result into more readable and maintable code than two-dimensional arrays. Compare, for instance, the following blocks of code:

<cfset persons = arrayNew(2)>
<cfset persons[1][1] = "Martijn">
<cfset persons[1][2] = 29>
<cfset persons[2][1] = "Franca">
<cfset persons[2][2] = 33>

<cfset persons = arrayNew(1)>
<cfset personData = structNew()>
<cfset personData.firstName = "Martijn">
<cfset personData.age = 29>
<cfset arrayAppend(persons, personData)>
<cfset personData = structNew()>
<cfset personData.firstName="Franca">
<cfset personData.age = 32>
<cfset arrayAppend(persons, personData)>

OK, so the second block of code is longer, but it also provides more information: from the first block of code we would have to guess that the number at position 2 of the second index is in fact the age of the person. In the second block of code, this is explicitly stated. Dumping the array would show all properties of the persons by name, so if we pass the data around in an application it will be instantly obvious what properties we are dealing with, even without looking at the code that created the array. This also applies to debugging your code: error messages like “Element firstName is undefined in a CFML structure referenced as part of an expression. ” are much easier to decode than “Element 1 is undefined in a Java object of type class coldfusion.runtime.Array. “

Creating the array like in the second block is also less error-prone: often we would write the code for creating one record, than copy-paste-edit to create the rest of the record. Using arrays of structures, we would only have to edit the property values. When using two-dimensional arrays we would also have to edit the record number (the first index) and if you are not far more accurate than I am you are going to end up skipping or duplicating a row number, or worse: editing the wrong index.

So what about presenting the data as a query object? You could do something like this:

<cfset rs = queryNew("firstName,age")>
<cfset queryAddRow(rs)>
<cfset querySetCell(rs,"firstName","Martijn")>
<cfset querySetCell(rs,"age",29)>
<cfset queryAddRow(rs)>
<cfset querySetCell(rs,"firstName","Franca")>
<cfset querySetCell(rs,"age",33)>

I am aware of two advantages of arrays of structures over queries. The first advantage is, that an array of structures can be represented in almost any programming language, whereas query object are very specific to ColdFusion. So if you are building a webservice that is to be consumed by many different types of client technology, you might want to pick structures over query objects.

The second advantage is, that – using an array – you can pass all data pertaining to one record in one reference, like this:

<cfset secondPerson = persons[2]>
… instead of
<cfset secondPerson.firstName = persons.firstName[2]>
<cfset secondPerson.age = persons.age[2]

It is easy to see that the code referring to the query representation would have to be updated if we were to add a third property to each person. The code referring to the array-of-structs representation would stay the same.

So what are the advantages of the query object representation? The most important one to me is that queries provide a very easy way to select records based on one or more properties: you can query them as if they are database tables, like this:

<cfquery dbtyp="query" name="selectedPerson">
SELECT * FROM persons WHERE age >=30
</cfquery>

Using the array-of-structs representation, you would have to loop through the entire set of records and determine for every record wether it should be selected. Obviously, this is more tedious to implement and will certainly perfom poorly when using large datsets.

The second advantage would be that data in queries can easily be sorted on multiple key by adding an ORDER BY clause the query-of-query SQL. To overcome this, I have written a UDF that will do an insertion sort on an array of structures on multiple keys. Sort order can be specified seperately per key. The code containing the UDF and a simple example can be downloaded here. Enjoy!