Home All Groups Group Topic Archive Search About

Sorting Arraylist Of Objects

Author
16 Sep 2006 4:29 PM
Kittyhawk
I would like to sort an Arraylist of objects on multiple properties. For
instance, I have a Sort Index property and an ID property (both integers).
So, the results of my sort would look like this:

sort index, id
1000,1
1000,2
1000,3
1001,1
1001,2
....

I know I can implement IComparable.CompareTo to sort on one property but I
am not completely clear on how to sort on two. Thanks for any help.

Author
16 Sep 2006 5:46 PM
Tom Dacon
A good way to do this is as follows:

In the Compare method, for each object create a string that has the
concatenated string representations of the properties you want to sort on,
and then compare the strings.

For instance, suppose the first property in your example has a maximum
length, as a string, of five characters, and the second one has a maximum
length, as a string, of four characters. So the first property which you
give in your example could have values from zero to 99999, and the second
one could have values from zero to 9999. Then you'd do something like this:

   dim str1 as string = firstObj.Property1.ToString().Padleft(5, "0"c) +
firstObj.Property2.ToString().PadLeft(4, "0"c)
   dim str2 as string = secondObj.Property1.ToString().Padleft(5, "0"c) +
secondObj.Property2.ToString().PadLeft(4, "0"c)

    return String.Compare(str1, str2)

You'd be comparing strings like "010000001" and "010010002".

To sort descending on both properties, you'd:

    return -(String.Compare(str1, str2)

And to sort descending on just one property, subtract that property value
from the highest value it can hold and do the ToString and Padleft on that
value. So your first example line below, to sort ascending on property one
and descending on property two, would look like: "010009998".

HTH,
Tom Dacon
Dacon Software Consulting


Show quoteHide quote
"Kittyhawk" <Kittyh***@discussions.microsoft.com> wrote in message
news:82A1C08E-4F56-4DA4-82DA-6C2640EB0BCE@microsoft.com...
>I would like to sort an Arraylist of objects on multiple properties. For
> instance, I have a Sort Index property and an ID property (both integers).
> So, the results of my sort would look like this:
>
> sort index, id
> 1000,1
> 1000,2
> 1000,3
> 1001,1
> 1001,2
> ...
>
> I know I can implement IComparable.CompareTo to sort on one property but I
> am not completely clear on how to sort on two. Thanks for any help.
>
Author
18 Sep 2006 1:23 PM
Chris Dunaway
Tom Dacon wrote:
Show quoteHide quote
> A good way to do this is as follows:
>
> In the Compare method, for each object create a string that has the
> concatenated string representations of the properties you want to sort on,
> and then compare the strings.
>
> For instance, suppose the first property in your example has a maximum
> length, as a string, of five characters, and the second one has a maximum
> length, as a string, of four characters. So the first property which you
> give in your example could have values from zero to 99999, and the second
> one could have values from zero to 9999. Then you'd do something like this:
>
>    dim str1 as string = firstObj.Property1.ToString().Padleft(5, "0"c) +
> firstObj.Property2.ToString().PadLeft(4, "0"c)
>    dim str2 as string = secondObj.Property1.ToString().Padleft(5, "0"c) +
> secondObj.Property2.ToString().PadLeft(4, "0"c)
>
>     return String.Compare(str1, str2)
>
> You'd be comparing strings like "010000001" and "010010002".
>
> To sort descending on both properties, you'd:
>
>     return -(String.Compare(str1, str2)
>
> And to sort descending on just one property, subtract that property value
> from the highest value it can hold and do the ToString and Padleft on that
> value. So your first example line below, to sort ascending on property one
> and descending on property two, would look like: "010009998".
>
> HTH,
> Tom Dacon
> Dacon Software Consulting
>
>
> "Kittyhawk" <Kittyh***@discussions.microsoft.com> wrote in message
> news:82A1C08E-4F56-4DA4-82DA-6C2640EB0BCE@microsoft.com...
> >I would like to sort an Arraylist of objects on multiple properties. For
> > instance, I have a Sort Index property and an ID property (both integers).
> > So, the results of my sort would look like this:
> >
> > sort index, id
> > 1000,1
> > 1000,2
> > 1000,3
> > 1001,1
> > 1001,2
> > ...
> >
> > I know I can implement IComparable.CompareTo to sort on one property but I
> > am not completely clear on how to sort on two. Thanks for any help.
> >

Why create strings when it is not necessary?  All he needs to do is in
the CompareTo method to first check the value of the index.  If they
are not equal then do the compare on the index.  If the indexes are
equal, then do the compare on the ID:

If the indexes are not equal, then the value of the Id won't affect the
sort anyway.

If obj1.Index <> obj2.Index Then
    return obj1.Index.CompareTo(Obj2.Index)
Else
    return obj1.Id.CompareTo(obj2.Id)
End If
Author
18 Sep 2006 4:01 PM
Tom Dacon
Show quote Hide quote
> Why create strings when it is not necessary?  All he needs to do is in
> the CompareTo method to first check the value of the index.  If they
> are not equal then do the compare on the index.  If the indexes are
> equal, then do the compare on the ID:
>
> If the indexes are not equal, then the value of the Id won't affect the
> sort anyway.
>
> If obj1.Index <> obj2.Index Then
>    return obj1.Index.CompareTo(Obj2.Index)
> Else
>    return obj1.Id.CompareTo(obj2.Id)
> End If
>

Just think for a moment what the code would look like if the requirement
were to sort on three, or four, or more property values. What I'm showing is
a generalized solution to the specific problem that the OP posed, that can
be used not only for this simple requirement but also for more complex
sorting requirements. Once you understand how to structure a sorting
solution this way, for the simple case given, you can easily apply it to
other and perhaps more complex problems. This is a good technique to add to
your toolkit - it's is simple and easy to remember, and scales easily as you
sort on more and more properties.

Tom Dacon
Dacon Software Consulting
Author
18 Sep 2006 4:51 PM
Chris Dunaway
Tom Dacon wrote:
Show quoteHide quote
> > Why create strings when it is not necessary?  All he needs to do is in
> > the CompareTo method to first check the value of the index.  If they
> > are not equal then do the compare on the index.  If the indexes are
> > equal, then do the compare on the ID:
> >
> > If the indexes are not equal, then the value of the Id won't affect the
> > sort anyway.
> >
> > If obj1.Index <> obj2.Index Then
> >    return obj1.Index.CompareTo(Obj2.Index)
> > Else
> >    return obj1.Id.CompareTo(obj2.Id)
> > End If
> >
>
> Just think for a moment what the code would look like if the requirement
> were to sort on three, or four, or more property values. What I'm showing is
> a generalized solution to the specific problem that the OP posed, that can
> be used not only for this simple requirement but also for more complex
> sorting requirements. Once you understand how to structure a sorting
> solution this way, for the simple case given, you can easily apply it to
> other and perhaps more complex problems. This is a good technique to add to
> your toolkit - it's is simple and easy to remember, and scales easily as you
> sort on more and more properties.

I agree that it can be a bit unwieldy for more than a few components,
but your method is creating at least 6 strings for each comparison:  4
PadLeft and 2 concatenations.  And for more components, that will
increase.  Sorting an arraylist with a large number of items may incur
a performace hit.

I'm just trying to say that I would think twice before creating a bunch
of strings that might incur a significant performance hit.

It would be interesting to see what the differences are and if they are
a cause for concern.
Author
18 Sep 2006 5:33 PM
Tom Dacon
> I agree that it can be a bit unwieldy for more than a few components,
> but your method is creating at least 6 strings for each comparison:  4
> PadLeft and 2 concatenations.  And for more components, that will
> increase.  Sorting an arraylist with a large number of items may incur
> a performace hit.
>
> I'm just trying to say that I would think twice before creating a bunch
> of strings that might incur a significant performance hit.
>
> It would be interesting to see what the differences are and if they are
> a cause for concern.
>

It'll depend more on the number of elements in the array than anything else.
For small lists, you probably wouldn't be able to even measure the impact.
For lists of reasonable size there's not likely to be a noticeable impact
(such as for sorting lists of a size that you would present in a user
interface). For extremely large lists, you'd be looking at using something
besides an arraylist and a comparer anyway.

Tom
Author
18 Sep 2006 5:01 PM
Chris Dunaway
Tom Dacon wrote:
> a generalized solution to the specific problem that the OP posed, that can
> be used not only for this simple requirement but also for more complex
> sorting requirements. Once you understand how to structure a sorting
> solution this way, for the simple case given, you can easily apply it to
> other and perhaps more complex problems. This is a good technique to add to
> your toolkit - it's is simple and easy to remember, and scales easily as you
> sort on more and more properties.

And I would also say that having a nested If is not that bad.  It
assumes that the properties themselves are comparable, however, so that
may be one instance where converting to a string would be applicable.
But then you only need do it when necessary.  Also in your method, if
the first properties differ, why convert other properties when their
values will not affect the outcome?

Dim iResult as Integer

If obj1.prop1 = obj2.prop1 Then
    If obj1.prop2 = obj2.prop2 Then
        If obj1.prop3 = obj2.prop3 Then
            If obj1.prop4 = obj2.prop4 Then
                If obj1.prop5 = obj2.prop5 Then
                    iResult = obj1.prop6.CompareTo(obj2.prop6)
                Else
                    iResult = obj1.prop5.CompareTo(obj2.prop5)
                End If
            Else
                iResult = obj1.prop4.CompareTo(obj2.prop4)
            End If
        Else
            iResult = obj1.prop3.CompareTo(obj2.prop3)
        End If
    Else
        iResult = obj1.prop2.CompareTo(obj2.prop2)
    End If
Else
    iResult = obj1.prop1.CompareTo(obj2.prop1)
End If

Return iResult
Author
18 Sep 2006 5:42 PM
Tom Dacon
Well, suit yourself. To me this code looks really ugly, difficult to
maintain or extend, and error-prone to code (and I mean that, of course, in
the nicest possible way). And I might point out the sheer number of
individual comparisions in this algorithm, compared to the single comparison
in the algorithm under discussion.

But everyone has his own way of doing things. My intention was to offer to
the community a useful solution to the problem of sorting on multiple
properties in a list comparer, that might not occur to a lot of people.
Whether or not one uses it depends on other aspects of the problem at hand,
as has already been noted.

That's all for me on this topic,

Tom

Show quoteHide quote
>
> Dim iResult as Integer
>
> If obj1.prop1 = obj2.prop1 Then
>    If obj1.prop2 = obj2.prop2 Then
>        If obj1.prop3 = obj2.prop3 Then
>            If obj1.prop4 = obj2.prop4 Then
>                If obj1.prop5 = obj2.prop5 Then
>                    iResult = obj1.prop6.CompareTo(obj2.prop6)
>                Else
>                    iResult = obj1.prop5.CompareTo(obj2.prop5)
>                End If
>            Else
>                iResult = obj1.prop4.CompareTo(obj2.prop4)
>            End If
>        Else
>            iResult = obj1.prop3.CompareTo(obj2.prop3)
>        End If
>    Else
>        iResult = obj1.prop2.CompareTo(obj2.prop2)
>    End If
> Else
>    iResult = obj1.prop1.CompareTo(obj2.prop1)
> End If
>
> Return iResult
>
Author
18 Sep 2006 6:06 PM
Chris Dunaway
Tom Dacon wrote:
> Well, suit yourself. To me this code looks really ugly, difficult to
> maintain or extend, and error-prone to code (and I mean that, of course, in
> the nicest possible way).

I don't see how it's any more difficult to maintain and extend than the
string version.  Both will require changing the code and a recompile.
As for ugliness, I agree it's not the prettiest code.

> And I might point out the sheer number of
> individual comparisions in this algorithm, compared to the single comparison
> in the algorithm under discussion.

But I believe the individual comparisons are less than the cost of
creating a new string.  And in the version I posted there are at most 6
comparisons per call to the CompareTo method and as few as 2 -- to sort
on 6 fields.  Whereas the string version will ALWAYS create 14 strings,
6 PadLeft and one concatenation for each object compared.

> But everyone has his own way of doing things. My intention was to offer to
> the community a useful solution to the problem of sorting on multiple
> properties in a list comparer, that might not occur to a lot of people.

And I agree that it is clever and useful, but anytime I see alot of
strings and concatenation, it raises a red flag in my mind.  Especially
since strings in .Net are immutable.

> Whether or not one uses it depends on other aspects of the problem at hand,
> as has already been noted.

I agree (as you pointed out elsewhere) that the number of individual
fields to be sorted on as well as the number of records to sort will
affect the relative time needed to perform the sort

Cheers!

Chris
Author
18 Sep 2006 5:50 PM
Chris Dunaway
Tom Dacon wrote:
> Just think for a moment what the code would look like if the requirement
> were to sort on three, or four, or more property values. What I'm showing is
> a generalized solution to the specific problem that the OP posed, that can
> be used not only for this simple requirement but also for more complex
> sorting requirements. Once you understand how to structure a sorting
> solution this way, for the simple case given, you can easily apply it to
> other and perhaps more complex problems. This is a good technique to add to
> your toolkit - it's is simple and easy to remember, and scales easily as you
> sort on more and more properties.
>

I created the following code to test this.  The simple class that is
being sorted consists of 3 integers and 3 strings, randomly generated.
I did not implement IComparable on the class itself, but rather created
2 IComparer<SimpleClass> classes so I could keep them separate.  I also
used a generic List since it was faster to create. For the comparer
that implements your method I padded all integers on the left with 0
for a total length of 5 and for the strings, I padded on the left with
the letter A to 9 characters.  The strings were randomly generated with
a length from 4 to 9 characters.

For 100000 object, my method took .428 seconds.  The method using
strings took 15.353 seconds.  I don't know if this is conclusive, there
may be a problem with my testing methodolgy, but I would appreciate
some input.  Here is the code:  (I also used C# since I am faster at
developing with it)

Here are the comparer classes:

class Comparer1 : IComparer<SimpleClass>
{
    public int Compare(SimpleClass x, SimpleClass y)
    {
        int result;

        if (x.Prop1 == y.Prop1)
            if (x.Prop2 == y.Prop2)
                if (x.Prop3 == y.Prop3)
                    if (x.Prop4 == y.Prop4)
                        if (x.Prop5 == y.Prop5)
                            result = x.Prop6.CompareTo(y.Prop6);
                        else
                            result = x.Prop5.CompareTo(y.Prop5);
                    else
                        result = x.Prop4.CompareTo(y.Prop4);
                else
                    result = x.Prop3.CompareTo(y.Prop3);
            else
                result = x.Prop2.CompareTo(y.Prop2);
        else
            result = x.Prop1.CompareTo(y.Prop1);

        return result;
    }
}

class Comparer2 : IComparer<SimpleClass>
{
    public int Compare(SimpleClass x, SimpleClass y)
    {
        string str1 = x.Prop1.ToString().PadLeft(5, '0') +
                      x.Prop2.PadLeft(9, 'A') +
                      x.Prop3.ToString().PadLeft(5, '0') +
                      x.Prop4.ToString().PadLeft(5, '0') +
                      x.Prop5.PadLeft(9, 'A') +
                      x.Prop6.PadLeft(9, 'A');

        string str2 = y.Prop1.ToString().PadLeft(5, '0') +
                      y.Prop2.PadLeft(9, 'A') +
                      y.Prop3.ToString().PadLeft(5, '0') +
                      y.Prop4.ToString().PadLeft(5, '0') +
                      y.Prop5.PadLeft(9, 'A') +
                      y.Prop6.PadLeft(9, 'A');

        return str1.CompareTo(str2);
    }
}


Here is the code to generate the objects, and sort them:

Random rnd = new Random(Environment.TickCount);
private const int instanceCount = 100000;

private void button3_Click(object sender, EventArgs e)
{
    List<SimpleClass> items1 = createListOfItems();
    List<SimpleClass> items2 = createListOfItems();

    Stopwatch watch = new Stopwatch();

    //Sort by Comparer1
    watch.Reset();
    watch.Start();
    items1.Sort(new Comparer1());
    watch.Stop();
    MessageBox.Show(watch.Elapsed.ToString());

    //Sort by Comparer2
    watch.Reset();
    watch.Start();
    items2.Sort(new Comparer2());
    watch.Stop();
    MessageBox.Show(watch.Elapsed.ToString());
}

private List<SimpleClass> createListOfItems()
{
    List<SimpleClass> list = new List<SimpleClass>();

    for (int i = 0; i < instanceCount; i++)
    {
        SimpleClass s = new SimpleClass();
        s.Prop1 = rnd.Next(1, 10001);
        s.Prop2 = createRandomString(rnd.Next(4, 9));
        s.Prop3 = rnd.Next(1, 10001);
        s.Prop4 = rnd.Next(1, 10001);
        s.Prop5 = createRandomString(rnd.Next(4, 9));
        s.Prop6 = createRandomString(rnd.Next(4, 9));

        list.Add(s);
    }

    return list;
}

private string createRandomString(int p)
{
    string letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

    StringBuilder sb = new StringBuilder();

    for (int i = 0; i < p; i++)
    {
        sb.Append(letters[rnd.Next(0, letters.Length)]);
    }
    return sb.ToString();
}
Author
18 Sep 2006 5:56 PM
Chris Dunaway
Chris Dunaway wrote:

Oops, I forgot to post the SimpleClass:

class SimpleClass
{
    private int _prop1;
    public int Prop1
    {
        get { return _prop1; }
        set { _prop1 = value; }
    }

    private string _prop2;
    public string Prop2
    {
        get { return _prop2; }
        set { _prop2 = value; }
    }

    private int _prop3;
    public int Prop3
    {
        get { return _prop3; }
        set { _prop3 = value; }
    }

    private int _prop4;
    public int Prop4
    {
        get { return _prop4; }
        set { _prop4 = value; }
    }

    private string _prop5;
    public string Prop5
    {
        get { return _prop5; }
        set { _prop5 = value; }
    }

    private string _prop6;
    public string Prop6
    {
        get { return _prop6; }
        set { _prop6 = value; }
    }
}
Author
18 Sep 2006 6:21 PM
Tom Dacon
Interesting result. My use of this technique has been with much smaller
lists, on the order of a few hundred at most. Perhaps a StringBuilder or two
might be worth inserting into the algorithm, or perhaps the conclusion to be
reached is that the technique just isn't appropriate for lists on that
scale.

Thanks for taking the time to test it.

Tom

Show quoteHide quote
"Chris Dunaway" <dunaw***@gmail.com> wrote in message
news:1158602219.365012.143850@d34g2000cwd.googlegroups.com...
> Chris Dunaway wrote:
>
> Oops, I forgot to post the SimpleClass:
>
> class SimpleClass
> {
>    private int _prop1;
>    public int Prop1
>    {
>        get { return _prop1; }
>        set { _prop1 = value; }
>    }
>
>    private string _prop2;
>    public string Prop2
>    {
>        get { return _prop2; }
>        set { _prop2 = value; }
>    }
>
>    private int _prop3;
>    public int Prop3
>    {
>        get { return _prop3; }
>        set { _prop3 = value; }
>    }
>
>    private int _prop4;
>    public int Prop4
>    {
>        get { return _prop4; }
>        set { _prop4 = value; }
>    }
>
>    private string _prop5;
>    public string Prop5
>    {
>        get { return _prop5; }
>        set { _prop5 = value; }
>    }
>
>    private string _prop6;
>    public string Prop6
>    {
>        get { return _prop6; }
>        set { _prop6 = value; }
>    }
> }
>
Author
18 Sep 2006 6:33 PM
Chris Dunaway
Tom Dacon wrote:
> Interesting result. My use of this technique has been with much smaller
> lists, on the order of a few hundred at most. Perhaps a StringBuilder or two
> might be worth inserting into the algorithm, or perhaps the conclusion to be
> reached is that the technique just isn't appropriate for lists on that
> scale.
>
> Thanks for taking the time to test it.
>

I also realized that padding on the left with the letter A for the
string properties was wrong.  It think it changed the sort order so I
removed that and it took only 12 seconds.

I think I'll try the string builder and see what happens.
Author
18 Sep 2006 6:38 PM
Chris Dunaway
Chris Dunaway wrote:
Show quoteHide quote
> Tom Dacon wrote:
> > Interesting result. My use of this technique has been with much smaller
> > lists, on the order of a few hundred at most. Perhaps a StringBuilder or two
> > might be worth inserting into the algorithm, or perhaps the conclusion to be
> > reached is that the technique just isn't appropriate for lists on that
> > scale.
> >
> > Thanks for taking the time to test it.
> >
>
> I also realized that padding on the left with the letter A for the
> string properties was wrong.  It think it changed the sort order so I
> removed that and it took only 12 seconds.
>
> I think I'll try the string builder and see what happens.

Here is the string builder version but it didn't seem to save much
time:

    class Comparer2 : IComparer<SimpleClass>
    {
        public int Compare(SimpleClass x, SimpleClass y)
        {
            StringBuilder sb1 = new StringBuilder(50);
            StringBuilder sb2 = new StringBuilder(50);

            sb1.Append(x.Prop1.ToString().PadLeft(5, '0'));
            sb1.Append(x.Prop2.PadRight(9,' '));
            sb1.Append(x.Prop3.ToString().PadLeft(5, '0'));
            sb1.Append(x.Prop4.ToString().PadLeft(5, '0'));
            sb1.Append(x.Prop5.PadRight(9, ' '));
            sb1.Append(x.Prop6.PadRight(9, ' '));

            sb2.Append(y.Prop1.ToString().PadLeft(5, '0'));
            sb2.Append(y.Prop2.PadRight(9, ' '));
            sb2.Append(y.Prop3.ToString().PadLeft(5, '0'));
            sb2.Append(y.Prop4.ToString().PadLeft(5, '0'));
            sb2.Append(y.Prop5.PadRight(9, ' '));
            sb2.Append(y.Prop6.PadRight(9, ' '));

            return sb1.ToString().CompareTo(sb2.ToString());
        }
    }
Author
18 Sep 2006 6:41 PM
Chris Dunaway
Tom Dacon wrote:
> Interesting result. My use of this technique has been with much smaller
> lists, on the order of a few hundred at most. Perhaps a StringBuilder or two

You're right, for only a few hundred to a few thousand object, the
difference was only about 1 second.

While the method I proposed might be faster for very large lists, it
does sacrifice a little readablity.  For smaller lists, sacrificing an
almost negligible amount of time for readability may be the way to go.

Anyway, I think we have beaten this horse into the ground!

Chris
Author
18 Sep 2006 6:51 PM
Tom Dacon
indeed

Show quoteHide quote
"Chris Dunaway" <dunaw***@gmail.com> wrote in message
news:1158604891.046539.34210@i3g2000cwc.googlegroups.com...
> Tom Dacon wrote:
>> Interesting result. My use of this technique has been with much smaller
>> lists, on the order of a few hundred at most. Perhaps a StringBuilder or
>> two
>
> You're right, for only a few hundred to a few thousand object, the
> difference was only about 1 second.
>
> While the method I proposed might be faster for very large lists, it
> does sacrifice a little readablity.  For smaller lists, sacrificing an
> almost negligible amount of time for readability may be the way to go.
>
> Anyway, I think we have beaten this horse into the ground!
>
> Chris
>
Author
17 Sep 2006 1:52 PM
Dennis
In your compare routine, you should be able to compare both properties to see
which class instance should be first.
--
Dennis in Houston


Show quoteHide quote
"Kittyhawk" wrote:

> I would like to sort an Arraylist of objects on multiple properties. For
> instance, I have a Sort Index property and an ID property (both integers).
> So, the results of my sort would look like this:
>
> sort index, id
> 1000,1
> 1000,2
> 1000,3
> 1001,1
> 1001,2
> ...
>
> I know I can implement IComparable.CompareTo to sort on one property but I
> am not completely clear on how to sort on two. Thanks for any help.
>