#Google Analytic Tracker

Pages

Aug 4, 2009

Did you increase or decrease your application performance using IEnumerable in Linq (Part 2)?

Last time I talked about how you can improve your application performance by using IEnumerable so that no calculation is performed until you need it.

Bad Performance

Now, consider the following code:

private void GetStatistics(List<Vehicle> totalSaleList)
{
    IEnumerable<Vehicle> toyota
        = from v in totalSaleList
          where v.Make == Make.Toyota
          select v;

    int totalSales = toyota.Count();

    int thisYearSale = 
        (from v in toyota
         where v.Year == 2009
         select v).Count();

    double avgMilage = 
        (from v in toyota
         select v.Millage).Average();

    Console.WriteLine("Total Sales: {0}. This year sale: {1}. Average milage: {2}", totalSales, thisYearSale, avgMilage);
}

Notice that in the above example, I kept the variable "toyota" as an Enumerable. When I try to calculate the “totalSales”, “thisYearSale” and “avgMilage”, the querying always need to re-evaluate the first Linq statement. This is obvious a waste of cpu power. 

Better Performance

Ideally, you should do the following:

private void GetStatistics2(List<Vehicle> totalSaleList)
{
    Vehicle[] toyota
        = (from v in totalSaleList
          where v.Make == Make.Toyota
          select v).ToArray();

    int totalSales = toyota.Length;

    int thisYearSale =
        (from v in toyota
         where v.Year == 2009
         select v).Count();

    double avgMileage =
        (from v in toyota
         select v.Mileage).Average();

    Console.WriteLine("Total Sales: {0}. This year sale: {1}. Average mileage: {2}", totalSales, thisYearSale, avgMileage);
}

By converting the variable “toy ota” into an array, the above code does not need to re-evaluate “toyota” each time when we access the enumerable.

Even Better Performance

If performance is a must, you should try to combine calculations in as less number of loop as possible. Remember, each time you call Linq extension method (i.e. Max(), Min(), Average()), it actually has to loop though your enumerable object to calculate the result.

private void GetStatistics3(List<Vehicle> totalSaleList)
{
    Vehicle[] toyota
        = (from v in totalSaleList
           where v.Make == Make.Toyota
           select v).ToArray();

    int totalSales = toyota.Length;

    int thisYearSale = 0;
    int totalMileage = 0;
    foreach (var vehicle in toyota)
    {
        if (vehicle.Year == 2009)
            thisYearSale++;
        totalMileage += vehicle.Mileage;
    }

    double avgMilage = totalMileage * 1.0 / totalSales;

    Console.WriteLine("Total Sales: {0}. This year sale: {1}. Average milage: {2}", totalSales, thisYearSale, avgMilage);
}

Conclusion

When you need to performance more calculation on the same enumerable object, it would be wise to convert it to a list or an array first before further processing.  This will reduce the time of re-evaluate the enumerable result.

1 comment:

liv said...

Thanks for the examples.You have clear idea of IEnumerable concept,the performance examples are showing the true facts.

freisprecheinrichtung