#Google Analytic Tracker

Pages

Dec 1, 2010

In LINQ, is Take(n) really the same as Select Top n?

Today I was doing some bug fixes and I ended up wanting to do something like “Select top n From ..” in Linq.  I realized that there isn’t really a Top keyword. Ideally I would like something like this:

var result = from i in itemList
where SomeCheck(i)
select top 10 of i;

Of course,there is no TOP in LINQ. I asked my colleague what should I do, she said use Take(n). It’s true, Take(n) returns you n number of item from an IEnumerable. For example, you can do something like

var result = (from i in itemList
where SomeCheck(i)
orderby i.Rank
select i).Take(10);

However in my case, it is a bit more complicated then that. This method SomeCheck() maintains a state. Doing the above would call this method based on the count of itemList.  What I really want is I want to stop calling this SomeCheck method once I get 10 items.

Take() != Top()

Now then you think about now Top works in SQL Server, I would have guess that a query such as “ Select Top 10 * From [someTable] Where [SomeSlowEvaulation] ” would only evaluates “SomeSlowEvaulation” up to the point we get all 10 items. This cannot be achieve by using Take(n).

My Solution to My Own Problem

To achieve what I want, I ended with something like this:

var count = 0;
var result = from i in itemList
where count <= 10
where SomeCheck(i)
where count++ > -1
select i;

You may wonder what the heck am I doing. What the above code does is that it stop evaluate SomeCheck(i) once I get 10 or less items in the result. The code is very straight forward. Once count <= 10 failed, LINQ stop evaluating the rest of the where statement. If SomeCheck(i) passes, then it has to run count ++ > –1.  In the above case, you can join all these statements together, but it is easier to read the code in this format.

So, why would I want to do that. Of course there is a business logic why I can’t call SomeCheck(i). This method does two things, it checks if I want the result, and it caches the data at the same time. You can see that the flaw in this method is that it does two things at the same times. However, back then it appears to be a good idea.

The only one thing that the above solution can’t minic the top is that it doesn’t stop evaluating until it finished the list. In SQL, if you select top 10 out of 10 million records, you would expect SQL Server does not loop though the entire list.  In my case unfortunately it still does.

Conclusion

If you Google Top for Linq, you will find that most of the answer are using Take(). In most cases it Take() will work. However, just like anything, just because there is an answer to a problem you search, it doesn’t it is the right solution for you. You should always dig a bit deeper to the problem to ensure the answer applies to your problem.

No comments: