Does “Select New” in linq trigger an evaluation / load?
-
03-07-2019 - |
Question
I'm currently trying to create a class which implements IEnumerable<T>
in order to construct a Hierarchy from a flat list of objects which have references to each other through a ParentId property. I'd like to write a fluent interface for this so I can do something like this
IEnumerable<Tab> tabs = GetTabs();
IEnumerable<TabNode> tabNodes = tabs.AsHierarchy().WithStartLevel(2).WithMaxDepth(5);
So, about the yield statement, I wonder whether I could do something like this within my NodeHierarchy : IEnumerable<TabNode>
class:
private IEnumerable<TabNode> _nodes;
public NodeHierarchy(IEnumerable<Tab> tabs)
{
_nodes = CreateHierarchy(tabs);
}
public IEnumerable<TabNode> CreateHierarchy(IEnumerable<Tab> tabs)
{
/* About this block: I'm trying to find the top level
nodes of the first tab collection, maybe this is done poorly? */
var tabIds = tabs.Select(t => t.TabID);
IEnumerable<TabNode> nodes = from tab in tabs
where !tabIds.Contains(tab.ParentId)
select new TabNode {
Tab = node,
ChildNodes = CreateHierarchy(tabs, node.TabID, 1),
Depth = 1 };
return nodes;
}
or whether I would have to do something like this:
private IEnumerable<TabNode> _nodes;
public NodeHierarchy(IEnumerable<Tab> tabs)
{
_nodes = CreateHierarchy(tabs);
}
public IEnumerable<TabNode> CreateHierarchy(IEnumerable<Tab> tabs)
{
var tabIds = tabs.Select(t => t.TabID);
IEnumerable<Tab> startingNodes = from tab in tabs
where !tabIds.Contains(tab.ParentId)
select tab;
foreach(Tab node in startingNodes)
{
yield return
new TabNode()
{
Tab = node,
ChildNodes = CreateHierarchy(tabs, node.TabID, 1),
Depth = 1
};
}
Solution
No, select new
will not trigger evaluation. This will map to a call to:
.Select(tab => new TabNode {...})
And note that Select
(for LINQ-to-Objects, at least) is essentially something like:
public static IEnumerable<TDest> Select<TSource,TDest>(
this IEnumerable<TSource> source,
Func<TSource,TDest> selector)
{
foreach(TSource item in source)
{
yield return selector(source);
}
}
The key point here being that it evaluates lazy - not all at once.
Either approach should be comparable - the only difference is that without yield return
, some code will run immediately - but only the code to build the .Where(...).Select(...)
chain - it won't actually process the rows until you start iterating the result.
Furthermore, depending on the data source, that approach can actually be more efficient - for example, with a LINQ-to-SQL backend, as the TSQL generator can skip the unnecessary columns.