|
Upvoted for responding with some substance to a lonely thread.
Thanks, mixed in with your howling tautologies, is the implication you can trigger partial iteration to handle more memory than is available ... if there's a cheap way to use virtual memory with IEnumerable, I am all ears.
fyi: detailed history of generics in .NET: [^].
«The mind is not a vessel to be filled but a fire to be kindled» Plutarch
|
|
|
|
|
BillWoodruff wrote: if there's a cheap way to use virtual memory with IEnumerable
Large data sets still end up going to disk since "virtual memory" in the traditional sense uses the disk to get that memory.
That is why for example even though a binary sort might "seem" fast if you attempt it on a large data set that requires disk paging it not going to work. See heap sort instead.
Actions on large data sets which involve disk access always require careful design. They can't be fixed by implementation (where implementation precludes design.)
So for example that is why one must carefully consider what indexes database tables should use.
And if one is going to return a sorted list to a UI which originates from a database then from the very first design the API should have paging in it. I have seen silly designs for decades (now and then) where the UI falls over because the developer designed a UI report and tested it for 20 entries and now the major customer using it has 20 million entries. And the UI was set up to sort it in memory.
As for IEnumerable you can of course back it with anything. So for example the implementation can use a database paged query. The call to the exposed methods track the point where the last chunk was grabbed in the database and it returns the next chunk and then feeds from a chunk until that one is empty. That doesn't fix the count problem because there is no way to expose the count in the first place. So you would need to use another interface. You can of course use a double interface and then in the calling code it would need to cast to get the top level which has a count. The backing implementation would then provide the count in a better way than using the IEnumerable iterator.
But of course there are often other design problems with solutions like the above.
1. They assume that the count is a absolute. With large datasets the data very likely will be in transition. By the time you get to the end there could be more or less than the original count.
2. They right UIs with expectation that users are stupid and don't know how to do their job. So that example above with 20 entries and no other way to search becomes absolutely useless when there are 20 million entries even if it is paged. It should always have search criteria.
|
|
|
|
|
Me: "You know how when you have two cans connected with a string and when you pull the string tight you can hear what someone says in the can at the other end of the string?"
Student: "Yeah"
Me: "Well it's not like that at all."
I think humor should be injected into every conversation...
".45 ACP - because shooting twice is just silly" - JSOP, 2010 ----- You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010 ----- When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013
|
|
|
|
|
Is there any real reason to use an IEnumerable ?
Yes; first off IEnumerable was Microsoft implementation of functional programming algebras for a List . It is essentially a monadic data type that encapsulates List ; monadic data types are typically designed to conform to a fairly standard set of functional algebras i.e. providing a common API for computations.
One major difference with Microsoft’s implementation; is that Microsoft decided to depart from the more familiar naming conventions used in functional programming circles, for example:
- Select is more common referred to as a Functor; and its method is more typically named Map .
- SelectMany is more commonly referred to as a Monad, and its method is more typically named FlatMap .
Microsoft's choice was to align their monadic implementation with SQL for obvious reasons; they had an existing group of programmers who would be already quite accustomed to SQL; so adopting that style of API would simplify adoption.
Category Theory
Function programming algebras are directly tied to Category Theory; hence there are specific tests (axioms) that can be performed to verify that these common computations conform to the axioms of a Functor ; for example, it must preserve:
- Identity morphisms
- Commutativity re composition of morphisms
Similarly said there are rules governing the other algebras like SelectMany ; which as I mentioned is a monad, etc…
The benefit of IEnumerable is the same as for any other functionally conforming type; it allows the use of a common set of algebras (common API) on the type that is encapsulated by IEnumerable . In Functional Programming; IEnumerable is not the only monadic data types that are used; there are also for example:
- Identity Monad; this is the most trivial of the monadic types; encapsulating a single value.
- Maybe Monad; a type that encapsulates an optional value; similar to the Option type in C#
- List Monad; a type that encapsulates a list of values; similar to IEnumerable type in C#
- Either Monad; a type that encapsulates a value with one of two possible outcomes.
- Validate Monad; a type like Either that is used to write data validators; because it has been designed to accumulate errors with e.g. an input form with multiple input validations that could be in error at the same time.
- etc.
This is by no means a comprehensive list of monadic data types… both for functional programming and on the large, but its also not comprehensive in terms of the C# types that have some level of implementation of functional programming algebras.
What's so useful about deferred evaluation ?
Deferred evaluation is more commonly referred to as lazy evaluation; why is it useful in Linq is simple. it allows a more expressive use of the api without incurring additional computation cost, for example:
- Multiple Where statements in a single Linq computation without the cost of performing each Where computation separately -- it's combined into a single computation.
Allow you to only incur a computational cost when its needed.
For example; some lazy evaluations may be built up during a computation block; and then never executed because of e.g. a branched operation like a cancellation; where this lazy evaluation becomes unnecessary; hence it would be preferable to not incur any computational cost until its actually needed, and most certainly not at each step (e.g. for each dot chained method call).
Infinite Lists
The other less obvious use of lazy evaluation is infinite lists; lists that are computationally far too expensive and/or impossible to compute on the whole… that is where method calls like Take, Skip come into play, for example:
- I can choose to Skip the 1st 3 entries in an infinite list of primes, and Take the next 4; without incurring the computation cost of trying to 1st populate an infinite list of primes.
Isn't deferred evaluation dangerous because you have no idea of what might have changed in the time between assembling the IEnumerable and iterating/evaluating it ?
No not at all; because the final computation would still work on essentially a snapshot of the encapsulated value at the time the lazy evaluation was computed.
Plus its considered bad design to build any system that incorporates data races — a data race occurs when two computations access the same value concurrently, and at least one of the accesses mutates the value. The problem is that it makes testing impossible, because you have no way of knowing the state of the encapsulated value; worse the mutation step may only be partially computed; leaving the value in a compromised state. It makes it impossible to test code with objects that are in an indeterminate state.
Why can't I test an IEnumerable to see if it's empty, or null; why do I have to trigger iteration with 'Count() to see how many items it has, and, why can I only use 'Any to see if there are no items in it without triggering iteration..
…because its a lazy evaluation — which is typically impossible (if not difficult) to predetermine in advance of actually computing the lazy evaluation. However Microsoft as of .Net 6 has made available a new Linq method called TryGetNonEnumeratedCount which attempts to determine the number of elements in a sequence without forcing an enumeration; failing which it reverts to forcing the compute.
Personally I would advise against this style of coding completely + in the functional programming context; we specifically avoid these types of method calls, because they are indeterminate; worse this method mutates an input parameter (out keyword) -- which makes testing that much more difficult. In functional programming; there is huge merit in trying to keep a majority of the codebase as pure as possible. In the C# context; it is quite possible to have the core of the codebase pure; with a very small cut out for mutation where its unavoidable, for example: UI. Leaving a codebase that is for greater part very easy to test, and hence less prone to errors.
Anyway that's a window into the rabbit hole that is functional programming and its basis in category theory; hopefully its helpful.
modified 26-Jan-22 12:01pm.
|
|
|
|
|
It's an interface to a forward reading state machine who's final implementation is left up to the author. The whole "internal list" concept is an illusion.
I use IEnumerable to create custom tokenizers for certain text files. It's when you need it, and realize it, that its usefulness becomes clear.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
Most certainly Yes... in the context of considering IEnumerable as just an interface for iteration over a collection of sorts.
What however muddles the water is the questioning of its usefulness in relation to the Linq Extension Methods for IEnumerable , as mentioned by OP:
- Select
- Where
- Aggregate
Similarly deferred evaluation aka lazy evaluation and Count()
Whilst most any data type can conform to some degree with IEnumerable , and by virtue its Linq extensions; it's predominant practicality is with in memory collection types like List , where a majority of the Linq algebras would equally be useful, performant, etc.
|
|
|
|
|
There is no muddling; until I yield, I can pretty well do anything I like, including chaining.
Count() is not a method of IEnumerable. A state machine can learn; you could probably get a good estimate the 2nd time around; all other things being equal.
If the IEnumerable returns an object, you could even have it return a status report every so often; you're not limited to returning a particular type.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
Count() is not a method of IEnumerable.
It is an extension on IEnumerable<TSource> , the state machine is sub typed to IEnumerable ; see reference runtime code below:
runtime/Count.cs at dotnet/runtime · GitHub
A state machine can learn; you could probably get a good estimate the 2nd time around; all other things being equal.
I see no example of this in the code for Linq . In .Net 6 Microsoft has added TryGetNonEnumeratedCount to try to compute the compute the count without enumeration; failing which they fallback on enumerating the collection, reference runtime code below:
runtime/Count.cs at dotnet/runtime · GitHub
If the IEnumerable returns an object, you could even have it return a status report every so often; you're not limited to returning a particular type.
It's as mentioned before is a state machine that is sub typed to IEnumerable ; so barring guesstimate extension methods like TryGetNonEnumeratedCount ; it requires enumeration to return a result. Select allows mapping from 1 encapsulated type to another; however the collection object (IEnumerable ) is resolute for Linq.
Modifying the outer collection object; usually involves a traverse map; Linq AFAIK doesn't have any traversal extension methods; but it's fairly easy to build your own; re the building blocks for it are already there... Aggregate , SelectMany , and Select .
Side note: Arguably ToList and ToArray are basic traversals.
modified 28-Jan-22 1:52am.
|
|
|
|
|
My references are to IEnumerable; you keep wanting to make it about LINQ, extensions, and generic types.
"I see no example". And that means what? You only believe in "prior art"?
You keep ignoring it's an "interface"; there is no "code" (LINQ or otherwise) that you can make claims about.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
IEnumerable in the small in defined as just an interface; for example (.Net sourcecode):
internal interface IEnumerable {
System.Collections.IEnumerator GetEnumerator();
}
Which does nothing but define the need for a method called GetEnumerator that returns a IEnumerator .
If that's what you mean by "My references are to IEnumerable"; then following "your rules"; you should avoid saying:
Count() is not a method of IEnumerable. A state machine can learn; you could probably get a good estimate the 2nd time around; all other things being equal.
If the IEnumerable returns an object, you could even have it return a status report every so often; you're not limited to returning a particular type.
..because "your rules" are stipulating that we ignored all the extension methods that are linked to IEnumerable and by distinction all the sub types of IEnumerable . Meaning you can't imply its a "state machine" because that would require you to first acknowledge the many extension methods for IEnumerable ; because (if you missed it), the extension methods are the bridging code between IEnumerable as an interface to IEnumerable as a state machine i.e. part of what's commonly known as Linq.
What value is there in ignoring reality; Linq (for IEnumerable) exists as a large set of extension methods and sub types, etc.
Whilst its possible to create your own custom interfaces and types to tie in with Linq; that fact alone doesn't discount the large part of extension methods, ... that are considered .Net System.Linq and which is directly typed to the IEnumerable interface.
modified 28-Jan-22 12:49pm.
|
|
|
|
|
You're fixated on one type of implementation; the implementation is up to the author. The author decides how, when, and what to "yield"; which is the whole point.
There are no "rules" .
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
Nope... but I'm sure we can agree its become rather pedantic.
An interface on its own is not a state machine; whilst the outcome of this type of implementation:
public class Id : IEnumerable {
private readonly int Value;
public Id(int value) => (Value) = (value);
public IEnumerator GetEnumerator() {
yield return Value;
}
}
..technically yields a state machine; it does not make either IEnumerable or IEnumerator themselves a state machine... they're an implementation requirement.. but then (pedantically) we're getting into the weeds.
Plus this line of discussion has no bearing on the scope of the OP's question wrt to either:
...they've had no problems mastering 'Select and 'Where, 'Aggregate, etc.
What's so useful about deferred evaluation ?
why do I have to trigger iteration with 'Count() to see how many items it has...
In that context IEnumerable should not be considered without Linq; re Select and 'Where, 'Aggregate, etc. are a reference to Linq; and in that context it is the extension methods of Linq that bridge IEnumerable to Linq's implementation of a state machine, and similarly Count() .
I also said...
Whilst its possible to create your own custom interfaces and types to tie in with Linq
It is not a requirement for custom Linq types to implement IEnumerable to use both syntactic styles of Linq. Nor would it be possible for product and sum types to conform to IEnumerable ; because its an implementation that is tied to IEnumerator ; an API designed for iterating over a collection.
|
|
|
|
|
Gerry Schmitz wrote: It's an interface to a forward reading state machine who's final implementation is left up to the author. The whole "internal list" concept is an illusion. I like the analogy of a finite state machine, even if that ... for my students ... is an advanced concept that would probably distract them. i'd rephrase that as: "forward reading finite iteration machine."
"list ... illusion:" trying to prevent the student from assuming there is an internal list is a priority when i teach.
"custom tokenizers:" interesting ... i assume you mean 'Select based IEnumerables.
«The mind is not a vessel to be filled but a fire to be kindled» Plutarch
|
|
|
|
|
@endofunk I am stupefied by the idea anyone guiding a "bright young student" would smother them in such a pile of tangled, redundant, overblown, irrelevant, rhetoric.
«The mind is not a vessel to be filled but a fire to be kindled» Plutarch
|
|
|
|
|
Tsk tsk... that type reaction is IMO simply a reflection on an inability to grok the underlying complexity; not on the information and/or its relevance and/or its accuracy. It was after-all intended to provide an indication of the complexity underlying the architecting of Linq.
Ps. Plus my response was to you, not the student -- a good teacher would be able to gauge the level of their audience and adjust the presentation of a complex topic to match that.
|
|
|
|
|
the sweat trying to show off produces often tastes like non-sequiturs
«The mind is not a vessel to be filled but a fire to be kindled» Plutarch
|
|
|
|
|
An accusation of "showboating"; concludes with an incongruous use of a Latin phrase... Derp
No knowledge is ever to be wasted or ever to be despised
-- Joseph Needham
|
|
|
|
|
endo funk wrote: Tsk tsk... that type reaction is IMO simply a reflection on an inability to grok
Tsk tsk...That reply suggests one or more of the following
1. You did not read the original post
2. You didn't care what the original post asked.
3. You don't understand the context of what student might mean in the original post.
|
|
|
|
|
Nope. I quoted his words (question), and responded to that... a fact you apparently missed.
|
|
|
|
|
endo funk wrote: Nope. I quoted his words (question), and responded to that... a fact you apparently missed.
Your first response in this thread (not sub thread)...
Did not quote anything.
Apparently failed to understand what "bright young student" means.
|
|
|
|
|
Pssst.
...to avoid coming across as either an illiterate or a moron, or a bit of both.
I suggest you compare the Big Bolded Headings before each response paragraph; for example:
Is there any real reason to use an IEnumerable ?
What's so useful about deferred evaluation ?
etc...
...these are in fact direct quotes from the OP's post. Not sure how you missed that..
|
|
|
|
|
I agree. Nothing in that would have been something that one should present in the context of the original question.
|
|
|
|
|
endo funk wrote: Yes; first off IEnumerable was Microsoft implementation of functional programming algebras for a List. It is essentially a monadic data type that encapsulates List; monadic data types are typically designed to conform to a fairly standard set of functional algebras i.e. providing a common API for computations.
Err..no it wasn't.
IEnumerable existed long before linq. It wasn't added to the libraries in any way shape or form to support functional programming.
Now one could make an argument that they should have started from scratch when they did add linq but that is an argument and not a statement of fact.
One could also argue that they should have redid IEnumerable to support linq. However that argument would have been completely wrong since C# was a practical language not a university exercise. Redoing it would have broken the entire existing body of practical work that was using it. And would have been exactly the wrong thing to do for any language when attempting to enhance and not replace that language.
|
|
|
|
|
Extension methods... apparently so obvious that you missed it.
|
|
|
|
|
endo funk wrote: Extension methods... apparently so obvious that you missed it.
Apparently you do not know what you said...
"first off IEnumerable was Microsoft implementation of functional programming algebras for a List. "
And again, no it was not. It existed long before anything associated with functional programming was in the language.
Refutation of that statement has nothing to do with extension methods which were added long after IEnumerable. I can only suppose that you are claiming that they could have used extension methods when implementing linq. Perhaps. But still that has nothing to do with your original incorrect statement.
|
|
|
|