|
Assuming PLINQ's implementation is not terrible, you're probably incurring locking overhead. It doesn't make sense to try to use any kind of parallelization in the following scenarios
a) your problem has interdependent components such that you can't decouple the work done by B from the result of A and C depends on the result of both, so you're elephanted.
b) it doesn't do you a heck of a lot of good to query the same source in parallel with itself. It's hard to give you a good example in PLINQ but you want parallel op A to use a different datasource than B. In an RDBMS this principle is easier to understand. If I run a join across two tables, i don't have a lot i can do to make it parallel *unless* each table is on a separate drive ("spindle" in DB parlance) meaning the read operations of table A aren't dependent on waiting for read operations from B since they are different drive controllers working in parallel. The same basic idea would apply to PLINQ
If a or b are an issue, you'll probably end up incurring more overhead than you gain in throughput
Real programmers use butterflies
|
|
|
|
|
I tried with some code I have from my long past Physics PhD that integrate some equation over time... I have a multidimensional field and each dimension was in its own thread...
Mmm... come to think of it now, there is coupling between some variable I think, I wonder if it was the cause of the slow down... no matter.. not sure where this code even is now ^^
|
|
|
|
|
It's very likely. It can be really easy to miss interdependencies in formulas.
Real programmers use butterflies
|
|
|
|
|
Is LINQ bad'ish or orders of magnitude slower, than those hand-written operations? I ask because I wonder about performance implications myself, while also regarding code read-/maintainability (after all, if performance was all I cared about, I'd hand-optimize everything in assembly anyway, engineering is all about trade-offs).
|
|
|
|
|
Usually it's not terrible. Not orders of magnitude slower for what most people seem to use it for - queries in business software.
However, don't use it for what I'd call "real" functional programming.
If you're going to write a parser generator or scanner generator for example, you don't want to compute your tables using linq. In that case it *will* be orders of magnitude slower than most anything you could write by hand.
And I guess now you can tell what kind of software I write.
Real programmers use butterflies
|
|
|
|
|
Ah, I understand. Thank you for the explanation! Full disclosure: I don't write generators. I prefer shouldering the burden to drive something data-driven from the get-go then to go through the intermediate step of writing a generator (that generates something that gets the actual job done). I find the one-step-approach way easier to debug, adopt to future changing requirements (which will of course change because that's what requirements simply do) and to teach to someone freshly joining the team.
Should I ever be explicitly required to write a generator (instead of getting things done one way or another), I'll heed your words.
|
|
|
|
|
It's not so much about the code generation per se, but the type and amount of iterations you'll be doing.
Consider the following source file that generates an LR table. The code is ugly because the algo is ugly. There's not much way around it. See the accompanying article for an explanation of the algo if you want:
Downloads: GLR Parsing in C#: How to Use The Most Powerful Parsing Algorithm Known[^]
The point in showing you this is the iteration code to generate things like the LRFA state graph.
When I say generate above, I'm not talking about code generation, but simply computation of tables.
Trying to do those things with LINQ - massive recursive iteration is just a mug's game.
Real programmers use butterflies
modified 24-Feb-21 8:22am.
|
|
|
|
|
Speaking of ugly code due to ugly algo, VIF tables in the M-Bus norm are a bloody nightmare.
|
|
|
|
|
|
There is only one language integrated query, and it is scatter memvar / gather memvar.
|
|
|
|
|
I actually love Linq. I find just the opposite. After you learn it well, it is very powerful and very fast. I once wrote an in-house version of the FBI’s CODIS search engine that requires very complicated operations and millions of them because you are searching huge genetic identity (DNA) databases. Linq handles them very well. Plus, because you can split operations between cores\processors, it is very fast. Linq is my go to solution for lots of scientific software. I especially like it’s Join and GroupBy. I do a lot of Sql. The way I do it now is that I developed a very fast transfer From Sql to a list of classes and then do Linq operations on the list. Extremely fast and more powerful than using Sql operations. Certainly a lot easier than Sql. So, I have to respectfully disagree with the common bashing in these posts and recommend investigating how Linq works to its fullest and suggest that you will change your minds.
MeziLu
|
|
|
|
|
MeziLu wrote: Plus, because you can split operations between cores\processors, it is very fast.
That's PLINQ.
Read my comment again.
Real programmers use butterflies
|
|
|
|
|
Thank you for your comment. I meant it as part of the entire package, not as a singled out feature to make the point that you need to consider Linq in its entirety, not this feature or that feature.
MeziLu
|
|
|
|
|
I feel like I did, which is why I made a whole exception for PLINQ. I even agree with you, if you'll read my comment that PLINQ is where LINQ finally pays for itself.
Where we disagree is on basic LINQ. Not PLINQ.
I covered readability. I don't think LINQ helps that, because I operate under the idea that if you can't understand exactly what your code is doing it doesn't matter if the code is concise. LINQ is concise, but not easy to read: It's harder to translate a LINQ call of any real world complexity mentally into the series of iteration operations its going to perform than it is to do the same with nested foreach and if statements. LINQ is shorter, but it doesn't lend itself to readability, it just means the code is concise, which isn't the same thing, IMO.
Real programmers use butterflies
|
|
|
|
|
Wow, I must be wired differently. Readability is one of the things I like most about Linq. I place each operation on its own line and it’s like this “outline” or flow diagram that I and my colleagues can instantly understand. Anyway, good chatting with you but I need to take care of some urgent things now
MeziLu
|
|
|
|
|
One final quick comment: There are actually two different approaches to Linq: extension methods and lambda-like. I totally agree with you about the latter. I can’t follow or understand that to save my life and lever use that approach. I avoid the lambda-like approach like the plague. Extension methods are dreams come true, but that might be because I write a lot of extension methods myself.
MeziLu
|
|
|
|
|
One other point about parallel processing: When I do that, I always actually check the speed by setting a stopwatch to make sure I am getting a faster speed. Some times it is actually slower, but for the kinds of things I do, it is most often a lot faster. Just a word of advice to not assume speed boosts. Overall, without parallel processing, I find that Linq is blisteringly fast.
MeziLu
|
|
|
|
|
Performance should never be the only consideration, and in most cases not even the most important one. One big advantage of LINQ is that you have the same set of functions, no matter what data source lies behind the data. It might be a local System.Collections.Generic.List generated on-the-fly, but it might as well be an SQL or SOAP connection providing the data collection. Either way, you should always consider the cost when calling LINQ functions. So calling Enumerable.Count() without important reason is usually not a good idea, as this will iterate over all items.
Another advantage is readability. Of course only when you know how to read LINQ.
In performance-critical scenarios though, the cost of generating an enumerator just for the sake of being able to use LINQ (e.g. for an array, which doesn't come with its own enumerator) might be relevant. But you should decide on a case-by-case basis instead of generalizing the decision for or against LINQ. "Only a Sith deals in absolutes." 
|
|
|
|
|
I feel justified in making the general observations I made about linq.
And my comment wasn't just about performance. It was also about cognitive load in terms of understanding what your code is doing as well.
And performance considerations are indeed important *if* they influence architecture, at which point potential perf problems are best identified at design time rather than after you've already architected something that will not perform to requirements.
Using LINQ to implement all of your functional-programming style operations is a "Bad Idea(TM)" when you're doing loads of heavy iteration, like building parser or scanner tables. Most pure functional languages like Haskell handle iteration a lot better than LINQ if nothing else than for the fact that it's a first class operation. Enumerators in .NET were designed not as first class operations but built on top of the existing operations in .NET, and practically, that comes with performance considerations, like all the object creation it does.
If you don't believe me, write a LALR(1) parser generator using LINQ and then one without. As long as you know what you're doing latter will be at least twice as fast.
Real programmers use butterflies
|
|
|
|
|
The other issue is that for those of us who write in VB.Net, the LINQ syntax is convoluted and ugly. At least in C# the LINQ syntax is concise.
As for clarity, simple LINQ statements can be clear, but a loop can always be clearer. The other issue with LINQ is trying to figure out what the actual <t> in the IEnumerator<t> is. This is why C# added the var statement, because LINQ doesn't lend itself to clean typing.
|
|
|
|
|
Actually, I would discourage the use of var and encourage creating a class up front. This forces you to design what you want up front, which not only makes designing easier, but more importantly, you can serialize the class (with its groups, lists, etc) to an xml file and review it to make sure you are getting what you want. Var, IMHO, is a terrible way to go, especially for long term maintenance should something unexpected pop up. Serializing a class is extremely useful, especially for groupings, joins, and select many, etc.
The tree view like structure in a serialized xml file is incredible to look at for complicated solutions.
Guess another thing I am saying is that using var promotes backwards design. Create the Linq and hope for the best vs know what you want and design Linq to get you there, using serialization to test.
Further, having the class greatly improves readability, a concern some have expressed. It’s a lot easier to know what Linq is doing if you can see what it produces.
Regarding speed: I use PLinq for Next Generation Sequencing of the human genome. Six billion data points. It is plenty fast. Same goes for my CODIS search engine. CODIS is what you see on crime lab TV shows: here is the DNA profile from a crime scene. Are there any matches? PLinq was a godsend for that complicated process and is literally a million fold faster than Sql. Example: a 47 minute Sql search was 18 seconds for PLinq.
MeziLu
modified 11-Feb-21 10:07am.
|
|
|
|
|
|
I started reading it, and it seems we're on the same page as far as general approach toward optimization.
Optimization starts in the design phase during requirements gathering. Performance is either an explicit or unwritten requirement of any application. No application can take forever to perform. How long is acceptable is a question of design.
Optimization continues through project planning - choosing the platforms and tools, and even right data structures and patterns to accomplish your tasks. You don't garbage collect driver code. You don't use a Dictionary where a LinkedList would be more appropriate. These are design decisions, the first one high level, the second one more specific, but still, design decisions.
Only after that does the phrase "optimization is evil" come into play. Because at this point, if you're optimizing, you're optimizing something you should have optimized during design, or you're trying to bit twiddle to work around something that again, should have been optimized during design.
It's way more efficient to optimize up front during the design and planning phase, rather than after the fact when you are locked in and your options for improving performance are limited to bit twiddling.
Unless you're doing embedded though, counting cycles isn't important. Optimizations should be done on the algorithmic level - look for a Big O figure, not how to shave a cycle here or there.
Real programmers use butterflies
|
|
|
|
|
"I tend to... make my iteration operations explicit and long hand."
Thanks! I thought I was the only one. So much more readable when I come back later, IMHO.
|
|
|
|
|
I think the premise of Linq was to separate database operations (T-SQL) from C# developers.
Personally I think they have replaced T-SQL for something that is not easy to learn and understand. Exactly the opposite of what T-SQL is.
If one does not do Linq each and every day, but once in a while then Linq is difficult to quickly comprehend what is going on for someone asked to maintain it.
Second, in Microsoft's universe there are ample resources(developers, fast computers, fast internet connections, and Azure is free. Most of their big clients are in a similar situation. And Microsoft can afford to hire those that are the best of the best. That is not necessarily true for their big clients, and probably not true for the bottom of their customer pyramid.
Very few within Microsoft have to maintain complex codebases for long periods of time. And we can see what happens when they have to. How many repetitive breaking bugs have you seen in VS? New stuff is the focus, not fixing what is broken.
Consider any product demo or new feature demo. It is thrown together by the back office developers to demonstrate the concepts of the new capabilities, carefully avoiding any and all complexities that would pop up in a production environment. The life time is a few weeks, because by then Microsoft devs have moved on to next thing, a new iteration, and terminology has changed.
If performance is not an use, and one would prefer to keep database operations separate from the dedicated C# team then Linq is the way to go.
You just need to ask yourself is this Microsoft flavor of the day the best solution for me?
And to the person who thought the orderby and groupby operations were easy to understand, you should make a YT video and disperse your knowledge. Once can look at 5 Linq videos on this subject and be none the wiser.
|
|
|
|
|