Nested IFs (early exits) vs. OR/AND?
I'm curious what others' thoughts are, or if there's been any testing done regarding the performance difference between nested IFs vs. OR/AND. I found a very old thread on this topic from 2014, but I expect the thinking & research has progressed a lot since then.
Let's say I have multiple criteria I need to check in order to produce a final determination. Using careful construction of nested IFs, I can allow early exits from the formula, which supports the P (Performance) rule of PLANS, however the resulting formula can be very long and convoluted. Using OR/AND results in a much simpler formula (L - Logical and A - Auditable in PLANS) but does not allow for early exits. In situations where one case is much more common than other(s), I imagine this could make a significant difference, especially over long timescales, many dimensions, etc.
What do you prefer to use?
Best Answer
-
The right hand side of an AND is only evaluated if the left hand side is true, and the right hand side of an OR is only evaluated if the left hand side is not true. Use IF when the value evaluated as the condition will not appear in the result. You can optimise AND/OR expressions if you know the ratio of true/false evaluations in your data set and keep more expensive Boolean subexpressions to the right.
If you find long (or even short if text) subexpressions repeated multiple times in a single formula it may be more performant refactored into multiple line items/properties, especially if it enables parallel evaluation and/or reduces the dimensionality. It will certainly be more manageable, although a trade-off is the extra memory required.7
Answers
-
I think it is a matter of judgement.
Is your model already big and performance at the limit ? favor performance.
Is this formula the center of the model and needs to be understood by all and often ? favor auditability
Both ? Then actually measure the performance gain. Do you gain half a second of calculation or 5 ?
7 -
First, great topic! An AND by nature performs faster because both have to be true whereas an OR is like a wildcard in SQL. But first, you have to look at the S for sustainability/scalability, you should be breaking the long nested IF THEN statements up. When you have these long formulas, you are making the formula sequential meaning it has to go through the long if statement until it hits an early exit (great point by you). But still, if the logic never hits the early exit and goes all the way through the end of the formula, that is bad on performance because it is doing the same thing for every cell that line items has. So, it is best to break it up. A great analogy that David spoke about was a highway. Would you rather have 144 lanes going the same direction or just one?
Rob
6 -
@rob_marshall To be clear, when you say break it up, do you mean into separate line items? I agree and definitely try to do that where space allows. However, I am more interested in the difference in constructing the final formula.
For instance, if I have conditions with the following results:
and I want a 1 where any of the 3 are true, I can do:
"if A OR B OR C then 1", or
"if A then 1 else if B then 1 else if C then 1"
Is there a performance difference between these? My gut reaction is that the first would be slower since it has to evaluate all 3 conditions every time, but maybe that's not the case.
@nathan_rudman Thanks for the response! I'm not sure what the real performance difference is (just my guess as above, which might be wrong) and find it very difficult to nail down, particularly in a general case like this where it's less about optimizing 1 terrible formula and more making sure I take the right approach because small differences add up. If you have any tips for measuring performance difference, I would love to hear them!
2 -
This is really useful to know!0
-
Thank you @ben_speight; this is super handy to know and exactly what I was looking for.
Thanks all for the thoughtful replies! I had posted a follow-up on Friday but it looks like it was deleted somehow; not sure why that happened.
0 -
in your example, I believe "if A OR B OR C then 1" will be faster than "if A then 1 else if B then 1 else if C then 1" because the 2nd one has to go through the three different IF statements whereas the first does not.
Rob
0