New Formula Idea : Convert boolean into a number and stop IF BOOLEAN THEN X ELSE XX
I have to do a lot of controls (boolean) and base and theses controls give me a value based on its true or not.
I do a lot of these technicals line items for :
I code it in 3 steps :
- 1) Business Element (Number) : Uselly a business value
- 2) Control (Boolean) : Management rules that I want to verify (ex : Business Element > 0)
- 3) Color Coding (Number) : Based on the control I will apply a value in order to build a color coding.
It is not very complicated to do. Nevertheless, this 3rd step forces me to write it this way :
Color Coding = IF Control THEN XX ElSE XX
I try to split my line items when I can in order to have the best lisibility but this 3rd step create complexity where there is not.
It would be great to have a direct formula to convert a boolean into a number. I image a formula that would be such as :
CONVERT(Condition, true value result, false value result)
Condition : Boolean LI
True value result : Number
False value result : Number
Source : Boolean
Output : Number
Would this 'Convert' work the same as IF in excel? It could also be generalised; as long as true value result and false value result where the same type it could be generalised to not just a number.
In your example though wouldn't you just do "IF BusinessElement > 0 THEN XX ELSE XX"? You say you want to break out the "Business Element > 0" into its own line item for visibility, but then wouldn't you want to do the same for CONVERT for the same reason?0
Thank you George for you answer. I will deep dive a little more.
Would it be the same as IF ?
Well, All the formulas I have seen in Anaplan have a designed format result. Not sure, Anaplan has the capability to have a same formula that would give different format result.
Auditability & Performance of the IF/THEN/ELSE :
The problem its not really breaking line items. Breaking line items will lead to improve performance in your model. According to best practices, if size isn't an issue you should break as much as possible your line items to improve performance. Breaking them should also give you more lisibility.
However, It is more the visibility of that 3rd LI that distrubed me. Indeed, you break your line items to improve performance and also lisibilty. Nevertheless, at the end you still have to do an IF/THEN/ELSE statement that is complex/un natural to read.
Moreover, IF/THEN/ELSE statement is a formula that is likely to underperform. In order to have the best performance in your model you should write your IF/THEN/ELSE statement in a way that the "THEN" condition would be the more likely to happened than the "ELSE" statement.
Therefore, for a simple request you would have to add a complex approach and sometimes you would have to write your line item such as IF NOT Condition THEN False Result ELSE True Result.
Saying so, in order to have a model as efficient as possible you loose a lot of lisibility treating with boolean. Thus, it would be great to have a fonction better designed function then IF/THEN/ELSE to improve both aspects :
- Lisibility in order to be understood in a second
- Performance by calculating at the same time the 2 possibles results.
Hi @Nico, Thanks for the very detailed response, it's great to have this kind of discussion.
There are formulas in Anaplan that vary their result format. Simple examples would be YEARVALUE() OR LAG(). Of course IF...THEN..ELSE varies it's result too, although the syntax is different to a typical 'Function'.
Personally I find IF..THEN..ELSE easier to read then CONVERT(,,) (particularly given the formula editor now formats this over multiple lines with proper indentation), but maybe that's just my background in programming more than a background in Excel showing.
I understand the principle of putting the common condition outcome in the THEN and the uncommon one in the ELSE. Rather than necessitating a NOT Condition in the IF wouldn't you just negate the condition (e.g. in your example "Business Element > 0" becomes "Business Element <= 0")? I understand that sometimes this may make the naming of the 'Condition' line item a bit odd however, therefore slightly increasing cognitive load in some cases. Most of the times I've had to do this I've found it fairly straight forward though as I've found the common case usually has a simpler concept than the uncommon one and in the cases where it's easier to use the original condition I'd just accept the "NOT Condition" in the IF.
A hypothetical CONVERT function that calculated both possible outcomes such that you got the same performance in either case would just be as bad as the worst case with an IF..THEN..ELSE. At the end of the day calculating values takes time. IF...THEN...ELSE has a performance improvement in one case exactly because it doesn't always have to calculate the ELSE part. If you're suggesting CONVERT always does, then performance would suffer.
Again, it's great to get a different viewpoint, and I really hope I'm not coming across as argumentative or anything like that.0
Great talk here,
I never used excel too much. However, in all my other programming languages, I was always advised against using "IF/THEN/ELSE" because there were always more advanced features.
We are aligned with the expression of the condition and its writing. I wrote the expression "NOT" to be as clear as possible.
What I understand about "IF THEN ELSE" is that the function calculates all the time the "THEN" and if the condition is not respected the "THEN" and the "ELSE".
That's why we are asked as developers to flip our condition sometimes.
The performance should be agnostic of the way the condition is written. I though calculating the two possibles outcomes and display only the value depending on the condition would be more efficient.
Maybe the solution is a more advanced While() loop ?0
Interesting that you've been advised to avoid IF/THEN/ELSE in other programming languages; I've never really come across that advice myself (I program in C# and Typescript mainly). I have found a few posts about avoiding IF/THEN/ELSE in the context of imperative programming languages where there is a control flow (a list of statements executed one after the other), rather than the language being an expression that defines a result (like Excel/Anaplan). In those languages you'd have things like object inheritance etc that could be used rather than IF, but that doesn't apply to Anaplan. To my mind any valid argument against using IF/THEN/ELSE would also be a valid argument against using a CONVERT(,,) method for the same reasons.
My understanding of why you'd put the common case in the THEN not the ELSE is because when a CPU executes instructions it performs look-ahead and in a sense starts doing the following instruction while it's doing the current one. If it gets the result of the condition and it's false it needs to 'undo' what it did for the THEN clause so it can then do the ELSE where as if the condition is true it can carry on as normal. I guess as Anaplan's advice is the same it works in a similar way. If that assumption is correct then the only way for performance to be agnostic is if it calculated the two outcomes regardless of which is subsequently to be used. That, though would be less performant in the case where you (the Anaplan model builder) correctly anticipated the common case and put that first since in that case Anaplan can avoid calculating the ELSE entirely. I'd imagine that cost saving (avoiding calculating the ELSE) vastly outweighs the small number of times where it has to pay the 'undo' cost of doing the THEN work that it started doing. Basically with a CONVERT(,,) it has to work out which one to display, rather than with IF/THEN/ELSE, which one to calculate. In either case it's still got to do that work, except with CONVERT it's also always calculating both the THEN and the ELSE, but with IF/THEN/ELSE it's only got to calculate the ELSE sometimes.
You said you want performance to ideally be agnostic to the way the formula is written (and I'd agree with that). What I'm saying though is I'm happy if I'm able to increase performance by reworking a formula, providing performance if I don't rework it is no worse than if rework it didn't make a difference.
Not sure what you mean re using a While() loop? I'm not really sure what that would look like within Anaplan.0
Hi @Nico & @GeorgeDuckett,
Really nice discussion, I hope Anaplan would have more like this.
I think such solution already exists (to some extent). You can type IFs just like in Excel, so using formula syntax IF(X,Y,Z). It's faster to type (and if you are used to Excel, you don't have to change your habits), but once you hit the Enter/Apply, Anaplan converts it to "normal", IF/THEN/ELSE syntax (just like with typing formulas in lowercase, they all get UPPERCASED). Personally, I think keeping IFs in one format is good for consistency and readability, so it's good that Anaplan allows for this alternative syntax but later uniforms it.
About the concept of putting more frequent outcome first, I think it's just erroneous interpretation of developer's explanation (but this is my hypothesis, so it'd be good if someone who really knows how Anaplan is coded would confirm or debunk). I think that some developer was describing the concept of "early exit" from IFs, so that if you have nested IFs, layers should be arranged from most to least probable. How statements are arranged in a layer shouldn't matter. And I think it got misinterpreted or simplified to "put most probable outcome first". And now people think they should even do IF NOT A THEN C ELSE B just because false=A is more probable than true=A.
Another thing is "branchless programming", which is a good concept about avoiding conditional statements in programming. As you described, it reduces the load on CPU as it doesn't need to guess what it should load next/doesn't need to load that much. Just structuring IF as a function probably wouldn't help with performance, as it would translate to the same instructions, but I think there are cases where branchless programming can be applied in Anaplan: SIGN() formula. I sometimes see formulas like "IF A>0 THEN 1 ELSE IF A<0 THEN -1 ELSE 0", which can be simply replaced by SIGN(A). Even if it's just "IF A>0 THEN 5 ELSE 0" (and you don't expect negative numbers), you can just do SIGN(A) * 5. Another example: "IF A<>0 THEN B ELSE 0" translates to ABS(SIGN(A)) * B. But again, I am just making an assumption that SIGN is faster than IF (it is for CPU, but who knows what's between Anaplan and bare metal). And about examples 2 and 3, maybe I am wrong because i.e. multiplication is quite expensive (usually) for CPUs, so it's actually faster to do IF than multiply two numbers?
PS This short notation of IF and other Boolean operators has been in Anaplan for years, and I wrote about it just two months ago after learning (and being surprised) that not many people know about it: https://community.anaplan.com/t5/Anaplan-Platform/Creating-a-Module-Report-but-need-to-bring-detail/m-p/133954#M347300
@M.Kierepka Yeah, great point about the IF(,,) syntax conversion; that's basically allowing @Nico to type in the format desired, even if it does end up converting it to IF...THEN...ELSE syntax.
The point about avoiding branching statements is a good one too. Like you say, in the general case CPUs don't like branches compared to most other operations. It may be the case that is even more relevant to Anaplan depending on what's going on under the hood too as branches limit the scope for some forms of parallelisation, which I'd hope Anaplan takes advantage of (using SIMD or maybe even GPU Compute).
What would be really nice (getting off topic now though I guess), would be able to get some kind of metrics for model calculation time broken down by the formulas used. I'd imagine that is probably not at all as simple as it might sound though.0
@GeorgeDuckett @M.Kierepka - definitely getting off topic, but like where we are going. Agree with the idea that the explanation for this in the planual could be a mistranslation / mis-simplification of 'early if exist' which should only apply to nested IFs.
I like your SIMD idea @GeorgeDuckett . A couple of guesses of my own to add (although you look to be more experienced than me in the area):
- The Branch Predictor: (a bit of circuitry in the CPU) is doing its job and guessing which way to go in IF statements based on the recent execution history --> This means we should see the order the (1 layer) IF condition is written as having no impact on the performance
- Optimising Compilers: When a simple IF statement compiles (e.g. "IF A? Then 1 ELSE 0"), the compiler spots the opportunity and optimises it to a one step operation "A? AND 1" when it compiles (which is branchless @M.Kierepka ) ---> This means we should expect the "B = IF A THEN 1 ELSE 0" statement to perform as fast as referencing ("B = A")
I'd be interested in hearing someone with the know-how at Anaplan giving us their thoughts / telling us how wrong we are 😂0
Get Started with Idea Exchange
See our Submission Guidelines and Idea Evaluation Criteria, then start posting your own ideas and showing support for others!