The Testing phase tests usability (including model performance), and manages bugs and change requests generated from testing. Focus testing efforts on ensuring the model does what the user expects it to do as well as predicting how the model will perform in a robust environment. Considerable tools, time, and resources go into testing. Testing must be scoped into the project during the scoping phase. Generally speaking, there are two types of testing: Model functionality and usability testing Model performance testing (Automated simulation of user activity at expected concurrency and measurement of response times) Tasks Deliverable Tools Testing Test execution (joint ownership with client) Performance Validation with concurrent user loads Final performance testing report Updates to the model, based on UAT feedback J-Meter Splunk L3 Support Triage and fix defects Defects and Change Requests tracking Agile Implementation – The Anaplan Way App Data integration Updates to imports and exports based on UAT feedback N/A Process Updates to dashboards and modules based on UAT feedback N/A Deployment Updates to user access rights based on UAT feedback Weekly status report Agile Implementation – The Anaplan Way App Determining Testing Types To structure testing that ends with user acceptance, some tests rank higher priority than others. Gather what you need to know about the model's production performance. Consider these three factors: Model size Model complexity Concurrency - the number of users accessing the platform The table below serves as a guide for determining focus areas that produce the most reliable feedback. High Medium Low SIZE X X COMPLEXITY X X X CONCURRENCY X X Tests Needed: Load, Calculation, Concurrency, Usability Load, Calculation, Usability Calculation, Concurrency, Usability Assess the state of a model with a variety of these tests. Prepare as if testing a large, complex, high concurrency model.
Testing fails due to poor advanced planning. When starting a project, testing seems a distant milestone; it’s easy for a project to avoid factoring it in until much later. However, a good testing process should be planned from the start of the project. This includes: Allowing enough time to run the testing and make tweaks Identifying individuals for live (UAT) testing Identifying the testable criteria upfront (as early as user story writing) Identifying potential hurdles to testing (technical limitations, global audience, and localization)
Because humans do unpredictable things, expect a labor-intensive UAT (humanoid) testing process. Humans follow a prepared script and execute steps over a set period of time to test for errors and performance problems. In addition to testing concurrency, capture other critical information from human testers: Use of the model with different bandwidths Use of different operating systems Location differences Browser compatibility This data that can help identify and resolve usability issues. You will not know if the tester follows the script as you might anticipate. If a tester becomes bored or distracted, steps can be missed and areas of the model with bugs can go undetected.
Humanoid testing assesses a set of criteria in hopes that human testers accept model functionality and usability. Key questions used to direct the testers include: Is model fit for its purpose? Model must meet its intended goal and satisfy all stated objectives. Is it useable? Model must be intuitive and cannot create confusion or frustration for users. Model must perform as expected for single users and multiple users. Does it work as designed? Defined processes must function correctly and work in a sensible way. Does the data flow? Data must follow the model's logic to consistently produce the right output. Does it calculate correctly? Formulas and functions must be built for accurate calculations throughout the application. Will we get end-user acceptance? You will get acceptance if close attention is paid to the top five questions guiding the testing process and the questions are answered affirmatively. When this isn’t the case, and you don’t get end-user acceptance, the testing feedback must reflect the areas that need to be focused on in the UAT to meet the goal of buy-in from the end-users.
Early preparation helps humanoid testing go smoothly. To prepare: The customer and business partner should agree which actions are included in the testing script and the steps the testers will follow. Simplicity is important; include no more than 10 to 20 steps to complete in a script. Make the test scripts comprehensive, but not so exhaustive that the testers become bored with the process and lose interest. Write the scripts from the previous sprint as part of the current sprint process (example, in sprint two write the scripts that apply to the user stories from sprint one). Have testable data loaded in advance. Determine the role and selective access levels needed for testing and assign appropriate testers to each role. Create a presentation used to guide the users on the day of the test. Be sure all participants are online - including the testers, the project team, and the Anaplan consultants. Provide basic Anaplan end user training prior to testing to reduce the amount of “bugs” reported because testers don’t understand how the system works. Make sure consultants are looking at Splunk reports and the server log files are evaluated throughout the testing process.
Anaplan conducts humanoid UAT because people act differently than computers and are thus unpredictable. Good test scripts should contain: The user story being tested. The success criteria - broad description of what the test is supposed to achieve and how that fits into the grand scheme. Pre-requisites - Steps or procedures that the user must have completed before executing the test (i.e. any standard login functions or anything they have to do to prepare the test environment). Any known behaviors which may affect the user’s ability to complete the script (i.e. any intermittent bugs or undefined behavior). A step by step script, in tabulated form, with instructions on how to execute the test. with the following columns: Step Number Step Description Requirements mapping (if applicable - put the actual Requirement that maps to this step, not all steps will map to a Requirement) Comments - place for UAT tester to mark any pertinent comments (i.e. "I could not find that option/ could not click that button) Pass / Fail - the result that the user got when trying to carry out that line of the script
Once you’ve completed testing, launch a user survey. Base survey questions on conditions related to performance throughout the testing -- internet conductivity, variations of speed, and the performance over the testing period. Do not send a survey if you already know you have poor performance or the testing results. There’s no need to confirm something you already know. Only conduct humanoid testing when you know the results are going to be somewhat acceptable. Major system issues should be eliminated during the automated testing phase.
During testing, you collect rich information about the model, its performance, and its usability. During triage, determine what to do with the collected information. Form a triage committee that includes an Anaplan business partner, a customer subject matter expert, and the project sponsor. Make decisions that define the next steps in the UAT. In general, the testing feedback falls into one of 2 categories: it’s either considered a bug (defect) or a change request. Categorize feedback as bugs or change requests; then assess the level of severity. Depending on the severity of the bug or the change request, it will either be included in the current release, or assigned to the Backlog and included in the next release of the model. Review this guide for assessing how bugs and change requests are handled during the UAT process. Levels of Severity Bugs Change Requests L1 Must fix and include in release Show-stopper functionality - must have L2 Must have by end of UAT Desirable to have L3 Desirable to have Likely in future release If the team identifies a critical bug – it must be fixed and included in the current released – prioritize is as a L1 and factor into the UAT exit criteria. If a change request is identified as a show-stopper – or L1 – it too will be prioritized and follow the path of being included in the current release. Assign other bugs and change requests to include in the current release if possible, or if it’s not possible, added to be included in the next release of the model.
Determine time and resources needed to fix bugs so the customer can successfully complete UAT. If you have a number of L1 bugs to fix and the time and resources needed to fix bugs are extensive, all lower level bugs will be assigned to the next release. The UAT exit criteria should be referenced as a guide to follow for fixing bugs.
The Statement of Work (SOW) every customer receives as part of the implementation process contains the requirements for the model and the procedures to follow for incorporating changes in the model. When the testing results include feedback that Anaplan reasonably determines is out of the scope of the SOW, Anaplan notifies the customer with an impact analysis of the request, a quote for the additional work and an action plan for handling the request. All change requests must be mutually approved in writing before the work involved in the scope change will be performed. As with fixing bugs, prioritize changes requests by level of severity. Any change request considered a “show-stopper” gets top priority; other requests with less severe impact may become part of the next release.
With the feedback from the testing prioritized, fine tune or tweak the model. Tuning contains three layers with each layer going a little deeper to validate a model free of defects and optimized for performance. The three layers include: Model Design Calculation, Formulas, Blueprints Core Code, Model Behavior Layer 1 – (Model design) involves taking a closer look at some of these model specific details: Number of modules General dimensionality Numbered lists Subsets and composite lists Sparsity in modules Layer 2 – (Calculation, Formulas, Blueprints) focuses on these calculation-related issues and items controlled in the Blueprint: Use of functions versus long, complex formulas Use of Booleans and other Blueprint settings Use of Summary Methods Layer 3 – (Core Code) optimizes the model’s code and assesses functioning at the Core level. At this deeper level is where you look at configuration issues and behavior that impacts the model’s performance and overall functionality.
When it’s time to finalize the UAT exit criteria, it’s often a decision of the team to set a percentage of the L1 bugs and a percentage of the L1 change requests that must be completed. For example, establishing as exit criteria that 80% of L1 bugs must be fixed and 20% of change requests must be included before the UAT process ends.
Place the Go/No Go meeting on the calendar well in advance. This will help mitigate everyone’s busy schedules as you get closer to the go live date. It also provides the team with a goal date to drive to completion.
Work with the Customer Performance Testing team to schedule automated testing of load, performance, and concurrency. Using special testing software with defined instructions that execute processes repeatedly, you can see results that simulate a scenario stretching the model’s use beyond its normal activity. Check out the charts in this section to determine if performance testing is needed and if it is, how long it may take.
For this process, load data into the model to simulate production model volumes. Basic functions are performed such as data input, break-back, (where appropriate), allocations, filtering, pivoting, sorting, list formatted item, and drop down manipulation. Imports and exports can also be included in automated testing. The Customer Performance Testing team can also simulate load on multiple models or multiple workspaces at the same time. This testing can discover defects or highlight areas of slow performance that would be undetectable without extensive activity; it also determines the model’s stability during normal activity. The testing also provides feedback on what level of user experience can be expected at a given level of concurrent activity. If you determine some or many functions are slow and server memory and CPU are used to the maximum, you may have a case for model distribution. If, however, the model is slow, but user concurrency is minimal, then this could form a case for a single model instance, as the system is merely processing numbers and not being accessed by a user community. The end user’s experience, including performance, must hit the mark when you deploy models (see also section called Tweaking and Tuning). In order to optimize performance, system administrators need to take into account the following factors when deploying to determine whether a single instance or distributed instance strategy is best: Data volume (memory usage) Model complexity (calculation logic and business rules) User concurrency
The Anaplan performance team cannot test during every project. Follow these guidelines on performance testing: ID Type/Description Range Observations Risk or Issues 01 Model Size > 8 GB Models over 8 GB are much more likely to experience moderate to heavy performance issues. Highly variable - dependent on type of use cases (see 04). 02 Number of Cells > 1 Billion Cells Models over 1 billion are much more likely to experience moderate to heavy performance issues. Highly variable - dependent on type of use cases (see 04). 03 User Base > 400 accounts with access With an assumed 15% user concurrency, a user base of more than 400 is likely to experience 60 or more active concurrent users in a peak. Good indicator of whether model(s) will require performance validation on a mixed read/write model type (see 04). 04 Type of Model/Scenario > 60% of concurrent users are actively writing to the model (as opposed to passively viewing the model) Examples: All sales managers concurrently inputting daily sales in one geographical location in a predictable 1 hour time slot. Users inputting their figures to meet a deadline (accounting/financial period). The type of scenario requiring many cell changes in a peak business hour is a good case for performing load tests. Good indicator of whether model(s) will require performance validation. 05 Complexity of Operations - Imports & Exports ≥1 import and/or >2 export operations are frequently used during peak business hours (including any processes) Each import/export is potentially a very large blocking transaction. More than 1 frequent import(s) and/or more than 2 frequent exports (blocking transactions) in the midst of peak activity is likely to cause noticeable performance degradation for the majority of users. If a process is frequently used where there are more than 3 import actions in sequence - it is very likely for all users to be severely affected. Very high risk of encountering slow response times even for simple actions (such as opening/refreshing a dashboard). 06 Complexity of Operations - Adding Items > 2 operations involving adding to a list frequently during peak business hours (including any processes) Adding items has often been a very slow transaction for the projects we've been engaged on. After consideration of 01 and 02, there is a high risk of slow response times. 07 Complexity of Operations - Administrative Major changes to the structure of the model or changes to user accounts/access, or even model restores This does not typically happen on any models during peak business activity. These actions usually lead to very large blocking transactions. Very high. Reconsider use-cases/activity flow to one where impact to end-users is minimized. How long will performance testing take? There is no simple answer to this question; every project has many variables that impact performance testing duration. In order to provide better insight, below are some recent projects and the length of time testing has taken: No. of Scripts Phase performed Duration* Re-tests Performance issues 5 During UAT 3.5 weeks Yes. Re-adjust expectations. Little 1 Just before launch 4 weeks None. Moderate in relation to expectations 1 Just before UAT 2.5 - 3 weeks Yes. Model changes and re-adjust expectations Moderate in relation to expectations 3 During UAT 4 weeks Yes. Changes to line items. Moderate in relation to expectations 2 During UAT 2 - 2.5 weeks Yes. Reduce dimensions of 2 modules. None 10 Just before launch 2 weeks Yes. Core build changes (new release) Moderate/heavy 5 Just before launch 4 weeks Yes. Core build changes (new release) Heavy 2 Just before launch 2 weeks None None 4 During UAT 1.5 weeks Yes. Core build changes. None/little 9 Development + UAT 2.5 weeks Yes. New model. 2 Phases in testing Moderate in relation to expectations *Testing duration after receiving model
The requirements for automated testing are included here. Note that getting ready for automated testing takes time and should be included in the project schedule. Model Sanitization According to Anaplan security policies, all models placed into the testing environment must be sanitized. Sanitization involves the manipulation of data in a model to values that do not identify any company, persons, precise locations, company plans, or sensitive financial data. Make a copy of the model, sanitize it and provide login access to the Customer Performance Testing Team. If there is insufficient workspace to do a model copy, L3 Support can assist by providing an isolated workspace to carry out the sanitization. While it is best to sanitize all data, there may be situations where that is not possible due to time and effort constraints. The chart below ranks the priority for sanitization. # Data Location Examples Sanitization Mandatory? Responsible Team 1 Company Name(s) Model Name / Workspace Anaplan Yes Performance Team 2 Other Company Name(s) General Lists Accounts, Suppliers, Clients, Distributors Yes Business Partner/ Model Builder 3 Financial Data Data Input Modules Salaries, Revenue, Expenses, Sales Tax % Yes Business Partner/ Model Builder 4 Real Person Name(s) General Lists Employees, Partners Yes Business Partner/ Model Builder 5 Locations General Lists Sales Offices, Retailers No Business Partner/ Model Builder 6 Products General Lists Biscuit Brands, Drink Brands Yes - Brand specific names No - Generic Business Partner/ Model Builder 7 Services General Lists Dental, Advertising, Housing No Business Partner/ Model Builder Sanitization Techniques The Customer Performance Testing team provides additional information and techniques for sanitization on their Confluence page. Sanitization techniques include: Modify numbered lists Temporary hardcoding of values Direct copy and paste Use import and export functionality Test Scripts or Users Test scripts can also be referred to as discrete users in performance testing. These are step-by-step instructions that can be followed by the simulated user. The Customer Performance Testing team requires the finer details on test scripts/users, roles and selective access when the model's business processes become clear. A video that demonstrates the user role and its steps would give them the material to start evaluating whether these scripts are fit for performance testing. If a video recording cannot be created, an arranged meeting (and screen share) will be sufficient to talk through the steps. It is important that these details are captured accurately. The ideal number of scripts is dependent on each unique model, but the team would typically expect to have multiple scripts where each script/user has a specific set of tasks related to their role. Multiple scripts enable greater control over the distribution of the work load, reflecting the load/usage patterns as though real users were using the system. Additionally, the Customer Performance Testing team should know where the user base will be geographically located. Targets and Customer Requirements The team needs the customer requirements of model performance: 90th or 95th Percentile target response times for each transaction Expected load volumes of the model by end users (pacing) Expected scenarios by end users Concurrency level of the user base (typically 15% to 20%) These requirements are included in the questionnaire that the team has developed to capture the information they need to perform testing. It is available on the Customer Performance Testing Confluence page. How long will performance testing take? There is no simple answer to this question; every project has many variables that impact performance testing duration. The range is from a week and a half to four weeks.
In order for end users to enjoy the best possible experience and get an average less than two seconds in response to most popular requests, model size and concurrency must be managed appropriately. In many cases a project produces a base model that contains all the dimensionality and calculation logic. The model is subjected to a series of tests that determine end user experience and model performance. If you have a model that requires an interaction with a large user base, user concurrency tests should be performed. As a general rule, user concurrency comes in at approximately 15% of your total user community. Therefore, if you have a total user base of 1000, around 150 people will be on the live system, performing tasks, at any given time. In some cases, though, models follow a high concurrency pattern and this needs to be taken into account. For example, a weekly sales forecast may have 1000 users on the system, but very likely, each Sunday, (if forecasts are due Monday), the user concurrency will be quite high, maybe as high as 60%. Account for these factors. Customer processes and experience determine exact concurrency in high traffic models or periods. Test concurrency in two pieces. First, schedule an automated test to simulate user actions across the system. Second, conduct a human intervention test that requires a group of people actually using the system at the same time to record and react to actual system behavior. In some cases, automated testing does not reveal idiosyncrasies. Monitor the server while testing to track memory and CPU usage. In either test, tune the model afterward to optimize for all conditions.
For the UAT phase, your cornerstones include the items in the chart below. This chart is not intended to be an all-inclusive list; there are probably others that your project requires. Model Data Process Deployment Use Rapid Forensic Analysis Job Aid to help identify performance problems Ensure testing data is loaded Ensure that all types of users are included in the testing Thoughtful selection of users involved in UAT Make sure to have users assigned to the correct roles Communicate results of UAT Identify tweaks to business process Create end user training plan