Integrating AI with Anaplan to summarize table formatted data

Options
Lehtohen
edited August 2023 in Blog

Author: Henrikki Lehtola, Certified Solution Architect and Anaplan Consultant at twoday Finland.

It's undeniable that artificial intelligence has revolutionized the way businesses operate. An example of this revolution is ChatGPT, a cutting-edge language model developed by OpenAI, that has the ability to understand context, interpret data, and generate human-like text.
Among its various use cases, one application of ChatGPT that stands out is its ability to interpret table formatted data and provide a succinct and comprehensive summary. This is particularly beneficial in financial planning where large volumes of data are involved. With the right input, ChatGPT can distill complex data into easily digestible summaries. This allows business leaders to focus on strategic decision making, rather than spending time going through the data.

In this article, we will explore how we can integrate Anaplan with ChatGPT via API. I will use a Capex Planning model, developed by a brilliant colleague of mine, as an example use case for the demo.

Architecture

The general architecture of the solution is described in the image below. I am using Python as the coding language and the code has been deployed to a Microsoft Azure Function so that it can be triggered via url, which has been published to an Anaplan dashboard. This means that any user can trigger the function, which executes desired functions.

Python's "requests" library is a handy tool for making HTTP requests. The process involves creating a GET request to the Anaplan API to fetch the capex details from Anaplan, converting this data into a format that ChatGPT understands, and then sending this data to ChatGPT for interpretation and sending the answer back to Anaplan.

ChatGPT API attributes

OpenAI offers an extensive documentation of the API, which I suggest checking out. Below is a snippet of my script and what the API call to OpenAI looks like:

# Get Data from Anaplan
… 
# Call OpenAI API
    chatTask = "Take on the role of a Financial Manager and provide a short summary of the investment details provided for you. Focus on the financial aspect and the preliminary investment amount. Keep answer below 200 words and don't use bullet points."
    chatContent = contentFromAnaplan
 
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system",
                "content": chatTask},
            {"role": "user",
                "content": chatContent}],
        temperature=0,
        max_tokens=500,
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0)
 
    summary = output['choices'][0]['message']['content']
 
# Write Data to Anaplan

It is important to note chatTask, which works as the prompt. The behavior of the output is very sensitive to the prompt, which makes prompt engineering an important skill to develop. I have learned that incorporating a comprehensive definition of the role within the prompt is beneficial, alongside specifying the desired word count for the summary and emphasizing the exclusion of bullet points.

User experience

Below is an example of the UX page used in this demo, where you can see how to create a quite seamless user experience while integrating Anaplan with other tools. The data for AI to analyze is located in a simple module and a link to calling the Azure function is published to the UX page. Clicking the button will trigger the Azure function, which takes a few seconds to run in the background and will return the analyzed result by writing it into a specified line item.

As mentioned, the quality of the prompt will impact the output, but also the GPT-model will have a great impact. At the time of writing the latest models are "gpt-3.5-turbo" and "gpt-4". GPT-4 is said to better context and distinguish nuances, which give more accurate and coherent responses, but this model is not available for all users. Other factors that can affect the output are Temperature, Frequency_Penalty, and Presence_Penalty. Modifying these parameters will impact the creativity and diversity of the output.

I tested the output using models "gpt-3.5-turbo" and "gpt-4" and the different outputs can be seen below.

GPT-3.5:
The investment details provided include various projects and their financial aspects. The preliminary investment amounts range from $50,000 to $1,500,000. The projects have different statuses, such as "Not started," "Ready," "Ongoing," and "Ongoing." The investment types include plant-based R&D, private label, strategy, and constructions. The depreciation starts and end months vary for each project, and the asset types include other intangible assets and constructions. Some projects require advance payments, while others do not. The strategic emphasis is mentioned for some projects, such as "Innovate" and "No product." The product dimension is specified for one project as "G Stracciatella." The investment accounts are categorized as "1114 Other intangible assets" and "1123 Property and plant." The number of years of cash flow generated by each investment ranges from 3 to 5 years. The long-term growth rate is not provided for any project.

GPT-4:
The investment portfolio contains a mix of initiatives in plant-based projects, seasonal category planning, real estate expansion, and production improvements, all with varying investment sizes and statuses. Major investments include the 'Plant Based Project' with a preliminary investment of $1.2M, 'Spring 2024 Category Planning' with $1.5M, and several 'New Retail Locations' each estimated at $100K-$200K. Real estate investments are ongoing or ready, focusing on constructions that depreciate over five years. The plant-based initiatives, including a 'Gelato Machine' and 'Freezer Improvements', are in the R&D cost center with an expected asset life of three years. The start and end months for these investments span throughout 2023, with the largest investment, 'Plant Based Project', commencing in May 2023. Notably, the 'Plant Based Project' includes a $100K advance payment, indicating early financial commitment. Across these investments, there's an evident strategic emphasis on plant-based and retail strategy, both innovating and expanding the company's footprint.

GPT-3.5, Temperature = 0.5 (more creative):
The investment details provided include various projects and their financial aspects.

  1. Plant based project: This project is related to plant-based research and development. It is not yet started and requires a preliminary investment amount of $1,200,000. The project is expected to start in May 2023 and end in July 2023. It falls under the cost center of "Plant based R&D" and is classified as other intangible assets with a depreciation period of 5 years. An advance payment of $100,000 has been made and activated.
  2. Zero calories: This project is ready and requires a preliminary investment amount of $500,000. It is also related to plant-based research and development. No further financial details are provided.
  3. Summer 2024 category planning: This ongoing project requires a preliminary investment amount of $1,000,000. It is associated with private label products and falls under the cost center of "Other intangible assets." The depreciation period is 3 years, and there is no specific product mentioned.
  4. Spring 2024 category planning: This project is not yet started and requires a preliminary investment amount of $1,500,000. It is also related to private label products and has similar financial details as the previous project.
  5. Christmas 2024 category planning: This project is not yet started and requires a preliminary investment amount of $800,000. It is associated with private label products and has similar financial details as the previous projects.
  6. New office space to Turku: This project is ready and requires a preliminary investment amount of $200,000. It is related to the construction of a production plant in Turku. The asset type is classified as constructions with a depreciation period of 5 years.
  7. New production site to Tallin: This project is ready and requires a preliminary investment amount of $200,000. It is associated with indirect sourcing and has similar financial details as the previous project.
  8. New retail locations: There are several ongoing and not yet started projects for new retail locations in Tampere, Stockholm, Vaasa, and central Helsinki. Each project requires a preliminary investment amount ranging from $100,000 to $200,000. They are associated with the strategy and fall under the cost center of "Constructions." The asset type is classified as constructions with a depreciation period of 5 years.

Comparing the outputs, we can see that GPT-3.5's analysis is more general and focuses on providing an overview of various parameters such as the range of preliminary investment amounts, status, investment types, depreciation period, and strategic emphasis. The information is more quantitative and broader, giving a snapshot of the overall projects and their attributes. This makes it useful for getting a quick and high-level understanding of the investment details.

On the other hand, GPT-4 provides a more detailed and qualitative analysis. It offers in-depth insights about specific projects, such as the 'Plant Based Project' and 'Spring 2024 Category Planning', providing information about their status, investment amount, start month, and strategic emphasis. The specific naming and details given about each project make the analysis more informative and could help a decision-maker understand the specific aspects of each project.

Modifying the temperature parameter makes the model more creative. The created output using GPT-3.5 and temperature of 0.5 is quite different from the original one, and in my opinion more informative. The condition for not using bullet points was not really followed, but the numbered output makes it quite easy to read. In my opinion this type of output would be very suitable for users who prefer written text over table formats.

Conclusion

The aim of this article is to demonstrate how we can integrate and create value with AI-generated summaries in Anaplan. It's key to remember, especially when we use AI tools, that data privacy is incredibly important. In some cases, it may be enough to mask the data by using codes instead of names. OpenAI is developing a business subscription for ChatGPT, aimed at providing stronger data privacy. Also, Microsoft now has a product called Azure OpenAI. It shares the same GPT foundation as ChatGPT but adds in the strong security elements of Microsoft Azure.

If you find the topic interesting and have some experiences yourself, leave a comment!

Comments

  • Interesting Idea! Also good note of minding the data privacy

  • AnyaS
    Options

    neat! Well done, @Lehtohen!

  • Interesting topic !

  • Exceedingly interesting article especially as we contemplate the future of autonomous planning.

  • Thanks for your article. Question on implementation - Have you managed to send large data tables given the token count limitations on requests? We have implemented something similar but it couldn't handle receiving a 4 column by 150 row table.

  • Lehtohen
    edited August 2023
    Options


    Good question. The token amount currently sets some limitations. I have now been testing Azure OpenAI and different GPT-models, which support different token limits. I believe the highest token limit now available is gpt-4-32k, which supports up to 32k tokens instead of just 4k supported by model gpt-3.5. I also belive that the amount of supported tokens will keep increasing in time.

    In Anaplan side I have done everything I can to reduce the number of cells and count of characters in the table. For example, I noticed that number formatted lineitems often contain up to ten decimals, which unnecessarily eats up the token amount. I have tackled this by using the ROUND-function.

    I would suggest you to check out the Tokenizer by OpenAI, where you can see the amount of tokens consumed by your input. Link to te Tokenizer is here: https://platform.openai.com/tokenizer
    Your example of 4 columns and 150 rows does not sound that much data, and I have managed to use much larger datasets.

    @JoaquinEms

  • @Lehtohen This is awesome. We built something similar on dashboard summarization and Q&A and its working perfectly across multiple applications including Anaplan. We encountered an issue related to user experience in Anaplan. When we click on the LINK to summarize data on the Anaplan dashboard, a new web page tab opens up. We had to get back to the summary dashboard and then click REFRESH button to get the updated summary. We are able to skip these steps in other web applications, but not in Anaplan. Please suggest how can we get away from these two unwanted steps in Anaplan. Thanks.