# How to Calculate Distance Between Pairs of Latitude-Longitude Locations

There are several business use cases that require the ability to compute distances between pairs of locations.

• Optimizing sales territory realignment
• Logistics cost optimization
• Transportation industry passenger revenue or cost per mile
• Franchise territory design
• Brick-and-mortar market area analysis (stores, hotels, bank branches, …)
• Optimizing inventory among geographic Distribution Centers

At their core, each of these requires knowing how far apart a pair of sites are positioned.

This article provides step-by-step instructions for creating a dashboard where users select a location and set a market area radius, then the dashboard shows all population centers in that vicinity with some demographic information.

# Doing the Math: Trig functions in Anaplan

Distance between two latitude-longitude points (lat1, lon1) and (lat2, lon2) requires solving this equation:

ACOS(  COS(90 - lat1) * COS(90 - lat2)

+ SIN(90 - lat1) * SIN(90 - lat2) * COS(lon1 - lon2)

)

This formula works quite well. We know the Earth isn’t flat, but it’s not a perfect sphere either. Our home world bulges a bit at the equator and is flattened a bit at the poles. But for most purposes other than true rocket science, this equation gives sufficiently accurate results.

Unfortunately, Anaplan doesn’t have the functions SIN, COS, or ACOS built in, and the usual workaround – lookup modules – simply won't do in this situation because we need much higher precision than lookups can practically handle. But don't despair, it is possible to calculate trig functions to 8 decimal point precision using nothing more sophisticated than Anaplan's POWER() function and some ingenuity. In the following demonstration model, the trig functions needed for distance calculation have been built for you using equations called Taylor Series expansions.

## Step-by-Step Construction

Here's a small educational project: In our example model, the user will select one post code, enter a market area radius value, and click a button. Changing the selected post code updates rows in a filtered module, so we need to refresh the dashboard to see the result. The dashboard will identify all post codes in the vicinity location and display their population, growth rate, median age, and distance.

## Step 1

Get U.S. postal code demographic and geolocation data.

Our model will use Census Zip Code tabulation areas. ZCTAs are essentially postal Zip Codes adjusted to remove unpopulated Zip Codes that are only for PO Boxes and combining some codes where that solves practical census tallying problems. There are about 32,000 ZCTAs and 43,000 Zip Codes in the U.S.

• Download the US.zip file from http://download.geonames.org/export/zip/ That file provides a full list of US Zip Codes and their county, state, latitude, and longitude. Other countries post codes are also listed in that folder.
• Download demographic data by post code from the US Census Bureau report DP05, choose the 5-digit ZCTA geographic option for the entire US. To calculate growth rate, you will need datasets for both the most recent year available and for the fifth year prior to that. (2017 and 2012 at the time this was written.)

Notes:

Import maps in the next two steps will need some manipulation by concatenating fields to get nice looking names (such as "Boston, MA 02134") and to get codes to match up among the lists. You'll need to either import to transaction modules or do this manipulation in Excel.

## Step 2

Create a list named "Loc 3 - Post Codes".

Set a top level member with a name like “Total Population Centers”.

It is generally a best practice to create a Clear action for any list to be run before future list reloads.

Notes:

For the purposes of this demonstration, a flat list of 5-digit codes is sufficient. I found it helpful to roll up ZCTAs by state (Loc 1) and county (Loc 2). This is optional. I will leave “give friendly names to your list members and assign them to parents” as an exercise for the advanced reader.

## Step 3

Create a module named "DATA: Loc 3 - Post Codes" dimensionalized by the list "Loc 3 - Post Codes" (no time, no version).

Notes:

1. There are a LOT of data fields in the tables you downloaded, and much more data is available in other Census Bureau products (gender, households, age details, income, …). Feel free to add line items for any census fields that you find useful. I found it helpful to pull the data into Excel and keep only the fields of interest to streamline the mapping process in Anaplan.
2. Expect a few rejects due to mismatches between Zip Code and ZCTA files. The geonames.org zip code list US.zip doesn't include Puerto Rico and other island territories. Census data does include them. As a result, Census ZCTAs that begin with 006## and 009## will report there is no matching list member. In a "real world" application, a significant effort goes into insuring that data "ties out" by addressing issues like this. You may either ignore the small percentage of rejects (my sincere apologies to the people of Puerto Rico) or you may find and add those missing zip codes to your list. Your choice.

For this exercise, the module must contain, at minimum, these line items:

 Formula Format Applies To DATA: Loc 3 - Post Codes Loc 3 - Post Codes Latitude Number - Longitude Number - Total Population Number - Total Population 5 yr prior Number - Growth Rate POWER(Total Population / 'Total Population 5 yr prior', 0.2) – 1 Number - Median Age Number - Median Age * Tot Pop Median Age * Total Population Number -

Set the Summary properties as follows:

• 'Total Population’, ‘Total Population 5 yr prior’, and ‘Median Age * Tot Pop’ aggregate by Sum.
• ‘Growth Rate’ aggregates by Formula.
• ‘Median Age’ aggregates by Ratio: ‘Median Age * Tot Pop’ / ‘Total Population’

## Step 4

Create a module named "INPUT: Globals". It holds four constants and two inputs as line items. There is no List, Time or Version dimension. I put those line items’ values into the Formula so users cannot change them.

Line Items are:

 Formula Format Applies To INPUT: Globals UI No Data Select a Location List, Loc 3 – Post Code - Market Area Radius (miles) Number - Constants No Data - Earth Radius (km) 6371 Number - Pi 3.141592654 Number - km / mi 1.609344 Number - ACOS(2/3) 0.588002604 Number -

Publish the “Select a Location” and “Market Area Radius (miles) line items to a new dashboard with the name “Distance Demo”.

Note:

Distance calculations in kilometers are provided below. Feel free to adjust your model’s inputs, outputs, and filters to the needs of your locale.

## Step 5

Create a module named "CALC: Post Code - Nearby Population Centers" dimensionalized by only the list “Loc 3 - Post Codes”. There are no Time or Versions dimensions.

Set summary settings for the user-facing population and age line items just as you did in step 2. The line items under Distance Calculations and Staging should not roll up, so use summary: None. (This is a best practice for conserving model size). The ‘In Market Area?’ Boolean should roll up using summary: Any.

Filter the list with ‘In Market Area?’ = TRUE and publish the 'CALC: Post Code - Nearby Population Centers' module to your dashboard.

In grid view, use pivot / filter / hide in the module:

‘Loc 3 – Post Codes’ is the row dimension,

Filter on ‘Is Market Area?’ = True,

Line items are in the columns and only the desired line items show,

Save the view and publish it to your dashboard.

## Step 6

Create a new Action that opens the dashboard, name it "Refresh Surrounding Locations". Publish it to your dashboard and position it between the two inputs and the output module. This action button is needed because the output module is filtered for "In Market Area?" = True but that filtering is only updated when the dashboard is refreshed.

This completes the build instructions, following are more insights on the calculations.

# The calculation logic

Take a look at the line item formulas under Staging. In those, we build the distance equation from its component parts. You might find it helpful to know that each trig operation, such as COS(90 - lat1), is a line item.

ACOS(  COS(90 - lat1) * COS(90 - lat2)

+ SIN(90 - lat1) * SIN(90 - lat2) * COS(lon1 - lon2)

)

In overview, the line items represent these steps:

• Get the constants Pi, Earth’s radius, etc.
• Convert latitude and longitude from degrees to radians
• Use Taylor Series formulas to calculate the variety of SIN and COS components
• Use another Taylor Series formula and a trig identity to calc ASIN, then convert ASIN to ACOS using another trig identity.
• Multiply the finished ACOS by Earth’s radius.

# Going Multidimensional

This example model is intentionally small; it uses a single list of locations and computes their distances from a selected location. In most "real world" applications, you need to know the distance between every pairing of two lists of locations, for example Stores and Towns, or DCs and Stores. Let’s call them origin and destination locations. To compute distance between every possible pairing, you would dimensionalize the CALC module above by those two lists and replace the user selection with ITEM(<origin location list>).

Good luck!!

The content in this article has not been evaluated for all Anaplan implementations and may not be recommended for your specific situation.