How to Calculate Distance Between Pairs of Latitude-Longitude Locations

There are several business use cases that require the ability to compute distances between pairs of locations.

  • Optimizing sales territory realignment
  • Logistics cost optimization
  • Transportation industry passenger revenue or cost per mile
  • Franchise territory design
  • Brick-and-mortar market area analysis (stores, hotels, bank branches, …)
  • Optimizing inventory among geographic Distribution Centers

At their core, each of these requires knowing how far apart a pair of sites are positioned.

This article provides step-by-step instructions for creating a dashboard where users select a location and set a market area radius, then the dashboard shows all population centers in that vicinity with some demographic information.

Distance Calc Tutorial Screenshot.jpg

Doing the Math: Trig functions in Anaplan 

Distance between two latitude-longitude points (lat1, lon1) and (lat2, lon2) requires solving this equation:

Radius of Earth * 
ACOS(  COS(90 - lat1) * COS(90 - lat2) 

     + SIN(90 - lat1) * SIN(90 - lat2) * COS(lon1 - lon2)

    ) 

This formula works quite well. We know the Earth isn’t flat, but it’s not a perfect sphere either. Our home world bulges a bit at the equator and is flattened a bit at the poles. But for most purposes other than true rocket science, this equation gives sufficiently accurate results. 

Unfortunately, Anaplan doesn’t have the functions SIN, COS, or ACOS built in, and the usual workaround – lookup modules – simply won't do in this situation because we need much higher precision than lookups can practically handle. But don't despair, it is possible to calculate trig functions to 8 decimal point precision using nothing more sophisticated than Anaplan's POWER() function and some ingenuity. In the following demonstration model, the trig functions needed for distance calculation have been built for you using equations called Taylor Series expansions.

Step-by-Step Construction 

Here's a small educational project: In our example model, the user will select one post code, enter a market area radius value, and click a button. Changing the selected post code updates rows in a filtered module, so we need to refresh the dashboard to see the result. The dashboard will identify all post codes in the vicinity location and display their population, growth rate, median age, and distance.

Step 1

Get U.S. postal code demographic and geolocation data.

Our model will use Census Zip Code tabulation areas. ZCTAs are essentially postal Zip Codes adjusted to remove unpopulated Zip Codes that are only for PO Boxes and combining some codes where that solves practical census tallying problems. There are about 32,000 ZCTAs and 43,000 Zip Codes in the U.S.

  • Download the US.zip file from http://download.geonames.org/export/zip/ That file provides a full list of US Zip Codes and their county, state, latitude, and longitude. Other countries post codes are also listed in that folder.
  • Download demographic data by post code from the US Census Bureau report DP05, choose the 5-digit ZCTA geographic option for the entire US. To calculate growth rate, you will need datasets for both the most recent year available and for the fifth year prior to that. (2017 and 2012 at the time this was written.) 

Notes:

Import maps in the next two steps will need some manipulation by concatenating fields to get nice looking names (such as "Boston, MA 02134") and to get codes to match up among the lists. You'll need to either import to transaction modules or do this manipulation in Excel.

Step 2

Create a list named "Loc 3 - Post Codes".

Set a top level member with a name like “Total Population Centers”. 

It is generally a best practice to create a Clear action for any list to be run before future list reloads.

Notes:

For the purposes of this demonstration, a flat list of 5-digit codes is sufficient. I found it helpful to roll up ZCTAs by state (Loc 1) and county (Loc 2). This is optional. I will leave “give friendly names to your list members and assign them to parents” as an exercise for the advanced reader.

Step 3

Create a module named "DATA: Loc 3 - Post Codes" dimensionalized by the list "Loc 3 - Post Codes" (no time, no version).

Notes:

  1. There are a LOT of data fields in the tables you downloaded, and much more data is available in other Census Bureau products (gender, households, age details, income, …). Feel free to add line items for any census fields that you find useful. I found it helpful to pull the data into Excel and keep only the fields of interest to streamline the mapping process in Anaplan.
  2. Expect a few rejects due to mismatches between Zip Code and ZCTA files. The geonames.org zip code list US.zip doesn't include Puerto Rico and other island territories. Census data does include them. As a result, Census ZCTAs that begin with 006## and 009## will report there is no matching list member. In a "real world" application, a significant effort goes into insuring that data "ties out" by addressing issues like this. You may either ignore the small percentage of rejects (my sincere apologies to the people of Puerto Rico) or you may find and add those missing zip codes to your list. Your choice. 

For this exercise, the module must contain, at minimum, these line items:

 

Formula

Format

Applies To

DATA: Loc 3 - Post Codes

 

 

Loc 3 - Post Codes

Latitude

 

Number

-

Longitude

 

Number

-

Total Population

 

Number

-

Total Population 5 yr prior

 

Number

-

Growth Rate

POWER(Total Population /
'Total Population 5 yr prior', 0.2) – 1

Number

-

Median Age

 

Number

-

Median Age * Tot Pop

Median Age * Total Population

Number

-

Set the Summary properties as follows:

  • 'Total Population’, ‘Total Population 5 yr prior’, and ‘Median Age * Tot Pop’ aggregate by Sum.
  • ‘Growth Rate’ aggregates by Formula.
  • ‘Median Age’ aggregates by Ratio: ‘Median Age * Tot Pop’ / ‘Total Population’

Create import actions to load your downloaded data into “DATA: Loc 3 – Post Codes”

Step 4

Create a module named "INPUT: Globals". It holds four constants and two inputs as line items. There is no List, Time or Version dimension. I put those line items’ values into the Formula so users cannot change them.

Line Items are:

 

Formula

Format

Applies To

INPUT: Globals

 

 

 <none>

UI

 

No Data

 

Select a Location

 

List, Loc 3 – Post Code

-

Market Area Radius (miles)

 

Number

-

Constants

 

No Data

-

Earth Radius (km)

6371

Number

-

Pi

3.141592654

Number

-

km / mi

1.609344

Number

-

ACOS(2/3)

0.588002604

Number

-

Publish the “Select a Location” and “Market Area Radius (miles) line items to a new dashboard with the name “Distance Demo”.

Note:

Distance calculations in kilometers are provided below. Feel free to adjust your model’s inputs, outputs, and filters to the needs of your locale.

Step 5

Create a module named "CALC: Post Code - Nearby Population Centers" dimensionalized by only the list “Loc 3 - Post Codes”. There are no Time or Versions dimensions.

 

Formula

Format

Applies To

CALC: Post Code - Nearby Population Centers

 

 

Loc 3 - Post Codes

Origination Location:

 

No Data

-

Selected Post Code

'INPUT: Globals'.'Select a Location'

List: Loc 3 - Post Codes

<none>

Selected Post Code Latitude

'DATA: Loc 3 - Post Codes'.Latitude[LOOKUP: Selected Post Code]

Number

<none> 

Selected Post Code Longitude

'DATA: Loc 3 - Post Codes'.Longitude[LOOKUP: Selected Post Code]

Number

<none> 

Destination Location:

 

No Data

 

Population Center

ITEM('Loc 3 - Post Codes')

List: Loc 3 - Post Codes

-

Population

If 'In Market Area?' Then 'DATA: Loc 3 - Post Codes'.Total Population Else 0

Number

-

Population 5 yr prior

IF In Market Area? THEN 'DATA: Loc 3 - Post Codes'.'Total Population 5 yr prior' ELSE 0

Number

Growth Rate

IF In Market Area? THEN POWER(Population / 'Population 5 yr prior', -2) - 1 ELSE 0

Number, Percent

-

Median Age

If 'In Market Area?' Then 'DATA: Loc 3 - Post Codes'.Median Age Else 0

Number

-

Median Age * Pop

If 'In Market Area?' Then Median Age * Population Else 0

Number

-

Pop Center Latitude

'DATA: Loc 3 - Post Codes'.Latitude

Number

-

Pop Center Longitude

'DATA: Loc 3 - Post Codes'.Longitude

Number

-

Calculated Distance:

 

No Data

 

Distance (miles)

'EarthRadius (miles)' * 'ACOS(x)'

Number

-

Distance (km)

'EarthRadius (km)' * 'ACOS(x)'

Number

-

Staging

 

No Data

 

EarthRadius (km)

'INPUT: Globals'.'Earth Radius (km)'

Number

 

EarthRadius (miles)

'EarthRadius (km)' / 'INPUT: Globals'.'km / mi'

Number

 

Pi

'INPUT: Globals'.Pi

Number

 

Radians(90 - Lat1)

2 * Pi * (90 - Selected Post Code Latitude) / 360

Number

-

COS(Radians(90 -  Lat1))

1 - POWER('Radians(90 - Lat1)', 2) / 2 + POWER('Radians(90 - Lat1)', 4) / 24 - POWER('Radians(90 - Lat1)', 6) / 720 + POWER('Radians(90 - Lat1)', 8) / 40320 - POWER('Radians(90 - Lat1)', 10) / 3628800 + POWER('Radians(90 - Lat1)', 12) / 479001600 - POWER('Radians(90 - Lat1)', 14) / 87178291200 + POWER('Radians(90 - Lat1)', 16) / 20922789888000 - POWER('Radians(90 - Lat1)', 18) / 6402373705728000 + POWER('Radians(90 - Lat1)', 20) / 2432902008176640000

Number

-

SIN(Radians(90 - Lat1))

'Radians(90 - Lat1)' - POWER('Radians(90 - Lat1)', 3) / 6 + POWER('Radians(90 - Lat1)', 5) / 120 - POWER('Radians(90 - Lat1)', 7) / 5040 + POWER('Radians(90 - Lat1)', 9) / 362880 - POWER('Radians(90 - Lat1)', 11) / 39916800 + POWER('Radians(90 - Lat1)', 13) / 6227020800 - POWER('Radians(90 - Lat1)', 15) / 1307674368000 + POWER('Radians(90 - Lat1)', 17) / 355687428096000 - POWER('Radians(90 - Lat1)', 19) / 121645100408832000 + POWER('Radians(90 - Lat1)', 21) / 51090942171709440000

Number

-

Radians(90 - Lat2)

2 * Pi * (90 - Pop Center Latitude) / 360

Number

-

COS(Radians(90 -  Lat2))

1 - POWER('Radians(90 - Lat2)', 2) / 2 + POWER('Radians(90 - Lat2)', 4) / 24 - POWER('Radians(90 - Lat2)', 6) / 720 + POWER('Radians(90 - Lat2)', 8) / 40320 - POWER('Radians(90 - Lat2)', 10) / 3628800 + POWER('Radians(90 - Lat2)', 12) / 479001600 - POWER('Radians(90 - Lat2)', 14) / 87178291200 + POWER('Radians(90 - Lat2)', 16) / 20922789888000 - POWER('Radians(90 - Lat2)', 18) / 6402373705728000 + POWER('Radians(90 - Lat2)', 20) / 2432902008176640000

Number

-

SIN(Radians(90 - Lat2))

'Radians(90 - Lat2)' - POWER('Radians(90 - Lat2)', 3) / 6 + POWER('Radians(90 - Lat2)', 5) / 120 - POWER('Radians(90 - Lat2)', 7) / 5040 + POWER('Radians(90 - Lat2)', 9) / 362880 - POWER('Radians(90 - Lat2)', 11) / 39916800 + POWER('Radians(90 - Lat2)', 13) / 6227020800 - POWER('Radians(90 - Lat2)', 15) / 1307674368000 + POWER('Radians(90 - Lat2)', 17) / 355687428096000 - POWER('Radians(90 - Lat2)', 19) / 121645100408832000 + POWER('Radians(90 - Lat2)', 21) / 51090942171709440000

Number

-

Radians(Long1-Long2)

2 * Pi * (Selected Post Code Longitude - Pop Center Longitude) / 360

Number

-

COS(RADIANS(Long1-Long2))

1 - POWER('Radians(Long1-Long2)', 2) / 2 + POWER('Radians(Long1-Long2)', 4) / 24 - POWER('Radians(Long1-Long2)', 6) / 720 + POWER('Radians(Long1-Long2)', 8) / 40320 - POWER('Radians(Long1-Long2)', 10) / 3628800 + POWER('Radians(Long1-Long2)', 12) / 479001600 - POWER('Radians(Long1-Long2)', 14) / 87178291200 + POWER('Radians(Long1-Long2)', 16) / 20922789888000 - POWER('Radians(Long1-Long2)', 18) / 6402373705728000 + POWER('Radians(Long1-Long2)', 20) / 2432902008176640000

Number

-

X - pre adj

'COS(Radians(90 -  Lat1))' * 'COS(Radians(90 -  Lat2))' + 'SIN(Radians(90 - Lat1))' * 'SIN(Radians(90 - Lat2))' * 'COS(RADIANS(Long1-Long2))'

Number

-

X

IF ABS('X - pre adj') <= 1 / POWER(2, 0.5) THEN 'X - pre adj' ELSE IF ABS('X - pre adj') > 1 THEN SQRT(-1) ELSE POWER(1 - POWER('X - pre adj', 2), 0.5)

Number

-

ASIN (Taylor Series)

X + 1 / 6 * POWER(X, 3) + 3 / 40 * POWER(X, 5) + 5 / 112 * POWER(X, 7) + 35 / 1152 * POWER(X, 9) + 63 / 2816 * POWER(X, 11) + 231 / 13312 * POWER(X, 13) + 143 / 10240 * POWER(X, 15) / 6435 / 557056 * POWER(X, 17) + 12155 / 1245184 * POWER(X, 19) + 46189 / 5505024 * POWER(X, 21) + 88179 / 12058624 * POWER(X, 23)

Number

-

ASIN(x)

IF ABS('X - pre adj') <= 1 / SQRT(2) THEN 'ASIN (Taylor Series)' ELSE IF 'X - pre adj' > 1 / SQRT(2) AND 'X - pre adj' <= 1 THEN Pi / 2 - 'ASIN (Taylor Series)' ELSE IF 'X - pre adj' < -1 / SQRT(2) AND 'X - pre adj' > -1 THEN -Pi / 2 + 'ASIN (Taylor Series)' ELSE SQRT(-1)

Number

-

ACOS(x)

Pi / 2 - 'ASIN(x)'

Number

-

Filters

 

No Data

 

In Market Area?

Distance (miles)' > 0 AND 'Distance (miles)' <= 'INPUT: Globals'.'Market Area Radius (miles)'

Boolean

-

Set summary settings for the user-facing population and age line items just as you did in step 2. The line items under Distance Calculations and Staging should not roll up, so use summary: None. (This is a best practice for conserving model size). The ‘In Market Area?’ Boolean should roll up using summary: Any.

Filter the list with ‘In Market Area?’ = TRUE and publish the 'CALC: Post Code - Nearby Population Centers' module to your dashboard.

In grid view, use pivot / filter / hide in the module: 

‘Loc 3 – Post Codes’ is the row dimension,

Filter on ‘Is Market Area?’ = True,

Line items are in the columns and only the desired line items show,

Adjust column settings for heading wrap and column widths.

Save the view and publish it to your dashboard.

Step 6

Create a new Action that opens the dashboard, name it "Refresh Surrounding Locations". Publish it to your dashboard and position it between the two inputs and the output module. This action button is needed because the output module is filtered for "In Market Area?" = True but that filtering is only updated when the dashboard is refreshed.

This completes the build instructions, following are more insights on the calculations.

The calculation logic

Take a look at the line item formulas under Staging. In those, we build the distance equation from its component parts. You might find it helpful to know that each trig operation, such as COS(90 - lat1), is a line item.

Radius of Earth * 
ACOS(  COS(90 - lat1) * COS(90 - lat2) 

     + SIN(90 - lat1) * SIN(90 - lat2) * COS(lon1 - lon2)

    )

In overview, the line items represent these steps:

  • Get the constants Pi, Earth’s radius, etc.
  • Convert latitude and longitude from degrees to radians
  • Use Taylor Series formulas to calculate the variety of SIN and COS components
  • Use another Taylor Series formula and a trig identity to calc ASIN, then convert ASIN to ACOS using another trig identity.
  • Multiply the finished ACOS by Earth’s radius.

Going Multidimensional

This example model is intentionally small; it uses a single list of locations and computes their distances from a selected location. In most "real world" applications, you need to know the distance between every pairing of two lists of locations, for example Stores and Towns, or DCs and Stores. Let’s call them origin and destination locations. To compute distance between every possible pairing, you would dimensionalize the CALC module above by those two lists and replace the user selection with ITEM(<origin location list>).

Good luck!!

The content in this article has not been evaluated for all Anaplan implementations and may not be recommended for your specific situation.
Please consult your internal administrators prior to applying any of the ideas or steps in this article.
Labels (2)
Contributors