How to query history/logs in a high scale environment

For those of you in high scale environments, that wants to query & audit logs, you may find it usefull to export logs on a daily or weekly basis and then query them like a database.

 

I do this with Apache Drill (https://drill.apache.org/) which allows me to download multiple history files to a directory, and then query accross all of the data files to find records or audit activities.  Here are the steps to replicate:

 

1. Download apache drill and extract it to a directory on your computer

2. Launch Apache Drill - I'm on Linux but you can also use Windows or Mac

 

juddm@gw:~/apache-drill-1.12.0$ bin/drill-embedded
Dec 18, 2017 11:08:29 AM org.glassfish.jersey.server.ApplicationHandler initialize
INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29 01:25:26...
apache drill 1.12.0
"got drill?"
0: jdbc:drill:zk=local>

 

This is embedded mode Apache Drill - but you can also use distributed mode which allows you to have clusters of servers acting as a single database engine.

 

3. In Anaplan, show history using the "save option".  This will create an action you can use on a dashboard or call via Anaplan connect on a scheduled basis.

4. Run the Action to export the history.  The file will be named something like "Model History - Last 7 Days.txt"

5. Save the file to a data directory on the computer running Apache Drill.  I use /home/juddm/apache-drill-1.14.0/data/anaplan_logs When you save the file - rename it so to include the model name and the date of the download and make sure the extension is .tsv (tab delimited).  I called mine mikesmodel-201771218.tsv

6. Query all of the log files in the anaplan_logs directory in Apache Drill:

0: jdbc:drill:zk=local> select * from dfs.`/home/juddm/apache-drill-1.12.0/data/anaplan_logs` limit 5;
+----------+----------------------+-------------------------+-------------------+-----------------+------------+------------------------------+-----------------------------------------------+--------------+----------------------+----------------------+--------+-----------+------+-------------+---------------+---------------+----------+-----------------------------+------------+-----------------------------+------------------+----------------------+---------+---------+------------+--------------+---------+--------------+-------+----------+
| ID | Date_Time__UTC_ | User | Description | Previous_Value | New_Value | Module_List | Line_Item_Property | Rep_Profile | Sales_Capacity_Reps | Use_Case_Department | Plans | L0_Geo | LOB | Quota_Tier | Month_Number | Ramp_Profile | Quarter | Accounts__Salesforce__Flat | Deal_SFDC | Opportunities__Salesforce_ | L1_New_Industry | L2_New_Sub_Industry | Export | Object | Dashboard | Target_User | Import | Target_Role | Time | Version |
+----------+----------------------+-------------------------+-------------------+-----------------+------------+------------------------------+-----------------------------------------------+--------------+----------------------+----------------------+--------+-----------+------+-------------+---------------+---------------+----------+-----------------------------+------------+-----------------------------+------------------+----------------------+---------+---------+------------+--------------+---------+--------------+-------+----------+
| 2006844 | 2017-12-11 15:45:01 | jon.mavetz@anaplan.com | | 0.93 | Avera | Won Deals (now deleted)Won | Opportunity (now deleted)ACV (now deleted)Ty | | | | | Americas | | | | | | | | | | | | | | | | | |
| 2006845 | 2017-12-11 15:45:05 | jon.mavetz@anaplan.com | | 100.0 | ge N | Deals (now deleted)Won De | pe (now deleted)ACVCountLY CountCountLY Coun | | | | | Americas | | | | | | | | | | | | | | | | | |
| 2006846 | 2017-12-11 15:45:41 | jon.mavetz@anaplan.com | Change Line Item | | | als (now deleted)Won Deals | tLY CountCountLY CountAver | | | | | | | | | | | | | | | | | | | | | | |
| 2006847 | 2017-12-11 15:45:43 | jon.mavetz@anaplan.com | Change Line Item | | | (now deleted)Won Deals (now | age New ACV (now Averag | | | | | | | | | | | | | | | | | | | | | | |
| 2006847 | 2017-12-11 15:45:43 | jon.mavetz@anaplan.com | Change Line Item | | | deleted)Pipeline HealthCus | e New ACV ($000's))Total A | | | | | | | | | | | | | | | | | | | | | | |
+----------+----------------------+-------------------------+-------------------+-----------------+------------+------------------------------+-----------------------------------------------+--------------+----------------------+----------------------+--------+-----------+------+-------------+---------------+---------------+----------+-----------------------------+------------+-----------------------------+------------------+----------------------+---------+---------+------------+--------------+---------+--------------+-------+----------+
5 rows selected (0.768 seconds)

Screenshot from 2017-12-18 13_42_07.png

 

You can use the apache drill odbc/jdbc drivers to use your own SQL tool interface or ETL tools to automate auditing or build reports.  Apache Drill can also be used to query web based data sources which you can then intergrate as imports in to your Anaplan models for predictive forward indicators.