CatBoost inference

This notebooks demonstrates generating inferences from a pretrained CatBoost model. This notebook utilizes the deepfuel-ML/src/test.py script for generating inferences. The script does everything from calculating error values to plotting data for visual inference.

import os
import pandas as pd
import numpy as np
from joblib import dump, load
from IPython.display import Image, display, HTML

Using test.py

Below is the description of its arguements: - --model_name: Name of the model to be trained (“CatBoost” or “LightGBM”). - --model_path: Path to the pre-trained model. - --data_path: Valid data directory where all the test .csv files are stored. - --results_path: Directory where the result inference .csv files and .png visualizations are going to be stored.

With Ground Truth (actual_load is present in the test csv)

!python '../src/test.py'  --model_name 'CatBoost' --model_path '../src/pre-trained_models/CatBoost.joblib' --data_path '../data/infer_midlats'  --results_path '../data/midlats/results'
MAPE July : 380.44795759521344
MAPE Aug : 283.7487728040964
MAPE Sept : 203.97476414457114
MAPE Oct : 117.19251658203949
MAPE Nov : 105.94428641567805
MAPE Dec : 99.29645055040669
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Nov_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Nov_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_July_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_July_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Dec_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Dec_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Aug_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Aug_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Oct_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Oct_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Sept_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Sept_predicted.html

Inference CSV

test.py generates .csv files for each month with the following columns: - latitude - longitude - actual_load - Actual Fuel Load value - predicted_load - Predicted Fuel Load value - APE - Average Percentage Error between actual and predicted fuel load values

df=pd.read_csv('../data/midlats/results/midlats_output_July.csv')
df.head()
lat lon actual_load predicted_load APE
0 -35.125 -69.375 9.188477e+07 8.817028e+07 4.042547
1 -31.625 27.875 7.486465e+07 5.130763e+08 585.338529
2 -31.375 28.375 6.728101e+07 4.373534e+08 550.039875
3 -31.125 28.625 9.200570e+07 4.966761e+08 439.831873
4 -31.125 29.625 1.413486e+08 4.879350e+08 245.199817

Without Ground Truth (actual_load is not present in the test csv)

!python '../src/test.py'  --model_name 'CatBoost' --model_path '../src/pre-trained_models/CatBoost.joblib' --data_path '../data/infer_midlats'  --results_path '../data/midlats/results'
MAPE July : 380.44795759521344
MAPE Aug : 283.7487728040964
MAPE Sept : 203.97476414457114
MAPE Oct : 117.19251658203949
MAPE Nov : 105.94428641567805
MAPE Dec : 99.29645055040669
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Nov_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Nov_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_July_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_July_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Dec_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Dec_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Aug_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Aug_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Oct_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Oct_predicted.html
Actual FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Sept_actual.html
Predicted FL plot successfully generated! File saved to  ../data/midlats/results/midlats_Sept_predicted.html

Inference CSV

df=pd.read_csv('../data/midlats/results/midlats_output_July.csv')
df.head()
lat lon actual_load predicted_load APE
0 -35.125 -69.375 9.188477e+07 8.817028e+07 4.042547
1 -31.625 27.875 7.486465e+07 5.130763e+08 585.338529
2 -31.375 28.375 6.728101e+07 4.373534e+08 550.039875
3 -31.125 28.625 9.200570e+07 4.966761e+08 439.831873
4 -31.125 29.625 1.413486e+08 4.879350e+08 245.199817

Visualizing the plots generated

The plots are stored as html files that can be zoomed in upto the resolution of the data to view the predicted and actual values