Pre-processing¶
Raw data should first be processed using notebooks in notebooks/preprocess/*
.
Entry point for the pre-processing script for the ML pipeline is src/pre-processing.py
.
- Args description:
*
--data_path
: Path to the data files
Input: Enter the root directory of the xarray data files as the script argument. All data files produced are stored in this directory.
src/utils/data_paths.py
- defines the files paths for the features used in training and also the paths of thefuel_load.nc
which will be created.
Output:
Creates
fuel_load.nc
file for Fuel Load Data (Burned Area * Above Ground Biomass).Saves the following files for the Tropics & Mid-Latitudes regions respectively, where {type} is ‘tropics’ or ‘midlats’.
Save Directory
root_path/{type}
- {type}_train.csv - {type}_val.csv - {type}_test.csv
Save Directory
root_path/infer_{type}
- {type}*infers*\ July.csv - {type}*infers*\ Aug.csv - {type}*infers*\ Sept.csv - {type}*infers*\ Oct.csv - {type}*infers*\ Nov.csv - {type}*infers*\ Dec.csv
Where root_path is the root save path provided for
pre-processing.py