m

Wildfire Prediction: Time Series Classification

Geospatial Analysis of Wildfires in California Area

Business Problem

Firefighting resource is crucial in containing wildfires and should be allocated correctly when needed. MODIS historical fire pixel data contains useful spatial information that can be used to identify patterns over time. Meteorological indicators can also provide information regarding weather conditions that causes wildfire. Soil quality indicators give information regarding drought possibilities. Though, MODIS can identify large fires with higher accuracy, it does have too many false alarms. It is expensive to use resources for false alarms. Thus, accurate prediction of fire is crucial to minimize the cost and for early preparation.

For this project, I would like to answer following questions: Do weather and soil conditions has any significance in wildfires? Can we identify true fires more accurately by combining weather and soil data to fire pixels data? Can we identify the fire prone area before fire happens, using historical fire pixels data combined with weather indicators and soil quality indicator and location information?

Techniques
The methods I used to answer the first question is data visualization techniques using Python library Seaborn and Matplotlib. In my analysis I found, weather indeed has impact on wildfires as well as soil quality. Below charts indicate it:
 
     

For the second part of the problem, I used classification model of Random Forest and Support Vector Machines. I implemented model selection and evaluation step by comparing precision and accuracy rate of both model and further tuned hyperparameters to increase accuracy. Random Forest outperformed with above 98% accuracy in identifying true fire pixels.

For the last part, I attempted to use time series classification. In this I transformed the data into 3-Dimensional and implemented LSTM model for classification. Model accuracy was 75% and precision was .50.

Future Applications
The model has the potential to perform better with right kind of data. There are other methods that I would like to try such as Random Forest for time series classification. Furthermore, model can also be used to identify true fire locations accurately by identifying true fire pixels.
 
Project Duration
This project lasted approximately 2-3 months. Though, modeling did not required too much time, much of the time was spent in Data Collection and Data preparation of geospatial data. 
 
Key Skills
Data Transformation, Data Munging, Data Visualization, Spatial Analysis, Deep Neural Networks, Classification Models
 
Tools
Python, Geopandas, Pandas, NumPy, Scikit-learn, Seaborn and Matplotlib
 

m

Copyright © 2023 By Taniya D. Adhikari
All Rights Reserved