Machine Learning Uncovers the Environmental Toll of Stubble Burning in Northern India
A recent study conducted by researchers from the School of Computer Science and Engineering at the University of Westminster, along with the School of Computer Science and Technology at the University of Bedfordshire, sheds light on the critical issue of air pollution in India, specifically focusing on the northern states of Delhi, Punjab, and Haryana. With air pollution posing a severe public health threat globally, causing an estimated seven million deaths annually due to exposure to fine particulate matter, the research delved into the factors contributing to hazardous air quality levels in cities like Delhi. Industrial emissions, vehicular pollution, and seasonal agricultural practices like stubble burning have been identified as key drivers of air pollution in the region.
The study aimed to predict the Air Quality Index (AQI) by leveraging various machine learning models to understand the impact of different pollutants on air quality and assess the influence of stubble burning in Punjab on AQI levels across the northern states. Using a comprehensive dataset from the Central Pollution Control Board (CPCB), which included data on pollutants like PM2.5, PM10, NO2, SO2, CO, among others collected from monitoring stations in Delhi, Haryana, and Punjab, the researchers applied machine learning models such as Random Forest, CatBoost, XGBoost, Support Vector Regressor (SVR), and LSTM to predict AQI levels. The Random Forest model emerged as the most accurate predictor, followed closely by CatBoost and XGBoost, underscoring the effectiveness of these models in analyzing complex, high-dimensional data related to air quality.
One of the key focal points of the study was the role of stubble burning, a prevalent agricultural practice in North India, particularly in Punjab and Haryana. Farmers often resort to burning crop residues after harvest to prepare the fields for the next planting season, leading to a significant increase in air pollution levels during the post-harvest months from October to December. The research highlighted the detrimental impact of stubble burning on air quality in Delhi and surrounding areas, emphasizing the need for alternative and sustainable farming practices to mitigate this environmental challenge.
In addition to analyzing the environmental data, the researchers addressed challenges associated with data preprocessing, including handling missing values and outliers. The study employed techniques like mean imputation to address missing data in the dataset, ensuring its reliability for machine learning analysis. Furthermore, the researchers conducted a thorough examination of outliers using visualizations like box plots to identify extreme pollution events or data collection errors, maintaining data integrity while differentiating between genuine outliers and anomalies.
While machine learning models like Random Forest proved effective in predicting AQI levels, the research underscored the necessity for more robust meteorological data integration to enhance predictive accuracy. Factors such as temperature, humidity, and wind direction significantly influence pollutant dispersion and concentration, playing a crucial role in air quality forecasts. The researchers recommended integrating improved meteorological data with machine learning models to bolster AQI predictions and develop more effective pollution control measures.
In conclusion, the study highlights the urgent need to address air pollution challenges in northern India, particularly concerning stubble burning and its impact on air quality. By leveraging advanced machine learning models and incorporating comprehensive environmental data, researchers aim to provide valuable insights for policymakers and stakeholders to develop targeted interventions and strategies for air quality improvement in the region.