薛云飞
(重庆交通大学机电与车辆工程学院,重庆 400074)
摘要:针对汽车尾气排放物中二氧化碳(CO2)的排放量测量设备价格昂贵且测量精度低的问题,进行基于机器学习的汽车二氧化碳排放量预测研究。首先,利用斯皮尔曼等级相关系数分析汽车特征之间的相关性,并过滤冗余特征;然后,利用随机森林算法筛选出影响CO2排放量的4个核心特征;最后,分别基于线性回归、梯度提升树、XGBoost、支持向量机4种机器学习算法建立CO2排放量的预测模型,并通过模型效果对比和网格搜索调参,确定最佳的预测模型为基于梯度提升树算法构建的模型。预测值和真实值的对比结果表明,基于梯度提升树算法构建的模型具有较高的预测精度,能有效预测不同汽车每公里的CO2排放量。
关键词:机器学习;CO2排放量;斯皮尔曼等级相关系数;随机森林算法;预测模型
中图分类号:TP181 文献标志码:A 文章编号:1674-2605(2023)01-0004-06
DOI:10.3969/j.issn.1674-2605.2023.01.004
Research on Prediction of Automobile Carbon Dioxide Emissions Based on Machine Learning
XUE Yunfei
(School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing 400074, China)
Abstract: Aiming at the problem of the high price and low measurement accuracy of the emission measurement equipment of carbon dioxide (CO2) in automobile exhaust emissions, the research on the prediction of automobile carbon dioxide emissions based on machine learning is carried out. Firstly, the correlation between automobile features is analyzed by using Spearman rank correlation coefficient, and redundant features are filtered; Then, the random forest algorithm is used to screen out four core characteristics that affect the emission of CO2; Finally, the prediction model of CO2 emissions is established based on four machine learning algorithms, namely linear regression, gradient lifting tree, XGBoost and support vector machine, and the best prediction model is determined based on gradient lifting tree algorithm through model effect comparison and grid search parameter adjustment. The comparison between the predicted value and the real value shows that the model based on gradient lifting tree algorithm has high prediction accuracy and can effectively predict the CO2 emissions per kilometer of different automobile.
Keywords: machine learning; CO2 emissions; Spearman rank correlation coefficient; random forest algorithm; prediction model