基于Boruta和极端随机树方法的森林蓄积量估测

doi:10.13466/j.cnki.lyzygl.2020.04.018

林业资源管理 ›› 2020›› Issue (4): 127-133.doi: 10.13466/j.cnki.lyzygl.2020.04.018

基于Boruta和极端随机树方法的森林蓄积量估测

韩瑞¹^,²^,³(), 吴达胜¹^,²^,³(), 方陆明¹^,²^,³, 黄宇玲¹^,²^,³^,⁴

1.浙江农林大学信息工程学院,杭州 311300
2.林业感知技术与智能装备国家林业和草原局重点实验室,杭州 311300
3.浙江省林业智能监测与信息技术研究重点实验室,杭州 311300
4.醴陵市陶瓷烟花职业技术学校,湖南醴陵 412200

收稿日期:2020-05-03 修回日期:2020-06-03 出版日期:2020-08-28 发布日期:2020-10-10
通讯作者: 吴达胜
作者简介:韩瑞(1995-),男,河南周口人,在读硕士,主要从事资源与环境信息系统研究。Email: 804874277@qq.com
基金资助:
浙江省科技重点研发计划资助项目(2018C02013)

Estimation of Forest Reserves Based on Boruta and Extra-trees Methods

HAN Rui¹^,²^,³(), WU Dasheng¹^,²^,³(), FANG Luming¹^,²^,³, HUANG Yuling¹^,²^,³^,⁴

1. School of Information Engineering,Zhejiang Agriculture and Forestry University,Hangzhou Zhejiang 311300,China
2. Key Laboratory of State Forestry and Grassland Administration on Forestry Sensing Technology and Intelligent Equipment,Hangzhou Zhejiang 311300,China
3. Key Laboratory of Forestry Intelligent Monitoring and Information Technology of Zhejiang Province,Hangzhou Zhejiang 311300,China
4. Ceramics-Fireworks Vocational Technology School of Liling,Hunan 412200,China

Received:2020-05-03 Revised:2020-06-03 Online:2020-08-28 Published:2020-10-10
Contact: WU Dasheng

摘要/Abstract

摘要：

森林蓄积量是反映森林资源数量的重要指标之一。本研究应用Boruta特征选择方法和极端随机树(Extremely randomized trees,Extra-trees)方法,以小班为研究单元,估测龙泉市部分区域森林资源的每公顷蓄积量,为县域尺度森林蓄积量的估测提供新的方法和思路。基于研究区的森林资源二类调查数据、高分二号(GF-2)遥感影像数据、数字高程模型数据,提取多元特征组成原始特征集。通过Boruta选择方法对原始特征集进行筛选,利用Extra-trees方法建立森林蓄积量估测模型,选用十折交叉验证法对模型进行检验,并与随机森林(Random Forest,RF)方法和梯度提升(Gradient Boosting)方法进行对比分析。研究结果显示:1) 经过Boruta特征选择方法得出的特征有土层厚度、年龄、郁闭度、海拔、坡度和坡向;2) 极端随机树方法采用网格搜索调参得到的最优参数组合为:树的个数为250,树的最大深度为14;3) 基于Boruta和极端随机树方法的森林蓄积量估测模型的测试精度为84.14%,R²为0.92,RMSE为19.65m³/hm²,MAE为13.95m³/hm²,模型优于随机森林方法和梯度提升方法,表明Boruta特征选择方法结合极端随机树方法估测森林蓄积量可取得更好的效果。

关键词: Boruta特征选择, 极端随机树, 随机森林, 森林蓄积量, 机器学习

Abstract:

Forest reserve is an important index to show the quantity of forest resources.In this study,the boruta feature selection method and the extremely randomized trees(Extra-trees) method are used to estimate the forest resources per mu in some areas of Longquan city.The study takes small classes as the research unit to provide new methods for the estimation of forest reserves at the county level.Based on the secondary survey data of forest resources,GF-2 remote sensing image data and digital elevation model data,multiple features are collected to form the original feature set.Through the boruta selection method,the original feature set is screened,the forest volume estimation model is established by extra-trees method,and the ten fold cross validation method is used to test the model,which is compared with the random forest(RF) method and gradient boosting method.The results show that:(1) the features found by the boruta feature selection method are soil thickness,age,canopy density,altitude and slope;(2) the optimal parameter combination obtained by the grid search and parameter adjustment of the Extra-trees method is:the number of tree is 250,and the maximum depth of the tree is 14;(3) the testing accuracy of the forest volume estimation model based on the boruta and extra-trees method is 84.14%,R ² is 0.92,RMSE is 19.65m³/hm²,and MAE is 13.95m³/hm².The model is superior to the random forest method and the gradient lifting method.It shows that the boruta feature selection method combined with the Extra-trees method can achieve better results in estimating the forest reserve.

Key words: boruta feature selection, extremely randomized trees, random forest, forest reserves, machine learning

中图分类号:

TP757

韩瑞, 吴达胜, 方陆明, 黄宇玲. 基于Boruta和极端随机树方法的森林蓄积量估测[J]. 林业资源管理, 2020,(4): 127-133.

HAN Rui, WU Dasheng, FANG Luming, HUANG Yuling. Estimation of Forest Reserves Based on Boruta and Extra-trees Methods[J]. FOREST RESOURCES WANAGEMENT, 2020,(4): 127-133.

图/表 5

表1

表2

表3

图1

图2

参考文献 23

[1]	吴达胜. 基于多源数据和神经网络模型的森林资源蓄积量动态监测[D]. 杭州:浙江大学, 2014.
[2]	尤静妮. 基于高分遥感纹理信息的森林蓄积量估测研究[D]. 西安:西安科技大学, 2017.
[3]	姚新华, 金佳, 徐飞飞, 等. 太湖流域果树提取的光谱和纹理特征选择研究[J]. 中国生态农业学报, 2019,27(10):1596-1606.
[4]	刘明艳, 王秀兰, 冯仲科, 等. 基于主成分分析法的老秃顶子自然保护区森林蓄积量遥感估测[J]. 中南林业科技大学学报, 2017,37(10):80-83.
[5]	周如意. 基于Landsat-8遥感影像的森林蓄积量估测[D]. 杭州:浙江农林大学, 2019.
[6]	王海宾, 彭道黎, 高秀会, 等. 基于GF-1 PMS影像和k-NN方法的延庆区森林蓄积量估测[J]. 浙江农林大学学报, 2018,35(6):1070-1078.
[7]	Wu Dasheng. Estimation of forest volume base on LM-BP neural network model[J]. Comput.Model.New Technol, 2014,18(4):131-137.
[8]	汪康宁, 马婷, 吕杰. 基于随机森林算法的凉水自然保护区蓄积量反演研究[J]. 西南林业大学学报, 2016,36(5):125-129.
[9]	郭海山, 高波涌, 陆慧娟. 基于Boruta-PSO-SVM的股票收益率研究[J]. 传感器与微系统, 2018,37(3):51-53.
[10]	Wei Jing, Huang Wei, Li Zhanqing, et al. Estimating 1-km-resolution PM 2.5 concentrations across China using the space-time random forest approach[J]. Remote Sensing of Environment, 2019,231:111221.
[11]	Djavan De Clercq, Wen Zongguo, Fei Fan. Determinants of efficiency in anaerobic bio-waste co-digestion facilities:A data envelopment analysis and gradient boosting approach[J]. Applied Energy, 2019,253:113570.
[12]	洪汝锋. 龙泉市林权制度改革研究[D]. 杭州:浙江农林大学, 2012.
[13]	杨永恬. 基于多源遥感数据的森林蓄积量估测方法研究[D]. 北京:中国林业科学研究院, 2010.
[14]	姚啸. 面向对象的高分遥感影像分类在森林蓄积量估测中的应用研究[D]. 西安:西安科技大学, 2015.
[15]	王雪军. 基于多源数据源的森林资源年度动态监测研究[D]. 北京:北京林业大学, 2013.
[16]	Kursa M B, Rudnicki W R. Feature selection with the Boruta package[J]. Journal of Statistical Software. 2010,36(11):1-13.
[17]	Geurts P, Ernst D, Wehenkel L. Extremely randomized trees[J]. Machine Learning, 2006,63(1):3-42.
[18]	李世波, 林辉, 王光明, 等. 基于GF-1的森林蓄积量遥感估测[J]. 中南林业科技大学学报, 2019,39(8):70-75.
[19]	罗蜜, 孙玉军, 张博, 等. 应用衍生纹理指数对杉木林分蓄积量的估测[J]. 东北林业大学学报, 2019,47(7):43-49.
[20]	李亚东, 曹明兰, 李长青, 等. 无人机森林航摄影像三维点云估测林分蓄积量研究[J]. 中南林业科技大学学报, 2019,39(3):56-60.
[21]	郎晓雪, 许彦红, 舒清态, 等. 香格里拉市云冷杉林蓄积量遥感估测非参数模型研究[J]. 西南林业大学学报:自然科学, 2019,39(1):146-151.
[22]	陈新云, 李利伟, 刘承芳, 等. 热带原始森林类型分类和蓄积量遥感反演研究[J].林业资源管理, 2019(2):39-46.
[23]	曹霖, 彭道黎, 王雪军, 等. 应用Sentinel-2A卫星光谱与纹理信息的森林蓄积量估算[J]. 东北林业大学学报, 2018,46(9):54-58.

特征	结果	特征	结果
土层厚度	通过	腐殖质厚度	不通过
年龄	通过	郁闭度	通过
海拔	通过	坡度	通过
坡向	通过	蓝色波段	不通过
绿色波段	不通过	红色波段	不通过
近红外波段	不通过	差值植被指数	不通过
比值植被指数	不通过	增强型植被指数	不通过
差值植被指数	不通过	壤调节植被指数	不通过

树的最大深度/m	树的个数 /个	随机森林的均方误差	梯度提升的均方误差	极端随机树的均方误差
2	50	-8.21	-6.49	-10.86
2	100	-8.23	-7.26	-10.73
2	150	-8.22	-6.22	-10.52
2	200	-8.12	-6.21	-10.52
2	250	-8.24	-6.23	-10.69
6	50	-6.23	-6.03	-6.86
6	100	-6.26	-6.14	-6.87
6	150	-6.25	-6.25	-6.87
6	200	-6.25	-6.29	-6.86
6	250	-6.27	-6.32	-6.88
10	50	-6.01	-6.48	-5.92
10	100	-5.95	-6.52	-5.93
10	150	-5.93	-6.54	-5.91
10	200	-5.92	-6.54	-5.92
10	250	-5.94	-6.51	-5.91
14	50	-5.94	-7.44	-5.73
14	100	-5.90	-7.42	-5.74
14	150	-5.82	-7.40	-5.73
14	200	-5.86	-7.42	-5.72
14	250	-5.89	-7.47	-5.71
18	50	-5.93	-8.63	-5.88
18	100	-5.88	-8.62	-5.83
18	150	-5.87	-8.59	-5.82
18	200	-5.86	-8.64	-5.81
18	250	-5.85	-8.69	-5.77

模型	决定系数		均方根误差/(m³/hm²)		平均绝对误差/(m³/hm²)		平均百分比误差/%		估测精度/%
模型	建模	估测	建模	估测	建模	估测	建模	估测	建模	估测
随机森林	0.71	0.90	36.90	21.75	25.65	15.30	13.78	16.78	86.22	83.22
梯度提升	0.71	0.91	36.75	20.85	25.95	15.00	27.92	16.45	72.08	83.55
极端随机树	0.73	0.92	35.85	19.65	25.05	13.95	14.05	15.86	85.95	84.14

基于Boruta和极端随机树方法的森林蓄积量估测

Estimation of Forest Reserves Based on Boruta and Extra-trees Methods

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 5

参考文献 23

相关文章 15

编辑推荐

Metrics

本文评价

[1]	唐佳俊, 柴宗政. 基于机载激光雷达和机器学习的林分平均胸径遥感估测[J]. 林草资源研究, 2024, 0(1): 56-64.
[2]	任晓琦, 侯鹏, 陈妍. 森林地上生物量遥感反演研究进展[J]. 林草资源研究, 2023, 0(6): 146-158.
[3]	张国丽, 慈雪伦, 杨雪清, 蒋春颖, 孙志超, 孟海丁. 森林火灾时空分布特征及易发性分析研究[J]. 林草资源研究, 2023, 0(5): 48-55.
[4]	巨文珍, 韦龙斌, 彭泊林, 李常诚, 潘婷. 广西林火驱动因子及预测模型研究[J]. 林草资源研究, 2023, 0(5): 56-62.
[5]	梁立成, 傅晓强, 张滨, 程谷栒, 李佐晖. 基于GEE与随机森林的象山港互花米草动态监测[J]. 林业资源管理, 2023, 0(3): 38-45.
[6]	王晓慧, 张会儒, 庞勇, 覃先林, 李海奎, 蒙诗栎, 余涛. 天然林保护工程区森林植被类型遥感监测[J]. 林业资源管理, 2023, 0(2): 96-103.
[7]	高金萍, 于慧娜, 翟召坤. 高分多模卫星林业地类及树种识别应用研究[J]. 林业资源管理, 2023, 0(1): 127-132.
[8]	邱洁, 李倩楠, 虞瑶. 机载激光雷达数据估测森林资源蓄积量研究[J]. 林业资源管理, 2023, 0(1): 153-160.
[9]	王晓洋, 姜友谊, 黎晓, 胡亚轩, 张家政, 刘博伟. 基于GF-1影像的多时相多特征落叶松人工林提取研究[J]. 林业资源管理, 2022, 0(4): 109-118.
[10]	曾伟生, 杨学云, 孙乡楠, 刘樯漪, 张宇超. 森林资源调查监测中各级储量数据的一体化方法研究[J]. 林业资源管理, 2022, 0(4): 13-19.
[11]	黄冰倩, 岳彩荣, 朱泊东. 基于GF-1数据多尺度遥感特征的森林蓄积量估测研究[J]. 林业资源管理, 2022, 0(3): 54-59.
[12]	黄锦程, 刘洪生, 宁金魁, 欧阳勋志, 臧颢. 基于随机森林算法的江西省崇义县主要造林树种适生性研究[J]. 林业资源管理, 2022, 0(2): 117-125.
[13]	龙植豪, 罗鹏, 许等平, 李振, 代华兵. Sentinel-2A红边波段森林蓄积量反演研究[J]. 林业资源管理, 2022, 0(2): 126-134.
[14]	曾伟生. 森林蓄积量和生物量多元混合模型研建[J]. 林业资源管理, 2021, 0(6): 23-28.
[15]	唐金灏, 张加龙, 陈立业, 程滔. 高山松地上生物量估测与尺度转换研究[J]. 林业资源管理, 2021, 0(6): 83-89.