欢迎访问林草资源研究

林草资源研究 ›› 2024›› Issue (6): 45-53.doi: 10.13466/j.cnki.lczyyj.2024.06.006

• 科学研究 • 上一篇    下一篇

样本结构对林业数学模型拟合结果的影响分析

曾伟生()   

  1. 国家林业和草原局林草调查规划院,北京 100714
  • 收稿日期:2024-03-07 修回日期:2024-08-12 出版日期:2024-12-28 发布日期:2025-04-18
  • 作者简介:曾伟生,教授级高级工程师,博士,主要从事森林资源调查监测与林业数学建模等工作。Email:zengweisheng0928@126.com
  • 基金资助:
    国家重点研发计划项目“典型人工林立地质量评价与生产力提升技术”(2022YFD2200501)

Analyzing the Impact of Sample Structure on Fitting Results of Forestry Mathematical Models

ZENG Weisheng()   

  1. Academy of Inventory and Planning,National Forestry and Grassland Administration,Beijing 100714,China
  • Received:2024-03-07 Revised:2024-08-12 Online:2024-12-28 Published:2025-04-18

摘要:

样本结构和估计方法对数学模型的拟合结果均具有重要影响。估计方法的重要性已得到广泛认可,而样本结构的重要性却未得到足够重视。综合考虑模型复杂性、方差异同性、样本均质性等多种因素,设计了8套模拟数据;通过采用普通回归和加权回归方法,利用这8套模拟数据及其分段样本,分别对生物量模型和树高生长模型进行拟合,并使用6项基本指标[确定系数(R2)、估计值的标准差(SEE)、总体相对误差(TRE)、平均系统误差(ASE)、平均预估误差(MPE)和平均百分标准误差(MPSE)]对模型拟合效果进行评价。结果表明:1)在理想的建模样本条件下,不论是异方差模型还是等方差模型,采用普通回归和加权回归均可取得相同的结果,使得模型的TRE和ASE都趋于0。2)样本结构是影响建模结果的关键因素,获取理想的建模样本比选择参数估计方法更为重要。3)样本结构质量的决定因素,既不是划分径阶或龄级(自变量等级)的个数,也不是样本量是否分布均匀于各自变量等级,而是每个等级内的样本是否均匀分布。因此,在收集建模样本时,应尽可能涵盖自变量和因变量的变化范围,合理划分自变量等级,并按等级科学分配样本数量;在完善建模样本结构的基础上,再进一步提高样本数据的质量。

关键词: 样本结构, 加权回归, 平均系统误差, 总体相对误差, 异方差

Abstract:

Sample structure and estimation method significantly influence the fitting accuracy of mathematical models.While the importance of estimation method is well-documented,the critical role of sample structure has received insufficient attention.This study designed eight sets of simulation datasets incorporating factors such as model complexity,data heteroscedasticity and sample homogeneity.Ordinary regression and weighted regression methods were applied to eight simulation datasets and their segmented samples to fit the biomass and tree height growth model.Six evaluation metrics were used to assess model fitting:coefficient of determination(R2),standard error of estimate(SEE),total relative error(TRE),average systematic error(ASE),mean prediction error(MPE),and mean percent standard error(MPSE).1)Under ideal modeling sample conditions,both heteroscedastic and homoscedastic models produced identical results using ordinary and weighted regression methods,with TRE and ASE values approaching zero.2)The sample structure emerged as the key determinant of modeling result,outweighing the choice of parameter estimation methods.3)The quality of sample structure depends not on the number of diameter or age(independent variable)classes,nor on the uniformity of sample size is distributed according to the independent variable classes,but on the even distribution of samples within each class.To enhance model accuracy,it is crucial to maximize coverage of the variation ranges of independent and dependent variables,divide independent variables into classes rationally,and scientifically allocate the samples sizes within each class.Emphasis should be placed on improving sample structure to ensure high-quality data for modeling.

Key words: sample structure, weighted regression, average systematic error, total relative error, heteroscedasticity

中图分类号: