欢迎访问中国科学院大学学报,今天是

中国科学院大学学报

• • 上一篇    下一篇

基于序贯算法的重构密度估计*

黄思源1, 谢田法1, 熊世峰2,3,†   

  1. 1 北京工业大学数学统计学与力学学院,北京100124;
    2 中国科学院大学数学科学学院,北京 101408;
    3 中国科学院数学与系统科学研究院,北京 100190
  • 收稿日期:2023-10-20 修回日期:2024-04-03 发布日期:2024-05-09
  • 通讯作者: † E-mail: xiong@amss.ac.cn
  • 基金资助:
    * 国家自然科学基金项目 (12171462)和国家重点研发计划项目(2022YFF0609903)资助

Reconstruction density estimation based on sequential algorithms

HUANG Siyuan1, XIE Tianfa1, XIONG Shifeng2,3,†   

  1. 1 School of Mathematics, Statistics and Mechanics, Beijing University of Technology, Beijing 100124, China;
    2 School of Mathematical Sciences, University of Chinese Academy and Science, Beijing 101408, China;
    3 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2023-10-20 Revised:2024-04-03 Published:2024-05-09

摘要: 本文对于重构方法给出的密度估计,提出了一种基于序贯思想来解决重构密度估计中节点选择问题的算法。由于密度估计可视为一个无监督学习问题,即没有响应变量y,因此针对回归的节点序贯选择方法在这里不适用。我们将节点视为一个参数,通过最小化损失函数选出下一个节点,利用贪心算法找出整个节点集。该算法操作简单,进一步提升了估计效果,且可以减小由于节点选择不同而对密度估计产生的影响。此外,本文还根据重构方法中参数的实际含义给出了先验,利用Metropolis算法得到后验分布的样本,通过样本分位数近似总体分位数构造了密度函数逐点的区间估计。最后,我们在几个数据集上验证了序贯重构密度估计及其区间估计的效果。

关键词: 重构方法, 密度估计, 序贯算法, 区间估计

Abstract: In this paper, for the density estimation given by the reconstruction approach, an algorithm based on the sequential idea is proposed to solve the node selection problem in the reconstructed density estimation. Since density estimation can be regarded as an unsupervised learning problem, i.e., there is no response variable y, the node sequential selection approach for regression is not applicable here. We regard the node as a parameter and select the next node by minimising the loss function, then find out the entire set of nodes using a greedy algorithm. This algorithm is simple to operate, further improves the estimation effect, and can reduce the impact on density estimation due to different node selection. In addition, in this paper, the prior is given according to the actual meanings of the parameters in the reconstruction approach, the samples of the posterior distribution are obtained using the Metropolis algorithm, so that the interval estimation of the density function point by point is constructed by approximating the overall quartile through the sample quartiles. Finally, we validate the sequential reconstruction density estimation and its interval estimation on several datasets.

Key words: Key Words Reconstruction approach, Density estimation, Sequential algorithm, Interval estimation

中图分类号: