A nearest neighbor estimate of the residual variance
We study the problem of estimating the smallest achievable mean-squared error in regression function estimation. The problem is equivalent to estimating the second moment of the regression function of Y on X∈Rd. We introduce a nearest-neighbor-based estimate and obtain a normal limit law for the est...
| Autores: | , , , |
|---|---|
| Tipo de documento: | artigo |
| Estado: | Versão publicada |
| Data de publicação: | 2018 |
| País: | España |
| Recursos: | Universitat Pompeu Fabra |
| Repositório: | Repositorio Digital de la UPF |
| OAI Identifier: | oai:repositori.upf.edu:10230/44868 |
| Acesso em linha: | http://hdl.handle.net/10230/44868 http://dx.doi.org/10.1214/18-EJS1438 |
| Access Level: | Acceso aberto |
| Palavra-chave: | Regression functional Nearest-neighbor-based estimate Asymptotic normality Concentration inequalities Dimension reduction |
| Resumo: | We study the problem of estimating the smallest achievable mean-squared error in regression function estimation. The problem is equivalent to estimating the second moment of the regression function of Y on X∈Rd. We introduce a nearest-neighbor-based estimate and obtain a normal limit law for the estimate when X has an absolutely continuous distribution, without any condition on the density. We also compute the asymptotic variance explicitly and derive a non-asymptotic bound on the variance that does not depend on the dimension d. The asymptotic variance does not depend on the smoothness of the density of X or of the regression function. A non-asymptotic exponential concentration inequality is also proved. We illustrate the use of the new estimate through testing whether a component of the vector X carries information for predicting Y. |
|---|