The Hannan-Quinn Proposition for Linear Regression

Joe Suzuki

Abstract


We consider the variable selection problem in linear regression. Suppose that we have a set of random variables  $X_1,\cdots,X_m,Y,\epsilon$ ($m\geq 1$) such that $Y=\sum_{k\in \pi}\alpha_kX_k+\epsilon$ with $\pi\subseteq \{1,\cdots,m\}$ and reals $\{\alpha_k\}_{k=1}^m$, assuming that $\epsilon$ is independent of any linear combination of $X_1,\cdots,X_m$. Given $n$ examples $\{(x_{i,1}\cdots,x_{i,m},y_i)\}_{i=1}^n$ actually independently emitted from $(X_1,\cdots,X_m, Y)$, we wish to estimate the true $\pi$ based on information criteria in the form of $H+(k/2)d_n$, where $H$ is the likelihood with respect to $\pi$ multiplied by $-1$, and $\{d_n\}$ is a positive real sequence. If $d_n$ is too small, we cannot obtain consistency because of overestimation. For  autoregression, Hannan-Quinn proved that the rate $d_n=2\log\log n$ is the minimum  satisfying strong consistency. This paper solves the statement affirmative for linear regression. Thus far, there was no proof for the proposition while $d_n=c\log\log n$ for some $c>0$ was shown to be sufficient.

Full Text: PDF DOI: 10.5539/ijsp.v1n2p179

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

International Journal of Statistics and Probability   ISSN 1927-7032(Print)   ISSN 1927-7040(Online)

Copyright © Canadian Center of Science and Education

To make sure that you can receive messages from us, please add the 'ccsenet.org' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------

doaj_logo_new_120 proquest_logo_120images_120.