A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression


Satman M. H.

Journal of Statistics and Applied Sciences, cilt.10, ss.76-85, 2024 (Hakemli Dergi)

Özet

The main motivation of this study is to develop an efficient algorithm for diagnosing and detecting outliers in linear regression up to a reasonable level of contamination. The algorithm initially obtains a robust version of the hat matrix at the linear algebra level. The basic subset obtained in the first stage is improved through concentration steps as defined in the fast-LTS (Least Trimmed Squares) regression algorithm. The method can be plugged into another algorithm as a basic subset selection state. The algorithm is effective against outliers in both X and Y directions by a rate of 25%. The complexity of the algorithm increases linearly with the number of observations and parameters. The algorithm is quite fast as it does not require iterative calculations. The success of the algorithm against a specific contamination level is demonstrated through simulations.