Category: Robust statistics

Robust principal component analysis

Robust Principal Component Analysis (RPCA) is a modification of the widely used statistical procedure of principal component analysis (PCA) which works well with respect to grossly corrupted observati

Two-step M-estimator

Two-step M-estimators deals with M-estimation problems that require preliminary estimation to obtain the parameter of interest. Two-step M-estimation is different from usual M-estimation problem becau

Bagplot

A bagplot, or starburst plot, is a method in robust statistics for visualizing two- or three-dimensional statistical data, analogous to the one-dimensional box plot. Introduced in 1999 by Rousseuw et

Interquartile mean

The interquartile mean (IQM) (or midmean) is a statistical measure of central tendency based on the truncated mean of the interquartile range. The IQM is very similar to the scoring method used in spo

Simplicial depth

In robust statistics and computational geometry, simplicial depth is a measure of central tendency determined by the simplices that contain a given point. For the Euclidean plane, it counts the number

CDF-based nonparametric confidence interval

In statistics, cumulative distribution function (CDF)-based nonparametric confidence intervals are a general class of confidence intervals around statistical functionals of a distribution. To calculat

Least absolute deviations

Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute residuals (LAR), or least absolute values (LAV), is a statistical optimality criterion and a statistical opti

Winsorized mean

A winsorized mean is a winsorized statistical measure of central tendency, much like the mean and median, and even more similar to the truncated mean. It involves the calculation of the mean after win

Median absolute deviation

In statistics, the median absolute deviation (MAD) is a robust measure of the variability of a univariate sample of quantitative data. It can also refer to the population parameter that is estimated b

Outlier

In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter

Trimmed estimator

In statistics, a trimmed estimator is an estimator derived from another estimator by excluding some of the extreme values, a process called truncation. This is generally done to obtain a more robust s

Median

In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be th

Influential observation

In statistics, an influential observation is an observation for a statistical calculation whose deletion from the dataset would noticeably change the result of the calculation. In particular, in regre

Nonparametric statistics

Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions (common examples of parameters are the mean and variance). Nonparame

L-estimator

In statistics, an L-estimator is an estimator which is a linear combination of order statistics of the measurements (which is also called an L-statistic). This can be as little as a single point, as i

Least trimmed squares

Least trimmed squares (LTS), or least trimmed sum of squares, is a robust statistical method that fits a function to a set of data whilst not being unduly affected by the presence of outliers. It is o

Weighted median

In statistics, a weighted median of a sample is the 50% weighted percentile. It was first proposed by F. Y. Edgeworth in 1888. Like the median, it is useful as an estimator of central tendency, robust

Winsorizing

Winsorizing or winsorization is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers. It is named after the engineer-t

Truncated mean

A truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. It involves the calculation of the mean after discarding given parts of a probability dist

Convex layers

In computational geometry, the convex layers of a set of points in the Euclidean plane are a sequence of nested convex polygons having the points as their vertices. The outermost one is the convex hul

Robust measures of scale

In statistics, robust measures of scale are methods that quantify the statistical dispersion in a sample of numerical data while resisting outliers. The most common such robust statistics are the inte

Robust Bayesian analysis

In statistics, robust Bayesian analysis, also called Bayesian sensitivity analysis, is a type of sensitivity analysis applied to the outcome from Bayesian inference or Bayesian optimal decisions.

Robust statistics

Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have

Trimean

In statistics the trimean (TM), or Tukey's trimean, is a measure of a probability distribution's location defined as a weighted average of the distribution's median and its two quartiles: This is equi

M-estimator

In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special c

Hodges–Lehmann estimator

In statistics, the Hodges–Lehmann estimator is a robust and nonparametric estimator of a population's location parameter. For populations that are symmetric about one median, such as the (Gaussian) no

Dixon's Q test

In statistics, Dixon's Q test, or simply the Q test, is used for identification and rejection of outliers. This assumes normal distribution and per Robert Dean and Wilfrid Dixon, and others, this test

Redescending M-estimator

In statistics, redescending M-estimators are Ψ-type M-estimators which have ψ functions that are non-decreasing near the origin, but decreasing toward 0 far from the origin. Their ψ functions can be c

Medcouple

In statistics, the medcouple is a robust statistic that measures the skewness of a univariate distribution. It is defined as a scaled median difference of the left and right half of a distribution. It

Info-gap decision theory

Info-gap decision theory seeks to optimize robustness to failure under severe uncertainty, in particular applying sensitivity analysis of the stability radius type to perturbations in the value of a g

Random sample consensus

Random sample consensus (RANSAC) is an iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers, when outliers are to be accorded no influence

Huber loss

In statistics, the Huber loss is a loss function used in robust regression, that is less sensitive to outliers in data than the squared error loss. A variant for classification is also sometimes used.