Category: Dimension reduction

Multidimensional analysis

In statistics, econometrics and related fields, multidimensional analysis (MDA) is a data analysis process that groups data into two categories: data dimensions and measurements. For example, a data s

Robust principal component analysis

Robust Principal Component Analysis (RPCA) is a modification of the widely used statistical procedure of principal component analysis (PCA) which works well with respect to grossly corrupted observati

Canonical correspondence analysis

In multivariate analysis, canonical correspondence analysis (CCA) is an ordination technique that determines axes from the response data as a linear combination of measured predictors. CCA is commonly

Dimensionality reduction

Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meani

Preference regression

Preference regression is a statistical technique used by marketers to determine consumers’ preferred core benefits. It usually supplements product positioning techniques like multi dimensional scaling

Low-rank approximation

In mathematics, low-rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix (the data) and an approximating matrix (the optimization variable),

Count sketch

Count sketch is a type of dimensionality reduction that is particularly efficient in statistics, machine learning and algorithms.It was invented by Moses Charikar, Kevin Chen and Martin Farach-Colton

Feature selection

In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features (va

Stress majorization

Stress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of -dimensional data items, a configuration of points in -dimensional space is sought that minim

Local tangent space alignment

Local tangent space alignment (LTSA) is a method for manifold learning, which can efficiently learn a nonlinear embedding into low-dimensional coordinates from high-dimensional data, and can also reco

Elastic map

Elastic maps provide a tool for nonlinear dimensionality reduction. By their construction, they are a system of elastic springs embedded in the dataspace. This system approximates a low-dimensional ma

Principal component analysis

Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while pr

Correspondence analysis

Correspondence analysis (CA) is a multivariate statistical technique proposed by Herman Otto Hartley (Hirschfeld) and later developed by Jean-Paul Benzécri. It is conceptually similar to principal com

Multilinear principal component analysis

Within statistics, Multilinear principal component analysis (MPCA) is a multilinear extension of principal component analysis (PCA). MPCA is employed in the analysis of n-way arrays, i.e. a cube or hy

Feature extraction

In machine learning, pattern recognition, and image processing, feature extraction starts from an initial set of measured data and builds derived values (features) intended to be informative and non-r

Modes of variation

In statistics, modes of variation are a continuously indexed set of vectors or functions that are centered at a mean and are used to depict the variation in a population or sample. Typically, variatio

Proper generalized decomposition

The proper generalized decomposition (PGD) is an iterative numerical method for solving boundary value problems (BVPs), that is, partial differential equations constrained by a set of boundary conditi

Sammon mapping

Sammon mapping or Sammon projection is an algorithm that maps a high-dimensional space to a space of lower dimensionality (see multidimensional scaling) by trying to preserve the structure of inter-po

Generalized multidimensional scaling

Generalized multidimensional scaling (GMDS) is an extension of metric multidimensional scaling, in which the target space is non-Euclidean. When the dissimilarities are distances on a surface and the

T-distributed stochastic neighbor embedding

t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on

Kernel principal component analysis

In the field of multivariate statistics, kernel principal component analysis (kernel PCA)is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the o

Self-organizing map

A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher

NOMINATE (scaling method)

NOMINATE (an acronym for Nominal Three-Step Estimation) is a multidimensional scaling application developed by US political scientists Keith T. Poole and Howard Rosenthal in the early 1980s to analyze

Sliced inverse regression

Sliced inverse regression (or SIR) is a tool for dimensionality reduction in the field of multivariate statistics. In statistics, regression analysis is a method of studying the relationship between a

Multifactor dimensionality reduction

Multifactor dimensionality reduction (MDR) is a statistical approach, also used in machine learning automatic approaches, for detecting and characterizing combinations of attributes or independent var

Tucker decomposition

In mathematics, Tucker decomposition decomposes a tensor into a set of matrices and one small core tensor. It is named after Ledyard R. Tuckeralthough it goes back to Hitchcock in 1927.Initially descr

Sufficient dimension reduction

In statistics, sufficient dimension reduction (SDR) is a paradigm for analyzing data that combines the ideas of dimension reduction with the concept of sufficiency. Dimension reduction has long been a

Nonlinear dimensionality reduction

Nonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the go

Random indexing

Random indexing is a dimensionality reduction method and computational framework for distributional semantics, based on the insight that very-high-dimensional vector space model implementations are im

Generalized canonical correlation

In statistics, the generalized canonical correlation analysis (gCCA), is a way of making sense of cross-correlation matrices between the sets of random variables when there are more than two sets. Whi

Independent component analysis

In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomp

Tensor sketch

In statistics, machine learning and algorithms, a tensor sketch is a type of dimensionality reduction that is particularly efficient when applied to vectors that have tensor structure. Such a sketch c

Detrended correspondence analysis

Detrended correspondence analysis (DCA) is a multivariate statistical technique widely used by ecologists to find the main factors or gradients in large, species-rich but usually sparse data matrices

Relief (feature selection)

Relief is an algorithm developed by Kira and Rendell in 1992 that takes a filter-method approach to feature selection that is notably sensitive to feature interactions. It was originally designed for

Autoencoder

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning). The encoding is validated and refined by attempting to regenerate the i

Locality-sensitive hashing

In computer science, locality-sensitive hashing (LSH) is an algorithmic technique that hashes similar input items into the same "buckets" with high probability. (The number of buckets is much smaller

Multiple correspondence analysis

In statistics, multiple correspondence analysis (MCA) is a data analysis technique for nominal categorical data, used to detect and represent underlying structures in a data set. It does this by repre

Semidefinite embedding

Maximum Variance Unfolding (MVU), also known as Semidefinite Embedding (SDE), is an algorithm in computer science that uses semidefinite programming to perform non-linear dimensionality reduction of h

Semantic mapping (statistics)

Semantic mapping (SM) in statistics is a method for dimensionality reduction (the transformation of data from a high-dimensional space into a low-dimensional space). SM can be used in a set of multidi

Ordination (statistics)

Ordination or gradient analysis, in multivariate analysis, is a method complementary to data clustering, and used mainly in exploratory data analysis (rather than in hypothesis testing). Ordination or

Relationship square

In statistics, the relationship square is a graphical representation for use in the factorial analysis of a table individuals x variables. This representation completes classical representations provi

Sparse PCA

Sparse principal component analysis (sparse PCA) is a specialised technique used in statistical analysis and, in particular, in the analysis of multivariate data sets. It extends the classic method of

Multidimensional scaling

Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset. MDS is used to translate "information about the pairwise 'distances' among a set of o

Multilinear subspace learning

Multilinear subspace learning is an approach to dimensionality reduction. Dimensionality reduction can be performed on a data tensor whose observations have been vectorized and organized into a data t

Random projection

In mathematics and statistics, random projection is a technique used to reduce the dimensionality of a set of points which lie in Euclidean space. Random projection methods are known for their power,

Variational autoencoder

In machine learning, a variational autoencoder (VAE), is an artificial neural network architecture introduced by Diederik P. Kingma and Max Welling, belonging to the families of probabilistic graphica