Cluster analysis algorithms

K-medians clustering

In statistics, k-medians clustering is a cluster analysis algorithm. It is a variation of k-means clustering where instead of calculating the mean for each cluster to determine its centroid, one instead calculates the median. This has the effect of minimizing error over all clusters with respect to the 1-norm distance metric, as opposed to the squared 2-norm distance metric (which k-means does.) This relates directly to the k-median problem with respect to the 1-norm, which is the problem of finding k centers such that the clusters formed by them are the most compact. Formally, given a set of data points x, the k centers ci are to be chosen so as to minimize the sum of the distances from each x to the nearest ci. The criterion function formulated in this way is sometimes a better criterion than that used in the k-means clustering algorithm, in which the sum of the squared distances is used. The sum of distances is widely used in applications such as the facility location problem. The proposed algorithm uses Lloyd-style iteration which alternates between an expectation (E) and maximization (M) step, making this an expectationโ€“maximization algorithm. In the E step, all objects are assigned to their nearest median. In the M step, the medians are recomputed by using the median in each single dimension. (Wikipedia).

Video thumbnail

Clustering 1: monothetic vs. polythetic

Full lecture: http://bit.ly/K-means The aim of clustering is to partition a population into sub-groups (clusters). Clusters can be monothetic (where all cluster members share some common property) or polythetic (where all cluster members are similar to each other in some sense).

From playlist K-means Clustering

Video thumbnail

Clustering (3): K-Means Clustering

The K-Means clustering algorithm. Includes derivation as coordinate descent on a squared error cost function, some initialization techniques, and using a complexity penalty to determine the number of clusters.

From playlist cs273a

Video thumbnail

Clustering 3: overview of methods

Full lecture: http://bit.ly/K-means In this course we cover 4 different clustering algorithms: K-D trees (part of lecture 9), K-means (this lecture), Gaussian mixture models (lecture 17) and agglomerative clustering (lecture 20).

From playlist K-means Clustering

Video thumbnail

Subspace and Network Averaging for Computer Vision and Bioinformatics -- Math Major Seminar

โญSupport the channelโญ Patreon: https://www.patreon.com/michaelpennmath Merch: https://teespring.com/stores/michael-penn-math My amazon shop: https://www.amazon.com/shop/michaelpenn ๐ŸŸข Discord: https://discord.gg/Ta6PTGtKBm โญmy other channelsโญ Main Channel: https://www.youtube.

From playlist MathMajor Seminar

Video thumbnail

Nexus Trimester - Harry Lang (Johns Hopkins University)

Data Reduction for Clustering on Streams Harry Lang (Johns Hopkins University) March 08, 2016 Abstract: We explore clustering problems in the streaming sliding window model in both general metric spaces and Euclidean space. We present the first polylogarithmic space O(1)-approximation to

From playlist 2016-T1 - Nexus of Information and Computation Theory - CEB Trimester

Video thumbnail

Clustering 2: soft vs. hard clustering

Full lecture: http://bit.ly/K-means A hard clustering means we have non-overlapping clusters, where each instance belongs to one and only one cluster. In a soft clustering method, a single individual can belong to multiple clusters, often with a confidence (belief) associated with each cl

From playlist K-means Clustering

Video thumbnail

Hierarchical Clustering 5: summary

[http://bit.ly/s-link] Summary of the lecture.

From playlist Hierarchical Clustering

Video thumbnail

Adam Polak: Nearly-Tight and Oblivious Algorithms for Explainable Clustering

We study the problem of explainable clustering in the setting first formalized by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). A k-clustering is said to b e explainable if it is given by a decision tree where each internal no de splits data points with a threshold cut in a sing

From playlist Workshop: Approximation and Relaxation

Video thumbnail

Introduction to Outlier Detection Methods - Wolfram Livecoding Session

Andreas Lauschke, a senior mathematical programmer, live-demos key Wolfram Language features useful in data science. In the sixth session, Andreas introduces some methods for outlier detection. This is part 1 of 2. A close look will be taken at box plots as well as caveats (i.e. when not t

From playlist Data Science with Andreas Lauschke

Video thumbnail

TabPy Tutorial For Beginners | TabPy Training | Tableau Training | Edureka | Tableau Rewind

๐Ÿ”ฅ๐„๐๐ฎ๐ซ๐ž๐ค๐š ๐“๐š๐›๐ฅ๐ž๐š๐ฎ ๐‚๐ž๐ซ๐ญ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง ๐“๐ซ๐š๐ข๐ง๐ข๐ง๐  : https://www.edureka.co/tableau-certification-training (๐”๐ฌ๐ž ๐‚๐จ๐๐ž: ๐˜๐Ž๐”๐“๐”๐๐„๐Ÿ๐ŸŽ) This Edureka tutorial on "TabPy Tutorial For Beginners " is to help you utilize donut charts as a tool, not only for engagement but also comprehension efficiency. Topic

From playlist Tableau Training Videos | Tableau Tutorial Videos | Data Visualisation using Tableau | Edureka

Video thumbnail

Practical, Fast, Beyond 2-pt Statistics for Large Scale Structure Clustering by Thomas Abel

PROGRAM LESS TRAVELLED PATH TO THE DARK UNIVERSE ORGANIZERS: Arka Banerjee (IISER Pune), Subinoy Das (IIA, Bangalore), Koushik Dutta (IISER, Kolkata), Raghavan Rangarajan (Ahmedabad University) and Vikram Rentala (IIT Bombay) DATE & TIME: 13 March 2023 to 24 March 2023 VENUE: Ramanujan

From playlist LESS TRAVELLED PATH TO THE DARK UNIVERSE

Video thumbnail

Stanford Seminar - Decision Making at Scale: Algorithms, Mechanisms, and Platforms

Ashish Goel Stanford University This seminar series features dynamic professionals sharing their industry experience and cutting edge research within the human-computer interaction (HCI) field. Each week, a unique collection of technologists, artists, designers, and activists will discuss

From playlist Stanford Seminars

Video thumbnail

(ML 16.1) K-means clustering (part 1)

Introduction to the K-means algorithm for clustering.

From playlist Machine Learning

Related pages

K-medoids | Binary data | Norm (mathematics) | Expectationโ€“maximization algorithm | Medoid | Euclidean distance | Median | Stata | Mean | Silhouette (clustering) | Statistics | Facility location problem | Cluster analysis | ELKI | K-means clustering