Measure theory | Clustering criteria | String metrics | Similarity measures

Simple matching coefficient

The simple matching coefficient (SMC) or Rand similarity coefficient is a statistic used for comparing the similarity and diversity of sample sets. Given two objects, A and B, each with n binary attributes, SMC is defined as: where: is the total number of attributes where A and B both have a value of 0. is the total number of attributes where A and B both have a value of 1. is the total number of attributes where the attribute of A is 0 and the attribute of B is 1. is the total number of attributes where the attribute of A is 1 and the attribute of B is 0. The simple matching distance (SMD), which measures dissimilarity between sample sets, is given by . SMC is linearly related to Hamann similarity: . Also, , where is the squared Euclidean distance between the two objects (binary vectors) and n is the number of attributes. The SMC is very similar to the more popular Jaccard index. The main difference is that the SMC has the term in its numerator and denominator, whereas the Jaccard index does not. Thus, the SMC counts both mutual presences (when an attribute is present in both sets) and mutual absence (when an attribute is absent in both sets) as matches and compares it to the total number of attributes in the universe, whereas the Jaccard index only counts mutual presence as matches and compares it to the number of attributes that have been chosen by at least one of the two sets. In market basket analysis, for example, the basket of two consumers who we wish to compare might only contain a small fraction of all the available products in the store, so the SMC will usually return very high values of similarities even when the baskets bear very little resemblance, thus making the Jaccard index a more appropriate measure of similarity in that context. For example, consider a supermarket with 1000 products and two customers. The basket of the first customer contains salt and pepper and the basket of the second contains salt and sugar. In this scenario, the similarity between the two baskets as measured by the Jaccard index would be 1/3, but the similarity becomes 0.998 using the SMC. In other contexts, where 0 and 1 carry equivalent information (symmetry), the SMC is a better measure of similarity. For example, vectors of demographic variables stored in dummy variables, such as binary gender, would be better compared with the SMC than with the Jaccard index since the impact of gender on similarity should be equal, independently of whether male is defined as a 0 and female as a 1 or the other way around. However, when we have symmetric dummy variables, one could replicate the behaviour of the SMC by splitting the dummies into two binary attributes (in this case, male and female), thus transforming them into asymmetric attributes, allowing the use of the Jaccard index without introducing any bias. By using this trick, the Jaccard index can be considered as making the SMC a fully redundant metric. The SMC remains, however, more computationally efficient in the case of symmetric dummy variables since it does not require adding extra dimensions. The Jaccard index is also more general than the SMC and can be used to compare other data types than just vectors of binary attributes, such as probability measures. (Wikipedia).

Video thumbnail

Covariance (8 of 17) What is the Correlation Coefficient?

Visit http://ilectureonline.com for more math and science lectures! To donate:a http://www.ilectureonline.com/donate https://www.patreon.com/user?u=3236071 We will learn what is and how to find the correlation coefficient of 2 data sets and see how it corresponds to the graph of the data

From playlist COVARIANCE AND VARIANCE

Video thumbnail

Estimate the Correlation Coefficient Given a Scatter Plot

This video explains how to estimate the correlation coefficient given a scatter plot.

From playlist Performing Linear Regression and Correlation

Video thumbnail

Covariance (12 of 17) Covariance Matrix wth 3 Data Sets and Correlation Coefficients

Visit http://ilectureonline.com for more math and science lectures! To donate:a http://www.ilectureonline.com/donate https://www.patreon.com/user?u=3236071 We will find the correlation coefficients of the 3 data sets form the previous 2 videos. Next video in this series can be seen at:

From playlist COVARIANCE AND VARIANCE

Video thumbnail

What is similarity

๐Ÿ‘‰ Learn how to solve with similar triangles. Two triangles are said to be similar if the corresponding angles are congruent (equal). Note that two triangles are similar does not imply that the length of the sides are equal but the sides are proportional. Knowledge of the length of the side

From playlist Similar Triangles

Video thumbnail

Similar Triangles Using Side-Side-Side and Side-Angle-Side

This video explains how to determine if two triangles are similar using SSS and SAS. Complete Video List: http://www.mathispower4u.yolasite.com

From playlist Similarity

Video thumbnail

Correlation Coefficient

This video explains how to find the correlation coefficient which describes the strength of the linear relationship between two variables x and y. My Website: https://www.video-tutor.net Patreon: https://www.patreon.com/MathScienceTutor Amazon Store: https://www.amazon.com/shop/theorga

From playlist Statistics

Video thumbnail

Using a set of points to determine if two triangles are similar to each other

๐Ÿ‘‰ Learn how to determine whether two triangles are similar given the coordinate points of the vertices of the triangle. Two triangles are said to be equal when the corresponding angles of the triangles are congruent (equal) or when the corresponding side lengths are proportional. When give

From playlist Similar Triangles

Video thumbnail

Given two similar triangles determine the values of x and y for the angles

๐Ÿ‘‰ Learn how to solve with similar triangles. Two triangles are said to be similar if the corresponding angles are congruent (equal). Note that two triangles are similar does not imply that the length of the sides are equal but the sides are proportional. Knowledge of the length of the side

From playlist Similar Triangles

Video thumbnail

Session 2 - The On-Shell Analytic S-Matrix: Jacob Bourjaily

https://strings2015.icts.res.in/talkTitles.php

From playlist Strings 2015 conference

Video thumbnail

What are similar triangles?

Youโ€™ve heard about similar triangles, but do you know what technically makes two triangles similar? Informally, we can say that two triangles are similar if their associated angles are congruent. In other words, their angle measures have to be the same. However, the triangles donโ€™t necess

From playlist Popular Questions

Video thumbnail

Monotone Arithmetic Circuit Lower Bounds Via Communication Complexity - Arkadev Chattopadhyay

Computer Science/Discrete Mathematics Seminar I Topic: Monotone Arithmetic Circuit Lower Bounds Via Communication Complexity Speaker: Arkadev Chattopadhyay Affiliation: Tata Institute of Fundamental Research Date: February 15, 2021 For more video please visit http://video.ias.edu

From playlist Mathematics

Video thumbnail

Linear Algebra 3c2: Decomposition with Polynomials 2

https://bit.ly/PavelPatreon https://lem.ma/LA - Linear Algebra on Lemma http://bit.ly/ITCYTNew - Dr. Grinfeld's Tensor Calculus textbook https://lem.ma/prep - Complete SAT Math Prep

From playlist Part 1 Linear Algebra: An In-Depth Introduction with a Focus on Applications

Video thumbnail

Arthur Szlam: "A Tutorial on Sparse Modeling"

Graduate Summer School 2012: Deep Learning Feature Learning A Tutorial on Sparse Modeling" Arthur Szlam, New York University Institute for Pure and Applied Mathematics, UCLA July 16, 2012 For more information: https://www.ipam.ucla.edu/programs/summer-schools/graduate-summer-school-deep

From playlist GSS2012: Deep Learning, Feature Learning

Video thumbnail

Combinatorial methods for PIT (and ranks of matrix spaces) - Roy Meshulam

Optimization, Complexity and Invariant Theory Topic: Combinatorial methods for PIT (and ranks of matrix spaces) Speaker: Roy Meshulam Affiliation: Technion Date: June 8. 2018 For more videos, please visit http://video.ias.edu

From playlist Mathematics

Video thumbnail

Cluster algebras from surfaces II: expansion formulas, good bases,... (Lecture 2) by Jon Wilson

PROGRAM :SCHOOL ON CLUSTER ALGEBRAS ORGANIZERS :Ashish Gupta and Ashish K Srivastava DATE :08 December 2018 to 22 December 2018 VENUE :Madhava Lecture Hall, ICTS Bangalore In 2000, S. Fomin and A. Zelevinsky introduced Cluster Algebras as abstractions of a combinatoro-algebra

From playlist School on Cluster Algebras 2018

Video thumbnail

9. HQET Matching & Power Corrections

MIT 8.851 Effective Field Theory, Spring 2013 View the complete course: http://ocw.mit.edu/8-851S13 Instructor: Iain Stewart In this lecture, the professor discussed detailed matching calculation, velocity dependent anomalous dimension and Wilson coefficients, power corrections and repar

From playlist MIT 8.851 Effective Field Theory, Spring 2013

Video thumbnail

Dimers, networks, and integrable systems - Anton Izosimov

Joint IAS/Princeton/Montreal/Paris/Tel-Aviv Symplectic Geometry Zoominar Topic: Dimers, networks, and integrable systems Speaker: Anton Izosimov Affiliation: The University of Arizona Date: March 18, 2022 I will review two combinatorial constructions of integrable systems: Goncharov-Keny

From playlist Mathematics

Video thumbnail

Linear Algebra 3c1: Decomposition with Polynomials 1

https://bit.ly/PavelPatreon https://lem.ma/LA - Linear Algebra on Lemma http://bit.ly/ITCYTNew - Dr. Grinfeld's Tensor Calculus textbook https://lem.ma/prep - Complete SAT Math Prep

From playlist Part 1 Linear Algebra: An In-Depth Introduction with a Focus on Applications

Video thumbnail

Using Similarity and proportions to find the missing values

๐Ÿ‘‰ Learn how to solve with similar triangles. Two triangles are said to be similar if the corresponding angles are congruent (equal). Note that two triangles are similar does not imply that the length of the sides are equal but the sides are proportional. Knowledge of the length of the side

From playlist Similar Triangles

Related pages

Rand index | Diversity index | Similarity measure | Probability measure | Statistic | Dummy variable (statistics) | Jaccard index