String matching algorithms

Two-way string-matching algorithm

In computer science, the two-way string-matching algorithm is an string-searching algorithm, discovered by Maxime Crochemore and Dominique Perrin in 1991. It takes a pattern of size m, called a “needle”, preprocesses it in linear time O(m), producing information that can then be used to search for the needle in any “haystack” string, taking only linear time O(n) with n the haystack's length. The two-way algorithm can be viewed as a combination of the forward-going Knuth–Morris–Pratt algorithm (KMP) and the backward-running Boyer–Moore string-search algorithm (BM).Like those two, the 2-way algorithm preprocesses the pattern to find partially repeating periods and computes “shifts” based on them, indicating what offset to “jump” to in the haystack when a given character is encountered. Unlike BM and KMP, it uses only O(log m) additional space to store information about those partial repeats: the search pattern is split into two halves (its critical factorization), represented only by the position of that split. Being a number lesser than m, it can be represented in ⌈log₂ m⌉ bits. In most practical settings, this can be taken to be O(1), as the needle's size is limited by the size of addressable memory.The actual matching operation performs at most 2n − m comparisons. Breslauer later published two improved variants performing fewer comparisons, at the cost of storing additional data about the preprocessed needle: * The first one performs at most n + ⌊(n − m)/2⌋ comparisons, ⌈(n − m)/2⌉ fewer than the original. It must however store ⌈log m⌉ additional offsets in the needle, using O(log2 m) space. * The second adapts it to only store a constant number of such offsets, denoted c, but must perform n + ⌊(1⁄2 + ε) * (n − m)⌋ comparisons, with ε = 1⁄2(Fc+2 − 1)−1 = O(−c) going to zero exponentially quickly as c increases. The algorithm is considered fairly efficient in practice, being cache-friendly and using several operations that can be implemented in well-optimized subroutines. It is used by the C standard libraries glibc, newlib, and musl, to implement the memmem and strstr family of substring functions. As with most advanced string-search algorithms, the naïve implementation may be more efficient on small-enough instances; this is especially so if the needle isn't searched in multiple haystacks, which would amortize the preprocessing cost. (Wikipedia).

Video thumbnail

Pattern Matching - Correctness

Learn how to use pattern matching to assist you in your determination of correctness. This video contains two examples, one with feedback and one without. https://teacher.desmos.com/activitybuilder/custom/6066725595e2513dc3958333

From playlist Pattern Matching with Computation Layer

Video thumbnail

Determine If an Ordered Pair is a Solution to a Linear Equation

This video explains how to determine if an ordered pair is a solution to a given linear equation. http://mathispower4u.com

From playlist Graphing Linear Equations Using a Table of Values

Video thumbnail

Determine the Number of Solutions to a System of Linear Equations From a Graph

This video explains how to determine the number of solutions from the graph of a system of linear equations. http://mathispower4u.com

From playlist Solving Systems of Equations by Graphing

Video thumbnail

Determining Signal Similarities

Get a Free Trial: https://goo.gl/C2Y9A5 Get Pricing Info: https://goo.gl/kDvGHt Ready to Buy: https://goo.gl/vsIeA5 Find a signal of interest within another signal, and align signals by determining the delay between them using Signal Processing Toolbox™. For more on Signal Processing To

From playlist Signal Processing and Communications

Video thumbnail

More Complex Patterns

Sometimes you need to nest a pattern in another pattern. Learn how to build these patterns and then extract information from them. https://teacher.desmos.com/activitybuilder/custom/605e21d90925ca0c93fabbbd

From playlist Pattern Matching with Computation Layer

Video thumbnail

Ex: Identify the Solution to a System of Equation Given a Graph, Then Verify

This video provides the graph of two linear equations from a system of equations. It explains how to identify the solution and then verify the solution. Complete Library: http://www.mathispower4u.com Search by Topic: http://www.mathispower4u.wordpress.com

From playlist Solving Systems of Equations by Graphing

Video thumbnail

Fun with Strings

Experimenting and seeing what we can do with strings

From playlist Computer Science

Video thumbnail

Lecture 8 - String Matching Algorithms

This is Lecture 8 of the CSE549 (Computational Biology) course taught by Professor Steven Skiena [http://www.cs.sunysb.edu/~skiena/] at Stony Brook University in 2010. The lecture slides are available at: http://www.algorithm.cs.sunysb.edu/computationalbiology/pdf/lecture8.pdf More infor

From playlist CSE549 - Computational Biology - 2010 SBU

Video thumbnail

Lecture 3 - Computer Science for Biologists

This is Lecture 3 of the CSE549 (Computational Biology) course taught by Professor Steven Skiena [http://www.cs.sunysb.edu/~skiena/] at Stony Brook University in 2010. The lecture slides are available at: http://www.algorithm.cs.sunysb.edu/computationalbiology/pdf/lecture3.pdf More infor

From playlist CSE549 - Computational Biology - 2010 SBU

Video thumbnail

Lecture 21 - Dynamic Programming

This is Lecture 21 of the COMP300E (Programming Challenges) course taught by Professor Steven Skiena [http://www.cs.sunysb.edu/~skiena/] at Hong Kong University of Science and Technology in 2009. The lecture slides are available at: http://www.algorithm.cs.sunysb.edu/programmingchallenges

From playlist COMP300E - Programming Challenges - 2009 HKUST

Video thumbnail

Dynamic Programming Crash Course | Advanced Data Structures And Algorithms Tutorial | Simplilearn

🔥Post Graduate Program In Full Stack Web Development: https://www.simplilearn.com/pgp-full-stack-web-development-certification-training-course?utm_campaign=DynamicProgrammingCrashCourse-xZKqH7ZcS_Y&utm_medium=DescriptionFF&utm_source=youtube 🔥Caltech Coding Bootcamp (US Only): https://www.

From playlist Data Structures & Algorithms [2022 Updated]

Video thumbnail

Lecture 9 - Projects & Approximate String Matching

This is Lecture 9 of the CSE549 (Computational Biology) course taught by Professor Steven Skiena [http://www.cs.sunysb.edu/~skiena/] at Stony Brook University in 2010. The lecture slides are available at: http://www.algorithm.cs.sunysb.edu/computationalbiology/pdf/lecture9.pdf More infor

From playlist CSE549 - Computational Biology - 2010 SBU

Video thumbnail

Lecture 7 - Suffix Arrays and Assembly

This is Lecture 7 of the CSE549 (Computational Biology) course taught by Professor Steven Skiena [http://www.cs.sunysb.edu/~skiena/] at Stony Brook University in 2010. The lecture slides are available at: http://www.algorithm.cs.sunysb.edu/computationalbiology/pdf/lecture7.pdf More infor

From playlist CSE549 - Computational Biology - 2010 SBU

Video thumbnail

Ex 2: Solve a System of Equations by Graphing

This video provides an example of how to solve of system of linear equations by graphing. Complete Library: http://www.mathispower4u.com Search by Topic: http://www.mathispower4u.wordpress.com

From playlist Solving Systems of Equations by Graphing

Video thumbnail

Lecture 11 - Smith-Waterman Algorithm

This is Lecture 11 of the CSE549 (Computational Biology) course taught by Professor Steven Skiena [http://www.cs.sunysb.edu/~skiena/] at Stony Brook University in 2010. The lecture slides are available at: http://www.algorithm.cs.sunysb.edu/computationalbiology/pdf/lecture11.pdf More inf

From playlist CSE549 - Computational Biology - 2010 SBU

Related pages

C string handling | Fibonacci number | Boyer–Moore string-search algorithm | String-searching algorithm | Golden ratio | Knuth–Morris–Pratt algorithm | Lyndon word | String (computer science)