Computer algebra | Genetic programming | Regression analysis

Symbolic regression

Symbolic regression (SR) is a type of regression analysis that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of accuracy and simplicity. No particular model is provided as a starting point for symbolic regression. Instead, initial expressions are formed by randomly combining mathematical building blocks such as mathematical operators, analytic functions, constants, and state variables. Usually, a subset of these primitives will be specified by the person operating it, but that's not a requirement of the technique. The symbolic regression problem for mathematical functions has been tackled with a variety of methods, including recombining equations most commonly using genetic programming, as well as more recent methods utilizing Bayesian methods and neural networks. Another non-classical alternative method to SR is called Universal Functions Originator (UFO), which has a different mechanism, search-space, and building strategy. Further methods such as Exact Learning attempt to transform the fitting problem into a moments problem in a natural function space, usually built around generalizations of the Meijer-G function. By not requiring a priori specification of a model, symbolic regression isn't affected by human bias, or unknown gaps in domain knowledge. It attempts to uncover the intrinsic relationships of the dataset, by letting the patterns in the data itself reveal the appropriate models, rather than imposing a model structure that is deemed mathematically tractable from a human perspective. The fitness function that drives the evolution of the models takes into account not only error metrics (to ensure the models accurately predict the data), but also special complexity measures, thus ensuring that the resulting models reveal the data's underlying structure in a way that's understandable from a human perspective. This facilitates reasoning and favors the odds of getting insights about the data-generating system. It has been proven that symbolic regression is an NP-hard problem, in the sense that one cannot always find the best possible mathematical expression to fit to a given dataset in polynomial time. (Wikipedia).

Symbolic regression
Video thumbnail

Linear regression

Linear regression is used to compare sets or pairs of numerical data points. We use it to find a correlation between variables.

From playlist Learning medical statistics with python and Jupyter notebooks

Video thumbnail

An Introduction to Linear Regression Analysis

Tutorial introducing the idea of linear regression analysis and the least square method. Typically used in a statistics class. Playlist on Linear Regression http://www.youtube.com/course?list=ECF596A4043DBEAE9C Like us on: http://www.facebook.com/PartyMoreStudyLess Created by David Lon

From playlist Linear Regression.

Video thumbnail

(ML 9.2) Linear regression - Definition & Motivation

Linear regression arises naturally from a sequence of simple choices: discriminative model, Gaussian distributions, and linear functions. A playlist of these Machine Learning videos is available here: http://www.youtube.com/view_play_list?p=D0F06AA0D2E8FFBA

From playlist Machine Learning

Video thumbnail

Logistic Regression

Overview of logistic regression, a statistical classification technique.

From playlist Machine Learning

Video thumbnail

Predicting the rules behind - Deep Symbolic Regression for Recurrent Sequences (w/ author interview)

#deeplearning #symbolic #research This video includes an interview with first author Stéphane d'Ascoli (https://sdascoli.github.io/). Deep neural networks are typically excellent at numeric regression, but using them for symbolic computation has largely been ignored so far. This paper use

From playlist Papers Explained

Video thumbnail

Logistic Regression

This is a single lecture from a course. If you you like the material and want more context (e.g., the lectures that came before), check out the whole course: https://go.umd.edu/jbg-inst-808 (Including homeworks and reading.) Music: https://soundcloud.com/alvin-grissom-ii/review-and-rest

From playlist Deep Learning for Information Scientists

Video thumbnail

Linear Regression using Python

This seminar series looks at four important linear models (linear regression, analysis of variance, analysis of covariance, and logistic regression). A video that explains all four model types is at https://www.youtube.com/watch?v=SV9AxXFWZnM&t=12s This video is on linear regression usin

From playlist Statistics

Video thumbnail

Interpretable Deep Learning for New Physics Discovery

In this video, Miles Cranmer discusses a method for converting a neural network into an analytic equation using a particular set of inductive biases. The technique relies on a sparsification of latent spaces in a deep neural network, followed by symbolic regression. In their paper, they de

From playlist Data-Driven Dynamical Systems with Machine Learning

Video thumbnail

DDPS | The problem with deep learning for physics (and how to fix it) by Miles Cranmer

Description: I will present a review of how deep learning is used in physics, and how this use is often misguided. I will introduce the term “scientific debt,” and argue that, though deep learning can quickly solve a complex problem, its success does not come for free. Because most learnin

From playlist Data-driven Physical Simulations (DDPS) Seminar Series

Video thumbnail

Symbolic Regression and Program Induction: Lars Buesing

Machine Learning for the Working Mathematician: Week Fourteen 2 June 2022 Lars Buesing, Searching for Formulas and Algorithms: Symbolic Regression and Program Induction Abstract: In spite of their enormous success as black box function approximators in many fields such as computer vision

From playlist Machine Learning for the Working Mathematician

Video thumbnail

Deep Symbolic Regression: Recovering Math Expressions from Data via Risk-Seeking Policy Gradients

The Data Science Institute (DSI) hosted a virtual seminar by Brenden Petersen from Lawrence Livermore National Laboratory on April 22, 2021. Read more about the DSI seminar series at https://data-science.llnl.gov/latest/seminar-series. Discovering the underlying mathematical expressions d

From playlist DSI Virtual Seminar Series

Video thumbnail

PDE FIND

We propose a sparse regression method capable of discovering the governing partial differential equation(s) of a given system by time series measurements in the spatial domain. The regression framework relies on sparsity promoting techniques to select the nonlinear and partial derivative

From playlist Research Abstracts from Brunton Lab

Video thumbnail

Discovering Symbolic Models from Deep Learning with Inductive Biases (Paper Explained)

Neural networks are very good at predicting systems' numerical outputs, but not very good at deriving the discrete symbolic equations that govern many physical systems. This paper combines Graph Networks with symbolic regression and shows that the strong inductive biases of these models ca

From playlist Papers Explained

Video thumbnail

Patrick Riley - Symbolic Regression for Discovery of a DFT Functional - IPAM at UCLA

Recorded 23 January 2023. Patrick Riley of Relay Therapeutics presents "Symbolic Regression for Discovery of a DFT Functional" at IPAM's Learning and Emergence in Molecular Systems Workshop. Abstract: Symbolic regression is a family of machine learning algorithms that aim to produce small

From playlist 2023 Learning and Emergence in Molecular Systems

Video thumbnail

Boris Beranger - Composite likelihood and logistic regression models for aggregated data

Dr Boris Beranger (UNSW Sydney) presents “Composite likelihood and logistic regression models for aggregated data”, 14 August 2020. This seminar was organised by the University of Technology Sydney.

From playlist Statistics Across Campuses

Video thumbnail

SINDy-PI: A robust algorithm for parallel implicit sparse identification of nonlinear dynamics

In this video, Kadierdan Kaheman describes SINDy-PI: A robust algorithm for parallel implicit sparse identification of nonlinear dynamics. The SINDy-PI overcomes the difficulties of using SINDy to identify the rational system or implicit dynamics and made it possible to directly extract th

From playlist Research Abstracts from Brunton Lab

Video thumbnail

Data Mining using R | Data Mining Tutorial for Beginners | R Tutorial for Beginners | Edureka

( R Training : https://www.edureka.co/data-analytics-with-r-certification-training ) This Edureka R tutorial on "Data Mining using R" will help you understand the core concepts of Data Mining comprehensively. This tutorial will also comprise of a case study using R, where you'll apply dat

From playlist Machine Learning with R | Edureka

Video thumbnail

Linear Regression Using R

How to calculate Linear Regression using R. http://www.MyBookSucks.Com/R/Linear_Regression.R http://www.MyBookSucks.Com/R Playlist http://www.youtube.com/playlist?list=PLF596A4043DBEAE9C

From playlist Linear Regression.

Related pages

Meijer G-function | Residual (numerical analysis) | Multi expression programming | Fitness function | Operation (mathematics) | Regression analysis | Mathematical optimization | Gradient | NP-hardness | Reverse mathematics | Method of moments (statistics) | Bayesian statistics | Dimensional analysis | State variable | Linear genetic programming | Genetic and Evolutionary Computation Conference | Simulated annealing | Divide-and-conquer algorithm | QLattice | Julia (programming language) | Kolmogorov complexity | Closed-form expression | Artificial intelligence | HeuristicLab | Combinatorial explosion | Artificial neural network | Evolutionary algorithm | Analytic function | Scikit-learn | Discovery system (AI research) | Eureqa | Constant (mathematics) | Gene expression programming | Genetic programming