Category: Computer arithmetic

Permute instruction

Permute (and Shuffle) instructions, part of bit manipulation as well as vector processing, copy unaltered contents from a source array to a destination array, where the indices are specified by a seco

Single-precision floating-point format

Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric value

Common operator notation

In programming languages, scientific calculators and similar common operator notation or operator grammar is a way to define and analyse mathematical and other formal expressions. In this model a line

Numerical error

In software engineering and mathematics, numerical error is the error in the numerical computations.

IEEE 754-2019

No description available.

Compare-and-swap

In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if th

Carry flag

In computer processors the carry flag (usually indicated as the C flag) is a single bit in a system status register/flag register used to indicate when an arithmetic carry or borrow has been generated

Long double

In C and related programming languages, long double refers to a floating-point data type that is often more precise than double precision though the language standard only requires it to be at

Q (number format)

The Q notation is a way to specify the parameters of a binary fixed point number format. For example, in Q notation, the number format denoted by Q8.8 means that the fixed point numbers in this format

Radix economy

The radix economy of a number in a particular base (or radix) is the number of digits needed to express it in that base, multiplied by the base (the number of possible values each digit could have). T

Saturation arithmetic

Saturation arithmetic is a version of arithmetic in which all operations, such as addition and multiplication, are limited to a fixed range between a minimum and maximum value. If the result of an ope

Barrel shifter

A barrel shifter is a digital circuit that can shift a data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a bin

Bit-length

Bit-length or bit width is the number of binary digits, called bits, necessary to represent an integer as a binary number. Formally, the bit-length of a natural number is a function, bitLength(n), of

Find first set

In computer software and hardware, find first set (ffs) or find first one is a bit operation that, given an unsigned machine word, designates the index or position of the least significant bit set to

Hamming code

In computer science and telecommunication, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection

IEEE 754-2008

No description available.

Normalized number

In applied mathematics, a number is normalized when it is written in scientific notation with one non-zero decimal digit before the decimal point. Thus, a real number, when written out in normalized s

Augmented assignment

Augmented assignment (or compound assignment) is the name given to certain assignment operators in certain programming languages (especially those derived from C). An augmented assignment is generally

Extended precision

Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended precision formats support a basic format by minimizing roundof

Carry-less product

The carry-less product of two binary numbersis the result of carry-less multiplication of these numbers.This operation conceptually works like long multiplicationexcept for the fact that the carryis d

Division algorithm

A division algorithm is an algorithm which, given two integers N and D, computes their quotient and/or remainder, the result of Euclidean division. Some are applied by hand, while others are employed

Decimal data type

Some programming languages (or compilers for them) provide a built-in (primitive) or library decimal data type to represent non-repeating decimal fractions like 0.3 and -1.17 without rounding, and to

Ternary numeral system

A ternary /ˈtɜːrnəri/ numeral system (also called base 3 or trinary) has three as its base. Analogous to a bit, a ternary digit is a trit (trinary digit). One trit is equivalent to log2 3 (about 1.584

Double-precision floating-point format

Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric

Reduction of summands

Reduction of summands is an algorithm for fast binary multiplication of non-signed binary integers. It is performed in three steps: production of summands, reduction of summands, and summation.

Negative flag

In a computer processor the negative flag or sign flag is a single bit in a system status (flag) register used to indicate whether the result of the last mathematical operation produced a value in whi

Carry operator

The carry operator, symbolized by the ¢ sign, is an abstraction of the operation of determining whether a portion of an adder network generates or propagates a carry. It is defined as follows: ¢

Modulo operation

In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another (called the modulus of the operation). Given two positive numbers a a

BKM algorithm

The BKM algorithm is a shift-and-add algorithm for computing elementary functions, first published in 1994 by Jean-Claude Bajard, Sylvanus Kla, and Jean-Michel Muller. BKM is based on computing comple

Barrett reduction

In modular arithmetic, Barrett reduction is a reduction algorithm introduced in 1986 by P.D. Barrett. A naive way of computing would be to use a fast division algorithm. Barrett reduction is an algori

Unit in the last place

In computer science and numerical analysis, unit in the last place or unit of least precision (ulp) is the spacing between two consecutive floating-point numbers, i.e., the value the least significant

INTLAB

INTLAB (INTerval LABoratory) is an interval arithmetic library using MATLAB and GNU Octave, available in Windows and Linux, macOS. It was developed by S.M. Rump from Hamburg University of Technology.

Bit manipulation

Bit manipulation is the act of algorithmically manipulating bits or other pieces of data shorter than a word. Computer programming tasks that require bit manipulation include low-level device control,

Exponentiation by squaring

In mathematics and computer programming, exponentiating by squaring is a general method for fast computation of large positive integer powers of a number, or more generally of an element of a semigrou

GNU MPFR

The GNU Multiple Precision Floating-Point Reliable Library (GNU MPFR) is a GNU portable C library for arbitrary-precision binary floating-point computation with correct rounding, based on GNU Multi-Pr

Circular shift

In combinatorial mathematics, a circular shift is the operation of rearranging the entries in a tuple, either by moving the final entry to the first position, while shifting all other entries to the n

List of arbitrary-precision arithmetic software

This article lists libraries, applications, and other software which enable or support arbitrary-precision arithmetic.

5-3-1-1 code

No description available.

Adjust flag

The Adjust flag (AF) is a CPU flag in the FLAGS register of all x86-compatible CPUs, and the preceding 8080-family; it is also called the Auxiliary flag and the Auxiliary Carry flag (AC, though this m

Binary number

A binary number is a number expressed in the base-2 numeral system or binary numeral system, a method of mathematical expression which uses only two symbols: typically "0" (zero) and "1" (one). The ba

Significand

The significand (also mantissa or coefficient, sometimes also argument, or ambiguously fraction or characteristic) is part of a number in scientific notation or in floating-point representation, consi

Arithmetic overflow

No description available.

Signedness

In computing, signedness is a property of data types representing numbers in computer programs. A numeric variable is signed if it can represent both positive and negative numbers, and unsigned if it

IEEE 854-1987

The IEEE Standard for Radix-Independent Floating-Point Arithmetic (IEEE 854), was the first Institute of Electrical and Electronics Engineers (IEEE) international standard for floating-point arithmeti

Jump-at-2 code

No description available.

Radix point

No description available.

Fetch-and-add

In computer science, the fetch-and-add CPU instruction (FAA) atomically increments the contents of a memory location by a specified value. That is, fetch-and-add performs the operation increment the v

Integer overflow

In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is outside of the range that can be represented with a given number of digits –

White code

No description available.

Balanced ternary

Balanced ternary is a ternary numeral system (i.e. base 3 with three digits) that uses a balanced signed-digit representation of the integers in which the digits have the values −1, 0, and 1. This sta

Hamming(7,4)

In coding theory, Hamming(7,4) is a linear error-correcting code that encodes four bits of data into seven bits by adding three parity bits. It is a member of a larger family of Hamming codes, but the

Dadda multiplier

The Dadda multiplier is a hardware binary multiplier design invented by computer scientist Luigi Dadda in 1965. It uses a selection of full and half adders to sum the partial products in stages (the D

Negative base

A negative base (or negative radix) may be used to construct a non-standard positional numeral system. Like other place-value systems, each position holds multiples of the appropriate power of the sys

CORDIC

CORDIC (for "coordinate rotation digital computer"), also known as Volder's algorithm, or: Digit-by-digit method Circular CORDIC (Jack E. Volder), Linear CORDIC, Hyperbolic CORDIC (John Stephen Walthe

Jump-at-8 code

No description available.

Floating-point arithmetic

In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of

Multiply–accumulate operation

In computing, especially digital signal processing, the multiply–accumulate (MAC) or multiply-add (MAD) operation is a common step that computes the product of two numbers and adds that product to an

GNU Multiple Precision Arithmetic Library

GNU Multiple Precision Arithmetic Library (GMP) is a free library for arbitrary-precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers. There are no practical

Division by zero

In mathematics, division by zero is division where the divisor (denominator) is zero. Such a division can be formally expressed as , where a is the dividend (numerator). In ordinary arithmetic, the ex

Floating-point unit

A floating-point unit (FPU, colloquially a math coprocessor) is a part of a computer system specially designed to carry out operations on floating-point numbers. Typical operations are addition, subtr

Integer set library

isl (integer set library) is a portable C library for manipulating sets and relations of integer points bounded by linear constraints. The following operations are supported: * intersection, union, s

Skew binary number system

The skew binary number system is a non-standard positional numeral system in which the nth digit contributes a value of times the digit (digits are indexed from 0) instead of times as they do in binar

Sign bit

In computer science, the sign bit is a bit in a signed number representation that indicates the sign of a number. Although only signed numeric data types have a sign bit, it is invariably located in t

Mixed-precision arithmetic

Mixed-precision arithmetic is a form of floating-point arithmetic that uses numbers with varying widths in a single operation.

Machine epsilon

Machine epsilon or machine precision is an upper bound on the relative approximation error due to rounding in floating point arithmetic. This value characterizes computer arithmetic in the field of nu

Integer (computer science)

In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be a

Pairwise summation

In numerical analysis, pairwise summation, also called cascade summation, is a technique to sum a sequence of finite-precision floating-point numbers that substantially reduces the accumulated round-o

NaN

In computing, NaN (/næn/), standing for Not a Number, is a member of a numeric data type that can be interpreted as a value that is undefined or unrepresentable, especially in floating-point arithmeti

4-2-2-1 code

No description available.

Binade

In software engineering, a binade is a set of numbers in a binary IEEE 754 floating-point format that all have the same exponent. In other words, a binade is the interval [2n, 2n+1) for some integer v

Interval contractor

In mathematics, an interval contractor (or contractor for short) associated to a set X is an operator C which associates to a box [x] in Rn another box C([x]) of Rn such that the two following propert

IEEE 754

The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). Th

Execution unit

In computer engineering, an execution unit (E-unit or EU) is a part of the central processing unit (CPU) that performs the operations and calculations as instructed by the computer program. It may hav

Test and test-and-set

In computer science, the test-and-set CPU instruction is used to implement mutual exclusion in multiprocessor environments. Although a correct lock can be implemented with test-and-set, it can lead to

2-4-2-1 code

No description available.

8-4-2-1 code

No description available.

Microsoft Binary Format

In computing, Microsoft Binary Format (MBF) is a format for floating-point numbers which was used in Microsoft's BASIC language products, including MBASIC, GW-BASIC and QuickBASIC prior to version 4.0

Zero flag

The zero flag is a single bit flag that is a central feature on most conventional CPU architectures (including x86, ARM, PDP-11, 68000, 6502, and numerous others). It is often stored in a dedicated re

Arithmetic logic unit

In computing, an arithmetic logic unit (ALU) is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit

Normal number (computing)

In computing, a normal number is a non-zero number in a floating-point representation which is within the balanced range supported by a given floating-point format: it is a floating point number that

Signed zero

Signed zero is zero with an associated sign. In ordinary arithmetic, the number 0 does not have a sign, so that −0, +0 and 0 are identical. However, in computing, some number representations allow for

Residue number system

A residue numeral system (RNS) is a numeral system representing integers by their values modulo several pairwise coprime integers called the moduli. This representation is allowed by the Chinese remai

Bi-quinary coded decimal

Bi-quinary coded decimal is a numeral encoding scheme used in many abacuses and in some early computers, including the Colossus. The term bi-quinary indicates that the code comprises both a two-state

Rounding

Rounding means replacing a number with an approximate value that has a shorter, simpler, or more explicit representation. For example, replacing $23.4476 with $23.45, the fraction 312/937 with 1/3, or

Half-carry flag

A half-carry flag (also known as an auxiliary flag or decimal adjust flag) is a condition flag bit in the status register of many CPU families, such as the Intel 8080, Zilog Z80, the x86, and the Atme

Method of complements

In mathematics and computing, the method of complements is a technique to encode a symmetric range of positive and negative integers in a way that they can use the same algorithm (hardware) for additi

Division by infinity

In mathematics, division by infinity is division where the divisor (denominator) is infinity. In ordinary arithmetic, this does not have a well-defined meaning, since infinity is a mathematical concep

Exponent bias

In IEEE 754 floating-point numbers, the exponent is biased in the engineering sense of the word – the value stored is offset from the actual value by the exponent bias, also called a biased exponent.B

Test-and-set

In computer science, the test-and-set instruction is an instruction used to write (set) 1 to a memory location and return its old value as a single atomic (i.e., non-interruptible) operation. The call

Binary integer decimal

The IEEE 754-2008 standard includes decimal floating-point number formats in which the significand and the exponent (and the payloads of NaNs) can be encoded in two ways, referred to as binary encodin

Decimal floating point

Decimal floating-point (DFP) arithmetic refers to both a representation and operations on decimal floating-point numbers. Working directly with decimal (base-10) fractions can avoid the rounding error

Block floating point

Block floating point (BFP) is a method used to provide an arithmetic approaching floating point while using a fixed-point processor. BFP assigns a group of significands (the non-exponent part of the f

Karlsruhe Accurate Arithmetic

Karlsruhe Accurate Arithmetic (KAA) or Karlsruhe Accurate Arithmetic Approach (KAAA), augments conventional floating-point arithmetic with good error behaviour with new operations to calculate scalar

MPIR (mathematics software)

Multiple Precision Integers and Rationals (MPIR) is an open-source software multiprecision integer library forked from the GNU Multiple Precision Arithmetic Library (GMP) project. It consists of much

Base36

Base36 is a binary-to-text encoding scheme that represents binary data in an ASCII string format by translating it into a radix-36 representation. The choice of 36 is convenient in that the digits can

Computer number format

A computer number format is the internal representation of numeric values in digital device hardware and software, such as in programmable computers and calculators. Numerical values are stored as gro

Symmetric level-index arithmetic

The level-index (LI) representation of numbers, and its algorithms for arithmetic operations, were introduced by Charles Clenshaw and Frank Olver in 1984. The symmetric form of the LI system and its a

Gal's accurate tables

Gal's accurate tables is a method devised by Shmuel Gal to provide accurate values of special functions using a lookup table and interpolation. It is a fast and efficient method for generating values

Tapered floating point

In computing, tapered floating point (TFP) is a format similar to floating point, but with variable-sized entries for the significand and exponent instead of the fixed-length entries found in normal f

ISO/IEC 10967

ISO/IEC 10967, Language independent arithmetic (LIA), is a series ofstandards on computer arithmetic. It is compatible with ISO/IEC/IEEE 60559:2011,more known as IEEE 754-2008, and much of thespecific

Interval arithmetic

Interval arithmetic (also known as interval mathematics, interval analysis, or interval computation) is a mathematical technique used to put bounds on rounding errors and measurement errors in mathema

Binary-coded decimal

In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Some

Zero to the power of zero

Zero to the power of zero, denoted by 00, is a mathematical expression that is either defined as 1 or left undefined, depending on context. In algebra and combinatorics, one typically defines 00 = 1.

Floating-point error mitigation

Floating-point error mitigation is the minimization of errors caused by the fact that real numbers cannot, in general, be accurately represented in a fixed space. By definition, floating-point error c

ARITH Symposium on Computer Arithmetic

The IEEE International Symposium on Computer Arithmetic (ARITH) is a conference in the area of computer arithmetic. The symposium was established in 1969, initially as three-year event, then as abienn

Sterbenz lemma

In floating-point arithmetic, the Sterbenz lemma or Sterbenz's lemma is a theorem giving conditions under which floating-point differences are computed exactly.It is named after Pat H. Sterbenz, who p

Carry (arithmetic)

In elementary arithmetic, a carry is a digit that is transferred from one column of digits to another column of more significant digits. It is part of the standard algorithm to add numbers together by

Computron tube

The Computron was an electron tube designed to perform the parallel addition and multiplication of digital numbers. It was conceived by Richard L. Snyder, Jr., Jan A. Rajchman, Paul Rudnick and the di

Kahan summation algorithm

In numerical analysis, the Kahan summation algorithm, also known as compensated summation, significantly reduces the numerical error in the total obtained by adding a sequence of finite-precision floa

Kulisch arithmetic

No description available.

IBM hexadecimal floating-point

Hexadecimal floating point (now called HFP by IBM) is a format for encoding floating-point numbers first introduced on the IBM System/360 computers, and supported on subsequent machines based on that

IEEE 1788-2015

No description available.

5-2-2-1 code

No description available.

Minifloat

In computing, minifloats are floating-point values represented with very few bits. Predictably, they are not well suited for general-purpose numerical calculations. They are used for special purposes,

5-4-2-1 code

No description available.

2Sum

2Sum is a floating-point algorithm for computing the exact round-off error in a floating-point addition operation. 2Sum and its variant Fast2Sum were first published by Møller in 1965.Fast2Sum is ofte

Two-out-of-five code

A two-out-of-five code is a constant-weight code that provides exactly ten possible combinations of two bits, and is thus used for representing the decimal digits using five bits. Each bit is assigned

Montgomery modular multiplication

In modular arithmetic computation, Montgomery modular multiplication, more commonly referred to as Montgomery multiplication, is a method for performing fast modular multiplication. It was introduced

Subnormal number

In computer science, subnormal numbers are the subset of denormalized numbers (sometimes called denormals) that fill the underflow gap around zero in floating-point arithmetic. Any non-zero number wit

Arithmetic underflow

The term arithmetic underflow (also floating point underflow, or just underflow) is a condition in a computer program where the result of a calculation is a number of more precise absolute value than

Logarithmic number system

A logarithmic number system (LNS) is an arithmetic system used for representing real numbers in computer and digital hardware, especially for digital signal processing.

Decimal32 floating-point format

In computing, decimal32 is a decimal floating-point computer numbering format that occupies 4 bytes (32 bits) in computer memory.It is intended for applications where it is necessary to emulate decima

Arbitrary-precision arithmetic

In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed

Decimal64 floating-point format

In computing, decimal64 is a decimal floating-point computer numbering format that occupies 8 bytes (64 bits) in computer memory.It is intended for applications where it is necessary to emulate decima

Aiken code

The Aiken code (also known as 2421 code) is a complementary binary-coded decimal (BCD) code. A group of four bits is assigned to the decimal digits from 0 to 9 according to the following table. The co

Signed number representations

In computing, signed number representations are required to encode negative numbers in binary number systems. In mathematics, negative numbers in any base are represented by prefixing them with a minu

Wallace tree

A Wallace multiplier is a hardware implementation of a binary multiplier, a digital circuit that multiplies two integers. It uses a selection of full and half adders (the Wallace tree or Wallace reduc

Fixed-point arithmetic

In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with

IEEE P1788

No description available.

Overflow flag

In computer processors, the overflow flag (sometimes called the V flag) is usually a single bit in a system status register used to indicate when an arithmetic overflow has occurred in an operation, i

IEEE 754-2008 revision

IEEE 754-2008 (previously known as IEEE 754r) was published in August 2008 and is a significant revision to, and replaces, the IEEE 754-1985 floating-point standard, while in 2019 it was updated with

Serial decimal

In computers, a serial decimal numeric representation is one in which ten bits are reserved for each digit, with a different bit turned on depending on which of the ten possible digits is intended. EN

IEEE 754-1985

IEEE 754-1985 was an industry standard for representing floating-point numbers in computers, officially adopted in 1985 and superseded in 2008 by IEEE 754-2008, and then again in 2019 by minor revisio