Category: Computer arithmetic

Permute instruction
Permute (and Shuffle) instructions, part of bit manipulation as well as vector processing, copy unaltered contents from a source array to a destination array, where the indices are specified by a seco
Single-precision floating-point format
Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric value
Common operator notation
In programming languages, scientific calculators and similar common operator notation or operator grammar is a way to define and analyse mathematical and other formal expressions. In this model a line
Numerical error
In software engineering and mathematics, numerical error is the error in the numerical computations.
IEEE 754-2019
No description available.
Compare-and-swap
In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if th
Carry flag
In computer processors the carry flag (usually indicated as the C flag) is a single bit in a system status register/flag register used to indicate when an arithmetic carry or borrow has been generated
Long double
In C and related programming languages, long double refers to a floating-point data type that is often more precise than double precision though the language standard only requires it to be at
Q (number format)
The Q notation is a way to specify the parameters of a binary fixed point number format. For example, in Q notation, the number format denoted by Q8.8 means that the fixed point numbers in this format
Radix economy
The radix economy of a number in a particular base (or radix) is the number of digits needed to express it in that base, multiplied by the base (the number of possible values each digit could have). T
Saturation arithmetic
Saturation arithmetic is a version of arithmetic in which all operations, such as addition and multiplication, are limited to a fixed range between a minimum and maximum value. If the result of an ope
Barrel shifter
A barrel shifter is a digital circuit that can shift a data word by a specified number of bits without the use of any sequential logic, only pure combinational logic, i.e. it inherently provides a bin
Bit-length
Bit-length or bit width is the number of binary digits, called bits, necessary to represent an integer as a binary number. Formally, the bit-length of a natural number is a function, bitLength(n), of
Find first set
In computer software and hardware, find first set (ffs) or find first one is a bit operation that, given an unsigned machine word, designates the index or position of the least significant bit set to
Hamming code
In computer science and telecommunication, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection
IEEE 754-2008
No description available.
Normalized number
In applied mathematics, a number is normalized when it is written in scientific notation with one non-zero decimal digit before the decimal point. Thus, a real number, when written out in normalized s
Augmented assignment
Augmented assignment (or compound assignment) is the name given to certain assignment operators in certain programming languages (especially those derived from C). An augmented assignment is generally
Extended precision
Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended precision formats support a basic format by minimizing roundof
Carry-less product
The carry-less product of two binary numbersis the result of carry-less multiplication of these numbers.This operation conceptually works like long multiplicationexcept for the fact that the carryis d
Division algorithm
A division algorithm is an algorithm which, given two integers N and D, computes their quotient and/or remainder, the result of Euclidean division. Some are applied by hand, while others are employed
Decimal data type
Some programming languages (or compilers for them) provide a built-in (primitive) or library decimal data type to represent non-repeating decimal fractions like 0.3 and -1.17 without rounding, and to
Ternary numeral system
A ternary /ˈtɜːrnəri/ numeral system (also called base 3 or trinary) has three as its base. Analogous to a bit, a ternary digit is a trit (trinary digit). One trit is equivalent to log2 3 (about 1.584
Double-precision floating-point format
Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric
Reduction of summands
Reduction of summands is an algorithm for fast binary multiplication of non-signed binary integers. It is performed in three steps: production of summands, reduction of summands, and summation.
Negative flag
In a computer processor the negative flag or sign flag is a single bit in a system status (flag) register used to indicate whether the result of the last mathematical operation produced a value in whi
Carry operator
The carry operator, symbolized by the ¢ sign, is an abstraction of the operation of determining whether a portion of an adder network generates or propagates a carry. It is defined as follows: ¢
Modulo operation
In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another (called the modulus of the operation). Given two positive numbers a a
BKM algorithm
The BKM algorithm is a shift-and-add algorithm for computing elementary functions, first published in 1994 by Jean-Claude Bajard, Sylvanus Kla, and Jean-Michel Muller. BKM is based on computing comple
Barrett reduction
In modular arithmetic, Barrett reduction is a reduction algorithm introduced in 1986 by P.D. Barrett. A naive way of computing would be to use a fast division algorithm. Barrett reduction is an algori
Unit in the last place
In computer science and numerical analysis, unit in the last place or unit of least precision (ulp) is the spacing between two consecutive floating-point numbers, i.e., the value the least significant
INTLAB
INTLAB (INTerval LABoratory) is an interval arithmetic library using MATLAB and GNU Octave, available in Windows and Linux, macOS. It was developed by S.M. Rump from Hamburg University of Technology.
Bit manipulation
Bit manipulation is the act of algorithmically manipulating bits or other pieces of data shorter than a word. Computer programming tasks that require bit manipulation include low-level device control,
Exponentiation by squaring
In mathematics and computer programming, exponentiating by squaring is a general method for fast computation of large positive integer powers of a number, or more generally of an element of a semigrou
GNU MPFR
The GNU Multiple Precision Floating-Point Reliable Library (GNU MPFR) is a GNU portable C library for arbitrary-precision binary floating-point computation with correct rounding, based on GNU Multi-Pr
Circular shift
In combinatorial mathematics, a circular shift is the operation of rearranging the entries in a tuple, either by moving the final entry to the first position, while shifting all other entries to the n
List of arbitrary-precision arithmetic software
This article lists libraries, applications, and other software which enable or support arbitrary-precision arithmetic.
5-3-1-1 code
No description available.
Adjust flag
The Adjust flag (AF) is a CPU flag in the FLAGS register of all x86-compatible CPUs, and the preceding 8080-family; it is also called the Auxiliary flag and the Auxiliary Carry flag (AC, though this m
Binary number
A binary number is a number expressed in the base-2 numeral system or binary numeral system, a method of mathematical expression which uses only two symbols: typically "0" (zero) and "1" (one). The ba
Significand
The significand (also mantissa or coefficient, sometimes also argument, or ambiguously fraction or characteristic) is part of a number in scientific notation or in floating-point representation, consi
Arithmetic overflow
No description available.
Signedness
In computing, signedness is a property of data types representing numbers in computer programs. A numeric variable is signed if it can represent both positive and negative numbers, and unsigned if it
IEEE 854-1987
The IEEE Standard for Radix-Independent Floating-Point Arithmetic (IEEE 854), was the first Institute of Electrical and Electronics Engineers (IEEE) international standard for floating-point arithmeti
Jump-at-2 code
No description available.
Radix point
No description available.
Fetch-and-add
In computer science, the fetch-and-add CPU instruction (FAA) atomically increments the contents of a memory location by a specified value. That is, fetch-and-add performs the operation increment the v
Integer overflow
In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is outside of the range that can be represented with a given number of digits –
White code
No description available.
Balanced ternary
Balanced ternary is a ternary numeral system (i.e. base 3 with three digits) that uses a balanced signed-digit representation of the integers in which the digits have the values −1, 0, and 1. This sta
Hamming(7,4)
In coding theory, Hamming(7,4) is a linear error-correcting code that encodes four bits of data into seven bits by adding three parity bits. It is a member of a larger family of Hamming codes, but the
Dadda multiplier
The Dadda multiplier is a hardware binary multiplier design invented by computer scientist Luigi Dadda in 1965. It uses a selection of full and half adders to sum the partial products in stages (the D
Negative base
A negative base (or negative radix) may be used to construct a non-standard positional numeral system. Like other place-value systems, each position holds multiples of the appropriate power of the sys
CORDIC
CORDIC (for "coordinate rotation digital computer"), also known as Volder's algorithm, or: Digit-by-digit method Circular CORDIC (Jack E. Volder), Linear CORDIC, Hyperbolic CORDIC (John Stephen Walthe
Jump-at-8 code
No description available.
Floating-point arithmetic
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of
Multiply–accumulate operation
In computing, especially digital signal processing, the multiply–accumulate (MAC) or multiply-add (MAD) operation is a common step that computes the product of two numbers and adds that product to an
GNU Multiple Precision Arithmetic Library
GNU Multiple Precision Arithmetic Library (GMP) is a free library for arbitrary-precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers. There are no practical
Division by zero
In mathematics, division by zero is division where the divisor (denominator) is zero. Such a division can be formally expressed as , where a is the dividend (numerator). In ordinary arithmetic, the ex
Floating-point unit
A floating-point unit (FPU, colloquially a math coprocessor) is a part of a computer system specially designed to carry out operations on floating-point numbers. Typical operations are addition, subtr
Integer set library
isl (integer set library) is a portable C library for manipulating sets and relations of integer points bounded by linear constraints. The following operations are supported: * intersection, union, s
Skew binary number system
The skew binary number system is a non-standard positional numeral system in which the nth digit contributes a value of times the digit (digits are indexed from 0) instead of times as they do in binar
Sign bit
In computer science, the sign bit is a bit in a signed number representation that indicates the sign of a number. Although only signed numeric data types have a sign bit, it is invariably located in t
Mixed-precision arithmetic
Mixed-precision arithmetic is a form of floating-point arithmetic that uses numbers with varying widths in a single operation.
Machine epsilon
Machine epsilon or machine precision is an upper bound on the relative approximation error due to rounding in floating point arithmetic. This value characterizes computer arithmetic in the field of nu
Integer (computer science)
In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be a
Pairwise summation
In numerical analysis, pairwise summation, also called cascade summation, is a technique to sum a sequence of finite-precision floating-point numbers that substantially reduces the accumulated round-o
NaN
In computing, NaN (/næn/), standing for Not a Number, is a member of a numeric data type that can be interpreted as a value that is undefined or unrepresentable, especially in floating-point arithmeti
4-2-2-1 code
No description available.
Binade
In software engineering, a binade is a set of numbers in a binary IEEE 754 floating-point format that all have the same exponent. In other words, a binade is the interval [2n, 2n+1) for some integer v
Interval contractor
In mathematics, an interval contractor (or contractor for short) associated to a set X is an operator C which associates to a box [x] in Rn another box C([x]) of Rn such that the two following propert
IEEE 754
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). Th
Execution unit
In computer engineering, an execution unit (E-unit or EU) is a part of the central processing unit (CPU) that performs the operations and calculations as instructed by the computer program. It may hav
Test and test-and-set
In computer science, the test-and-set CPU instruction is used to implement mutual exclusion in multiprocessor environments. Although a correct lock can be implemented with test-and-set, it can lead to
2-4-2-1 code
No description available.
8-4-2-1 code
No description available.
Microsoft Binary Format
In computing, Microsoft Binary Format (MBF) is a format for floating-point numbers which was used in Microsoft's BASIC language products, including MBASIC, GW-BASIC and QuickBASIC prior to version 4.0
Zero flag
The zero flag is a single bit flag that is a central feature on most conventional CPU architectures (including x86, ARM, PDP-11, 68000, 6502, and numerous others). It is often stored in a dedicated re
Arithmetic logic unit
In computing, an arithmetic logic unit (ALU) is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit
Normal number (computing)
In computing, a normal number is a non-zero number in a floating-point representation which is within the balanced range supported by a given floating-point format: it is a floating point number that
Signed zero
Signed zero is zero with an associated sign. In ordinary arithmetic, the number 0 does not have a sign, so that −0, +0 and 0 are identical. However, in computing, some number representations allow for
Residue number system
A residue numeral system (RNS) is a numeral system representing integers by their values modulo several pairwise coprime integers called the moduli. This representation is allowed by the Chinese remai
Bi-quinary coded decimal
Bi-quinary coded decimal is a numeral encoding scheme used in many abacuses and in some early computers, including the Colossus. The term bi-quinary indicates that the code comprises both a two-state
Rounding
Rounding means replacing a number with an approximate value that has a shorter, simpler, or more explicit representation. For example, replacing $23.4476 with $23.45, the fraction 312/937 with 1/3, or
Half-carry flag
A half-carry flag (also known as an auxiliary flag or decimal adjust flag) is a condition flag bit in the status register of many CPU families, such as the Intel 8080, Zilog Z80, the x86, and the Atme
Method of complements
In mathematics and computing, the method of complements is a technique to encode a symmetric range of positive and negative integers in a way that they can use the same algorithm (hardware) for additi
Division by infinity
In mathematics, division by infinity is division where the divisor (denominator) is infinity. In ordinary arithmetic, this does not have a well-defined meaning, since infinity is a mathematical concep
Exponent bias
In IEEE 754 floating-point numbers, the exponent is biased in the engineering sense of the word – the value stored is offset from the actual value by the exponent bias, also called a biased exponent.B
Test-and-set
In computer science, the test-and-set instruction is an instruction used to write (set) 1 to a memory location and return its old value as a single atomic (i.e., non-interruptible) operation. The call
Binary integer decimal
The IEEE 754-2008 standard includes decimal floating-point number formats in which the significand and the exponent (and the payloads of NaNs) can be encoded in two ways, referred to as binary encodin
Decimal floating point
Decimal floating-point (DFP) arithmetic refers to both a representation and operations on decimal floating-point numbers. Working directly with decimal (base-10) fractions can avoid the rounding error
Block floating point
Block floating point (BFP) is a method used to provide an arithmetic approaching floating point while using a fixed-point processor. BFP assigns a group of significands (the non-exponent part of the f
Karlsruhe Accurate Arithmetic
Karlsruhe Accurate Arithmetic (KAA) or Karlsruhe Accurate Arithmetic Approach (KAAA), augments conventional floating-point arithmetic with good error behaviour with new operations to calculate scalar
MPIR (mathematics software)
Multiple Precision Integers and Rationals (MPIR) is an open-source software multiprecision integer library forked from the GNU Multiple Precision Arithmetic Library (GMP) project. It consists of much
Base36
Base36 is a binary-to-text encoding scheme that represents binary data in an ASCII string format by translating it into a radix-36 representation. The choice of 36 is convenient in that the digits can
Computer number format
A computer number format is the internal representation of numeric values in digital device hardware and software, such as in programmable computers and calculators. Numerical values are stored as gro
Symmetric level-index arithmetic
The level-index (LI) representation of numbers, and its algorithms for arithmetic operations, were introduced by Charles Clenshaw and Frank Olver in 1984. The symmetric form of the LI system and its a
Gal's accurate tables
Gal's accurate tables is a method devised by Shmuel Gal to provide accurate values of special functions using a lookup table and interpolation. It is a fast and efficient method for generating values
Tapered floating point
In computing, tapered floating point (TFP) is a format similar to floating point, but with variable-sized entries for the significand and exponent instead of the fixed-length entries found in normal f
ISO/IEC 10967
ISO/IEC 10967, Language independent arithmetic (LIA), is a series ofstandards on computer arithmetic. It is compatible with ISO/IEC/IEEE 60559:2011,more known as IEEE 754-2008, and much of thespecific
Interval arithmetic
Interval arithmetic (also known as interval mathematics, interval analysis, or interval computation) is a mathematical technique used to put bounds on rounding errors and measurement errors in mathema
Binary-coded decimal
In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Some
Zero to the power of zero
Zero to the power of zero, denoted by 00, is a mathematical expression that is either defined as 1 or left undefined, depending on context. In algebra and combinatorics, one typically defines 00 = 1.
Floating-point error mitigation
Floating-point error mitigation is the minimization of errors caused by the fact that real numbers cannot, in general, be accurately represented in a fixed space. By definition, floating-point error c
ARITH Symposium on Computer Arithmetic
The IEEE International Symposium on Computer Arithmetic (ARITH) is a conference in the area of computer arithmetic. The symposium was established in 1969, initially as three-year event, then as abienn
Sterbenz lemma
In floating-point arithmetic, the Sterbenz lemma or Sterbenz's lemma is a theorem giving conditions under which floating-point differences are computed exactly.It is named after Pat H. Sterbenz, who p
Carry (arithmetic)
In elementary arithmetic, a carry is a digit that is transferred from one column of digits to another column of more significant digits. It is part of the standard algorithm to add numbers together by
Computron tube
The Computron was an electron tube designed to perform the parallel addition and multiplication of digital numbers. It was conceived by Richard L. Snyder, Jr., Jan A. Rajchman, Paul Rudnick and the di
Kahan summation algorithm
In numerical analysis, the Kahan summation algorithm, also known as compensated summation, significantly reduces the numerical error in the total obtained by adding a sequence of finite-precision floa
Kulisch arithmetic
No description available.
IBM hexadecimal floating-point
Hexadecimal floating point (now called HFP by IBM) is a format for encoding floating-point numbers first introduced on the IBM System/360 computers, and supported on subsequent machines based on that
IEEE 1788-2015
No description available.
5-2-2-1 code
No description available.
Minifloat
In computing, minifloats are floating-point values represented with very few bits. Predictably, they are not well suited for general-purpose numerical calculations. They are used for special purposes,
5-4-2-1 code
No description available.
2Sum
2Sum is a floating-point algorithm for computing the exact round-off error in a floating-point addition operation. 2Sum and its variant Fast2Sum were first published by Møller in 1965.Fast2Sum is ofte
Two-out-of-five code
A two-out-of-five code is a constant-weight code that provides exactly ten possible combinations of two bits, and is thus used for representing the decimal digits using five bits. Each bit is assigned
Montgomery modular multiplication
In modular arithmetic computation, Montgomery modular multiplication, more commonly referred to as Montgomery multiplication, is a method for performing fast modular multiplication. It was introduced
Subnormal number
In computer science, subnormal numbers are the subset of denormalized numbers (sometimes called denormals) that fill the underflow gap around zero in floating-point arithmetic. Any non-zero number wit
Arithmetic underflow
The term arithmetic underflow (also floating point underflow, or just underflow) is a condition in a computer program where the result of a calculation is a number of more precise absolute value than
Logarithmic number system
A logarithmic number system (LNS) is an arithmetic system used for representing real numbers in computer and digital hardware, especially for digital signal processing.
Decimal32 floating-point format
In computing, decimal32 is a decimal floating-point computer numbering format that occupies 4 bytes (32 bits) in computer memory.It is intended for applications where it is necessary to emulate decima
Arbitrary-precision arithmetic
In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed
Decimal64 floating-point format
In computing, decimal64 is a decimal floating-point computer numbering format that occupies 8 bytes (64 bits) in computer memory.It is intended for applications where it is necessary to emulate decima
Aiken code
The Aiken code (also known as 2421 code) is a complementary binary-coded decimal (BCD) code. A group of four bits is assigned to the decimal digits from 0 to 9 according to the following table. The co
Signed number representations
In computing, signed number representations are required to encode negative numbers in binary number systems. In mathematics, negative numbers in any base are represented by prefixing them with a minu
Wallace tree
A Wallace multiplier is a hardware implementation of a binary multiplier, a digital circuit that multiplies two integers. It uses a selection of full and half adders (the Wallace tree or Wallace reduc
Fixed-point arithmetic
In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with
IEEE P1788
No description available.
Overflow flag
In computer processors, the overflow flag (sometimes called the V flag) is usually a single bit in a system status register used to indicate when an arithmetic overflow has occurred in an operation, i
IEEE 754-2008 revision
IEEE 754-2008 (previously known as IEEE 754r) was published in August 2008 and is a significant revision to, and replaces, the IEEE 754-1985 floating-point standard, while in 2019 it was updated with
Serial decimal
In computers, a serial decimal numeric representation is one in which ten bits are reserved for each digit, with a different bit turned on depending on which of the ten possible digits is intended. EN
IEEE 754-1985
IEEE 754-1985 was an industry standard for representing floating-point numbers in computers, officially adopted in 1985 and superseded in 2008 by IEEE 754-2008, and then again in 2019 by minor revisio