···11-http://floating-point-gui.de aims to provide both short and simple
22-answers to the common recurring questions of novice programmers
33-about floating-point numbers not "adding up" correctly, and
44-more in-depth information about how IEEE 754 floats work,
55-when and how to use them correctly, and what to use instead
66-when they are not appropriate.
77-88-The site is built using the nanoc static site generator:
99-http://nanoc.stoneship.org/ (Requires the kramdown and adsf gems)
1010-1111-and published under the Creative Commons Attribution License (BY):
1212-http://creativecommons.org/licenses/by/3.0/
···11-#!/usr/bin/env ruby
22-33-# A few helpful tips about the Rules file:
44-#
55-# * The order of rules is important: for each item, only the first matching
66-# rule is applied.
77-#
88-# * Item identifiers start and end with a slash (e.g. “/about/” for the file
99-# “content/about.html”). To select all children, grandchildren, … of an
1010-# item, use the pattern “/about/*/”; “/about/*” will also select the parent,
1111-# because “*” matches zero or more characters.
1212-1313-compile '*' do
1414- case item[:extension]
1515- when 'html'
1616- filter :erb
1717- filter :kramdown
1818- layout 'default'
1919- end
2020-end
2121-2222-route '*' do
2323- case item[:extension]
2424- when 'html'
2525- item.identifier + 'index.html'
2626- else
2727- item.identifier.chop + '.' + item[:extension]
2828- end
2929-end
3030-3131-layout '*', :erb
···11-# A list of file extensions that nanoc will consider to be textual rather than
22-# binary. If an item with an extension not in this list is found, the file
33-# will be considered as binary.
44-text_extensions: [ 'css', 'erb', 'haml', 'htm', 'html', 'js', 'less', 'markdown', 'md', 'php', 'rb', 'sass', 'txt' ]
55-66-# The path to the directory where all generated files will be written to. This
77-# can be an absolute path starting with a slash, but it can also be path
88-# relative to the site directory.
99-output_dir: ../output
1010-1111-# A list of index filenames, i.e. names of files that will be served by a web
1212-# server when a directory is requested. Usually, index files are named
1313-# “index.hml”, but depending on the web server, this may be something else,
1414-# such as “default.htm”. This list is used by nanoc to generate pretty URLs.
1515-index_filenames: [ 'index.html' ]
1616-1717-# Whether or not to generate a diff of the compiled content when compiling a
1818-# site. The diff will contain the differences between the compiled content
1919-# before and after the last site compilation.
2020-enable_output_diff: false
2121-2222-# The data sources where nanoc loads its data from. This is an array of
2323-# hashes; each array element represents a single data source. By default,
2424-# there is only a single data source that reads data from the “content/” and
2525-# “layout/” directories in the site directory.
2626-data_sources:
2727- -
2828- # The type is the identifier of the data source. By default, this will be
2929- # `filesystem_unified`.
3030- type: filesystem_unified
3131-3232- # The path where items should be mounted (comparable to mount points in
3333- # Unix-like systems). This is “/” by default, meaning that items will have
3434- # “/” prefixed to their identifiers. If the items root were “/en/”
3535- # instead, an item at content/about.html would have an identifier of
3636- # “/en/about/” instead of just “/about/”.
3737- items_root: /
3838-3939- # The path where layouts should be mounted. The layouts root behaves the
4040- # same as the items root, but applies to layouts rather than items.
4141- layouts_root: /
-44
content/basic.html
···11----
22-title: Basic Answers
33-description: Concise answers to common basic questions about floating-point math, like "Why don't my numbers add up?"
44----
55-66-### Why don't my numbers, like 0.1 + 0.2 add up to a nice round 0.3, and instead I get a weird result like 0.30000000000000004?
77-88-Because internally, computers use a format ([binary](/formats/binary/) [floating-point](/formats/fp/)) that
99-cannot accurately represent a number like 0.1, 0.2 or 0.3 *at all*.
1010-1111-When the code is compiled or interpreted, your "0.1" is already
1212-rounded to the nearest number in that format, which results
1313-in a small [rounding error](/errors/rounding/) even before the calculation happens.
1414-1515-### Why do computers use such a stupid system?
1616-1717-It's not stupid, just different. Decimal numbers cannot accurately
1818-represent a number like 1/3, so you have to round to something like
1919-0.33 - and you don't expect 0.33 + 0.33 + 0.33 to add up to 1, either - do you?
2020-2121-Computers use [binary numbers](/formats/binary/) because they're faster at dealing with
2222-those, and because for most calculations, a tiny error in the 17th
2323-decimal place doesn't matter at all since the numbers you work with
2424-aren't round (or that precise) anyway.
2525-2626-### What can I do to avoid this problem?
2727-2828-That depends on what kind of calculations you're doing.
2929-3030-* If you really need your results to add up exactly, especially when you work with money: use a special [decimal datatype](/formats/exact/).
3131-* If you just don't want to see all those extra decimal places: simply format your result rounded to a fixed number of decimal places when displaying it.
3232-* If you have no decimal datatype available, an alternative is to work with [integers](/formats/integer/), e.g. do money calculations entirely in cents. But this is more work and has some drawbacks.
3333-3434-### Why do other calculations like 0.1 + 0.4 work correctly?
3535-3636-In that case, the result (0.5) *can* be represented exactly as a floating-point number,
3737-and it's possible for rounding errors in the input numbers to cancel each other out -
3838-But that can't necessarily be relied upon (e.g. when those two numbers
3939-were stored in differently sized floating point representations first, the rounding
4040-errors might not offset each other).
4141-4242-In other cases like 0.1 + 0.3, the result actually isn't *really* 0.4, but close enough that 0.4
4343-is the shortest number that is closer to the result than to any other floating-point number. Many languages then display that number instead of converting the actual result back to the closest
4444-decimal fraction.
···11----
22-title: Comparison
33-description: Explanation of the various pitfalls in comparing floating-point numbers.
44----
55-66-Due to rounding errors, most [floating-point](/formats/fp/) numbers end up being slightly imprecise. As long as this
77-imprecision stays small, it can usually be ignored. However, it also means that numbers expected
88-to be equal (e.g. when calculating the same result through different correct methods) often differ
99-slightly, and a simple equality test fails. For example:
1010-1111- float a = 0.15 + 0.15
1212- float b = 0.1 + 0.2
1313- if(a == b) // can be false!
1414- if(a >= b) // can also be false!
1515-1616-Don't use absolute error margins
1717---------------------------------
1818-The solution is to check not whether the numbers are exactly the same, but whether their difference is
1919-very small. The error margin that the difference is compared to is often called *epsilon*.
2020-The most simple form:
2121-2222- if( Math.abs(a-b) < 0.00001) // wrong - don't do this
2323-2424-This is a bad way to do it because a fixed epsilon chosen because it "looks small" could actually be way too
2525-large when the numbers being compared are very small as well. The comparison would return "true" for numbers that are quite different. And when the numbers are very large, the epsilon
2626-could end up being smaller than the smallest rounding error, so that the comparison always returns "false".
2727-Therefore, it is necessary to see whether the *relative error* is smaller than epsilon:
2828-2929- if( Math.abs((a-b)/b) < 0.00001 ) // still not right!
3030-3131-Look out for edge cases
3232------------------------
3333-There are some important special cases where this will fail:
3434-3535-* When both `a` and `b` are zero. `0.0/0.0` is "not a number", which causes an exception on some platforms, or returns false for all comparisons.
3636-* When only `b` is zero, the division yields "infinity", which may also cause an exception, or is greater than epsilon even when `a` is smaller.
3737-* It returns `false` when both `a` and `b` are very small but on opposite sides of zero, even when they're the smallest possible non-zero numbers.
3838-3939-Also, the result is not commutative (`nearlyEquals(a,b)` is not always the same as `nearlyEquals(b,a)`). To fix these problems, the code has to get a lot more complex, so we really need to put it into a function of its own:
4040-4141- public static boolean nearlyEqual(float a, float b, float epsilon)
4242- {
4343- final float absA = Math.abs(a);
4444- final float absB = Math.abs(b);
4545- final float diff = Math.abs(a - b);
4646-4747- if (a == b) { // shortcut, handles infinities
4848- return true;
4949- } else if (a * b == 0) { // a or b or both are zero
5050- // relative error is not meaningful here
5151- return diff < (epsilon * epsilon);
5252- } else { // use relative error
5353- return diff / (absA + absB) < epsilon;
5454- }
5555- }
5656-5757-This method [passes tests](../NearlyEqualsTest.java) for many important special cases, but as you can see, it
5858-uses some quite non-obvious logic. In particular, it has to use a completely different definition of error margin
5959-when `a` or `b` is zero, because the classical definition of relative error becomes meaningless in those cases.
6060-6161-There are some cases where the method above still produces unexpected results (in particular, it's much stricter when one value is nearly zero than when it is exactly zero), and some of the tests
6262-it was developed to pass probably specify behaviour that is not appropriate for some applications. Before using it, make sure it's appropriate for your application!
6363-6464-Comparing floating-point values as integers
6565--------------------------------------------
6666-There is an alternative to heaping conceptual complexity onto such an apparently simple task: instead of comparing `a` and `b` as [real numbers](http://en.wikipedia.org/wiki/Real_numbers), we can think about them as discrete steps and define the error margin as the maximum number of possible floating-point values between the two values.
6767-6868-This is conceptually very clear and easy and has the advantage of implicitly scaling the relative error margin with the magnitude of the values. Technically, it's a bit more complex, but not as much as you might think, because IEEE 754 floats are designed to maintain their order when their bit patterns are interpreted as integers.
6969-7070-However, this method does require the programming language to support conversion between floating-point values and integer bit patterns. Read the [Comparing floating-point numbers](/references/) paper for more details.
-29
content/errors/propagation.html
···11----
22-title: Error Propagation
33-description: Explanations about propagation of errors in floating-point math.
44----
55-66-While the errors in single [floating-point numbers](/formats/fp/) are very small, even simple calculations on them
77-can contain pitfalls that increase the error in the result way beyond just having the individual
88-errors "add up".
99-1010-In general:
1111-1212-* Multiplication and division are "safe" operations
1313-* Addition and subtraction are dangerous, because when numbers of different magnitudes are involved,
1414- digits of the smaller-magnitude number are lost.
1515-* This loss of digits can be inevitable and benign (when the lost digits also insignificant for
1616- the final result) or catastrophic (when the loss is magnified and distorts the result strongly).
1717-* The more calculations are done (especially when they form an iterative algorithm) the more important
1818- it is to consider this kind of problem.
1919-* A method of calculation can be *stable* (meaning that it tends to reduce rounding errors)
2020- or *unstable* (meaning that rounding errors are magnified). Very often, there are both stable
2121- and unstable solutions for a problem.
2222-2323-There is an entire sub-field of mathematics (in [numerical analysis](http://en.wikipedia.org/wiki/Numerical_analysis)) devoted to studying the numerical stability
2424-of algorithms. For doing complex calculations involving floating-point numbers, it is absolutely
2525-necessary to have some understanding of this discipline.
2626-2727-The article [What Every Computer Scientist Should Know About Floating-Point Arithmetic](/references/) gives a detailed introduction,
2828-and served as an inspiration for creating this website, mainly due to being a bit too detailed and
2929-intimidating to programmers without a scientific background.
-58
content/errors/rounding.html
···11----
22-title: Rounding Errors
33-description: Explanation of the reasons for rounding errors in floating-point math, and of rounding modes.
44----
55-66-Because [floating-point numbers](/formats/fp/) have a limited number of digits, they cannot represent all
77-[real numbers](http://en.wikipedia.org/wiki/Real_number) accurately: when there
88-are more digits than the format allows, the leftover ones are omitted - the number is
99-*rounded*. There are three reasons why this can be necessary:
1010-1111-* **Large Denominators**
1212- In any base, the larger the denominator of an (irreducible) fraction, the more digits it needs in
1313- positional notation. A sufficiently large denominator will require rounding, no
1414- matter what the base or number of available digits is. For example, 1/1000
1515- cannot be accurately represented in less than 3 decimal digits, nor can any
1616- multiple of it (that does not allow simplifying the fraction).
1717-* **Periodical digits**
1818- Any (irreducible) fraction where the denominator has a prime factor that does not occur in the base
1919- requires an infinite number of digits that repeat periodically after a certain point.
2020- For example, in decimal 1/4, 3/5 and 8/20 are finite, because 2 and
2121- 5 are the prime factors of 10. But 1/3 is not finite, nor is 2/3 or 1/7 or 5/6, because 3
2222- and 7 are not factors of 10. Fractions with a prime factor of 5 in the denominator
2323- can be finite in base 10, but [not in base 2](/formats/binary/) - the biggest source of confusion for most
2424- novice users of floating-point numbers.
2525-* **Non-rational numbers**
2626- Non-rational numbers cannot be represented as a regular fraction at all, and in
2727- positional notation (no matter what base) they require an infinite number of non-recurring digits.
2828-2929-Rounding modes
3030---------------
3131-There are different methods to do rounding, and this can be very important in programming,
3232-because rounding can cause different problems in various contexts that can be addressed by
3333-using a better rounding mode. The most common rounding modes are:
3434-3535-* **Rounding towards zero** - simply truncate the extra digits. The
3636- simplest method, but it introduces larger errors than necessary as well
3737- as a bias towards zero when dealing with mainly positive or mainly
3838- negative numbers.
3939-* **Rounding half away from zero** - if the truncated fraction is greater than or equal to half the base,
4040- increase the last remaining digit. This is the method generally taught in school and used by most
4141- people. It minimizes errors, but also introduces a bias (away from zero).
4242-* **Rounding half to even** also known as **banker's rounding** - if the truncated fraction is
4343- greater than half the base,
4444- increase the last remaining digit. If it is equal to half the base, increase the digit only
4545- if that produces an even result. This minimizes errors and bias, and is therefore preferred for bookkeeping.
4646-4747-Examples in base 10:
4848-4949-| | Towards zero | Half away from zero | Half to even |
5050-|------|--------------|---------------------|--------------|
5151-| 1.4 | 1 | 1 | 1 |
5252-| 1.5 | 1 | 2 | 2 |
5353-| -1.6 | -1 | -2 | -2 |
5454-| 2.6 | 2 | 3 | 3 |
5555-| 2.5 | 2 | 3 | 2 |
5656-| -2.4 | -2 | -2 | -2 |
5757-5858-More [rounding methods](http://en.wikipedia.org/wiki/Rounding) can be found at Wikipedia.
content/favicon.ico
favicon.ico
-86
content/formats/binary.html
···11----
22-title: Binary Fractions
33-description: In-depth explanation of how binary fractions work, what problems the cause and why they are used anyway
44----
55-66-How they work
77--------------
88-As a programmer, you should be familiar with the concept of binary integers, i.e.
99-the representation of integer numbers as a series of bits:
1010-1111-<table>
1212-<tr><th colspan="9">Decimal (<span class="num_base">base 10</span>)</th><th> </th><th colspan="17">Binary (<span class="num_base">base 2</span>)</th></tr>
1313-<tr class="base_example">
1414-<td class="digit">1</td><td>⋅</td><td class="num_base">10<sup>1</sup></td><td>+</td>
1515-<td class="digit">3</td><td>⋅</td><td class="num_base">10<sup>0</sup></td><td>=</td>
1616-<td class="digit">13<sub class="num_base">10</sub></td><td class="separator">=</td>
1717-<td class="digit">1101<sub class="num_base">2</sub></td><td>=</td>
1818-<td class="digit">1</td><td>⋅</td><td class="num_base">2<sup>3</sup></td><td>+</td>
1919-<td class="digit">1</td><td>⋅</td><td class="num_base">2<sup>2</sup></td><td>+</td>
2020-<td class="digit">0</td><td>⋅</td><td class="num_base">2<sup>1</sup></td><td>+</td>
2121-<td class="digit">1</td><td>⋅</td><td class="num_base">2<sup>0</sup></td>
2222-</tr><tr class="base_example">
2323-<td class="digit">1</td><td>⋅</td><td>10</td><td>+</td>
2424-<td class="digit">3</td><td>⋅</td><td>1 </td><td>=</td>
2525-<td class="digit">13<sub class="num_base">10</sub></td><td class="separator">=</td>
2626-<td class="digit">1101<sub class="num_base">2</sub></td><td>=</td>
2727-<td class="digit">1</td><td>⋅</td><td>8</td><td>+</td>
2828-<td class="digit">1</td><td>⋅</td><td>4</td><td>+</td>
2929-<td class="digit">0</td><td>⋅</td><td>2</td><td>+</td>
3030-<td class="digit">1</td><td>⋅</td><td>1</td>
3131-</tr></table>
3232-3333-This is how computers store integer numbers internally. And for fractional numbers in [positional notation](http://en.wikipedia.org/wiki/Positional_notation), they do the same thing:
3434-3535-<table>
3636-<tr><th colspan="13">Decimal (<span class="num_base">base 10</span>)</th><th> </th><th colspan="13">Binary (<span class="num_base">base 2</span>)</th></tr>
3737-<tr class="base_example">
3838-<td class="digit">6</td><td>⋅</td><td class="num_base">10<sup>-1</sup></td><td>+</td>
3939-<td class="digit">2</td><td>⋅</td><td class="num_base">10<sup>-2</sup></td><td>+</td>
4040-<td class="digit">5</td><td>⋅</td><td class="num_base">10<sup>-3</sup></td><td>=</td>
4141-<td class="digit">0.625<sub class="num_base">10</sub></td><td class="separator">=</td>
4242-<td class="digit">0.101<sub class="num_base">2</sub></td><td>=</td>
4343-<td class="digit">1</td><td>⋅</td><td class="num_base">2<sup>-1</sup></td><td>+</td>
4444-<td class="digit">0</td><td>⋅</td><td class="num_base">2<sup>-2</sup></td><td>+</td>
4545-<td class="digit">1</td><td>⋅</td><td class="num_base">2<sup>-3</sup></td>
4646-</tr><tr class="base_example">
4747-<td class="digit">6</td><td>⋅</td><td>1/10</td><td>+</td>
4848-<td class="digit">2</td><td>⋅</td><td>1/100</td><td>+</td>
4949-<td class="digit">5</td><td>⋅</td><td>1/1000</td><td>=</td>
5050-<td class="digit">0.625<sub class="num_base">10</sub></td><td class="separator">=</td>
5151-<td class="digit">0.101<sub class="num_base">2</sub></td><td>=</td>
5252-<td class="digit">1</td><td>⋅</td><td>1/2</td><td>+</td>
5353-<td class="digit">0</td><td>⋅</td><td>1/4</td><td>+</td>
5454-<td class="digit">1</td><td>⋅</td><td>1/8</td>
5555-</tr></table>
5656-5757-Problems
5858---------
5959-While they work the same in principle, binary fractions are different from decimal fractions in what
6060-numbers they can accurately represent with a given number of digits, and thus also in what numbers result in [rounding errors](/errors/rounding/):
6161-6262-Specifically, binary can only represent those numbers as a finite fraction where the denominator
6363-is a power of 2. Unfortunately, this does not include most of the numbers that can be
6464-represented as finite fraction in base 10, like 0.1.
6565-6666-| Fraction | Base | Positional Notation | Rounded to 4 digits| Rounded value as fraction | Rounding error |
6767-|-|-|-|-|
6868-| 1/10 | 10 | 0.1 | 0.1 | 1/10 | 0 |
6969-| 1/3 | 10 | 0.<span class="over">3</span> | 0.3333 | 3333/10000 | 1/30000 |
7070-| 1/2 | 2 | 0.1 | 0.1 | 1/2 | 0 |
7171-| 1/10 | 2 | 0.0<span class="over">0011</span> | 0.0001 | 1/16 | 3/80 |
7272-7373-And this is how you already get a rounding error when you just *write down* a number like 0.1 and
7474-run it through your interpreter or compiler. It's not as big as 3/80 and may be invisible because
7575-computers cut off after 23 or 52 binary digits rather than 4. But the error is there and *will* cause
7676-problems eventually if you just ignore it.
7777-7878-7979-Why use Binary?
8080----------------
8181-At the lowest level, computers are based on billions of electrical elements that have only two states, (usually low and high voltage). By interpreting these as 0 and 1, it's very easy to build circuits for storing binary numbers and doing calculations with them.
8282-8383-While it's possible to simulate the behaviour of decimal numbers with binary circuits as well, it's less efficient. If computers used decimal numbers internally, they'd have less memory and be slower at the same level of technology.
8484-8585-Since the difference in behaviour between binary and decimal numbers is not important for most applications, the logical choice is to build computers based on binary numbers and live with the fact
8686-that some extra care and effort are necessary for applications that require [decimal-like behaviour](/formats/exact/).
-36
content/formats/exact.html
···11----
22-title: Exact Types
33-description: Description of various datatypes that can be more exact that floating-point numbers
44----
55-66-While [binary](/formats/binary/) [floating-point](/formats/fp/) numbers are better for computers to work with, and usually good enough for humans, sometimes they are just not appropriate. Sometimes, the numbers really must add up to the last bit, and no technical excuses are acceptable - usually when the calculations involve money.
77-88-Unfortunately, there is no dominating standard like IEEE 754 for this (The 2008 version of the standard added decimal types, which is too recent to have seen widespread adoption).
99-Each language or platform has its own solution, sometimes multiple different ones. For details, look at the language cheat sheets.
1010-1111-There are at least three fundamentally different kinds of such types:
1212-1313-Limited-Precision Decimal
1414--------------------------
1515-Basically the same as a IEEE 754 binary floating-point, except that the exponent is interpreted as base 10. As a result, there are no unexpected [rounding errors](/errors/rounding/). Also, this kind of format is relatively compact and fast, but usually slower than binary formats.
1616-1717-Arbitrary-Precision Decimal
1818----------------------------
1919-Sometimes called "bignum", this is similar to a limited-precision type, but has the ability to increase the length of the significand (possibly also the exponent) as required. The downside is that there is some basic overhead (memory and speed) to support this flexibility, and that the longer the significand gets, the more
2020-memory is needed and the slower all calculations become.
2121-2222-It can be very tempting to say "My calculation is important, so I need as much precision as possible", but in practice the actual importance of precision at the 10,000th decimal digit quickly pales in comparison with the
2323-performance penalty required to support it.
2424-2525-Symbolic calculations
2626----------------------
2727-The "holy grail" of exact calculations. Achieved by writing a program that actually knows all the rules of math and represents data as *symbols* rather than imprecise, rounded numbers. For example:
2828-2929-* 1/3 is actually a fraction "one divided by three"
3030-* The square root of 2 is really the number that, multiplied by itself, is *exactly* 2
3131-* Even [transcendental numbers](http://en.wikipedia.org/wiki/Transcendental_numbers) like **e** and **π** are known, together with their properties, so that e<sup>iπ</sup> is *exactly* equal to -1.
3232-3333-However, these symbolic math systems are complex, slow and require significant mathematical knowledge to use.
3434-They are invaluable tools for mathematicians, but not appropriate for most everyday programming tasks. And
3535-even many mathematicians work on problems where imprecise, numerical solutions are better because no
3636-symbolic solution is known.
-60
content/formats/fp.html
···11----
22-title: Floating Point Numbers
33-description: Explanation of how floating-points numbers work and what they are good for
44----
55-66-Why floating-point numbers are needed
77--------------------------------------
88-99-Since computer memory is limited, you cannot store numbers with infinite precision, no matter whether you use [binary fractions](/formats/binary/) or decimal ones: at some point you have to cut off. But how much accuracy is needed? And *where* is it needed? How many integer digits and how many fraction digits?
1010-1111-* To an engineer building a highway, it does not matter whether it's 10 meters or 10.0001 meters wide - his measurements are probably not that accurate in the first place.
1212-* To someone designing a microchip, 0.0001 meters (a tenth of a millimeter) is a *huge* difference - But he'll never have to deal with a distance larger than 0.1 meters.
1313-* A physicist needs to use the [speed of light](http://en.wikipedia.org/wiki/Speed_of_light) (about 300000000) and [Newton's gravitational constant](http://en.wikipedia.org/wiki/Gravitational_constant) (about 0.0000000000667) together in the same calculation.
1414-1515-To satisfy the engineer and the chip designer, a number format has to provide accuracy for numbers at very different magnitudes. However, only *relative* accuracy is needed. To satisfy the physicist, it must be possible to do calculations that involve numbers with different magnitudes.
1616-1717-Basically, having a fixed number of integer and fractional digits is not useful - and the solution is a format with a *floating point*.
1818-1919-How floating-point numbers work
2020--------------------------------
2121-The idea is to compose a number of two main parts:
2222-2323-* A **significand** that contains the number's digits. Negative significands represent negative numbers.
2424-* An **exponent** that says where the decimal (or binary) point is placed relative to the beginning of the significand. Negative exponents represent numbers that are very small (i.e. close to zero).
2525-2626-Such a format satisfies all the requirements:
2727-2828-* It can represent numbers at wildly different magnitudes (limited by the length of the exponent)
2929-* It provides the same relative accuracy at all magnitudes (limited by the length of the significand)
3030-* It allows calculations across magnitudes: multiplying a very large and a very small number preserves the accuracy of both in the result.
3131-3232-Decimal floating-point numbers usually take the form of [scientific notation](http://en.wikipedia.org/wiki/Scientific_notation) with an
3333-explicit point always between the 1st and 2nd digits. The exponent is
3434-either written explicitly including the base, or an **e** is used to
3535-separate it from the significand.
3636-3737-| Significand | Exponent | Scientific notation | Fixed-point value |
3838-|-------------|----------|---------------------|-------------------|
3939-| 1.5 | 4 | 1.5 ⋅ 10<sup>4</sup> | 15000 |
4040-| -2.001 | 2 | -2.001 ⋅ 10<sup>2</sup> | -200.1 |
4141-| 5 | -3 | 5 ⋅ 10<sup>-3</sup> | 0,005 |
4242-| 6.667 | -11 | 6.667e-11 | 0.0000000000667 |
4343-4444-The standard
4545-------------
4646-Nearly all hardware and programming languages use floating-point numbers in the same binary formats, which are defined in the [IEEE 754](http://en.wikipedia.org/wiki/IEEE_754-2008) standard. The usual formats are 32 or 64 bits in total length:
4747-4848-| Format | Total bits | Significand bits | Exponent bits | Smallest number | Largest number |
4949-|--------|------------|------------------|---------------|-----------------|----------------|
5050-| Single precision | 32 | 23 + 1 sign | 8 | ca. 1.2 ⋅ 10<sup>-38</sup> | ca. 3.4 ⋅ 10<sup>38</sup>|
5151-| Double precision | 64 | 52 + 1 sign | 11 | ca. 5.0 ⋅ 10<sup>-324</sup> | ca. 1.8 ⋅ 10<sup>308</sup> |
5252-5353-Note that there are some peculiarities:
5454-5555-* The **actual bit sequence** is the sign bit first, followed by the exponent and finally the significand bits.
5656-* The exponent does not have a sign; instead an **exponent bias** is subtracted from it (127 for single and 1023 for double precision). This, and the bit sequence, allows floating-point numbers to be compared and sorted correctly even when interpreting them as integers.
5757-* The significand's most significant bit is assumed to be 1 and omitted, except for special cases.
5858-* There are separate **positive and a negative zero** values, differing in the sign bit, where all other bits are 0. These must be considered equal even though their bit patterns are different.
5959-* There are special **positive and negative infinity** values, where the exponent is all 1-bits and the significand is all 0-bits. These are the results of calculations where the positive range of the exponent is exceeded, or division of a regular number by zero.
6060-* There are special **not a number** (or NaN) values where the exponent is all 1-bits and the significand is *not* all 0-bits. These represent the result of various undefined calculations (like multiplying 0 and infinity, any calculation involving a NaN value, or application-specific cases). Even bit-identical NaN values must *not* be considered equal.
-14
content/formats/integer.html
···11----
22-title: On Using Integers
33-description: Explanation why using integers to avoid floating-point problems by having them represent e.g. cents is not a good solution.
44----
55-66-While integer types are usually binary and by definition do not support fractions, they are exact (no [rounding errors](/errors/rounding/) when converting from decimal integers) and can be used as a sort of "poor man's [decimal type](/formats/exact/)" by choosing an implicit fixed decimal point so that the smallest unit you work with can be represented as the integer 1. In practical terms, this is often put as: **"To handle money, store and calculate everything in cents and format only the output"**.
77-88-This works, but has a number of **severe drawbacks**:
99-1010-* It's more work (and more opportunity for bugs) to get it right, especially in regard to [rounding modes](/errors/rounding/).
1111-* Integers have complete precision, but very limited range, and when they overflow, they usually "wrap around" silently, i.e. the largest integer plus 1 becomes zero (for unsigned ints) or the negative value with the largest magnitude (for signed). This is just about the worst possible behaviour when dealing with money, for obvious reasons.
1212-* The implicit decimal point is hard to change and extremely inflexible: if you store dollars as cents, it's simply impossible to support the [Bahraini dinar](http://en.wikipedia.org/wiki/Bahraini_dinar)(1 dinar = 1,000 Fils) at the same time. You'd have to store the position of the decimal point with the data - the first step in implementing your own (buggy, non-standard) limited-precision decimal [floating-point](/formats/fp/) format.
1313-1414-Summary: **using integers is not recommended.** Do this only if there really is no [better alternative](/formats/exact/) at all.
-28
content/index.html
···11----
22-title: Lo Que Todo Programador Debería Saber Sobre Aritmética de Coma Flotante
33-description: Pretende dar respuestas cortas y sencillas a las preguntas recurrentes de programadores principiantes sobre números de coma flotante que «no se suman» correctamente, e información más detallada sobre cómo funcionan los números decimales del IEEE 754, cuándo y cómo usarlos correctamente, y qué usar en su lugar cuando no son apropiados.
44----
55-66-o
77--
88-99-¿Por qué mis números no se suman bien?
1010-======================================
1111-1212-O sea que has escrito algún código absurdamente simple, como por ejemplo:
1313-1414- 0.1 + 0.2
1515-1616-y has obtenido un resultado totalmente inesperado:
1717-1818- 0.30000000000000004
1919-2020-Tal vez pediste ayuda en algún foro y te mandaron a un [artículo largo con un montón de fórmulas](http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html) que no parecía ser de ayuda.
2121-2222-Bueno, este sitio está aquí para:
2323-2424-* Explicar de manera concisa por qué obtuviste ese resultado inesperado
2525-* Decirte cómo lidiar con este problema
2626-* Si te interesa, dar explicaciones detalladas de por qué los números de coma flotante tienen que funcionar así y qué otros problemas pueden surgir
2727-2828-Deberías ir a la sección de [Respuestas Básicas](/basic/) primero - ¡pero no termines ahí!
-38
content/languages/csharp.html
···11----
22-title: Floating-point cheat sheet for C#
33-description: Tips for using floating-point and decimal numbers in C#
44----
55-66-Floating-Point Types
77---------
88-C# has [IEEE 754](/formats/fp/) single and double precision types supported by keywords:
99-1010- float f = 0.1f; // 32 bit float, note f suffix
1111- double d = 0.1d; // 64 bit float, suffix optional
1212-1313-1414-Decimal Types
1515--------------
1616-C# has a 128 bit [limited-precision](/formats/exact/) decimal type denoted by the keyword
1717-<code>decimal</code>:
1818-1919- decimal myMoney = 300.1m; // note m suffix on the literal
2020-2121-2222-How to Round
2323-------------
2424-The <code>Math.Round()</code> method works with the double and decimal types, and allows you to specify a [rounding mode](/errors/rounding/):
2525-2626- Math.Round(1.25m, 1, MidpointRounding.AwayFromZero); // returns 1.3
2727-2828-2929-3030-Resources
3131----------
3232-* [C# Reference](http://msdn.microsoft.com/en-us/library/618ayhy6%28v=VS.80%29.aspx)
3333- * [float type](http://msdn.microsoft.com/en-us/library/b1e65aza%28v=VS.80%29.aspx)
3434- * [double type](http://msdn.microsoft.com/en-us/library/678hzkk9%28v=VS.80%29.aspx)
3535- * [decimal type](http://msdn.microsoft.com/en-us/library/364x0z75%28v=VS.80%29.aspx)
3636- * [Math.Round()](http://msdn.microsoft.com/en-US/library/system.math.round%28v=VS.80%29.aspx)
3737-3838-
-54
content/languages/java.html
···11----
22-title: Floating-point cheat sheet for Java
33-description: Tips for using floating-point and decimal numbers in Java
44----
55-66-Floating-Point Types
77---------
88-Java has [IEEE 754](/formats/fp/) single and double precision types supported by keywords:
99-1010- float f = 0.1f; // 32 bit float, note f suffix
1111- double d = 0.1d; // 64 bit float, suffix optional
1212-1313-The `strictfp` keyword on classes, interfaces and methods forces all intermediate results of floating-point calculations to be IEEE 754 values as well, guaranteeing identical results on all platforms. Without that keyword, implementations can use an extended exponent range where available, resulting in more precise results and faster execution on many common CPUs.
1414-1515-Decimal Types
1616--------------
1717-Java has an [arbitrary-precision](/formats/exact/) decimal type named <code>java.math.BigDecimal</code>, which
1818-also allows to choose the [rounding mode](/errors/rounding/).
1919-2020- BigDecimal a = new BigDecimal("0.1");
2121- BigDecimal b = new BigDecimal("0.2");
2222- BigDecimal c = a.add(b); // returns a BigDecimal representing exactly 0.3
2323-2424-2525-How to Round
2626-------------
2727-To get a String:
2828-2929- String.format("%.2f", 1.2399) // returns "1.24"
3030- String.format("%.3f", 1.2399) // returns "1.240"
3131- String.format("%.2f", 1.2) // returns "1.20"
3232-3333-To print to standard output (or any <code>PrintStream</code>):
3434-3535- System.out.printf("%.2f", 1.2399) // same syntax as String.format()
3636-3737-If you don't want trailing zeroes:
3838-3939- new DecimalFormat("0.00").format(1.2)// returns "1.20"
4040- new DecimalFormat("0.##").format(1.2)// returns "1.2"
4141-4242-If you need a specific [rounding mode](/errors/rounding/):
4343-4444- new BigDecimal("1.25").setScale(1, RoundingMode.HALF_EVEN); // returns 1.2
4545-4646-4747-Resources
4848----------
4949-* [Java Language Specification](http://java.sun.com/docs/books/jls/third_edition/html/j3TOC.html)
5050- * [Floating-Point Types, Formats, and Values](http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.2.3)
5151-* [Java Standard API](http://java.sun.com/javase/6/docs/api/)
5252- * [BigDecimal](http://download.oracle.com/javase/6/docs/api/java/math/BigDecimal.html)
5353- * [DecimalFormat](http://download.oracle.com/javase/6/docs/api/java/text/DecimalFormat.html)
5454- * [String.format()](http://download.oracle.com/javase/6/docs/api/java/lang/String.html#format(java.lang.String,%20java.lang.Object...))
-42
content/languages/javascript.html
···11----
22-title: Floating-point cheat sheet for JavaScript
33-description: Tips for using floating-point and decimal numbers in JavaScript
44----
55-66-Floating-Point Types
77---------
88-JavaScript is dynamically typed and will often convert implicitly between strings and floating-point numbers (which are IEEE 64 bit values). To force a variable to floating-point, use the global <code>parseFloat()</code> function.
99-1010- var num = parseFloat("3.5");
1111-1212-Decimal Types
1313--------------
1414-1515-The best decimal type for JavaScript seems to be a port of [Java's](/languages/java/) <code>BigDecimal</code> class, which also supports [rounding modes](/errors/rounding/):
1616-1717- var a = new BigDecimal("0.01");
1818- var b = new BigDecimal("0.02");
1919- var c = a.add(b); // 0.03
2020- var d = c.setScale(1, BigDecimal.prototype.ROUND_HALF_UP);
2121-2222-2323-How to Round
2424-------------
2525-2626- var num = 5.123456;
2727- num.toPrecision(1) //returns 5 as string
2828- num.toPrecision(2) //returns 5.1 as string
2929- num.toPrecision(4) //returns 5.123 as string
3030-3131-Using a specific rounding mode:
3232-3333- new BigDecimal("1.25").setScale(1, BigDecimal.prototype.ROUND_HALF_UP);
3434-3535-3636-Resources
3737----------
3838-* [BigDecimal for JavaScript](https://github.com/dtrebbien/BigDecimal.js)
3939-* [Core JavaScript Reference](https://developer.mozilla.org/en/JavaScript/Reference)
4040- * [parseFloat()](https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/parseFloat)
4141- * [toPrecision()](https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Number/toPrecision)
4242-
-60
content/languages/perl.html
···11----
22-title: Floating-point cheat sheet for Perl
33-description: Tips for using floating-point and decimal numbers in Perl
44----
55-66-Floating-Point Types
77---------
88-Perl supports platform-native floating-point as scalar values; in practice this usually means [IEEE 754](/formats/fp/) double precision.
99-1010-Exact Types
1111--------------
1212-Perl can also store decimal numbers as strings, but the builtin arithmetic operators will convert them to integer or floating-point values to perform the operation.
1313-1414-The <code>Math::BigFloat</code> extension provides an arbitrary-precision [decimal type](/formats/exact/):
1515-1616- use Math::BigFloat ':constant'
1717- my $f = 0.1 + 0.2; # returns exactly 0.3
1818-1919-The <code>Number::Fraction</code> extension provides a fraction type that overloads the arithmetic operators with [symbolic](/formats/exact/) fraction arithmetic:
2020-2121- use Number::Fraction ':constants';
2222- my $f = '1/2' - '1/3'; # returns 1/6
2323-2424-The <code>Math::BigRat</code> extension provides similar functionality. Its advantage is compatibility with the
2525-<code>Math::BigInt</code> and <code>Math::BigFloat</code> extensions, but it does not seem to support fraction literals.
2626-2727-How to Round
2828-------------
2929-To get a string:
3030-3131- $result = sprintf("%.2f", 1.2345); # returns 1.23
3232-3333-To format output:
3434-3535- printf("%.2f", 1.2); # prints 1.20
3636-3737-Note that this implicitly uses [round-to-even](/errors/rounding/). The variable <code>$#</code> contains the default format for printing numbers, but its use is considered deprecated.
3838-3939-The <code>Math::Round</code> extension provides various functions for rounding floating-point values:
4040-4141- use Math::Round qw(:all);
4242- $result = nearest(.1, 4.567) # prints 4.6
4343- $result = nearest(.01, 4.567) # prints 4.57
4444-4545-The <code>Math::BigFloat</code> extension also supports various [rounding modes](/errors/rounding/):
4646-4747- use Math::BigFloat;
4848- my $n = Math::BigFloat->new(123.455);
4949- my $f1 = $n->round('','-2','common'); # returns 123.46
5050- my $f2 = $n->round('','-2','zero'); # returns 123.45
5151-5252-Resources
5353----------
5454-* [Semantics of numbers and numeric operations in Perl](http://perldoc.perl.org/perlnumber.html)
5555-* [sprintf function](http://perldoc.perl.org/functions/sprintf.html)
5656-* [Math::Round extension](http://search.cpan.org/dist/Math-Round/Round.pm)
5757-* [Number::Fraction extension](http://search.cpan.org/~davecross/Number-Fraction-1.13/lib/Number/Fraction.pm)
5858-* [Math::BigRat extension](http://search.cpan.org/~flora/Math-BigRat-0.26/lib/Math/BigRat.pm)
5959-* [Math::BigFloat extension](http://search.cpan.org/~flora/Math-BigInt-1.95/lib/Math/BigFloat.pm)
6060-
-35
content/languages/php.html
···11----
22-title: Floating-point cheat sheet for PHP
33-description: Tips for using floating-point and decimal numbers in PHP
44----
55-66-Floating-Point Types
77---------
88-PHP is dynamically typed and will often convert implicitly between strings and floating-point numbers (which are platform-dependant, but typically IEEE 64 bit values). To force a value to floating-point, evaluate it in a numerical context:
99-1010- $foo = 0 + "10.5";
1111-1212-1313-Decimal Types
1414--------------
1515-The BC Math extension implements [arbitrary-precision](/formats/exact/) decimal math:
1616-1717- $a = '0.1';
1818- $b = '0.2';
1919- echo bcadd($a, $b); // prints 0.3
2020-2121-How to Round
2222-------------
2323-Rounding can be done with the `number_format()` function:
2424-2525- $number = 4.123;
2626- echo number_format($number, 2); // prints 4.12
2727-2828-2929-Resources
3030----------
3131-* [PHP manual](http://www.php.net/manual/en/index.php)
3232- * [Floating point types](http://www.php.net/manual/en/language.types.float.php)
3333- * [BC Math extension](http://de3.php.net/manual/en/ref.bc.php)
3434- * [number_format()](http://de3.php.net/manual/en/function.number-format.php)
3535-
-42
content/languages/python.html
···11----
22-title: Floating-point cheat sheet for Python
33-description: Tips for using floating-point and decimal numbers in Python
44----
55-66-Floating-Point Types
77---------
88-Almost all platforms map Python floats to [IEEE 754](/formats/fp/)
99-double precision.
1010-1111- f = 0.1
1212-1313-Decimal Types
1414--------------
1515-Python has an [arbitrary-precision](/formats/exact/) decimal type named <code>Decimal</code> in the <code>decimal</code> module, which also allows to choose the [rounding mode](/errors/rounding/).
1616-1717- a = Decimal('0.1')
1818- b = Decimal('0.2')
1919- c = a + b # returns a Decimal representing exactly 0.3
2020-2121-How to Round
2222-------------
2323-To get a string:
2424-2525- "%.2f" % 1.2399 # returns "1.24"
2626- "%.3f" % 1.2399 # returns "1.240"
2727- "%.2f" % 1.2 # returns "1.20"
2828-2929-To print to standard output:
3030-3131- print "%.2f" % 1.2399 # just use print and string formatting
3232-3333-Specific [rounding modes](/errors/rounding/) and other parameters can be defined in a Context object:
3434-3535- getcontext().prec = 7
3636-3737-Resources
3838----------
3939-* [Floating Point Arithmetic: Issues and Limitations](http://docs.python.org/tutorial/floatingpoint.html)
4040-* [The decimal module](http://docs.python.org/library/decimal.html)
4141- * [Context objects](http://docs.python.org/library/decimal.html#context-objects)
4242-* [String formatting in Python](http://docs.python.org/library/stdtypes.html#string-formatting-operations)
-39
content/languages/sql.html
···11----
22-title: Floating-point cheat sheet for SQL
33-description: Tips for using floating-point and decimal numbers in SQL
44----
55-66-Floating-Point Types
77---------
88-The SQL standard defines three binary floating-point types:
99-1010-* `REAL` has implementation-dependant precision (usually maps to a hardware-supported type like IEEE 754 single or double precision)
1111-* `DOUBLE PRECISION` has implementation-dependant precision which is greater than `REAL` (usually maps to IEEE 754 double precision)
1212-* `FLOAT(N)` has at least `N` binary digits of precision, with an implementation-dependant maximum for `N`
1313-1414-The exponent range for all three types is implementation-dependant as well.
1515-1616-Decimal Types
1717--------------
1818-The standard defines two fixed-point decimal types:
1919-2020-* `NUMERIC(M,N)` has exactly `M` total digits, `N` of them after the decimal point
2121-* `DECIMAL(M,N)` is the same as `NUMERIC(M,N)`, except that it is allowed to have more than `M` total digits
2222-2323-The maximum values of `M` and `M` are implementation-dependant. Vendors often implement the two types identically.
2424-2525-How to Round
2626-------------
2727-2828-The SQL standard defines no explicit rounding, but most vendors provide a `ROUND()` or `TRUNC()` function.
2929-3030-However, it usually makes little sense to round within the database, since its job is *storing* data, while rounding is an aspect of *displaying* data, and should therefore be done by the code in the presentation layer.
3131-3232-3333-Resources
3434----------
3535-* [Official ISO SQL 2008 standard (non-free)](http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=38640)
3636-* [SQL 92 draft (free)](http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt)
3737-* [MySQL numeric types](http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html)
3838-* [PostgreSQL data types](http://www.postgresql.org/docs/8.1/static/datatype.html)
3939-* [MS SQL Server data types](http://msdn.microsoft.com/en-US/library/ms187752%28v=SQL.90%29.aspx)
content/logo.png
logo.png
-14
content/references.html
···11----
22-title: References
33-description: Documents with more in-depth information about floating-point math
44----
55-66-Documents that contain more in-depth information about the topics
77-covered on this wbesite:
88-99-* [Homepage of the IEEE 754 standard](http://grouper.ieee.org/groups/754/)
1010-* [What Every Computer Scientist Should Know About Floating-Point Arithmetic](http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html)
1111-* [Homepage of William Kahan (architect of the IEEE 754 standard, lots of interesting links)](http://www.cs.berkeley.edu/~wkahan/)
1212-* [Decimal Arithmetic FAQ ](http://speleotrove.com/decimal/decifaq.html)
1313-* [Comparing floating-point numbers](http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm)
1414-* [Tool to convert numbers between bases, including fractions](http://www.easysurf.cc/cnver17.htm)
content/robots.txt
robots.txt
content/style.css
style.css
-14
content/xkcd.html
···11----
22-title: xkcd
33-description: How to mess with people who've learned to *expect* rounding errors in floating-point math.
44----
55-66-or
77---
88-99-<%= @item[:description] %>
1010-====================
1111-1212-
1313-1414-From [xkcd](http://www.xkcd.com/217/)