Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decimal strict mode should prevent Decimal("0.89") == 0.89 #125557

Open
timkay opened this issue Oct 15, 2024 · 25 comments
Open

Decimal strict mode should prevent Decimal("0.89") == 0.89 #125557

timkay opened this issue Oct 15, 2024 · 25 comments
Labels
extension-modules C modules in the Modules dir pending The issue will be closed if no feedback is provided type-feature A feature request or enhancement

Comments

@timkay
Copy link

timkay commented Oct 15, 2024

Bug report

Bug description:

Mixing Decimals and floats will often get you the wrong answer. By default, the Decimal library allows such behavior:

>>> import decimal
>>> decimal.Decimal(0.1)
Decimal('0.1000000000000000055511151231257827021181583404541015625')
>>> decimal.Decimal("0.1") < 0.1
True
>>> decimal.Decimal("0.1") == 0.1
False

Fortunately, you can turn on floating point strict mode, where mixing Decimals and floats is prohibited:

>>> decimal.getcontext().traps[decimal.FloatOperation] = True
>>> decimal.Decimal(0.1)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
decimal.FloatOperation: [<class 'decimal.FloatOperation'>]
>>> decimal.Decimal("0.1") < 0.1
Traceback (most recent call last):
  File "<console>", line 1, in <module>
decimal.FloatOperation: [<class 'decimal.FloatOperation'>]

HOWEVER, for some reason, == and != are allowed in floating point strict mode, and produce wrong answers:

>>> decimal.Decimal("0.1") == 0.1
False
>>>

When floating point strict mode is on, why is == and != still allowed?

    if isinstance(other, float):
        context = getcontext()
        if equality_op:
            context.flags[FloatOperation] = 1
        else:
            context._raise_error(FloatOperation,
                "strict semantics for mixing floats and Decimals are enabled")
        return self, Decimal.from_float(other)

This code would be better without the if equality_op::

    if isinstance(other, float):
        context = getcontext()
        context._raise_error(FloatOperation,
            "strict semantics for mixing floats and Decimals are enabled")
        return self, Decimal.from_float(other)

Why are == and != allowed in floating point strict mode?

CPython versions tested on:

3.12

Operating systems tested on:

Linux

@timkay timkay added the type-bug An unexpected behavior, bug, or error label Oct 15, 2024
@skirpichev
Copy link
Member

you can turn on floating point strict mode, where mixing Decimals and floats is prohibited

Not at all:
"If the signal is not trapped (default), mixing floats and Decimals is permitted in the Decimal constructor, create_decimal() and all comparison operators. Both conversion and comparisons are exact. Any occurrence of a mixed operation is silently recorded by setting FloatOperation in the context flags. Explicit conversions with from_float() or create_decimal_from_float() do not set the flag.

Otherwise (the signal is trapped), only equality comparisons and explicit conversions are silent. All other mixed operations raise FloatOperation."

HOWEVER, for some reason, == and != are allowed in floating point strict mode, and produce wrong answers:

Why do you think it's a wrong answer?! 1/10 != (binary approximation of)0.1; the 0.1 can't be represented exactly as binary floating-point number. See https://docs.python.org/3/tutorial/floatingpoint.html#floating-point-arithmetic-issues-and-limitations

So, current behaviour seems to be consistent and well documented. Your only argument against is wrong. And this will break backward compatibility.

To change things you need more arguments. Probably, this should be discussed first on https://discuss.python.org/

@mdickinson
Copy link
Member

mdickinson commented Oct 16, 2024

for some reason, == and != are allowed in floating point strict mode

There's a more general design principle at work here (though it's not one I think I've seen articulated clearly in the docs), not related to float or Decimal specifically: as a rule, an == comparison between two hashable objects shouldn't raise an exception. The reason is that equality checks are performed implicitly and unpredictably in hash-based collections like dict and set.

If == raised as you suggest, then it would be possible to create the set {Decimal("0.47013"), 8946670875749133.0}, but creating a set {Decimal("0.47012"), 8946670875749133.0} would raise (because the hashes of the two elements in the set match, and so an equality test would be invoked to determine whether the elements are actually different, and that equality test would raise). I think that would be rather surprising.

>>> from decimal import Decimal
>>> s = {Decimal("0.47012"), 8946670875749133.0}
>>> list(map(hash, s))
[8946670875749133, 8946670875749133]

@skirpichev
Copy link
Member

There's a more general design principle at work here (though it's not one I think I've seen articulated clearly in the docs), not related to float or Decimal specifically: as a rule, an == comparison between two hashable objects shouldn't raise an exception.

Perhaps, most close to documenting it is a quote from here: "When no appropriate method returns any value other than NotImplemented, the == and != operators will fall back to is and is not, respectively."

It doesn't say why this default does exists, however. But I'm not sure it worth.

I think this can be closed.

@skirpichev skirpichev added the pending The issue will be closed if no feedback is provided label Oct 16, 2024
@timkay
Copy link
Author

timkay commented Oct 16, 2024

There's a more general design principle at work here

The general design principle is that floating point strict mode should prevent coders from making mistakes. Sure it's not always a mistake to say Decimal(0.89), if that's what you really intended, but that's almost never what people really want.

>>> Decimal(0.89)
Decimal('0.89000000000000001332267629550187848508358001708984375')

Decimal has lots of use cases, and a common one is to handle dollars-and-cents. If you say

account.balance < amount

You can run into trouble because

>>> Decimal("0.89") < 0.89
True

This sort of problem is flagged by turning on floating point strict mode, and the runtime will flag that statement as incorrect.

The exact same argument applies to equality. For many use cases, equality between Dollar and float is a coding error, and floating point strict mode should catch it.

The idea that things like sets use equality implicitly is interesting. Problem is, most people would be surprised by

>>> 0.89 in set([Decimal("0.89")])
False

What is the use case where equality makes sense between Decimal and float?

As for breaking changes, we could introduce a new flag, StrictMode, which would do what FloatOperation does, but includes equality.

@skirpichev
Copy link
Member

What is the use case where equality makes sense between Decimal and float?

When they are mathematically equal.

>>> import decimal
>>> decimal.Decimal(0.1) == 0.1
True
>>> decimal.Decimal(0.89) == 0.89
True

We don't loose anything, unless you apply context settings with insufficient precision:

>>> (+decimal.Decimal(0.1)) == 0.1
False
>>> decimal.Decimal(0.1).as_integer_ratio()
(3602879701896397, 36028797018963968)
>>> (0.1).as_integer_ratio()
(3602879701896397, 36028797018963968)
>>> (+decimal.Decimal(0.1)).as_integer_ratio()
(1000000000000000055511151231, 10000000000000000000000000000)
>>> decimal.getcontext().prec=100
>>> (+decimal.Decimal(0.1)).as_integer_ratio()
(3602879701896397, 36028797018963968)

we could introduce a new flag, StrictMode, which would do what FloatOperation does, but includes equality.

Mark remind us above why we can't raise an exception here.

So, in this case you suggest to fall back on is and set 0.1 != decimal.Decimal(0.1). What we gain here?

@timkay
Copy link
Author

timkay commented Oct 16, 2024

I wasn't suggesting that we fall back on is. I was responding to a previous point that equality is used implicitly sometimes, and that's why equality is white listed even in floating point strict mode. But it doesn't make any sense, as that example shows.

To your point, I ask, what is the use case for saying Decimal(0.1)? The correct use is Decimal("0.1").

@skirpichev
Copy link
Member

I wasn't suggesting that we fall back on is.

Then what? Currently Decimal's comparison method does implicit conversion and then do comparison. We have options: 1) implement comparison method in some other way (how?) or 2) just return NotImplemented - then Python will fall back to is.

If you rule out 2) - please explain how we should implement 1) instead.

what is the use case for saying Decimal(0.1)?

To convert losslessly binary floating-point number to it's Decimal equivalent. Or in other words: to construct Decimal instances from some binary fractions ($N/2^M$).

The correct use is Decimal("0.1").

It depends. To achieve the previous goal e.g. for 0.1 - the correct use will be something like:

>>> x = 0.1
>>> with localcontext() as ctx:
...     ctx.prec = 55  # assuming IEEE doubles
...     n, d = x.as_integer_ratio()
...     a = Decimal(n)/d
...     
>>> a  # == Decimal(x)
Decimal('0.1000000000000000055511151231257827021181583404541015625')

Using a string in the Decimal constructor - the correct way to enter a decimal fraction.

@tim-one
Copy link
Member

tim-one commented Oct 17, 2024

Elaborating on what @mdickinson said, there's an even more general design principle at work: starting with the introduction of the classes in the datetime module, the intent has been that == and != never raise an exception when applied to objects of different types. That was late in Python 2's life, and was in a sense a prototype for a more sweeping rule in Python 3. The idea being that equality is a much simpler concept than ordering (although don't tell philosophers that 😉), and e.g. it's just plain dead obvious that datetime.dateime.now() != K for any integer K.

It was appreciated at the time that this would simplify reasoning about sets and dicts, but the latter isn't really what drove it. Instead the latter was taken as confirmation of that it was a Good Idea in general.

@skirpichev
Copy link
Member

Other than raising an exception (that, I think, excluded by Mark's arguments) - the only option is to make a context-dependent equality.

I think this is silly, we might end up e.g. with a set, that after changing context settings - has equal elements.

@tim-one
Copy link
Member

tim-one commented Oct 17, 2024

@mdickinson wrote:

If == raised as you suggest, then it would be possible to create the set {Decimal("0.47013"), 8946670875749133.0}, but creating a set {Decimal("0.47012"), 8946670875749133.0} would raise

Perhaps when FloatOperation was introduced, mixed equality comparison could have raised too on "practicality beats purity" grounds, but now there's also backwards compatibility to preserve. As Mark's example shows, behavior in existing code could change if FloatOperation changed to include raising on mixed equality. A simpler example illustrates simpler potential breakage:

f = 0.0
d = Decimal(0)
if f == d:
    # whatever

That works fine today regardless of FloatOperation state. Changing it to raise if FloatOperation is enabled would be a breaking change.

In any case, behavior that's working as designed and documented isn't properly called "a bug" in the issue tracker. I suggest changing it to a feature request, and instead ask for a new exception to be introduced. one that acts like FloatOperation but also raises on mixed (in)equality comparisons. I'd probably be +0 on that, but, as Mark's example shows, it would come with its own surprises.

@timkay
Copy link
Author

timkay commented Oct 17, 2024

@skirpichev, I asked what is the use case for Decimal(0.1), and you replied, that it's to load a Decimal instance with the value 0.1000000000000000055511151231257827021181583404541015625. That's not a use case. What would be a situation where doing so would be useful?

@mdickinson points out that {Decimal("0.47012"), 8946670875749133.0} would raise an exception. I ask you for a use case, where mixing Decimal and float is useful.

There is a reason that Decimal exists, and it is to allow decimal computations without worrying about floating point rounding. If you care about that then you would use Decimal and avoid float. Mixing the two leads to incorrect answers. Raising on equality would prevent that.

Put a different way, IF you want to mix Decimal and float, then why would you want to prevent comparing Decimal < float ? What is the purpose of floating point strict mode? If you want to mix Decimal and float, don't turn on strict mode. The compromise that is strict mode now, it isn't useful.

@tim-one
Copy link
Member

tim-one commented Oct 17, 2024

@timkay:

I asked what is the use case for Decimal(0.1), and you replied, that it's to load a Decimal instance with the value 0.1000000000000000055511151231257827021181583404541015625. That's not a use case.

It's not your use case, but I've frequently done such things, because the ability is quite useful for people analyzing floating-point error propagation. Decimal isn't their focus. Decimal is just a tool that makes it much easier for non-experts to "see" what's really happening in their float code. For example, I've frequently used this in public replies to Stackoverflow questions about float behavior.

Read the room here? For reasons of backward compatibility alone, I expect there's essentially no chance this will change. But I think you could make a decent case for adding a new exception (say, FloatOperationEx or FloatOperationStrict) that acts as you want.

The compromise that is strict mode now, it isn't useful.

It doesn't do everything you want, but does do a great deal of you want. It does do everything I want of it (and I do use it when appropriate), but then I'm always acutely aware of how mixing numeric types works in Python. "Isn't useful" is too strong.

@skirpichev skirpichev added type-feature A feature request or enhancement extension-modules C modules in the Modules dir and removed type-bug An unexpected behavior, bug, or error pending The issue will be closed if no feedback is provided labels Oct 18, 2024
@skirpichev
Copy link
Member

skirpichev commented Oct 18, 2024

What would be a situation where doing so would be useful?

Well, just showing the decimal floating-point number, which is equal to the given binary floating-point number - is a useful thing on itself. People too often misinterpret floating-point literals like 0.1 as decimal fractions.

(IMO, "short" float repr in 2.7+ and 3.1+ rather adds more confusion here.)

And too often bugs "this Python answer in that floating-point computation is wrong/not accurate/imprecise" ends with similar computations with Decimal equivalents. Just a quick examples : #111933 (comment) ("python answer is wrong!") or #82884 (comment) (a subtype, "round()'s answer is wrong").

I ask you for a use case, where mixing Decimal and float is useful.

They aren't really got mixed. Floats are converted to decimal equivalents.

it is to allow decimal computations without worrying about floating point rounding.

I'm afraid, but you can't in general avoid floating-point rounding in computations with decimal numbers (try to trap the Inexact signal and see how long you can play with numbers).

IF you want to mix Decimal and float, then why would you want to prevent comparing Decimal < float ?

E.g. to prevent comparisons like x < 0.2:

>>> Decimal('0.200000000000000001') <= 0.2
True

On another hand, if you are using ==/!= to test computed floating-point values - that's usually means a bug in your program. There are few exceptions, e.g. see recipes for exp/sin/cos in the decimal module (i.e. some fixed-point iterations). But here changing test to lasts - s == 0.0 is harmless.

I've changed this to a feature request per Tim's suggestion. But I don't see much sense in this.

@tim-one
Copy link
Member

tim-one commented Oct 18, 2024

@skirpichev, I'm sympathetic to the OP"s complaint. I don't know the history of FloatOperation's introduction, but it seemed strangely inconsistent from the start that, by default e.g. decimal + float wasn't allowed but decimal < float was. Then FloatOperation was introduced to make the latter complain too - but then decimal == float still didn't complain.

To a non-expert, this all seems arbitrary, and inessential arbitrariness is "unPythonic". A new exception that says "no implicit mixing of decimals and floats, period" would at least be comprehensible to everyone. Although, as Mark said, opens the door to new surprises (due to Python implicitly doing equality comparisons under the covers, for dicts and sets, and even for x in sequence).

@tim-one
Copy link
Member

tim-one commented Oct 18, 2024

I'll add that one non-obvious advantage of the current default behavior is that it allows lists mixing floats and Decimals to be sorted without exception, and likewise to be used by bisect.* and heapq.* functions. The only comparison method they use is __lt__.

@timkay
Copy link
Author

timkay commented Oct 18, 2024

If you want to mix Decimal and float, then you wouldn't turn on floating-point-strict-mode. We are back to the question, what is the point of floating-point-strict-mode?

As for using Decimal to display what's going on with floating point representation, again, don't turn on floating-point-strict-mode. Also there is a much more direct way to do that:

>>> f"{0.1:.60f}"
'0.100000000000000005551115123125782702118158340454101562500000'

Does anybody have an application that has floating-point-strict-mode turned on, where you want it to allow equality between Decimal and float? That's the use case I am looking for. If you can't come up with a use case, then you are saying, "It's the correct behavior because it's the actual behavior."

@tim-one
Copy link
Member

tim-one commented Oct 18, 2024

there is a much more direct way to do tha
>>> f"{0.1:.60f}"
'0.100000000000000005551115123125782702118158340454101562500000'

Inadequate in general. Decimal(float) is lossless, creating as many decimal digits as needed to reproduce the input exactly, regardless of current context precision. A .60f format produces 60 fractional decimal digits regardless of whether only 1 is needed, or if 600 are actually needed. It's poke-&-hope guesswork, far less useful.

"It's the correct behavior because it's the actual behavior."

Misses the real point. "It's the actual behavior" alone is entirely what "backward compatibility" is about. Python is exceedingly reluctant to introduce breaking changes, and especially not so when "the actual behavior" is the documented behavior.

It remains possible that a new exception could be introduced that would do what you want.

@skirpichev
Copy link
Member

it seemed strangely inconsistent from the start that, by default e.g. decimal + float wasn't allowed but decimal < float was.

@tim-one, honestly I didn't get Mark's point on why it shouldn't be extended to arithmetic. See #87768.

But this is another story.

Although, as Mark said, opens the door to new surprises

And it will be much less useful. IMO, using == to compare computed values - is a bug. Can you suggest sane use case for ==?

On another hand, == test used implicitly e.g. to construct sets. Why we want forbid to do {0.1, Decimal.from_float(0.1)}? Note, that we don't reject all floats or decimals here, but ones with specific values.

what is the point of floating-point-strict-mode?

@timkay, you repeat question, which was answered above.

@tim-one
Copy link
Member

tim-one commented Oct 20, 2024

I don't see value in revisiting old decisions. It's messy, and can't change what we're stuck with now 😉.

FloatOperation was a strange compromise, for precisely the reason @timkay pointed out at the start. There's a decent case, IMO, for adding a new exception that raises on (in)equality comparisons too. I expect that "almost everyone" who cares about this would in fact use that newer exception.

It's too late to change the current FloatOperation. Perhaps it could be deprecated.

Why we want forbid to do {0.1, Decimal.from_float(0.1)}?

We don't want to forbid it. But under the new exception, it would raise an exception, as an unavoidable consequence of making (say) NoFloat easy to explain and catch the cases the OP cares about.

Note, that we don't reject all floats or decimals here, but ones with specific values.

And it's worse than just that. In non-trivial examples, whether an exception is raised doesn't just depend on the values in the set (or dict), but also on the history of insertions and deletions. Nothing whatsoever is defined about the order in which hash chain collisions are traversed.

As Mark said, "equality checks are performed implicitly and unpredictably in hash-based collections like dict and set".

I've never mixed Decimals with floats in a set or dict to begin with, so would never notice. If I ever did, and got burned, I'd have to stop enabling the new exception. Fine by me.

@skirpichev
Copy link
Member

I don't see value in revisiting old decisions.

Yet I have tried to dig in history with hope to understand motivations for the current design. But without much luck. It seems, the FloatOperation signal in it's present shape was introduced in #51901 without much discussion :(

whether an exception is raised doesn't just depend on the values in the set (or dict), but also on the history of insertions and deletions.

And even worse, on whether you have enabled FloatOperation before working with a container instance or not:

>>> import _pydecimal as decimal
>>> l2 = [decimal.Decimal("0.47012"), 8946670875749133.0]
>>> s2 = set(l2)
>>> l2.index(8946670875749133.0)
1
>>> decimal.getcontext().traps[decimal.FloatOperation] = True  # "forbidden" containers can't be created below
>>> l2.index(8946670875749133.0)
Traceback (most recent call last):
...
decimal.FloatOperation: strict semantics for mixing floats and Decimals are enabled
>>> l2.index(decimal.Decimal("0.47012"))
0
>>> 1.1 in s2
False
>>> 8946670875749133.0 in s2
Traceback (most recent call last):
...
decimal.FloatOperation: strict semantics for mixing floats and Decimals are enabled
a patch to play
diff --git a/Lib/_pydecimal.py b/Lib/_pydecimal.py
index ec03619933..a2dbd3059f 100644
--- a/Lib/_pydecimal.py
+++ b/Lib/_pydecimal.py
@@ -807,7 +807,7 @@ def _cmp(self, other):
     # that specified by IEEE 754.
 
     def __eq__(self, other, context=None):
-        self, other = _convert_for_comparison(self, other, equality_op=True)
+        self, other = _convert_for_comparison(self, other, equality_op=False)
         if other is NotImplemented:
             return other
         if self._check_nans(other, context):

On another hand, with enabled FloatOperation signal now - you might lose ability to sort some containers.

In either case, I doubt that new semantics of the FloatOperation signal will lead to hard to detect bugs.

CC @serhiy-storchaka

It's too late to change the current FloatOperation. Perhaps it could be deprecated.

Hmm, can't we treat the current behavior as a bug? It's trapping turned off by default.

@tim-one
Copy link
Member

tim-one commented Mar 23, 2025

I don't see anything new here. The current behavior is working as designed and documented, so can't be called "a bug". You could call it a "design error", but I wouldn't. At worst it's a design decision the OP doesn't like.

For more on that, you'd have to ask @skrah. The mountain of code is overwhelmingly his, and best I can tell FloatOperation was his design. Unfortunately, he left the community on bad terms so is unlikely to voluntarily chime in here.

Sorting is a red herring. Sorting doesn't use == or !=, only <. So sorting a mixed list already reliably blows up when FloatOperation is enabled.

    >>> import decimal as d
    >>> d.getcontext().traps[d.FloatOperation] = True
    >>> sorted([d.Decimal("1"), 1.0])
    Traceback (most recent call last):
      File "<python-input-4>", line 1, in <module>
        sorted([d.Decimal("1"), 1.0])
        ~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
    decimal.FloatOperation: [<class 'decimal.FloatOperation'>]

Users don't appreciate breaking changes, so I oppose changing the currently well-defined semantics in any way. But I may weakly support adding a new, "even stricter" trap, which also complained about mixed (in)equality comparisons. That's only "may support", because, as mentioned before, a deeper design invariant is that == and != never raise exceptions, regardless of types.

In the case of mixing floats and decimals, == and != also deliver the mathematically correct results (they're exactly right, independent of context precision and rounding mode).

It is, after all, a plan fact that Decimal("0.89") != 0.89

@tim-one
Copy link
Member

tim-one commented Mar 23, 2025

@tim-one

Sorting is a red herring. Sorting doesn't use == or !=, only <. So sorting a mixed list already reliably blows up when FloatOperation is enabled.

Not quite. While sorting itself never does (in)equality comparison directly, applying __lt__ to elements might. In the example, wrap each list element in a 1-list or 1-tuple. List/tuple __lt__ does compare elements for equality. So then you get an example that sorts fine today even when FloatOperation is enabled. But would blow up if equality were trapped too.

@tim-one
Copy link
Member

tim-one commented Mar 24, 2025

What to do with mixed float/decimal comparisons has a long, contentious history, going back before Python 3, and including a span when didn't it complain but returned nonsense results:

#46783

But there's essentially no public discussion I found of FloatOperation. Before then, the main hang-up was that there was no practical way to convert a float to a decimal losslessly at all. Nobody much cared. But in that issue report, @mdickinson eventually solved the technical problems, and the rest followed.

Throughout he didn't care much about sorting, but about that elt in dict/set/sequence "just plain work" without worrying about mixing types.

And the design of FloatOperation exempted equality, so those uses continue to "just plain work".

So I'm satisficed it's working as intended, I'd leave it alone. -1 on changing its semantics in any way, and -1 on deprecating it. At best +0 on introducing a stricter trap to prevent mixed (in)equality comparison too.

And I'll leave it there. If someone else around at the time (@mdickinson , @rhettinger) doesn't chime in with a strong opposing preference, I expect to just eventually close this as "not planned".

@skirpichev skirpichev added the pending The issue will be closed if no feedback is provided label Mar 24, 2025
@skirpichev
Copy link
Member

At best +0 on introducing a stricter trap to prevent mixed (in)equality comparison too.

Ok, lets see of some core dev(s) will in favor of this (i.e. +<nonzero>).

Lets not forget, that in principle we have access to contexts flags even if a signal isn't trapped:

>>> import decimal
>>> ctx = decimal.getcontext()
>>> [k.__name__ for k, v in ctx.flags.items() if v]
[]
>>> d = decimal.Decimal(0.1)
>>> [k.__name__ for k, v in ctx.flags.items() if v]
['FloatOperation']
>>> d + 1
Decimal('1.100000000000000005551115123')
>>> [k.__name__ for k, v in ctx.flags.items() if v]
['FloatOperation', 'Inexact', 'Rounded']

@serhiy-storchaka
Copy link
Member

I cannot add anything new. @mdickinson's argument looks solid to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir pending The issue will be closed if no feedback is provided type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants