-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[time] document that integer nanoseconds may be more precise than float #83665
Comments
On windows, the timestamps produced by time.time() often end up being equal because of the 15 ms resolution: >>> time.time(), time.time()
(1580301469.6875124, 1580301469.6875124) The problem I noticed is that a value produced by time_ns() might end up being higher then a value produced time() even though time_ns() was called before: >>> a, b = time.time_ns(), time.time()
>>> a, b
(1580301619906185300, 1580301619.9061852)
>>> a / 10**9 <= b
False This break in causality can lead to very obscure bugs since timestamps are often compared to one another. Note that those timestamps can also come from non-python sources, i.e a C program using This problem seems to be related to the conversion Lines 460 to 461 in f1c1903
# Float produced by `time.time()`
>>> b.hex()
'0x1.78c5f4cf9fef0p+30'
# Basically what `_PyTime_AsSecondsDouble` does:
>>> (float(a) / 10**9).hex()
'0x1.78c5f4cf9fef0p+30'
# What I would expect from `time.time()`
>>> (a / 10**9).hex()
'0x1.78c5f4cf9fef1p+30' However I don't know if this would be enough to fix all causality issues since, as Tim Peters noted in another thread:
|
I thought about it a bit more and I realized there is no way to recover the time in hundreds of nanoseconds from the float produced by That means If that makes sense, a possible roadmap to tackle this problem would be:
|
>>> a / 10**9 <= b
False Try to use a/1e9 <= b. -- The C code to get the system clock is the same for time.time() and time.time_ns(). It's only the conversion of the result which is different: static PyObject *
time_time(PyObject *self, PyObject *unused)
{
_PyTime_t t = _PyTime_GetSystemClock();
return _PyFloat_FromPyTime(t);
}
static PyObject *
time_time_ns(PyObject *self, PyObject *unused)
{
_PyTime_t t = _PyTime_GetSystemClock();
return _PyTime_AsNanosecondsObject(t);
} where _PyTime_t is int64_t: 64-bit signed integer. Conversions: static PyObject*
_PyFloat_FromPyTime(_PyTime_t t)
{
double d = _PyTime_AsSecondsDouble(t);
return PyFloat_FromDouble(d);
}
double
_PyTime_AsSecondsDouble(_PyTime_t t)
{
/* volatile avoids optimization changing how numbers are rounded */
volatile double d;
if (t % SEC_TO_NS == 0) {
_PyTime_t secs;
/* Divide using integers to avoid rounding issues on the integer part.
1e-9 cannot be stored exactly in IEEE 64-bit. */
secs = t / SEC_TO_NS;
d = (double)secs;
}
else {
d = (double)t;
d /= 1e9;
}
return d;
}
PyObject *
_PyTime_AsNanosecondsObject(_PyTime_t t)
{
Py_BUILD_ASSERT(sizeof(long long) >= sizeof(_PyTime_t));
return PyLong_FromLongLong((long long)t);
} In short, time.time() = float(time.time_ns()) / 1e9. -- The problem can be reproduced in Python: >>> a=1580301619906185300
>>> b=a/1e9
>>> a / 10**9 <= b
False I added time.time_ns() because we loose precision if you care about nanosecond resolution, with such "large number". float has a precision around 238 nanoseconds: >>> import math; ulp=math.ulp(b)
>>> ulp
2.384185791015625e-07
>>> "%.0f +- %.0f" % (b*1e9, ulp*1e9)
'1580301619906185216 +- 238' int/int and int/float don't give the same result: >>> a/10**9
1580301619.9061854
>>> a/1e9
1580301619.9061852 I'm not sure which one is "correct". To understand the issue, you can use the next math.nextafter() function to get the next floating point towards -inf: >>> a/10**9
1580301619.9061854
>>> a/1e9
1580301619.9061852
>>> math.nextafter(a/10**9, -math.inf)
1580301619.9061852
>>> math.nextafter(a/1e9, -math.inf)
1580301619.906185 Handling floating point numbers are hard. Why don't use only use integers? :-) |
Another way to understand the problem: nanosecond (int) => seconds (float) => nanoseconds (int) roundtrip looses precison. >>> a=1580301619906185300
>>> a/1e9*1e9
1.5803016199061852e+18
>>> b=int(a/1e9*1e9)
>>> b
1580301619906185216
>>> a - b
84 The best would be to add a round parameter to _PyTime_AsSecondsDouble(), but I'm not sure how to implement it. The following rounding mode is used to read a clock:
_PyTime_ROUND_FLOOR is used in time.clock_settime(), time.gmtime(), time.localtime() and time.ctime() functions: to round input arguments. time.time(), time.monotonic() and time.perf_counter() converts _PyTime_t to float using _PyTime_AsSecondsDouble() (which currently has no round parameter) for their output. See also my rejected PEP-410 ;-) -- One way to solve this issue is to document how to compare time.time() and time.time_ns() timestamps in a reliable way. |
By the way, I wrote an article about the history on how Python rounds time... https://vstinner.github.io/pytime.html I also wrote two articles about nanoseconds in Python:
Oh and I forgot the main one: PEP-564 :-) "Example 2: compare times with different resolution" sounds like this issue. https://www.python.org/dev/peps/pep-0564/#example-2-compare-times-with-different-resolution |
I don't think this is fixable, because it's not exactly a bug. The problem is we're running out of bits. In converting the time around, we've lost some precision. So the times that come out of time.time() and time.time_ns() should not be considered directly comparable. Both functions, time.time() and time.time_ns(), call the same underlying function to get the current time. That function is_PyTime_GetSystemClock(); it returns nanoseconds since the 1970 epoch, stored in an int64. Each function then simply converts that time into its return format and returns that. In the case of time.time_ns(), it loses no precision whatsoever. In the case of time.time(), it (usually) converts to double and divides by 1e9, which is implicitly floor rounding. Back-of-the-envelope math here: An IEEE double has 53 bits of resolution for the mantissa, not counting the leading 1. The current time in seconds since the 1970 epoch uses about 29 bits of those 53 bits. That leaves 24 bits for the fractional second. But you'd need 30 bits to render all one billion fractional values. We're six bits short. Unless anybody has an amazing suggestion about how to ameliorate this situation, I think we should close this as wontfix. |
(Oh, wow, Victor, you wrote all that while I was writing my reply. ;-) |
Thanks for your answers, that was very informative!
Originally, I thought
I decided to plot the conversion errors for different division methods over a short period of time. It turns out that:
This result really surprised me, I have no idea what is the reason behind this. See the plots and code attached for more information. In any case, this means there is no reason to change the division in --- As a side note, the only place I could find something similar mentioned in the docs is in the https://docs.python.org/3.8/library/os.html#os.stat_result.st_ctime_ns
Maybe this kind of limitation should also be mentioned in the documentation of |
Yeah, time.time(), time.monotonic() and time.perf_counter() can benefit of a note suggestion to use time.time_ns(), time.monotonic_ns() or time.perf_counter_ns() to better precision. |
The problem is that there is a double rounding in time = float(time_ns) / 1e9
The formula time = time_ns / 10**9 may be more accurate. |
Actually
Well that seems to not be the case, see the plots and the corresponding code. I might have made a mistake though, please let me know if I got something wrong :) |
I'm pretty sure that in Python 3, if you say |
But they are not single-digit integers. And more, int(float(a)) != a. |
I'm not sure which kind of problem you are trying to solve here. time.time() does lose precision because it uses the float type. Comparing time.time() and time.time_ns() tricky because of that. If you care of nanosecond precision, avoid float whenever possible and only store time as integer. I'm not sure how to compat time.time() float with time.time_ns(). Maybe math.isclose() can help. I don't think that Python is wrong here, time.time() and time.time_ns() work are expected, and I don't think that time.time() result can be magically more accurate: 1580301619906185300 nanoseconds (int) cannot be stored exactly as floating point number of seconds. I suggest to only document in time.time() is less accurate than time.time_ns(). |
>>> 1580301619906185300/10**9
1580301619.9061854
>>> 1580301619906185300/1e9
1580301619.9061852
>>> float(F(1580301619906185300/10**9) * 10**9 - 1580301619906185300)
88.5650634765625
>>> float(F(1580301619906185300/1e9) * 10**9 - 1580301619906185300)
-149.853515625 1580301619906185300/10**9 is more accurate than 1580301619906185300/1e9. |
I compare nanoseconds (int):
# int/int: int.__truediv__(int)
>>> abs(t - int(t/10**9 * 1e9))
172
# int/float: float.__rtruediv__(int)
>>> abs(t - int(t/1e9 * 1e9))
84
# float/int: float.__truediv__(int)
>>> abs(t - int(float(t)/10**9 * 1e9))
84
# float/float: float.__truediv__(float)
>>> abs(t - int(float(t)/1e9 * 1e9))
84 => int/int is less accurate than float/float for t=1580301619906185300 You compare seconds (float/Fraction):
# int / int
>>> float(F(t/10**9) * 10**9 - t)
88.5650634765625
# int / float
>>> float(F(t/1e9) * 10**9 - t)
-149.853515625 => here int/int looks more accurate than int/float And we get different conclusion :-) |
@serhiy.storchaka
I don't know exactly what >>> r = 1580301619906185300
>>> int(r / 10**9 * 10**9) - r
172
>>> int(r / 1e9 * 10**9) - r
-84
Sounds good! |
No, int/int is more accurate here. If a and b are ints, a / b is always correctly rounded on an IEEE 754 system, while float(a) / float(b) will not necessarily give a correctly rounded result. So for an integer a, |
To be clear: the following is flawed as an accuracy test, because the *multiplication* by 1e9 introduces additional error. # int/int: int.__truediv__(int)
>>> abs(t - int(t/10**9 * 1e9))
172 Try this instead, which uses the Fractions module to get the exact error. (The error is converted to a float before printing, for convenience, to show the approximate size of the errors.) >>> from fractions import Fraction as F
>>> exact = F(t, 10**9)
>>> int_int = t / 10**9
>>> float_float = t / 1e9
>>> int_int_error = F(int_int) - exact
>>> float_float_error = F(float_float) - exact
>>> print(float(int_int_error))
8.85650634765625e-08
>>> print(float(float_float_error))
-1.49853515625e-07 |
@mark.dickinson
Interesting, I completely missed that! But did you notice that the full conversion might still perform better when using only floats?
I wanted to figure out how often that happens so I updated my plotting, you can find the code and plot attached. Notice how both methods seems to perform equally good (the difference of the absolute errors seems to average to zero). I have no idea about why that happens though. |
Should _PyFloat_FromPyTime() implementation be modified to reuse long_true_divide()? |
A binary float has the form (-1)**sign * (1 + frac) * 2**exp, where sign is 0 or 1, frac is a rational value in the range [0, 1), and exp is a signed integer (but stored in non-negative, biased form). The smallest value of frac is epsilon, and the smallest increment for a given power of two is thus epsilon * 2**exp. To get exp for a given value, we have log2(abs(value)) == log2((1 + frac) * 2**exp) == log2(1 + frac) + log2(2**exp) == log2(1 + frac) + exp. Thus exp == log2(abs(value)) - log2(1 + frac). We know log2(1 + frac) is in the range [0, 1), so exp is the floor of the log2 result. For a binary64, epsilon is 2**-52, but we can leave it up to the floating point implementation by using sys.float_info: >>> exp = math.floor(math.log2(time.time()))
>>> sys.float_info.epsilon * 2**exp
2.384185791015625e-07 Anyway, it's better to leave it to the experts: >>> t = time.time()
>>> math.nextafter(t, math.inf) - t
2.384185791015625e-07 |
I'm not sure what you're suggesting here. I shouldn't try to understand how floating-point numbers are stored? |
No, that's the furthest thought from my mind. I meant only that I would not recommend using one's own understanding of floating-point numbers instead of something like math.nextafter. Even if I correctly understand the general case, there are probably corner cases that I'm not aware of. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: