Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame.min with skipna=True raises TypeError when column contains np.nan and datetime.date #61204

Open
3 tasks done
tanjt107 opened this issue Mar 31, 2025 · 2 comments
Assignees
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reduction Operations sum, mean, min, max, etc.

Comments

@tanjt107
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np
import datetime

data = {
    "dates": [
        np.nan,
        np.nan,
        datetime.date(2025, 1, 3),
        datetime.date(2025, 1, 4),
    ],
}

df = pd.DataFrame(data)

df.min(axis=0)

Issue Description

The issue arises when calling DataFrame.min(axis=0) with skipna=True (default) on a column containing a mix of np.nan and datetime.date. This results in a TypeError because np.nan (a float) cannot be compared with datetime.date.


  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\test.py", line 29, in <module>

    df.min(axis=0)

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\frame.py", line 11643, in min

    result = super().min(axis, skipna, numeric_only, **kwargs)

             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\generic.py", line 12388, in min

    return self._stat_function(

           ^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\generic.py", line 12377, in _stat_function

    return self._reduce(

           ^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\frame.py", line 11562, in _reduce

    res = df._mgr.reduce(blk_func)

          ^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\internals\managers.py", line 1500, in reduce

    nbs = blk.reduce(func)

          ^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\internals\blocks.py", line 404, in reduce

    result = func(self.values)

             ^^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\frame.py", line 11481, in blk_func

    return op(values, axis=axis, skipna=skipna, **kwds)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\nanops.py", line 147, in f

    result = alt(values, axis=axis, skipna=skipna, **kwds)

             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\nanops.py", line 404, in new_func

    result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)

             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\pandas\core\nanops.py", line 1098, in reduction

    result = getattr(values, meth)(axis)

             ^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Users\45217950\Downloads\GitHub\irr-cloud\.venv\Lib\site-packages\numpy\_core\_methods.py", line 48, in _amin

    return umr_minimum(a, axis, None, out, keepdims, initial, where)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

TypeError: '<=' not supported between instances of 'float' and 'datetime.date'

This issue is related to issue #61187, but the specific case here involves datetime.date (not datetime.datetime), which behaves differently in pandas.

Expected Behavior

dates    2025-01-03
dtype: object

Installed Versions

INSTALLED VERSIONS


commit : 0691c5c

python : 3.12.7

python-bits : 64

OS : Windows

OS-release : 10

Version : 10.0.19045

machine : AMD64

processor : Intel64 Family 6 Model 85 Stepping 7, GenuineIntel

byteorder : little

LC_ALL : None

LANG : en_US.UTF-8

LOCALE : English_United States.1252

pandas : 2.2.3

numpy : 2.2.3

pytz : 2025.1

dateutil : 2.9.0

pip : 24.2

Cython : None

sphinx : None

IPython : None

adbc-driver-postgresql: None

adbc-driver-sqlite : None

bs4 : None

blosc : None

bottleneck : None

dataframe-api-compat : None

fastparquet : None

fsspec : None

html5lib : None

hypothesis : None

gcsfs : None

jinja2 : None

lxml.etree : None

matplotlib : None

numba : None

numexpr : None

odfpy : None

openpyxl : 3.1.5

pandas_gbq : 0.28.0

psycopg2 : None

pymysql : None

pyarrow : 19.0.1

pyreadstat : None

pytest : None

python-calamine : None

pyxlsb : None

s3fs : None

scipy : 1.15.2

sqlalchemy : None

tables : None

tabulate : None

xarray : None

xlrd : None

xlsxwriter : 3.2.2

zstandard : None

tzdata : 2025.1

qtpy : None

pyqt5 : None

@rhshadrach
Copy link
Member

Thanks for the report. This happens on any object dtype data; I am wondering if pandas should handle object blocks specially where we filter instead of fillna values. Further investigations and PRs to fix are welcome!

@rhshadrach rhshadrach added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reduction Operations sum, mean, min, max, etc. and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 31, 2025
@MayurKishorKumar
Copy link

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

No branches or pull requests

4 participants