Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Specify what "non-null" means in DataFrame.info() #60802

Open
1 task done
jxu opened this issue Jan 27, 2025 · 4 comments
Open
1 task done

DOC: Specify what "non-null" means in DataFrame.info() #60802

jxu opened this issue Jan 27, 2025 · 4 comments
Assignees
Labels
Docs Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Comments

@jxu
Copy link

jxu commented Jan 27, 2025

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.info.html

Documentation problem

Non-null is not specific

Suggested fix for documentation

Link to documentation or specify exactly what non-null means. In particular, for float64s NaN are considered "null". And does it also represent NULLs in the Nullable integer types? https://pandas.pydata.org/docs/user_guide/integer_na.html

Pandas is not consistent with its terminology of NA, NULL, and NaN.
NaN is a floating point value that is not in the IEEE standard as a missing value.
R uses NA consistently and SQL uses NULL consistently in 3VL.

@jxu jxu added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 27, 2025
@rhshadrach
Copy link
Member

Thanks for the report. I think we should be using NA consistently throughout the docs and not null.

Pandas is not consistent with its terminology of NA, NULL, and NaN.

This is an area being actively worked on. See

#58988

@rhshadrach rhshadrach added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 29, 2025
@KevsterAmp
Copy link
Contributor

I think we should be using NA consistently throughout the docs and not null.

@rhshadrach is a PR relevant for this issue? ie replacing non-null in the docs above as non-NA or something similar?

@rhshadrach
Copy link
Member

Yep - I think we should replace with non-NA.

@KevsterAmp
Copy link
Contributor

Take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

No branches or pull requests

3 participants