Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TYP: Update type hints for ExtensionArray and ExtensionDtype #39501

Closed
wants to merge 28 commits into from

Conversation

Dr-Irv
Copy link
Contributor

@Dr-Irv Dr-Irv commented Jan 31, 2021

Modifications to reduce error reports from pyright --verifytypes related to ExtensionArray and ExtensionDtype.

Some notes:

  1. Had to use Any in a number of places, because an ExtensionArray can contain any type
  2. Because numpy doesn't have a released version with types, pyright complains whenever we work with numpy types, so for type checking purposes, I redefine np.dtype and np.ndarray to a dummy type. When numpy 1.20 is released, we can most likely remove that. Alternatively, I can find out if there is a way to see if pyright has something like TYPE_CHECKING to bracket those changes.

I anticipate these changes might be controversial, so constructive criticism is welcome!

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Jan 31, 2021

-        method: Optional[Literal["backfill", "bfill", "ffill", "pad"]] = None,
+        method: Optional[Literal[backfill, bfill, ffill, pad]] = None,

Looks like a bug in no-string-hints, sorry about that, will test + update!


Have updated and pushed a new tag, the version in pandas' pre-commit-config will get updated tomorrow morning as part of the scheduled weekly update

@Dr-Irv Dr-Irv changed the title Update type hints for ExtensionArray and ExtensionDtype TYP: Update type hints for ExtensionArray and ExtensionDtype Jan 31, 2021
@simonjayhawkins
Copy link
Member

2. When numpy 1.20 is released

@Dr-Irv can you update

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Feb 1, 2021

  1. When numpy 1.20 is released

@Dr-Irv can you update

Ha! It was just released in pypl on 1/30/21, which is when I pushed this! Waiting for it to be on conda and then will update.

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Feb 1, 2021

  1. When numpy 1.20 is released

@Dr-Irv can you update

I did try this, and now mypy pandas reports more errors because numpy 1.20 has typing. #39513

I'm asking the pyright people to see if there is a different workaround. Ideally, I'd like to tell pyright "process pandas, but ignore numpy"

@jreback jreback added ExtensionArray Extending pandas with custom dtypes or arrays. Typing type annotations, mypy/pyright type checking labels Feb 2, 2021
@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Feb 2, 2021

I'm asking the pyright people to see if there is a different workaround. Ideally, I'd like to tell pyright "process pandas, but ignore numpy"

You can now use pyright to find pandas symbols that are missing types by using pyright --verifytypes pandas! (The ! means don't look at imported libraries). That means it won't look at the numpy imports. Based on that (and including this PR), we have 75.7% type completeness for pandas.

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Feb 3, 2021

Looks like a bug in no-string-hints, sorry about that, will test + update!

Have updated and pushed a new tag, the version in pandas' pre-commit-config will get updated tomorrow morning as part of the scheduled weekly update

@MarcoGorelli Has this been done yet? I'm still getting the pre-commit failure

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Feb 3, 2021

Looks like a bug in no-string-hints, sorry about that, will test + update!
Have updated and pushed a new tag, the version in pandas' pre-commit-config will get updated tomorrow morning as part of the scheduled weekly update

@MarcoGorelli Has this been done yet? I'm still getting the pre-commit failure

if you approve/merge #39524 and then rebase/merge, then it'll pass

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2021

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Mar 6, 2021
@Dr-Irv Dr-Irv removed the Stale label Mar 8, 2021
@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Mar 12, 2021

@simonjayhawkins Updated the PR based on your latest stuff with typing, so now pandas/core/arrays/base.py is fully typed and that allowed me to remove a bunch of # type; ignore things. Main changes from last review is an overload for astype and fixing isin. Looking forward to your review.

@simonjayhawkins
Copy link
Member

Thanks @Dr-Irv for working on this.

I would imagine that this PR will be easier to check if we limit the changes to pandas/core/arrays/base.py and the other changes are done in a separate PR(s).

There is a conflict here and I think some other overlaps with other PRs removing the ignores.

some of the changes to remove ignores need more discussion than others. If ignores are being removed that's great, if ignores or changed or others added we need to review in more depth.

@simonjayhawkins
Copy link
Member

I'll push a commit merging master.

some new mypy errors and now also another conflict from a subsequent merge to master related to #39501 (comment)

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Mar 13, 2021

I would imagine that this PR will be easier to check if we limit the changes to pandas/core/arrays/base.py and the other changes are done in a separate PR(s).

I'm not sure that it can be done alone, because pandas/core/arrays/base.py has the base ExtensionArray class, so you have to fix things upstream when fixing things there.

There is a conflict here and I think some other overlaps with other PRs removing the ignores.

some of the changes to remove ignores need more discussion than others. If ignores are being removed that's great, if ignores or changed or others added we need to review in more depth.

I removed ignores only based on when mypy said they were not needed any more. I don't think I added any ignores.

Let me get the current version in a state that goes green.

@simonjayhawkins
Copy link
Member

I'm not sure that it can be done alone, because pandas/core/arrays/base.py has the base ExtensionArray class, so you have to fix things upstream when fixing things there.

I think that while numpy types resolved to Any this PR adding types to pandas/core/arrays/base.py was moreless ready.

all the other changes are related to numpy types now having some meaning. I'm sure these could be done separately (and in smaller chunks)

changes to 36 files takes some review, especially as changes are made as other PRs are merged.

typing PRs need to be atomic to be easily reviewed.

Main changes from last review is an overload for astype and fixing isin. Looking forward to your review.

so maybe a PR for the astype fixes and a PR for the isin fixes?

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Mar 13, 2021

all the other changes are related to numpy types now having some meaning. I'm sure these could be done separately (and in smaller chunks)

If I limit things to pandas/core/arrays/base.py, I'm pretty sure I can't get it to be green. Is that OK?

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Mar 13, 2021

@simonjayhawkins Latest commit 5bb24d4 still has a variety of changes. It just takes what you merged in and should make it mypy happy.

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Mar 13, 2021

so maybe a PR for the astype fixes and a PR for the isin fixes?

I'm going to create a separate PR that is effectively this one, MINUS the astype and isin.

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Mar 14, 2021

Things that have been approved/merged

Awaiting approval/merges:

Things to do in pandas/core/arrays.base.py and waiting approval

  • arguments for isin
  • to_numpy

@jbrockmendel
Copy link
Member

Things that can be done independent of approval of #40421

Looks like you've made some progress on that checklist. Is this PR still active?

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Mar 28, 2021

Things that can be done independent of approval of #40421

Looks like you've made some progress on that checklist. Is this PR still active?

I will change it to draft status, as I'm using it to track the changes I would do after #40421 is approved. Then when I finish the checklist, I'll close this one.

@Dr-Irv Dr-Irv marked this pull request as draft March 28, 2021 16:19
@simonjayhawkins
Copy link
Member

I will change it to draft status, as I'm using it to track the changes I would do after #40421 is approved. Then when I finish the checklist, I'll close this one.

Is this still the master PR? if so could you merge master so we can monitor progress. and more easily see the outstanding review comments.

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented May 23, 2021

I will change it to draft status, as I'm using it to track the changes I would do after #40421 is approved. Then when I finish the checklist, I'll close this one.

Is this still the master PR? if so could you merge master so we can monitor progress. and more easily see the outstanding review comments.

Yes. But I'd prefer to do the merge once #41203, #41251, #41258 and #41513 are merged in so we can see where we are with respect to the comments.

@simonjayhawkins
Copy link
Member

ok. will close for now. reopen when ready.

@Dr-Irv Dr-Irv deleted the extensiontyping branch February 13, 2023 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants