-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contributing Guide for Type Hints #27050
Conversation
c: Union[int, float] | ||
) -> float: | ||
|
||
When dealing with parameters with a default argument of ``None``, you should not use ``Optional`` as this will be inferred by the static type checker. So instead of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback drawing your attention here as you might disagree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback we are going with Optional
or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this up.
|
||
.. code-block:: python | ||
|
||
def some_func(a: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should make style recommendations that aren't enforced by a linter,
Black formats this as
def some_func(a: str, b: float, c: Union[int, float]) -> float:
pass
And when a wrap is required
def some_func(
a_long_name_really_long_name: str,
b_long_name: float,
c_long_name: Union[int, float],
) -> float:
pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know. Haven't used that yet (have seen other issue, will look into it) but what you are saying makes sense.
@topper-123 I think you suggested something like this as well so FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it fits within 79 chars, I don't the point to wrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +1 on requiring multiline defs to be wrapped in the style guide. It makes for much clearer code.
OTOH, if we go with Black, such choices will be made for us?
Pandas-specific Types | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Commonly used types specific to *pandas* will appear in pandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small list of the types of things found here may be helpful (ArrayLike, Dtype, etc.)
I think inlining an example here would be very helpful; early on, contributors may be adding to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes adding stuff in _typing.py
sometimes confuses me. Some example will be nice.
|
||
from typing import Union | ||
|
||
maybe_primes = [] # type: Union[int, None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it should be
maybe_primes = [] # type: Union[int, None] | |
maybe_primes = [] # type: List[Union[int, None]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea great catch!
Codecov Report
@@ Coverage Diff @@
## master #27050 +/- ##
==========================================
- Coverage 92.01% 92% -0.01%
==========================================
Files 180 180
Lines 50754 50754
==========================================
- Hits 46699 46695 -4
- Misses 4055 4059 +4
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #27050 +/- ##
==========================================
+ Coverage 91.86% 92.03% +0.17%
==========================================
Files 180 180
Lines 50792 50714 -78
==========================================
+ Hits 46658 46675 +17
+ Misses 4134 4039 -95
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, some comments
Syntax Requirements | ||
~~~~~~~~~~~~~~~~~~~ | ||
|
||
Because *pandas* still supports Python 3.5, :pep:`526` does not apply and variables **must** be annotated with type comments. Specifically, this is a valid annotation within pandas: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this comment lower, e.g. show the common / easy case first
|
||
maybe_primes = [] # type: List[Optional[int]] | ||
|
||
When dealing with parameters with a default argument of ``None``, you should not use ``Optional`` as this will be inferred by the static type checker. So instead of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a code check for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to revert this section. Mypy supports this by default but it appears to be only for backwards compatibility and is considered a "mistake" in PEP 484
So separately can look into enabling --no-implicit-optional
to see what breaks
|
||
Commonly used types specific to *pandas* will appear in pandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas. | ||
|
||
For example, quite a few functions in *pandas* accept a ``dtype`` argument. This can be expressed as a string like ``"object"``, a numpy.dtype like ``np.int64`` or even a pandas ``ExtensionDtype`` like ``pd.CategoricalDtype``. Rather than burden the user with having to constantly annotate all of those options, this can simply be imported and reused from the pandas._typing module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use a ref to numpy.dtype
Pandas-specific Types | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Commonly used types specific to *pandas* will appear in pandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put a reference to pandas._typing : https://github.com/pandas-dev/pandas/blob/master/pandas/_typing.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. @jorisvandenbossche comments?
if you can add in a followup (or this one ok too), how to deal with circular imports; pretty much use the string form of the import; also indicate that we generally need to import fully qualified, e.g. from pandas.core.frame import DataFrame
Would prefer that piece as follow up. I have somewhat of a concern that that might not actually the be correct solution as it is not suggested by MyPy (refs like that only suggested within one module). It might just be replacing those with Any I could be wrong but would like more time to investigate |
lgtm. @jorisvandenbossche any comments? |
Yea that’s correct - something we don’t control going forward
…Sent from my iPhone
On Jun 30, 2019, at 1:11 AM, topper-123 ***@***.***> wrote:
@topper-123 commented on this pull request.
In doc/source/development/contributing.rst:
> +
+ from typing import Optional
+
+ maybe_primes = [] # type: Optional[int]
+
+If a function accepts multiple arguments, every parameter should appear on a separate line. So rather than
+
+.. code-block:: python
+
+ def some_func(a: str, b: float, c: Union[int, float]) -> float:
+
+The preferred style would be
+
+.. code-block:: python
+
+ def some_func(a: str,
I'm +1 on requiring multiline defs to be wrapped in the style guide. It makes for much clearer code.
OTOH, if we go with Black, such choices will be made for us?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
|
||
.. code-block:: shell | ||
|
||
mypy pandas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also type check single files or submodules? (for quicker development turnover, if you are trying out type checking whole pandas takes a while)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could but not generically useful as mypy doggedly follows all imports so wouldn't necessarily save much time:
https://mypy.readthedocs.io/en/latest/running_mypy.html#following-imports
If type checking speed is a concern the suggested approach is to use a daemon:
https://mypy.readthedocs.io/en/latest/mypy_daemon.html#mypy-daemon-mypy-server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche yes you can do something like this: mypy pandas/core/something.py
, or mypy pandas/core/generic
It saves a bit of time but not much, but adding how to do that in the contributing guide might not be a bad idea. Personally when I am working with typing I run mypy for just the single file I am working on. For whole project I run it only before committing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t plan on adding this - it’s not value added to do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that's true, someone can easily figure out the command for a single file/folder by making a wise guess or going to mypy docs.
Co-Authored-By: Joris Van den Bossche <jorisvandenbossche@gmail.com>
class SomeClass2: | ||
str = None # type: str_type | ||
|
||
In some cases you may be tempted to use ``cast`` from the typing module when you know better than the analyzer. This occurs particularly when using custom inference functions. For example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche let me know if this helps with understanding of cast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it also an option to just "leave it" for some cases? Or does mypy error if it cannot infer a type?
Because I have the feeling we have a lot of such cases, we use those is_..
idioms almost everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand what you mean by “leave it” but once the annotation is added in the signature here any op which assumes a string but which Mypy can’t narrow inference down to will raise (here it would say something like int/float has no attribute “upper”)
For sure though I think we will have a few places in the code base where cast would be required, at least unless the referenced Mypy enhancement is implemented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I basically meant "leave it untyped" (as you can also leave complete functions untyped). So but you answered: once you type the signature of a function, mypy needs to understand the full body of that function.
I think we will have a few places in the code base where cast would be required
I don't have the feeling it is "a few". We do such is_
calls much more than actual isinstance checks.
at least unless the referenced Mypy enhancement is implemented
There doesn't seem to be much movement in that issue at the moment?
@jorisvandenbossche do you have any other edits you want to make to this? Would like to get this in dev guide for users to see |
Ping @jorisvandenbossche |
Let's merge this today and do the rest as followups? I suspect this is an improvement from master. |
Style Guidelines | ||
~~~~~~~~~~~~~~~~ | ||
|
||
Types imports should follow the ``from typing import ...`` convention. So rather than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could add a codecheck for this?
Merging as is from last comment. Anything else here can be done as follow ups |
Thanks @WillAyd |
Want to get this documented as we've been picking up momentum with type hints
@toobaz @TomAugspurger @vaibhavhrt @gwrome @makbigc