Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STYLE "wrongplaced whitespace" rule doesn't error on its example of a bad string #40873

Closed
MarcoGorelli opened this issue Apr 11, 2021 · 5 comments · Fixed by #40906
Closed
Labels
Code Style Code style, linting, code_checks
Milestone

Comments

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Apr 11, 2021

The docstring reads

For example, this is bad:
>>> rule = (
... "We want the space at the begging of "
... "the line if the previous line is ending with a \n "
... "not at the end, like always"
... )

yet this file doesn't error if the script is run on it:

$ cat t.py 
rule = (
   "We want the space at the begging of "
   "the line if the previous line is ending with a \n "
   "not at the end, like always"
)
$ python scripts/validate_unwanted_patterns.py -vt 'strings_with_wrong_placed_whitespace' t.py 

Some other files which don't error, though I can't understand why:

$ cat t.py 
rule = (
   "  We want the space at the begging of"
   " the line if the previous line is ending with a \n "
   "not at the end, like always"
)
$ python scripts/validate_unwanted_patterns.py -vt 'strings_with_wrong_placed_whitespace' t.py 
$ cat t.py 
rule = (
   "We want the space at the begging of"
   " the line if the previous line is ending with a  "
   "not at the end, like always"
)
$ python scripts/validate_unwanted_patterns.py -vt 'strings_with_wrong_placed_whitespace' t.py

Hi @ShaharNaveh @datapythonista - do you have any context you could provide on the trailing newline rule? I'm struggling to make sense of it - thanks :)

@MarcoGorelli MarcoGorelli added the Code Style Code style, linting, code_checks label Apr 11, 2021
@MarcoGorelli
Copy link
Member Author

What I don't understand is the logic behind the lines

elif first_line.startswith(" ") or second_line.startswith(" "):
return False
elif first_line.endswith(" ") or second_line.endswith(" "):
return False

Why should they return False? I can't find examples in the codebase of where they would be hit anyway:

$ git diff
diff --git a/scripts/validate_unwanted_patterns.py b/scripts/validate_unwanted_patterns.py
index b6b038ae9d..369039de2c 100755
--- a/scripts/validate_unwanted_patterns.py
+++ b/scripts/validate_unwanted_patterns.py
@@ -370,10 +370,6 @@ def strings_with_wrong_placed_whitespace(
         """
         if first_line.endswith(r"\n"):
             return False
-        elif first_line.startswith("  ") or second_line.startswith("  "):
-            return False
-        elif first_line.endswith("  ") or second_line.endswith("  "):
-            return False
         elif (not first_line.endswith(" ")) and second_line.startswith(" "):
             return True
         return False
$ pre-commit run unwanted-patterns-strings-with-wrong-placed-whitespace --all-files
Check for strings with wrong placed spaces...............................Passed

@datapythonista
Copy link
Member

The idea is that on multiline strings we always keep the spaces from a split in the same position (at the end).

foo = ('this is '
       'what we want')
bar = ('and this'
       ' is wrong')

Then, I think there were cases with multiple spaces, like for indenting, inside the string. So we were checking that a single space existed, otherwise we consider the case ok. I don't remember about the line breaks.

I thought these rules were well tested. I'm in my phone now, can't easily check, but aren't there tests that clarify what is doing with examples and make sure this works as expected?

@MarcoGorelli
Copy link
Member Author

Thanks @datapythonista !

The tests miss quite a few lines:

except ValueError:
return 0

if first_line.endswith(r"\n"):
return False

elif first_line.endswith(" ") or second_line.endswith(" "):
return False

Also, this test:

msg = (
"foo"
" bar"
"baz"
)

hits the lines

elif first_line.startswith(" ") or second_line.startswith(" "):
return False

though none of the pandas codebase does.

I get that if the first string ends with '\n', then we don't want an extra space after it. What's not clear to me is these lines:

elif first_line.startswith(" ") or second_line.startswith(" "):
return False
elif first_line.endswith(" ") or second_line.endswith(" "):
return False

Seeing as nothing in the pandas codebase hits them, I'd be inclined to remove them

@datapythonista
Copy link
Member

Seems to me that these two last elif should be just one checking the end of the first line, and the start of the second. Doesn't make much sense this way.

But in any case, that was added for some case that wasn't working with the simple validation of if the space is not at the end of the first line instead of at the beginning of the second, then fail. So, if there is nothing in our code now that fails with that simple condition, agree that better to remove.

@ShaharNaveh
Copy link
Member

Seems to me that these two last elif should be just one checking the end of the first line, and the start of the second. Doesn't make much sense this way.

IIRC the reason for those two elif was to be as explicit as possible, in order to cover as much cases as possible I guess it slipped from me that they are not really useful 😅

But in any case, that was added for some case that wasn't working with the simple validation of if the space is not at the end of the first line instead of at the beginning of the second, then fail. So, if there is nothing in our code now that fails with that simple condition, agree that better to remove.

IIRC they did catch something in the codebase back then, but that was a year ago, I'm not really sure.
Any way, I think it would be better to remove them, as long as they don't seems to be needed. You can always add them again whenever you want.

@jreback jreback added this to the 1.3 milestone Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants