-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Deprecate literal json string input to read_json
#52271
Comments
Agreed that being stricter here by only accepting a "file like object" is reasonable |
xref #5924 can be closed as "no" if/when we restrict this consistently |
Thanks for that cross-ref, to broaden the scope, here's an audit of the current state of play:
So some, but not all, of the textual formats support reading literally, some support reading bytes as well as strings, and some of the binary formats support reading literal bytes. If deprecating literal input to read_json I would suggest it makes sense to therefore also do so for:
I see in the past there's been some discussion about introducing utility |
Yeah makes sense to have API consistency among the other read functions too
I would opt for not introducing those methods personally |
take |
@mroeschke Do you think it would make sense to change the name of the path_or_buf parameter for I feel like path_or_buf insinuates that you can pass a string-type argument that represents a file path. |
I think |
Thanks @rmhowe425! I opened #53767 to track deprecation of literal input for the remaining IO routines that support it. |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
As seen in #29102 (the rejected #29104), and then #46718, determining user intent in user input from
pd.read_json(some_string)
is in general not possible. #46718 is a halfway house in that it explicitly marks some "file extensions" as "you probably wanted to read from a file", but is easily defeated by, for examplepd.read_json("missing.jsonl", lines=True)
(jsonl
being a common extension for "lines"-formatted json files).AFAICT,
read_json
is the onlyread_XXX
function that accepts a literal representation of the data in itspath_or_buf
argument, so there doesn't seem to be a great deal of precedent here.Feature Description
Deprecate literal json input to
pd.read_json
, if one wants to read from a string it should be wrapped in a StringIO.e.g.
Alternative Solutions
None
Additional Context
No response
The text was updated successfully, but these errors were encountered: