-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Replacing behavior currently provided by pandas.to_numeric using errors="ignore" #59221
Comments
upvote for this issue |
@mroeschke this looks like a legitimate use case of the |
Thanks @snitish.
|
Based on some initial tests with Thus I add a third suggestion: If an error occurs, it keeps the original string for that entry. |
This would work for us and is the behavior we are using in our use case. For reference, the use case is a .csv that serves as a controlled vocabulary for defined terms in order to map between fields-values-UI display terms. It needs to be centralized in a single file as the actual values/UI meanings are reviewed/maintained by SME vs. developers The code in question creates a dict that splits this apart, and could potentially be handled with an if/then since within a given field things are either all numeric or all not. But the 'if this is numeric, make it numeric, and if not leave it as a string' is the intended behavior as currently written. Current function in use:
|
Based on the comment by @mepearson, I add our example here.
To complicate things, the cells can also be empty.
It would be nice if these empty cells remain interpreted as |
Thanks for the suggestion, but I would be -1 on modifying Although it's not "clean" just to do
This behavior was intentionally deprecated in pandas 2.x to avoid potentially returning |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
As a User, I am reading in CSV files with mixed types in certain columns.
Specifically, the column can have a value, or a file path to a list of values.
With all current quoting flags, the df.read_csv turns this into an object column where the floats or ints are quoted as well.
Thus, turning all nrs into numerics while keeping all file paths as strings is easily done as follows:
df[column_name] = df[column_name].apply(_pd.to_numeric, errors="ignore")
Thus, as the solution above is being sunsetted, I wish pandas could provide a simple solution to get object columns with mixed dtypes which treat nrs as numerics from csv files.
Feature Description
usage:
pd.to_numeric(preserve_text=True)
implementation
Alternative Solutions
usage:
csv_read(to_numeric_in_mixed_columns=True, preserve_text_in_mixed_columns=True)
Additional Context
#54467
#43280
pypest/pyemu#485
The proposed workaround in the pypest issue is not satisfactory as this should be a common occurrence with a common solution.
The text was updated successfully, but these errors were encountered: