You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
# Unfortunately, `urllib.request.urlretrieve(..)` is producing a 403 response# when I attempt to automate the file download. So, you will need to manually# download and unzip the data file from the following URL:## https://www.federalreserve.gov/econres/files/scf2019x.zip#importpandasaspd# "./scf2019x" is the file that was extracted from "scf2019x.zip"everything=pd.read_sas("./scf2019x", format='xport')
df=everything[['X3509']]
# Confirm that `0.0` never occursprint(sum(df['X3509'] ==0.0))
# Demonstrate that `5.397605e-79` has several occurrencesdf[df['X3509'] <0.0001]
After you unzip the file, you can recreate the issue as follows:
importpandasaspd# "./scf2019x" is the file that was extracted from "scf2019x.zip"everything=pd.read_sas("./scf2019x", format='xport')
df=everything[['X3509']]
# Confirm that `0.0` never occursprint(sum(df['X3509'] ==0.0))
# Demonstrate that `5.397605e-79` has several occurrencesdf[df['X3509'] <0.0001]
Note that it is unlikely that issue is with truncated values (values stored with fewer than 8 bytes in the XPORT) file since the behavior is also observed with this transport file from CDC NHANES date for 2021 that only has full 8 byte values. https://wwwn.cdc.gov/Nchs/Data/Nhanes/Public/2021/DataFiles/DPQ_L.xpt
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
(Note: issue was originally reported in #30051 )
Occasionally,
pandas.read_sas(..)
interprets some "zero values" as5.397605e-79
.This issue can also be observed in the 2019 Survey of Consumer Finances when using
pandas==1.5.2
:https://www.federalreserve.gov/econres/files/scf2019x.zip
After you unzip the file, you can recreate the issue as follows:
This produces the following output:
According to the survey's documentation, these values were all intended to be equal to
0
.Expected Behavior
These values should all be equal to
0.0
.Installed Versions
The text was updated successfully, but these errors were encountered: