-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add origin parameter to Timestamp/to_datetime epoch support. #11745
Comments
closes pandas-dev#11276 closes pandas-dev#11745 closes pandas-dev#11470
@jtkiley Is |
It does not (pandas 0.19.1). Below, see one of my Stata datasets (originally written by R). In this example, I've formatted the date like my original post but without the epoch adjustment. I have some type stuff going on here that I'd fix in a real project, but you can see that the separate year variable is 10 years off from the date.
Here's the same data in Stata, also without adjustment, though I formatted the date using
|
@jtkiley Could you post a small DTA that demonstrated this issue? |
@jtkiley Any chance for sharing a DTA with this issue? |
Sorry for the delay. Here's one that I reduced down (columns and rows) to what you see above. It was originally written by R and then reduced and saved using Stata. It continues to exhibit this issue. It's also zipped to make Github happy. |
I can't reproduce it. When I use
which is identical to what Stata shows. When I convert date to a data column in Stata using
which seems to be correct. |
@bashtage Right. The problem is when you convert the epoch time using |
@jtkiley I see. I thought it was a bug in |
@bashtage That makes sense. I was thinking of a I often see it when moving data around or pulling it from sources that have a Stata export option, and those often don't come with the date formatting intact. I tend to use those export options (often with Stata for co-author accessibility), assemble data in pandas (R in the past), and then export it in Stata format for sharing and analysis. |
It looks like #11470 has the |
I don't see a strong reason to allow arbitrary offsets in the Stata interface code. The present version is very loyal to the Stata dta format spec and allowing a semi-random option to be internalized rather than chained seems like the wrong way to do things. I suppose without explicit support one would have to do something like
Maybe there would be an easier way to re-originate existing date-times. |
@bashtage makes sense. |
When using SAS or Stata data, dates are represented as the number of days since 1/1/1960, and other statistical software uses different origin dates. With that in mind, it would be nice to have an origin date that can be specified. See also, #3969.
It's a relatively simple thing, and not hard to work around, of course. However, I end up dealing with date formatting on just about every data set I import, and I imagine that lots of others do, too.
Currently, I do something like this:
In R, the
as.Date()
function takes an origin parameter for numeric types (see, manual). So, in R, the date part would simply be:The text was updated successfully, but these errors were encountered: