-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEP Remove pytorch from environment.yml #49798
Conversation
@mroeschke looks like you'd added Would you be OK with this change? |
eac9c92
to
2f68a4f
Compare
@@ -69,7 +69,6 @@ dependencies: | |||
- pandas-datareader | |||
- pyyaml | |||
- py | |||
- pytorch | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't you comment it out with a note?
.github/workflows/code-checks.yml
Outdated
@@ -114,6 +114,9 @@ jobs: | |||
with: | |||
fetch-depth: 0 | |||
|
|||
- name: Uncomment pytorch dependency | |||
run: sed -i 's/# - pytorch/- pytorch/g' environment.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is needed. asv pulls its dependencies into a separate environment based off of asv.conf.json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't it does for this job because there's
pandas/.github/workflows/code-checks.yml
Line 131 in 0286130
asv run --quick --dry-run --strict --durations=30 --python=same |
In the CI run of the commit which doesn't have this, the torch ASV check is skipped: https://github.com/pandas-dev/pandas/actions/runs/3507106640/jobs/5874585410
2022-11-20T08:22:57.2883873Z [ 7.85%] ··· frame_ctor.From3rdParty.time_from_torch n/a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the concern of pytorch being a heavy dependency, but I am also not the biggest fan of this patchy workaround.
In essence the root motivation of this seems related to having more targeted dependency files instead of an environmental.yml
monolith. #48828
Unless there's an urgent need to remove this, I'd recommend working toward #48828 (e.g. a conda lock file for docs/required deps test/optional deps test/ etc) instead as it could also help out CI environments too
Thanks for taking a look As far as I understand, a lock file would speed up the solving of the environment, but wouldn't affect its size What I'm trying to tackle here is the size of the environment, which I think is bigger than that of practically any other open source project If you're -1 on commenting and uncomment the pytorch dependency, would you be OK with having a separate environment for for asv checks (which would contain pytorch) and removing pytorch from environment.yml? |
f77d4da
to
95740a2
Compare
I've tried this again, making a separate environment for the Like this:
Not urgent, but the PyDataGlobal sprints are coming soon (early December), and I'd be pretty keen to remove 2GB of dependencies which are irrelevant to 99.9% of contributions (probably not an exaggeration) by then |
If python=same is the problem, can we just remove that part of the asv command, so that asv can create its own environment? Otherwise, we wouldn't have coverage for that in the CI(I've definitely forgot to update Cython before in the asv.conf.json and CI didn't fail). |
Yeah that's a good point - I tried it https://github.com/pandas-dev/pandas/actions/runs/3512450800/jobs/5884163088 but got:
with no further info. Any ideas? Looking further into this, debugging here #49814 |
d6f36c9
to
65675ac
Compare
I would generally be OK with splitting up
then an environment can be assembled based on install 1 or more of the dependency files depending on what is worked on. (Right, conda-lock doesn't necessarily simplify the env size, but it greatly reduces the solve speed which can also be a limiting factor) |
🥳 🎉 finally got it, just needed to remove the So, now, we have:
@mroeschke would you be OK with this solution? (if not, no worries, hopefully there's something else we could work towards) |
I think this is OK. That's a pretty big dependency to have, and doesn't seem like it is even tested in our CI? |
I'd be okay removing pytorch (potentially for now) by:
My main (stickler of a) motivation is that when I did a lot of dependency cleanup a while back for the CI, local dependencies in |
Not sure I follow, wouldn't count as a more targeted environment file? Because then I've gone with your suggestion anyway |
Not sure if I completely follow now. That file only sets up the environment for ASV IIUC. If |
environment.yml
Outdated
@@ -69,7 +69,7 @@ dependencies: | |||
- pandas-datareader | |||
- pyyaml | |||
- py | |||
- pytorch | |||
# - pytorch https://github.com/pandas-dev/pandas/pull/49798 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the pytorch test and ASV are removed, I think it's okay to remove this too
my bad, I was only thinking about the ASV |
thanks for your review / approval! merging then |
* no torch * set matplotlib to minimum version * take mpl pin out again * use pip for matplotlib * remove default channel * no pin needed if no default channel? * turns out we gotta pin * 🔥 remove torch * undo asv.conf.json change * comment out pytorch * remove commented-out torch Co-authored-by: MarcoGorelli <>
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.