-
Notifications
You must be signed in to change notification settings - Fork 132
Update repository to be compatible with numpy 2.0, refactor non-core functionality into optional dependencies, update to PEP 621 and PEP 660 #477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ort-safe without GPy; replace deprecated numpy/scipy constants; add optional extras and markers; remove internal ticket comments
…oE deps to core - Add sklearn & notebooks pytest markers and annotate related tests - Complete GPy gating (all tests use importorskip + marker) - Simplify BenchmarkPlot (matplotlib now mandatory) - Move PyDOE & sobol_seq to core requirements; adjust extras - Add pybnn/sklearn/notebooks markers to setup.cfg - Fix NumPy 2.0 deprecations (np.int -> int) in examples - Adjust parameter_space test to avoid NumPy 2 recursion - Add timeline file documenting tickets 001-003,007,014,021-023
…o pip install with docs extra
…ional-gpy-gating-cleanup
…\nAdd new extras: bnn (pybnn+torch), sklearn, examples bundle, expand full meta extra. Document usage in README and installation.rst. Unify Bohamiann import guidance and improve torch ImportError wording. Add ipykernel gating for notebook integration tests and explicitly set kernel. Minor test adjustments (import GPy explicitly, relax gradient tolerance). Simplify numpy requirement to avoid premature 2.x pin conflicts.
…rsion and setup.py shim
|
On my goodness, @AlejandroGomezFrieiro this is an absolutely monumental PR right there! Alongside you also took care of so many things we never got around to, like isolating GPy. A million thanks for this work! How shall we proceed in reviewing? I am happy to review the entire PR in one big and long sitting, or can provide incremental feedback as I move along. To be honest, at the first glance this looks fantastic and probably won't require a ton of iterations. |
|
@apaleyes Whatever works best for you. Full disclosure: copilot was used for the refactor, but I own all the changes so let me know of there's any part that you'd like to improve upon. It seems that one notebook might not be passing tests, and not sure about the examples but I think some of them were already not working fully before this work. But at least all other unit and integration tests (for the core and full installations) are passing at this point. |
|
@apaleyes any progress on the reviewing? Is there anything I can do to help move things forward? |
|
apologies, i started but did not make the time yet to finish the review |
apaleyes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all fantastic work, LLM or not LLM, thanks a lot for this! Left many comments below, hopefully that's understandable given the size of the PR.
I was delighted to see how few changes to the actual library were, numpy did great job with API compatibility it seems
| @@ -1,50 +1,9 @@ | |||
| # Copyright 2020-2024 The Emukit Authors. All Rights Reserved. | |||
| # SPDX-License-Identifier: Apache-2.0 | |||
| # Legacy shim retained for backward compatibility. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling this will stay for a while. Let's keep this new version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So can I interpret from this that it's OK to remove the old setup.py?
| gpy = ["GPy>=1.13.0"] | ||
| bnn = ["pybnn>=0.0.5", "torch"] | ||
| sklearn = ["scikit-learn"] | ||
| examples = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty sure these extras can reference each other, e.g.
examples = [
emukit[gpy,bnn,sklearn],
...
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's true, I set it now to use the correct extras referencing each other
pyproject.toml
Outdated
| "pytest-cov>=2.5.1", | ||
| "mock>=2.0.0", | ||
| ] | ||
| # Convenience aggregate identical to previous 'full' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what this comment says
also of note that full contains stuff like jupyter, black, isort etc - something other extras don't need. I would suggest to call this one dev to signify this is something most likely only developers would need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, now it's dev. Still not 100% happy with the state of the optional dependencies, but necessary it seems...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this proves too clunky we can merge them in later versions
pyproject.toml
Outdated
| description = "Toolkit for decision making under uncertainty." | ||
| readme = "README.md" | ||
| license = { file = "LICENSE" } | ||
| requires-python = ">=3.9" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's update to 3.10, 3.9 reached end of life last month
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, good catch!
CONTRIBUTING.md
Outdated
| ``` | ||
| isort . | ||
| black . | ||
| # Or run only on changed files via pre-commit if configured. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove this line, we don't have pre-commit configured
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| from typing import List, Tuple, Type | ||
|
|
||
| import GPy | ||
| if importlib.util.find_spec("GPy") is None: # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
| from .linear_model import GPyLinearMultiFidelityModel # noqa: F401 | ||
| from .non_linear_multi_fidelity_model import NonLinearMultiFidelityModel # noqa: F401 | ||
| else: | ||
| class _GPyMissingBase: # pragma: no cover - exercised in minimal installs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pure magic to me. What is going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, if GPy is installed we import as normal, and otherwise we import dummy classes with the same names that throw an import error when we try to initialize it. This way we can use the __all__ syntax, and be able to actually use the core API and import as usual, but also pointing users to check the emukit[gpy] install through the custom error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat! can you add all this as a comment to the code please, here and in other places with similar logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments to guarded imports in all relevant __init__.py
| self.x_axis = x_axis_metric_name | ||
|
|
||
| def make_plot(self, log_y: bool = False) -> None: | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the docstring gone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely accident, will restore this and the previous one as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now
| x_axis_metric_name: str = None, | ||
| metrics_to_plot: List[str] = None, | ||
| ): | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, why is this deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now
Yeah, of course! I now pushed changes and answered most if not all of the comments. Thanks for the extensive review, let's polish these last couple leftover things |
|
Thanks a lot @AlejandroGomezFrieiro ! This all looks good to me, don't think any more work is necessary. I will be unavailable for next 10 days or so, whenever I am back I will take another good long look at this PR, and hopefully merge this. It's a pretty massive change in the right direction, so I'd like to ensure we are not missing anything, perhaps give this branch a small local test etc. Thanks again for your incredible contribution! |
|
How's the testing going? It is now starting to become a bit pressing on our end to get the new release out just so we can update our code to numpy 2.0 Do you think it'll be possible to finish the review process and release the new version within the next couple weeks (the sooner the better on our end tbh) |
|
@AlejandroGomezFrieiro i am really sorry, it is incredibly hard to find time to give this proper attention. I do hope to be able to find a breather this or next week, but as all side projects go I cannot give any promises. Your frustration is totally understandable, and I hate the be the bottleneck, but cannot give a better answer atm. |
|
@apaleyes no need to apologise, we are the ones that are time constrained. We can temporarily use my local fork until then |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #477 +/- ##
===========================================
- Coverage 89.09% 55.64% -33.46%
===========================================
Files 137 137
Lines 4842 4845 +3
Branches 547 479 -68
===========================================
- Hits 4314 2696 -1618
- Misses 403 2084 +1681
+ Partials 125 65 -60 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
apaleyes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @AlejandroGomezFrieiro for this massive step forward for emukit! Would not have happened with you.
I've made several adjustments to the PR, hope this is fine. Watch out for 0.5 version, coming in your nearest PyPI mirror soon.
emukitis currently not compatible withnumpy 2.0, and withnumpy 1.26end-of-life last month it should be good to let users update tonumpy2.0 when using non-GPy models. For this purpose, I have attempted to refactor the repository. I like the philosophy of being model-agnostic, and so would like to see the package move in that direction. Also modernizes certain things such as using project metadata inpyproject.toml.More importantly, it separates specific examples from core functionality so that the repository can be more easily maintained and used without the issues that older and ill-maintained packages can bring (such as
pybnnwhich was last updated years ago...). Attempted to streamline it through optional dependencies, and if one installsfullit should be backwards compatible. However, core defaultemukitinstallation (throughpip install emukit) should be lean.Would like to understand what is the way forward and if such a change makes sense for the repository. Ping @apaleyes since afaik they are the maintainer. This alleviates the complications caused by SheffieldML/GPy#1112, since it seems both
paramzandGPymight take a while to upgrade, likely sometime next year.In any case, we might keep a fork to allow cleaner dependencies for our work with the package (which we really like, btw! Awesome job on the concept) if the PR does not go through.
Summary:
• Refactors GPy-dependent code to allow core Emukit usage without GPy installed; raises clear errors only when needed. This improves upon the philosophy of being model-agnostic and bring your own model.
• Introduces optional dependency groups (e.g., full, bnn, sklearn, examples) for more flexible installation and modern dependencies (for example, can bring numpy 2.0 without GPy)
• Adds and documents pytest markers for optional dependencies; splits CI to test core and GPy features separately.
• Updates documentation (README, installation, CHANGELOG) to clarify installation and usage.
• Minor test and code cleanups for compatibility and maintainability. Default installation works with numpy>=2.0, while full should be backwards compatible with numpy 1.26-ish.
• Packaging modernization: adopts PEP 621 project metadata in pyproject.toml (dynamic version from
emukit.__version__), enabling standard editable installs (pip install -e .[...]) via PEP 660.• CI updated to use
pip install -e .[tests](and [tests,gpy]) instead of layering multiple requirements files (Would like to make sure what is being executed on CI pipelines).• CHANGELOG, CONTRIBUTING, installation docs, and README extended to explain optional-based workflow and transition period for legacy requirements/ files (we can also work on removing the legacy stuff in the same PR if that is deemed necessary!)
Motivation:
• Simplifies installation for users not needing all features, keeps core features separated from model-specific wrappers and examples.
• Improves test reliability and developer experience.
• Makes dependency management and documentation clearer.
• Aligns project with modern Python packaging standards (PEP 621/660) for better tooling interoperability and clearer metadata.
Notes:
• Users needing GPy features and wrappers should install with
pip install emukit[gpy]oremukit[full].emukit[full]should be more or less the same as the current main, while coreemukitinstall is lean and compatible with current numpy and scipy.• Editable development now uses:
pip install -e .[tests](add other extras as needed).By submitting this pull request, I confirm that my contribution is made under Apache 2.0