Skip to content

Conversation

@AlejandroGomezFrieiro
Copy link
Contributor

@AlejandroGomezFrieiro AlejandroGomezFrieiro commented Oct 15, 2025

emukit is currently not compatible with numpy 2.0, and with numpy 1.26 end-of-life last month it should be good to let users update to numpy 2.0 when using non-GPy models. For this purpose, I have attempted to refactor the repository. I like the philosophy of being model-agnostic, and so would like to see the package move in that direction. Also modernizes certain things such as using project metadata in pyproject.toml.

More importantly, it separates specific examples from core functionality so that the repository can be more easily maintained and used without the issues that older and ill-maintained packages can bring (such as pybnn which was last updated years ago...). Attempted to streamline it through optional dependencies, and if one installs full it should be backwards compatible. However, core default emukit installation (through pip install emukit) should be lean.

Would like to understand what is the way forward and if such a change makes sense for the repository. Ping @apaleyes since afaik they are the maintainer. This alleviates the complications caused by SheffieldML/GPy#1112, since it seems both paramz and GPy might take a while to upgrade, likely sometime next year.

In any case, we might keep a fork to allow cleaner dependencies for our work with the package (which we really like, btw! Awesome job on the concept) if the PR does not go through.

Summary:

• Refactors GPy-dependent code to allow core Emukit usage without GPy installed; raises clear errors only when needed. This improves upon the philosophy of being model-agnostic and bring your own model.
• Introduces optional dependency groups (e.g., full, bnn, sklearn, examples) for more flexible installation and modern dependencies (for example, can bring numpy 2.0 without GPy)
• Adds and documents pytest markers for optional dependencies; splits CI to test core and GPy features separately.
• Updates documentation (README, installation, CHANGELOG) to clarify installation and usage.
• Minor test and code cleanups for compatibility and maintainability. Default installation works with numpy>=2.0, while full should be backwards compatible with numpy 1.26-ish.
• Packaging modernization: adopts PEP 621 project metadata in pyproject.toml (dynamic version from emukit.__version__), enabling standard editable installs (pip install -e .[...]) via PEP 660.
• CI updated to use pip install -e .[tests] (and [tests,gpy]) instead of layering multiple requirements files (Would like to make sure what is being executed on CI pipelines).
• CHANGELOG, CONTRIBUTING, installation docs, and README extended to explain optional-based workflow and transition period for legacy requirements/ files (we can also work on removing the legacy stuff in the same PR if that is deemed necessary!)

Motivation:

• Simplifies installation for users not needing all features, keeps core features separated from model-specific wrappers and examples.
• Improves test reliability and developer experience.
• Makes dependency management and documentation clearer.
• Aligns project with modern Python packaging standards (PEP 621/660) for better tooling interoperability and clearer metadata.

Notes:

• Users needing GPy features and wrappers should install with pip install emukit[gpy] or emukit[full]. emukit[full] should be more or less the same as the current main, while core emukit install is lean and compatible with current numpy and scipy.
• Editable development now uses: pip install -e .[tests] (add other extras as needed).

By submitting this pull request, I confirm that my contribution is made under Apache 2.0

…ort-safe without GPy; replace deprecated numpy/scipy constants; add optional extras and markers; remove internal ticket comments
…oE deps to core

- Add sklearn & notebooks pytest markers and annotate related tests
- Complete GPy gating (all tests use importorskip + marker)
- Simplify BenchmarkPlot (matplotlib now mandatory)
- Move PyDOE & sobol_seq to core requirements; adjust extras
- Add pybnn/sklearn/notebooks markers to setup.cfg
- Fix NumPy 2.0 deprecations (np.int -> int) in examples
- Adjust parameter_space test to avoid NumPy 2 recursion
- Add timeline file documenting tickets 001-003,007,014,021-023
…\nAdd new extras: bnn (pybnn+torch), sklearn, examples bundle, expand full meta extra. Document usage in README and installation.rst. Unify Bohamiann import guidance and improve torch ImportError wording. Add ipykernel gating for notebook integration tests and explicitly set kernel. Minor test adjustments (import GPy explicitly, relax gradient tolerance). Simplify numpy requirement to avoid premature 2.x pin conflicts.
@apaleyes
Copy link
Collaborator

On my goodness, @AlejandroGomezFrieiro this is an absolutely monumental PR right there! Alongside you also took care of so many things we never got around to, like isolating GPy. A million thanks for this work!

How shall we proceed in reviewing? I am happy to review the entire PR in one big and long sitting, or can provide incremental feedback as I move along. To be honest, at the first glance this looks fantastic and probably won't require a ton of iterations.

@apaleyes apaleyes self-requested a review October 17, 2025 16:12
@AlejandroGomezFrieiro
Copy link
Contributor Author

@apaleyes Whatever works best for you. Full disclosure: copilot was used for the refactor, but I own all the changes so let me know of there's any part that you'd like to improve upon.

It seems that one notebook might not be passing tests, and not sure about the examples but I think some of them were already not working fully before this work. But at least all other unit and integration tests (for the core and full installations) are passing at this point.

@AlejandroGomezFrieiro
Copy link
Contributor Author

@apaleyes any progress on the reviewing? Is there anything I can do to help move things forward?

@apaleyes
Copy link
Collaborator

apologies, i started but did not make the time yet to finish the review

Copy link
Collaborator

@apaleyes apaleyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all fantastic work, LLM or not LLM, thanks a lot for this! Left many comments below, hopefully that's understandable given the size of the PR.

I was delighted to see how few changes to the actual library were, numpy did great job with API compatibility it seems

@@ -1,50 +1,9 @@
# Copyright 2020-2024 The Emukit Authors. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
# Legacy shim retained for backward compatibility.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling this will stay for a while. Let's keep this new version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So can I interpret from this that it's OK to remove the old setup.py?

gpy = ["GPy>=1.13.0"]
bnn = ["pybnn>=0.0.5", "torch"]
sklearn = ["scikit-learn"]
examples = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty sure these extras can reference each other, e.g.

examples = [
    emukit[gpy,bnn,sklearn],
    ...
]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's true, I set it now to use the correct extras referencing each other

pyproject.toml Outdated
"pytest-cov>=2.5.1",
"mock>=2.0.0",
]
# Convenience aggregate identical to previous 'full'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what this comment says

also of note that full contains stuff like jupyter, black, isort etc - something other extras don't need. I would suggest to call this one dev to signify this is something most likely only developers would need

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, now it's dev. Still not 100% happy with the state of the optional dependencies, but necessary it seems...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this proves too clunky we can merge them in later versions

pyproject.toml Outdated
description = "Toolkit for decision making under uncertainty."
readme = "README.md"
license = { file = "LICENSE" }
requires-python = ">=3.9"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's update to 3.10, 3.9 reached end of life last month

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, good catch!

CONTRIBUTING.md Outdated
```
isort .
black .
# Or run only on changed files via pre-commit if configured.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove this line, we don't have pre-commit configured

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

from typing import List, Tuple, Type

import GPy
if importlib.util.find_spec("GPy") is None: # pragma: no cover
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

from .linear_model import GPyLinearMultiFidelityModel # noqa: F401
from .non_linear_multi_fidelity_model import NonLinearMultiFidelityModel # noqa: F401
else:
class _GPyMissingBase: # pragma: no cover - exercised in minimal installs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pure magic to me. What is going on here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, if GPy is installed we import as normal, and otherwise we import dummy classes with the same names that throw an import error when we try to initialize it. This way we can use the __all__ syntax, and be able to actually use the core API and import as usual, but also pointing users to check the emukit[gpy] install through the custom error message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat! can you add all this as a comment to the code please, here and in other places with similar logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments to guarded imports in all relevant __init__.py

self.x_axis = x_axis_metric_name

def make_plot(self, log_y: bool = False) -> None:
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the docstring gone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely accident, will restore this and the previous one as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now

x_axis_metric_name: str = None,
metrics_to_plot: List[str] = None,
):
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, why is this deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now

@AlejandroGomezFrieiro
Copy link
Contributor Author

This is all fantastic work, LLM or not LLM, thanks a lot for this! Left many comments below, hopefully that's understandable given the size of the PR.

I was delighted to see how few changes to the actual library were, numpy did great job with API compatibility it seems

Yeah, of course! I now pushed changes and answered most if not all of the comments. Thanks for the extensive review, let's polish these last couple leftover things

@apaleyes
Copy link
Collaborator

Thanks a lot @AlejandroGomezFrieiro ! This all looks good to me, don't think any more work is necessary.

I will be unavailable for next 10 days or so, whenever I am back I will take another good long look at this PR, and hopefully merge this. It's a pretty massive change in the right direction, so I'd like to ensure we are not missing anything, perhaps give this branch a small local test etc.

Thanks again for your incredible contribution!

@AlejandroGomezFrieiro
Copy link
Contributor Author

How's the testing going? It is now starting to become a bit pressing on our end to get the new release out just so we can update our code to numpy 2.0

Do you think it'll be possible to finish the review process and release the new version within the next couple weeks (the sooner the better on our end tbh)

@apaleyes
Copy link
Collaborator

apaleyes commented Dec 2, 2025

@AlejandroGomezFrieiro i am really sorry, it is incredibly hard to find time to give this proper attention. I do hope to be able to find a breather this or next week, but as all side projects go I cannot give any promises. Your frustration is totally understandable, and I hate the be the bottleneck, but cannot give a better answer atm.

@AlejandroGomezFrieiro
Copy link
Contributor Author

@apaleyes no need to apologise, we are the ones that are time constrained. We can temporarily use my local fork until then

@codecov
Copy link

codecov bot commented Dec 19, 2025

Codecov Report

❌ Patch coverage is 51.16279% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.64%. Comparing base (97188a9) to head (6d8e7e0).
⚠️ Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
...yesian_optimization/acquisitions/entropy_search.py 0.00% 5 Missing ⚠️
...t/benchmarking/loop_benchmarking/benchmark_plot.py 60.00% 2 Missing ⚠️
emukit/model_wrappers/__init__.py 66.66% 1 Missing and 1 partial ⚠️
emukit/model_wrappers/gpy_model_wrappers.py 0.00% 2 Missing ⚠️
emukit/model_wrappers/gpy_quadrature_wrappers.py 0.00% 2 Missing ⚠️
emukit/multi_fidelity/models/__init__.py 60.00% 2 Missing ⚠️
emukit/multi_fidelity/models/linear_model.py 0.00% 2 Missing ⚠️
...fidelity/models/non_linear_multi_fidelity_model.py 0.00% 2 Missing ⚠️
...imization/acquisitions/max_value_entropy_search.py 66.66% 1 Missing ⚠️
emukit/multi_fidelity/kernels/__init__.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #477       +/-   ##
===========================================
- Coverage   89.09%   55.64%   -33.46%     
===========================================
  Files         137      137               
  Lines        4842     4845        +3     
  Branches      547      479       -68     
===========================================
- Hits         4314     2696     -1618     
- Misses        403     2084     +1681     
+ Partials      125       65       -60     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@apaleyes apaleyes marked this pull request as ready for review December 19, 2025 12:24
Copy link
Collaborator

@apaleyes apaleyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @AlejandroGomezFrieiro for this massive step forward for emukit! Would not have happened with you.

I've made several adjustments to the PR, hope this is fine. Watch out for 0.5 version, coming in your nearest PyPI mirror soon.

@apaleyes apaleyes merged commit 374ebb5 into EmuKit:main Dec 19, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants