-
Notifications
You must be signed in to change notification settings - Fork 0
Inference_stories
US Model Inference Job : Define a job for generating batch predictions from a registered model.
- US Model Inference Job : Define a job for generating batch predictions from a registered model.
classDiagram
class InferenceJob {
+KIND: T.Literal["InferenceJob"] = "InferenceJob"
+inputs: datasets.ReaderKind
+outputs: datasets.WriterKind
+alias_or_version: str | int = "Champion"
+loader: registries.LoaderKind = CustomLoader()
+run() base.Locals
}
class Job {
<<abstract>>
+run()* base.Locals
}
InferenceJob --|> Job : inherits
class ReaderKind {
<<type>>
Reader
}
class WriterKind {
<<type>>
Writer
}
class LoaderKind {
<<type>>
Loader
}
class CustomLoader {
+load(uri: str) Any
}
CustomLoader --|> LoaderKind : implements
InferenceJob --> ReaderKind : "uses"
InferenceJob --> WriterKind : "uses"
InferenceJob --> LoaderKind : "uses"
Title:
As a data scientist, I want to configure an inference job that specifies the necessary parameters for generating predictions, so that batch predictions can be effectively processed.
Description:
The InferenceJob class enables the setup of the job with parameters such as input data readers, output data writers, model details, and the loader for accessing the model.
Acceptance Criteria:
- The job is initialized with the necessary parameters.
- Default values are properly handled for optional fields.
Title:
As a data engineer, I want to read input data from specified sources, so that the model can generate predictions based on these inputs.
Description:
In the run method, the input data is read using the designated data reader, which ensures data integrity and prepares it for prediction.
Acceptance Criteria:
- The job successfully reads input data using the configured reader.
- Input data is validated and conforms to the expected schema.
Title:
As a data scientist, I want to load the registered model from the model registry, so that I can use it to generate predictions on the input data.
Description:
The job uses the configured loader to access the specified version or alias of the model from the registry.
Acceptance Criteria:
- The model is loaded correctly from the registry using the provided loader.
- The model instance must be ready for making predictions.
Title:
As a data scientist, I want to generate predictions using the loaded model and the input data, so that I can evaluate the model's performance on new data points.
Description:
The job leverages the model's predict method to produce output predictions based on the input data.
Acceptance Criteria:
- Predictions are generated using the loaded model and validated input data.
- The output of the predictions is in a usable format for further processing.
Title:
As a data engineer, I want to write the generated predictions to a specified data output, so that results can be stored and retrieved later.
Description:
The job takes the prediction outputs and writes them to the designated storage using the configured writer.
Acceptance Criteria:
- The predictions are successfully written to the specified output using the writer.
- The method of storage should ensure data integrity.
Title:
As a user, I want to be notified when the inference job is finished, along with the shape of the output data, so that I can review the results promptly.
Description:
At the end of the job execution, notifications are sent to relevant stakeholders summarizing the outcome, including predictions shape.
Acceptance Criteria:
- Notifications include job completion details, specifically the shape of the outputs.
- The alerts service successfully informs users about job completion status.
-
Implementation Requirements:
- The
InferenceJobclass correctly implements the abstractrunmethod from the baseJobclass. - All necessary services (logging, model registry, alerts) are initialized at the start of the inference job.
- The
-
Error Handling:
- Clear error messages are logged for any issues encountered during the reading, loading, or writing processes.
-
Testing:
- Unit tests validate job initialization, data reading, model loading, prediction generation, and output writing.
- Tests ensure that errors in processes trigger appropriate logging and notifications.
-
Documentation:
- Each class and method in the InferenceJob should have clear docstrings and examples provided for clarity.
- Users should be guided on how to configure and use the inference job.
- The
InferenceJobclass is fully implemented and tests pass all acceptance criteria. - The functionality includes reading inputs, loading models, generating predictions, writing outputs, and notifying users.
- The documentation is complete and well-structured for ease of understanding.
Powered by MLOps Factory