-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Summary
Add an abstractive summarization step so each article gets a short “TL;DR.”
Motivation
- Helps end-users absorb long articles quickly.
- Demonstrates multi-document summarization (e.g. daily digest).
Scope
None
Acceptance Criteria
-
summarize(text)produces a concise summary (<200 chars) -
summarize_task(article_id)saves"summary"to the article record - CLI
summarizecommand runs without errors and prints confirmation - Tests pass in CI and README clearly describes both execution paths
Additional Context
Details
- Category: nlp
- Priority: P1
- Estimate: 2d
- Dependencies:
- Database connection module (
nlp/db.py) in place - Articles already normalized and persisted
- Database connection module (
Tasks
- Add dependencies
- Add
transformersandtorchto/nlp/requirements.txt.
- Add
- Core function signature
- Define in
/nlp/core.py:def summarize(text: str) -> str
- Define in
- Celery task hook
- In
/nlp/tasks.py, register:@app.task def summarize_task(article_id: str) -> str
- In
- CLI entrypoint
- In
/nlp/cli.py, expose:python -m nlp.cli summarize --article-id=<id>
- In
- Tests & documentation
- Unit test that
summarize()returns a non-empty string under 200 chars. - Test that
summarize_task()updates the DB with a"summary"field. - Update
/nlp/README.mdwith:- Installation steps
- How to run the Celery task
- How to invoke the CLI command
- Unit test that
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Ready