[BOUNTY] Add deterministic seed support to data_generator.py (#4) by leo202000 · Pull Request #29 · thanhle74/kickama

leo202000 · 2026-06-22T13:26:48Z

Summary

Makes DataGenerator fully deterministic by routing all randomness through the seeded random.Random instance, addressing bounty #4. The existing seed parameter now guarantees identical output across runs.

Changes

tools/data_generator.py:
- Module-level helpers (random_phone, random_email, random_datetime, gaussian_random) now accept an optional rng argument so callers can inject a seeded generator.
- DataGenerator passes self.random to every helper call site in generate_users, generate_orders, and generate_trades, closing the gap where email/phone/datetime/timestamp used the unseeded global random.
- Added a DataGenerator class docstring documenting seed-based reproducibility.
tests/test_data_generator_seed.py: 11 unit tests verifying same-seed reproducibility (users/orders/trades), seed differentiation, cross-run stability, default seed, and helper-level rng injection.
diagnostic/build-23f043a7.logd + .json: required diagnostic bundle.

Testing

python3 tests/test_data_generator_seed.py -v -> 11 tests pass.
python3 build.py -> diagnostic bundle generated and committed (diagnostic/build-23f043a7.logd, 15044 bytes, DIAG magic).
Smoke test: DataGenerator(seed=42) produces byte-identical users across two instances; different seeds produce different data.

Checklist

Relevant modules affected by these changes build locally
Tests pass locally
Diagnostic build log is committed in this PR
Documentation has been updated, if applicable
Configuration or schema changes are documented, if applicable
No generated build artifacts are committed, except the required diagnostic build log
Changes are scoped to the PR purpose and avoid unrelated cleanup
Security, privacy, and error-handling implications have been considered

I would like to request that my diagnostic build log is removed before merging

Addresses bounty issue #4. Please let me know the process for claiming the $25 bounty once merged.

Make the existing seed parameter fully deterministic by routing all randomness through the seeded random.Random instance. The module-level helpers (random_phone, random_email, random_datetime, gaussian_random) now accept an optional rng argument, and DataGenerator passes its self.random to every call site so the same seed reproduces identical users, orders, trades, ticks, and candles. Adds a DataGenerator docstring and unit tests verifying reproducibility across runs, seed differentiation, and helper-level rng injection. Addresses bounty mannowell#4.

coderabbitai · 2026-06-22T13:27:34Z

Warning

Review limit reached

@leo202000, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 4 minutes and 13 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ae9cdc2-0d80-4ed7-a2ad-650a129334ca

📥 Commits

Reviewing files that changed from the base of the PR and between 94e0fb0 and 1016565.

📒 Files selected for processing (4)

diagnostic/build-23f043a7.json
diagnostic/build-23f043a7.logd
tests/test_data_generator_seed.py
tools/data_generator.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

leo202000 added 2 commits June 22, 2026 21:25

chore(diagnostic): add diagnostic bundle

1016565

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BOUNTY] Add deterministic seed support to data_generator.py (#4)#29

[BOUNTY] Add deterministic seed support to data_generator.py (#4)#29
leo202000 wants to merge 2 commits into
thanhle74:mainfrom
leo202000:feat/data-generator-seed

leo202000 commented Jun 22, 2026

Uh oh!

coderabbitai Bot commented Jun 22, 2026

Review limit reached

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

leo202000 commented Jun 22, 2026

Summary

Changes

Testing

Checklist

Uh oh!

coderabbitai Bot commented Jun 22, 2026

Review limit reached

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant