Skip to content

[recipe] Provide Relax style recipe#93

Merged
0oshowero0 merged 22 commits into
Ascend:mainfrom
Jixixi2020:streaming_dataloader_demo
May 15, 2026
Merged

[recipe] Provide Relax style recipe#93
0oshowero0 merged 22 commits into
Ascend:mainfrom
Jixixi2020:streaming_dataloader_demo

Conversation

@Jixixi2020
Copy link
Copy Markdown
Contributor

Summary
Adds a new demo to illustrate how to use StreamingDataset and StreamingDataLoader in a simple data-centric, asynchronous RL-style workflow.

The demo shows a decentralized worker-per-stage pipeline where each stage independently consumes the fields it needs from the queue and writes its outputs back for downstream stages. It is designed as a readable example of how streaming data access can be used to connect multiple RL pipeline stages without tightly coupling execution to a centralized stage-by-stage scheduler.

Changes

  • Demonstrated StreamingDataset and StreamingDataLoader usage
  • Implemented a simple asynchronous RL-style pipeline with separate stages and Kept the workflow data-centric:
    • workers read only the required fields for their stage
    • workers write derived fields back to the same partition metadata flow
    • downstream stages naturally continue from available data
  • Kept the demo intentionally lightweight and educational:
    • simple synthetic tensor generation
    • simplified RL-like field transformations
    • explicit progress logging for step-level visibility

@ascend-robot
Copy link
Copy Markdown

CLA Signature Guide

@Jixixi2020 , thanks for your pull request.

The following commit(s) are not associated with a signed Contributor License Agreement (CLA).

Commit Reason
[67153a1 [recipe] Add recipe demo to use...](67153a1) the email used in the commit is not linked to a signed CLA!
please verify that it matches the email you used when signing the CLA.

To sign CLA, click here.

To check if your email is configured correctly, refer to the FAQs.

Once you've signed the CLA or updating your email, please comment /check-cla to revalidate CLA status.

@0oshowero0
Copy link
Copy Markdown
Collaborator

@NINGBENZHE Please help to review this recipe~

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new recipe-style demo (streaming_dataloader_demo.py) showing how to connect multiple asynchronous RL-like pipeline stages via StreamingDataset + StreamingDataLoader, where each stage reads only the fields it needs and writes derived fields back into the same partition.

Changes:

  • Introduces a Ray-based, decentralized “worker-per-stage” pipeline demo (rollout/ref/actor/reward/update).
  • Demonstrates field-level streaming reads with StreamingDataset(data_fields=...) and writes back via tq_client.put(..., metadata=batch_meta).
  • Adds a small driver loop that inserts prompts, waits for stage completion, simulates weight sync, and clears partitions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -0,0 +1,428 @@
import argparse
Comment on lines +238 to +243
num_steps: int
pipeline_depth: int
global_batch_size: int
micro_batch_size: int
prompt_length: int
response_length: int
Comment on lines +252 to +254
weight_sync_seconds: float
empty_poll_log_interval: int
num_data_storage_units: int
Comment on lines +110 to +142
def _run_step(self, step: int) -> None:
partition_id = f"{self.cfg_demo.partition_prefix}_{step}"
dataloader = self._build_dataloader(partition_id)

for batch, batch_meta in dataloader:
sample_ids = batch["sample_id"].view(-1).tolist()
logger.info(f"[{self.worker_name}] step={step} consumed sample_ids={sample_ids}")

output, written_fields = self.compute(batch, batch_meta)
self.tq_client.put(output, metadata=batch_meta)

count = ray.get(self.tracker.record.remote(self.stage_name, step, len(sample_ids)))
logger.info(
f"[{self.worker_name}] step={step} done -> written_fields={written_fields}, "
f"{self.stage_name}_count={count}/{self.cfg_demo.global_batch_size}"
)

ray.get(self.tracker.record_done.remote(self.stage_name, step))
logger.info(f"[{self.worker_name}] step={step} worker_done recorded")

def _build_dataloader(self, partition_id: str) -> StreamingDataLoader:
dataset = StreamingDataset(
config=self.cfg,
batch_size=self.cfg_demo.micro_batch_size,
micro_batch_size=self.cfg_demo.micro_batch_size,
data_fields=self.input_fields(),
partition_id=partition_id,
task_name=f"{self.cfg_demo.task_name_prefix}_{self.stage_name}",
dp_rank=self.worker_id,
should_check_consumption_status=True,
)
return StreamingDataLoader(dataset=dataset, num_workers=0, prefetch_factor=None)

Comment on lines +363 to +365
ray.get(refs)
logger.info("demo done!")
return []
Comment on lines +16 to +18

logging.basicConfig(level=logging.INFO, format="%(asctime)s.%(msecs)03d - %(levelname)s - %(message)s", datefmt="%H:%M:%S")
logger = logging.getLogger(__name__)
@ascend-robot
Copy link
Copy Markdown

CLA Signature Guide

@Jixixi2020 , thanks for your pull request.

The following commit(s) are not associated with a signed Contributor License Agreement (CLA).

Commit Reason
[67153a1 [recipe] Add recipe demo to use...](67153a1) the email used in the commit is not linked to a signed CLA!
please verify that it matches the email you used when signing the CLA.

To sign CLA, click here.

To check if your email is configured correctly, refer to the FAQs.

Once you've signed the CLA or updating your email, please comment /check-cla to revalidate CLA status.


for step in range(self.config.num_steps):
self._put_prompt(step)
self._wait_complete(step)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的数据流转是没有太大问题的,不过这里demo写的算是个on policy的场景,没有体现off policy的逻辑,可以考虑丰富一下


ray.init()
try:
demo = DecentralizedInheritedWorkerPipelineDemo(cfg, build_tq_config(cfg))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little too long

Comment thread recipe/simple_use_case/relax_demo.py

def fit(self) -> list[dict]:
logger.info("=" * 72)
logger.info("TransferQueue StreamingDataLoader Decentralized Inherited Worker Pipeline Demo")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remember to modify this when changing the file name

Comment thread recipe/simple_use_case/relax_demo.py
@ascend-robot
Copy link
Copy Markdown

CLA Signature Guide

@Jixixi2020 , thanks for your pull request.

The following commit(s) are not associated with a signed Contributor License Agreement (CLA).

Commit Reason
[67153a1 [recipe] Add recipe demo to use...](67153a1) the email used in the commit is not linked to a signed CLA!
please verify that it matches the email you used when signing the CLA.

To sign CLA, click here.

To check if your email is configured correctly, refer to the FAQs.

Once you've signed the CLA or updating your email, please comment /check-cla to revalidate CLA status.

1 similar comment
@ascend-robot
Copy link
Copy Markdown

CLA Signature Guide

@Jixixi2020 , thanks for your pull request.

The following commit(s) are not associated with a signed Contributor License Agreement (CLA).

Commit Reason
[67153a1 [recipe] Add recipe demo to use...](67153a1) the email used in the commit is not linked to a signed CLA!
please verify that it matches the email you used when signing the CLA.

To sign CLA, click here.

To check if your email is configured correctly, refer to the FAQs.

Once you've signed the CLA or updating your email, please comment /check-cla to revalidate CLA status.

@ascend-robot
Copy link
Copy Markdown

CLA Signature Pass

Jixixi2020, thanks for your pull request. All authors of the commits have signed the CLA. 👍

@0oshowero0 0oshowero0 changed the title [recipe] Add recipe demo to use StreamingDataset & StreamingDataLoader [recipe] Provide Relax style recipe May 14, 2026
@Jixixi2020 Jixixi2020 force-pushed the streaming_dataloader_demo branch from 053f6ba to 3718ceb Compare May 14, 2026 13:09
@ascend-robot
Copy link
Copy Markdown

CLA Signature Pass

Jixixi2020, thanks for your pull request. All authors of the commits have signed the CLA. 👍

Jixixi2020 added 10 commits May 15, 2026 11:06
Signed-off-by: jxixi <916099156@qq.com>
…oader_demo.py

Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
…t_logger

Signed-off-by: jxixi <916099156@qq.com>
…prefix

Signed-off-by: jxixi <916099156@qq.com>
…tricWorkerPipelineDemo

Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
Jixixi2020 added 12 commits May 15, 2026 11:06
Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
…me-prefix

Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
…solving conflict with PR Ascend#92, and add worker exception handling

Signed-off-by: jxixi <916099156@qq.com>
Signed-off-by: jxixi <916099156@qq.com>
@Jixixi2020 Jixixi2020 force-pushed the streaming_dataloader_demo branch from 3718ceb to 0ffd5e8 Compare May 15, 2026 03:14
@ascend-robot
Copy link
Copy Markdown

CLA Signature Pass

Jixixi2020, thanks for your pull request. All authors of the commits have signed the CLA. 👍

@0oshowero0 0oshowero0 merged commit 8c9a067 into Ascend:main May 15, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants