Skip to content

Conversation

@yuejiewang
Copy link
Collaborator

PR Category

CRL

PR Types

New Features

PR Description

@CLAassistant
Copy link

CLAassistant commented Dec 1, 2025

CLA assistant check
All committers have signed the CLA.

@yuejiewang yuejiewang force-pushed the dev branch 2 times, most recently from 9fdadc7 to 027e885 Compare December 3, 2025 06:36

// (6) set completion flag
__syncthreads();
FLAGCX_DEVICE_THREAD_FENCE();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move FLAGCX_DEVICE_THREAD_FENCE() after t->setComplete()

@yuejiewang yuejiewang force-pushed the dev branch 2 times, most recently from 70003ed to 642accd Compare December 29, 2025 03:34
@yuejiewang yuejiewang force-pushed the dev branch 2 times, most recently from b3d8a81 to 425bb29 Compare December 30, 2025 07:00
@yuejiewang yuejiewang marked this pull request as ready for review January 4, 2026 02:18
#define ENABLE_TIMER 0
#include "timer.h"

FLAGCX_PARAM(RunUniRunnerAllReduce, "RUN_UNIRUNNER_ALLREDUCE", 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLAGCX_P2P_DISABLE?

}

FLAGCX_HOST_DECORATOR uint64_t flagcxReduceTrigger::pollState() {
uint64_t curr_val = __atomic_load_n(&this->value[3], __ATOMIC_ACQUIRE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curr_val -> currVal


FLAGCX_HOST_DECORATOR void flagcxReduceTrigger::setState(int state) {
uint64_t curr_val = __atomic_load_n(&this->value[3], __ATOMIC_ACQUIRE);
curr_val &= ~(flagcxTriggerMask(flagcxReduceTriggerBitsState)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

FLAGCX_PARAM(P2pDisable, "P2P_DISABLE", 0);

static inline bool isSameNode(struct flagcxHeteroComm *comm, int peer) {
// force use network transport for unirunner allreduce
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// force use net transport

FLAGCX_DEVICE_INLINE_DECORATOR flagcxResult_t dequeue(volatile uint64_t *buffer,
int *idx) {
while (true) {
unsigned long long int old_c = *(buffer + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old_c -> oldConsumed
cur_p -> curProduced

uint64_t nthreads, uint64_t datatype, uint64_t redOp) {
// to be implemented by vendors
int tid = threadIdx.x;
float *fst_ptr = (float *)fst;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fst_ptr -> fstPtr

float *snd_ptr = (float *)snd;
float *out_ptr = (float *)out;
for (int i = tid; i < count; i += nthreads) {
out_ptr[i] = fst_ptr[i] + snd_ptr[i];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above


FLAGCX_GLOBAL_DECORATOR void flagcxCollectiveKernel(void *fifoBuffer) {
volatile uint64_t *vBuf = (volatile uint64_t *)fifoBuffer;
int empty_iter = 0; // backoff counter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty_iter -> emptyIter

uint64_t redop;
int slot = myIdx & (*vBuf - 1);
if (tid == 0) {
// printf("block %d get work idx %d, slot %d\n", blockIdx.x, myIdx, slot);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line

t->setComplete();
}
}
// FLAGCX_DEVICE_THREAD_FENCE();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line

Copy link
Collaborator

@MC952-arch MC952-arch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@mikethegoblin mikethegoblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MC952-arch MC952-arch merged commit 58bea1f into flagos-ai:main Jan 4, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants