Skip to content

Fix node names#14

Open
LanderOtto wants to merge 3 commits into
masterfrom
fix-node-names
Open

Fix node names#14
LanderOtto wants to merge 3 commits into
masterfrom
fix-node-names

Conversation

@LanderOtto
Copy link
Copy Markdown

@LanderOtto LanderOtto commented May 20, 2026

This commit fixes several issues:

  • Execution node numeration was erroneous. The containers numeration starts from 1.
  • Munge fails for erronous file stats (owner and mode).
  • Add a Fully Qualified Domain Name (FQDN) length check on the server. Before this commit, long compose project names produce FQDN that execeed the MAX_HOSTNAME_LEN. The pbs_mom fails with failed to get fullhostname

- Add FQDN length check on server startup using getconf HOST_NAME_MAX
- Add run-sched wrapper that waits for pbs.conf.ready before starting
- Set autorestart=false for pbs_server to prevent crash loop
- Fix munge.key ownership/permission wait conditions
- Add munge wait loop before pbs_mom starts
- Add supervisord priority to ensure munge starts first
- Update README with FQDN warning and fix typos
- run-sched file-based sync was overengineered; pbs_sched can run
  directly since /etc/pbs.conf is substituted before it launches
- /tmp/pbs.conf.ready was unused after removing the sync
- supervisord priority was redundant (alphabetical order + munge wait
  loop already handle ordering)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant