Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 157 additions & 0 deletions docs/source/changing-output.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
.. Copyright 2023-2026 Tom Meltzer. See the top-level COPYRIGHT file for
details.

.. _changing_output:

Changing the Output Mode
========================

When debugging across multiple ranks, ``mdb`` can produce a large amount of output. By default,
each rank's output is displayed independently, separated by divider lines. This is called *separate*
mode and is the default behaviour.

When the output from all ranks is identical, this can be redundant. The *combined* output mode
reduces this verbosity by merging common lines across ranks into a single block, prefixing them
with a combined rank range.

This tutorial demonstrates the difference between the two modes using the ``set output`` command.

The Example Program
-------------------

We will use the ``simple-mpi.exe`` binary from the ``examples/`` directory (see :ref:`quick_start`
for details on compiling it).

Launching the Debug Target
--------------------------

First, launch the program with two processes::

$ mdb launch -n 2 -t ./simple-mpi.exe

This starts the debug server and waits for an ``mdb attach`` connection.

Separate Output Mode (Default)
------------------------------

In separate mode, each rank's output is printed independently. The output from each rank is
prefixed with its rank ID (e.g., ``0:`` or ``1:``) and blocks for different ranks are separated by
a line of asterisks (``****``).

Create a script file ``script-separate.mdb``::

set output separate
broadcast start
info proc
break 15
continue
list
continue
quit
quit

Run it with::

$ mdb attach -x script-separate.mdb > separate.out

Here is a shortened excerpt of the output from the ``info proc`` and ``list`` commands:

**``info proc`` in separate mode:**

.. code-block:: console

0: process 44900
0: cmdline = '/home/melt/sync/cambridge/projects/side/mdb/examples/simple-mpi.exe'
0: cwd = '/home/melt/sync/cambridge/projects/side/mdb'
0: exe = '/home/melt/sync/cambridge/projects/side/mdb/examples/simple-mpi.exe'
************************************************************************
1: process 44903
1: cmdline = '/home/melt/sync/cambridge/projects/side/mdb/examples/simple-mpi.exe'
1: cwd = '/home/melt/sync/cambridge/projects/side/mdb'
1: exe = '/home/melt/sync/cambridge/projects/side/mdb/examples/simple-mpi.exe'

Each rank's output appears in its own block, prefixed with just the rank number, separated by
``****`` dividers.

**``list`` in separate mode:**

.. code-block:: console

0: 10
0: 11 call mpi_init(ierror)
0: 12 call mpi_comm_size(mpi_comm_world, size_of_cluster, ierror)
0: 13 call mpi_comm_rank(mpi_comm_world, process_rank, ierror)
0: 14
0: 15 var = 10.*process_rank
0: 16
0: 17 if (process_rank == 0) then
0: 18 print *, 'process 0 sleeping for 3s...'
0: 19 do i = 1, 3
************************************************************************
1: 10
1: 11 call mpi_init(ierror)
1: 12 call mpi_comm_size(mpi_comm_world, size_of_cluster, ierror)
1: 13 call mpi_comm_rank(mpi_comm_world, process_rank, ierror)
1: 14
1: 15 var = 10.*process_rank
1: 16
1: 17 if (process_rank == 0) then
1: 18 print *, 'process 0 sleeping for 3s...'
1: 19 do i = 1, 3

Since the source code is identical across ranks, the ``list`` output is duplicated for each rank.

Combined Output Mode
--------------------

In combined mode, lines that are identical across all ranks are merged and prefixed with a combined
rank range (e.g., ``0-1:``). Lines that differ between ranks are still shown per-rank with a single
rank prefix (e.g., `` 0:``). No ``****`` dividers are used.

Create a script file ``script-combined.mdb``::

set output combined
broadcast start
info proc
break 15
continue
list
continue
quit
quit

Run it with::

$ mdb attach -x script-combined.mdb > combined.out

**``info proc`` in combined mode:**

.. code-block:: console

0: process 45004
0-1: cmdline = '/home/melt/sync/cambridge/projects/side/mdb/examples/simple-mpi.exe'
0-1: cwd = '/home/melt/sync/cambridge/projects/side/mdb'
0-1: exe = '/home/melt/sync/cambridge/projects/side/mdb/examples/simple-mpi.exe'
1: process 45005

Notice how the three identical lines (``cmdline``, ``cwd``, ``exe``) are merged under a single
``0-1:`` prefix. The ``process`` line differs between ranks (different PIDs) so it is shown
individually for each rank.

**``list`` in combined mode:**

.. code-block:: console

0-1: 10
0-1: 11 call mpi_init(ierror)
0-1: 12 call mpi_comm_size(mpi_comm_world, size_of_cluster, ierror)
0-1: 13 call mpi_comm_rank(mpi_comm_world, process_rank, ierror)
0-1: 14
0-1: 15 var = 10.*process_rank
0-1: 16
0-1: 17 if (process_rank == 0) then
0-1: 18 print *, 'process 0 sleeping for 3s...'
0-1: 19 do i = 1, 3

Since all ranks share the same source code, every line is identical and merged into a single
compact block. This avoids the duplication seen in separate mode.
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ serial debugger backends e.g., `gdb <https://sourceware.org/gdb>`_ and `lldb

Installation <installation>
Quick Start <quick-start>
Changing the Output Mode <changing-output>
Debugging AMD GPU Kernels <gpu-amd>

.. toctree::
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "mdb_debugger"
version = "1.0.6"
version = "1.0.7"
dependencies = [
"click==8.1.7",
"matplotlib==3.8.3",
Expand Down
45 changes: 44 additions & 1 deletion src/mdb/mdb_shell.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
parse_ranks,
pretty_print_response,
sort_debug_response,
reduce_response,
)

if TYPE_CHECKING:
Expand Down Expand Up @@ -55,6 +56,7 @@ def __init__(self, shell_opts: ShellOpts, client: Client) -> None:
self.exchange_select = parse_ranks(self.exchange_select_str)
self.select_str = self.exchange_select_str
self.select = self.exchange_select
self.output_mode = "separate" # 'separate' or 'combined'
backend_name = shell_opts["backend_name"].lower()
if backend_name in backends:
self.backend = backends[backend_name]()
Expand Down Expand Up @@ -214,7 +216,10 @@ def ask_remain_calm(signame: str) -> None:

if command_response.msg_type == "exchange_command_response":
response = sort_debug_response(command_response.data["results"])
pretty_print_response(response)
if self.output_mode == "combined":
reduce_response(response)
else:
pretty_print_response(response)
else:
print("Received unexpected message type: %s", command_response.msg_type)
return
Expand Down Expand Up @@ -250,6 +255,44 @@ def do_shell(self, line: str) -> None:
run(split(line))
return

def do_set(self, line: str) -> None:
"""
Description:
Set mdb options.

Example:
Switch output format between separate and combined mode:

(mdb) set output combined
(mdb) set output separate

- separate shows all output for each rank separate by ***'s
- combined reduces common output across the ranks

Show current settings:

(mdb) set
"""
if not line:
print(f"output: {self.output_mode}")
return

parts = line.split()
if len(parts) < 2:
print("Usage: set output [separate|combined]")
return

if parts[0].lower() == "output":
mode = parts[1].lower()
if mode in ("separate", "combined"):
self.output_mode = mode
else:
print(
f"Error: unknown output mode '{mode}'. Use 'separate' or 'combined'."
)
else:
print(f"Error: unknown option '{parts[0]}'. Use 'output'.")

def do_select(self, line: str) -> None:
"""
Description:
Expand Down
75 changes: 72 additions & 3 deletions src/mdb/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import re
from os.path import expanduser
from typing import TYPE_CHECKING
from collections import defaultdict

if TYPE_CHECKING:
from .backend import DebugBackend
Expand All @@ -21,6 +22,29 @@ def sort_debug_response(results: dict[int, str]) -> dict[int, str]:
return dict(sorted(results.items()))


def collapse_ranges(values: list[int]) -> str:
"""Collapse a list of integers into minimal range notation.

e.g. [1, 2, 3, 6, 7, 10, 11, 12] -> '1-3,6-7,10-12'
"""
sorted_vals = sorted(set(values))
if not sorted_vals:
return ""

ranges = []
start = end = sorted_vals[0]

for n in sorted_vals[1:]:
if n == end + 1:
end = n
else:
ranges.append(f"{start}-{end}" if start != end else str(start))
start = end = n

ranges.append(f"{start}-{end}" if start != end else str(start))
return ",".join(ranges)


def pretty_print_response(response: dict[int, str]) -> None:
lines = []
for rank, result in response.items():
Expand All @@ -30,6 +54,53 @@ def pretty_print_response(response: dict[int, str]) -> None:
print(combined_output)


def reduce_response(response: dict[int, str]) -> None:
"""Reduce debug output by deduplicating common lines across ranks.

Parses each rank's output, groups identical lines together, and prints
each unique line prefixed by the collapsed set of ranks that produced it.
Lines from only one rank appear normally; shared lines are printed once
with all contributing ranks.

Example output before and after:

Before (raw output per rank):

0: process 45402
0: cmdline = '/mdb/examples/simple-mpi.exe'
0: cwd = '/mdb'
0: exe = '/mdb/examples/simple-mpi.exe'
************************************************************************
1: process 45403
1: cmdline = '/mdb/examples/simple-mpi.exe'
1: cwd = '/mdb'
1: exe = '/mdb/examples/simple-mpi.exe'

After (deduplicated, ranks collapsed):

0: process 45402
0-1: cmdline = '/mdb/examples/simple-mpi.exe'
0-1: cwd = '/mdb'
0-1: exe = '/mdb/examples/simple-mpi.exe'
1: process 45403

Args:
response: dict mapping process rank (int) to its output string.
"""
reduced = defaultdict(list)
for rank, result in response.items():
if result:
for line in result.split("\r\n")[1:-1]:
reduced[line].append(rank)

reduced = {k: collapse_ranges(v) for k, v in reduced.items()}

max_len = max([len(v) for v in reduced.values()], default=0)
for line, ranks_str in reduced.items():
padded = ranks_str.rjust(max_len)
print(f"{padded}: {line}")


def extract_float(line: str, backend: "DebugBackend") -> float:
float_regex = backend.float_regex
line = strip_control_characters(line)
Expand All @@ -49,9 +120,7 @@ def extract_float(line: str, backend: "DebugBackend") -> float:


def prepend_ranks(rank: int, result: str) -> str:
return "".join(
[f"{rank}:\t" + line + "\r\n" for line in result.split("\r\n")[1:-1]]
)
return "".join([f"{rank}: " + line + "\r\n" for line in result.split("\r\n")[1:-1]])


def strip_bracketted_paste(text: str) -> str:
Expand Down
Loading
Loading