Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Agents provisioned before this release need `Agent365.Observability.OtelWrite` g
- Log separator written at the start of each CLI invocation now redacts values for secret-bearing options (e.g. `--idp-client-secret`) so they are not written to the log file in plain text.
- Authentication context (tenant and user) is now logged at the `Information` level whenever the resolved sign-in identity changes, giving operators a clear audit trail in the log file of who the CLI is acting as, without exposing credentials.
- `a365 develop-mcp evaluate` command for evaluating MCP server tool schema quality — runs deterministic and semantic checks (via GitHub Copilot or Claude Code CLIs), computes maturity scoring, and generates an interactive HTML report
- `--eval-engine azure-openai` on `develop-mcp evaluate` — scores semantic checks with your own Azure OpenAI deployment instead of a local coding agent, authenticating with Microsoft Entra ID only (`DefaultAzureCredential`; no API key). Configured via `A365_EVAL_AZURE_OPENAI_ENDPOINT` and `A365_EVAL_AZURE_OPENAI_DEPLOYMENT`.
- `--skip-sp-provisioning` option on `setup all` — skips the interactive in-line provisioning of missing resource service principals (issue #429). Default: setup prompts per-resource and runs `az ad sp create --id <appId>` using the operator's `az login`. With this flag, missing SPs are excluded from the consent URL and listed in the Action Required block with the `az` command and a per-SP consent URL. Implicitly enabled when stdin is redirected (CI / pipe scenarios).
- Action Required block in the `setup all` summary now lists missing service principals — each entry shows the resource, pending scopes, the `az ad sp create` command, and the per-SP consent URL needed to complete provisioning.
- `logs export [command] [--output <dir>]` — exports a redacted copy of a CLI diagnostic log safe to share with Microsoft support. Redacts JWT tokens, email addresses, OS-path usernames, and tenant-specific GUIDs; replaces identical values with consistent aliases so log correlation is preserved. Preserves diagnostic IDs that aren't sensitive but are useful for debugging — `TraceId`, `CorrelationId`, Microsoft Graph `request-id` and `client-request-id` values, and well-known public Microsoft / Agent 365 resource appIds (such as the Microsoft Graph appId `00000003-0000-0000-c000-000000000000`). Omit `[command]` to export all available logs at once.
Expand Down
4 changes: 3 additions & 1 deletion docs/agent365-guided-setup/a365-evaluate-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,9 @@ Microsoft's role is supplying the `a365` CLI that orchestrates steps 1–2. Step

By default the evaluation uses **Claude Haiku 4.5**, invoked through whichever coding agent CLI you have installed (GitHub Copilot CLI or Claude Code CLI). You can override the engine with `--eval-engine github-copilot` or `--eval-engine claude-code`.

To run a **different model** without waiting for a CLI update, set an environment variable before the run: `A365_EVAL_COPILOT_MODEL` for GitHub Copilot (needs an exact model ID, e.g. `claude-haiku-4.5`) or `A365_EVAL_CLAUDE_MODEL` for Claude Code (accepts an alias, e.g. `haiku`).
You can also score with **your own Azure OpenAI deployment** via `--eval-engine azure-openai`. Set `A365_EVAL_AZURE_OPENAI_ENDPOINT` and `A365_EVAL_AZURE_OPENAI_DEPLOYMENT`, and sign in to Entra ID (e.g. `az login`). This engine authenticates with **Microsoft Entra ID only** (`DefaultAzureCredential`) — **API-key authentication is not supported**. Unlike the coding-agent engines it needs no local CLI: the `a365` CLI calls your Azure OpenAI deployment directly, under your own Azure subscription, billing, and access policies.

To run a **different model** without waiting for a CLI update, set an environment variable before the run: `A365_EVAL_COPILOT_MODEL` for GitHub Copilot (needs an exact model ID, e.g. `claude-haiku-4.5`) or `A365_EVAL_CLAUDE_MODEL` for Claude Code (accepts an alias, e.g. `haiku`). For Azure OpenAI, the model is whatever you name in `A365_EVAL_AZURE_OPENAI_DEPLOYMENT` (`gpt-5.4` recommended — it is the model this engine was tested against).

The CLI that runs the model is **yours**, installed from npm by you, authenticated with your credentials. Microsoft does not host or resell the model API call — it is made directly by your CLI to the AI provider under your own terms of service and billing. The `a365` CLI specifies the model flag but does not mediate the API call.

Expand Down
36 changes: 32 additions & 4 deletions docs/agent365-guided-setup/a365-evaluate-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,15 +32,17 @@ Wait for the answer. Store as `needsAuth`:
1. Auto — try GitHub Copilot first, fall back to Claude Code
2. GitHub Copilot only
3. Claude Code only
4. None — generate the checklist and let me (or my own LLM) score it manually (bring-your-own-LLM)
4. Azure OpenAI — score with your own Azure OpenAI deployment via Entra ID (no local coding agent; typically the fastest option)
5. None — generate the checklist and let me (or my own LLM) score it manually (bring-your-own-LLM)

Wait for the answer. Store as `evalEngine`:
- If **1 (Auto)**: `evalEngine = "auto"`
- If **2 (GitHub Copilot)**: `evalEngine = "github-copilot"`
- If **3 (Claude Code)**: `evalEngine = "claude-code"`
- If **4 (None)**: `evalEngine = "none"`
- If **4 (Azure OpenAI)**: `evalEngine = "azure-openai"` — also confirm the user has set `A365_EVAL_AZURE_OPENAI_ENDPOINT` and `A365_EVAL_AZURE_OPENAI_DEPLOYMENT` and is signed in to Entra ID (see Step 2).
- If **5 (None)**: `evalEngine = "none"`

> **Note:** Auto and the named engines require the corresponding CLI to be installed on the workstation (`copilot` or `claude`). If neither is installed and `evalEngine` is `auto` or a named engine, the pipeline will stop after writing the checklist and print BYO-LLM instructions — that is expected, not an error.
> **Note:** The local engines (`auto`, `github-copilot`, `claude-code`) require the corresponding CLI on the workstation (`copilot` or `claude`); if none is installed the pipeline stops after writing the checklist and prints BYO-LLM instructions — that is expected, not an error. The `azure-openai` engine needs **no local CLI** — instead it requires the `A365_EVAL_AZURE_OPENAI_ENDPOINT` and `A365_EVAL_AZURE_OPENAI_DEPLOYMENT` environment variables and an Entra ID sign-in (e.g. `az login`); see Step 2.

After all three questions are answered, create all todos for the path and mark Todo 1 in-progress:

Expand Down Expand Up @@ -133,6 +135,27 @@ Tell the user one of the following is required, and that without one of them the
- Install Claude Code (see above), or
- Switch to `evalEngine = "none"` and score the checklist with their own LLM after the run.

### If `evalEngine = "azure-openai"`

No local coding agent is used — the `a365` CLI calls your Azure OpenAI deployment directly. Verify two things:

1. **Endpoint and deployment are configured.** The CLI reads these from the environment (they are not flags):

```bash
echo "$A365_EVAL_AZURE_OPENAI_ENDPOINT" # e.g. https://<resource>.services.ai.azure.com/openai/v1
echo "$A365_EVAL_AZURE_OPENAI_DEPLOYMENT" # deployment name, e.g. gpt-5.4 (recommended)
```

If either is empty, ask the user for their Azure OpenAI endpoint (include the `/openai/v1` path) and deployment name, then set both variables.

2. **The workstation is signed in to Entra ID.** The engine authenticates with `DefaultAzureCredential` and requests a token for `https://ai.azure.com/.default`. The simplest sign-in is:

```bash
az login
```

> **Authentication — Microsoft Entra ID only.** The `azure-openai` engine supports **Entra ID authentication only**, via `DefaultAzureCredential` (Azure CLI sign-in, managed identity, a service principal supplied through environment variables, or Visual Studio / VS Code sign-in). **API-key authentication is NOT supported** — no key is read from the environment or any config file. The model call is made directly by the `a365` CLI to *your* Azure OpenAI deployment, under your Azure subscription, billing, and access policies.

### Step 2 completion

> Mark Todo 2 as **completed**. Mark Todo 3 as **in-progress**. Proceed to Step 3.
Expand Down Expand Up @@ -188,8 +211,11 @@ These settings can come from environment variables instead of (or alongside) the
| `A365_MCP_AUTH_TOKEN` | Bearer token for the MCP server, used when `--auth-token` is not passed. **Preferred over the flag** — it keeps the token out of process listings (`ps` / Task Manager) and shell history. If you pass `--auth-token`, the CLI prints a one-time warning recommending this variable. |
| `A365_EVAL_COPILOT_MODEL` | Override the GitHub Copilot model (exact model ID, e.g. `claude-haiku-4.5`). |
| `A365_EVAL_CLAUDE_MODEL` | Override the Claude Code model (alias, e.g. `haiku`). |
| `A365_EVAL_AZURE_OPENAI_ENDPOINT` | **Required for `--eval-engine azure-openai`.** Azure OpenAI endpoint, including the API path (e.g. `https://<resource>.services.ai.azure.com/openai/v1`). |
| `A365_EVAL_AZURE_OPENAI_DEPLOYMENT` | **Required for `--eval-engine azure-openai`.** Deployment (model) name to score with. `gpt-5.4` is recommended — it is the model this engine was tested against. |
| `A365_EVAL_AZURE_OPENAI_MAX_CONCURRENCY` | Parallel check-scoring calls for the `azure-openai` engine. Default `100`. |

The model defaults to Claude Haiku 4.5; override only to move to a newer model without waiting for a CLI release.
For the coding-agent engines the model defaults to Claude Haiku 4.5; override only to move to a newer model without waiting for a CLI release. The `azure-openai` engine uses whatever deployment you name in `A365_EVAL_AZURE_OPENAI_DEPLOYMENT` (`gpt-5.4` recommended) and authenticates with Entra ID only (no API key — see Step 2).

### What you will see

Expand Down Expand Up @@ -268,6 +294,8 @@ If any step results in an error, stop and analyze the error message carefully.
| `Unauthorized` from `tools/list` | Wrong or expired bearer token | Re-acquire the token (Question 2). |
| `Could not parse server URL` | URL doesn't include the protocol | Add `http://` or `https://` to the URL. |
| `No coding agent detected` after `[2/5]` | Neither `copilot` nor `claude` is on PATH | Install one (Step 2) or pass `--eval-engine none`. |
| `Azure OpenAI is not available` / engine not selected | `azure-openai` chosen but `A365_EVAL_AZURE_OPENAI_ENDPOINT` / `A365_EVAL_AZURE_OPENAI_DEPLOYMENT` are unset | Set both env vars (Step 2). |
| `Azure OpenAI authentication failed` | Not signed in to Entra ID (or no usable credential) for the `azure-openai` engine | Run `az login` (or configure a managed identity / service-principal credential). |
| `Failed to write report to <path>` | Output dir not writable | Choose a different `--output-dir` or fix permissions. |
| Telemetry warning at debug level | Non-blocking — the marker call failed | Ignore. The evaluation runs regardless. |

Expand Down
1 change: 1 addition & 0 deletions src/Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
<PackageVersion Include="Azure.Identity" Version="1.13.2" />
<PackageVersion Include="Azure.ResourceManager" Version="1.13.2" />
<PackageVersion Include="Azure.ResourceManager.AppService" Version="1.2.0" />
<PackageVersion Include="OpenAI" Version="2.8.0" />
<!-- Microsoft Graph -->
<PackageVersion Include="Microsoft.Graph" Version="5.36.0" />
<!-- Transitive pin: resolves NU1107 conflict between Microsoft.Extensions.Http 9.x (needs >= 9.0.8)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,11 @@ private static Command CreateEvaluateSubcommand(ILogger logger, IEvaluationPipel
var command = new Command(
"evaluate",
"Evaluate MCP server tool schema quality and generate an HTML report. " +
"Uses a locally installed coding agent (GitHub Copilot or Claude Code) to score semantic checks. " +
"Scores semantic checks with a locally installed coding agent (GitHub Copilot or Claude Code), " +
"or with your own Azure OpenAI deployment via --eval-engine azure-openai. " +
"If no agent is detected, the command stops after writing the checklist so you can score it manually with your own LLM, " +
"or pass --eval-engine none to skip agent probing entirely. " +
"Only run this against MCP servers you trust: the server's tool names and descriptions are sent to a locally running coding agent.");
"Only run this against MCP servers you trust: the server's tool names and descriptions are sent to the scoring engine.");

// Use a required option (not a positional argument) for consistency with other
// develop-mcp subcommands and Azure CLI conventions.
Expand All @@ -78,9 +79,10 @@ private static Command CreateEvaluateSubcommand(ILogger logger, IEvaluationPipel
var evalEngineOption = new Option<string>(
"--eval-engine",
getDefaultValue: () => "auto",
"Which local coding agent scores semantic checks. " +
"auto: try github-copilot then claude-code. " +
"github-copilot or claude-code: use only that engine. " +
"Which engine scores semantic checks. " +
"auto: try github-copilot then claude-code (local coding agents). " +
"github-copilot or claude-code: use only that local agent. " +
"azure-openai: use your own Azure OpenAI deployment as the judge via Entra ID (set A365_EVAL_AZURE_OPENAI_ENDPOINT and A365_EVAL_AZURE_OPENAI_DEPLOYMENT). " +
"none: skip automatic scoring and expect the checklist to be pre-scored (bring-your-own-LLM).");

var authTokenOption = new Option<string?>(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@
<PackageReference Include="Azure.ResourceManager" />
<PackageReference Include="Azure.ResourceManager.AppService" />

<!-- OpenAI SDK (Azure OpenAI BYO-LLM judge for evaluate) -->
<PackageReference Include="OpenAI" />

<!-- Microsoft Graph -->
<PackageReference Include="Microsoft.Graph" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,5 +56,6 @@ public enum EvalEngine
Auto,
GitHubCopilot,
ClaudeCode,
AzureOpenAI,
None
}
3 changes: 3 additions & 0 deletions src/Microsoft.Agents.A365.DevTools.Cli/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,9 @@ private static void ConfigureServices(IServiceCollection services, LogLevel mini
// adding its ICodingAgentLauncher implementation file plus one line here.
services.AddSingleton<ICodingAgentLauncher, GitHubCopilotLauncher>();
services.AddSingleton<ICodingAgentLauncher, ClaudeCodeLauncher>();
// Azure OpenAI judge (BYO model via Entra ID). Explicit-only (AutoDetectable=false),
// so it never participates in --eval-engine auto despite being registered here.
services.AddSingleton<ICodingAgentLauncher, AzureOpenAiLauncher>();
services.AddSingleton<IChecklistEvaluator, ChecklistEvaluator>();
services.AddSingleton<IEvaluationAnalyzer, EvaluationAnalyzer>();
services.AddSingleton<IReportGenerator, ReportGenerator>();
Expand Down
Loading
Loading