generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 31
feat(aws-observability): Add AWS Observability plugin #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
theagenticguy
wants to merge
9
commits into
awslabs:main
Choose a base branch
from
theagenticguy:aws-observability
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
e3cf4c6
feat(aws-observability): Add AWS Observability plugin
theagenticguy 5e77452
Merge branch 'main' into aws-observability
krokoko a4cf8d0
Merge branch 'main' into aws-observability
krokoko 0f37872
fix: address PR review feedback for aws-observability plugin
theagenticguy cee0518
fix: use awsknowledge HTTP MCP server instead of local docs server
theagenticguy 4fdcef6
Merge branch 'main' into aws-observability
theagenticguy 5d73b18
fix: address round 3 Copilot review feedback
theagenticguy 00016e8
refactor: slim SKILL.md to be agent-focused, not README-like
theagenticguy 4515dd5
Merge branch 'main' into aws-observability
krokoko File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| { | ||
| "author": { | ||
| "name": "Amazon Web Services" | ||
| }, | ||
| "description": "Comprehensive AWS observability platform combining CloudWatch Logs, Metrics, Alarms, Application Signals (APM), CloudTrail security auditing, and automated codebase observability gap analysis for complete monitoring, troubleshooting, and optimization.", | ||
| "homepage": "https://github.com/awslabs/agent-plugins", | ||
| "keywords": [ | ||
| "aws", | ||
| "observability", | ||
| "cloudwatch", | ||
| "monitoring", | ||
| "logs", | ||
| "metrics", | ||
| "alarms", | ||
| "application-signals", | ||
| "apm", | ||
| "cloudtrail", | ||
| "security", | ||
| "tracing" | ||
| ], | ||
| "license": "Apache-2.0", | ||
| "name": "aws-observability", | ||
| "repository": "https://github.com/awslabs/agent-plugins", | ||
| "version": "1.0.0" | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| { | ||
| "mcpServers": { | ||
| "awslabs.cloudwatch-mcp-server": { | ||
| "command": "uvx", | ||
| "args": [ | ||
| "awslabs.cloudwatch-mcp-server@latest" | ||
| ], | ||
| "env": { | ||
| "AWS_PROFILE": "default", | ||
| "AWS_REGION": "us-east-1", | ||
| "FASTMCP_LOG_LEVEL": "ERROR" | ||
| } | ||
| }, | ||
| "awslabs.cloudwatch-applicationsignals-mcp-server": { | ||
| "command": "uvx", | ||
| "args": [ | ||
| "awslabs.cloudwatch-applicationsignals-mcp-server@latest" | ||
| ], | ||
| "env": { | ||
| "AWS_PROFILE": "default", | ||
| "AWS_REGION": "us-east-1", | ||
| "FASTMCP_LOG_LEVEL": "ERROR" | ||
| } | ||
| }, | ||
| "awslabs.cloudtrail-mcp-server": { | ||
| "command": "uvx", | ||
| "args": [ | ||
| "awslabs.cloudtrail-mcp-server@latest" | ||
| ], | ||
| "env": { | ||
| "AWS_PROFILE": "default", | ||
| "AWS_REGION": "us-east-1", | ||
| "FASTMCP_LOG_LEVEL": "ERROR" | ||
| } | ||
| }, | ||
| "awslabs.billing-cost-management-mcp-server": { | ||
| "command": "uvx", | ||
| "args": [ | ||
| "awslabs.billing-cost-management-mcp-server@latest" | ||
| ], | ||
| "env": { | ||
| "AWS_PROFILE": "default", | ||
| "AWS_REGION": "us-east-1", | ||
| "FASTMCP_LOG_LEVEL": "ERROR" | ||
| } | ||
| }, | ||
| "awsknowledge": { | ||
| "type": "http", | ||
| "url": "https://knowledge-mcp.global.api.aws" | ||
| } | ||
| } | ||
| } | ||
88 changes: 88 additions & 0 deletions
88
plugins/aws-observability/skills/aws-observability/SKILL.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| --- | ||
| name: aws-observability | ||
| description: "Comprehensive AWS observability platform combining CloudWatch Logs, Metrics, Alarms, Application Signals (APM), CloudTrail security auditing, Billing & Cost Management, and automated codebase observability gap analysis. Triggers on phrases like: CloudWatch logs, metrics, alarms, monitoring, observability, application signals, APM, distributed tracing, performance, latency, errors, troubleshooting, root cause analysis, security audit, CloudTrail, log analysis, alerting, SLO, incident response, observability gaps, missing instrumentation, AWS costs, billing, cost anomaly." | ||
| --- | ||
|
|
||
| # AWS Observability | ||
|
|
||
| Requires AWS CLI credentials. All stdio MCP servers use `AWS_PROFILE` and `AWS_REGION` from their env config (defaults: `default` profile, `us-east-1`). | ||
|
|
||
| ## Capabilities | ||
|
|
||
| | Capability | MCP Server | Use When | | ||
| | --------------------------- | -------------------------------------------------- | -------------------------------------------------------- | | ||
| | CloudWatch Logs | `awslabs.cloudwatch-mcp-server` | Log queries, pattern detection, anomaly analysis | | ||
| | Metrics & Alarms | `awslabs.cloudwatch-mcp-server` | Metric data, alarm recommendations, trend analysis | | ||
| | Application Signals (APM) | `awslabs.cloudwatch-applicationsignals-mcp-server` | Service health, SLOs, distributed tracing, error budgets | | ||
| | CloudTrail Security | `awslabs.cloudtrail-mcp-server` | IAM changes, resource deletions, compliance audits | | ||
| | Billing & Cost Management | `awslabs.billing-cost-management-mcp-server` | Cost analysis, forecasting, Compute Optimizer, budgets | | ||
| | AWS Documentation | `awsknowledge` (HTTP) | Troubleshooting, best practices, API references | | ||
| | Codebase Observability Gaps | _(file analysis, no MCP)_ | Identify missing logging, metrics, tracing in code | | ||
|
|
||
| ## Workflow Decision Tree | ||
|
|
||
| **User reports an incident or error?** | ||
| -> Load [Incident Response](references/incident-response.md). Start with `audit_services` wildcard, then correlate alarms + logs + traces + CloudTrail changes. | ||
|
|
||
| **User asks about logs or wants to query logs?** | ||
| -> Load [Log Analysis](references/log-analysis.md). Use `execute_log_insights_query`. Always include `| limit` in queries. | ||
|
|
||
| **User wants to set up or tune alarms?** | ||
| -> Load [Alerting Setup](references/alerting-setup.md). Use `get_recommended_metric_alarms` for best-practice thresholds. | ||
|
|
||
| **User asks about service performance, latency, or SLOs?** | ||
| -> Load [Performance Monitoring](references/performance-monitoring.md). Start with `audit_services`, then `search_transaction_spans` for 100% trace visibility. | ||
|
|
||
| **User needs security audit or compliance review?** | ||
| -> Load [Security Auditing](references/security-auditing.md). Follow data source priority: CloudTrail Lake > CloudWatch Logs > Lookup Events API. | ||
|
|
||
| **User wants to assess codebase observability?** | ||
| -> Load [Observability Gap Analysis](references/observability-gap-analysis.md). Analyze logging, metrics, tracing, error handling, health checks. | ||
|
|
||
| **User setting up Application Signals for the first time?** | ||
| -> Load [Application Signals Setup](references/application-signals-setup.md). Start with `get_enablement_guide`. | ||
|
|
||
| **CloudTrail data source priority reference** (loaded by security-auditing.md, not directly): | ||
| -> [CloudTrail Data Source Selection](references/cloudtrail-data-source-selection.md) | ||
|
|
||
| ## Essential Log Query Patterns | ||
|
|
||
| ### Error Search | ||
|
|
||
| ``` | ||
| fields @timestamp, @message, @logStream, level | ||
| | filter level = "ERROR" | ||
| | sort @timestamp desc | ||
| | limit 100 | ||
| ``` | ||
|
|
||
| ### Performance Analysis | ||
|
|
||
| ``` | ||
| stats count() as requestCount, | ||
| avg(duration) as avgDuration, | ||
| pct(duration, 95) as p95Duration, | ||
| pct(duration, 99) as p99Duration | ||
| by endpoint | ||
| | filter requestCount > 10 | ||
| | sort p95Duration desc | ||
theagenticguy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| | limit 100 | ||
| ``` | ||
|
|
||
| ### Error Rate Over Time | ||
|
|
||
| ``` | ||
| stats count() as total, | ||
| sum(statusCode >= 500) as errors, | ||
| (sum(statusCode >= 500) / count()) * 100 as errorRate | ||
| by bin(5m) as timeWindow | ||
| | sort timeWindow | ||
| ``` | ||
|
|
||
| ## Key Tool Entry Points | ||
|
|
||
| - **Application Signals**: Start with `audit_services` using `[{"Type":"service","Data":{"Service":{"Type":"Service","Name":"*"}}}]` for wildcard discovery | ||
| - **Logs**: Use `describe_log_groups` to discover groups, then `execute_log_insights_query` | ||
| - **Metrics**: Use Sum for count metrics, Average for utilization, percentiles for latency | ||
| - **CloudTrail**: Check Lake first (`list_event_data_stores`), fall back to CloudWatch Logs, then `lookup_events` | ||
| - **Costs**: Use `cost-explorer` tool for spend analysis, `compute-optimizer` for right-sizing | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.