Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 19 additions & 19 deletions SKILLS.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,17 @@ Reach for Collibra tools when the user's question is about **understanding, disc

### Discovery & Search

**`data_assets_discover`** — Natural language semantic search over data assets (tables, columns, datasets). Use when the user asks open-ended questions like "what data do we have about customers?". Requires `dgc.ai-copilot` permission.
**`discover_data_assets`** — Natural language semantic search over data assets (tables, columns, datasets). Use when the user asks open-ended questions like "what data do we have about customers?". Requires `dgc.ai-copilot` permission.

**`business_glossary_discover`** — Natural language semantic search over the business glossary (terms, acronyms, KPIs, definitions). Use when the user asks about the meaning of a business concept. Requires `dgc.ai-copilot` permission.
**`discover_business_glossary`** — Natural language semantic search over the business glossary (terms, acronyms, KPIs, definitions). Use when the user asks about the meaning of a business concept. Requires `dgc.ai-copilot` permission.

**`search_asset_keyword`** — Wildcard keyword search. Returns names, IDs, and metadata but not full asset details. Use this to find an asset's UUID when you only know its name. Supports filtering by resource type, community, domain, asset type, status, and creator. Paginated via `limit`/`offset`.

**`asset_types_list`** — List all asset type names and UUIDs. Use this when you need a type UUID to filter `search_asset_keyword` results.
**`list_asset_types`** — List all asset type names and UUIDs. Use this when you need a type UUID to filter `search_asset_keyword` results.

### Asset Details

**`asset_details_get`** — Retrieve full details for a single asset by UUID: attributes, relations, and metadata. Returns a direct link to the asset in the Collibra UI. Relations are paginated (50 per page); use `outgoingRelationsCursor` and `incomingRelationsCursor` from the previous response to page through them.
**`get_asset_details`** — Retrieve full details for a single asset by UUID: attributes, relations, and metadata. Returns a direct link to the asset in the Collibra UI. Relations are paginated (50 per page); use `outgoingRelationsCursor` and `incomingRelationsCursor` from the previous response to page through them.

### Semantic Graph Traversal

Expand All @@ -43,13 +43,13 @@ These tools walk the Collibra asset relation graph to answer lineage and semanti

### Data Classification

**`data_class_search`** — Search for data classes by name or description. Use this to find a classification UUID before applying it to an asset. Requires `dgc.data-classes-read` permission.
**`search_data_class`** — Search for data classes by name or description. Use this to find a classification UUID before applying it to an asset. Requires `dgc.data-classes-read` permission.

**`data_classification_match_search`** — Search existing classification matches (associations between data classes and assets). Filter by asset IDs, classification IDs, or status (`ACCEPTED`, `REJECTED`, `SUGGESTED`). Requires `dgc.classify` + `dgc.catalog`.
**`search_data_classification_match`** — Search existing classification matches (associations between data classes and assets). Filter by asset IDs, classification IDs, or status (`ACCEPTED`, `REJECTED`, `SUGGESTED`). Requires `dgc.classify` + `dgc.catalog`.

**`data_classification_match_add`** — Apply a data class to an asset. Requires both the asset UUID and classification UUID. Requires `dgc.classify` + `dgc.catalog`.
**`add_data_classification_match`** — Apply a data class to an asset. Requires both the asset UUID and classification UUID. Requires `dgc.classify` + `dgc.catalog`.

**`data_classification_match_remove`** — Remove a classification match. Requires `dgc.classify` + `dgc.catalog`.
**`remove_data_classification_match`** — Remove a classification match. Requires `dgc.classify` + `dgc.catalog`.

### Technical Lineage

Expand All @@ -69,24 +69,24 @@ These tools query the technical lineage graph — a map of all data objects and

### Data Contracts

**`data_contract_list`** — List data contracts with cursor-based pagination. Filter by `manifestId`. Use this to find a contract's UUID.
**`list_data_contract`** — List data contracts with cursor-based pagination. Filter by `manifestId`. Use this to find a contract's UUID.

**`data_contract_manifest_pull`** — Download the manifest for a data contract by UUID.
**`pull_data_contract_manifest`** — Download the manifest for a data contract by UUID.

**`data_contract_manifest_push`** — Upload/update a manifest for a data contract by UUID.
**`push_data_contract_manifest`** — Upload/update a manifest for a data contract by UUID.

---

## Common Workflows

### Find an asset and get its details
1. `search_asset_keyword` with the asset name → get UUID from results
2. `asset_details_get` with the UUID → get full attributes and relations
2. `get_asset_details` with the UUID → get full attributes and relations

### Classify a column
1. `search_asset_keyword` to find the column UUID
2. `data_class_search` to find the data class UUID
3. `data_classification_match_add` with both UUIDs
2. `search_data_class` to find the data class UUID
3. `add_data_classification_match` with both UUIDs

### Understand what a table means
1. `search_asset_keyword` to find the table UUID
Expand All @@ -112,15 +112,15 @@ These tools query the technical lineage graph — a map of all data objects and
3. Follow up with `get_lineage_entity` for specific consumers as needed

### Manage a data contract
1. `data_contract_list` to find the contract UUID
2. `data_contract_manifest_pull` to download, edit, then `data_contract_manifest_push` to update
1. `list_data_contract` to find the contract UUID
2. `pull_data_contract_manifest` to download, edit, then `push_data_contract_manifest` to update

---

## Tips

- **UUIDs are required for most tools.** When you only have a name, start with `search_asset_keyword` or the natural language discovery tools to get the UUID first.
- **`data_assets_discover` vs `search_asset_keyword`**: Prefer `data_assets_discover` for open-ended semantic questions; prefer `search_asset_keyword` when you know the exact name or need to filter by type/community/domain.
- **Permissions**: `data_assets_discover` and `business_glossary_discover` require the `dgc.ai-copilot` permission. Classification tools require `dgc.classify` + `dgc.catalog`. If a tool fails with a permission error, let the user know which permission is needed.
- **Pagination**: `search_asset_keyword`, `asset_types_list`, `data_class_search`, and `data_classification_match_search` use `limit`/`offset`. `data_contract_list` and `asset_details_get` (for relations) use cursor-based pagination — carry the cursor from the previous response. Lineage tools (`search_lineage_entities`, `get_lineage_upstream`, `get_lineage_downstream`, `search_lineage_transformations`) also use cursor-based pagination.
- **`discover_data_assets` vs `search_asset_keyword`**: Prefer `discover_data_assets` for open-ended semantic questions; prefer `search_asset_keyword` when you know the exact name or need to filter by type/community/domain.
- **Permissions**: `discover_data_assets` and `discover_business_glossary` require the `dgc.ai-copilot` permission. Classification tools require `dgc.classify` + `dgc.catalog`. If a tool fails with a permission error, let the user know which permission is needed.
- **Pagination**: `search_asset_keyword`, `list_asset_types`, `search_data_class`, and `search_data_classification_match` use `limit`/`offset`. `list_data_contract` and `get_asset_details` (for relations) use cursor-based pagination — carry the cursor from the previous response. Lineage tools (`search_lineage_entities`, `get_lineage_upstream`, `get_lineage_downstream`, `search_lineage_transformations`) also use cursor-based pagination.
- **Error handling**: Validation errors are returned in the output `error` field (not as Go errors), so always check `error` and `success`/`found` fields in the response before using the data.
143 changes: 143 additions & 0 deletions pkg/clients/dgc_responsibility_client.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
package clients

import (
"context"
"encoding/json"
"fmt"
"log/slog"
"net/http"
)

const errFailedToCreateRequest = "failed to create request: %w"

// Responsibility represents a single responsibility assignment for an asset.
type Responsibility struct {
ID string `json:"id"`
Role *ResourceRole `json:"role,omitempty"`
Owner *ResourceRef `json:"owner,omitempty"`
BaseResource *ResourceRef `json:"baseResource,omitempty"`
System bool `json:"system"`
}

// ResourceRole represents the role in a responsibility (e.g., Owner, Steward).
type ResourceRole struct {
ID string `json:"id"`
Name string `json:"name"`
}

// ResourceRef represents a reference to a resource (user, group, community, etc.) in the API.
type ResourceRef struct {
ID string `json:"id"`
ResourceDiscriminator string `json:"resourceDiscriminator"`
}

// ResponsibilityPagedResponse represents the paginated response from the responsibilities API.
type ResponsibilityPagedResponse struct {
Total int64 `json:"total"`
Offset int64 `json:"offset"`
Limit int64 `json:"limit"`
Results []Responsibility `json:"results"`
}

// ResponsibilityQueryParams defines the query parameters for the responsibilities API.
type ResponsibilityQueryParams struct {
ResourceIDs string `url:"resourceIds,omitempty"`
IncludeInherited bool `url:"includeInherited,omitempty"`
Limit int `url:"limit,omitempty"`
Offset int `url:"offset,omitempty"`
}

// UserResponse represents the response from the /rest/2.0/users/{userId} endpoint.
type UserResponse struct {
ID string `json:"id"`
UserName string `json:"userName"`
FirstName string `json:"firstName,omitempty"`
LastName string `json:"lastName,omitempty"`
}

// UserGroupResponse represents the response from the /rest/2.0/userGroups/{groupId} endpoint.
type UserGroupResponse struct {
ID string `json:"id"`
Name string `json:"name"`
}

// GetResponsibilities fetches all responsibilities for the given asset ID, including inherited ones.
func GetResponsibilities(ctx context.Context, collibraHttpClient *http.Client, assetID string) ([]Responsibility, error) {
slog.InfoContext(ctx, fmt.Sprintf("Fetching responsibilities for asset: %s", assetID))

params := ResponsibilityQueryParams{
ResourceIDs: assetID,
IncludeInherited: true,
Limit: 100,
Offset: 0,
}

endpoint, err := buildUrl("/rest/2.0/responsibilities", params)
if err != nil {
return nil, fmt.Errorf("failed to build endpoint: %w", err)
}

req, err := http.NewRequestWithContext(ctx, "GET", endpoint, nil)
if err != nil {
return nil, fmt.Errorf(errFailedToCreateRequest, err)
}

body, err := executeRequest(collibraHttpClient, req)
if err != nil {
return nil, err
}

var response ResponsibilityPagedResponse
if err := json.Unmarshal(body, &response); err != nil {
return nil, fmt.Errorf("failed to parse responsibilities response: %w", err)
}

return response.Results, nil
}

// GetUserName fetches the display name for a user by ID.
func GetUserName(ctx context.Context, collibraHttpClient *http.Client, userID string) (string, error) {
endpoint := fmt.Sprintf("/rest/2.0/users/%s", userID)

req, err := http.NewRequestWithContext(ctx, "GET", endpoint, nil)
if err != nil {
return "", fmt.Errorf(errFailedToCreateRequest, err)
}

body, err := executeRequest(collibraHttpClient, req)
if err != nil {
return "", err
}

var user UserResponse
if err := json.Unmarshal(body, &user); err != nil {
return "", fmt.Errorf("failed to parse user response: %w", err)
}

if user.FirstName != "" || user.LastName != "" {
return fmt.Sprintf("%s %s (%s)", user.FirstName, user.LastName, user.UserName), nil
}
return user.UserName, nil
}

// GetUserGroupName fetches the name for a user group by ID.
func GetUserGroupName(ctx context.Context, collibraHttpClient *http.Client, groupID string) (string, error) {
endpoint := fmt.Sprintf("/rest/2.0/userGroups/%s", groupID)

req, err := http.NewRequestWithContext(ctx, "GET", endpoint, nil)
if err != nil {
return "", fmt.Errorf(errFailedToCreateRequest, err)
}

body, err := executeRequest(collibraHttpClient, req)
if err != nil {
return "", err
}

var group UserGroupResponse
if err := json.Unmarshal(body, &group); err != nil {
return "", fmt.Errorf("failed to parse user group response: %w", err)
}

return group.Name, nil
}
120 changes: 112 additions & 8 deletions pkg/tools/get_asset_details.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"log/slog"
"net/http"
"strings"
"sync"

"github.com/collibra/chip/pkg/chip"
"github.com/collibra/chip/pkg/clients"
Expand All @@ -19,16 +20,26 @@ type AssetDetailsInput struct {
}

type AssetDetailsOutput struct {
Asset *clients.Asset `json:"asset,omitempty" jsonschema:"the detailed asset information if found"`
Link string `json:"link,omitempty" jsonschema:"the link you can navigate to in Collibra to view the asset"`
Error string `json:"error,omitempty" jsonschema:"error message if asset not found or other error occurred"`
Found bool `json:"found" jsonschema:"whether the asset was found"`
Asset *clients.Asset `json:"asset,omitempty" jsonschema:"the detailed asset information if found"`
Responsibilities []AssetResponsibility `json:"responsibilities,omitempty" jsonschema:"the responsibilities assigned to this asset, including inherited ones"`
ResponsibilitiesStatus string `json:"responsibilitiesStatus,omitempty" jsonschema:"status message for responsibilities, e.g. No responsibilities assigned"`
Link string `json:"link,omitempty" jsonschema:"the link you can navigate to in Collibra to view the asset"`
Error string `json:"error,omitempty" jsonschema:"error message if asset not found or other error occurred"`
Found bool `json:"found" jsonschema:"whether the asset was found"`
}

// AssetResponsibility represents a role assignment (e.g., Owner, Steward) for an asset.
type AssetResponsibility struct {
RoleName string `json:"roleName" jsonschema:"the name of the resource role (e.g., Owner, Business Steward)"`
UserName string `json:"userName,omitempty" jsonschema:"the username of the assigned user, if the owner is a user"`
GroupName string `json:"groupName,omitempty" jsonschema:"the name of the assigned group, if the owner is a user group"`
Inherited bool `json:"inherited" jsonschema:"true if the responsibility is inherited from a parent resource (domain or community), false if directly assigned to this asset"`
}

func NewAssetDetailsTool(collibraClient *http.Client) *chip.Tool[AssetDetailsInput, AssetDetailsOutput] {
return &chip.Tool[AssetDetailsInput, AssetDetailsOutput]{
Name: "get_asset_details",
Description: "Get detailed information about a specific asset by its UUID, including attributes, relations, and metadata. Returns up to 100 attributes per type and supports cursor-based pagination for relations (50 per page).",
Description: "Get detailed information about a specific asset by its UUID, including attributes, relations, responsibilities (owners, stewards, and other role assignments), and metadata. Returns up to 100 attributes per type and supports cursor-based pagination for relations (50 per page).",
Handler: handleAssetDetails(collibraClient),
Permissions: []string{},
}
Expand Down Expand Up @@ -70,10 +81,103 @@ func handleAssetDetails(collibraClient *http.Client) chip.ToolHandlerFunc[AssetD
slog.WarnContext(ctx, "Collibra instance URL unknown, links will be rendered without host")
}

responsibilities, err := clients.GetResponsibilities(ctx, collibraClient, assetUUID.String())
if err != nil {
slog.WarnContext(ctx, fmt.Sprintf("Failed to retrieve responsibilities: %s", err.Error()))
}

mappedResponsibilities := resolveResponsibilities(ctx, collibraClient, responsibilities, assetUUID.String())
responsibilitiesStatus := ""
if len(mappedResponsibilities) == 0 {
responsibilitiesStatus = "No responsibilities assigned"
}

return AssetDetailsOutput{
Asset: &assets[0],
Found: true,
Link: fmt.Sprintf("%s/asset/%s", strings.TrimSuffix(collibraHost, "/"), assetUUID),
Asset: &assets[0],
Responsibilities: mappedResponsibilities,
ResponsibilitiesStatus: responsibilitiesStatus,
Found: true,
Link: fmt.Sprintf("%s/asset/%s", strings.TrimSuffix(collibraHost, "/"), assetUUID),
}, nil
}
}

func resolveResponsibilities(ctx context.Context, collibraClient *http.Client, responsibilities []clients.Responsibility, assetID string) []AssetResponsibility {
if len(responsibilities) == 0 {
return nil
}

// Collect unique owner IDs by type to avoid duplicate lookups
ownerNames := resolveOwnerNames(ctx, collibraClient, responsibilities)

result := make([]AssetResponsibility, 0, len(responsibilities))
for _, r := range responsibilities {
entry := AssetResponsibility{}
if r.Role != nil {
entry.RoleName = r.Role.Name
}
if r.Owner != nil {
resolved := ownerNames[r.Owner.ID]
if r.Owner.ResourceDiscriminator == "UserGroup" {
entry.GroupName = resolved
} else {
entry.UserName = resolved
}
}
entry.Inherited = r.BaseResource != nil && r.BaseResource.ID != assetID
result = append(result, entry)
}
return result
}

// resolveOwnerNames fetches display names for all unique owners in parallel.
// Returns a map of owner ID to resolved display name.
func resolveOwnerNames(ctx context.Context, collibraClient *http.Client, responsibilities []clients.Responsibility) map[string]string {
// Deduplicate owners by ID
owners := make(map[string]*clients.ResourceRef)
for _, r := range responsibilities {
if r.Owner != nil {
owners[r.Owner.ID] = r.Owner
}
}

names := make(map[string]string, len(owners))
var mu sync.Mutex
var wg sync.WaitGroup

for _, owner := range owners {
wg.Add(1)
go func(o *clients.ResourceRef) {
defer wg.Done()
name := fetchOwnerName(ctx, collibraClient, o)
mu.Lock()
names[o.ID] = name
mu.Unlock()
}(owner)
}

wg.Wait()
return names
}

func fetchOwnerName(ctx context.Context, collibraClient *http.Client, owner *clients.ResourceRef) string {
switch owner.ResourceDiscriminator {
case "User":
name, err := clients.GetUserName(ctx, collibraClient, owner.ID)
if err != nil {
slog.WarnContext(ctx, fmt.Sprintf("Failed to resolve user name for %s: %s", owner.ID, err.Error()))
return owner.ID
}
return name
case "UserGroup":
name, err := clients.GetUserGroupName(ctx, collibraClient, owner.ID)
if err != nil {
slog.WarnContext(ctx, fmt.Sprintf("Failed to resolve group name for %s: %s", owner.ID, err.Error()))
return owner.ID
}
return name
default:
slog.WarnContext(ctx, fmt.Sprintf("Unknown owner type '%s' for %s", owner.ResourceDiscriminator, owner.ID))
return owner.ID
}
}
Loading
Loading