new: domain-sharded tp management#4811
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
b062ed8 to
1b98c00
Compare
1b98c00 to
c4c82f1
Compare
|
Rebased on top of latest main after the landing of big #4265 :) |
c4c82f1 to
2b76963
Compare
2f42b2b to
24d9590
Compare
There was a problem hiding this comment.
Updated to declare tracingpolicies before using them, since we are going to need to call tp.TpDomain().
| message ListTracingPoliciesRequest {} | ||
| message ListTracingPoliciesRequest { | ||
| // domain to be listed; empty to list all domains | ||
| string domain = 1; |
There was a problem hiding this comment.
We have basically 3 domains for policies right now:
- "static" domain for tracing policies loaded by configured folder/file
- "grpc" for grpc-loaded policies (eg: tetra CLI)
- "k8s" for policies loaded by the crd watcher
In the tests this knowledge may be leaked, but normally this is all transparent in the code, ie:
- "k8s" is enforced by the k8s
TracingPolicy::TpDomainmethod on addition, and by the crd watcher on deletion - "static" is enforced by the normal
TracingPolicy::TpDomainmethod on addition (there is no deletion path for static policies) - "grpc" is enforced by grpc server for both additions and deletions
| // this enables policies with the same name for different namespaces | ||
| type collectionKey struct { | ||
| name, namespace string | ||
| name, namespace, domain string |
There was a problem hiding this comment.
Each tracing policy is now referenced by its domain too.
| var BaseSensorName = "__base__" | ||
| const ( | ||
| BaseSensorName = "__base__" | ||
| sensorsDomain = "sensors" |
There was a problem hiding this comment.
As explained above, the sensors domain is the one enforced by Sensors related APIs.
|
|
||
| if err := s.observer.AddTracingPolicy(ctx, tp); err != nil { | ||
| gtp := GRPCTracingPolicy{tp, grpcDomain} | ||
| if req.GetDomain() != "" { |
There was a problem hiding this comment.
Normally (and by default) it is just grpc, but for debug purposes eg: via tetra, users can customize the domain they want to act upon.
| return nil, errors.New("ListSensors is deprecated") | ||
| } | ||
|
|
||
| type GRPCTracingPolicy struct { |
There was a problem hiding this comment.
Small wrapper around TracingPolicy to enforce grpc specific domain.
045f89f to
06be127
Compare
|
Q: do we want to expose the domain in the |
kkourt
left a comment
There was a problem hiding this comment.
Thanks!
Can you please add a commit message in the first commit that introduces commits with the motivation for this change?
Things like 6a84712#r3052523498, are also well-suited for commit messages. Making the context part of the git history is very valuable.
Also, do you think we should enforce some constraints on what is allowed (e.g., characters) in a domain name? I think that's probably a good idea.
06be127 to
71bc066
Compare
| } | ||
|
|
||
| func (tp *TracingPolicy) TpDomain() string { | ||
| return k8sDomain |
There was a problem hiding this comment.
I find this confusing. Looking at the subsequent patches, it seems that what you want to do is have all other users of TracingPolicies use a different type that returns a different TpDomain().
Would it be a better choice to add a Domain field in the TracingPolicy? We don't have to expose it to users. It can be a private field (with getters and setters) or maybe marked with json:"-"?
There was a problem hiding this comment.
Why is it confusing? I mean, i don't find it confusing obviously :) Mind to share your concerns?
Indeed i find it rather simple and straightforward: each implementation of the interface will have its own specific domain.
There was a problem hiding this comment.
That's a good question.
One part is that historically, we used the same type for multiple "domains" (e.g., gRPC and static). I also feel that at least for the GenericTracingPolicy we can have the same type everywhere, and it can be used to make the domains more dynamic in the future.
There was a problem hiding this comment.
But I don't feel strongly about it. I do think, however, that we would need to document the new behavior in the commit message (and maybe the types) because this is a change from what we did before.
There was a problem hiding this comment.
also feel that at least for the GenericTracingPolicy we can have the same type everywhere, and it can be used to make the domains more dynamic in the future.
What i don't like is that if we allow people to customize the domain, sooner or later we might end up with mixed domains over and over (ie: "oh i added a GenericTracingPolicy { Domain: "k8s" } because x,y,z".
I prefer the "one type for each" in this case; but again, just like you, this is not a strong opinion.
Luckily, we can change in the future without any breaking change.
I do think, however, that we would need to document the new behavior in the commit message (and maybe the types) because this is a change from what we did before.
Fully agree! Let me improve the commit messages :)
Anyway, would love to hear more feedback on this topic; let me ping @mtardy @will-isovalent
Thanks!
I fully agree! EDIT: but also, since we are the ones setting them and they are just returned from interface method implementations, i am not sure whether we need it. After all, users cannot change it and they are all statically lived strings. |
ca0fa7a to
c133b46
Compare
117dd7b to
c906f31
Compare
kkourt
left a comment
There was a problem hiding this comment.
LGTM, thanks!
Left some nits, but it's up to you if you want to address them.
| // TPKindDefinition is the kind name of Cilium Tracing Policy | ||
| TPNamespacedKindDefinition = "TracingPolicyNamespaced" | ||
|
|
||
| k8sDomain = "k8s" |
There was a problem hiding this comment.
nit: How about we add a new package with all the domain constants? This means that we can use them in the CLI code as well.
There was a problem hiding this comment.
Indeed i tried hard not to expose them :) i don't think we should care about what the value is from the outside.
This means that we can use them in the CLI code as well.
Do we need to?
Also, ran `make protogen`. This commit introduces a `domain` for the tracing policies. We will later introduce a `TpDomain()` method on `TracingPolicy` interface. The idea is that each component (tetra through grpc, k8s and the static ones loaded by configured files) will only be able to act upon its own domain. For example, a crd tracing policy cannot be overridden/deleted by tetra, and the same goes the other way. If one pushes the same policy through different domains, it will end up with multiple policies loaded. We will have 3 domains for policies: * "static" domain for tracing policies loaded by configured folder/file * "grpc" for grpc-loaded policies (eg: tetra CLI) * "k8s" for policies loaded by the crd watcher Note that the grpc server will still allow to act on a specific domain, for debug purposes. That's why we added a `domain` field to the protobuf messages. Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
In the tests this knowledge may be leaked, but normally it is all transparent: * "k8s" is enforced by k8s `v1alpha1.TracingPolicy::TpDomain()` and by the crd watcher * "static" is enforced by normal `GenericTracingPolicy::TpDomain()` * "grpc" is enforced by grpc server through `GRPCTracingPolicy::TpDomain()` (introduced in next commit) Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
c906f31 to
028b167
Compare
Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
It is needed to list domains. Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
It is reserved for internal use (to keep track of loaded sensors). Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
With the `domain` based approach, we can no longer act upon a sensor via
the tracingpolicy gRPC API and viceversa, ie: we can no longer
act upon a tracingpolicy by using the Sensors gRPC API.
This happens because sensors are all registered in the (private)
`sensors` domain, and tracing policies instead belong to their own
specified domain.
In this specific case, the sensor is created and attached during the
`h.addTracingPolicy()` call, and stored as part of the same collection
of the tracing policy:
```
sensors, err := sensorsFromPolicyHandlers(op.tp, filterID)
if err != nil {
col.err = err
col.state = LoadErrorState
return err
}
col.sensors = make([]SensorIface, 0, len(sensors))
col.sensors = append(col.sensors, sensors...)
```
Thus, trying to reach for the sensor via the Sensor API is not going to work,
because as aforementioned, that enforces the `sensors` domain.
Signed-off-by: Federico Di Pierro <nierro92@gmail.com>
028b167 to
b9c8b4d
Compare
|
Also, rebased on top of main. |
Fixes #4808
Description
Add
domainsharding for tracing policies in sensors manager.Each "domain" can only see and work on its own policies;
tetra tracingpolicysub commands gained a new--domainflag; by default it will only act upongrpcdomain (enforced by the grpc API), but it can be enforced to work eg: on k8s domain, for debug purposes.Also a new
tetra tracingpolicy domainscommand has been added to list all available domains (this is a dynamic list, ie: it depends upon loaded policies domains).Examples of current impl:
We will then have:
Then we load a new policy via tetra:
And we now have a new domain:
We then try to remove the tetra policy but looking for it in the
staticdomain:But in the correct domain (
grpcis the default for grpc connections)Changelog