Evaluate rules on our own#418
Conversation
TODO
|
3a06180 to
f2e3457
Compare
Web doesn't handle those as regex capable operators and what users are using as expressions are simple glob like wildcards only. So if this interprets this as regex it will certainly fail and we have to make sure Web also serializes this as regex. |
Notifications transforms it into a regexp, so you don't need to change anything regarding this on the web side. |
f2e3457 to
0bd1872
Compare
Ah, but don't you think it's better then to leave this up to how the serialization is actually performed? I don't quite like the idea to require the daemon to transform the value of a condition prior interpretation. We may even serialize like/unlike to equal/unequal with |
The transformation exists because of the previous filter string parsing implementation. If you think it's better to leave it up to the serialization, then it's fine for me too and I don't mind changing it.
How would merging the like/unlike to equal/unequal disambiguate something? In fact that would make it even more complex for us to distinguish when to and not to use regexp. Compiling regexps is expensive, so equal/unequal are used for simple and plain comparisons unconditionally, while like/unlike are used for regexps. If you merge them, then I would have to add some kind of hacky workaround to determine whether I should use regexp based on the value of the condition, which is not ideal. |
It's not a must. I just thought that if we allow regular expressions, why not make it explicit and remove the operator distinction. That distinction was only necessary for Web due the query-string serialization to get rid of the error-prone inspection of whether the value contained an asterisk or not. That's why I also mentioned a different key other than |
Ah, ok, that should work just fine then! |
6fd71d3 to
a78fe57
Compare
5b424bc to
c5038e0
Compare
oxzi
left a comment
There was a problem hiding this comment.
Thanks, LGTM.
I went over the diff multiple times and it seems to be fine. Of course, there might be some things I am missing due to the PR's size, but I would say let's merge it!
There will follow some more changes to the rules evaluation and parsing.
c5038e0 to
f892190
Compare
6d102ca to
4ce8519
Compare
This reverts commit aa1886d.
This reverts commit 33b2934.
This reverts commit f3870a3.
Since the event rules are dependent on their corresponding sources, we don't need to add another layer of indirection via an extra "source -> IDs" cache in the RuntimeConfig. Instead, we can directly store the rule IDs within the Source struct and thus bound to the source's lifecycle.
Since the filters are now constructed from a JSON string the value can represent any type and not just string, so changing it so we can perform fine-grained comparisons in the filterable implementations.
4ce8519 to
8aa3308
Compare
|
Due to #406 (comment), I had to refactor the That's not final though, since I'm going to drop the wildard -> regex transformation of filter columns for the |
8aa3308 to
4091fae
Compare
|
4091fae to
7beb210
Compare
|
I've transformed if/elseif statements to switch case to make the CI happy 😢! |
|
My last commit makes sure that we don't include the now huge {"qs":"service.usergroup.name=icingaadmins&service.name!=http&servicegroup.name~%2Ad%2A&host.name~%2Aicinga%2A&hostgroup.name=linux-servers&host.user.name!=icingaadmin&host.vars.http_vhosts.http.http_uri=%2F&service.vars.disk_partitions!=%2F&host.vars.disks.disk%20%2F.disk_partitions%3E=%2F&host.vars.disks.disk!~%2A&host.vars.notification.mail.groups%5B%2A%5D=icingaadmins","ast":{"op":"&","rules":[{"op":"=","attributes":["services[*].usergroup.name"],"value":"icingaadmins"},{"op":"!=","attributes":["services[*].name"],"value":"http"},{"op":"=","attributes":["servicegroups[*].name"],"regex":"^.*d.*$"},{"op":"=","attributes":["host.name"],"regex":"^.*icinga.*$"},{"op":"=","attributes":["hostgroups[*].name"],"value":"linux-servers"},{"op":"!=","attributes":["host.user.name"],"value":"icingaadmin"},{"op":"=","attributes":["host.vars['http_vhosts']['http']['http_uri']"],"value":"\/"},{"op":"!=","attributes":["services[*].vars.disk_partitions"],"value":"\/"},{"op":">=","attributes":["host.vars['disks']['disk \/']['disk_partitions']","host.vars['disks']['disk']"],"value":"\/"},{"op":"!=","attributes":["host.vars[\"disks\"]['disk \/']['disk_partitions']","host.vars['disks']['disk']"],"regex":"^.*$"},{"op":"=","attributes":["host.vars['notification']['mail']['groups'][*]"],"value":"icingaadmins"}]}} |
c76b9f4 to
254189e
Compare
|
Fixed the requested change and updated the go.mod file to now require the |
| These rules can be configured in Icinga Notifications Web and should be designed to match the `relations` of the | ||
| submitted events. When submitting an event without the expected relations to evaluate the rules, Icinga Notifications | ||
| will reject the request with a `422 Unprocessable Entity` status code and a message describing the missing relations | ||
| when the `XX-Icinga-Reject-If-Relations-Incomplete` header is set to `true`. Otherwise, the request will be accepted |
There was a problem hiding this comment.
| when the `XX-Icinga-Reject-If-Relations-Incomplete` header is set to `true`. Otherwise, the request will be accepted | |
| when the `X-Icinga-Reject-If-Relations-Incomplete` header is set to `true`. Otherwise, the request will be accepted |
| {Expr: makeJsonFilterExpr(t, []string{"host.vars.dict.key_int"}, ">=", 5, false), Expected: true}, | ||
| {Expr: makeJsonFilterExpr(t, []string{"host.vars.dict.key_int"}, "=", 42.0, false), Expected: true}, | ||
| {Expr: makeJsonFilterExpr(t, []string{"host.vars.dict.key_float"}, "<", 3.17, false), Expected: true}, | ||
| {Expr: makeJsonFilterExpr(t, []string{"host.vars.dict.domain"}, ">", "foo.com", false), Expected: true}, |
There was a problem hiding this comment.
This testcase seems to only works by accident. The current logic first compares the length, than the content. As example.com is longer than foo.com, this works.
There was a problem hiding this comment.
It's not by accident but wanted to test just that.
There was a problem hiding this comment.
Ok, I've dropped the length check.
| var missing []string | ||
| for _, filterColumn := range columns { | ||
| if _, cached := e.evaluatedRelations[filterColumn]; cached { | ||
| continue |
There was a problem hiding this comment.
I am a bit unsure about this one, sorry. Do you might want to continue filterColumnsLoop here for the same reasons as commented above the continue filterColumnsLoop below in line 172?
| // a list of filter columns that do not have any matching nodes in the Relations field and are not | ||
| // part of the CompleteRelations field. For filter columns that do have matching nodes, it caches | ||
| // the evaluated nodes for potential later use during rules evaluation. | ||
| func (e *Event) ExtractMissingRelations(filterColumns ...[]string) []string { |
There was a problem hiding this comment.
As we have just discussed offline, it might be easier for sources if the missing relations are absolute, starting with $..
| return cmp.Compare(reflect.ValueOf(a).Int(), reflect.ValueOf(b).Convert(atype).Int()), nil | ||
| } | ||
| case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64: | ||
| if atype.ConvertibleTo(btype) { |
There was a problem hiding this comment.
| if atype.ConvertibleTo(btype) { | |
| if btype.ConvertibleTo(atype) { |
Unsure how much the implication a is convertible to b says about b.Convert(atype).
Same below for the floating point case.
There was a problem hiding this comment.
Initially, the switch case was based on b.type but then forgot to adjust these two checks as well.
254189e to
a4949f0
Compare
This effectively replaces the previous (prior to v0.2.0) implementation on the `Object` type.
The `object_filter` can be a huge JSON string depending on the configured filters and Icinga Notifications Web also encodes other metadata needed to reverse the JSON encoded filter chain back to a simple filter string. So, do not include it in the logs!
a4949f0 to
26219df
Compare
This PR kind of reverts all the changes mad in v0.2.0 with a slight twist. We will evaluate the rules on our own again (just like pre v0.2.0) but the filter string is expected to have a different format. Icinga Notifications Web will now store the filter string as a JSON stringified object instead of a simple filter string format used before. This is done allegedly to make it easier to parse the filter string but the previous filter string parser is still there as it's used by the escalation filters. The new filter string format is as follows:
{ "op": "&/|/!", "rules": [ { "column": "service.state", "op": "==/!=/>/</~/!~", "value": "OK" }, { "op": "&/|/!", "rules": [ ... ] } ] }The JSON object for filter conditions may include other meta information used by Icinga Notifications Web/Sources but the only relevant information for us is the above-mentioned format. The
opfield is the operator used to combine the filter chain rules and can be either&(AND),|(OR) or!(NOT). Therulesfield is an array of rules which can be either a simple condition or a nested filter chain. A simple condition consists of acolumn, anopand avalue. Thecolumnis the column name used to extract values from requests as JSONPath expressions, theopis the operator used to compare the JSONPath expression result with the provided value and can be either==,!=,>,<,~(regex match) or!~(regex not match). If thecolumnis not a valid JSONPath expression, Icinga Notifications will just ignore that rule and won't be loading it from the database. This means that if you have an existing filter string in the old format, they won't be loaded, and you will have to ask the web colleagues if they provide a migration step for the existing filter strings.Furthermore, prior to v0.2.0, we were evaluating the rules against the
Objectrepresentation of the request, now they are evaluated against theEventtype itself. Specifically, the JSONPath expressions are evaluated against the newrelationsfield of theEventtype which is a map of generic key-value pairs extracted from the request. When a source doesn't include that field with all the necessary information in its requests, some or even all the configured rules might not match if they reference the missing information in theirobject_filter. If the source wishes to be notified about the missing information, it can include the newX-Icinga-Enable-Attributes-NegotiationHTTP header set totruein its requests and Icinga Notifications will respond with a422 Unprocessable Entitystatus code (not final, is still up to a discussion) and a JSON body containing the missing attributes. This is a new feature but similar to the previous behavior introduced in v0.2.0 where we were responding with a412 Precondition Failedstatus code when the used rules_version was outdated. The source can then extend its requests with the missing information and retry the request.In the referenced issue, it was mentioned that sources might only re-send the missing information as a follow-up request instead of retrying the entire request. However, it's not worth implementing this behavior just for the sake of this single use case as it would require a lot of additional work and complexity on both the source and Icinga Notifications sides.
Lastly, in a discussion with @nilmerg yesterday, we agreed that Icinga Notifications should abandon all the effects of the ongoing request if it can't successfully notify the source that the request was successfully processed. Currently, the incident starts a database transaction internally and commits it before trying to send the resulting notifications. However, the client will only receive
200 OKif the notifications were sent successfully, and that in turn might not reach the source since the connection might be gone in the meantime (errors are ignored blantantly).icinga-notifications/internal/listener/listener.go
Lines 181 to 185 in aad8cb2
icinga-notifications/internal/incident/incident.go
Lines 221 to 229 in aad8cb2
In that case the source will be left in a state where the request was processed but the source wasn't notified about it, which is not ideal and will definitely lead to some issues when we've added HA support. Icinga Notifications will treat every event as a unique one and won't have any mechanism to detect duplicate events, so if the source retries the request, it will be processed again and might lead to some unwanted side effects. To avoid that, we will need to roll back all the changes made by the request if we fail to notify the source about the successful processing of the request.
I didn't implement this behavior in this PR as I see it as a separate task that can be done in a follow-up PR, but if you think it should be included in this PR, please let me know and I will add it.Forget that, that's really something that should be done after or while implementing the HA support.resolves #406