Basically, it would be awesome if we could dispatch the output of all our wonderful plugins to other different systems -- in particular, to statsd.
Currently, we can't do this as all of our plugins hardcode their interaction with sqlalchemy to store their data (read: side-effects).
Instead, we should have the plugins return their data and then have the calling code dispatch that data to different handlers/callbacks/etc.. which then store the data in postgres, or send it to statsd, or memcached, or wherever.
The tricky part here will be to make some kind of common data object that can represent both the time-series information from the volume plugin and the singleton information from the releng plugins.
Full IRC log of the conversation that brought this up:
relrod │ threebean: does statscache fit in anywhere with the graphite stuff I've been playing with?
relrod │ Could it use graphite as an output or something maybe?
relrod │ or statsd
relrod │ threebean: I guess another way of asking this is: Can plugins be outputs too? Can I write a plugin that says "any time some other plugin stores a stat, toss it to statsd too"?
threebean │ relrod: hm, good idea.
relrod │ threebean: Just don't want to duplicate effort. Especially for stuff like dashboards (I'm really liking grafana, it's pretty :P)
threebean │ the .. way it's written right now is that each plugin is responsible for shoving its data into postgres.
threebean │ but we could rework.
threebean │ but we could rework that.
threebean │ here's an example: https://github.com/fedora-infra/statscache_plugins/blob/develop/statscache_plugins/volume/simple.py
relrod │ threebean: I've been getting really heavily into doing (at least my Haskell projects) like this: Write core stuff as a library; then as part of that library include some common/default
│ handlers. Then as a "config file" type thing, import that library and write handlers (or use built in ones), then crete a mapping between some key and a list of callback handlers.
relrod │ The reason I mention that is...
relrod │ I'm wonder if plugins could return some common data type (er, object in this case :P). Then we can have lists of callbacks that each get passed that object, every time a stat is changed
relrod │ One of those callbacks could go to postgres, one to statsd, etc.
threebean │ yeah, that'd be the way to do it.
relrod │ (one to fedmsg - we could create an infinite loop :P)
threebean │ superb :p
threebean │ I like it. let's do it.
threebean │ there's some trickiness in there about an edge case we wanted statscache to handle.
threebean │ see how it can handle updating pre-existing records? that's one.
threebean │ another oddity is that not all the stats are time-series (i'm assuming that statsd is only for time-series data)
threebean │ the releng plugins in there store only information about the latest state for the composes and stuff.
threebean │ making a common object interface between those two kinds of data (time-series and otherwise) seems like it would be difficult.
threebean │ s/difficult/easy to get wrong/ :)
relrod │ hm, yeah fair
threebean │ otoh, having statscache stuff in statsd and then grafana would be amazing.
threebean │ we could build app-specific visualizations for custom use-cases like fedora-hubs, but then get generic yet beautiful graphs for ourselves.
relrod │ threebean: btw, here's an example of what I meant above: https://github.com/fedora-infra/pagure-hook-receiver/blob/master/examples/pagure-test.hs -- notice how projectMapping is keyed on the
│ pagure project name, then the callbacks can do whatever. I've only recently started doing that kind of development, but I like it a *lot*
relrod │ It's how the xmonad window manager works too ;)
relrod │ xmonad is a library, your config file is a Haskell program that makes use of it
threebean │ cool :)
relrod │ and compiles into a window manager
relrod │ threebean: but yeah. It sounds like a great usecase for graphite, I think (the time-series ones at least)
threebean │ maybe, hm..
threebean │ maybe we have all plugins return some object representation of their data.. and the calling code dispatches to the postgres store and statsd store and etc..
threebean │ but we only send time-series stuff to statsd
relrod │ yeah
threebean │ and we have some variety of objects with the know-how to store the different types in postgres.
threebean │ (like we do now -- except that logic is scattered throughout the plugins themselves)
relrod │ have a type with two constructors, one for time series and one for singleton, and pattern match on the constructor - oh wait, we're not in Haskell. ;)
* │ threebean tries to file an issue for this
relrod │ threebean: yeah I think having an object representation is probably better in the long run, because then we can write output plugins ("callbacks" in my terminology above) for things we
│ haven't even thought of yet.
threebean │ right. like memcached or websockets.
relrod │ yep :)
Basically, it would be awesome if we could dispatch the output of all our wonderful plugins to other different systems -- in particular, to statsd.
Currently, we can't do this as all of our plugins hardcode their interaction with sqlalchemy to store their data (read: side-effects).
Instead, we should have the plugins
returntheir data and then have the calling code dispatch that data to different handlers/callbacks/etc.. which then store the data in postgres, or send it to statsd, or memcached, or wherever.The tricky part here will be to make some kind of common data object that can represent both the time-series information from the volume plugin and the singleton information from the releng plugins.
Full IRC log of the conversation that brought this up: