Skip to content
This repository was archived by the owner on Dec 4, 2018. It is now read-only.
This repository was archived by the owner on Dec 4, 2018. It is now read-only.

some design problems #6

@ralphbean

Description

@ralphbean

I think we might want to reconsider more parts of the design.

To illustrate, consider trying to create the github "activity calendar" graph from fedmsg data:

We would need to cache the count (or volume) of activity for each contributor for each day. We already have a statscache plugin that does this, however:

  • Think about how the PollingProducers get scheduled. If we have one with a frequency set to 1 day, it will run one day from when the fedmsg-hub daemon started up. If we start it at 09:32 UTC, then the stats will be put into the database at 09:32 UTC each day. If we restart it 2 weeks later at 14:58 UTC, then from then on the stats will be put into the database at 14:58 UTC each day. That's not consistent, and not what we want.
  • Think about how statscache builds up those buckets of messages for each frequency period. During the day, it would accumulate a bucket of all messages during that time period, and only hand it off to the stats plugins at the end of the period for analysis and storage. If we restart the daemon during that period, all of those messages get lost -- they are never analyzed and stats about them are never stored.

I'm open to any number of different ways to design this.. but it might make sense to:

  • Run these things under cron. This is what cron was designed for. Our sysadmins know how to work with it. Why reinvent the wheel?
  • Use datagrepper for the "buckets". When the time comes for us to calculate stats for the day, instead of checking an in-memory bucket for the history, we could call out to datagrepper to get all the messages for the last day (or when time comes to calculate messages stats for the last 10 seconds, call out to datagrepper for that too, etc.).

There are probably still more angles to consider here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions