I have a series of timestamps (second granularity). I need to derive a "frequency" from them and produce a string like "every X minutes/hours/days/weeks/etc", to where it fits closest.

Is there a well-known algorithm for this? I'm having trouble finding search terms.

  • 0
    Pretty sure this is going to have to boil down to a search that scores possible outcomes based on an error rate.
  • 2
    Isnt this just the average of the time between two sequencal timestamps. Thats if they really happen regularly.
  • 0
    @24th-Dragon The second part is the tricky part.
  • 0

    say series is 1...n
    calc the diffs series: 2-1, 3-2... n-(n-1)
    then sum, and divide by n-1 to get the avg.
    change what you get to human readable time period.
  • 3
    The regularity should be fairly easy if you look for the root average of squares and compare it to average.
    The delta should give you the average distance from the average. Or sth like that, there is a term that staticians use to find volatility in data.

    Also accept the fact that the task is unsolvable for some cases and think how you would go reporting that to the user.
  • 3
    I would think looking at deltas at standard deviation and min/max would be a good first pass to see of they are regular "enough"
  • 1
    @HitWRight Root average of squares is ringing bells from my Big Tech Inc. days, I think this is what I was trying to remember.

    For context, this is more or less like a human-friendly "this user uploads content every couple of weeks" type of thing. I'd like statements like "every week", "every two weeks", "monthly", "bi-monthly", "once in a while" or "rarely" or whatever. That's more what I'm thinking of.

    I think just having a set of valid outputs and testing for best fit based on deviation is probably best.

    Thanks everyone :)
  • 0
    uploads / seconds of period to look at is the upload frequency.
    You can test, whether that evaluates to more than once a second, minute, hour, day, week, bi-week, month, quarter, half-year, year, decade, century, eon...

    But you probably want to display nothing when the period doesn't contain any downloads or the user's first download is too recent.
Add Comment