Improve leaderboard calculation accuracy for shorter segments

The way Strava calculates time for leader boards seems imprecise for shorter segments. This can lead to either significantly higher-than-real or significantly lower-than-real speeds displayed in Strava. This can also cause different average speed to be shown in leader board vs ride analysis - see here https://strava.zendesk.com/entries/40529030-Average-Speed-in-segmen.

This is a proposal to do a very simple correction to the leaderboard time calculation formula which will lead to significant accuracy improvements.

This has been submitted as Request #433568 but Sean from Strava support suggested I put this here. This text is from the original request:

------------------------

Example (all numbers are faked just to demonstrate the idea): 
Let's say, there is a segment 100 meters long. 
GPS track points selected by segment matching algorithm happened to be within the segment, and distance between the first and last GPS track points happened to be 80 meters. 
Let's say, it took me 8 seconds to move between these GPS track points (according to the GPS track data).

Currently Strava calculates AVG speed on the segment like this:

(this is based on the information from this thread https://strava.zendesk.com/entries/40529030-Average-Speed-in-segmen ) 

AVG speed on segment = 100m (full segment length) / 8s (time between track points) = 12.5 m/s 
Leaderboard time of the segment = 8s (the time between the track points is taken as is)

What I'm proposing is to tweak the formula very slightly: 
AVG speed on segment = 80m (actual distance between the GPS track points) / 8s (time between track points) = 10m/s 
Leaderboard time of the segment = 100m (full segment length) / 10m/s (AVG speed between track points) = 10s

This approach also works fine in all other track point configurations, e.g. when the first GPS track point is before the segment start and/or the last GPS track point is after the segment end.

As you see, the change to the formula is really simple - no new data is needed, and the math is very basic. But this simple change makes great improvement to accuracy!

The accuracy improvement is obvious - the 20m difference between the segment and the actual recorded track section is interpolated/assumed to have been run/ridden with AVG speed 10m/s, while the current Strava algorithm assumes they were run/ridden with the speed of light, i.e. "in no time" :)

I really hope you can propose this change to engineers - it would really make a huge difference for those of us users who enjoy 1 minute "all out" sprints :)

27

Comentarios

26 comentarios
  • Agree!! Strava please change this. It has been a nuisance for quite some time to be honest. It takes the fun out of short sprints and makes it sort of useless for that purpose. I once physically sprinted against a friend of mine and won the sprint, his results on strava were better than mine though :s . This alone would make me not go for the paid version: it has to be as accurate as possible. Clearly the suggestef method gives better results
    1
    Acciones de comentarios Permalink
  • I love this proposed solution. Another angle would be to extrapolate the speed at the beginning or end of the segment to cover the missing GPS points at those ends. Or to get complicated with it, use the cumulative speed curves of all the riders to calculate the most likely speed shape, and apply that shape to the rider's average speed for the segment. But honestly the suggestion Mikalai has proposed would work just as well in pretty much all cases and should be simple to apply.

    The big problem I find is that many sprintable segment leaderboards are led by riders with inaccurate GPS data, but not to the point that anyone wants to flag them. This creates a lot of friction when hunting, and takes a lot of fun out of it. Strava should try a fix like this and try recalculating a sample set of segment leaderboards to see how it shakes out.

    I figure I'll lose some KOMs, but maybe get some others. Get the messaging right and users should be alright with it.

     

    0
    Acciones de comentarios Permalink
  • I agree the calculation should be improved for the same reasons as mentioned before. In my opionion the extrapolation approach suggested by Troy is more accurate than Mikalai's approach as it takes account of the acceleration or braking phases at the beginning and the end of a segment.

    0
    Acciones de comentarios Permalink
  • I'm going to use this feature request to record the segments I lose due to shoddy GPS data:

    https://www.strava.com/segments/6430646?filter=overall

     

    I have a higher average speed on this segment than the new KOM leader, but he just got a nice 7 second lead on the segment from favorable GPS drift. His effort ends well before the end of the segment due to drift. But since he didn't do anything wrong, I don't want to flag his ride.

     

    Strava, please fix the way leaderboards are calculated.

    0
    Acciones de comentarios Permalink
  • On a recent ride (2015-08-12), over a short sprint segment, Strava reports that my average speed was 24.4 mph, despite my max speed being 21.0 mph.  The analysis of this segment shows that I started the segment at 18.1 mph and ramped up to 21.0 mph by the end (with an average speed of 20.7 mph).  Strava reported my time over the segment as 17 sec, and awarded me a KOM.  I believe that this an error due to Strava not accounting for the difference between the GPS fixes and the segment endpoints, as explained by Mikalai above.  Please fix the algorithm.

    0
    Acciones de comentarios Permalink
  • While I wholeheartedly agree that Strava really, really needs to do something about segment time calculation, the method proposed here is fundamentally flawed. The problem is that the actual ridden track will almost always deviate a bit (sometimes quite much) from the segment track, causing you to compare apples and oranges in your calculation. 

    This problem is easy to see if we look at a modified version of your own example:

    Assume a straight segment of 100 meters, where the KOM is 10 seconds (and is also the creator of the segment). While Strava just picks the closest data points for their timing, assume for a moment that another person happens to get data points exactly at the segment start and end points (so his time using the Strava method will be spot on correct).

    Let's say his time was also 10 seconds (as recorded by Strava). Assume that this person likes to zigsag, so his recorded distance is 120 meters (and thus his average speed was higher). If we then apply your method to calculate his time, we get:  time = 100 m / (120 m / 10 s) = 8.33 seconds. In other words, his time is now wrong, while it initially was correct.

    The ridden track can only be used to verify that a person actually rode a segment, for timing one has to consider only the end points, and with proper interpolation, accuracy can be improved a lot.

     

    1
    Acciones de comentarios Permalink
  • Øyvind, good thinking and interesting example, thanks!

    So, you're saying the actual traveled and recorded distance for the second person in your example is 120m because of zig-zagging, right? Although if he ran along the segment it'd be 80m, right?

    Now, I need to clarify something - the actual distance between the GPS track points in the proposed formula actually should be understood as "actual distance between the start and end GPS track points measured along the segment". I.e. this distance would be 80m no matter how the second person zig-zags.

    With this clarification the formula seems to work just fine in your example too! :) - both users would get the same segment time and the same avg segment speed, which is fair. Yes, the second person was technically moving faster, but it's his problem that he did unnecessary extra work. For example, if a 100m runner sprinter zig-zags - nobody will account for that, and his result will be calculated without taking the zig-zags into account ;)

    By the way, if the second person zig-zags too much - then Strava's matching algorithm won't match his activity and the segment at all (because they will be too different, and trying to compare them would really be "apple to oranges").

    0
    Acciones de comentarios Permalink
  • Mikalai, I used the term zig-zagging just as a visualization, as that also happens naturally (to a certain degree) due to GPS jumping around a bit (or slow recording intervals). If someone actually zig-zags, causing them to do extra work, then that's all on them. Of course.

    However, I'm sorry to say, but your method is still flawed. What you are proposing is to take the actual average speed of an effort (using Strava's recorded time and distance), and then calculate a segment time by applying that average speed to the recorded segment length. If the speed was constant over the entire segment for every effort, this method would work, but that's of course not the case in real life. In real life, the speed will typically be different along the segment. Assume for instance that a segment starts uphill and ends downhill. The speed at the start of the segment will be a lot lower than the speed at the end of the segment, and the average speed will be somewhere inbetween. Using this average speed as a correction factor will of course be wrong, since the errors we want to correct for, are at the beginning and the end, both of which have a very different speed from the average.

    I can of course give you a more elaborate example if you want, but that's probably not necessary. You get it?

     

    1
    Acciones de comentarios Permalink
  • Øyvind,

    Hmm, I'm not sure I explained the proposal well. Can you forget about the formulas for a minute, here is an extension of the 80m/100m example to describe the general idea:

    [segment start point (SS)] ... 10m distance ... [actual GPS point closest to the start (GS)] ... 80m and more GPS points go here ... [GPS point closest to the end of segment (GE)] ... 10m ... [segment end (SE)]

    *Problem:* Currently Strava takes the time it took to travel between points GS and GE (part of segment between the first and last recorded GPS points), and treats this time as if it was time from SS to SE (full segment). This effectively means that the time it took to travel between [SS to GS] and [GE to SE] is estimated as zero, i.e. the person was moving with the speed of light ;)

    *Proposed solution:* calculate average speed between [GS and GE] and use this speed *as a reasonable approximation* of the unknown speed for [SS to GS] and [GE to SE]. Of course, this approximation is not ideal, but I bet it is *much closer to reality* than the speed of light.

    So, I agree this approach is not ideal (flawed in some way), but the point is - current Strava approach is much worse. And it is really easy to fix. You can pick better approximations for speed in [SS to GS] and [GE to SE] (there were some proposals in the comments here), but they are still approximations (i.e. not perfect), may be harder to implement, and accuracy improvement won't be as huge as with the jump from speed of light to avg speed ;) 

    We can take it offline (email, skype, etc) if you want to discuss detailed examples etc. I think this discussion thread contains enough information (including my proposal and your concerns) if Strava developers decide to tackle this issue at some point.

    0
    Acciones de comentarios Permalink
  • You explained it well enough, but the method will in a great deal of the cases introduce way more error than the method used by Strava. You need to keep in mind that the ultimate goal is to get the time between the actual segment start and end points as correct as possible, and using the average speed along the segment to correct the error is not a good approximation. Not even close.

    I'm pretty sure that Strava doesn't give a rat's ass about this discussion, as they seem to have become more or less totally deaf to what their users want. As an example, look at the default activity name change they introduced, going from something that was OK to something that is totally meaningless.

    I'd be happy to explain this privately should you want to, but as mentioned, there is no reason to be concerned that the info in this thread will overwhelm Strava...

    As it happens, I have written my own software that does proper interpolation/extrapolation on Strava segments, and just for fun I included your calculation method as a comparison, which shows just how bad of an approximation this method is.

    Here's an article I found that also details the problems with Strava's method: http://mroek.blogspot.no/2015/09/why-strava-segment-matching-and-timing.html

     

    1
    Acciones de comentarios Permalink
  • Just chiming in that I'd still love to see a fix, but I agree that it's not likely. If they ever take this seriously, there are several viable ideas in this topic.

    0
    Acciones de comentarios Permalink
  • If Strava even monitors these threads, here's even a rather detailed proposed solution:

    http://mroek.blogspot.no/2015/10/why-doesnt-strava-do-proper.html

     

    1
    Acciones de comentarios Permalink
  • Great blog posts, Øyvind!

    I trust your findings because you did the experiments to find out which extrapolation method works better.

    Hope Strava people will find your blogpost useful if they decide to take a look, although I too greatly doubt it's going to happen in our life time :)

    0
    Acciones de comentarios Permalink
  • We feel very passionately about this topic as well! 

    As things currently stand, we're very much looking forward to solving this issue but it's not currently being worked on. We're still a very small team considering some of the larger companies in our space. Segments are so integral to the Strava experience that we really look forward to this work on segment accuracy. Please keep your comments and ideas coming and we'll be ready to consider them when this project comes "on dec". 

    1
    Acciones de comentarios Permalink
  • Cool, I just lost another lost KOM due to the poor segment time calculations that are in place.

    https://www.strava.com/activities/552983004/segments/13364611247

    Seriously, please pass this on to the product manager. Keeping the current flawed implementation is inexcusable.

    0
    Acciones de comentarios Permalink
  • Yeah, I'm the one who was undeservedly given that KOM Troy is talking about.

    I was going about 43 km/h, while Strava's algorithm decided I was moving at 51.6km/h. The difference is huge.

    And of course the same symptoms of AVG speed being greater than MAX speed (AVG = 51.6km/h, MAX = 44.6km/h). Silly.

    It takes all the fun out of the short segments because the fair/healthy competition essentially becomes impossible.

    Please fix and re-calc the leaderboards.

    0
    Acciones de comentarios Permalink
  • Yesterday I also lost a KOM due to this. The guy who took it was proud that he had done it, but in reality his effort was not the KOM, it was a 3rd place. He was really lucky, and his time was started way into the segment, while his time was stopped before the end of the segment, giving him a time nearly 4 seconds better than it should have been.

    Strava isn't going to fix this, and even if they fixed it, I bet they wouldn't have the balls to recalculate all previous top segment efforts.

    Given that segments is probably the key feature that attracted people to Strava in the first place, I cannot fathom why they don't just do something about this. It really isn't that difficult, as I have clearly proven.

     

     

    0
    Acciones de comentarios Permalink
  • ...and even if they fixed it, I bet they wouldn't have the balls to recalculate all previous top segment efforts.

    That's exactly what needs to happen. Maybe not in one day, maybe with some apologies emails to the former KOM owners (or just the normal "Uh, oh..." emails), but leaderboards correction needs to happen.

    I agree that the segments is the reason most people choose Strava over other activity tracking services. Bummer this is not prioritized properly.

    0
    Acciones de comentarios Permalink
  • I completely agree. They really need to fix their algorithms. Just to add a comment which may help project this more, and thus help them change it: Today and a few days ago, I had to mute/ make private my rides on a sprint segment because of inflated average speeds. When I took some averages, it still puts me in the top 2, or 3, but there's no way to officially place, that is, without flawed data. So, how can I even place? haha. It's frustrating to try so hard, and to have no actionable results. Yes, Strava, please fix this! 

    1
    Acciones de comentarios Permalink
  • Tried two sprints though a short section. Both at 30 mph. One result has me at just under 16 mph the other under 13 mph.  I know both times I was doing 29+mph and accelerated into the segment held it and analysis backs that up. The sprint time shows 6 seconds first time and 5 seconds the last (certainly didnt seem that long). My PB is 4 seconds at 19 MPH (actual speed 26-29mph) ????. So actually going slower I took less time and produces a faster average - that's bonkers. Tried Garmin set to 1 second recording and Smart - no difference. As far as I can tell the timings and averages on Strava for short sprints are complete garbage.

    1
    Acciones de comentarios Permalink
  • On some short but hard uphills segments there just 2-3 points of GPS data registered (probably due to bad GPS signal, or some other aspects). And I think that there is no any formula can fix measured data to be accurate enough. And in addition any fix of the formula solves problem with new data but not with already added data.

    What I'm proposing - just limit participation of tracks recorded using GPS data only in the short segments leaderboard. Let limit be set to 500 meters or whatever.

    Algorithm is pretty easy:

    1. If track taken using GPS data only:
    1.1. Check if Segment length is less than limit => data cannot be recorded in the segment leaderboard and visible for track owner only.
    1.2. Check if Segment length excess limit => data added to the segment leaderboard.
    2. If track taken using GPS + speed sensor - data added to the segment leaderboard any way.

    No any formula can smooth cases like:
    - Segment length 200m, gradient +10%
    - Tracked GPS data: just two points and calculated speed 138kmh.

    But such limitation can work good enough.

    0
    Acciones de comentarios Permalink
  • The Track&Field segment leaderboard is rubish due to very big margin error on short segments.

    Segments under 500 meters should be accomplished only if both point ( start point and end point)  of the segment are reached. ( only 5 meters of error shoud be allowed). If the condition is not meet the segment should not be accomplished.

    For example we have 200/400 meters segments with recorded times under World Recod 19.19/4.03 seconds because the segments are begining or ended way after/before the starting/ endig point.

    In this case on track&field evants the leaderboard table will be a very usefull tool.

    1
    Acciones de comentarios Permalink
  • Totally agree, recently lost a KOM (which doesn’t bother me) but the top speed achieved was outrageous, I suspect Caleb Ewan would have struggled to match it 😂

    0
    Acciones de comentarios Permalink
  • +1 to bump visibility to this. Considering prevalence of smart recording that doesn't update positions every second and low resolution devices like Fitbits, users with those low resolution devices often get unfair advantage where get lucky and cut a significant distance from both ends of a segment. I've seen many cases when it seems someone goes through a segment at a slower speed than the leader but they still end up at the top place. Now that Segments is a premium feature Strava must start paying more attention to getting this right!

    0
    Acciones de comentarios Permalink
  • With regards to interpolation I'd like to suggest a different algorithm. The current algorithm is to find a GPS position closest to the start and a position closest to the finish, and as long as these two positions are within the radius from the start and the finish, they are used as an actual start and finish for the matched part of the activity. Needless to say that both of these positions may end up either inside or outside of the segment, which would result in either getting a shorter or a longer than actual distance.

    The interpolation process should work as following:
    - Let's say that S is the actual start point of the segment and B - the matched start point on the activity. Let's say that A is the point on the activity just before B and C is the point just after B. This assumes that those points are produced discretely with every GPS refresh. Let's say that Ta, Tb, and Tc are timestamps corresponding to points A, B, and C. The current algorithm just uses Tb as the segment start timestamp. But we could draw lines between A--B and B--C, and on those two intervals calculate a point that is closest to the actual segment start S. That is a simple geometrical problem. Let's say that point was on the interval A--B, and let's call that point X. Just to clarify, X would be closer, potential much closer to S than B. Now all we need to do is to interpolate the time between Ta and Tb using the the same ratio as distances between A--X and X--B. The resulting time Tx would be the segment start time used for the match. If we repeat the same for the segment end we'd produce Ty timestamp that would represent the interpolated segment end time. The resulting segment time should be much more accurate.

    Using this approach Strava could also improve accuracy of segment matching to avoid the case when someone intentionally cuts the segment short, for example by turning around before the finishing. The radius for the interpolated start and end points X and Y could be tighter than the radius used to initially match the start and end points which would avoid false segment matches when someone doesn't actually go through the segment start or finish.

    0
    Acciones de comentarios Permalink
  • Stanislav C. Did you read my two articles about this, which are linked above in this thread:

    http://mroek.blogspot.no/2015/09/why-strava-segment-matching-and-timing.html

    http://mroek.blogspot.no/2015/10/why-doesnt-strava-do-proper.html

    Your suggestion is basically the same as I suggested (in great detail, I might add), but after Strava also completely locked down their API, it is no longer possible for third parties to validate anything, so you can be sure they are not going to do anything about it now. They can even claim that they have improved their algorithms, but there's no way to really tell.

     

    0
    Acciones de comentarios Permalink

Iniciar sesión para dejar un comentario.

¿No encontró lo que buscaba?

Nueva publicación