Many a mountain biker, trail runner or other off-road enthusiast has grumbled about the accuracy of Strava segments. Strava too often will miss the start of a segment you "know" you KOM'd. Circuitous segments are prone to being truncated, where a complete chunk of distance (and time) gets conveniently omitted from oblivious KOM takers. And most offensive is Strava depriving almost everyone that ventures off road of 10% or more of their hard-earned distance. What's up with this?
I will attempt to explain what is going on. I will start out by stating much of it is not Strava's fault. There are a plethora of devices out there, and many have suspect accuracy at best. Strava attempts to keep the playing field as fair as possible, which invariable leads to at least some users being penalized while not rejecting other users outright.
Just in the last few years, there's been an explosion of devices the athletic community uses to log route, distance, energy expenditure and more. First there were GPS devices using the original US military constellation of satellites. Then there were smart phones, using hybrid positioning from dumbed-down GPS receivers and triangulation off base stations. Now we have GLONASS, Russia's satellite positioning system that most new smart phones and GPS devices are compatible with.
For sensors that work with GPS enabled devices, we have heart-rate monitors, altimeters, speed and cadence sensors of all sorts, many types of power meters for bicycles, foot pods for runners and much more. These sensors link up via a variety of wireless interfaces, but Ant+ Sport and Bluetooth are the dominant ones.
So now Strava is faced with a hugely diverse source of data streamed up to its servers. In a single workout, there can be several redundant, conflicting sources of data. Should strava use the device's barometric altimeter for elevation data, satellite derived elevation, or calculate it from scratch from the GPS track? What about power? Power can be measured by a power meter or calculated on climbs. Then there is distance. A wheel sensor can count exact number of wheel revolutions, while distance can also be calculated from GPS Lat/Lon data points.
The question here is, what is truth? Absolute truth is unknowable. There are only degrees of trust, or goodness in the collected data. I'm going to focus on distance here, as power and elevation are topics that each require separate deep dives.
When GPSs were first used as cycle computers, off-road riders were dismayed with the distance accuracy. You compared the distance from a wired cycle computer that you hadn't yet removed with that which the GPS measured, and the GPS was 10%, sometimes 20% or even more short. WTF?! Well, there's a lot going when measuring your drunken-ant meandering with satellites whizzing by at 9000 mph, 12,000 miles up. There's noisiness in the data, and a lot of averaging is going on. Some of your little squiggles look like noise and get averaged out. That is where you get gypped. The system was never designed to produce the kind of accuracy needed to resolve tracks like these with $200 devices on your bike.
Now introduce wheel sensors. In an attempt to address these shortcomings, a wheel sensor can send revolution data to the GPS device. This gets you back into the realm of 1% accuracy, as the GPS now replaces its rounded off drunken ant distance with accumulated ticks of the wheel sensor. Here's what a track data point looks like in my Garmin 705 .TCX file. The HTML tags have been removed because I didn't want to escape character all of them:
Trackpoint
Time013-11-16T19:46:50Z
Position
LatitudeDegrees 42.6560440
LongitudeDegrees -70.9937590
AltitudeMeters 30.1330000
DistanceMeters 81905.5625000
SensorState Present
The DistanceMeters tag value normally gets calculated by the GPS receiver based on how much distance was traversed since the last track point. But when a wheel sensor is present, Garmin replaces the distance value with data derived from the sensor. My newer Garmin 500 produces a slightly different .TCX entry. It omits the SensorState tag and adds an extensions tag with other info.
Strava can chose to accept the distance data in the .TCX file or calculate its own from the position data. But what is truth here? I've seen some badly buggered up track files from smart phones and cheap GPS units. In those cases, distance data is almost certainly more accurate. That creates a data fusion problem, however. If the track is messed up, how do you assign good distance data to obviously flawed track data? This is where Strava makes a judgement call. From Strava's
knowledge base, we have this revelation:
"Also during the upload process, the Strava uploader detects any outlier GPS data that may be present in your file - this includes inaccurate GPS points and data that is clearly inconsistent within the file. This bad data detection is an effort to improve the quality of uploaded data on Strava, and does solve many issues with GPS inconsistencies. If, and only if, outlier/bad GPS data is detected, the distance calculation will be reprocessed automatically based on your GPS coordinates (see "GPS-based, Strava post-upload approach" below). This reprocessed distance can differ from the distance data originally reported by the Garmin device, especially if a speed sensor is present (see "How to gather distance data" below). To request that your distance be reverted to what your Garmin device reports, please submit a new support ticket, titled "Revert Distance" and include the relevant activity URLs where you would like the distance to be reverted."
I highlighted in red the key statements here. In other words, Strava knows best. I find it odd that Strava always deems my Garmin 500 distance as bad, while my Garmin 705 distance is always accepted. I note that my 705 produces considerably more accurate tracks than my 500. This is especially noticeable when doing laps, or riding both sides of a road in a given ride. Lap points fall right on top of each other, while riding both sides of a road my points stay clearly segregated to each side of the road. Look at an iPhone track in these cases, and points will be meandering all over the place. I believe the 705 is still the best cycling GPS Garmin has produced to date. None of the newer offerings produce as clean of tracks.
My 500, while producing cleaner tracks than an iPhone or Android Strava app, does frequently produce outlier points. For this reason, I believe Strava rejects its distance numbers. Strava seems to be agnostic to where distance came from, whether satellite based or wheel sensor based. If the GPS track is suspect, distance is gone, and Strava recomputes it from scratch using "cleaned up" GPS data. In this case, you totally lose on distance, even though your distance data was probably better than 1% accurate. I sometimes wonder if Strava just blanket rejects Garmin 500 distance data because the 500 produces questionable quality track data most of the time. I don't think Strava has ever accepted my 500 distance data, even when the track looked perfect. Then one time Strava accepted my 705 distance data, even though I had an obviously wrong wheel calibration setting. The track was clean though. Go figure.
Next I'll present a couple case studies. The first is from our Merrimack 50 ride a couple weeks ago. Ten Strava users road up a trail called Pipeline and back down again. This is a perfect case study, as these 10 tracks were on identical conditions on the same day. Things like tree and cloud cover can adversely affect GPS accuracy. Of the 10 rides, there was diversity in GPS units, from smart phones to 500's to my trusty 705.
The red track up and down Pipeline Trail is from my Garmin Edge 705. Luke recorded the yellow track with his iPhone Strava app. We both used wheel sensors with our respective devices. Note the red 705 trackpoints going out and back almost lie on top of each other. The repeatability of my 705 is excellent, something I notice lacking in Garmin's later offerings. The yellow iPhone track is of dubious quality upon inspection. Assuming Luke wasn't just free-ride meandering through the forest, there is considerable error between the two tracks here, more than 100ft difference in many places.
When we look at the distance Strava gives Luke and I on the descent, Luke gets 0.63 miles, I get 0.80 miles. Luke is appears to be getting robbed 21% just in this segment. But there's more. Strava lops off the first three turns at the top of the segment, so he loses some of that distance right there. Luke's iPhone position is off here, and because the trail loops back on itself, Strava doesn't have Luke starting the segment until he is clear of the uncertainty circle that Strava puts around each segment start and end point. I do not know how big these are, no doubt a highly kept secret. Strava doesn't need anybody looking behind the curtains to see how much dirty laundry there is back there.
Not that a distance error by itself does not impact leaderboard accuracy in itself. Only time traversing the segement matters. To get accurate time, the GPS (and Strava) must accurately time stop the start and stop points. Distance in the middle can be rounded away, it won't affect the clock.
As you can see, there are multiple sources of inaccuracy in Strava segments. Segments can be poorly defined, in that GPS devices do not have enough position fidelity to prevent short-circuiting portions of a segment. This is primarily a problem off-road, whether cycling, skiing, running, or especially a cyclocross course. But there are also problems on the road, where sometimes your GPS track never passes through that start or finish uncertainty circle and you end up killing yourself for nothing going for that KOM. And what if the segment creator used an iPhone that recorded a crappy track that day? Then many people with good tracks might never score in that segment. I believe Strava should restrict which devices can be used for creating new segments.
Strava's knowledge base added this excerpt on iPhones and Android Apps:
"When mobile data is synced with our servers from the iPhone or Android App, Strava runs a GPS-based distance calculation on the GPS coordinates, as we do not currently gather data from a speed/cadence sensor from the Strava App."
I wonder if that has anything to do with the track accuracy of these devices... It seems to me there would be a way to at least salvage the sensor data and give the rider credit for her distance. This can be remedied. If Strava can't figure it out, perhaps they should hire me to fix it.
Here's a second case study. Last weekend we did a 50 mile trail ride through Willowdale. I brought my Garmin 705 again, and to prove my suspicions, I also brought my Garmin 500. Both used a wheel calibration of 2278mm and were synced up to the wheel sensor. As you can see from the photo below, the two agree on distance and time to within a fraction of a percent of each other. The wheel distance data is placed in the .TCX file, which I uploaded from both units to Strava (actually, I used Garmin Training Center to convert the 500 .FIT file to a .TCX file).
What do you think was the result? Strava used my 705 distance and threw away my 500 distance! It was the same distance from the same ride from the same wheel sensor! Because Strava doesn't let you upload duplicate workouts anymore (I had to load one, delete it, then load the other), I put Strava screenshots below for comparison, the 705 first, 500 second.
Strava shorts me over 10% on distance even though I used a wheel sensor with the 500. I may send Strava a link to this post to see what gives, if this might be fixed anytime soon. I really can see what in the 500 track would cause Strava to reject my devices recorded distance data.
On a side note, another thing I notice is Strava always uses my 500 barometric altimeter data, while Strava always recomputes my 705 elevation data, just the opposite of distance! This is maddening.
Hopefully your head didn't exploded with too many details and you made it this far. I'd like to pick up a GLONASS enabled GPS in the near future to see if that can fix some of the accuracy problems. My 705 isn't going to last forever. Water readily finds its way into the unit when it gets wet now. More to follow...