GeoIP Dataset accuracy improved
GeoIP is one of the critical enrichments provided by our ingest layer to your Network or Cloud telemetry.
As our experience with GeoIP data has matured, we have realized that GeoIP datasets from commercial providers are most often strongly accurate for broadband/mobile providers; only moderately accurate for Content Providers, and not accurate at all for Backbone providers.
We have just rolled out an update to the Kentik platform that constantly augments our GeoIP records leveraging an extendable selection of datasets.
State of the GeoIP union
Originally, GeoIP was used to map source_IP and destination_IP from Netflow/sflow data in our unified, enriched flow record data structure, making City, Region and Country available as Flow Dimensions for querying.
Later in the life of the Kentik platform, these GeoIP mappings got used in the portal, under the hood, in a large number of areas, such as:
- In Synthetic testing; we compute the distance of a test based on the sum of the distances for all hops in the traceroute of a Network Test. If GeoIP is off for certain hops along the way, the distance will be incorrect, affecting our inference of the end-to-end latency when we compare it to the latency values obtained during the test.
- In Kentik Market Intelligence; we rank network providers in any market based on the amount of IP prefix space they get from their customer base (the more IPs, the higher the score). This means that the weight of a network can be overplayed or downplayed in the case of incorrect GeoIP data.
How did we fix it?
We initially relied on a simple system that takes in GeoIP data daily from our providers and reloads it into our ingest layer to constantly fetch the provider's updates, and apply them as soon as available.
When you as a customer would notice inaccuracies, we'd relay the evidence to our provider and they would surface them once blessed by their experts in a future update. We were not satisfied with the end-to-end time to satisfy customer requests for relocation so we built an override layer system to which we could feed both:
- Manually, by entering our own overrides
- Programmatically, by using additional external trusted datasets
We landed on a modular and layered architecture below, that does the following with each daily run:
- fetch the provider GeoIP Dataset
- overlay our own overrides based on their respective precedence/priority
- generate a resulting GeoIP custom Dataset
- swap the dataset on the fly on our ingest memory datastore as part of the daily job
Leveraging your SNMP data
Network Devices exporting flow data to Kentik can have SNMP enabled on them and fed to our ingest. One of the MIBs polled by our SNMP service is the interface MIB. With each poll, we get all interfaces for the network element and the configured IPs on these interfaces.
As part of our enrichments, our customers also declare their network elements in Sites, so they can later query telemetry data by site. The bonus is that users submit an address that we translate in Geocoded data so we can place sites on a map.
Every day, we scan the entire partitions of IP addresses learned on each device via SNMP, and for each public IP address, we associate it to the Geocoded data of the site, which contains the network element, which contains the public IP address.
This is a somewhat unique dataset that no GeoIP provider out there has at their disposal. We now leverage it daily as a "Layer" of overrides superseding this base GeoIP dataset from our provider.
Benefits for our customers
As of now, if you ever notice an inaccurate GeoIP mapping when using Kentik, we have reduced the time to correction by as much as 15x: while the typical back and forth process with the GeoIP provider would potentially take 15 days (we vet, then the GeoIP provider vets, then slots it to their next synchronous release).
Today, we are able to add an override to our GeoIP system immediately after we have vetted the evidence you submit, cutting this turnaround time to one day or less.
For critical updates, we can even have the engine recompute a complete map on demand.