Examples

We are providing you with a zipped file that contains example data so that you can experiment with the format of the Typical speeds file. This sample file uses OpenStreetMap node pairs to identify road segments within the 0320122 quadkey, which includes Wilmington, NC. This data is not up-to-date, and is for testing purposes only. There is an example file with data referenced by OpenLR Strings at the bottom of this page.

Access to Mapbox Traffic Data is restricted to Mapbox customers who have purchased a Traffic Data license. This gives you the most up-to-date Typical files, Live files, and files that use OpenLR identifiers. Contact Mapbox sales for more information.

Download file

The sample code in these examples use Python and the pandas library.

Calculate aggregated traffic metrics

You may be interested in calculating aggregated traffic metrics using the Typical speeds file. An example of this would be averaging speeds for each road across all five-minute intervals between 9:00 AM and 10:00 AM on Monday morning.

The first two columns of a row in the file identify the node pair that makes up a road segment. Because there are 396 five-minute intervals in the week between Sunday at 0:00 AM and Monday at 9:00 AM, this means the data to be aggregated exists between the 399th column and the 410th column of the Typical speeds CSV.

The following Python code consumes the sample Typical speeds file and generates the aggregated data described above as a pandas.DataFrame called monday_9_10_nodes_speeds:

import pandas as pd

all_speeds = pd.read_csv('0320122-America-New_York.csv.gz', header=None, compression='gzip')
monday_9_10_speeds = all_speeds.iloc[:, 398:410].mean(axis=1).round(2)
monday_9_10_nodes_speeds = pd.concat([all_speeds[[0, 1]]monday_9_10_speeds], axis=1)
monday_9_10_nodes_speeds.columns = ['start_node', 'end_node', 'average']
print(monday_9_10_nodes_speeds.head())

# Expected output:
#
#     start_node  end_node   average
# 0   113054533   113096757  55.58
# 1   170190194   4525170049 93.92
# 2   4525170049  4525170048 93.92
# 3   4525170048  170198969  93.92
# 4   170198969   170181217  93.92

Represent one point in time

You may also be interested in parsing the Typical speeds file to represent data at one point in time. To do so, you can manipulate the data so that it matches the format of the Live speeds file (start node, end node, speed). This format is useful if you want to provide traffic updates based on typical traffic patterns to a routing engine like OSRM.

Because there are two columns that represent the start and end nodes as well as 396 columns for the five-minute intervals between Sunday at 0:00 AM and Monday at 9:00 AM, this means the data for 9:00 AM Monday exists at the 399th column of the Typical speeds CSV.

The following Python code generates a simplified file that contains the start and end nodes and the speed data from 9:00-9:05 AM on Monday.

import pandas as pd

all_speeds = pd.read_csv('0320122-America-New_York.csv.gz', header=None, compression='gzip')
monday_9am = all_speeds[[0, 1, 398]]
monday_9am.to_csv('9am_speeds.csv', header=False, index=False)

If you want to use the resulting file to supply traffic data to OSRM, the OSRM Wiki entry on traffic updates has instructions specific to this use case.

This workflow enables you to load a single snapshot of traffic data, rather than an array of speeds that maps to times of the day or week. It is up to you to arrange for continuous updating if that is your intention.

If you are interested in applying frequent speed updates to OSRM (on the scale of minutes rather than hours), we recommend you use the Multi-level Dijkstra (MLD) preprocessing pipeline as described in the OSRM traffic updates wiki page.

Was this page helpful?