Data format
Mapbox Movement data are aggregations of movement activity for a given time span and geographic area. This guide describes the different geographic areas available, along with the schema for the various time span options.
To use Mapbox Movement, you must specify both the desired Geo aggregation and time span.
Geographic Areas
Spatially, activity can be aggregated into tiles, or into select boundary types from in the Mapbox Boundaries dataset (counties, states, etc).
Tiles
Mapbox Movement Data can be aggregated into geographic quadrangles known as tiles. Data are available at zoom level 18, where each tile is roughly the size of a city block.
In the dataset, the bounds
field provides the outer corner coordinates of each tile, while the xlat
and xlon
fields show the centroid of each quadkey.
Tiles are referenced using a numeric quadkey which describes the tile's zoom level and location.
Understanding Quadkeys
To do additional geospatial manipulation on quadkeys, you can use the mercantile library. The conversion is necessary only if there is a need to convert CSV filenames to geographical regions, since each line item is already converted to longitude, latitude
coordinates. You can also use the What the Tile interactive tool to visualize the quadkey locations.
Mapbox Boundaries
For Movement data joined to Mapbox Boundaries, the data set contains only the code of the corresponding area in the geography
field. This is the last five characters of the ID for counties, and the last three characters of the ID for states is the region’s FIPS code. You can also request that additional Mapbox Boundaries files be provided with a mapping between regional codes and corresponding polygon shapes, coordinates, and surface areas.
Normalization
The activity index is not normalized by area or population density, so larger and denser counties will show higher activity levels. We recommend running additional normalization if the purpose of the analysis is to compare counties or states to each other.
Time Span Data Schema
Mapbox Movement datasets are available in two different time span types: Daily and Monthly. Multi-month aggregations are also available upon request.
The schema of these two time span types and example values for each field are defined below.
Daily Data
Device activity is aggregated every 24 hours and generated daily, which allows for frequent updates to reflect immediate changes in activity levels.
Field | Description | Example |
---|---|---|
geography | For tile aggregation: Z18 quadkey ID For Mapbox Boundary polygons: County ID State ID | 032001323000312110 212097 147 |
xlon | Longitude of the center of the bounded area (tile only) | -122.428207397461 |
xlat | Latitude of the center of the bounded area (tile only) | 47.6880413955171 |
bounds | The lower left and upper right corners of the bounded area (tile only) | -122.42889404296875, 47.68757916850813, -122.42752075195312, 47.68850362252605 |
activity_index_total | Normalized activity factor (for all activity types) | 0.069267 |
agg_day_period | The date, in local time | 2020-01-01 |
Monthly Data
Device activity is aggregated over a one-month time window, which allows for periodic updates to reflect changes in typical hourly activity patterns. Multi-month aggregations may be provided for improved coverage and consistency (3-month, 6-month, and 12-month).
Field | Description | Example |
---|---|---|
geography | For tile aggregation: Z18 quadkey ID | 032001323000312110 |
xlon | Longitude of the center of the bounded area (tile only) | -122.428207397461 |
xlat | Latitude of the center of the bounded area (tile only) | 47.6880413955171 |
bounds | The lower left and upper right corners of the bounded area (tile only) | -122.42889404296875, 47.68757916850813, -122.42752075195312, 47.68850362252605 |
activity_index_total | Normalized activity factor (for all activity types) | 0.069267 |
agg_day_period | For seven-day: An integer that describes the day of the week, starting with 0 for MondayFor weekday/weekend: An integer that describes whether the day is a weekday or a weekend | 0 for Monday, 1 for Tuesday, 6 for Sunday0 for a weekday, 1 for a weekend |
agg_time_period | For hourly: An integer between 0 and 23 that describes the hour of dayFor 2-hour dayparts: An integer between 0 and 11 that describes the index of the 2-hour windowFor 4-hour dayparts: An integer between 0 and 5 that describes the index of the 4-hour window | 4 (4:00 AM)18 (6:00 PM)0 (the 00:00-1:59 time window),11 (the 22:00-23:59 time window)0 (the 00:00-3:59 time window),5 (the 20:00-23:59 time window) |
Snowflake Data Share Schema
Daily or Monthly data delivered via Snowflake secure data share has a separate schema:
Field | Description | Example |
---|---|---|
country | 2 letter ISO 3166-1 country code | US |
geography | For tile aggregation: Z18 quadkey ID For Mapbox Boundary polygons: County ID State ID | 032001323000312110 212097 147 |
xlon | Longitude of the center of the bounded area (tile only) | -122.428207397461 |
xlat | Latitude of the center of the bounded area (tile only) | 47.6880413955171 |
bounds | The lower left and upper right corners of the bounded area (tile only) | -122.42889404296875, 47.68757916850813, -122.42752075195312, 47.68850362252605 |
activity_index_total | Normalized activity factor (for all activity types) | 0.069267 |
agg_day_period | For seven-day: An integer that describes the day of the week, starting with 0 for MondayFor weekday/weekend: An integer that describes whether the day is a weekday or a weekend | 0 for Monday, 1 for Tuesday, 6 for Sunday0 for a weekday, 1 for a weekend |
agg_time_period | For Daily Data: N/A For Monthly data: integer between 0 and 23 that describes hour of day (based on hourly_aggregation ) | N/A 3 |
data_version | Version # for data production. | v2.0 |
geo_aggregation | For tile aggregation: quadkey For Mapbox Boundary polygons: adm1 , adm2 , etc. | quadkey adm2 |
daily_aggregation | For Daily data: daily For Monthly data: weekday_weekend , day_of_week | daily weekday_weekend |
hourly_aggregation | For Daily data: 24 (24-hr daily aggregation)For Monthly data: {1, 2, 4} integer that describes # of hours aggregated in each time bucket (hourly, 2-hour, 4-hour dayparts) | 24 1 |