Draw points of interest using the Mapbox Vision SDK for iOS
Familiarity with Xcode and Swift, and completion of the Mapbox Vision SDK for
iOS Install and configure
guide.
The Mapbox Vision SDK for iOS is a library for interpreting road scenes in real time directly on iOS devices using the device’s built-in camera. The Vision SDK detects many types of objects including cars, people, road signs, and more.
In this tutorial, you'll learn how to use geo-to-world and world-to-screen coordinate transformations available in the Mapbox Vision SDK and apply it to rendering points of interest (POIs) on the screen.
In the example used in this tutorial, all POIs have predefined images and hard-coded locations. In your own application, you can use your own POI data or data from other Mapbox products. See the Specify a list of POIs section for details.
Getting started
Here are the resources that you need before getting started:
- An application including Mapbox Vision SDK for iOS. Before starting this tutorial, go through the Install and configure steps in the Vision SDK for iOS documentation. This will walk you through how to install the Vision SDK and configure your application.
- Recorded session. This tutorial is based on a recorded driving session through a city and the replay capabilities of
MapboxVision
. You may useVisionManager
andVisionReplayManager
interchangeably for live and recorded session respectively.- Check our Testing and development guide to familiarize yourself with record and replay functionality.
- You can download the recorded session used in this tutorial below.
Coordinate systems
The Vision SDK uses three coordinate systems: frame, world, and geo. You can translate your coordinates from one system to another with help of the VisionManager
and VisionReplayManager
methods geoToWorld(geoCoordinate:)
, worldToPixel(worldCoordinate:)
and inverse ones worldToGeo(worldCoordinates:)
, pixelToWorld(screenCoordinate:)
.
The accuracy of the transformation functions is highly dependent on camera
calibration so it's recommended that you use them when
Camera.isCalibrated
value becomes true
.
Frame coordinates
In the Mapbox Vision SDK for iOS, frame coordinates are represented by the Point2D
object.
This coordinate system represents the position of an object relative to the frame received from the camera. The origin is the left top corner of the frame. The position of an object is an x, y
pair where x
is the horizontal distance from the origin to the object in pixels, and y
is the vertical distance from the origin to the object in pixels.
This system can be also referred to as the screen coordinate system. Dimensions of the "screen" are defined by the dimensions of the frame received from the camera or other video source.
World coordinates
In the Mapbox Vision SDK for iOS, the world coordinate is represented by the WorldCoordinate
object.
This coordinate system represents the position of an object relative to the device camera in the physical world. The origin of the system is a point projected from the camera to a road plane. The coordinate of the object is a triplet x, y, z
where x
is a distance to the object in front of the origin, y
is a distance on the left of the origin, and z
is a distance above the origin. Distance in the world coordinate system is expressed in meters.
Geographic coordinate
In the Mapbox Vision SDK for iOS, the geographic coordinate is represented by the GeoCoordinate
object.
This coordinate system is used to locate an object's geographic position as it would appear on a map. Each point is specified using longitude, latitude
pair.
Longitude ranges from -180
to 180
degrees, where 0
is the Greenwich meridian, the positive direction (+
) is to the East
, and the negative direction (-
) is to the West
.
Latitude ranges from -90
to +90
degrees, where 0
is the equator, the positive direction (+
) is to the North
, and the negative direction (-
) is to the South
.
Set up the recorded session
To configure your application with a prerecorded session (see Getting started to download the sample session used in this tutorial):
- Unzip the contents to a folder on your local machine.
- Go your Xcode project
Info.plist
and setYES
forUIFileSharingEnabled
thus enabling file sharing through Finder. - Install the app to the device (
⌘
+R
). - Connect your device, choose it in Finder under
Locations
section, and selectFiles
tab. - Drag and drop the folder with the recorded session onto your app. Now the session is available in the
Documents
folder inside the app container. - In the code use
VisionReplayManager.create(recordPath:)
method to create an instance ofVisionReplayManager
by providing a path to a recorded session.
Configure Vision SDK lifecycle
In this tutorial, you'll use VisionReplayManager
to run a prerecorded session (which includes video and telemetry data) and find POIs in the video. The VisionReplayManager
class is the main object for registering for events from the Vision SDK and controlling its delivery. For production applications or testing in a live environment, use VisionManager
instead of VisionReplayManager
. See the Next steps section for details.
To set up the Vision SDK:
- Create a
VisionReplayManager
instance with a recorded session path. - Register its
delegate
. - Create
VisionPresentationViewController
and configure it withVisionReplayManager
to display camera frames.
- Start delivering events by calling
start
onVisionReplayManager
.
- Stop delivering events by calling
stop
onVisionReplayManager
.
Implement VisionManagerDelegate
By implementing the VisionManagerDelegate
protocol and registering as a delegate
with VisionReplayManager
, your class becomes capable of reacting to events emitted by the Vision SDK. All the methods in the protocol are optional so you may implement only the methods you need.
In this tutorial we're interested in obtaining the latest state of the Camera
which includes calibration progress and frame size in visionManager(_:didUpdateCamera:)
method as well as calculating the positions of POIs on the screen as update cycle of VisionReplayManager
completes in the visionManagerDidCompleteUpdate(_:)
method.
Calculate POI positions
Next, you'll prepare POI data to be added to the screen. To draw a POI label on the screen, you need to take the geo coordinate (longitude, latitude) of the POI and make two coordinate transformations: first from geo coordinates to world coordinates, then from world coordinates to frame coordinates. Remember that screen coordinates represent the point on the camera frame and you'll need an extra step to translate this coordinate into your view coordinates.
Specify a list of POIs
This example uses two hard-coded POIs. In your application, you will need to generate your own list of POIs including longitude
and latitude
. You'll also need an image to display for each.
Here are the sample POIs included in this tutorial:
In your own application, you could use your own POI data, data from the Mapbox Geocoding API, or data from the Mapbox Streets tileset via the Mapbox Tilequery API.
Transform GeoCoordinate
to WorldCoordinate
In this example, method updatePOI
takes GeoCoordinate
of the POI and UIView
instance that marks the POI on the screen.
Before drawing a POI we need to check that:
- The
Camera
is calibrated so that we receive precise values during coordinate transformations. geoToWorld(geoCoordinate:)
transformation is successful (WorldCoordinate
value is produced).- The POI is in the visibility range.
Although a POI is usually described as a single point, our POI marker is a rectangular banner pointing to the geographical point from above. The rectangle is defined by its left top and bottom right vertices. Lift the POI marker above the ground by increasing z
component of the WorldCoordinate
.
Transform WorldCoordinate
to frame coordinate
Now that you have two WorldCoordinate
s, you can translate them to frame coordinates. Put them in the guard
condition so that if the transformation doesn't produce results further work is not executed.
The last transformation is to convert marker rectangular vertices from camera frame space to the view space. To simplify this task Vision SDK provides a helper function CGPoint.convertForAspectRatioFill(from:to:) that converts a point from the original
bounds to the destination
one respecting the aspect ratio.
Draw POIs
The only remaining step is to construct the marker view frame, set it, and display the view.
Final result
import MapboxVision
import UIKit
/**
* "POI drawing" example demonstrates how to draw a point of interest on the screen knowing its geographical coordinates
* and using coordinate transformation functions.
*/
// POI coordinates for a provided session. Use your own for real-time or other recorded sessions
private let carWashCoordinate = GeoCoordinate(lon: 27.675944566726685, lat: 53.94105180084251)
private let gasStationCoordinate = GeoCoordinate(lon: 27.674764394760132, lat: 53.9405971055192)
private let distanceVisibilityThreshold = 300.0
private let distanceAboveGround = 16.0
private let poiDimension = 16.0
class POIDrawingViewController: UIViewController {
private var visionManager: VisionReplayManager!
private let visionViewController = VisionPresentationViewController()
private var carWashView = UIImageView(image: UIImage(named: "car_wash"))
private var gasStationView = UIImageView(image: UIImage(named: "gas_station"))
// latest value of a camera
private var camera: Camera?
override func viewDidLoad() {
super.viewDidLoad()
// Documents directory path with files uploaded via Finder
let documentsPath =
NSSearchPathForDirectoriesInDomains(.documentDirectory,
.userDomainMask,
true).first!
let path = documentsPath.appending("/poi-drawing")
// create VisionReplayManager with a path to recorded session
visionManager = try? VisionReplayManager.create(recordPath: path)
// register its delegate
visionManager.delegate = self
// configure Vision view to display sample buffers from video source
visionViewController.set(visionManager: visionManager)
// add Vision view as a child view
addVisionView()
}
override func viewWillAppear(_ animated: Bool) {
super.viewWillAppear(animated)
visionManager.start()
}
override func viewDidDisappear(_ animated: Bool) {
super.viewDidDisappear(animated)
visionManager.stop()
}
deinit {
// free up VisionManager's resources
visionManager.destroy()
}
private func addVisionView() {
addChild(visionViewController)
view.addSubview(visionViewController.view)
visionViewController.didMove(toParent: self)
}
private func updatePOI(geoCoordinate: GeoCoordinate, poiView: UIView) {
// closure that's used to hide the view if one of conditions isn't met
let hideView = {
poiView.removeFromSuperview()
}
guard
// make sure that `Camera` is calibrated for more precise transformations
let camera = camera, camera.isCalibrated,
// convert geo to world
let poiWorldCoordinate = visionManager.geoToWorld(geoCoordinate: geoCoordinate),
// make sure POI is in front of the camera and not too far away
poiWorldCoordinate.x > 0, poiWorldCoordinate.x < distanceVisibilityThreshold
else {
hideView()
return
}
// by default the translated geo coordinate is placed at 0 height in the world space.
// If you'd like to lift it above the ground alter its `z` coordinate
let worldCoordinateLeftTop =
WorldCoordinate(x: poiWorldCoordinate.x,
y: poiWorldCoordinate.y - poiDimension / 2,
z: distanceAboveGround + poiDimension / 2)
let worldCoordinateRightBottom =
WorldCoordinate(x: poiWorldCoordinate.x,
y: poiWorldCoordinate.y + poiDimension / 2,
z: distanceAboveGround - poiDimension / 2)
guard
// convert the POI to the screen coordinates
let screenCoordinateLeftTop =
visionManager.worldToPixel(worldCoordinate: worldCoordinateLeftTop),
let screenCoordinateRightBottom =
visionManager.worldToPixel(worldCoordinate: worldCoordinateRightBottom)
else {
hideView()
return
}
// translate points from the camera frame space to the view space
let frameSize = camera.frameSize.cgSize
let viewSize = view.bounds.size
let leftTop = screenCoordinateLeftTop.cgPoint
.convertForAspectRatioFill(from: frameSize, to: viewSize)
let rightBottom = screenCoordinateRightBottom.cgPoint
.convertForAspectRatioFill(from: frameSize, to: viewSize)
// construct and apply POI view frame rectangle
let poiFrame = CGRect(x: leftTop.x,
y: leftTop.y,
width: rightBottom.x - leftTop.x,
height: rightBottom.y - leftTop.y)
poiView.frame = poiFrame
view.addSubview(poiView)
}
}
extension POIDrawingViewController: VisionManagerDelegate {
func visionManager(_: VisionManagerProtocol, didUpdateCamera camera: Camera) {
// dispatch to the main queue in order to sync access to `Camera` instance
DispatchQueue.main.async {
self.camera = camera
// you may track the calibration progress
print("Calibration: \(camera.calibrationProgress)")
}
}
func visionManagerDidCompleteUpdate(_: VisionManagerProtocol) {
// dispatch to the main queue in order to work with UIKit elements
// and sync access to `Camera` instance
DispatchQueue.main.async {
self.updatePOI(geoCoordinate: gasStationCoordinate,
poiView: self.gasStationView)
self.updatePOI(geoCoordinate: carWashCoordinate,
poiView: self.carWashView)
}
}
}
//
Next steps
Use real-time data
When you're done testing, follow these steps to start working with real-time data.
- Change the type of
visionManager
var
toVisionManager
- Create and save a
CameraVideoSource
instance
// create a video source obtaining buffers from camera module
cameraVideoSource = CameraVideoSource()
- Create
VisionManager
with created video source
// create VisionManager with video source
visionManager = VisionManager.create(videoSource: cameraVideoSource!)
- Start
CameraVideoSource
along withVisionManager
inviewWillAppear(_:)
- Stop
CameraVideoSource
along withVisionManager
inviewDidDisappear(_:)
- Provide your own POI
GeoCoordinate
s