Draw points of interest using the Mapbox Vision SDK for iOS | Help

Prerequisite

Familiarity with Xcode and Swift, and completion of the Mapbox Vision SDK for iOS Install and configure guide.

The Mapbox Vision SDK for iOS is a library for interpreting road scenes in real time directly on iOS devices using the device’s built-in camera. The Vision SDK detects many types of objects including cars, people, road signs, and more.

In this tutorial, you'll learn how to use geo-to-world and world-to-screen coordinate transformations available in the Mapbox Vision SDK and apply it to rendering points of interest (POIs) on the screen.

Note

In the example used in this tutorial, all POIs have predefined images and hard-coded locations. In your own application, you can use your own POI data or data from other Mapbox products. See the Specify a list of POIs section for details.

Getting started

Here are the resources that you need before getting started:

An application including Mapbox Vision SDK for iOS. Before starting this tutorial, go through the Install and configure steps in the Vision SDK for iOS documentation. This will walk you through how to install the Vision SDK and configure your application.
Recorded session. This tutorial is based on a recorded driving session through a city and the replay capabilities of MapboxVision. You may use VisionManager and VisionReplayManager interchangeably for live and recorded session respectively.
- Check our Testing and development guide to familiarize yourself with record and replay functionality.
- You can download the recorded session used in this tutorial below.

arrow-downDownload sample session

Coordinate systems

The Vision SDK uses three coordinate systems: frame, world, and geo. You can translate your coordinates from one system to another with help of the VisionManager and VisionReplayManager methods geoToWorld(geoCoordinate:), worldToPixel(worldCoordinate:) and inverse ones worldToGeo(worldCoordinates:), pixelToWorld(screenCoordinate:).

Note

The accuracy of the transformation functions is highly dependent on camera calibration so it's recommended that you use them when Camera.isCalibrated value becomes true.

Frame coordinates

In the Mapbox Vision SDK for iOS, frame coordinates are represented by the Point2D object.

This coordinate system represents the position of an object relative to the frame received from the camera. The origin is the left top corner of the frame. The position of an object is an x, y pair where x is the horizontal distance from the origin to the object in pixels, and y is the vertical distance from the origin to the object in pixels.

Note

This system can be also referred to as the screen coordinate system. Dimensions of the "screen" are defined by the dimensions of the frame received from the camera or other video source.

World coordinates

In the Mapbox Vision SDK for iOS, the world coordinate is represented by the WorldCoordinate object.

This coordinate system represents the position of an object relative to the device camera in the physical world. The origin of the system is a point projected from the camera to a road plane. The coordinate of the object is a triplet x, y, z where x is a distance to the object in front of the origin, y is a distance on the left of the origin, and z is a distance above the origin. Distance in the world coordinate system is expressed in meters.

Geographic coordinate

In the Mapbox Vision SDK for iOS, the geographic coordinate is represented by the GeoCoordinate object.

This coordinate system is used to locate an object's geographic position as it would appear on a map. Each point is specified using longitude, latitude pair.

Longitude ranges from -180 to 180 degrees, where 0 is the Greenwich meridian, the positive direction (+) is to the East, and the negative direction (-) is to the West.

Latitude ranges from -90 to +90 degrees, where 0 is the equator, the positive direction (+) is to the North, and the negative direction (-) is to the South.

Set up the recorded session

To configure your application with a prerecorded session (see Getting started to download the sample session used in this tutorial):

Unzip the contents to a folder on your local machine.
Go your Xcode project Info.plist and set YES for UIFileSharingEnabled thus enabling file sharing through Finder.
Install the app to the device (⌘ + R).
Connect your device, choose it in Finder under Locations section, and select Files tab.
Drag and drop the folder with the recorded session onto your app. Now the session is available in the Documents folder inside the app container.
In the code use VisionReplayManager.create(recordPath:) method to create an instance of VisionReplayManager by providing a path to a recorded session.

Configure Vision SDK lifecycle

In this tutorial, you'll use VisionReplayManager to run a prerecorded session (which includes video and telemetry data) and find POIs in the video. The VisionReplayManager class is the main object for registering for events from the Vision SDK and controlling its delivery. For production applications or testing in a live environment, use VisionManager instead of VisionReplayManager. See the Next steps section for details.

To set up the Vision SDK:

Create a VisionReplayManager instance with a recorded session path.
Register its delegate.
Create VisionPresentationViewController and configure it with VisionReplayManagerto display camera frames.

Swift


 
    override func viewDidLoad() {
 
        super.viewDidLoad()
 

 
        // Documents directory path with files uploaded via Finder
 
        let documentsPath =
 
            NSSearchPathForDirectoriesInDomains(.documentDirectory,
 
                                                .userDomainMask,
 
                                                true).first!
 
        let path = documentsPath.appending("/poi-drawing")
 

 
        // create VisionReplayManager with a path to recorded session
 
        visionManager = try? VisionReplayManager.create(recordPath: path)
 
        // register its delegate
 
        visionManager.delegate = self
 

 
        // configure Vision view to display sample buffers from video source
 
        visionViewController.set(visionManager: visionManager)
 
        // add Vision view as a child view
 
        addVisionView()
 
    }

Start delivering events by calling start on VisionReplayManager.

Swift


 
    override func viewWillAppear(_ animated: Bool) {
 
        super.viewWillAppear(animated)
 
        visionManager.start()
 
    }

Stop delivering events by calling stop on VisionReplayManager.

Swift


 
    override func viewDidDisappear(_ animated: Bool) {
 
        super.viewDidDisappear(animated)
 
        visionManager.stop()
 
    }

Implement `VisionManagerDelegate`

By implementing the VisionManagerDelegate protocol and registering as a delegate with VisionReplayManager, your class becomes capable of reacting to events emitted by the Vision SDK. All the methods in the protocol are optional so you may implement only the methods you need.

In this tutorial we're interested in obtaining the latest state of the Camera which includes calibration progress and frame size in visionManager(_:didUpdateCamera:) method as well as calculating the positions of POIs on the screen as update cycle of VisionReplayManager completes in the visionManagerDidCompleteUpdate(_:) method.

Swift


 
}
 

 
extension POIDrawingViewController: VisionManagerDelegate {
 
    func visionManager(_: VisionManagerProtocol, didUpdateCamera camera: Camera) {
 
        // dispatch to the main queue in order to sync access to `Camera` instance
 
        DispatchQueue.main.async {
 
            self.camera = camera
 
            // you may track the calibration progress
 
            print("Calibration: \(camera.calibrationProgress)")
 
        }
 
    }
 

 
    func visionManagerDidCompleteUpdate(_: VisionManagerProtocol) {
 
        // dispatch to the main queue in order to work with UIKit elements
 
        // and sync access to `Camera` instance
 
        DispatchQueue.main.async {
 
            self.updatePOI(geoCoordinate: gasStationCoordinate,
 
                           poiView: self.gasStationView)
 

 
            self.updatePOI(geoCoordinate: carWashCoordinate,
 
                           poiView: self.carWashView)
 
        }

Calculate POI positions

Next, you'll prepare POI data to be added to the screen. To draw a POI label on the screen, you need to take the geo coordinate (longitude, latitude) of the POI and make two coordinate transformations: first from geo coordinates to world coordinates, then from world coordinates to frame coordinates. Remember that screen coordinates represent the point on the camera frame and you'll need an extra step to translate this coordinate into your view coordinates.

Specify a list of POIs

This example uses two hard-coded POIs. In your application, you will need to generate your own list of POIs including longitude and latitude. You'll also need an image to display for each.

Here are the sample POIs included in this tutorial:

Swift


 
// POI coordinates for a provided session. Use your own for real-time or other recorded sessions
 
private let carWashCoordinate = GeoCoordinate(lon: 27.675944566726685, lat: 53.94105180084251)
 
private let gasStationCoordinate = GeoCoordinate(lon: 27.674764394760132, lat: 53.9405971055192)

Note

In your own application, you could use your own POI data, data from the Mapbox Geocoding API, or data from the Mapbox Streets tileset via the Mapbox Tilequery API.

Transform `GeoCoordinate` to `WorldCoordinate`

In this example, method updatePOI takes GeoCoordinate of the POI and UIView instance that marks the POI on the screen.

Before drawing a POI we need to check that:

The Camera is calibrated so that we receive precise values during coordinate transformations.
geoToWorld(geoCoordinate:) transformation is successful (WorldCoordinate value is produced).
The POI is in the visibility range.

Swift


 
        guard
 
            // make sure that `Camera` is calibrated for more precise transformations
 
            let camera = camera, camera.isCalibrated,
 
            // convert geo to world
 
            let poiWorldCoordinate = visionManager.geoToWorld(geoCoordinate: geoCoordinate),
 
            // make sure POI is in front of the camera and not too far away
 
            poiWorldCoordinate.x > 0, poiWorldCoordinate.x < distanceVisibilityThreshold
 
        else {
 
            hideView()
 
            return
 
        }

Although a POI is usually described as a single point, our POI marker is a rectangular banner pointing to the geographical point from above. The rectangle is defined by its left top and bottom right vertices. Lift the POI marker above the ground by increasing z component of the WorldCoordinate.

Swift


 
        // by default the translated geo coordinate is placed at 0 height in the world space.
 
        // If you'd like to lift it above the ground alter its `z` coordinate
 
        let worldCoordinateLeftTop =
 
            WorldCoordinate(x: poiWorldCoordinate.x,
 
                            y: poiWorldCoordinate.y - poiDimension / 2,
 
                            z: distanceAboveGround + poiDimension / 2)
 

 
        let worldCoordinateRightBottom =
 
            WorldCoordinate(x: poiWorldCoordinate.x,
 
                            y: poiWorldCoordinate.y + poiDimension / 2,
 
                            z: distanceAboveGround - poiDimension / 2)

Transform `WorldCoordinate` to frame coordinate

Now that you have two WorldCoordinates, you can translate them to frame coordinates. Put them in the guard condition so that if the transformation doesn't produce results further work is not executed.

Swift


 
        guard
 
            // convert the POI to the screen coordinates
 
            let screenCoordinateLeftTop =
 
                visionManager.worldToPixel(worldCoordinate: worldCoordinateLeftTop),
 

 
            let screenCoordinateRightBottom =
 
                visionManager.worldToPixel(worldCoordinate: worldCoordinateRightBottom)
 
        else {
 
            hideView()
 
            return
 
        }

The last transformation is to convert marker rectangular vertices from camera frame space to the view space. To simplify this task Vision SDK provides a helper function CGPoint.convertForAspectRatioFill(from:to:) that converts a point from the original bounds to the destination one respecting the aspect ratio.

Swift


 
        // translate points from the camera frame space to the view space
 
        let frameSize = camera.frameSize.cgSize
 
        let viewSize = view.bounds.size
 

 
        let leftTop = screenCoordinateLeftTop.cgPoint
 
            .convertForAspectRatioFill(from: frameSize, to: viewSize)
 

 
        let rightBottom = screenCoordinateRightBottom.cgPoint

Draw POIs

The only remaining step is to construct the marker view frame, set it, and display the view.

Swift


 

 
        // construct and apply POI view frame rectangle
 
        let poiFrame = CGRect(x: leftTop.x,
 
                              y: leftTop.y,
 
                              width: rightBottom.x - leftTop.x,
 
                              height: rightBottom.y - leftTop.y)

Final result

Swift

POIDrawingViewController

github View on GitHub

import MapboxVision
import UIKit

/**
 * "POI drawing" example demonstrates how to draw a point of interest on the screen knowing its geographical coordinates
 * and using coordinate transformation functions.
 */

// POI coordinates for a provided session. Use your own for real-time or other recorded sessions
private let carWashCoordinate = GeoCoordinate(lon: 27.675944566726685, lat: 53.94105180084251)
private let gasStationCoordinate = GeoCoordinate(lon: 27.674764394760132, lat: 53.9405971055192)

private let distanceVisibilityThreshold = 300.0
private let distanceAboveGround = 16.0
private let poiDimension = 16.0

class POIDrawingViewController: UIViewController {
    private var visionManager: VisionReplayManager!

    private let visionViewController = VisionPresentationViewController()
    private var carWashView = UIImageView(image: UIImage(named: "car_wash"))
    private var gasStationView = UIImageView(image: UIImage(named: "gas_station"))

    // latest value of a camera
    private var camera: Camera?

    override func viewDidLoad() {
        super.viewDidLoad()

        // Documents directory path with files uploaded via Finder
        let documentsPath =
            NSSearchPathForDirectoriesInDomains(.documentDirectory,
                                                .userDomainMask,
                                                true).first!
        let path = documentsPath.appending("/poi-drawing")

        // create VisionReplayManager with a path to recorded session
        visionManager = try? VisionReplayManager.create(recordPath: path)
        // register its delegate
        visionManager.delegate = self

        // configure Vision view to display sample buffers from video source
        visionViewController.set(visionManager: visionManager)
        // add Vision view as a child view
        addVisionView()
    }

    override func viewWillAppear(_ animated: Bool) {
        super.viewWillAppear(animated)
        visionManager.start()
    }

    override func viewDidDisappear(_ animated: Bool) {
        super.viewDidDisappear(animated)
        visionManager.stop()
    }

    deinit {
        // free up VisionManager's resources
        visionManager.destroy()
    }

    private func addVisionView() {
        addChild(visionViewController)
        view.addSubview(visionViewController.view)
        visionViewController.didMove(toParent: self)
    }

    private func updatePOI(geoCoordinate: GeoCoordinate, poiView: UIView) {
        // closure that's used to hide the view if one of conditions isn't met
        let hideView = {
            poiView.removeFromSuperview()
        }

        guard
            // make sure that `Camera` is calibrated for more precise transformations
            let camera = camera, camera.isCalibrated,
            // convert geo to world
            let poiWorldCoordinate = visionManager.geoToWorld(geoCoordinate: geoCoordinate),
            // make sure POI is in front of the camera and not too far away
            poiWorldCoordinate.x > 0, poiWorldCoordinate.x < distanceVisibilityThreshold
        else {
            hideView()
            return
        }

        // by default the translated geo coordinate is placed at 0 height in the world space.
        // If you'd like to lift it above the ground alter its `z` coordinate
        let worldCoordinateLeftTop =
            WorldCoordinate(x: poiWorldCoordinate.x,
                            y: poiWorldCoordinate.y - poiDimension / 2,
                            z: distanceAboveGround + poiDimension / 2)

        let worldCoordinateRightBottom =
            WorldCoordinate(x: poiWorldCoordinate.x,
                            y: poiWorldCoordinate.y + poiDimension / 2,
                            z: distanceAboveGround - poiDimension / 2)

        guard
            // convert the POI to the screen coordinates
            let screenCoordinateLeftTop =
                visionManager.worldToPixel(worldCoordinate: worldCoordinateLeftTop),

            let screenCoordinateRightBottom =
                visionManager.worldToPixel(worldCoordinate: worldCoordinateRightBottom)
        else {
            hideView()
            return
        }

        // translate points from the camera frame space to the view space
        let frameSize = camera.frameSize.cgSize
        let viewSize = view.bounds.size

        let leftTop = screenCoordinateLeftTop.cgPoint
            .convertForAspectRatioFill(from: frameSize, to: viewSize)

        let rightBottom = screenCoordinateRightBottom.cgPoint
            .convertForAspectRatioFill(from: frameSize, to: viewSize)

        // construct and apply POI view frame rectangle
        let poiFrame = CGRect(x: leftTop.x,
                              y: leftTop.y,
                              width: rightBottom.x - leftTop.x,
                              height: rightBottom.y - leftTop.y)

        poiView.frame = poiFrame
        view.addSubview(poiView)
    }
}

extension POIDrawingViewController: VisionManagerDelegate {
    func visionManager(_: VisionManagerProtocol, didUpdateCamera camera: Camera) {
        // dispatch to the main queue in order to sync access to `Camera` instance
        DispatchQueue.main.async {
            self.camera = camera
            // you may track the calibration progress
            print("Calibration: \(camera.calibrationProgress)")
        }
    }

    func visionManagerDidCompleteUpdate(_: VisionManagerProtocol) {
        // dispatch to the main queue in order to work with UIKit elements
        // and sync access to `Camera` instance
        DispatchQueue.main.async {
            self.updatePOI(geoCoordinate: gasStationCoordinate,
                           poiView: self.gasStationView)

            self.updatePOI(geoCoordinate: carWashCoordinate,
                           poiView: self.carWashView)
        }
    }
}
//

Next steps

Use real-time data

When you're done testing, follow these steps to start working with real-time data.

Change the type of visionManager var to VisionManager
Create and save a CameraVideoSource instance

// create a video source obtaining buffers from camera module
cameraVideoSource = CameraVideoSource()

Create VisionManager with created video source

// create VisionManager with video source
visionManager = VisionManager.create(videoSource: cameraVideoSource!)

Start CameraVideoSource along with VisionManager in viewWillAppear(_:)
Stop CameraVideoSource along with VisionManager in viewDidDisappear(_:)
Provide your own POI GeoCoordinates

Was this page helpful?

Getting started​

Coordinate systems​

Frame coordinates​

World coordinates​

Geographic coordinate​

Set up the recorded session​

Configure Vision SDK lifecycle​

Implement VisionManagerDelegate​

Calculate POI positions​

Specify a list of POIs​

Transform GeoCoordinate to WorldCoordinate​

Transform WorldCoordinate to frame coordinate​

Draw POIs​

Final result​

Next steps​

Use real-time data​

Getting started

Coordinate systems

Frame coordinates

World coordinates

Geographic coordinate

Set up the recorded session

Configure Vision SDK lifecycle

Implement `VisionManagerDelegate`

Calculate POI positions

Specify a list of POIs

Transform `GeoCoordinate` to `WorldCoordinate`

Transform `WorldCoordinate` to frame coordinate

Draw POIs

Final result

Next steps

Use real-time data