Object detection in Unity using the HoloLens

3 minute read

Published:

The YoloDetectionHoloLens sample enables object detection through the YOLO framework by streaming of sensor frame data from the HoloLens (client) to a companion PC (server) using Windows Sockets.

Desktop (server)HoloLens (client)

Specifically, a TCP (Transmission Control Protocol) socket is used to enable low-level network data transfers (in both directions) for this long-lived connection. To allow for a UWP device to connect to a TCP socket, we need to declare app capabilities for Private Networks (client & server) or Internet (client & server) in both the Windows Desktop application (server) as well as the UWP application (client).

In the Package.appxmanifest file of C++ Windows Desktop application and the C# Unity IL2CPP output build, add:

<Capability Name="internetClientServer" />
<Capability Name="privateNetworkClientServer" />

Simple client-server socket

More detail into implementation coming soon!

  • Desktop streamer header contains code which creates a header which indicates the size of the outgoing data.

  • Desktop streamer contains a sample to send the outgoing bounding box data as an interface vector of custom WinRT objects (bounding boxes).

  • Device receiver allows for the HoloLens to listen for and receive incoming WinRT objects (bounding boxes). These are then formatted and displayed to the user.

Running the sample

The following instructions provide the necessary steps to build the YoloDetectionHoloLens sample.

YoloDetectionHoloLensUnity project

This project houses the code to enable media frame source groups, send sensor frames, receive bounding boxes from the companion PC and display the returned bounding boxes.

  1. Open HoloLensForCV sample in VS2017 and install included OpenCV nuget package to HoloLensForCV project
  2. Build the HoloLensForCV project (x86, Debug or Release)
  3. Copy all output files from HoloLensForCV path (dlls and HoloLensForCV.winmd) to the Assets->Plugins->x86 folder of the YoloDetectionHoloLensUnity project
  4. Open YoloDetectionHoloLensUnity Unity project, enter the IP address of your desktop PC into the relevant field on the script holder game object

unity sample

  1. Under Unity build settings, switch the platform to Universal Windows Platform and adjust relevant settings. Build project using IL2CPP
  2. Open VS solution from Unity build, build then deploy to device

ComputeOnDesktopYolo project

This project receives sensor frames from the connected HoloLens client, processes them using the Tiny YOLOV2 ONNX model. ONNX (Open Neural Network Exchange) provides an open source format for AI models and allows easy interoperability between different frameworks.

  1. Open the HoloLensForCV sample and build the ComputeOnDesktopYolo project (x64, Debug or Release)
  2. Deploy project to Local Machine. The sample should open a window which looks similar to the below figure

yolo console app

Combining the samples to process HoloLens PV camera frames

  1. Ensure the HoloLens and PC are on the same network and the HoloLens is currently running the YoloDetectionHoloLensUnity sample
  2. When the deployed ComputeOnDesktopYolo desktop app opens, enter the IP of the HoloLens to connect and click connect
  3. Sensor frames from the HoloLens should begin to stream to the desktop and appear in the app window

yolo console app with connected client

  1. When prompted on the HoloLens, perform the double tap gesture to connect to the host socket and begin receiving information about detected objects in streamed frames. Bounding boxes and updated on screen text should appear within the field-of-view of the HoloLens as below.

Note that this frame was taken at a different instance than the above frame from the desktop streaming, leading to the discrepancy in bounding box sizes.

yolo hololens