Eye Tracking in VR

It's time for the last piece of the puzzle: using real gaze data provided by an eye tracker integrated into a VR headset.

Reuse or Rebuild

We can reuse our previous Unity VR projects to save time.

If they do not exist already, start by creating a Scripts folder in Assets, as well as a Resources folder, which contains the audio sample pop.mp3 that we used previously. Then create a new scene for this module.

Create a Plane 3D object as the floor of the scene — name it Floor 👍.

Gaze Tracking Setup

Go to instructions for the:

Pico 4 Enterprise headset
Vive Focus Vision
No headset

Instructions for the Pico

To use eye tracking with a Pico 4 headset (specifically), we need to enable the feature in the PXR_Manager component, which should be located on the XR Origin (XR Rig) object. If it does not exist, add it by clicking on “Add Component” at the bottom of the inspector. Configure the component as follows:

Create a new script named EyeTrackingManager that you will fill with the code below.
This script retrieves eye data (position and direction) and transforms it to be relative to the global world, rather than to the camera.

EyeTrackingManager.cs

using UnityEngine;
using Unity.XR.PXR;

public class EyeTrackingManager : MonoBehaviour
{
    public Transform Origin;

    private Vector3 combineEyeGazeVector;
    private Vector3 combineEyeGazeOrigin;
    private Matrix4x4 originPoseMatrix;

    public Vector3 combineEyeGazeVectorInWorldSpace;
    public Vector3 combineEyeGazeOriginInWorldSpace;

    private Collider lastHit;

    void Start()
    {
        combineEyeGazeVector = Vector3.zero;
        combineEyeGazeOrigin = Vector3.zero;
    }

    void Update()
    {
        originPoseMatrix = Origin.localToWorldMatrix;

        PXR_EyeTracking.GetCombineEyeGazeVector(out combineEyeGazeVector);
        PXR_EyeTracking.GetCombineEyeGazePoint(out combineEyeGazeOrigin);
        //Translate Eye Gaze point and vector to world space
        combineEyeGazeOriginInWorldSpace = originPoseMatrix.MultiplyPoint(combineEyeGazeOrigin);
        combineEyeGazeVectorInWorldSpace = originPoseMatrix.MultiplyVector(combineEyeGazeVector);
    }
}

Assign the "Main Camera" to the Origin variable in the inspector.

These data (combineEyeGazeOriginInWorldSpace and combineEyeGazeVectorInWorldSpace) are updated every frame because the code is called in Update. They are also available to the rest of your code for implementing interactions or saving to a file.

Attach the script to the Floor object so it runs in your scene.

Instructions for the Vive Focus

To use eye tracking with a Vive Focus Vision (specifically) you will first need to make sure that you followed the instructions on this page to install the Vive OpenXR package for Unity.

Now we need to enable the eye tracking feature for the project. Open "Project Settings" in the "Edit" menu, navigate to "OpenXR" under "XR PLug-In Management" on the left tab panel. Make sure that you find the items listed in the image below in the "Enabled Interaction Profiles" box, and toggle "Vive XR Eye Tracker (Beta)" within the "OpenXR Feature Groups" box.

Warning

Be aware that for some reason "Vive XR Eye Tracker (Beta)" often untoggles itself when you quit and restart your Unity project. Keep in mind to check it and toggle it back if you don't get any eye tracking data from the headset!

Create a new script named EyeTrackingManager that you will fill with the code below. This script retrieves eye data (position and direction) and transforms it to be relative to the global world, rather than to the camera.

Eye tracking data frame of reference (for the Vive Focus Vision)

For this particular device the gaze position and rotation are not relative to the head position and orientation like is often the case (like for example the HTC Vive Pro Eye).

It is relative to the "XR Rig". The XR Rig is a prefab containing all of the items useful for a VR camera and controllers. It moves and rotates when you use the controllers to apply translation and rotations, or teleport.

EyeTrackingManager.cs

using System;
using System.Threading;
using UnityEngine;
using VIVE.OpenXR;
using VIVE.OpenXR.EyeTracker;
using VIVE.OpenXR.Raycast;

public class EyeTrackingManager : MonoBehaviour
{
    public XrSingleEyeGazeDataHTC leftGaze;
    public XrSingleEyeGazeDataHTC rightGaze;
    public bool isEmpty;

    private Collider lastHit;

    public Transform originXRRig;

    void Update()
    {
        XR_HTC_eye_tracker.Interop.GetEyeGazeData(out XrSingleEyeGazeDataHTC[] out_gazes);

        leftGaze = gazes[(int)XrEyePositionHTC.XR_EYE_POSITION_LEFT_HTC];
        rightGaze = gazes[(int)XrEyePositionHTC.XR_EYE_POSITION_RIGHT_HTC];
    }
}

Let's visualise your gaze before going further. Add the following property (class variable) to the class:

EyeTrackingManager.cs

    public Transform gazeOriginTrans;

And the following at the end of Update():

EyeTrackingManager.cs

        gazeOriginTrans.position = originXRRig.position + leftGaze.gazePose.position.ToUnityVector();
        gazeOriginTrans.rotation = originXRRig.rotation * leftGaze.gazePose.orientation.ToUnityQuaternion();

In Unity create an empty object (its transform information doesn't matter), name it "GazeOrigin". Add as a child to "GazeOrigin" a sphere, give it position (0, 0, 2) and scale (.1,.1,.1).

Add component "EyeTrackingManager" to the "GazeOrigin" object. In the inspector, locate where the "EyeTrackingManager" component is, then drag-and-drop the "XR Origin (XR Rig)" object into the field named "originXRRig", and the "GazeOrigin" onto "gazeOriginTrans".

The code above is going to get the latest eye tracking data, place it in the frame of reference of the XR Rig and because the sphere is a child of "GazeOrigin" located 2m ahead, but the rotation and position is applied to its parent, it will be positioned exactly 2m along the line where the left gaze data is predicted to be.

Simulating gaze

if you do not have a headset handy or one but without eye tracking capability you can model the gaze as a line extending from the forehead in the direction faced by the head.

In that scenario, follow the script from the subsection above, but set the camera object to the field named "originXRRig" (the camera is a children of the Rig, easy to find), finally the content of Update() above changes to:

EyeTrackingManager.cs

        gazeOriginTrans.position = originXRRig.position;
        gazeOriginTrans.rotation = originXRRig.rotation;

The script now simply samples the camera rotation and position that is updated in real-time when you move with headset and simulates gaze as a line extending on your headcenter forward into the scene.

Gaze Visualization

We will now create cubes that change color and size when they are looked at.

This will be done similarly to our Cube Factory: new cubes will be created and equipped with the necessary functionality.

Cube Animation

We don't want the cubes to simply disappear for now as we did with the cube factory, but rather to change color (suddenly, then gradually).

Create a new script in the Scripts folder named AnimateOnCollide.cs:

AnimateOnCollide.cs

using System.Collections;
using UnityEngine;

public class AnimateOnGaze : MonoBehaviour
{
    public bool isColliding;

    public Color[] colors;
    private Material _material;

    private float _animationDuration = .75f;

    private Vector3 scaleStart = new Vector3(.15f, .15f, .15f);
    private Vector3 scaleEnd = new Vector3(.35f, .35f, .35f);

    private void Start()
    {
        _material = GetComponent<MeshRenderer>().material;
        _material.color = colors[0];
    }

    public void OnGazeEnter()
    {
        StartCoroutine(Animate());
    }

    public void OnGazeExit()
    {
        StopAllCoroutines();
        transform.localScale = scaleStart;
    }

    private IEnumerator Animate()
    {   
        float animationTime = _animationDuration;

        // Interpolate between color one and two
        while (animationTime >= 0)
        {
            animationTime -= Time.deltaTime;

            _material.color = Color.Lerp(colors[1], colors[0],1 - animationTime /  _animationDuration );
            transform.localScale = Vector3.Lerp(scaleStart, scaleEnd, 1 - animationTime /  _animationDuration); 

            yield return null; // Wait for next frame
        }
    }
}

The IEnumerator Animate() will count down from the duration specified in animationDuration while lerping (interpolating) between the first two colors in its colors array, and assign them to the object’s material. Similarly, the cubes will grow progressively from a start size to a larger end size.

At this point the OnGazeEnter and OnGazeExit functions will never be called. We will trigger these events manually when the gaze falls on a cube.

Triggering events manually

The code below is an example of triggering methods in another monobehaviour script manually. It is not always possible to set trigger callback manually in the editor, for example because the object in question does not exist yet.

In that case we take care of everything at runtime.

Add the following at the end of the Update function in the EyeTrackingManager script:

EyeTrackingManager.cs

RaycastHit hit;
if (Physics.Raycast(gazeOriginTrans.position, gazeOriginTrans.rotation * Vector3.forward, out hit, Mathf.Infinity))
{
    if (hit.collider.name == "Collidable")
    {
        print($"Gaze hit");

        if (!lastHit)
        {
            hit.collider.gameObject.SendMessage("OnGazeEnter", new Collider(), SendMessageOptions.DontRequireReceiver);
        } else if (lastHit != hit.collider)
        {
            hit.collider.gameObject.SendMessage("OnGazeEnter", new Collider(), SendMessageOptions.DontRequireReceiver);
            lastHit.gameObject.SendMessage("OnGazeExit", new Collider(), SendMessageOptions.DontRequireReceiver);
        }

        lastHit = hit.collider;
        return;
    }
}

if (lastHit != null)
{
    lastHit.gameObject.SendMessage("OnGazeExit", new Collider(), SendMessageOptions.DontRequireReceiver);
    lastHit = null;
}

Hide the gaze sphere

At this point you can hide the sphere we used to show the gaze. You know how to calculate the necessary position and rotation of the gaze.

We use the RayCast method to determine if an object intersects with the gaze. This method involves placing a laser at the position of the eye and in the direction of the gaze; the ray of this laser can extend infinitely, and the Physics.Raycast function returns the object (Collider) intersected by the gaze if there is an intersection.

Other ways to model gaze

In a more serious context we would model gaze differently to account for binocular vergence for one(see this article) and for the fact that gaze is not a Dirac delta function (a straight line in space) but a cone representing foveal vision.

In the first case, one would compute the point of vergence before placing a sphere a its location. The sphere has a predefined radius in the field of view (e.g. 2°). anything within that sphere is said to be "in view".

In the second case, one can create a cone object which aperture is constant (e.g., 4°) in the field of view, with the help of a bit of trigonometry. Anything intersecting with that cone is said to be "in view".

Creating Cubes for Gaze Visualization

Our floor will now act as a sort of cube factory. Create a new script attached to the GameObject "Floor" called ProtocolVisualiseGaze:

ProtocolVisualiseGaze.cs

using System.Collections;
using UnityEngine;
using Random = UnityEngine.Random;

public class ProtocolVisualiseGaze : MonoBehaviour
{

    private static void CreateInteractiveObject(Vector3 position, Quaternion rotation, Color col1)
    {
        GameObject cubeGo = GameObject.CreatePrimitive(PrimitiveType.Cube);
        cubeGo.name = "Collidable";
        Transform cubeTrans = cubeGo.transform;

        cubeTrans.position = position;
        cubeTrans.rotation = rotation;
        cubeTrans.localScale *= .15f;

        AnimateOnGaze cubeCollChk = cubeGo.AddComponent<AnimateOnGaze>();

        cubeCollChk.colors = new[]
        {
            col1,
            new Color (1f-col1.r, 1f-col1.g, 1f-col1.b)
        };
    }
}

CreateInteractiveCube() creates cubes with a given position, rotation, and color, and equips them with the AnimateOnGaze component. It names them "CollidableCube" and assigns two colors to their AnimateOnGaze component: one passed to the CreateInteractiveCube() function via the col1 parameter, and its opposite via the command new Color (1f-col1.r, 1f-col1.g, 1f-col1.b).

Now add an IEnumerator Start() to this script to execute the cube generation with the code below:

ProtocolVisualiseGaze.cs

    IEnumerator Start()
    {
        // Create floating cubes in a square formation around the room's origin
        Vector2[] moveVec = new[]
        {
            new Vector2(0,-1),
            new Vector2(1,0),
            new Vector2(0,1),
            new Vector2(-1,0),
        };

        Vector3 startPos = new Vector3(1.8f, 1.6f, 1.8f);

        for (int iBorder = 0; iBorder < 4; iBorder++)
        {
            float tmpVal = startPos.x; 
            startPos.x = -startPos.z;
            startPos.z = tmpVal;

            for (int iCube = 0; iCube < 4; iCube++)
            {
                Vector3 position = startPos;
                position.x += moveVec[iBorder].x * (3.6f/4f * iCube);
                position.z += moveVec[iBorder].y * (3.6f/4f * iCube);

                CreateInteractiveObject(position, Random.rotation, Random.ColorHSV());
                yield return new WaitForSeconds(.1f);
            }
        }
    }

Although this function seems long, it is quite simple in what it does: it simply creates cubes around the centre of the room arranged along a square.

Save the code and try it. See if you can activate the cubes just by looking at them.

If Looks Could Destroy

Instead of animating our cubes by looking at them, let's destroy them as we did with the excess output from our Cube Factory.

Create a new script called DestroyOnCollide in our scripts folder:

DestroyOnCollide.cs

using System.Collections;
using UnityEngine;

public class DestroyOnCollide : MonoBehaviour
{
    private AudioSource _audioSource;

    private void Start()
    {
        _audioSource = gameObject.AddComponent<AudioSource>();
        _audioSource.playOnAwake = false;
        _audioSource.clip = Resources.Load<AudioClip>("pop");
    }

    private void OnGazeEnter()
    {
        StartCoroutine(PlayAudioThenDestroy());
    }

    private IEnumerator PlayAudioThenDestroy()
    {
        print($"Destoyed {name}");
        // Hide object
        Destroy(GetComponent<MeshRenderer>());
        // Delete collider component to prevent calling this coroutine twice
        Destroy(GetComponent<Rigidbody>());
        // Play pop sound
        _audioSource.Play();
        yield return new WaitUntil(() => !_audioSource.isPlaying);
        // Actually destroy the object now
        Destroy(gameObject);
    }
}

Its structure should no longer be new at this point.

Return to our ProtocolVisualiseGaze script and modify its CreateInteractiveCube() function to look like this:

ProtocolVisualiseGaze.cs

private static void CreateInteractiveCube(Vector3 position, Quaternion rotation, Color col1)
{
    GameObject cubeGo = GameObject.CreatePrimitive(PrimitiveType.Cube);
    cubeGo.name = "CollidableCube";
    Transform cubeTrans = cubeGo.transform;

    cubeTrans.position = position;
    cubeTrans.rotation = rotation;
    cubeTrans.localScale *= .15f;

    cubeGo.AddComponent<DestroyOnCollide>();
    cubeGo.GetComponent<MeshRenderer>().material.color = col1;
}

Save the code, start the game, and make some pops!

Challenge

Create an event sequence based on eye tracking:

The participant is in a scene containing only a small sphere floating at eye level.
The user must look at this target (the small sphere) for 500ms to trigger the appearance of a complex scene (the sphere disappears).
The complex scene contains elements from the Furniture Kit used at the very beginning of our workshop.
Give your participant 20 seconds to observe the scene, then hide it again.
Display on a panel the three most looked-at objects along with their respective observation durations.

Good luck.