Where Do Spatial and Affective Computing Meet?

Spatial computing has been in the spotlight, with the Apple Vision Pro headset shipping earlier this year. The device extends reality based on its understanding of the user and their environment, enabling natural interaction through hand gestures, eye tracking, and speech.

Today, we share some thoughts about the intersection of spatial computing and affective computing: “computing that relates to, arises from, or deliberately influences emotions”. The goal of affective computing is to create a computing system capable of both perception, recognition, and understanding of human emotions, and the ability to respond intelligently, sensitively, and naturally.

Spatial computing and affective computing therefore share a broader aim: making human-computer interaction more natural, enabled by understanding user context. While both terms have been around for decades (no, really and really!), they’re becoming increasingly relevant given advances in extended reality (XR) and AI technologies. In this week’s newsletter, we’ll consider how affective computing fits into the growing overlap between XR and AI, and offer three takeaways on how to leverage affective computing to improve human-computer interaction.

Vision Pro Ships Today with Glimpses of Tomorrow

Where does affective computing fit in the XR <> AI landscape?

To answer this question, we need to look across current affective computing use cases, and examine some themes. These use cases include:

Education

These use cases broadly fall into two categories – adaptive learning systems (e.g., Brainly) and emotionally-responsive educational robots (e.g., RoboKind). The former adjusts content delivery based on the learner’s emotional state, aiming to enhance engagement, motivation, and retention. The latter includes robots that interact with students and can adapt their teaching strategies to fit the emotional needs of each learner.

Healthcare and wellbeing

Emotion-aware health monitoring and therapeutic applications include wearables and mobile apps that monitor emotional states to help track mental health and stress levels (e.g., Affectiva), or provide support/coaching by adapting responses based on the user’s emotional state (e.g., Replika).

Customer service

Emotion detection has been employed in call centers to analyze vocal tones and speech patterns to adapt responses and improve customer satisfaction (e.g., Cogito). Chatbots and virtual assistants are also becoming more emotionally intelligent, offering responses that are both contextually appropriate and empathetically aligned with the user’s feelings (e.g., Live Person).

Entertainment and media

We can infer one step beyond current stream service algorithms, in which content is recommended based on the emotional state inferred from the user’s interactions and choices. There’s also rich literature on affective game computing, where a game adapts to the player’s emotions (e.g., adjusting difficulty or narrative elements to provide a more immersive experience).

Security

Systems can use emotional recognition as an additional layer of security, identifying potential threats based on emotional cues (e.g., Nviso).

2024 Predictions: AR & AI Get Hitched

Intertwined with AI

A common theme across use cases is the concept of ambiently anticipating and adapting to user needs, using the emotional cues people are already eliciting to tailor a user experience or initiate a response from a computer system. In an affective computing system, emotions become the user interface.

Compare this with a spatial computing system. Today, headsets like Apple Vision Pro allow us to interact with the headset the same way we interact with non-digital objects. For instance, “gaze at something you are interested in” – something we’ve been doing long before computers, now turns into “gaze at the interface element you want to select” in the headset. The user experience is based on the system’s understanding of both the user and their environment. As XR becomes further intertwined with AI, this depth of understanding and intuitiveness is likely to increase.

Ambient, anticipatory, and adaptive

Let’s consider an example of what affective computing might look like in a future where XR and AI have more thoroughly overlapped:

Imagine you’re stressed, working on a big deliverable. Your spatial computing headset (or glasses) can detect that stress, based on your speech, body movement and physiological signals. The device offers the ability to create a customized, relaxing digital cocoon, subtly offering options to instantly adapt lighting, interface layout and audio to help you get your work done.

A defining aspect of this example is that the system ambiently senses the user’s emotional cues. Once “stress” is detected, the system anticipates the user’s need, offering to adapt the user interface (which in this case, is highly immersive, leveraging audio and visuals). Note there is still an opportunity for the user to provide input. Affording user autonomy and the opportunity to provide feedback to the system are key to avoid false inputs and user frustration, especially until the system learns the user’s preferences (assuming the user gives it permission to do so).

There’s also an inherent theme here of multimodality. Multiple modalities coexist in our social environment, and humans can cross-reference between modalities in real time to understand others’ emotional states (e.g., whether speech matches body movements). This concept is similar to multisensory perception and cognition, in which we are integrating stimuli from across different senses (e.g., vision, hearing, touch) to make sense of the world. In the example described above, affective and spatial computing are coming together, powered by AI, to fuse these multimodal signals into an input to which the system responds.

Flow State: AI and AR Complete Each Other

Takeaways

If you’re building experiences in the realm of spatial and affective computing, here are three takeaways to consider:

Focus on accurately recognizing and fusing multimodal emotional cues, to provide responses that are helpful, genuine and unobtrusive: Aim for a user experience that is more intuitive, responsive and personalized than the user’s current digital experiences, approximating non-digital interaction as much as possible. This involves iterative cycles of testing and learning from your customers, and benchmarking value both against existing technologies, as well as non-digital experiences (e.g., if there’s a chatbot element to the experience, gather feedback on how users would compare it to a conversation with a person).
Respect user autonomy: While you do want to aim for building an unobtrusive solution, also remember that you also need to provide the user with control and the opportunity to provide feedback before adapting the user experience. This may become less required over time, if or when the user chooses to share their preferences with the system. However, start by allowing the user to confirm intent and adaptations before they’re implemented.
Promote well-being, for both the user and society: Affective computing use cases, especially when combined with XR, can quickly veer into Black Mirror territory. Stay focused on what value you’re aiming to deliver to the user, and build for user privacy and trust. Leverage ethical design toolkits such as Tarot Cards of Tech or IDEO’s AI Ethics Cards to pressure test your ideas and technology during product development.

So there you have it. Stay tuned for more commentary and analysis at this intersection of spatial and affective computing. Meanwhile, additional reading can be seen in Rony Abovitz’ recent article, The State of Play in Spatial Computing/XR in 2024. In it, the Magic Leap and Sun & Thunder founder describes our status on the path to a north-star XR experience called “XR Infinity”, where we develop XR tools that bend to meet people in context, rather than the other way around.

Stef Hutka, Ph.D. is head of design research at Sendful.