staying playful/

The enterprise metaverse, cognitive science, and interpersonal communication

thinking
flowering

last tended

December 20, 2023

Over the past six months I’ve had the opportunity to support several different metaverse-centric events at Avanade. These internal events have ranged from conceptual presentations and workshops to networking events and pride celebrations, primarily taking place in virtual reality (VR). Much of this effort has been in the name of experimentation to explore what this technology can mean for our business, as well as our clients’. Because of my involvement in this work, I’ve also become a point of contact for questions from co-workers and friends alike, about what this all means… What I’ve found is that we’re all asking similar questions-- and for good reason. Why the metaverse? What’s the point? Is VR a gimmick? Is Marky Z just at it again? Because of this, I wanted to share some of my learnings and the approach I’ve been taking when looking to understand whether there is value behind the hype.

Before diving in I do want to clarify how I have been framing the metaverse in my mind. The Metaverse as a singular, comprehensive form that is a persistent and interoperable connection between the physical, virtual and everything in-between, does not exist today. This definition is just an early example of how some are envisioning the future. How we define and understand the metaverse will evolve significantly, but that’s the point--- it’s a far-off destination that we’re imagining as our potential future state. Today when we see news on different companies “entering the Metaverse” it’s not in reference to that envisioned future happening right now, but more so a signaling term. For the most part when businesses and others discuss the Metaverse, or metaverses, it’s in reference to the many technologies that will be pivotal in making any potential future realization of the Metaverse possible. When hearing about metaverse strategies, I have been considering it to be “metaverse readiness” as firms are experimenting with one or more of these technologies to see how they can best leverage them moving forward. The most popular technologies featured in recent media have been examples of how brands might use extended reality and NFTs for consumer experiences.

As a User Experience designer, I have undoubtedly been interested in this area as many of the technologies that are necessary to make the metaverse possible also mean an entirely new way for humans to interact with technology, and an opportunity for designers to reimagine what those experiences will be. Even with the possibilities, it is vital that when we work with emerging technologies it is with the greatest level of consideration. We must only implement with clear intentions and the objectives of creating value while prioritizing the needs and safety of the humans that will be impacted. This is also why I am proud to work for a company like Avanade, where we prioritize our mission to make a genuine human impact and focus on cutting through the hype.

When referencing metaverse technologies there is a wide array of what that could mean: web3, blockchain, crypto, NFTs, smart contracts, IoT, edge computing, digital twins, extended reality, haptics and much more. The focus of my recent learnings has been centered on Extended Reality (XR) and the prophesied future experience layer of technology. Knowing that all experiences are grounded in how we feel, think, and behave, the approach I’ve been taking begins with a review of what research has revealed about our minds, to better define how XR may relate.

I’ve been exploring a number of use cases for XR looking at the “enterprise metaverse” specifically (meaning applications of these technologies in the workplace), and the one I’ve decided to center in on first is workplace experience and the case for hybrid/remote work. It has been one of the most topical in recent history and is the most accessible as an introduction to metaverse technologies. Over the past 2 years we’ve seen a great deal of change in how we live and use technology, and with those changes a conversation has been sparked about whether remote or in-person work is more effective. Taking a step back from focusing on where the work gets done, it’s important to consider what makes an organization effective. Just by doing a quick reflection of your own experiences, as I did of mine, you may quickly come to the realization that communication and collaboration are at the top of the list when it comes to organizational success. Considering this, it’s worthwhile to reframe our question to understand why one is more effective than the other, acknowledging many of the differences between in-person and remote work revolve around the channels in which we use to communicate[1]. This is where XR begins to fit into the conversation as it’s proposed to be the next communication channel in the workplace. Naturally, many begin to ask why should this replace in person communication or why is this better than a zoom call?

A review of research on communication reveals that in-person face to face communication is the most efficient communication channel for reasons related to the number of non-verbal cues available and how we have evolved to communicate. However, digitally mediated communication has been found to be just as effective as in-person communication with the right preparation and inputs[2]. Beyond traditional communication channels, recent findings tell us that the affordances offered by XR closely align to many of the factors that make in-person communication effective, and even allow for the opportunity to improve upon some of the inefficiencies we see in communication today.  This is not to say that XR tech as it exists today is a superior replacement for all other communication channels, but that it poses enough of an established opportunity to warrant investment and further exploration. Below we’ll breakdown why by looking at what researchers know about interpersonal communication and the communication channels we use today, as well as what is known about XR and how it relates to what we know about our cognitive processes.  

Communication

Communication is defined as the transfer of meaning— it’s the transmitting of information and common understanding from one person to another, including anything from thoughts and ideas to emotions and understanding[3]. The methods and channels of communication are nearly endless, spanning both verbal and non-verbal behaviors. When it comes to verbal and non-verbal communication, the most important distinction to understand is that verbal communication is linguistic and transfers explicit content, whereas non-verbal communication encompasses all of the other perceived behaviors that are subject to audience interpretation[4]. 

In-person communication

Language is one of our most incredible strengths as a species, but it is not the only way we transfer meaning, and it wasn’t even the first form of communication we used. Humans have evolved to leverage multiple communication channels collecting information from a variety of inputs, including non-verbal behaviors, to better understand meaning, intention, and emotion. This variety of input channels is also what makes in-person communication the most effective communication channel to this day. Beyond the linguistic transfer of explicit meaning, humans have relied on the ability to “mind read”—which is to interpret what others are thinking based on their actions and words, to more quickly understand intent and establish relationships and interpersonal trust[5]. When in-person communicators are able to observe behaviors such as eye contact, facial expression, body language, and tone among other non-verbal cues, they are able to more quickly establish “a willingness to be vulnerable, based on positive expectations about the actions of others.”[6]. In messages conveying feelings or attitude specifically, the successful transfer of meaning can become more than 90% reliant on non-verbal behaviors such as the way words are delivered, facial expressions, and other cues[7]. It goes to show, for certain communication needs that are particularly reliant on tacit knowledge, trust, or emotional communication, (things we may see in the workplace as intuition, conflict resolution, creativity, collaboration) in-person communication is the most effective as it allows for the quickest collection of the greatest number of relevant inputs and context to enable quick judgments.

Computer mediated communication

Information and communication technologies have significantly advanced over the past several decades. While these digitally mediated communication channels did not replace as much business-related travel as many had predicted, they have become vital to the efficiency of communication in the workplace[8]. From email and instant messaging to audio and video calls, we can breakdown the strengths and weaknesses of each to understand why they may not have been as effective at replacing in-person communication as previously predicted.

Text-based computer mediated communication (CMC)

Text-based communication includes the all too familiar email, instant messaging, and online text tools. These messages primarily contain linguistic content and are a great tool for conveying explicit and clear task-like action items or information. Text-based communication is seen as the least effective communication channel when it comes to tacit knowledge or interpersonal communication needs as it offers the least amount of additional context beyond the explicit linguistic information, making it a slower process to form judgments and establish trust. That’s not to say that text-based can never be as effective as face to face or even in-person, as Social Dynamic Media Theorists have noted, “barrier to efficient information and communication technology use may have to do with improper training or lack of clearly defined goals resulting from the culture of organizations and social habits"[9]. This does mean there is a lot more upfront work to define parameters for use and onboard users properly to adapt language, style and other cues, to facilitate more comprehensive meaning transfer (ie. Emoticon and gif usage). Some other realized benefits of CMC over in person communication include advantages like reduced hierarchies, more equal participation, and ability to contribute ideas simultaneously via chat in video calls[10].

“Face to face” computer mediated communication (CMC)

Video and audio digitally mediated communication have long been considered the next best synchronous CMC option when unable to meet in person. The benefits of these channels primarily stem from increased access to nonverbal behaviors such as facial expressions and vocal tone. Leveraging primarily audio and video communication during our forced transition to remote work we even saw an increase in productivity early on. While due to a variety of compounding factors, as noted by Microsoft in their annual Work Trends Index, ultimately employees are found to work longer and experience more burnout from the current tools available for digitally mediated communication[11].

A fair amount of this fatigue can be attributed to the inappropriate use of CMC channels for more nuanced and contextual communication needs like conflict resolution, creativity, and collaboration. Even though video and audio offer greater access to non-verbal behaviors, they still lack the full context we are used with things like body language, eye contact, shared gaze, and spatial presence. Having access to less information increases the time it takes to establish trust and shared understanding[12]. Beyond the reduced amount of information, Riedel presents a comprehensive theory on the root cause of zoom fatigue. Grounded in the Medial Naturalness Theory, six subsections are outlined that result in increased cognitive effort and stress because these video conferencing tools create unnatural perceptions that do not align with the conditions we have been hardwired to rely on for effective communication[13]. One example being the unnatural interaction with multiple faces. Video chat tools simulate artificial eye contact with multiple parties at a close distance over an extended period activating the sympathetic nervous system associated with the fight or flight response[14].

Extended Reality (XR) and being there

Much of what XR technology is intended to offer centers around perceptual and sensory immersion. Hardware technology available to us today spans headsets, behavior tracking, and haptics, all working towards optimization for the greatest amount of sensory input possible, with the least amount of discomfort and friction. These conditions also enable embodiment and spatial presence, which are the two primary affordances that make XR promising for many use cases in addition to workplace experience and interpersonal communication.

Embodied cognition is an approach to cognition that theorizes our cognitive processes are grounded in our sensorimotor capabilities[15]. This means that cognition emerges from the interplay between an organism and its environment, the organism’s autonomy to act on their environment, and the accuracy of the result of that action based on how the organism interacts with the environment[16]. This is significant because it means our mental processes benefit greatly not just from the greatest variety of perceived verbal and non-verbal inputs, but also spatial physicality. XR is able to simulate embodiment with perceptual and sensory immersion, allowing for an opportunity to deepen engagement and understanding.

Presence is often articulated as “a sense of being there”[17]. When we consider presence as it relates to XR, researchers primarily focus on “spatial presence”, or a technology-mediated experience, which is notably different from the natural presence that humans have traditionally known[18]. The most significant differentiator is the understanding that spatial presence is a perceptual illusion of non-mediation. This means that perceiving presence is not only a result of the most realistic sensory inputs but can also come from any conditions that result in the person failing to acknowledge the medium serving the experience (the extended reality headset or mobile device or book that the person becomes absorbed by) [19]. Without getting lost in the technicalities of presence, it’s most important to understand that presence begins and ends with attention, as attention drives what your mind selects to perceive, or not to perceive. Attention is often a result of either unexpected cues in the medium or the driving interest or need of an individual[20]. XR technologies are able to create a sense of presence because of the number of sensory stimuli they can provide, along with the ability to control them and center attention thoughtfully. The degree of control possible varies across the XR spectrum, with the highest degree of control being possible in fully immersive virtual reality environments. A sense of presence facilitates effective communication as attention is fully centered and a deeper level of processing is possible, increasing the chances of memory formation, shared understanding and collaboration[21].

XR as a tool for communication

When considering emerging technologies and new user experiences, it’s important to incorporate what we know about cognitive processes and human behavior to better understand what impact these new experiences may have on the people that will use them. By taking a deep dive into what is known about interpersonal communication and the impacts of XR experiences, we can determine that XR technology poses an opportunity for positive human impact when it comes to workplace experience and interpersonal communication.

Research tells us that the most effective form of communication is in-person because of how our minds have evolved to pull from the greatest variety of inputs, with non-verbal behaviors playing a significant role when it comes to establishing trust and shared understanding. While not impossible to do the same through CMC channels, sensory inputs are the most intuitive for our minds, can be processed quickly, and create the least amount of friction. Today, XR technology is able to reflect many of these standard sensory cues including eye contact and body language, and more is in the works on facial tracking and haptics for facial expressions and touch. It its simplest form when leveraged as a communication channel, XR can maintain some of the benefits of in-person communication while building off the benefits of CMC channels, opening the door to new opportunities by increasing access to effective hybrid work, international collaboration, skilled talent pools, sustainability, and so much more.

The XR affordances of embodiment and presence will also allow us to take things far beyond the familiar in-person forms of communication that we know today. Lombard tells us the experiences of presence and immersion are not entirely reliant on realism[22]. The true value of what is possible with the experience layer of the metaverse is our freedom from the constraints of reality. Even understanding the superiority of in-person communication, there are limitations in the ability to simultaneously contribute, the prevalence of having unknowing domineering participants overtaking the conversation and more[23]. Three decades ago, the case was made that communication methods are transformative not because they recreate face-to-face encounters, but rather because they offer new opportunities that go “beyond being there” and the same is true today[24]. There is an opportunity to optimize communication by altering perceptual experiences and social physics to further improve communication effectiveness[25]. There is no limit to what we might imagine for environments, visualizations, and interactions that can facilitate deeper understanding, collaboration, and creativity.

The state of XR technology today

As noted, much of the technology already available today is working towards addressing the needs identified to make XR and effect tool for interpersonal communication. Even with how far we’ve come, I do want to emphasize that we still have a long way to go. Hardware access is limited, headsets are still bulky and uncomfortable, processing and battery power are limited, and behavior tracking and haptics are still in their infancy. Beyond hardware, there are also significant gaps in user experience when it comes to onboarding, user interfaces, the establishment of spatial interaction patterns, and device interoperability. Just this week during an internal networking event with over 150 participants leveraging a popular VR tool-- many participants accessing through the desktop app were unable to track who they were conversing with because of gaps in the user experience. Finally, we will have to be thoughtful on where this technology may have peripheral impacts… even though it is sustainable from the perspective of reduced travel, power usage and processing are still significant concerns when rendering highly detailed content.

Where to go next

With our high-level understanding of the value of XR for workplace experience and communication, the next major steps are further exploration to establish the right interaction paradigms for spatial UX, and organizational experimentation to establish what it will take for successful implementation and adoption. As we know, familiarity and clarity in training are pivotal to adoption and effective use when easing people into new technologies[26].

On a grander scale—thorough research and collaboration with peripheral fields including academia will become increasingly vital to understanding the human impact of emerging technologies. It will also become non-negotiable when it comes to making calculated and thoughtful investments that will produce greater, more sustainable returns. We’re in a unique situation because no one truly has an advantage just yet. We all have access to the same tools and information, and it will come down to who invests the time to figure out what this can mean for their business.  

[1]Amit Kumar Singh, “Role of Interpersonal Communication in Organizational Effectiveness,” International Journal of Research in Management and Business Studies, vol. 1 issue 4 (2014): 36-39.

[2] Mohja Rhoads,“Face-to-Face and Computer-Mediated Communication: What Does Theory Tell Us and What Have We Learned so Far?,” Journal of Planning Literature, 25(2): 111-122. https://doi.org/10.1177/0885412210382984.

[3]Joann Keyton, Communication & organizational culture: a key to understanding work experiences, 2011, Los Angeles: SAGE.

[4]Bonaccio, Silvia, Jane O’Reilly, Sharon L. O’Sullivan, and François Chiocchio. “Nonverbal Behavior and Communication in the Workplace: A Review and an Agenda for Research.” Journal of Management 42, no. 5 (July 2016): 1044–74. https://doi.org/10.1177/0149206315621146.

[5] Tania Singer, “The neuronal basis and ontogeny of empathy and mind reading: review of literature and implications for future research.” Neuroscience and biobehavioral reviews vol. 30,6 (2006): 855-63. doi:10.1016/j.neubiorev.2006.06.011

[6] Roger C. Mayer, James H. Davis, F. David Schoorman, “An Integrative Model of Organizational Trust,” Academy of Management Review, 20(3): 344-354. https://doi.org/10.5465/amr.1995.9508080335

[7] Rhoads, “Face-to-Face,” 111-122.

[8] Rhoads, “Face-to-Face,” 111-122.

[9] Rhoads, “Face-to-Face,” 111-122.

[10] Nathan Bos, Judy Olson, Darren Gergle, Gary Olson, Zach Wright, “Effects of Four Computer-Mediated Communications Channels on Trust Development,” Conference: Proceedings of the CHI 2002 Conference on Human Factors in Computing Systems, 4(1): 20-25.

[11] “2022 Work Trend Index: Annual Report,” Microsoft, 2022.

[12] Rene Riedl, “On the stress potential of videoconferencing: definition and root causes of Zoom fatigue,” Electron Markets 32, 153–177 (2022). https://doi.org/10.1007/s12525-021-00501-3

[13] Riedl, “On the stress potential,” 153-177.

[14] Riedl, “On the stress potential,” 153-177.

[15] "Embodiment and Embodied Cognition". In obo in Psychology, https://www.oxfordbibliographies.com/view/document/obo-9780199828340/obo-9780199828340-0023.xml (accessed 19 Jul. 2022).

[16] Mina C. Johnson-Glenberg, “Immersive VR and Education: Embodied Design Principles That Include Gesture and Hand Controls.” Frontiers in robotics and AI vol. 5 81. 24 Jul. 2018, doi:10.3389/frobt.2018.00081

[17] Marvin Minsky, (1980) Telepresence. OMNI Magazine, 44-52. https://philpapers.org/rec/MINT

[18] Mehmet Ilker Berkman, Ecehan Akan, “Presence and Immersion in Virtual Reality,” in Lee N. (eds) Encyclopedia of Computer Graphics and Games. Springer, Cham. doi:10.1007/978-3-319-08234-9_162-1

[19] Matthew Lombard, Theresa Ditton, “At the Heart of It All: The Concept of Presence,” Journal of Computer-Mediated Communication, 3(2). https://doi.org/10.1111/j.1083-6101.1997.tb00072.x

[20]Steven Hornik, “Attention, Spatial Presence, and Engagement: Implications for Virtual Environment Learning Platforms,” Semantic Scholar, 1-6 (2009).

[21] Stanislas Dehaene, How We Learn: The New Science of Education and the Brain (New York: Penguin Press, 2016).

[22] Lombard, “At the Heart of It All”

[23] Rhoads, “Face-to-Face,” 111-122.

[24] Joshua McVeigh-Schultz, Katherine Isbister, “The Case for “Weird Social” in VR/XR: A Vision of Social Superpowers Beyond Meatspace,” CHI EA '21: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, https://doi.org/10.1145/3411763.3450377. https://par.nsf.gov/biblio/10295212.

[25] McVeigh-Schultz, “The Case for “Weird Social”.”

[26] McVeigh-Schultz, “The Case for “Weird Social”.”