Incorporating 3D Artificial Intelligence with AR and VR Technology

Cloud-based environments enable successful integration of AI, AR, and VR.

Augmented reality displays digital content in the real world as well as virtually. (Credit: GridRaster)

The race continues between the world’s largest tech leaders and companies to see which one will prevail and power the next generation of tools, technologies, and resources for manufacturing, healthcare, construction, and many other vertical market applications. These companies have been working tirelessly to create changes that will make a significant impact on our world. This all starts with the technological advances that have been made in recent years with artificial intelligence (AI), and immersive mixed reality technologies such as augmented reality (AR) and virtual reality (VR). All these technologies have specific differences, but they’re also now working together in advanced three-dimensional (3D) applications and environments.

Immersive Mixed Reality Uses

With VR, a user wears a headset that fully delves into a new world or environment, some that even mimic the real world. The user is given both a visual and audible experience that is meant to take them away from known reality. For health and pharmaceutical industries, think of a medical device manufacturer that is leveraging a VR headset to virtually design a new and advanced surgical equipment or pharmaceutical device. AR is similar in concept, but it also displays digital content in the real world. Think Pokémon Go or IKEA’s Place app, both of which allow a user to interact with and experience digital objects.

Where Immersive Mixed Reality Falls Short for Enterprises

Today’s AR/VR platforms powered by distributed cloud architecture and 3D vision-based AI. (Credit: GridRaster)

The challenge is that these technologies require heavy doses of data, the ability to process vast amounts of data at impeccable speeds, and the ability to scale projects in a computer environment that isn’t usually possible in traditional office environments. Immersive mixed reality requires a precise and persistent fusion of the real and virtual worlds. This means rendering complex models and scenes in photorealistic detail, rendered at the correct physical location (with respect to both the real and virtual worlds) with the correct scale, and accurate pose. Think of the accuracy and precise nature required in leveraging AR/ VR to design, build or repair components of an airline engine, or an advanced surgical device used in medical applications. This mixed reality is achieved today by using discrete GPUs from one or more servers and delivering the rendered frames wirelessly or remotely to the head-mounted displays (HMDs) such as the Microsoft HoloLens and the Oculus Quest.

The Need for 3D and AI in Immersive Mixed Reality

One of the key requirements for mixed reality applications is to precisely overlay on an object its model or the digital twin. This helps in providing work instructions for assembly and training, and to catch any errors or defects in manufacturing. The user can also track the objects and adjust the rendering as the work progresses.

Most on-device object tracking systems use 2D image and/or marker-based tracking. This severely limits overlay accuracy in 3D because 2D tracking cannot estimate depth with high accuracy, and consequently the scale and the pose. This means that even though users can get what looks like a good match when looking from one angle and/or position, the overlay loses alignment as the user moves around in 6DOF. Also, the object detection, identification, and its scale and orientation estimation — called object registration — is achieved, in most cases, computationally or by using simple computer vision methods with standard training libraries (examples: Google MediaPipe, VisionLib). This works well for regular and/or smaller and simpler objects such as hands, faces, cups, tables, chairs, wheels, regular geometry structures, etc. However, for large, complex objects in enterprise use cases, labeled training data (more so in 3D) is not readily available. This makes it difficult, if not impossible, to use the 2D image-based tracking to align, overlay, and persistently track the object and fuse the rendered model with it in 3D.

Enterprise-level users are overcoming these challenges by leveraging 3D environments and AI technology into their immersive mixed reality design/build projects. Deep learning-based 3D AI allows users to identify 3D objects of arbitrary shape and size in various orientations with high accuracy in the 3D space. This approach is scalable with any arbitrary shape and is amenable to use in enterprise use cases requiring rendering overlay of complex 3D models and digital twins with their real-world counterparts.

This approach can also be scaled to register with partially completed structures with the complete 3D models, allowing for ongoing construction and assembly. Users achieve an accuracy of 1–10 mm in the object registration and rendering with this platform approach. The rendering accuracy is primarily limited by the device capability. This approach to 3D object tracking allows users to truly fuse the real and virtual worlds in enterprise applications, opening many uses including but not limited to training with work instructions, defect and error detection in construction and assembly, and 3D design and engineering with life-size 3D rendering and overlay.

Working in Cloud Environments Is Critical

Manufacturers should be cautious in how they design and deploy these technologies, because there is great difference in the platform they are built on and the platform later maximized for use. Even though technologies like AR and VR have been in use for several years, many manufacturers have deployed virtual solutions that are built upon an on-premises environment, where all of the technology data is stored locally.

On-premises AR/VR infrastructures limit the speed and scalability needed for today’s virtual designs, and they limit the ability to conduct knowledge sharing between organizations that can be critical when designing new products and understanding the best way for virtual buildouts.

Manufacturers today are overcoming these limitations by leveraging cloud-based (or remote server-based) AR/VR platforms powered by distributed cloud architecture and 3D vision-based AI. These cloud platforms provide the desired performance and scalability to drive innovation in the industry at speed and scale.

This article was written by Dijam Panigrahi, Co-founder and COO of GridRaster Inc., a provider of cloud-based AR/VR platforms that power AR/VR experiences on mobile devices for enterprises, based in Mountain View, CA. For more information, visit here .