A computer’s understanding of space for Augmented Reality by@akshaykore

# A computer’s understanding of space for Augmented Reality

### @akshaykoreAkshay Kore

The goal of Augmented Reality is to superimpose the computer’s perception of space with human’s understanding of it. In computer science, space is simply a metaphor for commonly agreed and scientifically validated concepts of space, time and matter. A computer’s understanding of space is nothing more than a mathematically defined 3D representation of objects, location and matter. It can be simply understood by means of coordinate systems without the need of confusing jargon like hyper-realties or alternate universes. Although these are definitely interesting thought experiments. A virtual space is nothing but a computer’s understanding of the real world as provided by humans.

Humans are spatial beings. We interact with and understand a large portion of our realities in three dimensions. As Augmented reality tries to simulate virtual worlds into human reality, it is important to understand the basic aspects of virtual 3D spaces.

### Visual space and object space

What we perceive as location of objects in the environment is the reconstruction of light patterns on the retina. A visual space in computer graphics can be defined as perceived space or a visual scene of a 3D virtual space being experienced by a participant.

The virtual space in which the object exists is called the object space. It is a direct counterpart of the visual space.

Each eye sees the visual space differently. This is a critical challenge of computer graphics for binocular virtual devices or smart glasses. In order to design for virtual worlds, it is important to have a common understanding of the position and orientation of virtual objects in the real world. Common co-ordinate and orientation systems greatly help here.

### Position and coordinates

Three types of coordinate systems are used for layout and programming of virtual and augmented reality applications:

#### Cartesian Coordinates

The Cartesian coordinate system is used mainly for it’s simplicity and familiarity and most virtual spaces are defined by it. The x-y-z based coordinate system is precise for specifying location of 3D objects in virtual space. The three coordinate planes are perpendicular to each other. Distances and locations are specified from the point of origin which is the point where the three planes intersect with each other. This system is mainly used for defining visual coordinates of 3D objects.

#### Spherical Polar Coordinates

The Cartesian system defines the positions of 3D objects often with respect to an origin point. A system of spherical polar coordinates is used when locating objects and features with respect to the users’ position. This system is used mainly for mapping of a virtual sound source, or the mapping of spherical video in the case of first person based immersive VR. The Spherical coordinate system is based on perpendicular planes bisecting a sphere and consists of three elements: azimuth, elevation and distance. Azimuth is the angle from the origin point in the horizontal/ground plane, while the elevation is the angle in the vertical plane. Distance is the magnitude or range from the origin.

#### Cylindrical Coordinates

This system is mainly used in VR applications for viewing 360 degree panoramas. The cylindrical system allows for precise mapping and alignment of still images to overlap for edge stitching in panoramas. The system consists of a central reference axis (L) with an origin point (O). The radial distance (ρ) is defined from the origin (O). The angular coordinate (φ) is defined for the radial distance (ρ) along with a height (z). Although this system is good for scenarios that require rotational symmetry, it is limited in terms of it’s vertical view.

### Defining orientation and rotation

It is necessary to define the orientation and rotation of user viewpoints and objects along with their position in the virtual space. Knowing this information is especially important when tracking where the user is looking at or for knowing the orientation of virtual objects with respect to the visual space.

#### Six degrees of freedom (6 DOF)

In virtual and augmented reality, it is common to define orientation and rotation with three independent values. These are referred as roll (x), pitch (y) and yaw (z) and are know an Tait-Byan angles. A combination of position (x-y-z) and orientation (roll-pitch-yaw) is referred to as six degrees of freedom (6 DOF).