What’s the point of VR?
I’m captivated by its potential, but someone whose opinion I value keeps asking me why? And that’s a very valuable question, so I try to come up with an answer.
It’s not VR movies, games, or other so called “immersive experiences”. I don’t want to pretend to fly, or jump off a building, or relax in a garden.
I suppose it’s selfish but the answer is that I want to write code in VR. I’m tired of looking through a pair of smudged plastic rectangles at my work; I’d like to be surrounded by it.
My prediction is that in 10 years, we’ll think about flat screens the way today we remember lugging around 19" ViewSonic CRT monitors. We’ll have these thoughts while strapping on headsets to begin the day’s work.
The world of computer graphics can’t compete with the real one. But for those who seek to tightly couple the human mind with the abstract world of a networked computer, VR is the future.
So, what’s going to make it happen? The hardware is nearly there…
…but the software remains limited to toys, games, movies and experiences. These applications take control of the display pipeline and try to place the viewer in a foreign world. Some of them attempt a human computer interface, but nothing that makes us want to leave 2D behind.
A VR desktop operating system
We need an operating system to pull it together. Not an operating system in the virtual memory and device drivers sense, but a desktop operating system with a new take on user interaction. It can be built on Windows, Mac OS X, Linux, Android, the Web, all of the above.
Its purpose is to host both 2D and 3D applications in a spherical orbit around the user. I’m calling it a Shellspace, and I’ve begun to build it at http://github.com/wadetb/shellspace.
In the 2D desktop, application windows are packed together and overlaid in a 2D grid of pixels.
In the Shellspace, application widgets and content orbit around the user’s viewpoint like satellites, project themselves across distant virtual walls, or float in space above a synchronized physical/virtual desk.
A desktop process
Shellspace itself is a desktop process, running on top of a typical oprating system, connected to arbitrary VR hardware.
It replaces the traditional 2D desktop, though that desktop is left running in the background as part of the infrastructure for supporting legacy applications.
In a nutshell, the Shellspace process is responsible for:
- Merging visual representations of multiple 2D and 3D applications
- Managing application visibility and location in 3D space
- Tracking the user’s head movement
- Translating input from physical devices to 2D and 3D applications
- Maintaining an ideal VR framerate; throttling nonperformant applications
- Managing the transition from multi-application VR desktop to single-application VR immersion
- Rendering a customizable background environment
- Transmitting representations of the shellspace to other clients (at varying levels of detail)
User interface
My vision for the shellspace user interface is a stationary user surrounded by a sphere of floating applications, like a planet surrounded by satellites.
A mental exercise to help imagine the shellspace: picture giving each of your traditional desktop application windows (Mail, Editor, IM) an individual wireless flatscreen monitor, suspend them in a zero gravity environment, and try to get some work done.
This is in contrast to interfaces such as the HoloLens, where applications are integrated virtually into a physical office or living room environment, and to older examples like the infamous Jurrasic Park VR Unix.
Input visualization
Two kinds of VR hardware have been announced, with different implications with respect to user input: immersive and non-immersive.
Immersive includes technologies like the Oculus Rift and Sony’s Project Morpheus. Non-immersive includes technologies like Microsoft’s HoloLens and Google’s Magic Eye.
In non-immersive VR, standard input devices such as keyboards and mice remain usable just as in a desktop environment.
In immersive VR, a major issue with user input is the inability to see one’s hands or input devices.
Much work has been done towards representing a user’s arms, ands and fingers in the virtual environment.
Shellspace proposes to meet these efforts in the middle by also representing the user’s input devices in the virtual environment.
When initially setting up the shellspace, the user uses her hands to define the plane and bounds of her desk, and these bounds are represented as a desk in the virtual environment.
Interface widgets specific to different input devices such as keyboards and mice can also be registered with Shellspace and have their locations tracked via IR markers, with near-exact replicas being rendered in the virtual location which corresponds to their actual physical location.
Thus, the problem of connecting one’s hands, desk, mouse, and keyboard is eliminated by representing those physical objects in the virtual space.
Additionally, users may find that input devices which do not need to be moved, such as rigidly located keyboards and trackballs, may be preferrable over standard mice.
In addition to tracking their physical location, input device replicas can be programmed to represent their input state visually in the VR environment, for example by visually depressing keys as their real world counterparts are pressed, highlighting mouse buttons when clicked, or (somewhat jokingly) displaying the temperature and liquid level of a VR enabled coffee cup.
API principles
Input
Shellspace manages the translation of input from legacy and VR-native input devices to applications.
The notion of application and widget focus is retained from the 2D desktop; input is sent first to the active widget and propagated upwards through the ancestry until it is handled.
For 2D applications with 2D input devices such as mice or trackballs, input events are sent directly to the focused window. 3D input events are also sent to 2D applications, when focused.
For 3D applications, 2D input devices may be translated to 3D input devices by Shellspace in user-configurable manners. For example, moving a trackball might be set up to translate in the XY plane, unless the Shift key is held, in which case it translates in the XZ plane.
3D input devices can report position, orientation, digital and analog button presses. Shellspace will track hand movements and translate certain gestures as 3D inputs.
There is no limit on the number of input devices simultaneously transmitting.
When no window is focused, and when certain global hotkeys are pressed (e.g. the Windows key), input events may be captured by Shellspace to manage the environment. For example, holding the Windows key while rotating the trackball may rotate the shellspace environment around the user’s perspective like a globe.
Graphics
The principal constraint of a comfortable VR experience is a consistent refresh rate of 90fps. Current graphics APIs and hardware are designed to support the needs of a single application rendering a single immersive 3D scene.
To this end, the primary graphics card is dedicated to the Shellspace VR environment.
To represent themselves in the VR scene, individual VR applications inject content into the desktop process via interprocess communication channels, somewhat like the way 2D applications transfer pixel data through the Win32 GDI.
2D applications
Legacy 2D applications are translated into VR automatically. They are launched internally in an invisible virtual 2D desktop, and their pixels are captured by the Shellspace process and rendered into the VR scene.
3D applications
3D applications use Shellspace remote procedure calls to manage their representation in the scene. This content includes graphics primitives such as vertices, triangles, translations and rotations of root coordinate systems, and texture data.
The kinds of graphics primitives that may be submitted is intentionally kept simple. This is important to maintain 90fps. Specifically, 3D applications are not allowed to submit their own shaders. Where it is required to offer lighting effects beyond what is provided by the Shellspace renderer, these lighting effects must be rendered in texture space by software or a secondary GPU.
LOD and throttling
3D applications may register representations for different levels of detail, as requested by the Shellspace environment. Applications which inject too much content into the scene may be highlighted negatively to the user and/or automatically throttled. 2D applications may have their pixel data sub-sampled when their on-screen representation is small.
All Shellspace inter-process APIs are asynchronous, to avoid interfering with the performance of the Shellspace environment.
When an “immersive experience” application is launched, that process is given control of the graphics device, in a manner similar to a fullscreen mode transition in a 2D desktop. Otherwise, processes are free to use secondary graphics hardware which may be available for computation and/or rendering.