Neos, a vision for operating systems

Neos: the elegant operating system, is a collection of those of my ideas for computing which are implementable as an operating system (and a corresponding API environment).

Features

The key concepts are

smart lazy evaluation,
heavy compression, and
retention of semantic information.

Every user input and every program output is recorded by default. The user can travel backwards in time to view the history of this or of any other session, much like viewing a screen capture. However, the historical session data is semantic: the view can be filtered to particular application windows or user interface elements, the recording can jump forwards or backwards to user or program actions (“jump to the next button click,” “show me all error dialogues opened by this application in the past hour”), and, for supported programs, their execution can actually resume from a past state.

Since the recording differentiates between information shared with and presented by different applications, the user can naturally also rewind them individually.

The recording isn’t a video which would describe the exact pixels rendered to the screen. Instead, it consists of timestamped calls made to the system APIs to build and render user interfaces. These calls are then interpreted again when the user chooses to see a part of the recording. User settings and tweaks can be either taken from the recording itself, or applied from the current state of the system, giving the ability to see past actions with different UI themes.

When the user interacts with the system in this way (viewing past states or parts thereof), a timeline highlighting user interactions is shown. The timeline has controls for fast forwarding and rewinding, just like a video player. Additionally, the user can invoke the filtering facilities of the time machine via controls located near the timeline. Software with retroactive features can support user actions in the past which are efficiently carried over to the present. Any such actions introduce branches in the timeline which can be switched between. Applications supporting these interactions are hierarchically highlighted in the timeline view – moving the cursor from the timeline to the view of the system UI highlights first the windows of retroactive applications. Moving the cursor into a retroactive window then focuses the highlight on smaller retrointeractive UI elements, and so forth until the largest fully retroactive element of interest is focused.

Sections of the user interface which aren’t composed of the native widgets don’t support record, rewind, and replay out of the box. The user has the option to select which of these can be ignored (and will show a placeholder in any recordings) and which should be captured as regular video. The semantic, high-level information which enables the proposed features is unfortunately lost, although parts of the history management subsystem, such as retroactivity, can be supported independently of the UI framework.

The lossless nature of these API call recordings enables high fidelity streaming at any resolution. The recordings act as lossless, vector graphics video; they can be upscaled and downscaled at will, image quality is only ever lost in parts of the recording which display raster graphics. Additionally, any animations can be annotated with high-level information which specifies their duration, keyframes, etc. They can thus be transmitted over the network and rendered by connected systems without stuttering at the highest desired framerates. A system rendering the user interface to a 1080p screen at 30 FPS can stream to 4K @ 120 FPS systems with no computational or network overhead. The semantic interchange format is also far more memory efficient than a conventional video. Connected higher-resolution systems can interpolate between received keyframes to improve perceived fidelity and smooth out differences in resolution of input devices. Although latency and stuttering remain problematic, the high-level information helps reconstruct the source material more accurately. The animation metadata allows the clients to speed up the playback to smooth over network speed bumps.

Connected systems can also change the theme of the received widgets at will, and the tools available from the history view are also applicable when streaming. Both the streamer and the viewers can thus filter the user interface, the viewers can rewind, search, filter, or fast forward. Certain applications may support sharing state snapshots over such a stream; this would allow the streamer to “publish a window” which users could open on their own systems and further interact with separately.

Downscaling of the vector video format can utilise application-specific information to simplify the display of certain widgets, collections of widgets, or even entire windows. For example, circular user avatars can be replaced by dots of their accent colours when viewed at a smaller scale. Downscaling a stack of textual items can result in a stack of thick lines instead. Certain parts of the interface can be entirely simplified away.

Downscaling automates away some of the work usually done by designers, such as symbolic screenshots of user interfaces, and improves the fidelity of such simplifications by taking into account the ground truth, along with any user-selected themes and customisations, rather than an overly generic interpretation of the interface. The suggested downscaling rules can be augmented by the applications themselves in a context-sensitive manner, many downscaled options can be presented by the application to the system based on which widgets are considered to be most relevant to the user (these options can also be sorted according to the user’s preferences).

Downscaled versions of widgets and windows can be cached for future reference. These simplified views can be used in many cases to serve as previews or placeholders before more accurate information arrives. A typical use-case for downscaled interfaces is the popular idea of “ghost” placeholders which serve as filler content before an application loads actual user-generated content. These are commonplace in comment sections and instant messengers. Downscaling can improve these placeholders with relevant structure; rather than showing a generic ghost of a textual message with some embedded elements, the real downscaled version of the relevant message can be used instead. Downscaled views can also serve as video-player-like previews in the history subsystem or during streaming.

TODO:

simplifying cursor movements based on relevant actions