What Is AIV?

In its simplest terms, Apple Immersive Video (AIV) is a new format that produces true stereoscopic 3D immersive films to play back inside the Vision Pro. The URSA Cine Immersive is the first camera on the market to meet the specifications required for AIV and is currently the only way to capture images for AIV.
Apple released the Vision Pro in early 2024. While it looks like a VR headset from the outside, it’s an entirely new type of device. What Apple refer to as a “spatial computer” means that it’s a fully functioning computer on its own, and runs on a new operating system, visionOS, which can be controlled without a keyboard, trackpad, mouse or any other physical input device. It allows apps to run in three-dimensional space, while you can still see the normal world around you. The quality of the displays which closely correspond to human vision also makes it possible to play back the new filmmaking format: AIV.
One of the biggest misconceptions about AIV is that it’s something we’ve seen before. Traditional filmmakers think it’s just VR, VR filmmakers think it’s just another iteration of what they’ve been doing for years. We’ve had some in the industry refuse to even put on a Vision Pro because they’re so convinced they know what they are going to see.

Inside the Vision Pro, AIV’s combination of ultra-high resolution, extreme FOV where the true 3D images and sound wrap around you in 180° both horizontally and vertically, real world depth with zero distortion, at a high frame rate – and no frame – means for the first time the viewer viscerally – in both the brain and the body – feel like they are inside the story.
How AIV Works

The technology of AIV relies on end-to-end consistency of a precisely integrated system where each individual component not only has a critical part of play, but all are interrelated to the effects of the others.

To make its distinct visceral sensation of being inside the story work, AIV has two key secret sauces. The first is the ILPD, the Immersive Lens Projection Metadata. The Cine Immersive’s in-factory calibration connects each pixel to a point in the real world. This ILPD stays with the image all the way through to the Vision Pro, where all of the pixels are remapped out to their real-world position and adjusted for the individual user’s eyes in real time.
This is what’s referred to as “1:1 spatial world mapping” where there is no distortion or scale errors of the image. When handled correctly throughout the workflow, it becomes seamless and invisible to the production team.
But when it’s missing or incorrectly applied, scale, depth and proportions will be wrong and there will be some degree of distortion, most noticeably in straight lines that will curve around the edges of an image. The ILPD is unique to each individual Cine Immersive camera, and is embedded in every file it records.
The second secret sauce is Static Foveation. Foveation is simply mimicking the way the human eye only sees full sharpness in the central 2-5 degrees of our vision. Our eyes actually jump rapidly to build the impression of a detailed whole, and AIV uses this to efficiently deliver sharp and detailed images.
The Vision Pro actually uses dynamic foveated rendering to deliver the highest resolution where the viewer is looking at any one moment. But, AIV additionally uses static foveation to provide even higher resolution to the important areas of any individual shot – the areas of high perceptual salience, where viewers are most likely to look, and filmmakers can control these areas shot by shot. This all happens within a data rate that can still be streamed via current technology.
These two secret sauces, together with the technology of AIV, gives it a clarity that goes far beyond traditional film or VR. AIV’s degree of accuracy differentiates it from any other existing format, and the biological realism for the viewer in their brains and body, creates a sense of presence and authenticity.
Apple Spatial Video is NOT AIV

AIV is not the same thing as Apple spatial video, which is more a consumer friendly, 3D video format. While it is also designed for the Vision Pro, it is limited to HD 1080p at 30fps and sits within a traditional rectangular frame. It can be shot using either the Vision Pro’s built-in cameras or on a recent model iPhone, which we’ve found fantastic for shooting behind-the-scenes material for our AIV films.
AIV is also NOT APMP
APMP (Apple Projected Media Profile) which allows 180° and 360° videos from certain VR cameras including Canon, GoPro and Insta360 to play in a wraparound mode within the Vision Pro by using the APMP metadata recorded by those cameras.
AIV is a Format Choice

There’s no suggestion that AIV will replace traditional formats – it’s just another choice for filmmakers, similar to the way we have always chosen between storytelling formats such as Academy or Scope. AIV will suit some stories and filmmaking styles more than others.
Why is AIV so Different?
The easiest way to visually comprehend how different AIV is, is to use Ben Allan ACS CSI’s graphs where it compare AIV to the progression of digital filmmaking formats over the last 30 years, and we can use pixels per second to take in both spatial and temporal resolution.

Starting with SDTV at 10 million pixels per second, this is the era of The West Wing TV series. HDTV at 50 million pixels per second, is the era of Game Of Thrones. 2K Film res at 76m is the era of Lord Of The Rings and Skyfall. Now, Laser IMAX at 212m pixels per second is like how we saw Project Hail Mary.
But to go much further to include AIV, we need to recalibrate the scale.

Within this new calibration, we can see how exponential the difference is with what the URSA Cine Immersive captures with the somewhat mind boggling 10.5 billion pixels per second.

It’s this mind boggling amount of data that makes two really important things possible with AIV inside the Vision Pro. The first is the visceral degree of realism. The second – and even more importantly – is it allows for comfortable extended viewing without the light-headedness, nausea and headaches which are common in live action VR systems. Even if you watch other content shot and converted to play inside the Vision Pro, you will still experience this discomfort. It’s only AIV that is precisely formatted for comfort.
VR FILMMAKING VS AIV FILMMAKING
AIV is 40 Pixels Per Degree (PPD) which is the equivalent of resolution in traditional formats. PPD measures the pixel density in a single degree arc of our field of view. Linear pixel measurements don’t carry the useful meaning they do for framed formats because the Immersive format is curved, rather than rectangular, and fades out at the edges, so the pixel count of the container format is of limited use in measuring what the audience actually sees. For that, PPD more closely matches the functioning of the human eye, and can be measured using the same tools that are used to measure human eyesight – the LogMAR eye charts.
For example, you can’t measure the PPD of a 4K image in a cinema because the PPD changes depending on where you are sitting inside the cinema. The far left seat in the front row will have a different PPD to a middle seat on the right hand side, and a different PPD again if you’re seated in the middle of the back row.
But in AIV, the PPD for all viewer on any occasion is exactly the same and can be precisely calculated.

The Cine Immersive’s vertical resolution of 7200 pixels across a 180° FOV is 7200/180 = 40 PPD.
To understand this PPD level of extreme level of density and clarity, it is more than what is required to pass a driver’s license eye test. Someone with perfect 20/20 vision, will see around 60 PPD in the central 2 degrees of their vision, and this drops off rapidly over around 20 degrees either side. The Vision Pro can achieve 60 PPD and this is what creates the visual perception of feeling like you are seeing an unfiltered view of the world.

In comparison, current consumer VR headsets peak around 25 PPD, but most VR systems are closer to 20 PPD or even less. For example, an 8K side by side 360° image doesn’t even reach 12 PPD.

It’s also important to note that there is no stitching in AIV and it’s the precision of a tightly integrated ecosystem that makes AIV so distinct from any other format.
TRADITIONAL FILMMAKING VS AIV FILMMAKING

Up until now, in traditional filmmaking, the viewer has watched the story play out through a frame, with the viewer outside the scene watching the story unfold. The size and shape of that frame has evolved over time, but the concept of telling a story within it has not.
By removing the frame, the AIV viewer experiences the story without this layer between them and the scene.

The challenges of AIV filmmaking lie in three dimensional composition, directing attention without a frame, understanding viewer physiology and comfort, editing with peripheral vision, and sound mixing where every sound is clearly identifiable.
There are many elements of AIV filmmaking which function the same way as they do in traditional formats, while many others are completely different. The trick is figuring out which familiar elements work and the new creative language begins.

Many of the constraints of AIV filmmaking are the result of the human body’s limitations. The boundaries are mostly not creative or technical problems, they’re about how much the brain and body can comfortably handle. Like, for example, in aviation, airline pilots know their jets can descend much faster than passengers can tolerate.

Fortunately, these challenges can be understood through the science of perception and physiology. So just as in traditional filmmaking you learn how to master light, or choose a lens, or understand continuity, filmmaking in AIV also requires how to master perception and physiology.
Why Does AIV Matter?

The industry is going through one of the biggest structural shifts in its history. YouTube is now the world’s biggest streaming platform, influencers can sometimes pull larger audiences than the biggest movie stars, and you can now shoot and edit 4K on the cheapest iPads available. This democratisation of cameras and equipment has led to the technological barrier between professionals and amateurs – what Michael Cioni referred to in his viral keynote “Is This The End of Hollywood” to the Hollywood Press Association (HPA) as “the moat” that protected the industry for decades – largely blurred, making the industry, which has always been competitive, even more so.
AIV requires professional skills and techniques. It is unlikely to become a consumer, or even a prosumer format in the foreseeable future, so skilled professionals will best be able to take advantage of the opportunities this new format is providing.

While traditional cinema has increasingly struggled to compete with the convenience and quality of home viewing and streaming, IMAX reigns as a premium, differentiated entertainment experience, with many blockbusters now getting upwards of 10-15% of their box office revenue from IMAX, despite it having only 1% of the screens.
The implications for AIV are very significant. Just as IMAX is the extreme version of cinema, AIV is the extreme version of streaming.
IMAX proved audiences will pay for large scale cinematic immersion in theatres, and we believe that AIV is to streaming, what IMAX is to cinema.
AIV filmmaking is challenging, and does require many different ways of thinking, but these sort of challenges aren’t new, and echo IMAX’s technological journey. IMAX film was phenomenally expensive, and cameras until very recently were far too noisy to record dialogue on set. They were incredibly difficult to make films with, but this difficulty was also part of its prestige. In the same way, the challenges of AIV deliver both the quality and the professional prestige that makes AIV distinct from any other format.
What’s also significant for the immediate future, is that AIV is relatively resistant to generative AI because of those 10.5 billion pixels per second, and the format-specific requirements.
AIV FILMS
Apple have produced around 40 high-profile premium AIV films since the launch of the Vision Pro, all of which stream on Apple TV. These films favour short-form premium content, with most films between 6-14 minutes. These durations reflect the complexity of production – a short AIV film is as complex as a feature film in traditional filmmaking. These durations also reflect the fact that audiences will need to acclimatise to watching extended content in this format – this also mirrors the experience with IMAX, which began by introducing travelogues and documentary films, but, over time, gradually moved into narrative drama, with Christopher Nolan’s The Odyssey feature film, the first to be shot entirely in full IMAX.
AIV is also expanding. In February this year, Apple announced live Immersive sport – beginning with the NBA – live streaming the Lakers. This adds to the collection of high profile sports documentaries already on the platform, and concerts like Metallica, BONO and BBC Proms. AIV is also being used in commercial or specialist applications including Cirrus aircraft, NASA’s Artemis 2 launch, and fashion houses such as Balenciaga. As recently as a couple of weeks ago, the world’s first cataract surgery was also successfully completed using the Vision Pro, so AIV films and Vision Pro experiences are showing the potential for enormous growth.
IT’S JUST THE BEGINNING
It’s important to remember it’s still very early in the technological journey of AIV. Formats often far outlive the hardware they were originally designed for. Take, for example, the now standard QuickTime format which was introduced over 30 years ago, when it was installed on a floppy disk. While floppy disks are a piece of history now, the QuickTime format is still in widespread use today. AIV has potential for similar longevity beyond the current hardware. It is a future proof format where the production and mastering all exceed any current display technology.

Where we are at with AIV has more in common with the Motorola Bag Phone in the developmental journey of the telephone. Someone looking at the Bag Phone back in the 1980s would have been hard pressed to have imagined that a phone could not only be a mobile but also be a computer, a music player, a Sat Nav, a camera and a video camera – and that all of this could fit in your pocket. In the 1980s it would have been equally difficult to believe that these devices would also be one day as ubiquitous as they are now.

While the Vision Pro is vastly sleeker and more sophisticated than the Bag Phone, like the way that was an entirely new category of product – making the telephone portable for the first time – the Vision Pro has introduced a similarly new category of product with the concept of spatial computing. The Vision Pro is just the first device capable of playing AIV, so we can only dream of what technology might exist in the decades to come.

Leave a Reply