Intel Researchers Give ‘GTA V’ Photorealistic Graphics, Similar Techniques Could Do the Same for VR

30

Researchers from Intel’s Intelligent Systems Lab have revealed a new method for enhancing computer-generated imagery with photorealistic graphics. Demonstrated with GTA V, the approach uses deep-learning to analyze frames generated by the game and then generate new frames from a dataset of real images. While the technique in its research state is too slow for real gameplay today, it could represent a fundamentally new direction for real-time computer graphics of the future.

Despite being released back in 2013, GTA V remains a pretty darn good looking game. Even so, it’s far from what would truly fit the definition of “photorealistic.”

Although we’ve been able to create pre-rendered truly photorealistic imagery for quite some time now, doing so in real-time is still a major challenge. While real-time raytracing takes us another step toward realistic graphics, there’s still a gap between even the best looking games today and true photorealism.

Researchers from Intel’s Intelligent Systems Lab have published research demonstrating a state of the art approach to creating truly photorealistic real-time graphics by layering a deep-learning system on top of GTA V’s existing rendering engine. The results are quite impressive, showing stability that far exceeds similar methods.

In concept, the method is similar to NVIDIA’s Deep Learning Super Sampling (DLSS). But while DLSS is designed to ingest an image and then generate a sharper version of the same image, the method from the Intelligent Systems Lab ingests an image and then enhances its photorealism by drawing from a dataset of real life imagery—specifically a dataset called Cityscapes which features street view imagery from the perspective of a car. The method creates an entirely new frame by extracting features from the dataset which best match what’s shown in the frame originally generated by the GTA V game engine.

An example of a frame from GTA V after being enhanced by the method | Image courtesy Intel ISL

This ‘style transfer’ approach isn’t entirely new, but what is new with this approach is the integration of G-buffer data—created by the game engine—as part of the image synthesis process.

An example of G-buffer data | Image courtesy Intel ISL

A G-buffer is a representation of each game frame which includes information like depth, albedo, normal maps, and object segmentation, all of which is used in the game engine’s normal rendering process. Rather than looking only at the final frame rendered by the game engine, the method from the Intelligent Systems Lab looks at all of the extra data available in the G-buffer to make better guesses about which parts of its photorealistic dataset it should draw from in order to create an accurate representation of the scene.

Image courtesy Intel ISL

This approach is what gives the method its great temporal stability (moving objects look geometrically consistent from one frame to the next) and semantic consistency (objects in the newly generated frame correctly represent what was in the original frame). The researchers compared their method to other approaches, many of which struggled with those two points in particular.

– – — – –

The method currently runs at what the researchers—Stephan R. Richter, Hassan Abu AlHaija, and Vladlen Koltun—call “interactive rates,” it’s still too slow today to make for practical use in a videogame (hitting just 2 FPS using an Nvidia RTX 3090 GPU). In the future however, the researchers believe that the method could be optimized to work in tandem with a game engine (instead of on top of it), which could speed the process up to practically useful rates—perhaps one day bringing truly photorealistic graphics to VR.

“Our method integrates learning-based approaches with conventional real-time rendering pipelines. We expect our method to continue to benefit future graphics pipelines and to be compatible with real-time ray tracing,” the researchers conclude. […] “Since G-buffers that are used as input are produced natively on the GPU, our method could be integrated more deeply into game engines, increasing efficiency and possibly further advancing the level of realism.”

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. See here for more information.


Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • Years away from us (gamers) seeing it in use, but still fun to see the future.

  • Ad

    Is it just me or does this look real in the sense that it looks like a VHS tape?

    We could be use photogrametry in games more right now, it’s already possible. Most games have static environments, just use photogrammetry like the valve lobby in SteamVR home. It looks real and actually like a photo.

    • kontis

      No, it’s looks real because it generates materials that surpass offline rendered 1 frame per hours pictures with path tracing. Look at the specular highlights on glass and car paint. It’s identical to real life, better than CG in $200M+ movies.

      The tint comes from their data set (automotive cameras from Germany). Change the cameras and captured videos and you will get different style results.

      We could be use photogrametry in games more right now

      It’s used heavily in games for a decade. Usually you don’t notice because these assets are heavily modified to be usable in synthetic lighting and shading. The type of raw photogrammetry you like is not very usable in games because all the lighting is captured in it, so it’s like being in a photoscan that cannot change in any way (the opposite of what video games are) than an actual simulated virtual world. Games are dynamic and artists also want their own lighting and consistent style.

      • Ad

        “so it’s like being in a photoscan that cannot change in any way”

        Like the majority of VR games?

        “The tint comes from their data set”

        Interesting.

      • See my post, but I could see the process discussed in the process to delight, hole filing and creating PBR texture layers that require a very expensive photogrammetry set up (or a ton of manpower) to use the photogrammetry assets in a gaming engine. I am putting in a lot of hours in “Ingenuity in VR” to delight, adding PBR textures and color regrade to match my photogrammetry derived landing site (and rocks/soil) to other “Mars” textures and lighting within the Unreal Engine. Boy would a learning system that took in a ton of accurate images from Mars and created a kernel for this process be time saving.

    • Cragheart

      Photogrammetry is useful only for “realistic” looking games. What about all other games? I don’t want to see realism everywhere. Copying this world with photos to put inside games is not very creative.

      • Ad

        Maybe, but there’s both room for it and a lot of games do try and be generally realistic.

      • Stephen Gower

        If a developer is trying to make a game and doesn’t want the realistic look that photogrammetry provides, one idea worth considering is to not use photogrammetry when they make the game. Just a thought :) :)

      • Timothy S (Sergentti)

        What if it is trained on a cartoony art style?

    • Charles

      There’s already a realtime GTA mod that makes it look at least as realistic as this – maybe even more realistic:
      https://youtu.be/ENb6oKvbW_E?t=198

      • david vincent

        Not bad but too colorful to be realistic to me.

        • HenrikSmedsrud

          I think the added color is actually better than real life.

    • Aaron

      It does kind of look like a dashcam. Which got me thinking, could you imagine this in a found footage type game like where you look through a camera? Blair witch, Outlast, man this shit would be perfect.

    • Jonathan Winters III

      Yeah, because it was filmed with a dashcam, and with incorrect color temperature settings among other deficiencies.

  • GunnyNinja

    I would love to see this in The Division 2 which uses a real 1:1 representation of DC. I’ve compared locations in the game, and they are nearly indistinguishable from Google Street view shots of that location.

  • In reading the paper, I can definitely see its use in something others are commenting about; which is photogrammetry. Most photogrammetry is great at capturing the albedo layer, but unless you have a sophisticated lighting array at different wavelengths, many of G-Buffer layers described in the paper don’t exist. Here I think a convolutional learning system could work as well. Including delighting, filling in holes and creating the other layers so that you can use it as a PBR textured asset. Then I can see a convolutional kernel being used to manipulate the G-Buffer on output to create different environment conditions. Think of it as a new form of GL that we now use HDRI spheres for.

  • Andrew Jakobs

    I don’t think all games should be photorealistic, even games like GTA, hell especially games like GTA.

    • Yeah, games should still look like games, if I want realistic I’ll go outside.

  • Charles

    There’s already a realtime GTA mod that makes it look at least as realistic as this – maybe even more realistic:
    https://youtu.be/ENb6oKvbW_E?t=198

    • Sven Viking

      Each has somewhat different strengths, but that is a very nice mod.

  • Wild Dog

    That is pretty amazing.

  • Brettyboy01

    Battlefield 6 would be crazy.

  • jbob4mall

    But utterly lifeless. Weather effects, animations, clothe movements, npc interactions, physics etc. It’s these little things that are important and make the game world feel alive. A cartoon looking game can feel more real with these than a realistic game that doesn’t.

    • david vincent

      Lifeless ? Looks like you never played GTA V.
      If you want a lifeless world with almost no interaction, look at Cyberpunk 2077…

      • Stiff and static is what comes to mind when thinking of Cyberpunk 2077.

  • Impressive reserach :O

  • The CAT

    Another Pay me monthly mod..

  • brubble

    Impressive.