Facebook announced this week the open-sourcing of Detectron, the company’s platform for computer vision object detection algorithm based on a deep learning framework. The company says that its motive for opening up the project is to accelerate computer vision research, and that teams within Facebook are using the platform for a variety of applications, including augmented reality.

In my recent article detailing the three biggest challenges facing augmented reality today, I noted that real-time object classification was one of the biggest hurdles:

…it’s a non-trivial problem to get computer-vision to understand ‘cup’ rather than just seeing a shape. This is why for years and years we’ve seen AR demos where people attach fiducial markers to objects in order to facilitate more nuanced tracking and interactions.

Why is it so hard? The first challenge here is classification. Cups come in thousands of shapes, sizes, colors, and textures. Some cups have special properties and are made for special purposes (like beakers), which means they are used for entirely different things in very different places and contexts.

Think about the challenge of writing an algorithm which could help a computer understand all of these concepts, just to be able to know a cup when it sees it. Think about the challenge of writing code to explain to the computer the difference between a cup and a bowl from sight alone.

I also talked about how ‘deep learning’ techniques—which involve ‘training’ a computer to interpret what it sees, rather than programming detection by hand—are one potential answer to the problem of real-time object classification. Facebook this week has open-sourced their own object detection algorithm in a move which could accelerate development of systems capable of the sort of real-time object classification that could make augmented reality truly useful.

SEE ALSO
Facebook Teases "breakthrough technologies" Coming to New Oculus Products, Tours R&D Lab

Augmented reality that actually interacts with the world around us without being pre-programmed for specific environments needs to have a cursory understanding of what’s in our immediate vicinity. For example, if you’re wearing AR glasses and want to be able to project the oven temperature above the oven, along with an AR list floating on your refrigerator to show what food you’re almost out of, your glasses need to know what an oven and a refrigerator look like; a tremendously challenging task given the wide range of ovens and refrigerators, and the places in which they reside.

What object classification looks like through the lens of a deep learning algorithm | Image courtesy Hu et al

Facebook’s AI research team, among others, has been working on this problem of object detection by using deep learning to give computers the ability to reach conclusions about what objects are present in a scene. The company’s object detection algorithm, based on the Caffe2 deep learning framework, is called Detectron, and it’s now available for anyone to experiment with, hosted here on GitHub. Facebook hopes that open-sourcing Detectron will enable computer vision researchers around the world to experiment with and continue to improve the state of the art.

“The goal of Detectron is to provide a high-quality, high-performance codebase for object detection research. It is designed to be flexible in order to support rapid implementation and evaluation of novel research,” the project’s GitHub page reads.

The algorithms examine video input and are able to make guesses about what discrete objects comprise the scene. Research projects like Detecting and Recognizing Human-Object Interactions (Gkioxari et al), have used Detectron as a foundation for understanding human actions performed with objects in an environment, a step in the right direction toward helping computers understand enough about what we’re doing to be able to offer valuable information on the fly.

Image courtesy Gkioxari el al

Detectron is also used internally by Facebook outside of AI research; “teams use this platform to train custom models for a variety of applications including augmented reality and community integrity,” the company wrote in the announcement of Detectron’s open-sourcing.

Exactly which teams would be using Detectron for augmented reality isn’t clear, but one obvious guess is Oculus, whose chief scientist, Michael Abrash, recently spoke at length about how and when augmented reality will transform our lives.

Newsletter graphic

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. More information.


Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • daveinpublic

    I wonder if the government is using this right now to train computers to detect undesirable activities and alert them upon detection? I guess you can’t have progress without risks.

    • Foreign Devil

      Most likely. . I know they are in China.

  • kalqlate

    Great for AR but also great for giving autonomous robots object awareness.

  • Cool for lots of experimental applications

  • Cat of Many Faces

    Always nice to see these sorts of things go open source. It really helps accelerate early technology.

    See: 3D Printing

  • This will be worse than “Eagle Eye” and “Enemy of the State” movies together. And is not joke, any techie advancements will not be worth that situation…

  • Mateusz Pawluczuk

    woah, that donut is massive