More than a half billion of Instagram’s roughly users visit Instagram Explore monthly to discover photos, videos, livestreams, and Stories. The AI-based recommendation engine — which sorts through and curates billions of content sets that appear on Instagram— faced huge technical challenges as it had to scale massively to operate in real-time.
Facebook recently revealed the inner workings of Explore. It uses a three-part ranking funnel, which was designed with a custom query language and modeling techniques. The ranking engine extracts about 65 billion features and makes 90 million model predictions every second! Now that is performance.
The Explore development team developed tools to conduct large-scale experiments and obtain strong signals on the breadth of users’ interests before they began building a content recommendation system. The first of the tools was IGQL, a meta language that could scale well.
IGQL is both statistically validated and high-level, which allowed engineers to write recommendation algorithms in a “Python-like” fashion. And it complements a component that helps identify topically similar profiles as part of a retrieval pipeline that focuses on account-level information.
To predict the most relevant content for each person, a lightweight ranking distillation model preselects candidates before passing them to more complex ranking models. Then, leveraging knowledge from the more complicated models, the simpler model tries to approximate the main ranking models as much as possible via direct (and indirect) learning.
But there were age-appropriate considerations needed: Signals are used to filter out anything that might not be eligible (safe and appropriate). Algorithms detect and filter spam and other content, typically before an inventory is built for each user. Facebook says that over 99% of child nudity and exploitation posts were deleted over the past year.