Overview
OpenSea is the World’s Leading Web3 Marketplace
The Problem
Detecting Digital Marketplace Deception
Trust and safety are some of the most significant barriers to welcoming new people into the Web3 ecosystem. While Web3 and NFT technology are starting to mature, users new to the world of NFTs are still often deceived by fake content and need help properly distinguishing between original NFTs and copymints (or NFTs that are duplicates or imitations of popular NFTs). As a leader in the space, OpenSea was looking for a vendor to help advance their detection and removal capabilities to identify and mitigate copymints and fraud as early as possible.
The Solution
Real-Time Fraudulent NFT Detection on OpenSea’s Platform
OpenSea needed an industry-leading solution that could identify and handle a dynamic set of deceptive NFTs. Before working with Scale, the OpenSea team was early in their AI journey. Though the team already used rule-based systems to help capture forms of deception, it was a challenge to attain the desired speed, recall, and precision needed to effectively address fraud in the marketplace. OpenSea approached Scale due to the team’s experience building customized models and ability to ramp quickly for customers.
Scale Content Understanding provides data enhancement for better platform experiences by enriching, analyzing, and categorizing content. Here, Scale Content Understanding proved robust in testing against OpenSea’s high data volume of up to 50 million items a week and proved an effective partner for tackling their problems.
With Scale Content Understanding, the team offered a fast turnaround models-as-a-service solution to categorize and determine whether a given NFT is a close match of another one. Scale’s ML models provided multiple layers of deduplication processes: starting with real-time detection through API-based solutions and full catalog scans to remove historical scams.
Real-Time NFT Detection
When an NFT is minted, OpenSea needs to quickly detect whether or not it is a copymint and remove it from its site to mitigate the chance of a user purchasing that copyminted item. Scale Content Understanding facilitates this through two ways: 1) a real-time API and 2) a recurring batch job:
- The real-time API receives live traffic from items minted on OpenSea’s platform and items ingested via the block chain, scans the items through Scale’s ML model, and returns the likelihood that it is a copymint. All of this occurs within a matter of seconds.
- The recurring batch job runs on an hourly cadence to pick up any items that may have fallen through the real-time API.
In order to satisfy the scale and complexity of the problem, the Scale team trained custom deep learning image models to represent NFTs as embeddings in a manifold. With this setup, similar items will be nearby, meaning copymints would sit near each other, and items that are significantly different will have a further distance from one another. Scale converted and stored all these NFT embeddings in a vector database and built out systems for real-time querying and retrieval through k-Nearest Neighbors algorithms.
“The real-time detection of fuzzy matches is really tricky for systems to get right, but I think Scale’s models have really nailed it.” - Charles Zaffaroni, Product Manager, OpenSea
Full Catalog Scans
Over time, OpenSea's operations team verifies more collections on their platform, and Scale's engineering team improves the model’s performance based on this feedback. These improvements are retroactively applied to the complete set of items to take down any previously missed fraudulent items. Scale Content Understanding provides the capability to run full catalog scans, scanning hundreds of millions of items with high precision over a short period of time.
The Result
Decreased Time to Deceptive NFT Takedown
After kicking off the project with Scale Content Understanding, OpenSea was able to dramatically accelerate their copymint detection capabilities – today detecting copyminted NFTs in real time. Scale provided model-based systems to help process large volumes of data and provide signals to the OpenSea team. One of the critical measures of success for OpenSea was reducing the latency between an NFT being created on the platform to identifying bad content and taking it down.
The second measure of success is the ability to handle the large volumes of data from OpenSea. Scale processes up to 50 million items a week with 95% average precision. By quickly detecting and removing inauthentic NFTs, OpenSea is able to improve user trust in their marketplace. Here’s to a safer OpenSea, and thus a safer web3.