DesignRush
  • Trending Brand News
  • AGENCY DIRECTORY
    Featured
    Branding & Creative
    Website & Interface
    Marketing
    Software & App
    IT Services
    Featured
    • Web Design Companies
    • Web Design Companies
    • Digital Marketing Agencies
    • Digital Marketing Agencies
    • Software Development Companies
    • Software Development Companies
    • Mobile App Development Companies
    • Mobile App Development Companies
    • Web Development Companies
    • Web Development Companies
    • SEO Agencies
    • SEO Agencies
    • AI Companies
    • AI Companies
    • UI/UX Design Agencies
    • UI/UX Design Agencies
    • PPC Agencies
    • PPC Agencies
    • Branding Agencies
    • Branding Agencies
    • Google Ads Agencies
    • Google Ads Agencies
    Featured
    Branding & Creative
    • Digital Agencies
    • Digital Agencies
    • Branding Agencies
    • Branding Agencies
    • Creative Agencies
    • Creative Agencies
    • Product Design Companies
    • Product Design Companies
    • Logo Design Companies
    • Logo Design Companies
    • Graphic Design Companies
    • Graphic Design Companies
    • Packaging Design Companies
    • Packaging Design Companies
    • Video Production Companies
    • Video Production Companies
    • Public Relations Firms
    • Public Relations Firms
    • Design Agencies
    • Design Agencies
    • Reputation Management Companies
    • Reputation Management Companies
    Branding & Creative
    Website & Interface
    • Web Design Companies
    • Web Design Companies
    • eCommerce Development Companies
    • eCommerce Development Companies
    • Web Development Companies
    • Web Development Companies
    • WordPress Web Design Companies
    • WordPress Web Design Companies
    • WordPress Development Companies
    • WordPress Development Companies
    • Magento Development Companies
    • Magento Development Companies
    • Shopify Development Companies
    • Shopify Development Companies
    • UI/UX Design Agencies
    • UI/UX Design Agencies
    • Small Business Website Design Companies
    • Small Business Website Design Companies
    Website & Interface
    Marketing
    • Digital Marketing Agencies
    • Digital Marketing Agencies
    • SEO Agencies
    • SEO Agencies
    • PPC Agencies
    • PPC Agencies
    • Social Media Marketing Companies
    • Social Media Marketing Companies
    • Search Engine Marketing Agencies
    • Search Engine Marketing Agencies
    • Email Marketing Agencies
    • Email Marketing Agencies
    • Small Business SEO Companies
    • Small Business SEO Companies
    • Local SEO Companies
    • Local SEO Companies
    • Google Ads Agencies
    • Google Ads Agencies
    • Advertising Agencies
    • Advertising Agencies
    • eCommerce SEO Agencies
    • eCommerce SEO Agencies
    • Media Buying Agencies
    • Media Buying Agencies
    • Content Marketing Agencies
    • Content Marketing Agencies
    • Lead Generation Companies
    • Lead Generation Companies
    • Video Marketing Services
    • Video Marketing Services
    Marketing
    Software & App
    • Software Development Companies
    • Software Development Companies
    • Offshore Software Development Companies
    • Offshore Software Development Companies
    • Outsourcing Software Development Companies
    • Outsourcing Software Development Companies
    • Mobile App Development Companies
    • Mobile App Development Companies
    • VR & Augmented Reality Companies
    • VR & Augmented Reality Companies
    • AI Companies
    • AI Companies
    • Android App Development Companies
    • Android App Development Companies
    • iPhone App Development Companies
    • iPhone App Development Companies
    • Blockchain Development Companies
    • Blockchain Development Companies
    • Software Testing Companies
    • Software Testing Companies
    Software & App
    IT Services
    • IT Services Companies
    • IT Services Companies
    • IT Outsourcing Companies
    • IT Outsourcing Companies
    • Managed Service Providers
    • Managed Service Providers
    • Cybersecurity Companies
    • Cybersecurity Companies
    • Big Data Analytics Companies
    • Big Data Analytics Companies
    • Cloud Consulting Companies
    • Cloud Consulting Companies
    • Staff Augmentation Services
    • Staff Augmentation Services
    • SharePoint Consultants
    • SharePoint Consultants
    IT Services
  • List Your AgencyFind An Agency
  • Marketplace
  • Awards
    • All the Latest Winners
    • Website Design
    • Logo Design
    • Print Design
    • App Design
    • Packaging Design
    • Video Design
List Your AgencyFind An Agency
Trending Brand News
  • Latest News
  • Interviews
  • Podcast
  • Trends
  • Trending Brand News
  • Search API Docs Leaked: Did Google Lie All These Years?
Receive our Newsletter
Join over 70,000 B2B decision-makers growing their brands
Receive proposals from qualified agencies
Get Proposals
7 min read

Search API Docs Leaked: Did Google Lie All These Years?

Big Data 7 min read
3,441
Search API Docs Leaked: Did Google Lie All These Years?
[Source: Unsplash / Alex Dudar]
Article by Katherine MaclangKatherine Maclang
Published May 30 2024
|
Updated May 01 2025
Share

Nearly 2,600 documents on Google Search API have been leaked to Rand Fishkin, co-founder of market and audience research software firm SparkToro, revealing what possibly goes on behind the tech giant’s closely guarded search operations.

Fishkin recently joined us on the DesignRush Podcast discussing this and similar marketing and SEO topics.

Fishkin said an anonymous source contacted him via email on May 5. After several email exchanges, he learned through a video call that the person was an industry insider with whom he shares mutual professional acquaintances.

Here's my post breaking down the leak's source, my efforts to authenticate it, and early findings from the document trove: https://t.co/nmOD0fd5mNpic.twitter.com/yMxMrSeeLa

— Rand Fishkin (follow @randderuiter on Threads) (@randfish) May 28, 2024

The source chose the Moz creator because of his expertise and because he has been very vocal about calling for Google to be transparent about its operations in the past.

Fishkin found the source to be “credible, thoughtful, and deeply knowledgeable,” but he continued to consult experts to gauge the authenticity of the documents.

The SparkToro founder then consulted three former Google employees who all concluded that the documents “look legit” based on their firsthand knowledge of the notation style and format of the company’s internal documents.

Fishkin also went to iPullRank Founder Mike King for his technical SEO expertise, and although 100% authenticity can’t be claimed, he believes that the leaked files “appear to be a legitimate set of documents from inside Google’s Search division.”

After publishing the leaked documents on Monday, the source revealed himself to be Erfan Azimi, founder and director of SEO of digital marketing agency EA Eagle Digital.

According to his statement, the main reason for all of this is that “the truth needs to come out.”

Azimi also shared that the documents were given to him by a former member of Google’s search team who “respectfully declined” to reveal his identity due to the “sensitivity of the situation.”

If you want to step up your SEO game, consider partnering with one of the best SEO agencies listed on DesignRush.

What the Leaked Search API Docs Contain

The leaked files seem to be from an incident where code-hosting platform GitHub accidentally made public API documentation from its private repository and internal Google corporate sites.

The leaked documents’ commit history in Fishkin’s hands states that the code was uploaded to GitHub on March 27 and was only removed on May 7.

The search API documents are full of lines of code that only experts in technical SEO can make sense of.

“Think of this as instructions for members of Google’s search engine team. It’s like an inventory of books in a library, a card catalog of sorts, telling those employees who need to know what’s available and how they can get it,” Fishkin simplified.

One of the leaked Google API modules about Navboost.
One of the Leaked Google API Modules | Source: Rand Fishkin

According to King’s initial analysis, the 2,596 leaked modules that contain 14,014 API features or attributes give out information on Google Search’s core systems and functionality, such as the following:

  • Web crawling system Trawler shows Google’s crawl queue, how it maintains crawl rates, and how it analyzes how often pages change
  • Alexandria as the core indexing system, SegIndexer for tier indexing, and TeraGoogle for indexing documents that live on disk long term
  • HtmlrenderWebkitHeadless as the rendering system for JavaScript pages (Chromium was also mentioned in the docs, so It’s likely that Google originally used WebKit and made the switch once Headless Chrome arrived)
  • LinkExtractor to extract links from pages and WebMirror for managing canonicalization and duplication

Although there are no specific details as to how exactly Google scores its search results and how it decides which ones go on its first search engine result page (SERP), the leaked files give an idea as to its ranking system.

Google uses Mustang as its primary scoring, ranking, and serving system, Ascorer as its primary rankings algorithm that ranks pages before any re-ranking adjustments, and WebChooserScorer to define feature names in snippet scoring.

It then utilizes Navboost to rerank based on click logs of user behavior and FreshnessTwiddler to rerank documents based on freshness.

The Twiddler framework is the search engine’s overlay system that controls re-rankings after the core level algorithm is done, which includes Navboost, QualityBoost, RealTimeBoost, and WebImageBoost.

Before SERPs are served to users on the front end, Google has the following systems in place.

  • Google Web Server – the server that the front of Google interacts with, receiving the payloads of data to display to the user
  • SuperRoot – the brain of Google Search that sends messages to Google’s servers and manages the post-processing system for re-ranking and presentation of results.
  • SnippetBrain – the system that generates snippets for results
  • Glue – the system for pulling together universal results using user behavior
  • Cookbook – the system for generating signals, indicating that values are created at runtime

While the documents don’t reveal exact ranking factors or their weights, they provide a glimpse into Google’s ranking system.

As co-owner of search agency Candour, Mark Williams-Cook, pointed out, “Just because something is referenced in the API leak doesn't mean it's a ranking factor.”

What the Leaked Files Imply

It became clear that Google has made false statements in the past years, gaslighting the industry when it comes to how its search engine operates.

First off, Google has stated several times in the past that its search engine doesn’t use Domain Authority, a metric that estimates the likelihood of a website's domain ranking in SERPs compared to other similar domains.

However, upon closer inspection of the leaked documents, King found that “as part of the Compressed Quality Signals that are stored on a per document basis, Google has a feature they compute called siteAuthority."

Commenting on these findings, the CEO of the QGP link-building agency Kosta Hristov told DesignRush that "for more seasoned SEOs, this leak didn’t reveal anything new... We already knew that there’s Site Authority on the domain level and that it propagates down the pages."

"At QGP's SEO Academy, we regularly speak about similar topics," Hristov concluded.

Second, Google engineer Paul Haahr spoke in detail about live experiments with clicks at the 2016 SMX West, saying that it’s a mistake to use clicks as a ranking signal due to them being heavily affected by biases.

But, as seen in the contents of the leaked files, Navboost, an algorithm that optimizes search results by analyzing user click patterns, is used as one of the metrics in Google’s ranking system.

Critics conclude that the tech giant tried to hide its use of Click-Through Rate (CTR), which is a way to see how many people click on ads compared to how many times it is shown to people, as a ranking signal.

A screenshot of Paul Haahr's statement about Google not using CTR as a ranking metric.
A Screenshot of Paul Haahr's Statement About Clicks | Source: Mike King

Third, Google Senior Search Analyst John Mueller replied that “there is no sandbox” to a tweet (it has since been deleted) by Vijay Kumar asking about how long it takes for Google to take new websites out of the sandbox.

It’s not a secret that it takes a while for new websites to rank and come out high on SERPs, and the main theory is that Google places them in a sandbox for an unspecified period.

Again, this has been proven untrue as there’s a hostAge attribute in the leaked PerDocData module used “to sandbox fresh spam in serving time.”

John Mueller's deleted tweet quotes him saying that "there is no sandbox."
John Mueller's Deleted Tweet About the Google Sandbox | Source: Mike King 

And last, former Google engineer Matt Cutts reportedly said that the #1 search engine doesn’t use Chrome data for its organic ranking algorithm.

One of the leaked documents linked to page quality scores and another one connected to sitelink generation have Chrome-related attributes.

Bill Hartzer's post says Matt Cutts told him that Google Search doesn't use Chrome data in ranking.
'Matt Cutts: Organic Algo Does Not Use Any Chrome Data' | Source: Bill Hartzer's Post on Webmaster World

After reviewing the leaked modules, both Fishkin and King concluded that Google lied about their search operations at least four times.

As both SEO experts are quick to point out, analyzing all of the 2,600 API documents will take some time, and they will post their findings in the future as they uncover more insights.

How This Affects Marketers

Valued at $68.27 billion in 2022 and projected by Emergen Research to increase to $157.41 billion in 2032, the global search engine market, which Google dominates, is booming.

This shows how marketers rely heavily on SEO and ranking metrics to increase website traffic and brand visibility, as well as measure the success of their campaigns.

Knowing exactly how Google Search works will help marketers a lot in developing effective initiatives. But the fact that Google appears to not want to be transparent about it means there is something to hide.

I don't think years of personal experience with seeing Google's algorithm respond completely opposite to what all the talking heads were saying is preconceived bias. They have been lying through their teeth since day one, and anyone with even basic SEO experience who was around…

— Greg Boser (@GregBoser) May 28, 2024

Maybe it’s about Google boosting its image as a fair company, prioritizing the quality of its results over profit.

Or maybe it’s something more sinister like protecting its monopoly while appearing to promote competition.

Whatever the case may be, SEO practitioners are going crazy over these leaked documents, with some having an “I knew it” reaction and others preferring to be cautious.

Google Releases Statement on Leaked Modules

On Thursday, Google finally released a statement to The Verge via email, confirming the authenticity of the leaked documentation. 

Despite admitting that the search API modules are genuine, it brings into question their accuracy and relevance, with Google spokesperson Davis Thompson saying that they contain "out-of-context, outdated, or incomplete information."

“We’ve shared extensive information about how Search works and the types of factors that our systems weigh, while also working to protect the integrity of our results from manipulation,” Thompson added.

I do not like this, but it does support what the leaked API docs suggest Google's doing with rankings, so I guess.. yay?

¯\_(ツ)_/¯ https://t.co/lzTbDmZjFm

— Rand Fishkin (follow @randderuiter on Threads) (@randfish) May 29, 2024

It seems that Google is maintaining its stance that it didn't lie to marketers in the past to protect its search operations, clinging to the fact (or excuse?) that its algorithms are constantly being updated and changed.

It's also possible for Google to claim that they were just live experiments and that they weren't used as ranking signals.

But now that the authenticity of the leaked modules has been confirmed, marketers may just be able to create a breakthrough when it comes to newer companies standing a chance in the ranking battle against big brands. 

Expect updates in the coming days as more experts dissect and interpret the modules to reveal more of what goes on behind Google Search.

👍👎💗🤯
Tags:
google 
ipullrank 
mike king 
rand fishkin 
sparktoro 
Katherine Maclang
Katherine Maclang
B2B Editor

Katherine Maclang is an accomplished professional in journalism and marketing communication, with extensive experience working at top Philippine media company GMA Network. She has been published on Yahoo Finance, The European Business Review, and Benzinga. In film and TV, she has actively participated in the production of movies and series that have earned notable nominations and awards from prestigious international film festivals. Currently serving as a B2B Editor at DesignRush, she continues to make significant contributions to the AdTech world.

Follow on: LinkedIn Send email: katherine@designrush.com

Latest Big Data News

view all
  • Crowd Analyzer CEO Tarek Shalby, Bright Data CEO Or Lencher, Dealfront Senior Director of Global Partnerships Jesse Pärnänen
    Big Data

    Nuanced & Data-Driven Strategies Key in Europe, MENA and AI Markets

    By Andrea Surnit  |  2 months ago  |  4 min read
  • Rebecca Grimes, Full Lifecycle CRO at SheerID
    Big Data

    Expert on Customer Retention Explains Why Identity Verification Reduces Churn

    By Andrea Surnit  |  3 months ago  |  4 min read
  • Sean McCarthy, director of brand & content at Lucky Orange
    Big Data

    7 Reasons Brands Choose Lucky Orange Over Traditional Analytics

    By Andrea Surnit  |  4 months ago  |  3 min read
  • Big Data

    As Third-Party Cookies Disappear, 3 First-Party Strategies Take the Lead

    By Enrique Jose Tabuena  |  4 months ago  |  3 min read
view all

Most Popular Big Data Stories

  • Business Productivity

    Why LinkedIn’s Automation Crackdown Is Reshaping B2B Outreach

    By Enrique Jose Tabuena  |  2 months ago  |  5 min read
  • Business Productivity

    LinkedIn Trust and Safety Updates to Disrupt Outreach, Says GetSales CRO

    By Ryan de Smidt  |  1 month ago  |  4 min read
  • DesignRush Podcast Thumbnail with Guest Janell Scott of Hugo Inc. and Podcast Host Kia Johnson
    Business Productivity

    The $438 Billion Cost of Employee Disengagement and How to Avoid It

    By Anna Hecht  |  3 weeks ago  |  5 min read
  • Business Productivity

    Why CRM Implementations Fail and How to Get Them Right the First Time

    By Alexey Astakhov  |  1 month ago  |  6 min read
DesignRush

DesignRush is the premier agency directory, awards platform, and media hub connecting brands with top agencies in software, app development, design, and marketing. We deliver vetted reviews, insights, and trends to drive business growth.

For Businesses

  • Agency Categories
  • Agency Ranking Methodology
  • Trending Brand News
  • FAQs
  • Advertise

For Agencies

  • Benefits Of Listing With Us
  • Submit An Agency
  • Sponsorship
  • All Agencies

About DesignRush

  • Team & Story
  • Contact Us
18117 Biscayne Blvd
Miami, FL 33160
United States
© DesignRush 2026, All Rights Reserved
  • Sitemap
  • Terms of Use & IP
  • Privacy Policy
  • Accessibility
  • Fraud Protection
s