How We Found the Fake News King: An OSINT Investigation for NPR

In late 2016, a story appeared on a website called The Denver Guardian. The headline: "FBI Agent Suspected in Hillary Email Leaks Found Dead in Apparent Murder-Suicide." Within 10 days, the story had been viewed 1.6 million times. It was shared hundreds of thousands of times on Facebook. People cited it in arguments. It was presented as fact.

There was just one problem: none of it was real. There was no dead FBI agent. There was no murder-suicide. And there was no Denver Guardian — it wasn't a real newspaper. The domain had been registered weeks earlier. The "about" page contained placeholder text. The site had no editorial staff, no office, no history. It was a fabrication designed to look like a local news outlet, and it worked spectacularly well.

NPR wanted to find out who was behind it. That's where John Jansen came in.

The Starting Point

Open-source intelligence — OSINT — starts with what's publicly visible and works backwards. It doesn't require special access, classified databases, or law enforcement powers. It requires patience, methodical thinking, and an understanding of how the internet's infrastructure works.

The starting point was the domain itself: denverguardian.com. Every domain name has a registration record — WHOIS data — that includes information about when it was registered, who registered it, and sometimes their contact details. In 2016, WHOIS privacy services were common but not universal, and even privacy-protected registrations leak information if you know where to look.

The Denver Guardian's WHOIS record was privacy-protected, which was informative in itself: whoever set this up knew enough to use a privacy service, which meant they had done this before or at least understood the basics of operational security. But WHOIS privacy protects the registrant's name and contact details — it doesn't hide the registration date, the registrar used, or the nameservers the domain points to. Each of these is a data point.

Domain Intelligence

The next step was understanding the infrastructure. Every website lives on a server, and that server has an IP address. Reverse DNS lookups and reverse IP lookups reveal what other domains share the same server. This is where fake news networks start to reveal themselves.

The Denver Guardian's server hosted other domains. Some of these were also fake news sites — sites with names designed to sound like local news outlets, publishing fabricated or heavily distorted stories optimised for social media sharing. This wasn't a lone wolf with one website. It was a network.

John mapped the network by following connections outward from The Denver Guardian. Shared hosting infrastructure was the first connection. If two sites share a server, they might be run by the same person — or they might just be on the same shared hosting provider, which millions of sites use. Conclusions cannot be drawn from one shared server. But when shared hosting is combined with other signals — similar site designs, shared advertising accounts, overlapping content, similar registration patterns — the picture gets clearer.

Advertising accounts were particularly useful. Fake news sites in this era were primarily monetised through programmatic advertising — Google AdSense and similar networks. Each advertising account has an identifier that's embedded in the site's HTML. If two sites share the same advertising account ID, they're almost certainly run by the same entity. This is a much stronger signal than shared hosting, because advertising accounts are tied to bank accounts and tax identities. Nobody shares an ad account with strangers.

Content Fingerprinting

Beyond infrastructure, John looked at content patterns. Fake news sites in a network often share content — either identical articles published on multiple sites, or articles with minor variations (different headlines, slightly reworded opening paragraphs, same core fabrication). Content fingerprinting — identifying articles that are substantially similar despite surface-level differences — can link sites that don't share any visible infrastructure.

The techniques here are similar to plagiarism detection: compute text similarity metrics across articles from different sites, identify clusters of near-duplicate content, and map the relationships. In this case, the content sharing patterns aligned with the infrastructure connections already identified, which strengthened the case that these sites were part of a coordinated network rather than independent operations that happened to share hosting.

JJ also looked at publishing patterns. When were articles posted? How quickly did new sites appear? Was there a consistent editorial pattern that suggested a single operator or a small team? The timing data showed a pattern consistent with one person or a very small group managing multiple sites — not a large operation, but a disciplined one.

Following the Trail

The network mapping revealed a cluster of interconnected sites that all pointed back to an entity called Disinfomedia. This wasn't a registered company in any traditional sense — it was a label, a name used across several domain registrations and advertising accounts. But it was consistent enough to serve as an identifier.

From Disinfomedia, the trail led to a series of additional data points: email addresses associated with domain registrations (even privacy-protected registrations sometimes leak historical data through cached WHOIS records or through data breaches that include registration emails), social media accounts linked to those email addresses, and business registrations in public records.

Each step in the chain involved the same basic process: take what is known, search for connections to things not yet known, verify each connection through multiple independent sources, and only move forward with confidence. False positives are easy to generate in OSINT work — the internet is full of coincidental connections, shared names, and reused infrastructure. The discipline is in verification, not discovery.

The Identification

The trail led to Jestin Coler, a man living in the suburbs of Los Angeles. Coler, it turned out, was running Disinfomedia as a one-person operation, creating and managing a network of fake news sites that collectively generated significant traffic and advertising revenue. He wasn't a foreign agent or a political operative — he was an entrepreneur who had found a profitable niche in manufacturing viral misinformation.

The identification was based on the convergence of multiple independent evidence streams: domain registration data, advertising account connections, business records, social media accounts, and content patterns. No single piece of evidence was conclusive on its own. The confidence came from the convergence — when five independent lines of investigation all point to the same person, you can be reasonably certain.

When NPR's Laura Sydell contacted Coler, he confirmed it. He was remarkably open about what he had been doing and why. That interview became the basis for NPR Planet Money Episode 739, "Finding the Fake News King" (NPR Planet Money Episode 739), which explored the economics and mechanics of misinformation in the 2016 election cycle. NPR also published a companion article, "We Tracked Down A Fake-News Creator In The Suburbs. Here's What We Learned" (NPR companion article).

What Made This Technically Interesting

Strip away the specific context — fake news, elections, misinformation — and what's left is a network analysis problem. There is a set of nodes (websites) with observable properties (hosting infrastructure, advertising accounts, content, registration data). There are connections between nodes (shared infrastructure, shared accounts, content similarity). And the objective is to identify the structure of the network and trace it back to its operator.

This is the same analytical framework that applies to fraud detection (mapping networks of shell companies), cybersecurity (tracing attack infrastructure back to threat actors), competitive intelligence (understanding a competitor's digital footprint), and due diligence (verifying that a business partner is who they claim to be). The specific tools and data sources change, but the methodology is consistent: enumerate observable properties, map connections, identify patterns, verify through convergence.

The investigative process is also fundamentally about pattern recognition — the same cognitive skill that underpins machine learning, but applied manually with human judgment about which patterns are meaningful and which are coincidental. There's an irony in the fact that the same analytical thinking that powers AI systems was, in this case, applied by a human to investigate the misuse of digital platforms that AI would later be tasked with policing.

The Broader Impact

The NPR story was one of the earliest detailed investigations into the mechanics of fake news production. It landed at a moment when the world was just beginning to grapple with the scale of the misinformation problem and the role that platforms like Facebook played in amplifying fabricated content.

The conversation that followed — about platform responsibility, algorithmic amplification, media literacy, and the economics of misinformation — is still ongoing. The technical landscape has changed (AI-generated content has made the production side easier while making detection harder), but the fundamental dynamics haven't. Misinformation is profitable. Platforms optimise for engagement. And the infrastructure of the internet makes it cheap and easy to create convincing-looking sources from scratch.

Why These Skills Matter Today

OSINT, network analysis, and data investigation are more relevant now than they were in 2016. The applications have expanded well beyond journalism and law enforcement.

Due diligence is one of the most common use cases. Before entering a business relationship, acquiring a company, or making a significant investment, verifying that the other party is who they say they are, that their business is what it appears to be, and that there aren't hidden risks is essential. OSINT techniques — domain analysis, corporate registry searches, social media investigation, litigation record searches — can surface information that traditional due diligence processes miss.

Fraud detection uses many of the same network analysis techniques. Fraudulent operations, like fake news networks, tend to share infrastructure, reuse identities, and create patterns that are visible if you know where to look. The ability to map connections between entities — companies, domains, individuals, financial accounts — and identify suspicious patterns is valuable in insurance, banking, e-commerce, and government.

Competitive intelligence is another application. Understanding a competitor's digital infrastructure, their advertising strategy, their content patterns, and their market positioning is all achievable through open-source investigation. The line between competitive intelligence and corporate espionage is important — OSINT, by definition, uses only publicly available information — but within that boundary, there's a wealth of actionable insight available.

New Zealand Relevance

New Zealand businesses operate in a high-trust environment, which is generally a good thing but also creates vulnerability. The same trust that makes business relationships efficient also makes due diligence feel unnecessary — until something goes wrong. John has seen New Zealand companies enter partnerships, make investments, and sign contracts based on far less verification than the stakes warranted.

The techniques John used to find the Fake News King are directly applicable to New Zealand business contexts. Verifying a potential partner's claims about their track record. Investigating the provenance of a suspicious approach. Understanding the competitive landscape in a new market. Assessing the legitimacy of an online business before acquiring it.

These aren't exotic capabilities reserved for intelligence agencies. They're practical skills built on publicly available data and systematic analysis. The internet leaves traces, and those traces tell stories — if you know how to read them.

The question for New Zealand businesses isn't whether these techniques are relevant. It's whether you're using them, or whether you're relying on trust alone in an environment that increasingly rewards verification.