Scout at the source. EDGAR Form D plus ATS feeds, no news scraping.

A two-signal scout stack for B2B SaaS outbound. Funding from EDGAR, scale from Greenhouse / Lever / Ashby. TechCrunch demoted to enrichment.

The cold email pipeline post covered the shape of the system. This post is about the scout layer specifically, and why the right move in 2026 is to stop scraping news sites and go to the source-of-truth endpoints that the news sites are themselves scraping.

We just stood up the EDGAR endpoint and confirmed it works. 544 Form D filings came back in the last 4 days. That is the funding firehose at the source, no journalist in the middle.

The two signals worth scouting

For B2B SaaS outbound to RevOps buyers, only two signals actually move reply rates.

  1. Funding signal. The company raised recently and has fresh budget.
  2. Scale signal. The company is hiring revenue roles, which means the buyer (Head of RevOps, VP Sales, CRO) is feeling the pain you can fix.

Every other signal (press mentions, podcast appearances, LinkedIn posts) is a lagging proxy for one of those two.

So scout for those two directly. Skip the proxies.

Signal 1. EDGAR Form D

Every US private company that raises capital has to file a Form D with the SEC within 15 days of the first sale. The SEC publishes that filing index in a public, unauthenticated JSON endpoint.

GET https://efts.sec.gov/LATEST/search-index?forms=D

Poll every 15 minutes. Filter to software SIC codes (7372, 7370, 7371, 7379, 7389). Fetch each filing’s primary_doc.xml and extract:

  • Issuer name and address
  • Total offering amount
  • Amount sold to date
  • Related persons (executives and directors, full names)

Filter to offering size between $3M and $50M. That window is the Seed through Series B sweet spot where a company has budget but has not yet built the internal RevOps team that makes them a closed door.

Dedup against a scout_queue_seen table by issuer name and address so we never re-pitch.

What you get out: a structured row per fresh raise, with the actual executive names from the filing, before TechCrunch writes about it. Often days before.

Signal 2. ATS public feeds

The instinct here is “scrape LinkedIn for open roles.” Resist it. LinkedIn has no public API, gates everything behind auth, and the moment you build a scraper they ship a counter-measure.

The better source: ATS endpoints. Most B2B SaaS companies post jobs through Greenhouse, Lever, or Ashby. All three expose unauthenticated JSON feeds.

GET https://boards-api.greenhouse.io/v1/boards/<company>/jobs
GET https://api.lever.co/v0/postings/<company>
GET https://jobs.ashbyhq.com/api/non-user-graphql?op=ApiBoardJobPostings

For each company that EDGAR surfaces, probe all three endpoints. One of them will usually hit. Count the open roles tagged Sales, RevOps, AE, or CS. More than five open revenue roles is a strong scale signal.

Now cross-reference. A company that filed a Form D this week and has 8 open AE roles is exactly the buyer profile a RevOps consultancy wants. That intersection is small enough to hand-write a touch for, and large enough to fill a 50-per-day send cadence.

News demoted to enrichment

TechCrunch and Axios still have a role. It is not “find the company.” It is “find the quote or the detail you can drop into the personalized line.”

Once EDGAR has flagged the company and ATS has confirmed the hiring signal, then a focused news pull (one query, one URL fetch) grabs the lead investor name or a vertical-specific detail to merge into the static template. That is enrichment, not discovery.

This inversion matters. When news is the primary source, you compete with every other outbound shop reading the same TechCrunch RSS. When EDGAR is the primary source, you are reading the same filings the journalists read, days before they publish.

What this changes in the pipeline

In the architecture from the previous post, Scout was a Claude-driven research job hitting Google AI Answers. That works, but it surfaces companies after they have been written about.

The replacement Scout has three jobs running in parallel:

  1. EDGAR poller. 15 minute cron. Writes fresh issuer rows to scout_queue.jsonl.
  2. ATS prober. Consumes new issuer rows, probes Greenhouse / Lever / Ashby, attaches the role count.
  3. Enricher. For rows that pass both filters, one focused news pull to grab the merge-field details.

Each is a Python script under 200 lines. Same shape as before, different inputs.

Why this is cheaper

Claude spend drops, because Claude is no longer the research engine. It is the enricher, called once per qualified row, not once per candidate. EDGAR and the ATS endpoints are free.

A 50-prospect-per-day operation that was costing about $4 per day in Claude compute drops to under $1. The discovery is rule-based; only the enrichment uses tokens.

What we are still figuring out

A few open questions on this stack.

  • Form D issuer addresses are often a Delaware C-corp filing address, not the operating address. We are joining against the related persons’ LinkedIn profiles to recover the real operating city.
  • ATS endpoints rate-limit by IP. The prober needs a polite backoff, not parallel hammering.
  • Some companies use Workable, SmartRecruiters, or roll their own careers page. Greenhouse / Lever / Ashby coverage is roughly 60 percent of the funded B2B SaaS universe. The remaining 40 percent currently falls through to a manual check.

None of those are blockers. They are tuning.

The next post in this series

We will publish the four static templates that merge the EDGAR + ATS fields into copy a CRO will actually reply to. The architecture is the easy part. The copy is the leverage.


Building scout infrastructure and stuck? Message me on LinkedIn.