Mining "already-validated demand" from hosting platforms' default subdomains
Someone used a paid Similarweb account to sweep the default subdomains of 23 site-building platforms (vercel.app, netlify.app, pages.dev, github.io…) all at once, and ended up with a "demand radar map" of roughly 77,000 landing pages, 47,000 sites, and 31.07 million visits. This deep dive takes that methodology apart and lays it out clearly: at its core it's an extremely concrete, engineered version of this handbook's "Niche discovery" and "Demand validation" chapters — you don't invent demand, you pick up demand that other people have already validated for you. At the end I add my own independent fact-check of the key conclusions, plus three traps you absolutely have to watch for.
<50 approximated as 50), and a domain's "unregistered" status only reflects the snapshot at the time (2026-05-11) — before you actually buy, you must re-check registration, trademarks, the SERP, and legal risk. The numbers are for judging whether demand exists and how strong it is, not for precise valuation.1. The core insight: the smart move isn't "what do I want to build" but "what are users already looking for"
Most people are still picking projects on gut feel: AI tools are hot today, so they build an AI tool; directory sites are making money tomorrow, so they build a directory site — yanked around by the feed. This method flips the question around:
Don't ask "is there demand in this market" — go straight for the sites that already have real traffic but have terrible product and SEO.
A site whose owner didn't even bother to buy a domain, did no SEO, and shipped a crude page — yet still pulls a few thousand to a few hundred thousand visits — tells you the demand is rock-solid: users really are searching, really clicking, really using it.
Now when you redo it with a better domain, better pages, better SEO, and a better experience, you're not betting your life from zero — you're fighting an upgrade battle on a slot that's already been validated. This is exactly what this handbook keeps hammering: technology was never the bottleneck; distribution and "building the right thing" are.
2. Why focus on default subdomains like vercel.app / netlify.app / pages.dev / github.io
Mature sites bind their own brand domain. But huge numbers of indie developers, students, AI vibe coders, and one-off project authors deploy on a platform and then just take the default subdomain out into the world as-is:
These default subdomains have one very valuable property: they usually represent early, temporary, AI-rapidly-generated, un-branded products. In other words — a lot of these sites weren't built by pros, yet they already have traffic. That's a "demand mine" the professional SEO teams haven't picked clean yet.
3. The full picture: 23 platforms, 77,000 landing pages, 31.07 million visits
This sweep covered 23 platform domains, roughly 77,000 landing-page records, corresponding to about 47,590 platform sites, with total estimated landing-page traffic of about 31.07 million. Layered by traffic:
| Visit threshold | Sites | What it means |
|---|---|---|
| ≥ 500 | 5,928 | Real demand that's at least observable; the first-pass filter |
| ≥ 5,000 | 704 | Demand has taken shape; worth manual due diligence |
| ≥ 10,000 | 336 | Strong demand, but watch for gray-area / brand terms |
| ≥ 50,000 | 57 | High traffic, but the risk is most concentrated here too |
Looking at total traffic by platform, the differences are huge — and you can't just stare at the totals, because every platform has its own "personality":
| Platform | Landing-page traffic (est.) | Positioning & how to read it |
|---|---|---|
| github.io | ≈ 18.838M | Mature demand library: lots of old projects, open source, docs, game tools. Opportunity exists, but not necessarily new, and plenty of copyright-risky sites. |
| pages.dev | ≈ 2.844M | New-site radar: high share of new sites, lots of fresh traffic. |
| netlify.app | ≈ 1.816M | New-site radar: dense with small tools, movie/TV, education, games, calculators, and region-lookup sites. |
| herokuapp.com | ≈ 1.023M | Skews old, mostly legacy apps. |
| web.app | ≈ 1.013M | Firebase family, mixed bag. |
| vercel.app | ≈ 0.978M | Growth radar: this batch wasn't tagged "new," but the change field is almost entirely growth percentages, with many sites at 1,000%–5,000%+. Look at growth, not just "new." |
| lovable.app | ≈ 0.449M | AI early-stage incubator: the Top 10 only account for about 4.9%, sites are extremely scattered, each one is small, but the variety of demand is rich. |
| onrender.com | ≈ 0.341M | Skews backend / service-type. |
| replit.app | ≈ 0.247M | Extreme-event type: the Top1 single site is about 177.9K and is fresh traffic. It can spike but is short-lived — good for event templates, bad as a long-term asset. |
4. What actually matters isn't traffic, it's "signal" — four must-ask questions
When you look at this kind of data, "which site has the most traffic" is the most useless question. The real questions are these four — they matter ten times more than total traffic:
| Must ask | What it tells you |
|---|---|
| Is this stable demand, or a one-off spike? | Oil field vs. fireworks. Event terms (a sports star's incident, an election result) flare up and die. |
| Is this a generic term, or a brand / piracy / borderline term? | Only generic tasks are your opportunity; for a term naming "a specific existing thing," grabbing the exact-match domain is usually just trolling for a fight. |
| Is the page terrible, yet it ranks for a lot of keywords? | Keyword count = number of search entry points covered. 5,000 visits / 500 keywords is often more worth doing than 50,000 visits / 1 hot term. |
| Can you build a better version in 1–3 days with AI? | Feasibility decides whether it's actually "yours" to take. |
5. The five categories best suited to indie developers
Once you strip out the gray-area and brand minefields, the things with real long-term value are almost all in the "small tool" family. The common thread: clear demand, a small functional boundary, the user arrives and solves one specific problem, and there's no market to educate.
1. Generic tools (the cleanest, best to start with)
AVIF→JPG (avif jpg 変換, about 22,900), SVG Path Editor (about 8,100 / 385 keywords), App Privacy Policy Generator (about 6,200 / 288), PDF Dark Mode Converter (about 5,800), Steganography Decoder (about 10,600), MD5 Checksum (about 7,400), Base32 Decoder, Image to Spectrogram, LaTeX Viewer (about 11,500).
2. Calculators and planners (naturally interactive, long dwell time)
Enchantment Calculator (about 45,600 / 3,750 keywords), PSA Calculator (about 28,200 / 1,084), Response Sheet Marks Calculator (about 9,500), Vernier Caliper Simulator (about 7,000), plus all kinds of game build planners / damage calculators.
3. Gamer tools (steady traffic, but IP risk)
Team Builder, Cheat Sheet, Fusion Calculator, Predictor, Pixel Art Generator, League Planner… many were thrown together by a programmer — good functionality, but bad SEO / UI / multilingual support / mobile. You don't need to be technically more complex; just be easier to use, easier to find, and cover more long tail, and you'll eat the traffic.
4. Exam / education / region-specific lookups (highly seasonal, but they come back every year)
The JEE series, CGPA/Grade Calculator, Vietnam college-entrance countdown, SAT question banks, etc. Too small for big companies to bother with, but a hard need for students. The right move is to build an exam-tool template and clone pages by exam × year × region, turning "seasonal" into "cyclical."
5. Developer micro-tools and documentation explainers (the highest-quality users)
Readme Generator, Transformer Explainer, API Explorer, Cheat Sheet, Markdown Viewer, and the like. High user quality, good monetization (ads, paid templates, API, sponsorship, email list). But brand-term risk is high, so build descriptive pages rather than brand-name domains.
6. Quantifying "is it worth doing": the opportunity-score formula
The biggest problem with the raw candidate list is: domain available ≠ opportunity doable. So you can't just buy from highest traffic down — you need a risk-adjusted score. This method collapses the judgment into one formula:
Final opportunity score = demand strength × scalability × feasibility × monetization potential − risk penalty
Demand strength looks at traffic / keyword count / growth rate / multiple sources; scalability looks at whether you can spin out 20+ long-tail pages; feasibility looks at whether you can ship an MVP in 48 hours; monetization potential looks at ads / affiliate / templates / API / paid; the risk penalty looks at brand, copyright, adult, piracy, login impersonation, short-lived events, and medical/financial/legal misinformation.
"Your current list only completed step one — finding demand. It hasn't done step two — judging whether it's worth doing. Mediocre people see traffic and charge in; smart people first ask whether that traffic bites."
7. Risk red lines: which to drop outright, and which to "steal the demand but not buy the exact match"
| Tier | Type | How to handle |
|---|---|---|
| Red / Tier C | Movies/TV, streaming, piracy, ROM, adult, borderline, login-impersonation portals, one-off event terms | Drop outright, no regrets. Reject anything with terms like login / movie / stream / torrent / rom / pirate / porn / iptv / youtube-to-mp3. "Your goal is $100K a day, not a cease-and-desist a day." |
| Yellow / Tier B | Brand/platform terms (github, netlify, openai, microsoft…), game IP (Minecraft, Pokémon, Genshin, Arknights…) | Real demand, but don't buy the exact-match domain. Build an "adjacent, generalized site" instead: repo downloader, agent sdk examples, rpg fusion calculator, pixel art generator. Stand on the user's task route, not under the brand's house number. |
| Green / Tier A | Generic tools, clean calculators, exam templates, developer tools | Diligence first; pick your first batch of experiments from here. |
8. The Tier-A candidate list (scored on risk / demand / MVP difficulty / SEO expansion combined)
These "clean terms with 3,000–8,000 traffic" are often worth more than an 80K-traffic gray-area term — because they're clean, stable, and compound over the long run. Before you buy, re-verify registration, trademark, SERP, and competitors.
| Candidate | Traffic (est.) | Keywords | Verdict |
|---|---|---|---|
| enchantmentcalculator.com | 45,600 | 3,750 | One of the prettiest signals on the whole list — big long tail, can be a standalone site; the domain doesn't spell out a game name, so relatively safe. |
| avifjpg.com | 22,900 | 140 | Simple to build, low risk, easy to localize; good as the basis for a whole image-format tool matrix, leading with browser-side (local processing, no upload). |
| warpgenerator.com | 56,500 | 502 | Generic name, strong tool quality; worth checking search intent before diligence. |
| xdeltapatcher.com | 19,500 | 219 | Clear demand, but be careful not to touch ROM downloads. |
| latexviewer.com | 11,500 | 452 | Academic/developer crossover, clean, can extend to Markdown/BibTeX/citation. |
| steganographydecoder.com | 10,600 | 293 | An entry point for a security/CTF tool matrix, pair with MD5/Base32/Exif/QR. |
| responsesheetmarkscalculator.com | 9,500 | — | Exam tool, 105% growth; build templated per-year pages. |
| svgpatheditor.com | 8,100 | 385 | Design/frontend users, a long-term asset. |
| md5checksum.com | 7,400 | 236 | An old but stable need; can fold into a decoder tool site. |
| appprivacypolicygenerator.com | 6,200 | 288 | High commercial value (developers/publishers pay), must add a "not legal advice" disclaimer. |
| pdfdarkmodeconverter.com | 5,800 | 192 | Concrete pain point (PDFs are harsh on the eyes at night); emphasize local processing as a privacy selling point. |
| imagetospectrogram.com | 5,300 | 113 | Niche but clear. |
9. The real play: copy the demand, not the site — then rebuild the structure
Copying a site is a low-level move. The right order is: see why it has traffic → take apart its keyword structure → figure out what task it solves → find where it's done badly → rebuild it with better SEO/experience/localization.
For example, when you spot a PDF Dark Mode Converter micro-tool with traffic, don't just build the same single button — build out the entire "topic cluster":
This is putting programmatic SEO exactly where it counts: one tool page + 3 long-tail conversion pages + 3 tutorial pages + FAQ + privacy notice + related-tools internal links + localized versions. A single button is a demo; a tool site is an asset.
10. An executable SOP (copy it straight)
- Build a platform-domain pool: vercel.app, netlify.app, pages.dev, github.io, web.app, firebaseapp.com, herokuapp.com, onrender.com, railway.app, replit.app, lovable.app, bolt.host, amplifyapp.com, azurestaticapps.net, deno.dev, fly.dev…
- Export + clean: use Similarweb's keyword / landing-page tools to export CSVs, keeping URL, visits, change, keyword count, and top keywords; parse K/M/<50; aggregate by host, and strip out the platform's own pages separately (e.g. netlify.com's login/docs/form pages will pollute candidates).
- First pass: keep ≥500 visits; loosen up for new / high-growth sites, prioritizing change >100% / >500% / >1000%.
- Risk filter: directly exclude adult / gambling / piracy / movies-TV / cracks / brand impersonation / login portals / financial phishing / medical misinformation / obvious infringement.
- Tag the demand: tool / calculator / generator / converter / viewer / game / education / exam / developer / AI / region lookup / event spike / gray-area.
- Look at keyword count: more keywords = more search entry points, which matters more than raw visits.
- Open the competitor pages manually: check Title/H1, whether the feature is complete, mobile, load speed, whether there's content/FAQ/internal links/localization, and whether it's just a throwaway demo.
- Generate candidate domains: don't mechanically slap on .com; avoid brand/trademark/project names, favor generic descriptive ones.
- Build the MVP: one need, one core action — the user finishes the task in 10 seconds; no login/admin/membership/complex systems. Do the first batch browser-side only.
- Add the SEO structure: Title/Description/H1/FAQ/How-to/Schema/Sitemap/robots/internal links/related tools/localization — all of it.
- Launch and wire up Search Console: weekly, look at impression terms, click terms, and high-impression/low-rank terms, and use real search queries to drive the next batch of pages — don't expand pages on a hunch.
- Re-sweep every month: the power of this method isn't a one-time dig, it's a continuous radar — rerun it every 30 days and keep the new and high-growth demand.
11. My independent fact-check: what's credible, and which three traps to watch for
This method sounds pretty ruthless, but as a research report it has to be placed against verifiable facts and known limitations:
✅ The methodology itself is real, checkable, and corroborated
"Gefei" is a publicly active SEO / AdSense practitioner in the go-global site-building scene, and his community's methodology can be summed up as a 40% mining demand / 20% building the product / 20% (should be 40%) promotion loop; his public courses really do include things like "discovering new demand and new products from outbound domains" and "analyzing high-traffic pages to mine the demand others are already making money on" — and this deep dive's subdomain-scanning method is precisely the engineered upgrade of that idea. The indie hacker community also broadly corroborates the phenomenon that "a lot of top-ranking sites have bad technical SEO; junk sites eat traffic anyway" — i.e., "a weak site still has traffic = demand is validated + there's room to outdo it."
xxx.vercel.app often falls below its sample threshold, so long-tail estimates are noisy; add to that the raw basis of approximating <50 as 50, which systematically inflates the volume of "long-tail small sites." So these numbers are good for relative ranking and a "is there demand or not" judgment, not for precise traffic. Before you launch, always cross-check real search volume with Google Search Console / keyword tools.12. How it connects to the rest of this handbook
If you're someone who "can write code but lacks marketing instinct," this subdomain-mining method fills in exactly that first stretch from idea to traffic:
| Handbook chapter | The problem this method solves for it |
|---|---|
| Niche discovery | Instead of gut feel, use "junk sites that already have traffic" as a demand signal source to find niche opportunities at scale. |
| Demand validation | Traffic is itself a layer of validation (note: attention-level validation), turning "prove someone wants it before you write code" into a data pipeline. |
| The seven acquisition channels | Leads with programmatic SEO: topic clusters + long-tail pages + internal links + localized subdirectories. |
| Growth · Pricing | Use a tool matrix for product compounding, with Search Console data driving iteration. |
Don't build what you think is cool — build what someone has already proven with a junk site. Demand isn't dreamed up; it's mined out of user behavior.
The earlier dataset tells you "where the ore is"; the candidate domain list tells you "what's mixed into the ore." Now the move isn't to keep digging, it's to start smelting — build 3 first, 48 hours each, and let Search Console tell you what the next page should be.