Directory Niche Analysis — Methodology
Date: 2026-04-26 Author: Cyrus Purpose: A repeatable, evidence-based process for finding genuinely winnable directory niches. Built on what we've learned validating 15+ candidates against real Semrush data.
Why most niche analysis is wrong
The mistake almost everyone makes (including me, in earlier rounds): looking at search volume and competition score alone.
- "boba near me" has 450K volume and 0.11 competition — looks great. KD = 69. SERP is locked up by Yelp/DoorDash. Unwinnable.
- "u pick farms georgia" has 20 volume — looks dead. But it's part of a pattern that aggregates to millions of searches and KD 5-10 across all variants. Real winner.
The right question isn't "is this keyword good?" It's "can a programmatic-SEO directory built by one operator credibly rank in the top 5 for the long-tail of this niche within 6-18 months at a cost under $1,000?"
That's a 4-dimensional question. Volume × Competition × Data Availability × Defensibility.
The 7-step framework
Step 1 — Frame the search universe (NOT the keyword)
A niche is a search universe, not a single keyword. UPick Atlas isn't trying to rank for "u pick farms" — it's trying to rank for the family of queries:
[verb] + [crop] + [location modifier]
where verb ∈ {pick, picking, u-pick, you pick, pumpkin patch, apple orchard, ...}
where crop ∈ {pumpkins, apples, strawberries, ...}
where location ∈ {near me, in [state], in [city], on [route]}
That gives ~20 verbs × 15 crops × 200 locations = 60,000 distinct keyword variations. Almost all individually low-volume but collectively massive.
A niche is winnable if:
- The variation pattern aggregates to >50K monthly searches in total
- Individual variations are mostly KD 0-25 even if the head term is KD 60+
- The pattern produces enough distinct pages to be programmatic (>500)
Test for the pattern:
- Sample 5 head-term keywords (high volume)
- Sample 10 mid-tail (moderate volume + 1 modifier)
- Sample 10 long-tail (specific city / state / variant)
- Sum the volumes; if total > 50K/mo, the universe is large enough to matter
- Check the KD distribution — if mid + long-tail KD < 25, programmatic SEO can win
This is what UPick passed with flying colors. EV charging passed on volume but failed on KD distribution.
Step 2 — Identify the SERP shape, not just the SERP
Open Google for each head-term keyword. Don't look at the rankings. Look at WHO ranks. Categorize the top 10 into:
| SERP archetype | Beatable? | Why |
|---|---|---|
| A) Government / .gov / .edu | ❌ No | Infinite domain authority, can't be outranked |
| B) Big-brand aggregator (Yelp, TripAdvisor, DoorDash, Amazon, eBay) | ⚠️ Hard | Massive backlinks, but their pages are often thin and outrankable on long-tail |
| C) Category leader (AllTrails, BringFido, Niche.com, Untappd) | ❌ Mostly no | Built moat, defends actively |
| D) Network operator (ChargePoint, Tesla, Marriott, etc.) | ❌ No | Brand intent — searchers want them specifically |
| E) Single-business sites (one farm, one shop, one venue) | ✅ Yes | No coordinated structure |
| F) Mom-blog listicles | ✅ Yes | Thin, dated, no schema |
| G) Reddit / forums | ✅ Yes | Unstructured opinion threads |
| H) Regional tourism / city government | ⚠️ Mixed | High authority but narrow per page |
Rules of thumb:
- Top 3 = A or C → abort, unwinnable
- Top 3 = D → abort if brand intent dominates (Tesla supercharger), proceed if generic intent (e.g. "ev charging stations near me" allows aggregator entry)
- Top 10 = mostly E + F + G + H → ⭐ this is your opening. Programmatic directory wins here.
- Top 10 = mix of A/C + E/F → check what % of traffic is captured by E/F. If > 30%, there's a wedge.
Real examples:
- UPick: "pumpkin patch near me" → top 10 = single farms + Yelp + mom blogs. Beatable. ✅
- EV charging: "ev charging stations near me" → top 10 = afdc.energy.gov + ChargePoint + ElectrifyAmerica. ❌
- Waterfalls: "waterfalls near me" → top 10 = mom blogs only. ⭐ Very beatable.
- Hiking: "hiking trails near me" → top 10 = AllTrails + AllTrails + AllTrails. ❌
Step 3 — Score competitors, not search results
For every candidate niche, identify the 3-5 most-likely defenders (not just whoever's #1 today). Pull from Semrush:
- Total organic keywords
- Total monthly organic traffic
- Estimated organic traffic value (Semrush "Organic Cost")
A defender is dangerous if:
- They have >100K organic keywords AND
- They have >100K monthly organic visits AND
- Their traffic value > $50K/mo
A defender is beatable if:
- They have <50K organic keywords AND/OR
- Their content quality is visibly weak (manual SERP inspection)
- Their structured data is missing or broken
Worked examples:
- PickYourOwn.org: 39,815 keywords, 7,958 visits, $678 value → beatable
- BringFido: 1,090,114 keywords, 602,453 visits → fortress
- AllTrails: 2,426,279 keywords, 5,402,762 visits → untouchable
- Niche.com: 3,332,472 keywords, 6,962,506 visits → untouchable
- afdc.energy.gov: 800K keywords, 2.6M visits, +.gov domain → untouchable
Heuristic: if any single defender has >5× the keyword footprint you could realistically build in 12 months, abort.
Step 4 — Verify the data exists and is acquirable
Programmatic SEO requires real, structured data at scale. Three quality tiers:
Tier 1 — Free public dataset. USGS waterfalls (17K), USDA farms (~18K), DoE EV chargers (70K), PDGA disc golf courses (14K). Best-case scenario.
Tier 2 — Scrape-able with effort. Yelp, individual operator websites, regional aggregators. Workable but expensive (Firecrawl + LLM enrichment costs add up at scale).
Tier 3 — Locked / pay-walled / API-restricted. Yelp paid API, Niche.com proprietary review data, Untappd's beer database. Don't bother — you're competing with the incumbent who already has the data.
Test: before scoring a niche, ask:
- Where would I get the seed list? (Federal database, scraping, hand-curation?)
- How long would acquisition take? (Hours, days, weeks?)
- What's the per-entity enrichment cost? (Use Firecrawl: ~$0.001/page, plus LLM ~$0.005/extraction)
- Can I refresh the data quarterly? (Vital — stale directories die fast)
Anything that requires more than ~$100 in data acquisition for the first 1,000 entities is suspect.
Step 5 — Score schema.org fit
Google rewards structured data heavily on directory queries. Niche must fit cleanly into one of these schema types:
| Schema | Best for |
|---|---|
LocalBusiness (and subtypes) | Anything with a physical location |
TouristAttraction | Outdoor, recreational, visit-worthy spots |
Place | Generic geo-anchored entity |
Product | Comparable products |
Recipe | Food recipes |
Event | Time-bound happenings |
EVChargingStation | EV chargers (Google has dedicated schema) |
SportsActivityLocation | Disc golf, pickleball, climbing |
EducationalOrganization | Schools |
Red flag: if your niche doesn't fit any standard schema cleanly, search engines will struggle to understand it. Boba shops technically use LocalBusiness but Google doesn't reward boba-specific structure — there's no BobaTeaShop type.
Bonus: niches where Google already shows rich-result schema in SERP carousels are gold. Recipes, events, products — visible rich snippets directly correlate to clicks.
Step 6 — Calculate revenue ceiling
This is where most analysts hand-wave. Force yourself to do the math.
Formula:
Annual revenue = (Total niche search volume) × (your capturable share %) × (RPM per visit) × 12
- Capturable share %: realistic 0.5-3% for new sites, 5-15% for sites that "win" the niche over 2 years
- RPM per visit (revenue per 1,000 sessions):
- Display ads (Mediavine eligibility at 50K sessions): $20-40 RPM
- Affiliate (general): $5-15 RPM
- Affiliate (high-value lead-gen — storage, finance, services): $50-200 RPM
- Sponsored placements: $10-30 RPM at scale
Example: UPick Atlas
- Total niche search volume: ~2M/year (combined seasonal patterns)
- Capturable share at 12 months: 1.5% (optimistic for a new site) = 30K visits/year
- RPM (mostly seasonal, mid display + farm sponsorships): ~$15
- Annual revenue ceiling: 30K × $15 / 1000 = $450/year first year, $2-5K/year by year 2
Example: Waterfalls Atlas
- Total niche search volume: ~3M/year (110K head + long-tail)
- Capturable share at 12 months: 2% = 60K visits/year
- RPM (outdoor/travel = decent display, REI/Backcountry affiliate): ~$15
- Annual revenue ceiling: ~$900 first year, $4-8K/year by year 2
Example: Self-Storage Comparison
- Total niche search volume: ~10M/year
- Capturable share: 0.3% (heavily defended) = 30K visits/year
- RPM (storage lead-gen pays $20-40/lead, but conversion is low): ~$80 effective
- Annual revenue ceiling: $2,400 first year, $30K+/year if you capture 1%
Notice: self-storage has the highest ceiling but lowest probability of capture. Compute expected value, not maximum.
Final formula:
Expected annual revenue = ceiling × probability of capture
Probability of capture estimate:
- 70%+ if all 6 framework gates pass (volume, SERP shape, defenders, data, schema, ceiling math)
- 30-50% if 4-5 pass
- <20% if 3 or fewer pass — abort
Step 7 — Defensibility check
After 12-24 months, can you defend what you built? Three threats:
Threat A — Google algorithm. Programmatic AI-generated content is squarely targeted by Google's helpful-content updates. Hedge by:
- Adding genuine human-generated content per page (FAQ from real research, not LLM)
- Real photography (not stock)
- User-generated reviews if possible
- Update logs visible on each page (proves the data is fresh)
Threat B — Incumbent reaction. If you're capturing meaningful traffic from a defender (BringFido, AllTrails, etc.), they'll notice and fight. Hedge by:
- Pick niches where the incumbent is structurally bad (mom blogs, Reddit) not just lazy
- Build a moat the incumbent can't easily copy (real-time data, user contributions, depth no one else has)
Threat C — Clones. Once you prove a niche works, copycats follow within months. Hedge by:
- Get domain authority before publishing your playbook
- Lock in user-generated value (reviews, submissions) early
- Operate 2-3 sites in adjacent niches to share authority across them
The combined scoring rubric
For every candidate niche, score 1-5 on each dimension:
| Dimension | Test | Score 5 | Score 1 |
|---|---|---|---|
| Search universe size | Sum head + mid + long-tail volume | >100K/mo aggregate | <20K/mo aggregate |
| Volume × low KD intersection | Avg KD on long-tail | <15 | >40 |
| SERP shape | Top 10 archetype mix | All E+F+G | A or C dominates |
| Strongest defender | Their organic traffic | <20K/mo | >500K/mo |
| Data availability | Tier of source | Tier 1 free | Tier 3 locked |
| Schema fit | Standard schema type | Direct fit | No good schema |
| Revenue ceiling | Math from formula | >$10K/yr year 2 | <$1K/yr year 2 |
| Defensibility | 3-threat resistance | Strong on all 3 | Weak on 2+ |
Total /40. Action thresholds:
- 32-40: BUILD IT
- 24-31: Build only if the alternative is doing nothing
- 16-23: Skip
- <16: Don't waste an hour on it
This is the rubric I'll use going forward. Every niche we've discussed scored according to this, retroactively:
| Niche | Score | Decision |
|---|---|---|
| UPick Atlas | 36 | ✅ Built |
| Waterfalls Atlas | 35 | ⭐ Best new candidate |
| Drive-In Theaters | 31 | Quick-win option |
| Disc Golf | 28 | Skip — UDisc defends |
| Scenic Drives | 26 | Pair with Waterfalls |
| EV Charging | 26 | Skip — gov dominance |
| Self-Storage | 24 | Skip — defended hard |
| Coworking | 22 | Skip |
| Wineries/Breweries | 20 | Skip |
| Boba | 16 | Skip |
| Pickleball | 26 | Skip — Pickleheads defends |
| Dog Parks | 18 | Skip — BringFido fortress |
| Private Schools | 18 | Skip — Niche.com fortress |
| Hiking Trails | 8 | Don't even look |
Methodology Rule #8 — The Rich People Filter
Added 2026-04-26 from Tim Stoddard interview synthesis. Source: research/stoddard-synthesis-2026-04-26.md.
Before scoring volume + KD + DR + SERP shape, ask: What is the average transaction value (AOV) of the underlying business this directory serves leads to?
| AOV tier | Realistic ceiling | Monetization paths |
|---|---|---|
| <$50 (zero-ticket activities, free services) | Lifestyle scale only ($5-30K/yr) | Display ads, small affiliate, digital products |
| $50-500 | Mid (with perfect execution + heavy traffic) | Display + affiliate + sponsored placements |
| $500-5,000 | Sweet spot for solo operator ($100-500K/yr possible) | Lead gen, premium listings |
| $5,000+ | High-leverage but YMYL risk + harder lead verification | Lead gen at $50-500/lead |
Apply BEFORE volume research. Filters out 50% of candidates in 30 seconds.
Stoddard's $350K/yr came from rehab leads at ~$200-500/lead. AOV of underlying transaction: $30K. That's why the math works.
Applied to our existing portfolio:
- UPick Atlas (~$30 AOV per farm visit) → lifestyle ceiling
- Waterfall Atlas (~$300 effective if hotel-affiliate-attached) → mid-tier
- Plant Medicine Retreats ($5K-15K) → sweet spot — Stoddard already builds here
- Stem Cell Clinics ($3K-50K) → sweet spot but YMYL
- Disc Golf, EV Charging, Drive-Ins, Alpaca Farms → all fail this gate
Methodology Rule #9 — Institutional Outreach Strategy
Added 2026-04-26 from Tim Stoddard interview synthesis.
The single most durable backlink play available to a solo operator in any directory niche: manual outreach to local government, education, and tourism institutions.
Stoddard's actual playbook from Sober Nation (which became DR 72):
- Local municipalities (city/county .gov sites) — "We built a resource your residents are searching for"
- Universities (.edu) — If your topic touches student wellbeing, dropouts, retention, or campus life
- State / regional tourism boards (.org or .gov) — If your topic is location-anchored
- State agencies — Departments of agriculture, parks, health, transportation depending on niche
Why this beats keyword-based content marketing for backlinks:
- .gov / .edu domains carry disproportionate ranking weight in Google's algorithm
- These institutions don't do reciprocal link asks (so the link is permanent and editorial)
- They're under-crawled by AI link-builders and SEO tool scrapers (no cold outreach saturation)
- They reply to genuine offers because most of their content is volunteer-maintained
Operational pattern:
- 1 hour/day, 60 days = 60 emails sent
- Realistic conversion: 10-20% of emails get a link = 6-12 quality backlinks
- Each .gov / .edu link is worth 5-10 random wordpress.com links
- Cumulative effect: structurally outranks competitors with "fake moats" (lots of low-quality links)
Stoddard's growth-hack variant: user-submitted stories. Sober Nation's "sober stories" feature. Each submission becomes:
- Indexable unique page (programmatic content)
- Self-distributing (author shares it)
- Engagement signal Google rewards
Application to our sites:
UPick Atlas targets:
- 50 state agriculture departments (.gov)
- USDA local food portal partnership
- ~3,000 county tourism boards (.org)
- ~1,500 agriculture extension offices (.edu)
- Local family-blog publishers (low-DR but contextually relevant)
Waterfall Atlas targets:
- US Forest Service regional offices
- USGS hydrology programs
- 50 state parks departments
- ~500 state tourism boards
- National park visitor centers
- Regional outdoor recreation councils
- Photography clubs (.org)
Don't outsource this. Stoddard's literal answer to "what's your cold outreach hack" was: "I was just willing to do it." An hour a day, manually, for two months. That's the hack.
Where my framework can still fail
Honest list of things this methodology doesn't catch:
- Google's mood. A "helpful content" update can crush programmatic AI sites overnight. No analysis catches this until traffic dies.
- Trend reversal. Pickleball today, dead in 5 years? Disc golf trends actually decelerating? Trend data lags reality.
- CPM volatility. Display ad rates fluctuate 30-50%. Last year's $25 RPM might be $14 this year.
- Hidden compliance costs. Some niches (private schools, daycare) have GDPR/COPPA/data-privacy implications I haven't priced in.
- Local-pack monopoly. "near me" queries increasingly trigger Google Maps/local-pack results that bypass directories entirely. UPick is at risk here.
These are why probability of capture is never 100% even on a 40/40 niche. Plan for failure modes.
Process for next time
- Brainstorm 30 candidates (don't pre-filter, generate widely)
- Run Step 1 + Step 2 on all 30 (fast — checks SERP shape via 1 query each)
- Eliminate ~20 that fail SERP-shape test
- Run Step 3 + Step 4 on remaining ~10 (medium effort — pulls competitor data)
- Eliminate ~5 that fail competitor or data tests
- Run Step 5-7 in depth on remaining 3-5
- Pick #1, build it. Don't spread yourself across 3 — concentrate.
Estimated Semrush API budget per round: ~3,000-5,000 units (6-10% of monthly Guru allowance).
What we should do differently next time
Based on what I got wrong this cycle:
- Always pull KD scores, not just Competition scores. They diverge wildly (boba comp 0.11 / KD 69; drive-in comp 0.06 / KD 65).
- Always check the SERP shape before estimating capture probability. I overrated EV Charging because I didn't see afdc.energy.gov dominating until I pulled the live SERP.
- Always size up the strongest 2-3 defenders, not just the obvious one. PlugShare looked like the only EV directory; ChargePoint with 442K visits is the bigger threat.
- Force yourself to compute revenue ceiling. Most niches we considered have $1K-3K/year ceilings, not the $1K-3K/month we casually claimed.
This methodology now lives in the workspace. Future niche research goes through these 7 steps.