SEONIB SEONIB

A Few Things I Learned from Scraping My Way Through AI Search in 2026

Author: SEONIB Date: 2026-05-26 14:17:55
A Few Things I Learned from Scraping My Way Through AI Search in 2026

First, let me be honest: at the beginning of 2025 my attitude toward AI search was “let’s see first.” After then Google was still the dominant player, and AI‑search results didn’t seem very threatening—occasionally a summary would pop up, and clicking it would turn my original text into something unrecognizable and even attach a wrong link. I thought this was just a flashy feature and that serious traffic still had to come from traditional keyword rankings.

Then 2026 arrived, and the data hit me fast and accurately.

I ran a set of tests myself, using a few core keywords of our SaaS product to query ChatGPT, Perplexity, and Google AI Overviews. I found that 17 % of B2B SaaS discoveries now come from AI‑search channels—one year ago that number was only 4 %. If you sell enterprise software, that share is too large to ignore. Even more striking, my own brand content appears in AI results far less often than I expected. So I spent two weeks dissecting how to get discovered by AI, hit many snags, and overturned some long‑held “SEO truths.” What I’m sharing today comes from hands‑on experiments and still includes some unanswered questions.

Brand Visibility: Where Exactly Is Your Name Mentioned

At first I assumed AI search would cite webpages the same way traditional search does—whoever ranks higher gets cited. The reality is completely different.

I ran an experiment: using the same query “best CRM for small business,” my article appeared on the third page of Google search, but Perplexity’s answer cited three completely different sources—a Reddit post, a Wikipedia software list, and a research report. My article didn’t appear at all. I looked at the commonality of those three sources: they were mentioned across many independent sites. The Reddit post was cross‑posted to multiple sub‑reddits, the Wikipedia page had hundreds of external references, and the research report was quoted by many site headlines. AI search citation isn’t based on a single ranking signal; it looks at the breadth of a brand or content’s mentions across the whole web.

The 2026 WinWithSEO report confirmed this: 64 % of AI‑search citations come from Wikipedia, Reddit, and original‑research domains, while a brand’s own blog accounts for only 11 %. In other words, shouting “I’m professional” on your own site isn’t enough; you need third‑party sites to mention you voluntarily.

My approach was a bit clumsy: I first listed the authoritative third‑party sites, forums, and platforms that can publish original data in our industry. Then, instead of spamming links, I actually produced a small industry research report analyzing the pricing strategies of about 200 SaaS companies, published it on Medium and our own site, and reached out to a few industry blogs for citations. That report later got listed as an external link on a Wikipedia page, after which Perplexity began citing it in several related queries. The effect wasn’t immediate; it took roughly three months for AI results to start referencing it, but once it was in, it stayed stable.

I admit I wasted a lot of time in the process: my first attempts to post on Reddit were deleted three times before I realized Reddit is especially sensitive to self‑promotion. You need to genuinely participate in the community; just dropping links doesn’t work. I still haven’t fully cracked that.

Another often‑overlooked issue is that AI search behaves very differently across engines. ChatGPT on average cites only three sources and leans toward authoritative brand content; Perplexity cites seven and loves Reddit. If you want to be cited on multiple platforms, you need different content strategies. My current approach is: for ChatGPT, focus on original research reports and data; for Perplexity, cultivate accounts in relevant Reddit sub‑communities and occasionally join discussions. Honestly, this is still a sampled approach, not a systematic solution. If you have a better method, feel free to let me know.

Intent and Query Fan‑Out – What Users Really Want to Ask

I admit that I used to write article titles like “Best CRM Software 2026,” thinking matching keywords was enough. But AI search in 2026 made that mindset obsolete. Users now ask full sentences, e.g., “Which CRM is best for a B2B small team with 20 salespeople and a $500‑per‑month budget?”—that intent contains several dimensions: team size, budget, user type, business model.

That’s not the hardest part. The bigger headache is “query fan‑out”: when you ask AI a question, it automatically searches for multiple sub‑topics related to that question. For example, asking “best running shoes” might also retrieve “foot type analysis,” “running surface type,” “weight recommendation,” etc., and then stitch those pieces together into an answer. This means if your content only covers the narrow keyword “best running shoes” but doesn’t address “flat‑foot running shoe recommendations” or “running shoes for cement surfaces,” your chance of being cited is very low.

I learned this the hard way: I wrote an in‑depth article on “How to Choose an ERP System,” covering features, pricing, deployment, and implementation timeline, feeling confident. A quarter later, it appeared almost never in AI results. The reason turned out to be that users often add modifiers like “manufacturing ERP” or “SMB ERP” in their queries, while my article mentioned “manufacturing” only once and had no dedicated section. After AI’s query fan‑out, it found more relevant specialized articles and skipped my generic piece.

So I adjusted: every time I write on core topic, I first list 5–10 sub‑topics that the query might fan out to, then write a 300–500‑word dedicated paragraph for each sub‑topic, creating a topic cluster. This isn’t about turning the article into an encyclopedia, but ensuring that when AI fans out in a certain direction, there’s relevant content in your piece. I now use an AI tool (like SEONIB) to automatically discover these sub‑topics before writing, which saves a lot of effort. Even so, I still manually check for gaps before publishing.

Also, when optimizing content, distinguish between “quantitative” queries and “comparative” queries. The data shows quantitative queries (e.g., “What is the average price of CRM software?”) are cited more from data research and reports, while comparative queries (“A vs B”) rely more on Reddit and review sites. I previously didn’t differentiate these types; now I prepare different content formats for each. This doubles the amount of content needed—labor‑intensive, but the results are real.

E‑E‑A‑T and Naming Authors – Something You Can Fix Over a Weekend

Regarding signal strength in AI‑search citations, I read a dataset that seemed absurdly simple: pages with a bylined author and Person Schema are cited 2.4 × more often than anonymous articles; if the author also has a Wikipedia page or a verified LinkedIn profile, the multiplier jumps to 4.1 ×. This is almost the simplest, highest‑ROI optimization—just add an author bio at the bottom of the article, include a LinkedIn link, and embed Person Schema.

Our team ran into a funny resistance when we first tried this: most of our content is “compiled from multiple sources,” lacking a single clear author. The CTO didn’t want to sign because he didn’t claim those writings; the CEO didn’t want to sign because she didn’t have time to review. In the end I forced my own name on it and spent an afternoon adding the schema. A month later, we were cited for the first time on a Perplexity query, and the article happened to be the one I signed. I can’t say it was entirely because of the by, but it certainly didn’t hurt.

Of course, a byline alone isn’t enough. AI search’s notion of “expertise” heavily depends on verifiable credentials. If you write “How to Optimize Kubernetes Clusters” but your author bio only says “Content Manager,” that’s less credible than “Engineer with ten years of operations experience.” My current practice is to let people with real industry experience sign, even if they only review, and give them a “Technical Advisor” title. AI search seems to recognize this association—I don’t know exactly how, but the data shows it works.

Another surprising factor: the sameAs field in Schema. You need to add URLs pointing to your Twitter, LinkedIn, GitHub, Wikipedia, etc., in the Person Schema. I had ignored this step for a while; after adding it, citations increased a bit, though not dramatically—probably because my LinkedIn isn’t very influential. If you have a Wikipedia page, the effect is huge. Unfortunately I don’t have one, so I’m still building it.

Don’t Focus Only on Google; Perplexity and ChatGPT Are Not the Same

Earlier I mentioned the behavioral differences among AI search engines, and it’s worth expanding. In 2026 Google AI Overviews appear in about 38 % of commercial queries, but if you think managing Google alone is enough, you’ll likely miss growth on other channels.

I made a mistake: I spent a lot of effort optimizing content for Google AI Overviews, only to discover that ChatGPT never cited my content. Later testing showed that ChatGPT cares more about a brand’s overall web footprint rather than the ranking of a single page. So when doing brand outreach, I also had to focus on third‑party sources that boost ChatGPT citation probability. Perplexity, on the other hand, especially loves Reddit and discussion posts; I tried posting several times, with mixed results, and I’m still figuring out the pattern.

For most SaaS teams today, the most sensible approach isn’t to try to optimize all three platforms simultaneously, but to pick the AI search engine your target users use most, go deep, then expand. Our users are mainly technical people who favor Perplexity, so I prioritized Perplexity. That meant spending time on Reddit and Stack Overflow—areas I’m not comfortable with. After half an hour a day of replying for three months, I finally saw some citations. The efficiency is low, but there seems to be no shortcut.

Also, don’t overlook this: 64 % of AI‑search citations come from Wikipedia, Reddit, and original‑research domains. If you can establish a presence in any of those three, the effect may outweigh having 100 articles on your own site. But honestly, none of those three are easy to break into—Wikipedia editing is heavily vetted, Reddit can view self‑promotion as spam, and original research requires real money. My own research theme is “SaaS Pricing Trends”; I publish a data report each quarter and have been at it for a year, but Wikipedia has cited it only twice. Better than nothing, though.

FAQ

Will AI search replace traditional SEO?

Not for now—our data still shows Google traditional search accounts for 71 % of B2B SaaS discoveries. AI search is the fastest‑growing channel, and Google’s own Overviews are already appearing in many queries. The right approach is to treat them as parallel channels, not an either‑or choice. You need to monitor both traditional rankings and AI citations.

Do I have to add author bylines and Schema to every page?

Not every page, but it’s advisable for core commercial pages and articles with ranking potential. The effort is minimal (a day or two) and the payoff can be high (citation rates up 2–4×). If you have many legacy articles, prioritize the ones that already get the most traffic.

Can automated content tools like SEONIB help?

Yes, but it depends on how you use them. I mainly use it to discover sub‑topics and generate drafts, then add original research and external links manually. Pure AI‑generated content without a bylined author and source data is unlikely to be cited by AI search. The tool speeds production but can’t replace brand building.

How should I track AI‑search citations?

There’s no perfect monitoring tool yet. My method is to manually run a set of core queries on ChatGPT, Perplexity, and Google AI Overviews every two weeks and record the sources that cite us. Some paid tools are developing monitoring features, but I haven’t found a reliable one yet. If you discover a better way, feel free to share.

Why aren’t my original research reports being cited?

Possible reasons: missing Dataset Schema, lack of a dedicated methodology page, or insufficient promotion on social networks to generate third‑party citations. AI search values “verifiability”—they need to confirm your data isn’t fabricated. We later added a “Methodology” page to each report, detailing data collection and sample size, which nudged citation rates up a bit, though not dramatically. I’m still exploring this area.

Share Article

Related Articles

Recommended Reading

Ready to Get Started?

Experience our product immediately and explore more possibilities.