Be Seen.
Be Cited.
Be Chosen.
搜索时代问「我排在哪里」。AI 时代问「我被如何记住」。
SeenGeo 是中文品牌在 ChatGPT、DeepSeek、豆包、通义、Kimi 中的镜像。
loading mirrors…
搜索时代问「我排在哪里」。AI 时代问「我被如何记住」。
SeenGeo 是中文品牌在 ChatGPT、DeepSeek、豆包、通义、Kimi 中的镜像。
loading mirrors…
00 / Open Data
30 brands·~310 scans·160 scores·5 dimensions·CC-BY 4.0
When a customer asks an LLM about a brand, the model answers from a memory we cannot see. SeenGeo is a public ledger of that memory — captured, dated, and scored, so anyone can read what the machines believe.
Every day, SeenGeo asks 30 Chinese consumer brands the same five questions across roughly thirty large language models. We log the prompts, the responses, the tokens, the latency. A separate judge model scores each response on five dimensions of brand recognition. The dataset is the raw record of that loop.
brands.csv holds the brand metadata. scans.csv holds every scan: the rendered prompt, the full response, the engine, the timestamp. scores.csv holds the per-dimension judgements with confidence. system_v1.txt is the system prompt every engine receives, unchanged.
Search engine optimisation gave us decades of public corpora — crawl logs, link graphs, ranking studies — and an entire field of researchers who could read them. Generative engine optimisation does not yet have that. Most of what we know about how LLMs describe brands lives inside private dashboards.
We want a different default. The scans are public, the scores are public, the methodology is public. If you want to argue with our judgements, you can — and you should. Bring your own judge model; our prompts are in the bundle.
The dataset is released under the Creative Commons Attribution 4.0 International license. You may copy, redistribute, remix, and build upon the data — including commercially — provided you credit SeenGeo · seengeo.com and link back to this page.
The brand names, logos, and trademarks referenced inside scans belong to their owners. The dataset records third-party model output verbatim and does not endorse any claim it contains.
For academic work, please use the BibTeX below. For posts and articles, a link to seengeo.com/dataset is enough.
@dataset{seengeo2026,
title = {SeenGeo: An Open Dataset of LLM Brand Mirrors},
author = {{SeenGeo}},
year = {2026},
url = {https://seengeo.com/dataset},
note = {CC-BY 4.0}
}Need a single brand? Each mirror page exposes its own export — see /r/[brand]/export?format=csv.