People no longer search just to click - they ask to decide: “Where can I get cacio e pepe near the Colosseum right now? Do they have gluten-free?” Answer engines (ChatGPT, Gemini, Perplexity, Bing Copilot, Apple Intelligence) compile a single response from multiple sources using retrieval-augmented generation (RAG), vertical knowledge graphs, and structured snippets. If your site is easy to parse, you get included and cited. If not, you’re invisible. decide retrieval-augmented generation (RAG) This isn’t classical SEO theatre. It’s machine readability engineering. machine readability What Actually Happens Under the Hood (High-Level Pipeline) Real-world AI answer engines vary, but the architecture rhymes: Crawling & Fetching Respect robots.txt, sitemaps, canonical URLs. Prefer static HTML or server-side rendered (SSR/SSG) content. Hydration-only JSON apps get less love unless prerendered. Content Extraction Boilerplate removal (nav, footers, cookie banners) using DOM heuristics (Readability-like algorithms), visual density, and CSS role/ARIA hints. Main content scoring via tag semantics (<article>, <main>, <h1..h6>, <section>), heading hierarchy, and text-to-link ratios. Normalization Language detection, de-duplication, canonicalization, currency/unit normalization, timezone resolution (critical for hours/menus/events). Structuring Parse schema.org JSON-LD/Microdata/RDFa. Promote entity slots (Restaurant, Menu, MenuSection, MenuItem, Offer, Price, HoursSpecification, GeoCoordinates). Build internal knowledge graph edges: (Dish) —servedAt→ (Restaurant) —locatedIn→ (Rome). Indexing Create sparse/dense embeddings per section/chunk (BM25 + vector embeddings). Store table-like data separately (menu items, prices) to answer structured queries quickly. Retrieval at Query Time Query understanding → retrieve top-k chunks via hybrid search (BM25 + cosine sim) + schema filters (e.g., @type=MenuItem). Assemble context windows with citations and attribution-friendly snippets. Generation LLM answers with citations; when structured slots exist, they’re privileged over plain text. If you have structured data, you win tie-breaks. Crawling & Fetching Respect robots.txt, sitemaps, canonical URLs. Prefer static HTML or server-side rendered (SSR/SSG) content. Hydration-only JSON apps get less love unless prerendered. Crawling & Fetching Respect robots.txt, sitemaps, canonical URLs. Prefer static HTML or server-side rendered (SSR/SSG) content. Hydration-only JSON apps get less love unless prerendered. Respect robots.txt, sitemaps, canonical URLs. robots.txt Prefer static HTML or server-side rendered (SSR/SSG) content. Hydration-only JSON apps get less love unless prerendered. static HTML Content Extraction Boilerplate removal (nav, footers, cookie banners) using DOM heuristics (Readability-like algorithms), visual density, and CSS role/ARIA hints. Main content scoring via tag semantics (<article>, <main>, <h1..h6>, <section>), heading hierarchy, and text-to-link ratios. Content Extraction Boilerplate removal (nav, footers, cookie banners) using DOM heuristics (Readability-like algorithms), visual density, and CSS role/ARIA hints. Main content scoring via tag semantics (<article>, <main>, <h1..h6>, <section>), heading hierarchy, and text-to-link ratios. Boilerplate removal (nav, footers, cookie banners) using DOM heuristics (Readability-like algorithms), visual density, and CSS role/ARIA hints. DOM heuristics Main content scoring via tag semantics (<article>, <main>, <h1..h6>, <section>), heading hierarchy, and text-to-link ratios. <article> <main> <h1..h6> <section> Normalization Language detection, de-duplication, canonicalization, currency/unit normalization, timezone resolution (critical for hours/menus/events). Normalization Language detection, de-duplication, canonicalization, currency/unit normalization, timezone resolution (critical for hours/menus/events). Language detection, de-duplication, canonicalization, currency/unit normalization, timezone resolution (critical for hours/menus/events). Structuring Parse schema.org JSON-LD/Microdata/RDFa. Promote entity slots (Restaurant, Menu, MenuSection, MenuItem, Offer, Price, HoursSpecification, GeoCoordinates). Build internal knowledge graph edges: (Dish) —servedAt→ (Restaurant) —locatedIn→ (Rome). Structuring Parse schema.org JSON-LD/Microdata/RDFa. Promote entity slots (Restaurant, Menu, MenuSection, MenuItem, Offer, Price, HoursSpecification, GeoCoordinates). Build internal knowledge graph edges: (Dish) —servedAt→ (Restaurant) —locatedIn→ (Rome). Parse schema.org JSON-LD/Microdata/RDFa. schema.org JSON-LD schema.org Promote entity slots (Restaurant, Menu, MenuSection, MenuItem, Offer, Price, HoursSpecification, GeoCoordinates). entity slots Build internal knowledge graph edges: (Dish) —servedAt→ (Restaurant) —locatedIn→ (Rome). knowledge graph edges Indexing Create sparse/dense embeddings per section/chunk (BM25 + vector embeddings). Store table-like data separately (menu items, prices) to answer structured queries quickly. Indexing Create sparse/dense embeddings per section/chunk (BM25 + vector embeddings). Store table-like data separately (menu items, prices) to answer structured queries quickly. Create sparse/dense embeddings per section/chunk (BM25 + vector embeddings). sparse/dense embeddings Store table-like data separately (menu items, prices) to answer structured queries quickly. Retrieval at Query Time Query understanding → retrieve top-k chunks via hybrid search (BM25 + cosine sim) + schema filters (e.g., @type=MenuItem). Assemble context windows with citations and attribution-friendly snippets. Retrieval at Query Time Query understanding → retrieve top-k chunks via hybrid search (BM25 + cosine sim) + schema filters (e.g., @type=MenuItem). Assemble context windows with citations and attribution-friendly snippets. Query understanding → retrieve top-k chunks via hybrid search (BM25 + cosine sim) + schema filters (e.g., @type=MenuItem). top-k chunks schema filters @type=MenuItem Assemble context windows with citations and attribution-friendly snippets. citations attribution-friendly snippets Generation LLM answers with citations; when structured slots exist, they’re privileged over plain text. If you have structured data, you win tie-breaks. Generation LLM answers with citations; when structured slots exist, they’re privileged over plain text. If you have structured data, you win tie-breaks. LLM answers with citations; when structured slots exist, they’re privileged over plain text. If you have structured data, you win tie-breaks. privileged over plain text The theme: structure beats prose. structure beats prose The Algorithms & Signals That Matter (and How to Feed Them) 1) DOM Semantics → Main Content Scoring Many extractors score blocks by: Heading depth (<h1> near the top, reasonable <h2>/<h3> nesting). Semantic containers (<article>, <main>, <section role="region">). Density (characters per block, lower link density, fewer repeated patterns). ARIA roles (role="main", role="article"). Heading depth (<h1> near the top, reasonable <h2>/<h3> nesting). <h1> <h2> <h3> Semantic containers (<article>, <main>, <section role="region">). <article> <main> <section role="region"> Density (characters per block, lower link density, fewer repeated patterns). ARIA roles (role="main", role="article"). role="main" role="article" Do: Do: <main> <article> <header> <h1>Trattoria Aurelia — Roman Classics Since 1978</h1> <p class="lede">Handmade pasta, seasonal produce, and late-night hours near the Colosseum.</p> </header> <section> <h2>Menu</h2> <ul> <li> <h3>Cacio e Pepe</h3> <p>Fresh tonnarelli with Pecorino Romano DOP and Tellicherry pepper.</p> <p><strong>€12</strong></p> </li> </ul> </section> </article> </main> <main> <article> <header> <h1>Trattoria Aurelia — Roman Classics Since 1978</h1> <p class="lede">Handmade pasta, seasonal produce, and late-night hours near the Colosseum.</p> </header> <section> <h2>Menu</h2> <ul> <li> <h3>Cacio e Pepe</h3> <p>Fresh tonnarelli with Pecorino Romano DOP and Tellicherry pepper.</p> <p><strong>€12</strong></p> </li> </ul> </section> </article> </main> Don’t: nest the entire page inside <div class="container"> with custom class names only. Models lose strong hints about what’s important. Don’t: <div class="container"> 2) Structured Data (schema.org) → Entity Extraction schema.org LLM pipelines elevate JSON-LD above free text. Use Restaurant + Menu + MenuItem + Offer + OpeningHoursSpecification. Include sameAs, geo, priceRange, servesCuisine, acceptsReservations. Restaurant Menu MenuItem Offer OpeningHoursSpecification sameAs geo priceRange servesCuisine acceptsReservations Full Menu Example (JSON-LD) Full Menu Example (JSON-LD) <script type="application/ld+json"> { "@context": "https://schema.org", "@type": "Restaurant", "name": "Trattoria Aurelia", "servesCuisine": ["Italian", "Roman"], "priceRange": "€€", "telephone": "+39 06 5555 1234", "address": { "@type": "PostalAddress", "streetAddress": "Via dei Fori Imperiali 12", "addressLocality": "Rome", "addressRegion": "RM", "postalCode": "00184", "addressCountry": "IT" }, "geo": {"@type": "GeoCoordinates", "latitude": 41.8902, "longitude": 12.4922}, "acceptsReservations": "Yes", "openingHoursSpecification": [ {"@type": "OpeningHoursSpecification", "dayOfWeek": ["Monday","Tuesday","Wednesday","Thursday","Friday"], "opens": "11:30", "closes": "23:00", "validFrom": "2025-01-01"}, {"@type": "OpeningHoursSpecification", "dayOfWeek": ["Saturday","Sunday"], "opens": "11:30", "closes": "01:00"} ], "menu": { "@type": "Menu", "name": "Dinner Menu — Summer", "hasMenuSection": [ { "@type": "MenuSection", "name": "Pasta", "hasMenuItem": [ { "@type": "MenuItem", "name": "Cacio e Pepe", "description": "Tonnarelli, Pecorino Romano DOP, black pepper", "offers": {"@type": "Offer", "priceCurrency": "EUR", "price": 12} }, { "@type": "MenuItem", "name": "Amatriciana", "description": "Guanciale, tomato, Pecorino Romano", "offers": {"@type": "Offer", "priceCurrency": "EUR", "price": 13} } ] } ] }, "sameAs": [ "https://maps.google.com/?cid=...", "https://www.instagram.com/trattoriaaurelia", "https://www.facebook.com/trattoriaaurelia" ] } </script> <script type="application/ld+json"> { "@context": "https://schema.org", "@type": "Restaurant", "name": "Trattoria Aurelia", "servesCuisine": ["Italian", "Roman"], "priceRange": "€€", "telephone": "+39 06 5555 1234", "address": { "@type": "PostalAddress", "streetAddress": "Via dei Fori Imperiali 12", "addressLocality": "Rome", "addressRegion": "RM", "postalCode": "00184", "addressCountry": "IT" }, "geo": {"@type": "GeoCoordinates", "latitude": 41.8902, "longitude": 12.4922}, "acceptsReservations": "Yes", "openingHoursSpecification": [ {"@type": "OpeningHoursSpecification", "dayOfWeek": ["Monday","Tuesday","Wednesday","Thursday","Friday"], "opens": "11:30", "closes": "23:00", "validFrom": "2025-01-01"}, {"@type": "OpeningHoursSpecification", "dayOfWeek": ["Saturday","Sunday"], "opens": "11:30", "closes": "01:00"} ], "menu": { "@type": "Menu", "name": "Dinner Menu — Summer", "hasMenuSection": [ { "@type": "MenuSection", "name": "Pasta", "hasMenuItem": [ { "@type": "MenuItem", "name": "Cacio e Pepe", "description": "Tonnarelli, Pecorino Romano DOP, black pepper", "offers": {"@type": "Offer", "priceCurrency": "EUR", "price": 12} }, { "@type": "MenuItem", "name": "Amatriciana", "description": "Guanciale, tomato, Pecorino Romano", "offers": {"@type": "Offer", "priceCurrency": "EUR", "price": 13} } ] } ] }, "sameAs": [ "https://maps.google.com/?cid=...", "https://www.instagram.com/trattoriaaurelia", "https://www.facebook.com/trattoriaaurelia" ] } </script> Why it works: Retrieval systems can slot-answer queries like “price of cacio e pepe at Trattoria Aurelia” without re-parsing prose. Why it works: “price of cacio e pepe at Trattoria Aurelia” 3) Content Chunking → Embedding Recall LLMs use chunked embeddings. Oversized pages hurt recall; too many tiny chunks lose context. chunked embeddings Practice: Practice: Aim for 300–800 tokens per logical section (roughly 1–4 paragraphs + a table). Use headings and id anchors; keep each dish/FAQ in its own subsection. Provide permalink anchors (e.g., /menu#cacio-e-pepe) so retrievers can cite precise spans. Aim for 300–800 tokens per logical section (roughly 1–4 paragraphs + a table). 300–800 tokens per logical section Use headings and id anchors; keep each dish/FAQ in its own subsection. id Provide permalink anchors (e.g., /menu#cacio-e-pepe) so retrievers can cite precise spans. permalink anchors /menu#cacio-e-pepe <section id="cacio-e-pepe"> <h3>Cacio e Pepe</h3> <p>...</p> </section> <section id="cacio-e-pepe"> <h3>Cacio e Pepe</h3> <p>...</p> </section> 4) Tabular Data → Parseable Tables, Not Pictures If prices/hours appear in tables, keep them as semantic tables or lists. Avoid images/PDFs. semantic tables <table> <caption>Dinner Prices</caption> <thead> <tr><th>Dish</th><th>Price (EUR)</th><th>Allergens</th></tr> </thead> <tbody> <tr><td>Cacio e Pepe</td><td>12</td><td>Milk, Gluten</td></tr> <tr><td>Amatriciana</td><td>13</td><td>Milk, Pork, Gluten</td></tr> </tbody> </table> <table> <caption>Dinner Prices</caption> <thead> <tr><th>Dish</th><th>Price (EUR)</th><th>Allergens</th></tr> </thead> <tbody> <tr><td>Cacio e Pepe</td><td>12</td><td>Milk, Gluten</td></tr> <tr><td>Amatriciana</td><td>13</td><td>Milk, Pork, Gluten</td></tr> </tbody> </table> Bonus: mirror the data in JSON-LD so both text and structure exist. Bonus: 5) Multilingual & Locale Signals → Correct Matching Query: “Where to eat cacio e pepe near the Colosseum, up to 15 euros?” Make sure your page clarifies language, currency, and timezone. language, currency, and timezone Use lang on <html> and hreflang for alternates. Normalize currency with ISO codes in JSON-LD (priceCurrency: "EUR"). Provide local phone formats and openingHoursSpecification with validFrom/Through if seasonal. Use lang on <html> and hreflang for alternates. lang <html> hreflang Normalize currency with ISO codes in JSON-LD (priceCurrency: "EUR"). priceCurrency: "EUR" Provide local phone formats and openingHoursSpecification with validFrom/Through if seasonal. local phone formats openingHoursSpecification validFrom/Through 6) Canonicalization, Sitemaps, and Change Hints Retrievers prioritize freshness for operational facts (hours, prices, availability). Add lastmod to sitemaps and keep it honest. Use <link rel="canonical"> to avoid duplicate menus across UTM’d pages. Publish invalidation-friendly URLs: /menu/summer-2025 rather than /menu?date=1699. Avoid JS-only rendering for price text; SSR it. Add lastmod to sitemaps and keep it honest. lastmod sitemaps Use <link rel="canonical"> to avoid duplicate menus across UTM’d pages. <link rel="canonical"> Publish invalidation-friendly URLs: /menu/summer-2025 rather than /menu?date=1699. invalidation-friendly URLs /menu/summer-2025 /menu?date=1699 Avoid JS-only rendering for price text; SSR it. Non‑Trivial, Production‑Level Patterns A) Disambiguate Similar Entities with @id and Stable Anchors @id If you have two venues (Trastevere vs. Monti), give each a stable @id and specific geo. @id geo <script type="application/ld+json"> { "@context": "https://schema.org", "@id": "https://trattoriaaurelia.it/locations/monti#restaurant", "@type": "Restaurant", "name": "Trattoria Aurelia — Monti", "geo": {"@type": "GeoCoordinates", "latitude": 41.894, "longitude": 12.494}, "address": {"@type": "PostalAddress", "addressLocality": "Rome"} } </script> <script type="application/ld+json"> { "@context": "https://schema.org", "@id": "https://trattoriaaurelia.it/locations/monti#restaurant", "@type": "Restaurant", "name": "Trattoria Aurelia — Monti", "geo": {"@type": "GeoCoordinates", "latitude": 41.894, "longitude": 12.494}, "address": {"@type": "PostalAddress", "addressLocality": "Rome"} } </script> Now an AI can answer: “Which branch serves cacio e pepe past midnight?” by joining MenuItem with the Monti branch’s hours. “Which branch serves cacio e pepe past midnight?” MenuItem Monti B) Recipe/Allergen Knowledge → DietaryRestriction & SuitableForDiet DietaryRestriction SuitableForDiet { "@context": "https://schema.org", "@type": "MenuItem", "name": "Cacio e Pepe", "suitableForDiet": ["https://schema.org/VegetarianDiet"], "menuAddOn": [ {"@type": "MenuItem", "name": "Gluten-free pasta", "offers": {"@type": "Offer", "price": 2, "priceCurrency": "EUR"}} ], "requiresSubscription": false } { "@context": "https://schema.org", "@type": "MenuItem", "name": "Cacio e Pepe", "suitableForDiet": ["https://schema.org/VegetarianDiet"], "menuAddOn": [ {"@type": "MenuItem", "name": "Gluten-free pasta", "offers": {"@type": "Offer", "price": 2, "priceCurrency": "EUR"}} ], "requiresSubscription": false } This lets assistants answer “Is their cacio e pepe vegetarian? Can I get gluten-free?” without hallucinating. “Is their cacio e pepe vegetarian? Can I get gluten-free?” C) Temporal Facts → Validity Windows Menus change. Model pipelines weight recent structured facts. recent { "@type": "Offer", "price": 12, "priceCurrency": "EUR", "priceValidUntil": "2025-10-01" } { "@type": "Offer", "price": 12, "priceCurrency": "EUR", "priceValidUntil": "2025-10-01" } Pair with sitemap.xml updates so freshness signals align. sitemap.xml D) FAQ Blocks → Extractive Answer Boosters LLMs love Q&A pairs. Provide FAQPage with concise answers and link to canonical detail. FAQPage <script type="application/ld+json"> { "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [ { "@type": "Question", "name": "Do you accept walk-ins after 11pm?", "acceptedAnswer": {"@type": "Answer", "text": "Yes, until 12:30am on weekends."} } ] } </script> <script type="application/ld+json"> { "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [ { "@type": "Question", "name": "Do you accept walk-ins after 11pm?", "acceptedAnswer": {"@type": "Answer", "text": "Yes, until 12:30am on weekends."} } ] } </script> E) Robust Images → ALT, Figure Captions, and EXIF Hygiene Use <figure><img alt="Cacio e pepe with Pecorino Romano"><figcaption>…</figcaption></figure>. Keep filenames descriptive (cacio-e-pepe-pecorino.jpg). Don’t bake text into images; assistants can’t extract it reliably. Use <figure><img alt="Cacio e pepe with Pecorino Romano"><figcaption>…</figcaption></figure>. <figure><img alt="Cacio e pepe with Pecorino Romano"><figcaption>…</figcaption></figure> Keep filenames descriptive (cacio-e-pepe-pecorino.jpg). cacio-e-pepe-pecorino.jpg Don’t bake text into images; assistants can’t extract it reliably. F) Events & Reservations → Event + Deep Links Event { "@context": "https://schema.org", "@type": "Event", "name": "Truffle Week", "startDate": "2025-11-02", "endDate": "2025-11-09", "location": {"@type": "Place", "name": "Trattoria Aurelia — Monti"}, "offers": {"@type": "Offer", "url": "https://trattoriaaurelia.it/reserve?event=truffle-week"} } { "@context": "https://schema.org", "@type": "Event", "name": "Truffle Week", "startDate": "2025-11-02", "endDate": "2025-11-09", "location": {"@type": "Place", "name": "Trattoria Aurelia — Monti"}, "offers": {"@type": "Offer", "url": "https://trattoriaaurelia.it/reserve?event=truffle-week"} } Assistants can now answer “Any truffle events in Rome next week?” and deep-link bookings. “Any truffle events in Rome next week?” How This Maps to RAG & Embeddings Hybrid Retrieval: Systems combine classical IR (BM25) with dense vectors. Your job: make exact tokens discoverable (dish names, neighborhoods, hours) and provide rich context for vectors. Chunk Boundaries: Use headings and logical grouping; don’t interleave unrelated content in one paragraph (e.g., lunch and dinner prices together). Anchor-able Citations: Provide stable anchors so assistants can attribute precisely - this improves the chance you’re cited. Hybrid Retrieval: Systems combine classical IR (BM25) with dense vectors. Your job: make exact tokens discoverable (dish names, neighborhoods, hours) and provide rich context for vectors. Hybrid Retrieval: and Chunk Boundaries: Use headings and logical grouping; don’t interleave unrelated content in one paragraph (e.g., lunch and dinner prices together). Chunk Boundaries: Anchor-able Citations: Provide stable anchors so assistants can attribute precisely - this improves the chance you’re cited. Anchor-able Citations: Internal Linking = Graph Strength Think in graphs: link dish → origin story → ingredient sourcing → allergen policy. Each link with a descriptive anchor boosts entity resolution and gives retrievers semantic hops. Pitfalls That Break AI Parsing (Seen in Production) Menus as PDFs/JPEGs - invisible. Provide HTML + JSON-LD mirror. Prices rendered only client-side - crawlers time out on hydration or block XHR; SSR your critical content. Infinite scroll for core info - put the essentials above-the-fold in the initial HTML. Locale confusion - 12,00 vs 12.00, ambiguous timezones. Normalize in JSON-LD. Over-nested div soup - no semantic hints; extractors miss the main content. Duplicated pages without canonical - embeddings split authority; citations fragment. Menus as PDFs/JPEGs - invisible. Provide HTML + JSON-LD mirror. Menus as PDFs/JPEGs Prices rendered only client-side - crawlers time out on hydration or block XHR; SSR your critical content. Prices rendered only client-side Infinite scroll for core info - put the essentials above-the-fold in the initial HTML. Infinite scroll for core info Locale confusion - 12,00 vs 12.00, ambiguous timezones. Normalize in JSON-LD. Locale confusion 12,00 12.00 Over-nested div soup - no semantic hints; extractors miss the main content. Over-nested div soup Duplicated pages without canonical - embeddings split authority; citations fragment. Duplicated pages without canonical A Concrete Checklist <html lang> set; localized alternates with hreflang. SSR/SSG for critical text (menu, hours, address, phone). <main>, <article>, correct H1/H2 hierarchy. schema.org JSON-LD for your vertical (Restaurant, Product, Event, FAQ, Review). Each item (dish, product) gets its own <section id="…"> and anchor link. Tables are <table>, not screenshots. Sitemap with lastmod; stable canonical URLs. Use @id, sameAs, geo, openingHoursSpecification. Provide allergens and dietary flags. Test with multiple parsers (Google Rich Results test + a Readability clone + your own micro RAG script). <html lang> set; localized alternates with hreflang. <html lang> hreflang SSR/SSG for critical text (menu, hours, address, phone). <main>, <article>, correct H1/H2 hierarchy. <main> <article> schema.org JSON-LD for your vertical (Restaurant, Product, Event, FAQ, Review). schema.org Each item (dish, product) gets its own <section id="…"> and anchor link. <section id="…"> Tables are <table>, not screenshots. <table> Sitemap with lastmod; stable canonical URLs. lastmod Use @id, sameAs, geo, openingHoursSpecification. @id sameAs geo openingHoursSpecification Provide allergens and dietary flags. Test with multiple parsers (Google Rich Results test + a Readability clone + your own micro RAG script). Bonus: Build a Tiny RAG Parser to “Think Like an Answer Engine” You can locally simulate how assistants will see your page: Fetch your rendered HTML (SSR). Run Readability (or a Node clone) to extract main content. Parse JSON-LD via a micro schema.org extractor. Chunk by headings; embed each chunk (any open-source SentenceTransformer). Ask queries; retrieve top-k; verify the text is sufficient without images/JS. Fetch your rendered HTML (SSR). Run Readability (or a Node clone) to extract main content. Readability Parse JSON-LD via a micro schema.org extractor. schema.org Chunk by headings; embed each chunk (any open-source SentenceTransformer). Ask queries; retrieve top-k; verify the text is sufficient without images/JS. Once your page answers your own local RAG, you’ve cleared the biggest hurdle. Other Vertical Playbooks E-commerce: Product, Offer, AggregateRating, Review, ItemAvailability, Brand, GTIN, color/size variants as separate @id nodes. Events: Event with startDate/endDate, eventAttendanceMode, location.geo, ticket Offer with currency and availabilityEnds. Local Services: LocalBusiness with Service items (areaServed, hasOfferCatalog). Docs & APIs: TechArticle, HowTo, stable permalinks per endpoint, code blocks labeled, and FAQPage for common errors. E-commerce: Product, Offer, AggregateRating, Review, ItemAvailability, Brand, GTIN, color/size variants as separate @id nodes. E-commerce Product Offer AggregateRating Review ItemAvailability Brand GTIN color/size @id Events: Event with startDate/endDate, eventAttendanceMode, location.geo, ticket Offer with currency and availabilityEnds. Events Event startDate/endDate eventAttendanceMode location.geo Offer availabilityEnds Local Services: LocalBusiness with Service items (areaServed, hasOfferCatalog). Local Services LocalBusiness Service areaServed hasOfferCatalog Docs & APIs: TechArticle, HowTo, stable permalinks per endpoint, code blocks labeled, and FAQPage for common errors. Docs & APIs TechArticle HowTo stable permalinks per endpoint FAQPage Metrics and Monitoring Track assistant referrals (UTMs per assistant, e.g., ?ref=assistant-chatgpt). Measure crawl frequency via log analysis (user-agents like “GPTBot”, “CCBot”, “PerplexityBot”, etc.). Watch index freshness: time from deploy → assistant mentions updated price. A/B test schema richness on a subsection of pages. Track assistant referrals (UTMs per assistant, e.g., ?ref=assistant-chatgpt). assistant referrals ?ref=assistant-chatgpt Measure crawl frequency via log analysis (user-agents like “GPTBot”, “CCBot”, “PerplexityBot”, etc.). crawl frequency Watch index freshness: time from deploy → assistant mentions updated price. index freshness A/B test schema richness on a subsection of pages. schema richness If you can’t measure whether assistants picked you as a source, you’re guessing. If you can’t measure whether assistants picked you as a source, you’re guessing. Ethical & Practical Notes Respect accessibility: what helps screen readers helps machines. Avoid deceptive markup - penalties are coming. Publish privacy-respecting structured data (no dark patterns in JSON-LD). Respect accessibility: what helps screen readers helps machines. Avoid deceptive markup - penalties are coming. Publish privacy-respecting structured data (no dark patterns in JSON-LD). privacy-respecting Semantic Markup as an AI Contract Semantic HTML and structured data are not cosmetics; they’re a contract with machine readers. When you honor it - clear semantics, honest structure, fresh metadata - you’re easy to retrieve, easy to cite, and hard to hallucinate. contract Good markup today is discoverability tomorrow. Ship it.