Key Patterns Reveal The Future Of AI Discovery
The structured data landscape has undergone significant transformation in 2024, driven by the rise of AI-powered search, the growing importance of machine-readable content, and the need to ground large language models in factual data.
According to the latest HTTP Archive’s Web Almanac, analyzing structured data across 16.9 million websites reveals a clear shift from traditional SEO implementation to more sophisticated knowledge graph development that powers AI discovery systems.
While Google deprecated certain rich results like FAQs and HowTos in 2023, it simultaneously introduced an unprecedented number of new structured data types, including vehicle listings, course info, vacation rentals, profile pages, and 3D product models.
In February 2024, it expanded support for product variants and GS1 Digital Link, followed by the beta launch of structured data carousels in March.
This rapid evolution signals a maturing ecosystem where structured data serves not just search visibility but also forms the foundation for factual AI responses, training language models, and enhanced digital product experiences.
Analysis and Methodology
The insights presented in this article are based on the 2024 edition of the Structured Data chapter of the HTTP Archive’s Web Almanac. The annual report analyzes the state of the web by evaluating structured data implementation across 16.9 million websites. These datasets are publicly queryable on BigQuery in tables in the `httparchive.all.*`
tables for the date date="2024-06-01"
and relies on tools like WebPageTest, Lighthouse, and Wappalyzer to capture metrics on structured data formats, adoption trends, and performance.
Structured Data Adoption Trends
The analysis reveals compelling growth across major structured data formats:
- JSON-LD reaches 41% adoption (+7% YoY).
- RDFa maintains leadership with 66% presence (+3% YoY).
- Open Graph implementation grows to 64% (+5% YoY).
- X (Twitter) meta tag usage increases to 45% (+8% YoY).
This widespread adoption indicates that organizations are investing in structured data not just for search visibility, but also to enable AI and crawlers to understand and enhance their digital experiences.
AI Discovery And Knowledge Graphs
The relationship between structured data and AI systems is evolving in complex ways.
While many generative AI search engines are still developing their approach to leveraging structured data, established platforms like Bing Copilot, Google Gemini, and specialized tools like SearchGPT already seem to demonstrate the value of entity-based understanding, particularly for local queries and factual validation.
Training And Entity Understanding
Generative AI search engines are trained on vast datasets that include structured data markup, influencing how they:
- Recognize and categorize entities (products, locations, organizations).
- Ground responses. We see this in systems like DataGemma that use structured data to ground responses in verifiable facts.
- Understand relationships between different data points. This is particularly evident when schema.org is used for aggregating datasets from authoritative sources worldwide.
- Process-specific query types like local business and product searches.
This training shapes how AI systems interpret and respond to queries, particularly visible in:
- Local business queries where entity attributes match structured data patterns.
- Product queries that reflect merchant-provided structured data.
- Knowledge panel information that aligns with entity definitions.
Search Engine Integration
Different platforms demonstrate structured data influence through:
- Traditional Search: Rich results and knowledge panels directly powered by structured data.
- AI Search Integration:
- Bing Copilot showing enhanced results for structured entities.
- Google Gemini reflecting knowledge graph information.
- Specialized engines like Perplexity.ai demonstrating entity understanding in location queries.
- Latest Google’s experiment of an AI Sales Assistant integrated into the SERP for shopping queries (This is huge! Here is on X, spotted by SERP Alert).
Here is an example of Gemini and Google Search sharing the same factoid.
Data Validation And Verification
Structured data provides verification mechanisms through:
- Knowledge Graphs: Systems like Google’s Data Commons use structured data for fact verification.
- Training Sets: Schema.org markup creates reliable training examples for entity recognition.
- Validation Pipelines: Content generation tools, like WordLift, use structured data to verify AI outputs.
The key distinction is that structured data doesn’t directly influence LLM responses, but rather shapes AI search engines through:
- Training data that includes structured markup.
- Entity class definitions that guide understanding.
- Integration with traditional search rich results.
This makes structured data implementation increasingly important for visibility across both traditional and AI-powered search platforms.
As we enter this new era of AI Discovery, investing in structured data isn’t just about SEO anymore – it’s about building the semantic layer that enables machines to truly understand and accurately represent who you are.
Semantic SEO Evolution: From Structured Data To Semantic Data
The practice of SEO has evolved into Semantic SEO, going beyond traditional keyword optimization to embrace semantic understanding:
Entity-Based Optimization
- Focus on clear entity definitions and relationships.
- Implementation of comprehensive entity attributes.
- Strategic use of sameAs properties for entity disambiguation.
Content Networks
- Development of interconnected content clusters.
- Clear attribution and authorship markup.
- Rich media relationship definitions.
Key Implementation Patterns In JSON-LD
Content Publishing
Analysis of structured data patterns across millions of websites reveals three dominant implementation trends for content publishers.
Website Structure & Navigation (+6 Million Implementations)
The dominance of WebPage → isPartOf → WebSite (5.8 million) and WebPage → breadcrumb → BreadcrumbList (4.8 million) relationships demonstrates that major websites prioritize clear site architecture and navigation paths.
Site structure remains the foundation of structured data implementation, suggesting that search engines heavily rely on these signals for understanding content hierarchy.
Content Attribution & Authority
Strong patterns emerge around content attribution:
- Article → author → Person (925,000).
- Article → publisher → Organization (597,000).
- BlogPosting → author → Person (217,000).
This focus on authorship and organizational attribution reflects the increasing importance of E-E-A-T signals and content authority in search algorithms.
Rich Media Integration
Consistent implementation of image markup across content types:
- WebPage → primaryImageOfPage → ImageObject (3 million)
- Article → image → ImageObject (806,000)
The high frequency of media relationships indicates that publishers recognize the value of structured visual content for both search visibility and user experience.
The data suggests publishers are moving beyond basic SEO markup to create comprehensive machine-readable content graphs that support both traditional search and emerging AI discovery systems.
Local Business & Retail
Analysis of local business structured data implementation reveals three critical pattern groups that dominate location-based markup.
Location & Accessibility (+1.4 Million Implementations)
High adoption of physical location markup demonstrates its fundamental importance:
- LocalBusiness → address → PostalAddress (745,000).
- Place → address → PostalAddress (658,000).
- Organization → contactPoint → ContactPoint (334,000).
- LocalBusiness → openingHoursSpecification (519,000).
The strong presence of these basic operational details suggests they are core ranking factors for local search visibility.
Geographic Precision
Significant implementation of geo-coordinates shows focus on precise location:
- Place → geo → GeoCoordinates (231,000).
- LocalBusiness → geo → GeoCoordinates (205,000).
This dual approach to location (address + coordinates) indicates search engines value precise geographic positioning for local search accuracy.
Trust Signals
A smaller but notable pattern group focuses on reputation:
- LocalBusiness → review → Review (94,000)
- LocalBusiness → aggregateRating → AggregateRating (70,000)
- LocalBusiness → photos → ImageObject (42,000)
- LocalBusiness → makesOffer → Offer (56,000)
While less frequently implemented, these trust-building elements create richer local business entities that support both search visibility and user decision-making.
Ecommerce (Expanded List)
Analysis of ecommerce structured data reveals sophisticated implementation patterns that focus on product discovery and conversion optimization.
Core Product Information (+4.7 Million Implementations)
The dominance of basic product markup shows its fundamental importance:
- Product → offers → Offer (3.1 million).
- Offer → seller → Organization (2.2 million).
- Product → mainEntityOfPage → WebPage (1.5 million).
This high adoption rate of core product relationships indicates their critical role in product discovery and merchant visibility.
Trust & Social Proof
Significant implementation of review-related markup:
- Product → review → Review (490,000).
- Product → aggregateRating → AggregateRating (201,000).
- Review → reviewRating → Rating (110,000).
The substantial presence of review markup suggests social proof remains crucial for ecommerce conversion.
Enhanced Product Context
Rich product attribute implementation shows a focus on detailed product information:
- Product → brand → Brand (315,000).
- Product → additionalProperty → PropertyValue (253,000).
- Product → image → ImageObject (182,000).
- Offer → shippingDetails → OfferShippingDetails (151,000).
- Offer → priceSpecification → PriceSpecification (42,000).
- AggregateOffer → offers → Offer (69,000).
This layered approach to product attributes creates comprehensive product entities that support both search visibility and user decision-making.
Future Outlook
The role of structured data is expanding beyond its traditional function as an SEO tool for powering rich snippets and specific search features. In the age of AI discovery, structured data is becoming a critical enabler for machine understanding, transforming how content is interpreted and connected across the web. This shift is driving the industry to think beyond Google-centric optimization, embracing structured data as a core component of a semantic and AI-integrated web.
Structured data provides the scaffolding for creating interconnected, machine-readable frameworks, which are vital for emerging AI applications such as conversational search, knowledge graphs, and (Graph) retrieval-augmented generation (GraphRAG or RAG) systems. This evolution calls for a dual approach: leveraging actionable schema types for immediate SEO benefits (rich results) while investing in comprehensive, descriptive schemas that build a broader data ecosystem.
The future lies in the intersection of structured data, semantic modeling, and AI-driven content discovery systems. By adopting a more holistic view, organizations can move from using structured data as a tactical SEO addition to positioning it as a strategic layer for powering AI interactions and ensuring findability across diverse platforms.
Credits And Acknowledgements
This analysis wouldn’t be possible without the dedicated work of the HTTP Archive team and Web Almanac contributors. Special thanks to:
The complete Web Almanac Structured Data chapter offers even deeper insights into the evolving landscape of structured data implementation.
As we move toward an AI-powered future, the strategic importance of structured data will continue to grow.
More resources:
Featured Image: Koto Amatsukami/Shutterstock
Discover more from Сегодня.Today
Subscribe to get the latest posts sent to your email.