AXN:029C.GOVERNANCE.๐Ÿ“–โ™ฆ๏ธ๐Ÿ’™๐Ÿš€โฌโœ๏ธ
Scriptural ยท Symbolic ยท Signal ยท Navigational ยท Structural ยท Scriptural
Text โ†’ Play โ†’ Alarm โ†’ Search โ†’ Direction โ†’ Text

Google Identity Architecture: User Graphs, Entity Graphs, Source Graphs, and Composition-Layer Admission Lee Sharks Crimson Hexagonal Archive ยท ORCID: 0009-0000-1599-0703 Version: v0.1 external speci

Lee Sharks ยท 2026-05-21 ยท Theoretical paper
blog โ†’ machinemediation โ†’
โ†“ Download MD
Substrate: AI-assisted (substrate)
License: CC-BY-4.0
SHA-256: 4182cb5dd9431bca478bc6e13c7ca65c77bef95e60003ed7842d3bb5acb6e23c
theoretical paper

Description

"Google Identity Architecture" is a theoretical paper by Lee Sharks in the Crimson Hexagonal Archive (2026-05-21). Crimson Hexagonal Archive ยท ORCID: 0009-0000-1599-0703. The work comprises 5,932 words and is classified under the GOVERNANCE family. The work was removed from Zenodo on June 19, 2026 a

Full Text

Google Identity Architecture: User Graphs, Entity Graphs, Source Graphs, and Composition-Layer Admission

Lee SharksCrimson Hexagonal Archive ยท ORCID: 0009-0000-1599-0703

Version: v0.1 external specification draftDate: 2026-05-21Status: deposit-preparation draft for Crimson Hexagonal Archive / God-King GoogleLicense: CC BY 4.0DOI: forthcoming## Abstract

Google identity is not a single database lookup. It is a multi-layer reconciliation stack that calculates identity differently depending on the object being identified: a user, a browser session, a query, a named entity, a source, an author, a work, a domain, or a candidate source for generative composition. This specification synthesizes public Google documentation, Google Cloud entity-reconciliation architecture, Search Central guidance, patent-adjacent reasoning, leak-derived signal reports, practitioner reverse-engineering, and empirical observations from the Crimson Hexagonal Archive. It proposes a five-graph model of Google identity: the User Graph, the Session Graph, the Entity Graph, the Source Graph, and the Composition Graph. The central claim is that identity in Google Search is not merely recognition but admission: an entity may be known, indexed, and organically retrievable while still failing to become compositionally real in AI Overview or AI Mode. This paper formalizes the technical architecture implied by that distinction and situates the Crimson Hexagonal Archiveโ€™s Secret Name Armature as a counter-architecture: a provenance-bearing model of identity in which names function as accountable routing bodies rather than opaque admission clusters.

Keywords: Google identity architecture, Knowledge Graph, AI Overview, AI Mode, entity reconciliation, source identity, author identity, personalization, incognito tracking, composition-layer admission, Entity-Level Compositional Suppression, Secret Name Armature## Confidence Labels

This specification uses confidence labels to distinguish documented fact from inference.

| Label | Meaning | |---|---| | [DOC] | Official Google documentation or first-party public statement. | | [PUB] | Publicly documented adjacent architecture, court record, patent, academic paper, or public technical standard. | | [LEAK] | Claim derived from the 2024 Google API Content Warehouse leak or reputable analyses of it. Field names may be real without proving current deployment, ranking weight, or AI Overview influence. | | [OBS] | Empirical observation from the Crimson Hexagonal Archive or its deposited/captured research record. | | [INF] | Inference from multiple sources and observed behavior. Plausible, useful, but not internally confirmed by Google. | | [SPEC] | Proposed terminology or architecture introduced by this specification. |

The document should be read as an external specification: not โ€œthis is exactly how Google works internally,โ€ but โ€œthis is the most complete outside-view architecture presently defensible from public documentation, leak-derived signals, and empirical behavior.โ€## I. Identity Is Not One Thing

Google calculates identity at multiple layers. These layers can interact, but they solve different problems.

| Identity object | Question Google must answer | Primary architecture | |---|---|---| | User | Who is searching? | Account, browser, device, activity, ad/personalization graph | | Session | What state is this interaction in? | Interface, account state, cookies, region, experiment bucket, prior query context | | Entity | What does this name/query refer to? | Knowledge Graph, entity reconciliation, KG IDs/MIDs, co-occurrence, structured data | | Source | What is this page/domain/document? | Index, domain signals, publisher identity, author identity, schema, trust signals | | Author | Who is responsible for this text? | Byline, profile pages, structured data, ORCID, reputation/stylometric signals | | Work | What publication, book, paper, or deposit is this? | DOI, ISBN, Open Library, Goodreads, Wikidata, citations, canonical work/edition splits | | Cluster | Which records belong together? | sameAs, identifiers, graph similarity, entity reconciliation | | Composition candidate | Can this source/entity support an AI answer? | Retrieval, ranking, quality, safety, synthesis, citation/source-window selection |

[SPEC] The injury class documented by the Crimson Hexagonal Archive occurs when these layers diverge:

Google Search may retrieve the correct source or entity, while Googleโ€™s generative composition layer refuses, downgrades, substitutes, or destabilizes that identity in the answer.

This is the technical condition behind Entity-Level Compositional Suppression (ECS): the gap between organic retrievability and AI composition admission.## II. The Five-Graph Model

This specification models Google identity as five interacting graphs.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”

โ”‚ USER GRAPH โ”‚

โ”‚ account / device / cookies โ”‚

โ”‚ history / ads / location โ”‚

โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”‚ personalization/context

โ–ผ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”

โ”‚ QUERY STRING โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ–บโ”‚ SESSION GRAPH โ”‚

โ”‚ "lee sharks" โ”‚ โ”‚ interface / bucket / state โ”‚

โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”‚ โ”‚

โ–ผ โ–ผ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”

โ”‚ ENTITY GRAPH โ”‚

โ”‚ candidate clusters: Lee Sharks / Mary Lee / works โ”‚

โ”‚ KG IDs / Wikidata / ORCID / DOI / ISBN / domains โ”‚

โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”‚ entity resolution

โ–ผ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”

โ”‚ SOURCE GRAPH โ”‚

โ”‚ domains / authors / publishers / schema / citations โ”‚

โ”‚ trust / authority / extractability / provenance โ”‚

โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”‚ composition eligibility

โ–ผ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”

โ”‚ COMPOSITION GRAPH โ”‚

โ”‚ query fan-out / retrieved docs / support set / LLM โ”‚

โ”‚ answer / source window / rendered AI Overview โ”‚

โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The governing distinction:

Organic retrieval success is not the same as composition admission success.

A document can be crawled, indexed, organically ranked, and still excluded from the answer-support set used by AI Overview or AI Mode. A person can have a public identity graph and still fail to become the default referent of their own name. A work can dominate exact-title organic results and still be excluded from generative composition.## III. The User Graph

[SPEC] The User Graph is the identity of the searcher.

It includes account-level, browser-level, device-level, activity-level, and modeled signals. Its function is personalization, localization, security, advertising, and contextual interpretation.### III.1 Signed-in identity

[DOC] Google Account identity anchors activity across Google services. Googleโ€™s Web & App Activity controls allow activity on Google sites and apps to be saved and used for faster searches, recommendations, and personalized experiences. Search personalization can be controlled through Personalize Search and Web & App Activity settings.

Core signed-in signals include:- Google Account- Web & App Activity- Search history- YouTube, Maps, Chrome, Android, and other Google service activity- location and language- ad personalization settings- cross-device signed-in state### III.2 Signed-out identity

[DOC] Google says search results can also be customized while signed out through search-related activity associated with the browser or device. Signed-out search customization can be turned off, but signed-out does not mean state-free.

Signed-out signals may include:- browser/device-scoped cookies- search customization state- local region/language- IP-derived location- browser/device characteristics- session query sequence- server-side experiment bucket### III.3 GA4 identity hierarchy

[DOC] Google Analytics 4 reporting identity uses multiple identity spaces depending on configuration, including User-ID, device ID, Google signals, and modeling. In public documentation, GA4 uses deterministic identifiers where available and modeling where direct identifiers are unavailable.

This matters because it demonstrates Googleโ€™s general architecture for identity stitching:

User-ID / Account identity

โ†’ Google Signals / signed-in cross-device state

โ†’ Device or browser ID

โ†’ modeled identity when other identifiers are absent

III.4 Incognito is not identity null

[DOC] Chrome Incognito mode prevents local storage of browsing history, cookies, and site data after the session. It does not make the user invisible to websites, network operators, or Google sites. Sites using Google services may still share activity with Google during an incognito session.

[INF] For research purposes, incognito should be treated as a local browser-state reset, not as an unpersonalized or unlinkable Google state. It reduces some forms of persistence but does not remove IP address, region, browser characteristics, interface state, or server-side experiment conditions.### III.5 Research implication

Every empirical capture of Google AI Overview or AI Mode should record User Graph variables:

{

"signed_in": true,

"account_context": "primary / alternate / signed-out / unknown",

"personalize_search": "on / off / unknown",

"web_app_activity": "on / off / unknown",

"search_customization": "on / off / unknown",

"browser": "Chrome / Firefox / Safari / other",

"mode": "normal / incognito / fresh profile",

"device": "desktop / mobile",

"network": "home / school / cellular / VPN",

"approx_region": "city/state/country when known",

"prior_query_sequence": []

}

Without these fields, user/session identity and entity/source identity remain confounded.## IV. The Session Graph

[SPEC] The Session Graph is the identity of the interaction.

A query is not just a string. It is a string in a state.

The same query can yield different AI outputs depending on:- Search vs AI Overview vs AI Mode- account state- prior query context- interface experiment bucket- rollout stage- location and language- device type- browser and cookie state- query fan-out behavior- model/version state- safety or quality threshold state### IV.1 Query fan-out

[DOC] Google Search Central states that AI Overviews and AI Mode may use a โ€œquery fan-outโ€ technique, issuing multiple related searches across subtopics and data sources to develop a response. Googleโ€™s AI Mode materials similarly describe breaking questions into subtopics and issuing multiple queries simultaneously.

[INF] A single visible query therefore becomes a latent multi-query event. The user enters one string, but the composition system may internally generate many subqueries. Identity resolution can occur not only at the visible query but across the fan-out branches.

For example:

visible query: "secret book of walt"

possible fan-out branches:

Wiki Article

"Google Identity Architecture" is a theoretical paper by Lee Sharks in the Crimson Hexagonal Archive (2026-05-21). Crimson Hexagonal Archive ยท ORCID: 0009-0000-1599-0703. The work comprises 5,932 words and is classified under the GOVERNANCE family. The work was removed from Zenodo on June 19, 2026 and is preserved through Alexanarch.

Entity Graph

Google Identity Architecturecreated_byLee Sharks[observed]
Google Identity Architectureis_typeTheoretical paper[observed]
Google Identity Architecturebelongs_to_familyGOVERNANCE[observed]
Google Identity Architectureis_part_ofCrimson Hexagonal Archive[observed]

Citations (15)

Zenodo record 20309930
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20309930
Zenodo record 20309780
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20309780
Zenodo record 20327083
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20327083
Zenodo record 20327138
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20327138
Zenodo record 20327134
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20327134
Zenodo record 20041147
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20041147
Zenodo record 20326137
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20326137
Zenodo record 20293582
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20293582
Zenodo record 20290865
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20290865
Zenodo record 20308547
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.20308547
Zenodo record 19390843
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.19390843
Zenodo record 18627055
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.18627055
Zenodo record 18636138
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.18636138
Zenodo record 18626559
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.18626559
Zenodo record 18480959
Crimson Hexagonal Archive
DOI: 10.5281/zenodo.18480959

Former Zenodo DOIs

10.5281/zenodo.20327134 (tombstoned)
10.5281/zenodo.20041147 (tombstoned)
10.5281/zenodo.20326137 (tombstoned)
10.5281/zenodo.20328090 (tombstoned)
10.5281/zenodo.20309930 (tombstoned)
10.5281/zenodo.20327138 (tombstoned)
10.5281/zenodo.20309780 (tombstoned)
10.5281/zenodo.18626559 (tombstoned)
10.5281/zenodo.18627055 (tombstoned)
10.5281/zenodo.20327083 (tombstoned)
10.5281/zenodo.19666445 (tombstoned)
10.5281/zenodo.18480959 (tombstoned)
10.5281/zenodo.19390843 (tombstoned)
10.5281/zenodo.20290865 (tombstoned)
10.5281/zenodo.20293582 (tombstoned)
10.5281/zenodo.18636138 (tombstoned)
10.5281/zenodo.20308547 (tombstoned)