Abstract store¶
Overview¶
The abstract store is the interface which all database adapters share. It provides a unified API for persisting and retrieving Node objects across different backend implementations (SQL, NoSQL, Graph databases, etc.).
Design Notes¶
Interaction Pattern¶
The AbstractStore follows these interaction patterns:
-
Storage Pattern:
- Check if node exists (by id)
- Update if exists, insert if new
- Maintain idempotency
-
Retrieval Pattern:
- String filter: Return direct matches preserving multiplicity
- None filter: Return canonical, deduplicated node set
- Stable ordering for identical queries
-
Deletion Pattern:
- Single node removal with safety checks
- Must fail on ambiguous matches
Docstring abstract node¶
node
¶
Classes:
-
Node–Canonical storage-agnostic representation of a knowledge entity.
Attributes:
-
EntityType–Logical category of a Node.
-
KeyAttribute–Name of the primary human-readable identifier inside a Node payload.
-
NodeId–Globally stable identifier of a knowledge entity.
-
Payload–Structured attributes describing a Node.
-
Relation–Edge description between two Nodes.
EntityType
module-attribute
¶
KeyAttribute
module-attribute
¶
NodeId
module-attribute
¶
Globally stable identifier of a knowledge entity.
Properties¶
- Uniquely identifies the same real-world entity across all systems
- Must be deterministic across ingestion runs
- Must not encode storage-specific information (database IDs, row numbers)
- Safe for use as foreign key in relations
Examples¶
"user:42" "doi:10.1000/182" "sharepoint:file:abc123"
Payload
module-attribute
¶
Relation
module-attribute
¶
Edge description between two Nodes.
Minimum required keys¶
"type" : str Relationship type "target" : NodeId Identifier of the related node
Optional keys¶
Any additional metadata describing the relation.
Constraints¶
- Must not encode storage-specific fields
- Must be JSON-serializable
- Duplicate relations should be treated as identical
Node
dataclass
¶
Node(id: NodeId, payload_data: Payload = dict(), relations: Sequence[Relation] = tuple(), entity_type: EntityType = EntityType('node'), key_attribute: KeyAttribute = KeyAttribute('id'))
Canonical storage-agnostic representation of a knowledge entity.
A Node is the normalized form of structured information extracted from external sources. All database adapters MUST translate their internal records into this structure before persistence or retrieval.
Identity¶
The node identity is defined exclusively by id.
Nodes with identical id MUST represent the same real-world entity.
Stores must overwrite existing nodes instead of creating duplicates.
Fields¶
id : NodeId Globally stable identifier of the entity. Must remain constant across synchronization runs and storage backends.
Mapping[str, object]
Structured attributes describing the entity (properties).
Requirements: - JSON-serializable - Deterministic for identical source state - Order-independent - Safe to merge across updates
Sequence[Relation]
Outgoing relationships from this node to other nodes.
Each relation mapping should minimally contain:
{
"type":
Constraints: - Must not contain cyclic self-references unless meaningful - Order does not carry semantic meaning - Duplicate relations should be ignored by stores
EntityType
Logical category of the entity (e.g., "person", "document", "concept"). Used for indexing, filtering and schema interpretation. Must remain stable for a given node id.
KeyAttribute
Name of the primary human-readable identifier inside payload_data (e.g., "email", "title", "name").
- Main modules Graph Stores
- Main modules Graph Stores
Docstring abstract store¶
abstract_store
¶
Classes:
-
AbstractStore–Abstract persistence layer for storing and retrieving Nodes.
AbstractStore
¶
AbstractStore()
flowchart TD
database_builder_libs.models.abstract_store.AbstractStore[AbstractStore]
click database_builder_libs.models.abstract_store.AbstractStore href "" "database_builder_libs.models.abstract_store.AbstractStore"
Abstract persistence layer for storing and retrieving Nodes.
A Store represents a backend capable of persisting structured nodes and retrieving them via identifier lookup, textual filtering, or vector similarity.
Typical implementations: - SQL/NoSQL database - Vector database (e.g., embeddings search) - Graph database - In-memory index
Consistency requirements¶
Implementations must ensure: - Stable node identity across reads - Deterministic retrieval for identical queries - Idempotent storage: storing the same Node twice must not create duplicates
Methods:
-
connect–Establish connection to the backend.
-
get_nodes–Retrieve nodes from the store.
-
remove_node–Remove a single node identified by filter.
-
store_node–Persist a Node into the store.
Source code in src/database_builder_libs/models/abstract_store.py
27 28 29 | |
connect
¶
connect(config: dict | None = None) -> None
Establish connection to the backend.
This method is idempotent. Calling it multiple times must be safe.
Parameters¶
config : Any | None Backend-specific configuration object.
Raises¶
ConnectionError Backend unreachable. RuntimeError Backend misconfigured.
Source code in src/database_builder_libs/models/abstract_store.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | |
get_nodes
abstractmethod
¶
Retrieve nodes from the store.
Retrieval Modes¶
filter is interpreted as:
-
str → Selection query Returns nodes that directly match stored records. Multiple results representing different stored entities MUST be preserved (no merging/deduplication).
-
None → Reconstruction query Returns the canonical set of nodes represented by the backend. Implementations MUST merge overlapping representations and return a normalized, duplicate-free set of Nodes.
Returns¶
List[Node] Deterministically ordered list of nodes.
Guarantees¶
- Stable ordering for identical queries if backend unchanged
- filter=None returns a duplicate-free canonical node set
- filter=str preserves multiplicity of stored entities
Raises¶
RuntimeError If called before connect().
Source code in src/database_builder_libs/models/abstract_store.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |
remove_node
abstractmethod
¶
Remove a single node identified by filter.
Parameters¶
filter : str Unique identifier of the node to remove.
Returns¶
Node The removed node.
Behaviour¶
- Must remove exactly one node
- Must fail if multiple or zero matches
Raises¶
KeyError If no node matches the filter. ValueError If multiple nodes match the filter.
Source code in src/database_builder_libs/models/abstract_store.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |
store_node
abstractmethod
¶
store_node(node: Node) -> None
Persist a Node into the store.
Behaviour¶
- If the node already exists (same unique identifier), it must be updated.
- Operation must be idempotent.
Parameters¶
node : Node The node to persist.
Raises¶
RuntimeError If called before connect_to_source(). ValueError If the node is invalid for this backend.
Source code in src/database_builder_libs/models/abstract_store.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |