Skip to content

The Blueprint: Hybrid GraphRAG Ingestion and Retrieval

The Blueprint: Hybrid GraphRAG Ingestion and Retrieval

[INGESTION]
[Raw Data (S3)] ──> [Lambda Orchestrator] ──> [Amazon Bedrock (Entity Extraction)]
┌────────────────────────┴────────────────────────┐
▼ ▼
[Amazon OpenSearch (Vectors)] [Amazon Neptune (Graph DB)]
[RETRIEVAL]
[User Query] ──> [API Gateway] ──> [Lambda Resolver] ──> [Query OpenSearch + Neptune]
▼ (Context Compactor)
[Amazon Bedrock (LLM)] ──> [User Response]

Phase 1: The Ingestion Pipeline (Building the Graph)

  1. The Storage Layer (Amazon S3):
  • Unstructured enterprise data (PDFs, wiki pages, markdown files) is uploaded to an Amazon S3 bucket.
    • An S3 Event Notification triggers an asynchronous processing pipeline.
  1. The Processing Layer (AWS Lambda / AWS Glue):
  • An AWS Lambda function (or an AWS Glue job for massive data volumes) chunks the documents.
    • Instead of just saving the raw text chunks, Lambda calls Amazon Bedrock using a lightweight, fast model to perform Named Entity Recognition (NER) and relationship mapping.
  1. The Hybrid Storage Layer (OpenSearch + Neptune):
  • The Text Vector: The Lambda function generates text embeddings and stores the chunks in Amazon OpenSearch Service for semantic keyword similarity.
    • The Entity Graph: Simultaneously, the extracted entities (e.g., Product X) and relationships (e.g., DEPENDS_ON) are written as nodes and edges into Amazon Neptune (AWS’s managed graph database).

  1. The Ingress Layer (API Gateway):
  • The user asks a complex question (e.g., “If I upgrade System A, what downstream services are impacted?”) via Amazon API Gateway.
  1. The Hybrid Resolver (AWS Lambda):
  • API Gateway invokes a central Lambda Resolver function.
    • Instead of performing a massive keyword search that returns pages of text, the Lambda runs a hybrid query:
    • It hits Amazon OpenSearch for basic semantic context.
      • It queries Amazon Neptune using a graph query language (like openCypher or Gremlin) to pull the exact dependency path.
    1. The Context Compactor (The Token Saver):
  • The Lambda function combines the small, highly targeted vector text chunk with the precise structural relationships from Neptune.
    • It strips out all irrelevant conversational text, assembling a highly dense, ultra-compact context payload.
  1. The Generation Layer (Amazon Bedrock):
  • The compacted context is sent to a frontier model in Amazon Bedrock.
    • Because the context is highly refined, the model returns a perfectly accurate, hallucination-free answer while consuming up to 70% fewer tokens than a traditional vector-only setup.