Better Ingestion Ingestion Chunking Retrieval Ingestion Chunking

POMA PrimeCut is a document ingestion and chunking engine that preserves hierarchical document structure, eliminates context poisoning, and produces clean, semantically coherent chunks at unmatched accuracy per token.

Context Poisoning Starts at Ingestion
RAG Chunking Strategies Fail Without Hierarchical Ingestion

Engineers optimizing RAG systems spend most of their time tuning retrieval: adjusting similarity thresholds, re-ranking results, switching embedding models. These interventions treat symptoms. The underlying pathology is how documents are parsed and chunked upstream — a step most pipelines delegate to naive text splitters or PDF-to-text extractors that have no awareness of document semantics.

The result is a vector store filled with structurally corrupted, semantically diluted embeddings. No retrieval strategy fully recovers from that.

Failure Mode 01

Context Poisoning

Chunks containing content from multiple unrelated sections produce embeddings that represent neither topic accurately.

The consequence: The LLM receives contradictory context. Hallucination rates rise.
Failure Mode 02

Structural Signal Loss

Document hierarchy — headings, tables, section boundaries — is discarded before chunking begins.

The consequence: Hierarchical queries fail. The vector store cannot distinguish a section heading from a footnote.
Failure Mode 03

Boundary Blindness

Fixed-size chunking severs semantic units at arbitrary character or token limits.

The consequence: High-relevance content becomes unfindable even when it exists in the corpus.

These are not edge cases. They are the default behavior of general-purpose text processing pipelines applied to structured documents. Fixing retrieval without fixing ingestion is rearranging deck chairs. PrimeCut addresses all three failure modes at the source. Learn more about chunking failure modes →

How Hierarchical Ingestion and Chunking Works: Structure to RAG-Ready Chunks
How POMA PrimeCut Sees Your Document Hierarchy

Every document carries an internal logic: a hierarchy of headings, sub-sections, tables, lists, and supporting elements that define what content belongs together and why. That structure is not decoration — it is the semantic map of the document.

Standard ingestion pipelines discard this map. They extract raw text and hand it to a chunker that has no knowledge of where one idea ends and another begins.

PrimeCut understands your document’s content hierarchy before chunking — preserving structural relationships, eliminating context poisoning, and producing semantically coherent chunksets that make every downstream RAG component more accurate by default.

Text, Chart & Table — One Document, Fully Resolved

MSCI World Index (USD) Factsheet, Sep 2025 Chunksets 0–5 of 43
Shared root
Shared hierarchy
Leaf (unique to one chunkset)
Text
Image description
Table data

What POMA PrimeCut Does Differently
POMA PrimeCut vs Unstructured.io vs Conventional Chunking:
Hierarchical Chunking Compared

Conventional Chunk

an SPDF is one approach to help ensure that the QS regulation is met. Because of its benefits in helping comply with the QS regulation and
cybersecurity, FDA encourages manufacturers to use an SPDF, but other approaches might also satisfy the QS regulation.
### B. Designing for Security
When reviewing premarket submissions, FDA intends to assess device cybersecurity based on a number of factors, including, but not limited to, the
device's ability to provide and implement the security objectives below throughout the device architecture. The security objectives below generally
may apply broadly to devices within the scope of this guidance, including, but not limited to, devices containing artificial intelligence (AI) and
cloud-based services.
Security Objectives:
• Authenticity, which includes integrity;
• Authorization:
• Availability:
• Confidentiality; and
• Secure and timely updatability and patchability.
Premarket submissions should include information that describes how the above security objectives are addressed by and integrated into the device
design. The extent to which security requirements, architecture, supply chain, and implementation are needed to meet these objectives will depend on
but may not be limited to:
- The device’s intended use, indications for use, and reasonably foreseeable misuse;
- The presence and functionality of its electronic data interfaces;
• Its intended and actual environment of use:
- The risks presented by cybersecurity vulnerabilities;
- The exploitability of the vulnerabilities; and
- The risk of patient harm due to vulnerability exploitation.
SPDF processes aim to reduce the number and severity of vulnerabilities and thereby reduce the exploitability of a medical device system and the
associated risk of patient harm. Because exploitation of known vulnerabilities or weak cybersecurity controls should be considered reasonably
foreseeable failure modes for medical device systems, these factors should be addressed in the device design. $ ^{19} $ One of the key benefits of
using an SPDF is that a medical device system is more likely to be secure by design, such that the device is designed from the outset to be secure
within its system and/or network of use throughout the device lifecycle.
### C. Transparency
A lack of cybersecurity information, such as information necessary to integrate the device into the use environment, as well as information needed
by users to maintain the medical device system’s cybersecurity over the device lifecycle, has the potential to affect the safety and effectiveness
of a device. In order to address these concerns, it is important for device users to
## Contains Nonbinding Recommendations
have access to information pertaining to the device’s cybersecurity controls, potential risks to the medical device system, and other relevant
  • very long
  • spanning multiple sections
  • isolating heading from subsequent content

Unstructured.io Chunk

• The device’s intended use, indications for use, and reasonably foreseeable misuse;
• The presence and functionality of its electronic data interfaces;
• Its intended and actual environment of use; 18
• The risks presented by cybersecurity vulnerabilities;
• The exploitability of the vulnerabilities; and
• The risk of patient harm due to vulnerability exploitation.
  • No section indication
  • Random artifacts as part of apparent main text (“18”)
  • No context/positioning within the document

POMA Chunk(Set): Full Context Path

Cybersecurity Guidance for Medical Devices: Quality Systems and Premarket Submission
Requirements
	[…]
	Guidance for Industry and Food and Drug Administration Staff
		[…]
		Contains Nonbinding Recommendations outline
			[…]
			B. Designing for Security
			When reviewing premarket submissions, FDA intends to assess device cybersecurity
			based on a number of factors, including, but not limited to, the device's ability
			to provide and implement the security objectives below throughout the device
			architecture.
				[…]
				The extent to which security requirements, architecture, supply chain, and
				implementation are needed to meet these objectives will depend on but may not
				be limited to:
					[…]
					• Its intended and actual environment of use:
						[…]
						• The risk of patient harm due to vulnerability exploitation.
It just works

POMA-OfficeQA Benchmark: RAG Chunking Performance Results
PrimeCut Achieves 77% Fewer Tokens than Conventional Chunking — Without Losing Context Accuracy

Most RAG benchmarks measure whether the right chunks were retrieved — not whether the resulting context is actually useful to the LLM. Token waste, attention gaps, and context poisoning go unmeasured.

POMA-OfficeQA asked a different question: How many tokens do I need, to achieve 100% context recall?

Token comparison benchmark between PrimeCut and baselines

Two Ways to Use POMA PrimeCut Depending on Your Budget
PrimeCut Eco and PrimeCut Pro

PrimeCut ships in two tiers. Both preserve document hierarchy. Both eliminate context poisoning. The difference is in how they handle visual content and compute — matched to the complexity of your documents and your budget.

PrimeCut Eco

Simple hierarchical chunking for well-structured documents.

0.003 € / page
Features
  • Rapid document hierarchy detection
  • Semantically bounded chunks with ancestor context inheritance
  • Ready-to-embed chunksets
  • Images and visual elements extracted and placeholdered
  • Optimized for low cost
  • Simple Title Generation
Best for
  • Large knowledge bases with limited budget
  • Simple and well-structured content

PrimeCut Pro

Full structural and visual intelligence for complex, mixed-content documents.

0.03 € / page
Features
  • Full document hierarchy parsing
  • Semantically bounded and neighbour-aware chunks with ancestor context inheritance
  • Context-aware ready-to-embed chunksets
  • Full AI processing — figures, tables, and images parsed as semantic content
  • Visual elements both extracted and converted to retrievable, context-aware textual chunks
  • Optimized for multimodal accurate hierarchical textual representation of complex content
Best for
  • High-stakes domains with complex documents (legal & regulatory, financial & insurance, medical, engineering)
  • High need for search accuracy
  • Multimodal retrieval based on comparable semantic and hierarchical representations

Integration into Your RAG Pipeline
LangChain Document Chunking and RAG Pipeline Integration — No Architectural Overhaul

PrimeCut sits at the ingestion layer of your RAG pipeline — upstream of your vector database, your embedding model, and your retrieval logic. It receives documents. It returns structured, hierarchically-bounded chunksets.

The SDK is lightweight. The API is flexible. And because PrimeCut's output schema is consistent across both configurations.

Compatible with:
LLMs
OpenAI
Anthropic
Other leading LLMs
Vector Databases
Pinecone
Weaviate
Other vector databases
Frameworks
LangChain
LlamaIndex
Custom RAG implementations

The SDK is lightweight. The API is flexible. And PrimeCut's output schema is consistent across both configurations.