Technical

Nonprofit Knowledge Graphs: A Technical Whitepaper on Entity Resolution for Donor Data

May 18, 202612 min read

Knowledge graphs are having a moment. Neo4j, TigerGraph, and Amazon Neptune have spent the last decade making the case that entity resolution - the problem of figuring out whether "Margaret Chen," "M. Chen," and "Maggie Chen (deceased husband: Robert)" are the same person - is best solved by representing the world as nodes and edges rather than rows and columns. They're right. The enterprise case studies are compelling, the academic literature is mature, and the tooling is production-grade.

This whitepaper is about what happens when you apply that architecture to a domain those vendors have largely ignored: nonprofit donor data. The technical problems are the same. The constraints are very different. And the difference between "a knowledge graph for donors" and "donor management software with a graph database underneath" is the difference between an intelligence layer and a slightly fancier CRM.

This is a long read. We've written it for development directors who want to evaluate the technology honestly, and for technical readers (CTOs, data leads, board members from tech backgrounds) who want to understand what's actually under the hood.

Why Entity Resolution Is the Core Problem

Every nonprofit dataset contains the same fundamental ambiguity. A donor record in your CRM, an email thread in your inbox, a board meeting minute in Google Docs, a thank-you note logged by a former staff member, and a wealth screening report from a third party all reference the same human being - but none of them agree on how to identify her.

Relational databases solve this by demanding a single canonical key (`donor_id = 4729`) and forcing every other system to conform. That works inside the CRM. It breaks the moment you try to reason across the CRM and the unstructured world around it.

Knowledge graphs invert the model. Instead of forcing every reference to conform to a primary key, you represent each reference as a node and let an entity resolution layer assert relationships between them: `(:Mention {text: "Maggie"}) -[:REFERS_TO {confidence: 0.91}]-> (:Person {id: "p_4729"})`. Mentions stay where they originated. The graph is the layer that says "these are the same person, and here's the evidence."

This is the architectural pattern Neo4j, TigerGraph, and Neptune have spent a decade selling to banks, insurers, and intelligence agencies. It's also the right pattern for nonprofit donor data - for reasons those vendors haven't articulated, because they aren't in this space.

What's Different About Nonprofit Data

If the technical pattern is the same, why don't general-purpose graph databases solve the nonprofit case out of the box? Four reasons:

1. The signal-to-noise ratio is different. Enterprise entity resolution operates on millions of records with strong identifiers (SSNs, account numbers, transaction IDs). Nonprofit data is small (most orgs have under 50,000 records), weakly identified (donors share emails with spouses, change names through marriage, use nicknames in person and legal names on tax forms), and overwhelmingly unstructured. The notes field is where the real signal lives, not the contact card.

2. The cost of false positives is asymmetric. A bank that incorrectly merges two customer profiles loses time. A nonprofit that incorrectly merges Margaret Chen (deceased husband Robert) with Margaret Chen (whose husband Robert is very much alive and on the board) loses both donors and ends a career. Resolution confidence thresholds need to be much higher, and ambiguous matches need to surface to a human rather than auto-resolve.

3. Provenance is non-negotiable. When a development director asks "what do we know about Margaret?", every assertion in the answer needs to be traceable to a source record - the specific email, the specific note, the specific gift. Enterprise graph databases support this through edge properties, but it isn't the default; most analytics workloads don't need it. For donor work, an answer without a citation is worse than no answer at all.

4. The query interface needs to be natural language. Nobody on a development team is writing Cypher. The entity resolution graph is only useful if a gift officer can ask "who has Margaret talked to in the last year, and what was the outcome?" and get an answer cited to the underlying records. That requires a retrieval layer (vector embeddings + graph traversal + LLM synthesis) that the graph database itself doesn't provide.

The Three-Layer Architecture

A working nonprofit knowledge graph has three layers, and conflating them is the most common mistake we see in vendor pitches. From the bottom up:

Layer 1 - Ingestion & Normalization. Pulls structured records from your CRM and unstructured content from email, documents, notes, and third-party sources into a single pipeline.

Layer 2 - Entity Resolution Graph. The graph itself: nodes, edges, mentions, confidence scores, and provenance pointers back to source records.

Layer 3 - Retrieval & Synthesis. Natural-language queries in, cited answers out. This is the layer a human actually interacts with.

Layer 1: Ingestion. Connectors to the CRM (Bloomerang, Virtuous, Salesforce NPSP, Raiser's Edge), shared inboxes, document repositories, and third-party data sources (wealth screening, address verification). Output: a stream of normalized records with stable source identifiers.

Layer 2: Entity Resolution Graph. This is where the actual graph lives. Nodes for people, organizations, gifts, conversations, documents. Edges for relationships (`:GAVE`, `:WORKS_AT`, `:MENTIONED_IN`, `:REFERS_TO`). Mention nodes for every reference to an entity, with confidence scores and provenance pointers back to source records. This layer answers structural questions: who is connected to whom, with what confidence, and on what evidence.

Layer 3: Retrieval & Synthesis. Vector embeddings of unstructured content, semantic search across the graph, LLM-mediated synthesis that produces natural-language answers grounded in the graph's structured evidence. This is the layer a human actually interacts with. Every answer it produces carries citations from Layer 2.

A CRM is, at most, a partial Layer 1 - it ingests structured records but rarely the unstructured context around them. A graph database is Layer 2 without Layers 1 or 3. A general-purpose RAG system (LangChain on top of OpenAI) is a generic Layer 3 with no donor-specific entity resolution underneath. None of these, on their own, gives you what the nonprofit case actually needs.

Entity Resolution: How It Actually Works

The core algorithm in Layer 2 is a probabilistic match between mention pairs. For any two mentions of a possible entity, the resolution layer computes a similarity score across multiple features:

Lexical: token overlap, edit distance, phonetic matching (Soundex, Metaphone for names that sound the same but are spelled differently)

Structural: shared email domains, overlapping addresses, common employer

Relational: appears in the same conversations, shares connections in the graph

Temporal: active in overlapping date ranges (helps disambiguate two donors with the same name across generations)

Contextual (embedding-based): semantic similarity of the surrounding text using sentence embeddings - the model has learned that "Maggie's late husband Robert" and "Margaret, widowed in 2019" refer to the same family situation

The composite score is compared against three thresholds:

Above the auto-merge threshold: the mentions are linked to the same canonical entity automatically.

Below the auto-reject threshold: they are kept separate.

In the ambiguous middle band: the system surfaces the pair to a human for review, with the evidence presented inline.

This is the part vendors gloss over: the human-in-the-loop step is mandatory for nonprofit data. Auto-merging at high confidence is fine for transaction records. For donors, the cost of merging two distinct families is high enough that anything below near-certainty needs a human signoff.

Provenance: Every Edge Carries Its Source

Every edge in the graph carries metadata about where the assertion came from. A typical edge connecting Margaret Chen to a source email might look like this in graph form: a Person node (id p_4729, name "Margaret Chen") is linked by a MENTIONED_IN edge to a Document node (id email_88421), where the edge itself carries the source id, source type ("email"), source date ("2024-11-03"), the extraction model that produced the link ("ner_v3"), and a confidence score (0.94).

When the retrieval layer produces an answer, it walks these edges to assemble citations. The user sees: "Margaret mentioned her son's graduation in an email to Sarah on November 3, 2024 - [view source]." Clicking the citation surfaces the actual email.

This is the structural property that makes hallucination architecturally impossible. The LLM in Layer 3 is constrained to synthesize only over evidence the graph has surfaced; if there's no edge, there's no claim. The model can phrase the answer, but it can't invent facts that the graph doesn't already contain.

Why "Just Use Neo4j" Isn't the Answer

Neo4j is excellent software. So is TigerGraph, so is Neptune. If you're an enterprise data team with engineers who can model your domain, write Cypher, build ingestion pipelines, train NER models for your specific entity types, design retrieval-augmented generation on top, and maintain it all - you can build a nonprofit knowledge graph on any of them.

That's not a realistic path for a development team. A nonprofit needs:

Pre-built connectors to fundraising CRMs (not generic ETL frameworks)

NER and entity resolution tuned for donor-specific patterns (nicknames, family references, board affiliations, in-honor-of gifts)

A retrieval layer that already knows how to cite to fundraising-relevant evidence

A natural-language interface a gift officer can actually use

PII redaction and tenant isolation as default architecture, not optional features

A pricing model that doesn't assume you have a six-figure data infrastructure budget

This is the gap. The technology to do it well exists. The packaging for the nonprofit sector mostly does not.

How Gratefully Implements the Three Layers

We built Gratefully to be the nonprofit-specific version of this architecture. Briefly, in technical terms:

Layer 1: Native connectors to Bloomerang, Virtuous, Salesforce NPSP, and Raiser's Edge, plus generic ingestion for email, Google Drive, and Office 365. PII detection runs at ingestion and again at retrieval.

Layer 2: A property graph (under the hood) tuned for donor entity resolution. Composite scoring with mandatory human review in the ambiguous band. Every edge carries provenance metadata.

Layer 3: Hybrid retrieval (vector + graph traversal) feeding an LLM constrained to cite-only synthesis. Every answer carries a clickable source list. Donor data never trains shared models; tenants are isolated architecturally, not just contractually.

We've written separately about how this compares to general-purpose AI (Gratefully vs ChatGPT), to CRMs (vs Bloomerang, vs Virtuous, vs Salesforce Nonprofit), and to wealth screening tools (vs DonorSearch). The short version: a CRM, a graph database, a wealth screener, and a generic LLM are four different categories of software. A nonprofit knowledge graph is the layer that makes them work together coherently.

The Bottom Line

Entity resolution is a solved problem in the enterprise world. The architectural pattern - three layers of ingestion, resolved graph, and cited retrieval - is well-understood and battle-tested.

What hasn't been solved is the packaging of that pattern for nonprofits: connectors that fit fundraising tools, resolution thresholds tuned to the asymmetric cost of false positives in donor data, mandatory provenance on every assertion, and a query interface a development director can use without an engineer in the room.

That packaging is what the next generation of nonprofit intelligence software has to deliver. It's what we're building Gratefully to be.

For the conceptual version of this whitepaper aimed at non-technical readers, see The Institutional Memory Crisis. For a deeper look at the security model that makes this architecture safe for donor data, see Nonprofit AI Data Security: A Field Guide.

Want to see a nonprofit knowledge graph running on your own data?

We can walk you through it end-to-end - ingestion, resolution, retrieval, and citation - using a sample of your records, in under an hour.

Ready to transform your donor relationships?

See how Gratefully can help you implement these strategies at scale with AI-powered donor intelligence.

Want more insights like this? or with our team.