The first part of this series covered data sources and modelling. We discussed graphs in law enforcement investigations, their data sources, data provenance, INTs and how to model sources in graphs.
In part 2, Data quality and credibility, we covered source ratings (source reliability & information credibility) and their importance in investigation graphs for law enforcement.
This third blog post of the series will focus on corresponding entities, grouping and fusing entities.
Same entity, different sources
The sources are modelled, we have the information and the nodes have a source rating. What if different sources provide the same information and are represented by different nodes?

This graph contains information related to an armed robbery.
Paul, a witness, observed a vehicle — a Toyota — with John inside. He also recorded the licence plate. When the plate was checked, it was linked to a Toyota Camry registered to Richard S.
Because the licence plates match, it is highly likely these refer to the same vehicle. From this, an analyst might infer a connection between John and Richard. However, no such link currently exists in the graph, even though one probably should.
One option is to take no action and rely on the analyst to recognise that the two vehicles are the same. But this approach has drawbacks. Over time, the graph becomes cluttered with duplicate nodes and increasingly complex subgraphs. As the number of nodes grows into the hundreds or thousands, it becomes harder to identify duplicates.
This also increases cognitive load for analysts and undermines one of the key strengths of graphs: effective pathfinding.
Hypothesis links

When you are confident enough to formalise the hypothesis, you can create a link between the relevant entities in the graph. This establishes a path, making the connection visible when searching for the armed robbery, John, or related nodes.
However, this approach introduces some challenges.
First, the path becomes one hop longer, which can reduce clarity. More importantly, if additional vehicles share the same licence plate — for example, identified via speed cameras, other sightings, or even theft — this method does not scale well.
Since a relationship can only connect two nodes, multiple links are required to represent these scenarios. As the number of connections grows, the graph becomes increasingly noisy and harder to interpret.
This is further complicated if the links carry different confidence levels based on source reliability. Managing and interpreting these varying levels of certainty can quickly become unmanageable.
Using node representations
When creating a link between nodes is not the solution, creating a node representation is an alternative option.

All the activities around the Toyota Camry are linked to the new node, whenever it is observed again or caught by a speed camera, for example.
In this structure, many activity nodes can be linked to the vehicle node, but it adds additional steps to each graph traversal.
Grouping nodes
A way to reduce these extra steps (and the resulting noise in the graph) is to group the nodes. You can group several nodes visually, based on a strong identifier.
As the graph is connected, you can efficiently drill up and down the layers to find the right conclusions.

As this is a visual layer, a virtual group, there are no real paths connected. However, you can see what the facts are in this group of nodes as the underlying pieces of information.
The next step would be to materialise the groups and fuse entities, bringing together information from the entities considered to be the same.
Fusing entities
To fuse entities, create a new entity fusing information from facts determined to be the same. Then, establish the source and confidence in the new fused entity.

In the best case, all properties on all these entities are complementary and there is no difference or conflict between them, so we can create a new, unified node.
The confidence in the fused entity, the source rating, is human intelligence for this node as the analyst determined the original entities are the same. The analyst is quite confident as it has been rated A3.
This node is a brand new entity in the graph with its own source and rating. The original sources are as properties on the fused node.

The relationships are transferred from the original nodes to the fused entity. When exploring the fused entity, you are able to expand and collapse to see the relationships to the sources.
Challenges with fused entities
There are a couple of challenges you are most likely going to face with fusing entities.
- Updates of facts: One of the biggest questions is, how do you deal with updates? Are the fused entities automatically updated when the facts are updated?
- Materialised vs shell-fused entities: There are two types of fused entities. Materialised fused entities have materialised properties, i.e. they are copied from the facts onto the fused entity. In this case, it is critical to make sure the fused entity gets triggered and updated as well when you update a fact. Shell-fused entities are more common and the facts own and show the properties at run time, they will show updated data anyway.
- Information and source rating validation: It might happen a source rating of a fact changes, the original node. How does that affect the fused entity? If the source rating shifts from A5 to D2, the source can no longer be considered trustworthy and possibly the assessments of the analyst have lost validity.
- Conflicts: What happens if there is a conflict? For example, one of the vehicles in the fused entity wasn’t a Toyota, a mistake was made and the vehicle turned out to be a Jeep. Now they are not the same entity and this example forces you to consider how to model the graph and sources.
The main takeaway is to make sure you consider use cases before creating fused entities and create workflows to ensure all these issues remain in sync.
We will speak more about facts and offloading them in the upcoming blog, the fourth and final post in this Graphs in Law Enforcement series. In case you would like to watch a video with all these topics touched, check out the talk Tracking Data Sources of Fused Entities in Law Enforcement Graphs by our VP of Engineering, Luanne Misquitta.

