Cypher MERGE Explained

July 31, 2014 · 5 min read

With MERGE set to replace CREATE UNIQUE at some time, the behavior of MERGE can sometimes be tricky to understand.

MERGE

Here’s a summary of what MERGE does:

  • It ensures that a pattern exists in the graph by creating it if it does not exist already
  • It will not use partially existing patterns- it will attempt to match the entire pattern and create the entire pattern if missing
  • When unique constraints are defined, MERGE expects to find at most one node that matches the pattern
  • It also allows you to define what should happen based on whether data was created or matched

The key to understanding what part of the pattern is created if not matched is the concept of bound elements. So what is a bound element?

An element is bound if the identifier was used in an earlier clause of the cypher statement (thanks to Andrés and Anders for this definition).

The Basics

Merge acts as combination of MATCH and CREATE. It will try to find the pattern in the graph and if it does, nothing is created. If the pattern cannot be matched, only then will it be created.

MERGE (u1:User {name: "u1"}) will try to find a User node with name=u1. If such a node cannot be found, it is created. Once created, re-executing this MERGE statement has no effect on the graph.

If we want to make sure that a relationship with a given type is created once and only once between two nodes:

MATCH (u1:User {name: "u1"}), (u2:User {name: "u2"}) MERGE (u1)-[:FRIEND]-(u2)

Repeated execution of this statement is safe- it will not create more FRIEND relations between u1 and u2 because after the relation is created for the first time, MERGE can match it in subsequent executions.

Side Note: We’ve left off the direction of the FRIEND relationship because in this example, the direction is irrelevant. Notice, however, that Neo4j chose a direction; this is because all relationships in Neo4j must have a direction. We can ignore it though when traversing with no performance implications at all. For more information on this topic, please look at our earlier blog post.

Patterns with bound and unbound nodes warrant some examples.

Examples

Start off on an empty graph with the statement:

MERGE (u1:User {name: "u1"})-[:FRIEND]-(u2:User {name:"u2"})

Nothing in the pattern is bound, and moreover the unbound pattern cannot be matched in the existing empty graph, so it is created.

You should see

example 1

If you re-execute

MERGE (u1:User {name: "u1"})-[:FRIEND]-(u2:User {name:"u2"})

the graph remains unchanged. This is because the unbound pattern could be matched completely and so it does not create anything.

Clean out the graph and execute

MERGE (u1:User {name: "u1"})-[:FRIEND]-(u2:User {name:"u2"})

If you created the unique constraint above, drop it with

DROP CONSTRAINT ON (user:User) ASSERT user.name IS UNIQUE

What happens in this case?

MERGE (u1:User {name:’u1’})-[:FRIEND]-(u2:User { name:’u2’ })-[:LIVES_IN]->(c:Country { name:"India" })

Yes, it created the entire pattern.

example 3

Why is this? First, the entire pattern has no bound nodes. Since merge won’t consider a partial pattern, it attempted to match the entire unbound pattern which does not exist, and created it.

So how do we fix this to make sure that u1, u2, India and their relationships are not re-created? The answer is to bind the nodes of the pattern that you don’t want re-created.

MERGE (u1:User { name: "u1" }) MERGE (u2:User { name:"u2" }) MERGE (u1)-[:FRIEND]-(u2) MERGE (u2)-[:LIVES_IN]->(c:Country { name:"India" })

On an empty graph, this statement cannot match u1 so it creates it. Then it cannot match u2 so it creates it. u1 and u2 are now bound. In the final parts, both patterns do not exist so they are created using bound nodes u1 and u2.

What happens if you run this on a graph that contains

example 1

u1 can be matched, so it is not created. Same goes for u2. (u1)-[:FRIEND]->(u2) exists, so it is not created. Then the final part (u2)-[:LIVES_IN]->(c:Country { name:"India" }) does not exist and is created from the bound node u2.

example 1

If you re-execute this now, the graph remains unchanged because now the entire pattern in the final part of the query is matched and hence not created.

Contrast this with executing

MERGE (u1:User { name: "u1" }) MERGE (u1)-[:FRIEND]-(u2:User { name:"u2" })-[:LIVES_IN]->(c:Country { name:"India" })

on a graph that contains

example 1

In this case, the only node bound was u1. Since the unbound pattern could not be matched completely, it was created, which results in:

example 4

Properties and node matching

On a graph that contains

example 1

MERGE (u1:User {name: "u1",age:20})-[:FRIEND]-(u2:User {name:"u2"})

will create two more User nodes and a FRIEND relation between them.

example 2

This is because MERGE could not find any User node that has both a name and age property that match and so it went ahead and created the entire pattern.

Every property does not need to be specified to find a matching node- a subset will do. If we had a User node with properties name=”u1” and age=20 and then attempted to MERGE (u:User {name:”u1”}) it would not create a new node because it could match a User node with name=”u1”.

Note that in case of unique constraints defined on User such as

CREATE CONSTRAINT ON (u:User) ASSERT u.name IS UNIQUE;

you must have at most one User node with a given name property.

So in the case where a User node exists with name=”u1” (but no age),

MERGE (u1:User {name:”u1”,age:20})

will produce an exception:

Node already exists with label User and property “name”=[u1]

This is expected because MERGE attempted to create a new node but the unique constraint was violated as a result of it.

MERGE is really powerful- just remember to bind elements so that you don’t create extra data where you don’t expect to!

(Updated on May 11 2015 for better clarity)


Meet the authors