In this blog post, we’ll demonstrate how to use variable length relationships (sometimes called “variable length paths”) in Cypher using examples. We will also see when zero length relationships can be useful.
Introduction
Let’s start with the basics. For the sake of the blog post, our use case will be users that know other users. Users write blog posts modeled as linked lists:
You can generate an example graph with the following link to a predefined Graphgen graph, or use this Neo4j Console if you want to execute the queries whilst reading the blog post.
Basic Relationships Matching
Let’s start with a basic query that will find a user by his login name and retrieve his friends :
MATCH (user:User {login:'heller.perry'})-[:KNOWS]->(friend) RETURN user, friend
This was pretty easy.
Adding Relationship Length
Now, if we want to retrieve friends of friends that our user doesn’t know yet, we can simply expand the Cypher pattern :
MATCH (user:User {login:'heller.perry'})-[:KNOWS]->(friend)-[:KNOWS]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf
This will work without any problem, but we can simplify the query by introducing a “relationship length” (or “path length”):
MATCH (user:User {login:'heller.perry'})-[:KNOWS*2]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf
By using the relationship length -[:KNOWS*2]->
, we tell Cypher that there should be exactly 2 consecutive :KNOWS relationships on path between our user and his friends of friends.
In fact, not specifying the relationship length is the same as writing -[:KNOWS*1]->
.
Variable Relationship Length
We can also specify a variable length. For example, we may want to match possible friends of friends and friends of friends of friends and return them all in the same collection :
MATCH (user:User {login:'heller.perry'})-[:KNOWS*2..3]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf
Which will result in more friends suggestions than the previous query :
Infinite Length and Length Limit
Sometimes you don’t know how “deep” in your graph desired nodes can be, so you can use infinite lengths :
MATCH (user:User {login:'heller.perry'})-[:KNOWS*]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf
However, this query can have a huge impact on the performance depending on how large and how densely connected your data is. We recommend that you always specify a length limit in your queries:
MATCH (user:User {login:'heller.perry'})-[:KNOWS*..5]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf
Zero Length Paths
One of the generally lesser known aspects of the variable length paths topic are zero length paths. We will look at a useful example that will highlight the usage of zero length paths.
Let’s say that we want to retrieve the blog posts written by people a user knows. By looking at our model, we will first get their last blog posts :
MATCH (user:User {login:'klind'})-[:KNOWS]->(friend) MATCH (friend)-[:LAST_POST]->(post) RETURN friend, post
This will retrieve all friends that have a last blog post written:
Let’s now say now we would like to retrieve 2 latest blog posts for each friend. We could simply expand the pattern:
MATCH (user:User {login:'klind'})-[:KNOWS]->(friend) MATCH (friend)-[:LAST_POST]->(lastPost)-[:PREVIOUS_POST]->(previousPosts) RETURN friend, lastPost, previousPosts
What is happening? Have we lost friends? Not at all!
You may have friends that have only written a single blog post. Thus the end of the pattern containing the -[:PREVIOUS_POST]->
relationship will not be matched.
Zero Length Paths to the Rescue!
Try to execute the following query and see that all our friends are back!
MATCH (user:User {login:'klind'})-[:KNOWS]->(friend) MATCH (friend)-[:LAST_POST]->(lastPost)-[:PREVIOUS_POST*0..1]->(post) RETURN friend, post
By introducing a zero length path into the pattern, we have been able to instruct Cypher to bind the lastPost
and post
variables to the very same node in the zero length path case. In other words, RETURN friend, post
will now return all friends with at least one blog post and all their blog posts, no matter how many they’ve got.
This trick has two main advantages :
- You do not have to write
OPTIONAL MATCH
es for finding if thelastPost
has an outgoingPREVIOUS_POST
relationship - All the posts are in the same collection :
post
which makes things easier to work with in your application
In fact, the lastPost
node identifier can be omitted altogether, resulting in the following query, which is equivalent to the previous one.
MATCH (user:User {login:'klind'})-[:KNOWS]->(friend) MATCH (friend)-[:LAST_POST]->()-[:PREVIOUS_POST*0..1]->(post) RETURN friend, post
Conclusion
In this blog post, we have discovered basics of variable length relationships (paths) and learned that it is usually not a good idea to use infinite relationships lengths. We have also learned about zero length paths and demonstrated where they can be useful.