Cypher: Variable Length Relationships by Example

May 19, 2015 · 4 min read

In this blog post, we’ll demonstrate how to use variable length relationships (sometimes called “variable length paths”) in Cypher using examples. We will also see when zero length relationships can be useful.

Introduction

Let’s start with the basics. For the sake of the blog post, our use case will be users that know other users. Users write blog posts modeled as linked lists:

Model Overview

You can generate an example graph with the following link to a predefined Graphgen graph, or use this Neo4j Console if you want to execute the queries whilst reading the blog post.

Basic Relationships Matching

Let’s start with a basic query that will find a user by his login name and retrieve his friends :

MATCH (user:User {login:'heller.perry'})-[:KNOWS]->(friend) RETURN user, friend 
Match User Friends Query Result

This was pretty easy.

Adding Relationship Length

Now, if we want to retrieve friends of friends that our user doesn’t know yet, we can simply expand the Cypher pattern :

MATCH (user:User {login:'heller.perry'})-[:KNOWS]->(friend)-[:KNOWS]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf 

This will work without any problem, but we can simplify the query by introducing a “relationship length” (or “path length”):

MATCH (user:User {login:'heller.perry'})-[:KNOWS*2]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf 
Match User Friends of Friends Query Result

By using the relationship length -[:KNOWS*2]->, we tell Cypher that there should be exactly 2 consecutive :KNOWS relationships on path between our user and his friends of friends.

In fact, not specifying the relationship length is the same as writing -[:KNOWS*1]->.

Variable Relationship Length

We can also specify a variable length. For example, we may want to match possible friends of friends and friends of friends of friends and return them all in the same collection :

MATCH (user:User {login:'heller.perry'})-[:KNOWS*2..3]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf 

Which will result in more friends suggestions than the previous query :

Variable Length Query Result

Infinite Length and Length Limit

Sometimes you don’t know how “deep” in your graph desired nodes can be, so you can use infinite lengths :

MATCH (user:User {login:'heller.perry'})-[:KNOWS*]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf 

However, this query can have a huge impact on the performance depending on how large and how densely connected your data is. We recommend that you always specify a length limit in your queries:

MATCH (user:User {login:'heller.perry'})-[:KNOWS*..5]->(foaf) WHERE NOT((user)-[:KNOWS]->(foaf)) RETURN user, foaf 

Zero Length Paths

One of the generally lesser known aspects of the variable length paths topic are zero length paths. We will look at a useful example that will highlight the usage of zero length paths.

Let’s say that we want to retrieve the blog posts written by people a user knows. By looking at our model, we will first get their last blog posts :

MATCH (user:User {login:'klind'})-[:KNOWS]->(friend) MATCH (friend)-[:LAST_POST]->(post) RETURN friend, post 

This will retrieve all friends that have a last blog post written:

Friends' Latest Blog Posts

Let’s now say now we would like to retrieve 2 latest blog posts for each friend. We could simply expand the pattern:

MATCH (user:User {login:'klind'})-[:KNOWS]->(friend) MATCH (friend)-[:LAST_POST]->(lastPost)-[:PREVIOUS_POST]->(previousPosts) RETURN friend, lastPost, previousPosts 
Blog Posts Retrieved Using Non Zero Length Path

What is happening? Have we lost friends? Not at all!

You may have friends that have only written a single blog post. Thus the end of the pattern containing the -[:PREVIOUS_POST]-> relationship will not be matched.

Zero Length Paths to the Rescue!

Try to execute the following query and see that all our friends are back!

MATCH (user:User {login:'klind'})-[:KNOWS]->(friend) MATCH (friend)-[:LAST_POST]->(lastPost)-[:PREVIOUS_POST*0..1]->(post) RETURN friend, post 
zero length

By introducing a zero length path into the pattern, we have been able to instruct Cypher to bind the lastPost and post variables to the very same node in the zero length path case. In other words, RETURN friend, post will now return all friends with at least one blog post and all their blog posts, no matter how many they’ve got.

This trick has two main advantages :

  • You do not have to write OPTIONAL MATCHes for finding if the lastPost has an outgoing PREVIOUS_POST relationship
  • All the posts are in the same collection : post which makes things easier to work with in your application

In fact, the lastPost node identifier can be omitted altogether, resulting in the following query, which is equivalent to the previous one.

MATCH (user:User {login:'klind'})-[:KNOWS]->(friend) MATCH (friend)-[:LAST_POST]->()-[:PREVIOUS_POST*0..1]->(post) RETURN friend, post 

Conclusion

In this blog post, we have discovered basics of variable length relationships (paths) and learned that it is usually not a good idea to use infinite relationships lengths. We have also learned about zero length paths and demonstrated where they can be useful.


Meet the authors