MENU +44 (0) 333 44 GRAPH

GraphAware Blog

Cypher: Variable Length Relationships by Example

19 May 2015 by Christophe Willemsen & Michal Bachman

In this blog post, we’ll demonstrate how to use variable length relationships (sometimes called “variable length paths”) in Cypher using examples. We will also see when zero length relationships can be useful.

Introduction

Let’s start with the basics. For the sake of the blog post, our use case will be users that know other users. Users write blog posts modeled as linked lists:

Model Overview

You can generate an example graph with the following link to a predefined Graphgen graph, or use this Neo4j Console if you want to execute the queries whilst reading the blog post.

Basic Relationships Matching

Let’s start with a basic query that will find a user by his login name and retrieve his friends :

MATCH (user:User {login:'heller.perry'})-[:KNOWS]->(friend)
RETURN user, friend

Match User Friends Query Result

This was pretty easy.

Adding Relationship Length

Now, if we want to retrieve friends of friends that our user doesn’t know yet, we can simply expand the Cypher pattern :

MATCH (user:User {login:'heller.perry'})-[:KNOWS]->(friend)-[:KNOWS]->(foaf)
WHERE NOT((user)-[:KNOWS]->(foaf))
RETURN user, foaf

This will work without any problem, but we can simplify the query by introducing a “relationship length” (or “path length”):

MATCH (user:User {login:'heller.perry'})-[:KNOWS*2]->(foaf)
WHERE NOT((user)-[:KNOWS]->(foaf))
RETURN user, foaf

Match User Friends of Friends Query Result

By using the relationship length -[:KNOWS*2]->, we tell Cypher that there should be exactly 2 consecutive :KNOWS relationships on path between our user and his friends of friends.

In fact, not specifying the relationship length is the same as writing -[:KNOWS*1]->.

Variable Relationship Length

We can also specify a variable length. For example, we may want to match possible friends of friends and friends of friends of friends and return them all in the same collection :

MATCH (user:User {login:'heller.perry'})-[:KNOWS*2..3]->(foaf)
WHERE NOT((user)-[:KNOWS]->(foaf))
RETURN user, foaf

Which will result in more friends suggestions than the previous query :

Variable Length Query Result

Infinite Length and Length Limit

Sometimes you don’t know how “deep” in your graph desired nodes can be, so you can use infinite lengths :

MATCH (user:User {login:'heller.perry'})-[:KNOWS*]->(foaf)
WHERE NOT((user)-[:KNOWS]->(foaf))
RETURN user, foaf

However, this query can have a huge impact on the performance depending on how large and how densely connected your data is. We recommend that you always specify a length limit in your queries:

MATCH (user:User {login:'heller.perry'})-[:KNOWS*..5]->(foaf)
WHERE NOT((user)-[:KNOWS]->(foaf))
RETURN user, foaf

Zero Length Paths

One of the generally lesser known aspects of the variable length paths topic are zero length paths. We will look at a useful example that will highlight the usage of zero length paths.

Let’s say that we want to retrieve the blog posts written by people a user knows. By looking at our model, we will first get their last blog posts :

MATCH (user:User {login:'klind'})-[:KNOWS]->(friend)
MATCH (friend)-[:LAST_POST]->(post)
RETURN friend, post

This will retrieve all friends that have a last blog post written:

Friends' Latest Blog Posts

Let’s now say now we would like to retrieve 2 latest blog posts for each friend. We could simply expand the pattern:

MATCH (user:User {login:'klind'})-[:KNOWS]->(friend)
MATCH (friend)-[:LAST_POST]->(lastPost)-[:PREVIOUS_POST]->(previousPosts)
RETURN friend, lastPost, previousPosts

Blog Posts Retrieved Using Non Zero Length Path

What is happening? Have we lost friends? Not at all!

You may have friends that have only written a single blog post. Thus the end of the pattern containing the -[:PREVIOUS_POST]-> relationship will not be matched.

Zero Length Paths to the Rescue!

Try to execute the following query and see that all our friends are back!

MATCH (user:User {login:'klind'})-[:KNOWS]->(friend)
MATCH (friend)-[:LAST_POST]->(lastPost)-[:PREVIOUS_POST*0..1]->(post)
RETURN friend, post

zero length

By introducing a zero length path into the pattern, we have been able to instruct Cypher to bind the lastPost and post variables to the very same node in the zero length path case. In other words, RETURN friend, post will now return all friends with at least one blog post and all their blog posts, no matter how many they’ve got.

This trick has two main advantages :

In fact, the lastPost node identifier can be omitted altogether, resulting in the following query, which is equivalent to the previous one.

MATCH (user:User {login:'klind'})-[:KNOWS]->(friend)
MATCH (friend)-[:LAST_POST]->()-[:PREVIOUS_POST*0..1]->(post)
RETURN friend, post

Conclusion

In this blog post, we have discovered basics of variable length relationships (paths) and learned that it is usually not a good idea to use infinite relationships lengths. We have also learned about zero length paths and demonstrated where they can be useful.

Share this blog post:

+1 LinkedIn
comments powered by Disqus

Popular

Recent

Posts by tag

Neo4j Conference NoSQL Czech Beginner Analytics Advanced Modelling Meetup GraphAware Intermediate GraphUnit Testing Transactions Cypher Events Spring SDN OGM Recommendations Search Elasticsearch Security Enterprise NLP HCM PeopleAnalytics HR HRTech Framework Internationalization Localization

Search this blog