MENU +44 (0) 333 44 GRAPH

GraphAware Blog

GraphAware Neo4j TimeTree

20 Aug 2014 by Luanne Misquitta & Michal Bachman

Modelling and querying time-based events in a graph is a fairly common discussion topic and a frequently asked question on Q/A sites. In this blog post, we evaluate some of the common approaches and introduce GraphAware TimeTree, a GraphAware Framework Module that simplifies modelling time and events in Neo4j.

Naive Approach

Neo4j has no notion of a Date/Time data type, so you have to decide to store the timestamp either as a long, or as a human-readable String, for instance formatted as ‘YYYY-MM-DD HH:mm:ss’. Unless the time is only for human eyes, though, we recommend opting for the machine readable (long) approach. It is simply because it is slightly easier and less error-prone to convert long to String than the opposite.

However, that is not the main point of this post. Assuming we have selected the long timestamp approach, one of the early thoughts that springs to mind is simply (and only) storing the timestamp of the event as a property on the event node. This approach isn’t recommended, if any kind of timestamp-dependent querying is required.

First of all, finding events that occurred at a specific time or events that occurred within a time range requires the property to be examined on all candidate nodes. Secondly, ordering the events by time is not a cheap operation. Finally, queries such as finding dates which had no events or the most events would be very inefficient.

A multi level index structure such as a tree is perfect for representing time; events are simply attached to the leaf nodes of the time tree as described, for example, in Peter’s blog post and the Neo4j manual.

The Time Tree Model

A time tree has a root node, followed by years on the first level, months on the second, days on the third, and so on. It can be as granular as you need it to be.

Time Tree

You attach an event node to the leaf node(s) of the time tree. In the figure above, the Email was sent on April 24, 2014. Instead of being dependent on properties, querying for events on a particular date only involves arriving at the day node and traversing to all events linked to it.

Range queries are easy to handle too: from the start date to the end date, collect all events that fall within the range by following the NEXT relationship. Thanks to the levels modeled in the time tree, querying for all events at a coarser granularity such as events in April are straightforward as well.

GraphAware TimeTree Module

The GraphAware TimeTree is a library that builds the time tree on-demand. It supports resolutions of one year down to one millisecond and has time zone support as well as full support for attaching event nodes to on-demand time instants.

If you ask for a node representing the 24th of April, 2014 the library will either find and return that node to you or else create it along with its parent hierarchy. In this case, assuming this is the first call to the library, a 2014 year node, an April month node, and a 24 day node will be created, nothing else.

As more time instants are requested, other parts of the tree are constructed as necessary. Links between nodes on the same level as well as between levels are automatically maintained.

A SingleTimeTree maintains one single time tree in your graph, rooted at a node labelled TimeTreeRoot. You can select the resolution and timezone of the time instant you get from the time tree using both its Java and REST APIs. It also creates and maintains a few extra relationship types (FIRST, LAST) in order to simplify querying.

Time Tree

The library also gives you an option to create multiple time trees in your graph, in which case you should use a CustomRootTimeTree and supply a node from your graph that will serve as the root of the tree. This is useful, for instance, when you’re modelling people’s professional profiles and want to capture the start and end dates of their employments in different companies. In such case, each company would have its own time tree, effectively acting as an in-graph time index of employments of all different people in that company.

Apart from maintaining the tree, the GraphAware TimeTree library offers convenient methods for attaching your event nodes to time instants, creating the instant nodes if they do not already exist, and querying for events at a specific time instant or range thereof, including children of the time instant(s).

Other Approaches

Both Michael Hunger and Mark Needham have blogged about creating an entire time tree using Cypher:

The GraphAware TimeTree essentially creates the same structure including

Examples

Querying for events using the TimeTree is easy: GET http://server:7474/graphaware/timetree/single/{time}/events will give you all events attached to the time instant represented by {time} which is a long (the number of milliseconds since 1/1/1970).

GET http://server:7474/graphaware/timetree/range/{startTime}/{endTime} similarly gives you all events that occurred between {startTime} and {endTime} ordered by time.

You can also specify the relationship type that relates the event to a time instant. For example, GET http://server:7474/graphaware/timetree/range/{startTime}/{endTime}?relationshipType=SENT_ON will only return events attached to time instants with a SENT_ON relationship.

Leaving out the relationship type gives you all events, useful if you have different kinds of events occurring at the same time instant.

The Java API is similar, please refer to the JavaDoc for details.

Of course, you are not limited to using the time tree API’s. You can also use your own Cypher queries with additional logic such as find me all emails sent in April 2014 which did not get a reply:

sql MATCH (root:TimeTreeRoot)-[:CHILD]->(year:Year {value:2014})-[:CHILD]->(month:Month {value:4}) WITH month MATCH (month)-[:FIRST]->(firstDay) MATCH (month)-[:LAST]->(lastDay) WITH firstDay,lastDay MATCH (firstDay)-[:NEXT*0..31]->(day)-[:NEXT*0..31]->(lastDay) WITH day MATCH (email:Email)-[:SENT_ON]->(day) WHERE not (email<-[:REPLIED_TO]-()) RETURN email;

Conclusion

We hope you will find the GraphAware TimeTree useful. Check out more details on Github and do not hesitate to give us feedback, or even better, improve it by issuing a pull request. Thanks to those who have already done so!

Share this blog post:

+1 LinkedIn
comments powered by Disqus

Popular

Recent

Posts by tag

Neo4j Conference NoSQL Czech Beginner Analytics Advanced Modelling Meetup GraphAware Intermediate GraphUnit Testing Transactions Cypher Events Spring SDN OGM Recommendations Search Elasticsearch Security Enterprise NLP HCM PeopleAnalytics HR HRTech Framework Internationalization Localization

Search this blog