A book tells us a story, but for a computer it is a wall of text. How can we use graphs and NLP to help our machines make more sense of a story?
Find out what's new in the Neo4j world
A book tells us a story, but for a computer it is a wall of text. How can we use graphs and NLP to help our machines make more sense of a story?
“Relevance is the practice of improving search results for users by satisfying their information needs in the context of a particular user experience, while balancing how ranking impacts business’s needs.” 
Since our first post a few months back, Neo4j-Databridge has seen a number of improvements and enhancements. In this post, we’ll take a quick tour of the latest features.
Recommendation engines are a crucial element in the global trend towards a push-based web experience and away from a pull-based one. They provide the ability to personalize content offered to each user by predicting the interest the user will have in the recommended items. This is not only a powerful business tool for content providers, but also a vital improvement to the user experience. In today’s world where the volume, interdependence, variety and speed of information is overwhelming, recommendation engines can significantly reduce the gap between us and what we search for. Indeed, these engines are used even to enhance common text based search (read more about graph-aided search in our blog).
Last month, the 5th edition of GraphConnect San Francisco took place at the Hyatt Regency SF. It was the biggest graph technology event ever and GraphAware proudly contributed as a sponsor, with one main talk, two lightning talks and our GraphHero stand ｡^‿^｡ This edition’s big announcement was the upcoming new landmark release of Neo4j 3.1, “The database for the connected enterprise”, which introduces a new state-of-the-art clustering architecture and new security architecture to meet enterprise requirements for scale and security. There will be a lot to say about this release, but you can already try the beta release as we have done!
In recent years, the rapid growth of social media communities has created a vast amount of digital documents on the web. Recommending relevant documents to users is a strategic goal for the effectiveness of customer engagement but at the same time is not a trivial problem.
Whether you realize it or not, the software you create has a global market. Perhaps more so than any other product in any other industry, code that may start as a small, individual effort has the potential to rapidly blossom into a product used around the world. While it is not always obvious that your application can or will have such wide usage, it is in your best interest to maximize the number of organizations and people you can reach. This means it is important to ensure your software is internationalized and localized.
Without question, Github is the biggest code sharing platform on the planet. With more than 14 millions users and 35 million repositories, the insights you can discover by analyzing the data available through its API are surprising and revealing.
In the Bersin Predictions for 2016 report, Josh Bersin states that “it feels as though everything in the world of talent is changing – from the way we recruit and attract people, as well as how we reward them, to the way we learn, and how we curate and manage our entire work-life experience”.
A great part of the world’s knowledge is stored using text in natural language, but using it in an effective way is still a major challenge. Natural Language Processing (NLP) techniques provide the basis for harnessing this huge amount of data and converting it into a useful source of knowledge for further processing.
In our previous blog post we introduced the concept of Graph Aided Search. It refers to a personalised user experience during search where the results are customised for each user based on information gathered about them (likes, friends, clicks, buying history, etc.). This information is stored in a graph database and processed using machine learning and/or graph analysis algorithms.
At GraphAware, we live and breathe Neo4j. For three years, we have been helping customers around the world embrace this amazing technology as a solution to many interesting problems. Mainstream applications of graphs, such as real-time recommendations, fraud detection, impact analysis, and graph-aided search, have been getting a lot of media attention.
As of version 2.1, Neo4j OGM will support persistence events. Although a date for the release of 2.1 isn’t known at the time of writing, we think this is an important and exciting new feature and so we’ll be writing a series of posts about it over the next few weeks to whet your appetites. In this first post we’ll take a quick tour of the new Events mechanism in the OGM, and provide some examples of how we might use it in our own applications. But first, some background…
Spring Data Neo4j 4.1 introduces the ability to map nodes and relationships returned by custom Cypher queries to domain entities. This blog post will explain how different types of query results map to entities.
For most organisations, data security is extremely important. The topic comes up every single time we are training, consulting, or otherwise engaging in the world of graphs and Neo4j. At the same time, security is very difficult and time-consuming to get right and the implications of getting it wrong can be serious. In this blog post, we introduce the integration of Spring Security into Neo4j which provides important security controls and mechanisms for enterprises and governments that make use of the world’s most popular graph database.
At GraphAware, we help organisations in a wide range of verticals solve problems with graphs. Once we come across a requirement or use case two or three different times, we typically create an open-source Neo4j extension that addresses it. The latest addition to our product portfolio, introduced in this post, is a simple library that automatically expires data from the Neo4j graph database.
We are delighted to invite you to a Meetup on 4th February 2016 at 6:30 pm at GraphAware London office where Michal Bachman is going to present the European premiere of his talk entitled “Real-Time Recommendations and the Future of Search” combined with a unique expert panel discussion and Q&A.
Iterating over large numbers of nodes using Cypher is quite a common use case in Neo4j. Typically, the reason for doing this
is that we want to perform some kind of operation for each one of these nodes. In this blog post, we will use one million
TestNodes and try to iterate over them in order to index their contents into a freshly created Elasticsearch index.
There are three approaches we can take, two of which are quite common, but the most performant technique is largely unknown.
Last month, I had the pleasure of speaking at GraphConnect in San Francisco, introducing the Graph-Aided Search to a large audience of Neo4j users and graph enthusiasts. For those who missed the conference, the recording and slides have now been made available. Enjoy and get in touch with feedback / questions!
Recently, Neo Technology announced the 2.3.0-RC1 release of their Neo4j graph database. One of the key new features is Triadic Selection built into Cypher’s Cost Based Planner. In this blog post, we will explore the Triadic Selection in detail and demonstrate how significantly it can speed up recommendations computed in Neo4j.
For the last couple of years, Neo4j has been increasingly popular as the technology of choice for people building real-time recommendation engines. Having been at the forefront of the graph movement through client engagements and open-source software development, we have identified the next step in the natural evolution of graph-based recommendation engines. We call it Graph-Aided Search.
Drawing a graph on a whiteboard is easy and fun! Translating that graph into an object model can sometimes result in questions such as “do I have to define relationships in both participating node entities?” or “which end of the relationship should I save?”.
Writing integration tests for your code that runs against Neo4j is simple enough when using the native API, but there’s not a great deal of help out there if you’re working in client-server mode. Making assertions about the shape of the graph can also be difficult, particularly if use cases involve more than a few nodes and relationships.
In this blog post, we’ll demonstrate how to use variable length relationships (sometimes called “variable length paths”) in Cypher using examples. We will also see when zero length relationships can be useful.
GraphAware is very proud to sponsor GraphConnect Europe 2015, the only conference that focuses on the rapidly growing world of graph databases and applications that make sense of connected data. The conference takes place in London on 7th May 2015.
At GraphAware, we are very excited about the recently released Neo4j 2.2 and would like to share some info about where you can meet us in the next few weeks and months. Come and see us for a chat and learn something new about Neo4j and Graph Databases!
Over the last few months, GraphAware, Neo4j, and Pivotal engineers have been working on a ground-up reimplementation of Spring Data Neo4j (SDN) that is server-first and Cypher-centric. Today we are very excited to announce the first milestone of the new Spring Data project for Neo4j.
Last weekend, I came across a tweet announcing that Wikimedia released the dataset of the page clickstreams for February 2015. I found it interesting to download this dataset and see how people arrive on the Neo4j’s Wikipedia page.
Our earlier blog post
talked about using the Neo4j web browser along with embedded Neo4j.
WrappingNeoServerBootstrapper which was employed to do this has been deprecated for a while and it raises questions
about the alternative.
A common question when planning and designing your Neo4j Graph Database is how to handle “flagged” entities. This could include users that are active, blog posts that are published, news articles that have been read, etc.
There is no better way to start 2015 than to learn something new. In the wake of two recent major announcements (here and here), Neo4j is as hot as ever, so it might well be the next skill you pick up or improve. Here’s a list of Neo4j events organised by GraphAware around the world in the next few weeks. We’ll be delighted to see you there!
There are times when you have an application using Neo4j in embedded mode but also need to play around with the graph
using the Neo4j web browser. Since the database can be accessed from at most one process at a time, trying to start up
the Neo4j server when your embedded Neo4j application is running won’t work. The
although deprecated, comes to the rescue. Here’s how to set it up.
Last month, I had the pleasure of speaking at GraphConnect in San Francisco, introducing the GraphAware Framework to a large audience of Neo4j users and graph enthusiasts. For those who missed the conference, the recording and slides have now been made available. Enjoy and get in touch with feedback / questions!
Specialist in Neo4j consultancy, training, and software development, Graph Aware Ltd has been selected as one of Neo Technology’s first UK solution partners, under its newly launched partnership program.
In this post, we’d like to introduce the first version of the GraphAware Neo4j ChangeFeed - a GraphAware Runtime Module that keeps track of changes made to the graph.
Modelling and querying time-based events in a graph is a fairly common discussion topic and a frequently asked question on Q/A sites. In this blog post, we evaluate some of the common approaches and introduce GraphAware TimeTree, a GraphAware Framework Module that simplifies modelling time and events in Neo4j.
In the first part of this short series about random graph models, we talked about why they are useful and had a brief look at two of them: Erdos-Renyi graphs and Barabasi-Albert model. In this post, we take a look at the “small world” phenomenon and another network model, namely the Watts-Strogatz model.
With MERGE set to replace CREATE UNIQUE at some time, the behavior of MERGE can sometimes be tricky to understand.
Efficient counting of relationships in Neo4j was the cornerstone of my Master Thesis and the reason the very first GraphAware Framework module called the Relationship Count Module was born. The improvements in Neo4j 2.1 around dense nodes and the addition of getDegree(…) methods on the Node interface made me eager to do some benchmarking around relationship counts again.
When one obtains a graph data from a measurement on a real world network, it is sometimes useful to make comparison with a random graph. Such graph is characterised by certain degree distribution, which you can imagine to be a list of degrees of nodes present in the network. The most interesting distributions have certain functional dependence which allows one to infer what processes are dominant in formation of the network. The processes consequently characterise the relationships between the nodes.
One of the main goals of the GraphAware Framework is to simplify and speed up development with Neo4j. Although it is called a “framework” for reasons explained elsewhere, today we will simply treat it as a library of useful, tested, and documented Java code. The feature we will introduce is called Improved Transaction Event API, which is exactly what it says on the tin.
A couple of days ago, I wrote about unit testing with GraphUnit. GraphUnit tested the state of an embedded Neo4j database. What if you run Neo4j in standalone server mode? Fortunately, you can still test it and match subgraphs using the GraphAware Neo4j RestTest library.
Today, it is exactly one year ago since Graph Aware Limited was incorporated. It started as a one man show, whilst I was finishing my MSc. Thesis at Imperial College London. Since then, we’ve been growing slowly but steadily and will be moving to our new London office fairly soon (announcements to come). We have happy clients in London, New York, Copenhagen, Barcelona, Prague, and Accra.
Recently, we announced the GraphAware Framework. Today, I would like to introduce its first feature called GraphUnit. GraphUnit is a component that helps Java developers unit test their code that talks to Neo4j and mutates data.
In this short blog post, I would like to introduce the GraphAware Neo4j Framework. Its goal is very ambitious: we’d like to make it as useful for Neo4j developers, as the Spring Framework is for Java developers. The Framework aims at speeding up development with Neo4j by providing a platform for building useful generic as well as domain-specific functionality, analytical capabilities, graph algorithms, and more.
After a long wait, I finally got the opportunity to publish the recording of my graph/Neo4j talk at WebExpo Prague 2013, intentionally and somewhat misleadingly titled “(Big) Data Science”. Thanks to the organisers for making it available and see you soon at WebExpo 2014!
Those who missed the first official Czech Neo4j Meetup can view recording of the event below (in Czech). Thanks again to all organisers, speakers, and participants.
We are pleased to announce the first official Czech Neo4j Meetup on 11th November 2013 at 6pm at the Czech Technical University in Prague. It is a free event: Anyone interested in learning about graph databases as well as those already using them are welcome to attend, listen to the talks, and join us for a beer afterwards. The talks will be in Czech.
Srdečně zveme všechny zájemce o NoSQL, grafové databáze a Neo4j na první oficiální setkání v ČR, které se koná v rámci informatického večera na Fakultě informačních technologií ČVUT 11. listopadu 2013 v 18h. Vstup je zdarma.
In the last post of our “Neo4j Modelling for Beginners” series, we looked at bidirectional relationships. In this post, we compare the implications of qualifying relationships by using different relationship types versus using relationship properties.
Transitioning from the relational world to the beautiful world of graphs requires a shift in thinking about data. Although graphs are often much more intuitive than tables, there are certain mistakes people tend to make when modelling their data as a graph for the first time. In this article, we look at one common source of confusion: bidirectional relationships.
I have just finished a year-long MSc. program in Computing at Imperial College London. My thesis was called GraphAware: Towards Online Analytical Processing in Graph Databases, which you can freely download. It’s not an easy, cover-to-cover read, but there might be some interesting parts, even if you don’t go through all the (over 100) pages.
S laskavým svolením organizátorů konference WebExpo si dovoluji veřejně zpřístupnit záznam své přednášky o Neo4j. Enjoy!
Letos jsem se poprvé zúčastnil konference WebExpo a sepsal několik postřehů.