GraphAware Blog - Cypher

Find out what's new in the Neo4j world

Registering a custom analyzer for phonetic search in Neo4j 4

11 Mar 2021 by Luanne Misquitta Neo4j Cypher Search

Phonetic matching attempts to match words by pronunciation instead of spelling. Words are typically misspelled and exact matches result in them not being found.Algorithms such as Soundex and Metaphone were developed to address this problem and they have found usage in the areas of voice assistants, search, record linking and fraud detection, misspelled names of things (for example, medical records) etc.Custom analyzersIn 2019, we blogged about creating a Czech analyzer to address accents in the language.With Neo4j 4, a few things have changed. This short blog post was inspired by a StackOverflow question on phonetic searches and resulted in me...

Neo4j 4: Multi tenancy

06 Feb 2020 by Luanne Misquitta Neo4j Cypher

Neo4j 4: Multi tenancy

Up until version 4.0, Neo4j has supported only one active database per server instance. As such, achieving multi tenancy meant that either a Neo4j instance had to be deployed per tenant, or all tenant graphs co-existed in the same database.The first option meant a lot of extra infrastructure and maintenance, and the second implied some custom partitioning strategy usually achieved by differentiating tenants by labels or properties- a mechanism fraught with risk and mostly never preferred.Neo4j 4 allows you to use more than one active database at the same time, where each database defines a transaction domain and execution context,...

Neo4j 4: Post-Union Processing Explained

17 Jan 2020 by Luanne Misquitta Neo4j Cypher

Neo4j 4: Post-Union Processing Explained

Many, many years ago, I requested for the Cypher UNION clause in Cypher and Andres Taylor graciously added it.This was followed by the request for Post-Union Processing by Aseem Kishore, and it began to collect a whopping 99 comments over the course of time.It is exciting to see support for a subset of subqueries in openCypher i.e. uncorrelated subqueries in the soon to be released Neo4j 4, bringing post-union processing finally to Cypher.Given its history, a short article is in order.Union in 3.xIn pre-4x versions of Neo4j, UNION served to combine the results of 2 or more queries into one...

Handling synonyms in Neo4j's Full Text Search

20 Dec 2019 by Christophe Willemsen Neo4j Cypher Search

So you have followed the Deep Dive into Neo4j’s Full Text Search tutorial, learned even how to create custom analyzers and finally watched the Full Text Search tips and tricks talk at the Nodes19 online conference?Still, searching for boat does not yield results containing yacht or ship, and you’re wondering how to make your search engine a bit more relevant for your users?Don’t go any further, you’ll learn how to do it, now!SynonymsA synonym is a word or phrase that means exactly or nearly the same as another word or phrase.Why synonyms ?It’s all about recall! In other words, to...

Custom analyzer for fulltext search in Neo4j

06 Sep 2019 by František Hartman Neo4j Cypher Search

We have already blogged about fulltext search available in Neo4j 3.5. The list of available analyzers covers many languages and fits various use cases. However once you expose the search to real users they will start pointing out edge cases and complain about the search not being google-like.Speakers of languages using accents in their written form quite often leave out the accents. This has various reasons, the most common ones are historical, when different character encodings caused problems and users find it hard to change their habits using a different default keyboard layout (e.g. en_US); switching the layout just for...

Cypher: Using Index Hints

19 Aug 2019 by Luanne Misquitta Neo4j Cypher Intermediate

The Cypher query planner is quite advanced and mature, and you can mostly rely on it to pick the best plan for your query. However, there are rare cases, or bugs, that might want you looking for ways to influence that plan. This article demonstrates practical usage of an index hint. Note that all queries were tested against Neo4j Enterprise 3.5.8The graph modelThis is the relevant portion of the graph model that is sufficient to demonstrate the issue.Simple enough- we have many tweets, and tweets have keywords.Our graph has two indexes, one on the value of the Keyword, and the...

Avoid cycles in Cypher queries

26 Apr 2019 by Jan Zak Neo4j Beginner Cypher

Avoid cycles in Cypher queries

There is one common performance issue our clients run into when trying their first Cypher queries on a dataset in Neo4j. When writing a query, be sure that it doesn’t match any cycles, or you can experience unpleasant surprises.Assume the following sample graph and simple query:CREATE (a:Node {name: "A"}), (b:Node {name: "B"}), (c:Node {name: "C"}), (a)-[:TO {name: "1"}]->(b), (a)-[:TO {name: "2"}]->(b), (a)-[:TO {name: "3"}]->(b), (b)-[:TO {name: "4"}]->(c)MATCH p=({name: "A"})-[*..10]-({name: "C"}) RETURN pThe query returns 9 paths, instead of 3 as you might have guessed! The additional 6 paths have length 4 with node pattern A-B-A-B-C, note the repeated nodes A...

Deep Dive into Neo4j 3.5 Full Text Search

11 Jan 2019 by Christophe Willemsen Neo4j Cypher Search

Deep Dive into Neo4j 3.5 Full Text Search

In this blog we will go over the Full Text Search capabilities available in the latest major release of Neo4j.Contrary to our usual blogs, the content will rather focus on the underlying search engine used by Neo4j, that is Apache Lucene in version 5.5.5 .What exactly is Search ?Search is an interaction between a user and a search engine. The user has an information need at hand and attempts to satisfy it by providing a search with adequate constraints.The search engine uses those constraints to collect matching results and return them to the user.What is a Search Engine ?A search...

Reverse Engineering Book Stories with Neo4j and GraphAware NLP

24 Jul 2017 by Christophe Willemsen Neo4j NLP Cypher

A book tells us a story, but for a computer it is a wall of text. How can we use graphs and NLP to help our machines make more sense of a story?Our example comes from the A Song of Ice and Fire books, aka Game of Thrones. We converted the e-books (epub) to text-files and used a small python program to split them into chapters, paragraphs, and sentences.So a book turned into this model :GraphAware NLPGraphAware NLP Framework is a project that integrates NLP processing capabilities available in several software packages like Stanford NLP and OpenNLP, existing data sources,...

Graph-Aided Search - The Rise of Personalised Content

20 Apr 2016 by Alessandro Negro, Christophe Willemsen Neo4j Cypher Recommendations Elasticsearch

In our previous blog postwe introduced the concept of Graph Aided Search. It refers to a personalised user experience during search where theresults are customised for each user based on information gathered about them (likes, friends, clicks, buying history, etc.).This information is stored in a graph database and processed using machine learning and/or graph analysis algorithms.A simple example is the LinkedIn search functionality. If we were typing “Michal” in the text input, it would obviouslyreturn people where the name matches and order them by full text relevancy with some fuzziness:Lucene-based search engines such as Elasticsearch and Solr offer impressive performance...

Processing Large Sets of Nodes with Cypher

10 Dec 2015 by Christophe Willemsen Neo4j Cypher

Iterating over large numbers of nodes using Cypher is quite a common use case in Neo4j. Typically, the reason for doing thisis that we want to perform some kind of operation for each one of these nodes. In this blog post, we will use one millionTestNodes and try to iterate over them in order to index their contents into a freshly created Elasticsearch index.There are three approaches we can take, two of which are quite common, but the most performant technique is largely unknown.First Technique : SKIP and LIMITUsing SKIP and LIMIT is the first approach that comes to mind,...

Faster Recommendations with Neo4j 2.3 Triadic Selection

20 Oct 2015 by Alessandro Negro, Christophe Willemsen Neo4j Cypher Recommendations

Recently, Neo Technology announced the 2.3.0-RC1 release of their Neo4j graph database. One of the key new features is TriadicSelection built into Cypher’s Cost Based Planner. In this blog post, we will explore the Triadic Selection in detailand demonstrate how significantly it can speed up recommendations computed in Neo4j.What is Triadic Selection?A Bit of Theory: Triadic ClosureNetworks or graphs can rarely be considered static structures. On the contrary, often they seem to be ever-evolving objects.Any social network, for example, is often the most dynamic of graphs: at any moment, new relationships are created between existing nodes, other relationships vanish,new nodes...

Cypher: Variable Length Relationships by Example

19 May 2015 by Christophe Willemsen, Michal Bachman Neo4j Cypher

In this blog post, we’ll demonstrate how to use variable length relationships (sometimes called “variable length paths”)in Cypher using examples. We will also see when zero length relationships can be useful.IntroductionLet’s start with the basics. For the sake of the blog post, our use case will be users that know other users. Userswrite blog posts modeled as linked lists:You can generate an example graph with the following link to a predefined Graphgen graph, oruse this Neo4j Console if you want to execute the queries whilst reading the blog post.Basic Relationships MatchingLet’s start with a basic query that will find a...

MATCHing Paths with Very Dense Nodes in Neo4j 2.2

19 Mar 2015 by Christophe Willemsen Neo4j Cypher Intermediate

Last weekend, I came across a tweet announcing that Wikimedia released the dataset of the page clickstreamsfor February 2015. I found it interesting to download this dataset and see how people arrive on the Neo4j’s Wikipedia page.The data is quite simple; we have page entities that relate to other pages. A page can either be a Wikipedia page, ora non-Wikipedia page such as Google. Relationships can represent a user click from a Wikipedia page to another page, or a user searching on Google or Wikipedia. The number of times an event occurs is also provided in the dataset.Importing the DatasetYou...

Modelling Data in Neo4j: Labels vs. Indexed Properties

16 Jan 2015 by Christophe Willemsen Neo4j Modelling Cypher Intermediate

A common question when planning and designing your Neo4j Graph Database is how to handle “flagged” entities. This couldinclude users that are active, blog posts that are published, news articles that have been read, etc.IntroductionIn the SQL world, you would typically create a a boolean|tinyint column; in Neo4j, the same can be achieved in thefollowing two ways: A flagged indexed property A dedicated labelHaving faced this design dilemma a number of times, we would like to share our experience with the twopresented possibilities and some Cypher query optimizations that will help you take a full advantage of a the graph...

Cypher MERGE Explained

31 Jul 2014 by Luanne Misquitta Neo4j Beginner Cypher

With MERGE set to replace CREATE UNIQUEat some time, the behavior of MERGE can sometimes be tricky to understand.MERGEHere’s a summary of what MERGE does: It ensures that a pattern exists in the graph by creating it if it does not exist already It will not use partially existing patterns- it will attempt to match the entire pattern and create the entire pattern if missing When unique constraints are defined, MERGE expects to find at most one node that matches the pattern It also allows you to define what should happen based on whether data was created or matchedThe key...