Neo4j is free and very powerful graph analytics tool. Let’s learn to work with data sets, which will load into Neo4j and can then be analyzed. Neo4j is capable of processing and analyzing extremely complex graph networks consisting of millions of nodes and relationships. Cypher is the query language used in Neo4j.

## Downloading, Installing, and Running Neo4j

Download Neo4j at: http://www.neo4j.com/

### Useful Links

- Ask questions at Stack Overflow
- Discuss Neo4j on Google Groups
- Visit a local Meetup Group
- Contribute code on Github

## Getting Started With Neo4j

### Create a Toy Network with five nodes

- N1 = Tom
- N2 = Harry
- N3 = Julian
- N4 = Michele
- N5 = Josephine

and five edges:

- e1 = Harry ‘is known by’ Tom
- e2 = Julian ‘is co-worker of’ Harry
- e3 = Michele ‘is wife of’ Harry
- e4 = Josephine ‘is wife of’ Tom
- e5 = Josephine ‘is friend of’ Michele

1 2 3 4 5 |
create (N1:ToyNode {name: 'Tom'}) - [:ToyRelation {relationship: 'knows'}] -> (N2:ToyNode {name:'Harry'}), (N2) - [:ToyRelation {relationship: 'co-worker'}] -> (N3:ToyNode {name: 'Julian', job: 'plumber'}), (N2) - [:ToyRelation {relationship: 'wife'}] -> (N4:ToyNode {name: 'Michele', job: 'accountant'}), (N1) - [:ToyRelation {relationship: 'wife'}] -> (N5:ToyNode {name: 'Josephine', job: 'manager'}), (N4) - [:ToyRelation {relationship: 'friend'}] -> (N5); |

### View the resulting graph

1 |
match (n:ToyNode)-[r]-(m) return n, r, m |

### Delete all nodes and edges

1 |
match (n)-[r]-() delete n, r |

### Delete all nodes which have no edges

1 |
match (n) delete n |

### Delete only ToyNode nodes which have no edges

1 |
match (n:ToyNode) delete n |

### Delete all edges

1 |
match (n)-[r]-() delete r |

### Delete only ToyRelation edges

1 |
match (n)-[r:ToyRelation]-() delete r |

### Selecting an existing single ToyNode node

1 |
match (n:ToyNode {name:'Julian'}) return n |

### Counting the number of nodes

1 2 |
match (n:MyNode) return count(n) |

### Counting the number of edges

1 2 |
match (n:MyNode)-[r]->() return count(r) |

### Finding leaf nodes

1 2 3 |
match (n:MyNode)-[r:TO]->(m) where not ((m)-->()) return m |

### Finding root nodes

1 2 3 |
match (m)-[r:TO]->(n:MyNode) where not (()-->(m)) return m |

### Finding triangles:

1 2 |
match (a)-[:TO]->(b)-[:TO]->(c)-[:TO]->(a) return distinct a, b, c |

### Finding 2nd neighbors of D

1 2 3 |
match (a)-[:TO*..2]-(b) where a.Name='D' return distinct a, b |

### Finding the types of a node

1 2 3 |
match (n) where n.Name = 'Afghanistan' return labels(n) |

### Finding the label of an edge:

1 2 |
match (n {Name: 'Afghanistan'})<-[r]-() return distinct type(r) |

### Finding all properties of a node:

1 2 |
match (n:Actor) return * limit 20 |

### Finding loops:

1 2 |
match (n)-[r]->(n) return n, r limit 10 |

### Finding multigraphs:

1 2 3 |
match (n)-[r1]->(m), (n)-[r2]-(m) where r1 <> r2 return n, r1, r2, m limit 10 |

### Finding the induced subgraph given a set of nodes:

1 2 3 |
match (n)-[r:TO]-(m) where n.Name in ['A', 'B', 'C', 'D', 'E'] and m.Name in ['A', 'B', 'C', 'D', 'E'] return n, r, m |

## Importing Data Into Neo4j

### One way to “clean the slate” in Neo4j before importing (run both lines):

1 2 |
match (a)-[r]->() delete a,r match (a) delete a |

### Script to Import Data Set: test.csv (simple road network)

For Windows use something like the following

[NOTE: replace any spaces in your path with %20, “percent twenty” ]

1 2 3 4 |
LOAD CSV WITH HEADERS FROM "file:///C:/test.csv" AS line MERGE (n:MyNode {Name:line.Source}) MERGE (m:MyNode {Name:line.Target}) MERGE (n) -[:TO {dist:line.distance}]-> (m) |

## Adding to and Modifying a Graph

### Adding a Node Correctly

1 2 |
match (n:ToyNode {name:'Julian'}) merge (n)-[:ToyRelation {relationship: 'fiancee'}]->(m:ToyNode {name:'Joyce', job:'store clerk'}) |

### Adding a Node Incorrectly

1 |
create (n:ToyNode {name:'Julian'})-[:ToyRelation {relationship: 'fiancee'}]->(m:ToyNode {name:'Joyce', job:'store clerk'}) |

### Correct your mistake by deleting the bad nodes and edge

1 |
match (n:ToyNode {name:'Joyce'})-[r]-(m) delete n, r, m |

### Modify a Node’s Information

1 2 |
match (n:ToyNode) where n.name = 'Harry' set n.job = 'drummer' match (n:ToyNode) where n.name = 'Harry' set n.job = n.job + ['lead guitarist'] |

## Path Analytics with Neo4j

### Viewing the graph

1 2 |
match (n:MyNode)-[r]->(m) return n, r, m |

### Finding paths between specific nodes:

1 2 3 |
match p=(a)-[:TO*]-(c) where a.Name='H' and c.Name='P' return p limit 1 |

### Shortest path between nodes H and P

1 2 3 |
match p=(a)-[:TO*]-(c) where a.Name='H' and c.Name='P' return p order by length(p) asc limit 1 |

### Finding the length between specific nodes

1 2 3 |
match p=(a)-[:TO*]-(c) where a.Name='H' and c.Name='P' return length(p) limit 1 |

### Finding a shortest path between specific nodes:

1 2 3 |
match p=shortestPath((a)-[:TO*]-(c)) where a.Name='A' and c.Name='P' return p, length(p) limit 1 |

### All Shortest Paths

1 2 3 |
match p = allShortestPaths(source-[r:TO*]-destination) where source.Name='A' AND destination.Name = 'P' return extract(n in nodes(p)| n.Name) as Paths |

### All Shortest Paths with Path Conditions

1 2 3 |
match p = allShortestPaths(source-[r:TO*]->destination) where source.Name='A' AND destination.Name = 'P' and length(nodes(p)) > 5 return extract(n in nodes(p)| n.Name) as Paths, length(p) |

### Diameter of the graph

1 2 3 4 5 6 |
match (n:MyNode), (m:MyNode) where n <> m with n, m match p=shortestPath((n)-[*]->(m)) return n.Name, m.Name, length(p) order by length(p) desc limit 1 |

### Extracting and computing with node and properties

1 2 3 4 |
match p=(a)-[:TO*]-(c) where a.Name='H' and c.Name='P' return extract(n in nodes(p)|n.Name) as Nodes, length(p) as pathLength, reduce(s=0, e in relationships(p)| s + toInt(e.dist)) as pathDist limit 1 |

### Dijkstra’s algorithm for a specific target node

1 2 3 4 |
MATCH (from: MyNode {Name:'A'}), (to: MyNode {Name:'P'}), path = shortestPath((from)-[:TO*]->(to)) WITH REDUCE(dist = 0, rel in rels(path) | dist + toInt(rel.dist)) AS distance, path RETURN path, distance |

### Dijkstra’s algorithm SSSP

1 2 3 4 |
MATCH (from: MyNode {Name:'A'}), (to: MyNode), path = shortestPath((from)-[:TO*]->(to)) WITH REDUCE(dist = 0, rel in rels(path) | dist + toInt(rel.dist)) AS distance, path, from, to RETURN from, to, path, distance order by distance desc |

### Graph not containing a selected node

1 2 3 |
match (n)-[r:TO]->(m) where n.Name <> 'D' and m.Name <> 'D' return n, r, m |

### Shortest path over a Graph not containing a selected node

1 2 3 |
match p=shortestPath((a {Name: 'A'})-[:TO*]-(b {Name: 'P'})) where not('D' in (extract(n in nodes(p)|n.Name))) return p, length(p) |

### Graph not containing the immediate neighborhood of a specified node

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
match (d {Name:'D'})-[:TO]-(b) with collect(distinct b.Name) as neighbors match (n)-[r:TO]->(m) where not (n.Name in (neighbors+'D')) and not (m.Name in (neighbors+'D')) return n, r, m; match (d {Name:'D'})-[:TO]-(b)-[:TO]->(leaf) where not((leaf)-->()) return (leaf); match (d {Name:'D'})-[:TO]-(b)<-[:TO]-(root) where not((root)<--()) return (root) |

### Graph not containing a selected neighborhood

1 2 3 4 5 |
match (a {Name: 'F'})-[:TO*..2]-(b) with collect(distinct b.Name) as MyList match (n)-[r:TO]->(m) where not(n.Name in MyList) and not (m.Name in MyList) return distinct n, r, m |

## Connectivity Analytics

### Viewing the graph

1 2 |
match (n:MyNode)-[r]->(m) return n, r, m |

### Find the outdegree of all nodes

1 2 3 4 5 6 7 |
match (n:MyNode)-[r]->() return n.Name as Node, count(r) as Outdegree order by Outdegree union match (a:MyNode)-[r]->(leaf) where not((leaf)-->()) return leaf.Name as Node, 0 as Outdegree |

### Find the indegree of all nodes

1 2 3 4 5 6 |
match (n:MyNode)<-[r]-() return n.Name as Node, count(r) as Indegree order by Indegree union match (a:MyNode)<-[r]-(root) where not((root)<--()) return root.Name as Node, 0 as Indegree |

### Find the degree of all nodes

1 2 3 |
match (n:MyNode)-[r]-() return n.Name, count(distinct r) as degree order by degree |

### Find degree histogram of the graph

1 2 3 |
match (n:MyNode)-[r]-() with n as nodes, count(distinct r) as degree return degree, count(nodes) order by degree asc |

### Save the degree of the node as a new node property

1 2 3 4 |
match (n:MyNode)-[r]-() with n, count(distinct r) as degree set n.deg = degree return n.Name, n.deg |

### Construct the Adjacency Matrix of the graph

1 2 3 4 5 |
match (n:MyNode), (m:MyNode) return n.Name, m.Name, case when (n)-->(m) then 1 else 0 end as value |

### Construct the Normalized Laplacian Matrix of the graph

1 2 3 4 5 6 7 |
match (n:MyNode), (m:MyNode) return n.Name, m.Name, case when n.Name = m.Name then 1 when (n)-->(m) then -1/(sqrt(toInt(n.deg))*sqrt(toInt(m.deg))) else 0 end as value |

Hi Ranjan,

Am looking for possible query to carry out betweenness centrality on a selected subgraph with Cypher.

I found an example for PageRank, still looking for betweenness…

CALL apoc.algo.pageRankWithCypher(

{

iterations: 100,

write: true,

property: ‘pagerank_number_1′,

node_cypher:’none’,

rel_cypher: ‘MATCH (n)-[r]->(m) where //some criteria on the edge // return id(n) as source, id(m) as target, 1 as weight’

}

);

Hello Iyke

Following articles may help:

Thanks Ranjan :).