How to Install and Configure Neo4j Graph Database on Ubuntu 22.04

Neo4j is a graph database used to create data relationships. The data inside traditional databases are saved in a table-like structure. A graphical database records relationships between data nodes. Each node stores references to all the other nodes that it is connected to. Traditional databases do not save relationship data directly, but they can figure out connections by searching around data structures with the help of indexing which is expensive and time-consuming. A graphical database like Neo4j avoids all this overhead and can encode and query complex relationships.

How data is stored in a Graphical Database

Neo4j is developed by Neo Technology. It is written in Java and Scala and is available in both, a free community version, and an enterprise version. Neo4j uses its own query language called Cypher, but queries can be written in other styles.

This tutorial will teach you how to install and configure Neo4j on a Ubuntu 22.04 server.

Prerequisites

  • A server running Ubuntu 22.04 with a minimum of 1 CPU core and 2 GB of memory. You will need to upgrade the server as per requirements.

  • A non-root user with sudo privileges.

  • Make sure everything is updated.

  • $ sudo apt update
    
  • Install basic utility packages. Some of them may already be installed.

    $ sudo apt install wget curl nano software-properties-common dirmngr apt-transport-https gnupg gnupg2 ca-certificates lsb-release ubuntu-keyring unzip -y
    

Step 1 - Install Neo4j

The first step to installing Neo4j is adding the GPG key.

$ curl -fsSL https://debian.neo4j.com/neotechnology.gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/neo4j.gpg

Add the Neo4j repository to your system APT's sources directory.

$ echo "deb [signed-by=/usr/share/keyrings/neo4j.gpg] https://debian.neo4j.com stable latest" | sudo tee -a /etc/apt/sources.list.d/neo4j.list

To avoid the risk of upgrading to the next major version, you can specify the major and minor versions required in place of latest in the above command.

The following command will add Neo4j 5.x repository, which means you won't end up upgrading to the 6.x version whenever it releases.

$ echo "deb [signed-by=/usr/share/keyrings/neo4j.gpg] https://debian.neo4j.com stable 5" | sudo tee -a /etc/apt/sources.list.d/neo4j.list

Update the system repositories list.

$ sudo apt update

List the Neo4j versions available for installation.

$ apt list -a neo4j
Listing... Done
neo4j/stable 1:5.3.0 all
neo4j/stable 1:5.2.0 all
neo4j/stable 1:5.1.0 all

Install Neo4j Community edition.

$ sudo apt install neo4j

You can install a specific version using the following command.

$ sudo apt install neo4j=1:5.3.0

Note that the version includes an epoch version component (1:), in accordance with the Debian policy on versioning.

Neo4j will automatically install the required JDK version with it.

Enable the Neo4j service.

$ sudo systemctl enable neo4j

Start the Neo4j service.

$ sudo systemctl start neo4j

Check the status of the Neo4j service.

$ sudo systemctl status neo4j
? neo4j.service - Neo4j Graph Database
     Loaded: loaded (/lib/systemd/system/neo4j.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2023-01-21 20:50:52 UTC; 33s ago
   Main PID: 5241 (java)
      Tasks: 72 (limit: 1030)
     Memory: 399.3M
        CPU: 20.350s
     CGroup: /system.slice/neo4j.service

Step 2 - Test Connection

Neo4j uses the Cypher Shell for working with data.

Connect to the Cypher Shell.

$ cypher-shell

You will be prompted for a username and a password. The default username and password is neo4j. You will be asked to choose a new password.

username: neo4j
password:
Password change required
new password:
confirm password:
Connected to Neo4j using Bolt protocol version 5.0 at neo4j://localhost:7687 as user neo4j.
Type :help for a list of available commands or :exit to exit the shell.
Note that Cypher queries must end with a semicolon.
neo4j@neo4j>

This confirms the successful connection to Neo4j DBMS.

Type :exit to exit the shell.

neo4j@neo4j> :exit

Bye!

Step 3 - Configure Neo4j for Remote Access

For production environments, you may need to confirm Neo4j to accept connections from remote hosts. By default, Neo4j accepts connections from localhost only.

We can configure Neo4j to accept connections from remote hosts by editing its configuration file. Neo4j stores its settings in the /etc/neo4j/neo4j.conf file. Open it for editing.

$ sudo nano /etc/neo4j/neo4j.conf

Find the commented out line #dbms.default_listen_address=0.0.0.0 and uncomment it by removing the leading hash.

. . .
#*****************************************************************
# Network connector configuration
#*****************************************************************

# With default configuration Neo4j only accepts local connections.
# To accept non-local connections, uncomment this line:
server.default_listen_address=0.0.0.0
. . .

Save the file by pressing Ctrl + X and entering Y when prompted.

By default, the value 0.0.0.0 will bind Neo4j to all available IPv4 interfaces on your system, including localhost. If you would like to limit Neo4j to a particular IP address, for example, a private network IP, specify the IP address that is assigned to your server’s private network interface here.

You can also configure Neo4j to use IPv6 interfaces. As with IPv4, you can set the default_listen_address value to a specific IPv6 address that you will use to communicate with Neo4j. If you want to limit Neo4j to only use the local IPv6 address for your server, specify ::1, which corresponds to localhost using IPv6 notation.

If you configure Neo4j with an IPv6 address, you will not be able to connect with cypher-shell using the IPv6 address directly. Instead, you need to either configure a DNS name that resolves to the IPv6 address, or add an entry in the remote system’s /etc/hosts file that maps the address to a name. Then you will be able to use the DNS or hosts file name to connect to Neo4j using IPv6 from your remote system.

For example, a Neo4j server with an IPv6 address like 2001:db8::1 would require the remote connecting system to have an /etc/hosts entry as shown below.

2001:db8::1 your_hostname

You can then connect to the server from the remote system using the name that you specified as shown below.

$ cypher-shell -a 'neo4j://your_hostname:7687'

If you restrict Neo4j to use the IPv6 localhost address of ::1, then you can connect to it locally on the Neo4j server itself using the preconfigured ip6-localhost name from your /etc/hosts file shown below.

$ cypher-shell -a 'neo4j://ip6-localhost:7687'

Once you invoke cypher-shell with the connection URI, you will be prompted for your username and password as usual.

Step 4 - Configure Firewall Access (UFW)

Once you have enabled remote connections, you can use the firewall to restrict Neo4j to limit connections from only trusted systems to which it can connect.

Neo4j creates two network sockets, one on port 7474 for the built-in HTTP interface, and the main bolt protocol on port 7687.

Ubuntu 22.04 uses Uncomplicated Firewall(UFW) by default.

Configure the firewall to allow a trusted remote host access to the bolt interface using IPv4 using the following command.

$ sudo ufw allow from 203.0.113.1 to any port 7687 proto tcp

Substitute the IP address of the trusted remote system in place of the 203.0.113.1 value. Similarly, you can allow an entire network range using the following command.

$ sudo ufw allow from 192.0.2.0/24 to any port 7687 proto tcp

Substitute the actual network in place of the 192.0.2.0/24 value.

To allow access to a remote host using IPv6, you can use the following command.

$ sudo ufw allow from 2001:DB8::1/128 to any port 7687 proto tcp

Substitute your trusted system's IPv6 address in place of the 2001:DB8::1/128 value.

As with IPv4, you can allow a range of IPv6 addresses using the following command.

$ ufw allow from 192.0.2.0/24 to any port 7687 proto tcp

Again, substitute your trusted network range in place of the highlighted 2001:DB8::/32 network range.

Reload the firewall to apply the changes.

$ sudo ufw reload

Check the status of the firewall.

$ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW       Anywhere
22/tcp (v6)                ALLOW       Anywhere (v6)
7687/tcp                   ALLOW       203.0.113.1

Step 5 - Use Neo4j

Connect to Neo4j using cypher-shell tool. You will be prompted for your username and password.

$ cypher-shell

If you configured Neo4j for remote access, then use the following command to connect to Neo4j from the remote system.

$ cypher-shell -a 'neo4j://203.0.113.1:7687'

Here 203.0.113.1 is the IP address of the Neo4j server.

If you are using IPv6, ensure that you have an /etc/hosts entry with a name described in step 3. Then connect to the Neo4j server as follows.

$ cypher-shell -a 'neo4j://your_hostname:7687'

Ensure that your_hostname maps to your Neo4j server's IPv6 address in the remote system's /etc/hosts file.

Let us add a node called Slite and the names of authors to Neo4j. The following command will create a node of type Slite, with a name Navjot Singh.

neo4j@neo4j> CREATE (:Slite {name: 'Navjot Singh'});

You will get the following output.

0 rows
ready to start consuming query after 124 ms, results consumed after another 0 ms
Added 1 nodes, Set 1 properties, Added 1 labels

Next, we will add more employees, and relate them using a relationship called COLLEAGUE. You can link nodes with arbitrarily named relationships.

Add three more employees and link them using the COLLEAGUE relationship.

neo4j@neo4j> CREATE
             (:Slite {name: 'Sammy'})-[:COLLEAGUE]->
             (:Slite {name: 'Peter Jack'})-[:COLLEAGUE]->
             (:Slite {name: 'Chris Rock'});

You will get a similar output.

0 rows
ready to start consuming query after 72 ms, results consumed after another 0 ms
Added 3 nodes, Created 2 relationships, Set 3 properties, Added 3 labels

Now, let us create some relationships.

Since Peter and Chris work in the same department and have the same properties as nodes, we will create a relationship with the name column.

neo4j@neo4j> MATCH (a:Slite),(b:Slite)
             WHERE a.name = 'Peter Jack' AND b.name = 'Chris Rock'
             CREATE (a)-[r:DEPARTMENT { name: 'Designers' }]->(b)
             RETURN type(r), r.name;
+----------------------------+
| type(r)      | r.name      |
+----------------------------+
| "DEPARTMENT" | "Designers" |
+----------------------------+

1 row
ready to start consuming query after 60 ms, results consumed after another 17 ms
Created 1 relationships, Set 1 properties

Now, let us create another connection between Sammy and Peter since they are working on the same project.

neo4j@neo4j> MATCH (a:Slite), (b:Slite)
             WHERE a.name = 'Peter Jack' AND b.name = 'Sammy'                                                                        CREATE (a)-[r:PROJECT { name: 'Test Project 1' }]->(b)                                                                  RETURN type(r), r.name;
+------------------------------+
| type(r)   | r.name           |
+------------------------------+
| "PROJECT" | "Test Project 1" |
+------------------------------+

1 row
ready to start consuming query after 132 ms, results consumed after another 12 ms
Created 1 relationships, Set 1 properties

Let us display all this data using the following query.

neo4j@neo4j> MATCH (a)-[r]->(b)
             RETURN a.name,r,b.name
             ORDER BY r;
+-------------------------------------------------------------------+
| a.name       | r                                   | b.name       |
+-------------------------------------------------------------------+
| "Sammy"      | [:COLLEAGUE]                        | "Peter Jack" |
| "Peter Jack" | [:COLLEAGUE]                        | "Chris Rock" |
| "Peter Jack" | [:DEPARTMENT {name: "Designers"}]   | "Chris Rock" |
| "Peter Jack" | [:PROJECT {name: "Test Project 1"}] | "Sammy"      |
+-------------------------------------------------------------------+

4 rows
ready to start consuming query after 99 ms, results consumed after another 5 ms

Conclusion

This concludes our tutorial on installing and configuring Neo4j on a Ubuntu 22.04 server. If you have any questions, post them in the comments below.

Share this page:

0 Comment(s)