Mark Needham

Thoughts on Software Development

Kubernetes: Spinning up a Neo4j 3.1 Causal Cluster

with one comment

A couple of weeks ago I wrote a blog post explaining how I’d created a Neo4j causal cluster using docker containers directly and for my next pet project I wanted to use Kubernetes as an orchestration layer so that I could declaratively change the number of servers in my cluster.

I’d never used Kubernetes before but I saw a presentation showing how to use it to create an Elastic cluster at the GDG Cloud meetup a couple of months ago.

In that presentation I was introduced to the idea of a PetSet which is an abstraction exposed by Kubernetes which allows us to manage a set of pods (containers) which have a fixed identity. The documentation explains it better:

A PetSet ensures that a specified number of “pets” with unique identities are running at any given time. The identity of a Pet is comprised of:

  • a stable hostname, available in DNS
  • an ordinal index
  • stable storage: linked to the ordinal & hostname

In my case I need to have a stable hostname because each member of a Neo4j cluster is given a list of other cluster members with which it can create a new cluster or join an already existing one. This is the first use case described in the documentation:

PetSet also helps with the 2 most common problems encountered managing such clustered applications:

  • discovery of peers for quorum
  • startup/teardown ordering

So the first thing we need to do is create some stable storage for our pods to use.

We’ll create a cluster of 3 members so we need to create one PersistentVolume for each of them. The following script does the job:

volumes.sh

for i in $(seq 0 2); do
  cat <<EOF | kubectl create -f -
kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv${i}
  labels:
    type: local
    app: neo4j
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/${i}"
EOF
 
  cat <<EOF | kubectl create -f -
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: datadir-neo4j-${i}
  labels:
    app: neo4j
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
EOF
done;

If we run this script it’ll create 3 volumes which we can see by running the following command:

$ kubectl get pv
NAME      CAPACITY   ACCESSMODES   STATUS    CLAIM                     REASON    AGE
pv0       1Gi        RWO           Bound     default/datadir-neo4j-0             7s
pv1       1Gi        RWO           Bound     default/datadir-neo4j-1             7s
pv2       1Gi        RWO           Bound     default/datadir-neo4j-2             7s
$ kubectl get pvc
NAME              STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
datadir-neo4j-0   Bound     pv0       1Gi        RWO           26s
datadir-neo4j-1   Bound     pv1       1Gi        RWO           26s
datadir-neo4j-2   Bound     pv2       1Gi        RWO           25s

Next we need to create a PetSet template. After a lot of iterations I ended up with the following:

# Headless service to provide DNS lookup
apiVersion: v1
kind: Service
metadata:
  labels:
    app: neo4j
  name: neo4j
spec:
  clusterIP: None
  ports:
    - port: 7474
  selector:
    app: neo4j
----
# new API name
apiVersion: "apps/v1alpha1"
kind: PetSet
metadata:
  name: neo4j
spec:
  serviceName: neo4j
  replicas: 3
  template:
    metadata:
      annotations:
        pod.alpha.kubernetes.io/initialized: "true"
        pod.beta.kubernetes.io/init-containers: '[
            {
                "name": "install",
                "image": "gcr.io/google_containers/busybox:1.24",
                "command": ["/bin/sh", "-c", "echo \"
                unsupported.dbms.edition=enterprise\n
                dbms.mode=CORE\n
                dbms.connectors.default_advertised_address=$HOSTNAME.neo4j.default.svc.cluster.local\n
                dbms.connectors.default_listen_address=0.0.0.0\n
                dbms.connector.bolt.type=BOLT\n
                dbms.connector.bolt.enabled=true\n
                dbms.connector.bolt.listen_address=0.0.0.0:7687\n
                dbms.connector.http.type=HTTP\n
                dbms.connector.http.enabled=true\n
                dbms.connector.http.listen_address=0.0.0.0:7474\n
                causal_clustering.raft_messages_log_enable=true\n
                causal_clustering.initial_discovery_members=neo4j-0.neo4j.default.svc.cluster.local:5000,neo4j-1.neo4j.default.svc.cluster.local:5000,neo4j-2.neo4j.default.svc.cluster.local:5000\n
                causal_clustering.leader_election_timeout=2s\n
                  \" > /work-dir/neo4j.conf" ],
                "volumeMounts": [
                    {
                        "name": "confdir",
                        "mountPath": "/work-dir"
                    }
                ]
            }
        ]'
      labels:
        app: neo4j
    spec:
      containers:
      - name: neo4j
        image: "neo4j/neo4j-experimental:3.1.0-M13-beta3-enterprise"
        imagePullPolicy: Always
        ports:
        - containerPort: 5000
          name: discovery
        - containerPort: 6000
          name: tx
        - containerPort: 7000
          name: raft
        - containerPort: 7474
          name: browser
        - containerPort: 7687
          name: bolt
        securityContext:
          privileged: true
        volumeMounts:
        - name: datadir
          mountPath: /data
        - name: confdir
          mountPath: /conf
      volumes:
      - name: confdir
  volumeClaimTemplates:
  - metadata:
      name: datadir
      annotations:
        volume.alpha.kubernetes.io/storage-class: anything
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

The main thing I had trouble with was getting the members of the cluster to talk to each other. The default docker config uses hostnames but I found that pods were unable to contact each other unless I specified the FQDN in the config file. We can run the following command to create the PetSet:

$ kubectl create -f neo4j.yaml 
service "neo4j" created
petset "neo4j" created

We can check if the pods are up and running by executing the following command:

$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
neo4j-0   1/1       Running   0          2m
neo4j-1   1/1       Running   0          14s
neo4j-2   1/1       Running   0          10s

And we can tail neo4j’s log files like this:

$ kubectl logs neo4j-0
Starting Neo4j.
2016-11-25 16:39:50.333+0000 INFO  Starting...
2016-11-25 16:39:51.723+0000 INFO  Bolt enabled on 0.0.0.0:7687.
2016-11-25 16:39:51.733+0000 INFO  Initiating metrics...
2016-11-25 16:39:51.911+0000 INFO  Waiting for other members to join cluster before continuing...
2016-11-25 16:40:12.074+0000 INFO  Started.
2016-11-25 16:40:12.428+0000 INFO  Mounted REST API at: /db/manage
2016-11-25 16:40:13.350+0000 INFO  Remote interface available at http://neo4j-0.neo4j.default.svc.cluster.local:7474/
$ kubectl logs neo4j-1
Starting Neo4j.
2016-11-25 16:39:53.846+0000 INFO  Starting...
2016-11-25 16:39:56.212+0000 INFO  Bolt enabled on 0.0.0.0:7687.
2016-11-25 16:39:56.225+0000 INFO  Initiating metrics...
2016-11-25 16:39:56.341+0000 INFO  Waiting for other members to join cluster before continuing...
2016-11-25 16:40:16.623+0000 INFO  Started.
2016-11-25 16:40:16.951+0000 INFO  Mounted REST API at: /db/manage
2016-11-25 16:40:17.607+0000 INFO  Remote interface available at http://neo4j-1.neo4j.default.svc.cluster.local:7474/
$ kubectl logs neo4j-2
Starting Neo4j.
2016-11-25 16:39:57.828+0000 INFO  Starting...
2016-11-25 16:39:59.166+0000 INFO  Bolt enabled on 0.0.0.0:7687.
2016-11-25 16:39:59.176+0000 INFO  Initiating metrics...
2016-11-25 16:39:59.329+0000 INFO  Waiting for other members to join cluster before continuing...
2016-11-25 16:40:19.216+0000 INFO  Started.
2016-11-25 16:40:19.675+0000 INFO  Mounted REST API at: /db/manage
2016-11-25 16:40:21.029+0000 INFO  Remote interface available at http://neo4j-2.neo4j.default.svc.cluster.local:7474/

I wanted to log into the servers from my host machine’s browser so I setup port forwarding for each of the servers:

$ kubectl port-forward neo4j-0 7474:7474 7687:7687

We can then get an overview of the cluster by running the following procedure:

CALL dbms.cluster.overview()
 
╒════════════════════════════════════╤═════════════════════════════════════════════════════╤════════╕
│id                                  │addresses                                            │role    │
╞════════════════════════════════════╪═════════════════════════════════════════════════════╪════════╡
│81d8e5e2-02db-4414-85de-a7025b346e84│[bolt://neo4j-0.neo4j.default.svc.cluster.local:7687,│LEADER  │
│                                    │ http://neo4j-0.neo4j.default.svc.cluster.local:7474]│        │
├────────────────────────────────────┼─────────────────────────────────────────────────────┼────────┤
│347b7517-7ca0-4b92-b9f0-9249d46b2ad3│[bolt://neo4j-1.neo4j.default.svc.cluster.local:7687,│FOLLOWER│
│                                    │ http://neo4j-1.neo4j.default.svc.cluster.local:7474]│        │
├────────────────────────────────────┼─────────────────────────────────────────────────────┼────────┤
│a5ec1335-91ce-4358-910b-8af9086c2969│[bolt://neo4j-2.neo4j.default.svc.cluster.local:7687,│FOLLOWER│
│                                    │ http://neo4j-2.neo4j.default.svc.cluster.local:7474]│        │
└────────────────────────────────────┴─────────────────────────────────────────────────────┴────────┘

So far so good. What if we want to have 5 servers in the cluster instead of 3? We can run the following command to increase our replica size:

$ kubectl patch petset neo4j -p '{"spec":{"replicas":5}}'
"neo4j" patched

Let’s run that procedure again:

CALL dbms.cluster.overview()
 
╒════════════════════════════════════╤═════════════════════════════════════════════════════╤════════╕
│id                                  │addresses                                            │role    │
╞════════════════════════════════════╪═════════════════════════════════════════════════════╪════════╡
│81d8e5e2-02db-4414-85de-a7025b346e84│[bolt://neo4j-0.neo4j.default.svc.cluster.local:7687,│LEADER  │
│                                    │ http://neo4j-0.neo4j.default.svc.cluster.local:7474]│        │
├────────────────────────────────────┼─────────────────────────────────────────────────────┼────────┤
│347b7517-7ca0-4b92-b9f0-9249d46b2ad3│[bolt://neo4j-1.neo4j.default.svc.cluster.local:7687,│FOLLOWER│
│                                    │ http://neo4j-1.neo4j.default.svc.cluster.local:7474]│        │
├────────────────────────────────────┼─────────────────────────────────────────────────────┼────────┤
│a5ec1335-91ce-4358-910b-8af9086c2969│[bolt://neo4j-2.neo4j.default.svc.cluster.local:7687,│FOLLOWER│
│                                    │ http://neo4j-2.neo4j.default.svc.cluster.local:7474]│        │
├────────────────────────────────────┼─────────────────────────────────────────────────────┼────────┤
│28613d06-d4c5-461c-b277-ddb3f05e5647│[bolt://neo4j-3.neo4j.default.svc.cluster.local:7687,│FOLLOWER│
│                                    │ http://neo4j-3.neo4j.default.svc.cluster.local:7474]│        │
├────────────────────────────────────┼─────────────────────────────────────────────────────┼────────┤
│2eaa0058-a4f3-4f07-9f22-d310562ad1ec│[bolt://neo4j-4.neo4j.default.svc.cluster.local:7687,│FOLLOWER│
│                                    │ http://neo4j-4.neo4j.default.svc.cluster.local:7474]│        │
└────────────────────────────────────┴─────────────────────────────────────────────────────┴────────┘

Neat! And it’s as easy to go back down to 3 again:

$ kubectl patch petset neo4j -p '{"spec":{"replicas":3}}'
"neo4j" patched
CALL dbms.cluster.overview()
 
╒════════════════════════════════════╤══════════════════════════════════════════════════════╤════════╕
│id                                  │addresses                                             │role    │
╞════════════════════════════════════╪══════════════════════════════════════════════════════╪════════╡
│81d8e5e2-02db-4414-85de-a7025b346e84│[bolt://neo4j-0.neo4j.default.svc.cluster.local:7687, │LEADER  │
│                                    │http://neo4j-0.neo4j.default.svc.cluster.local:7474]  │        │
├────────────────────────────────────┼──────────────────────────────────────────────────────┼────────┤
│347b7517-7ca0-4b92-b9f0-9249d46b2ad3│[bolt://neo4j-1.neo4j.default.svc.cluster.local:7687, │FOLLOWER│
│                                    │http://neo4j-1.neo4j.default.svc.cluster.local:7474]  │        │
├────────────────────────────────────┼──────────────────────────────────────────────────────┼────────┤
│a5ec1335-91ce-4358-910b-8af9086c2969│[bolt://neo4j-2.neo4j.default.svc.cluster.local:7687, │FOLLOWER│
│                                    │http://neo4j-2.neo4j.default.svc.cluster.local:7474]  │        │
└────────────────────────────────────┴──────────────────────────────────────────────────────┴────────┘

Next I need to look at how we can add read replicas into the cluster. These don’t take part in the membership/quorum algorithm so I think I’ll be able to use the more common ReplicationController/Pod architecture for those.

If you want to play around with this the code is available as a gist. I’m using the minikube library for all my experiments but I’ll hopefully get around to trying this on GCE or AWS soon.

Be Sociable, Share!

Written by Mark Needham

November 25th, 2016 at 4:55 pm

Posted in neo4j

Tagged with ,