Mark Needham

Thoughts on Software Development

Archive for the ‘Kubernetes’ Category

Kubernetes: Simple example of pod running

without comments

I recently needed to create a Kubernetes pod that would ‘just sit there’ while I used kube cp to copy some files to a persistent volume to which it was bound.

I started out with this naive pod spec:

pod_no_while.yaml

kind: Pod
apiVersion: v1
metadata:
  name: marks-dummy-pod
spec:
  containers:
    - name: marks-dummy-pod
      image: ubuntu
  restartPolicy: Never

Let’s apply that template:

$ kubectl apply -f pod_no_while.yaml 
pod "marks-dummy-pod" created

And let’s check if we have any running pods:

$ kubectl get pods
No resources found, use --show-all to see completed objects.

We won’t see anything here because unsurprisingly the pod has run to completion as there’s nothing to keep it running! We can confirm that by running this command:

$ kubectl get pods --show-all
NAME              READY     STATUS      RESTARTS   AGE
marks-dummy-pod   0/1       Completed   0          1m

Now let’s create a pod that has an infinite bash while loop:

pod.yaml

kind: Pod
apiVersion: v1
metadata:
  name: marks-dummy-pod
spec:
  containers:
    - name: marks-dummy-pod
      image: ubuntu
      command: ["/bin/bash", "-ec", "while :; do echo '.'; sleep 5 ; done"]
  restartPolicy: Never

Let’s apply that one instead:

$ kubectl apply -f pod.yaml 
The Pod "marks-dummy-pod" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)

Oops, we need to delete it first so let’s do that:

$ kubectl delete pod marks-dummy-pod
pod "marks-dummy-pod" deleted

Attempt #2:

$ kubectl apply -f pod.yaml 
pod "marks-dummy-pod" created

And let’s checkup on our pod:

$ kubectl get pods
NAME              READY     STATUS    RESTARTS   AGE
marks-dummy-pod   1/1       Running   0          14s

Looks better already. Let’s check the logs

$ kubectl logs marks-dummy-pod 
.
.

Great! We can now kubectl cp to our heart’s content and then delete the pod aftewards.

Written by Mark Needham

October 21st, 2017 at 10:06 am

Posted in Kubernetes

Tagged with ,

Kubernetes: Which node is a pod on?

without comments

When running Kubernetes on a cloud provider, rather than locally using minikube, it’s useful to know which node a pod is running on.

The normal command to list pods doesn’t contain this information:

$ kubectl get pod
NAME           READY     STATUS    RESTARTS   AGE       
neo4j-core-0   1/1       Running   0          6m        
neo4j-core-1   1/1       Running   0          6m        
neo4j-core-2   1/1       Running   0          2m

I spent a while searching for a command that I could use before I came across Ta-Ching Chen’s blog post while looking for something else.

Ta-Ching points out that we just need to add the flag -o wide to our original command to get the information we require:

$ kubectl get pod -o wide
NAME           READY     STATUS    RESTARTS   AGE       IP           NODE
neo4j-core-0   1/1       Running   0          6m        10.32.3.6    gke-neo4j-cluster-default-pool-ded394fa-0kpw
neo4j-core-1   1/1       Running   0          6m        10.32.3.7    gke-neo4j-cluster-default-pool-ded394fa-0kpw
neo4j-core-2   1/1       Running   0          2m        10.32.0.10   gke-neo4j-cluster-default-pool-ded394fa-kp68

Easy!

Written by Mark Needham

June 14th, 2017 at 8:49 am

Posted in Kubernetes

Tagged with

Kubernetes: Simulating a network partition

without comments

A couple of weeks ago I wrote a post explaining how to create a Neo4j causal cluster using Kubernetes and … the I wanted to work out how to simulate a network partition which would put the leader on the minority side and force an election.

We’ve done this on our internal tooling on AWS using the iptables command but unfortunately that isn’t available in my container, which only has the utilities provided by BusyBox.

Luckily one of these is route command which will allow us to achieve the same thing.

To recap, I have 3 Neo4j pods up and running:

$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
neo4j-0   1/1       Running   0          6h
neo4j-1   1/1       Running   0          6h
neo4j-2   1/1       Running   0          6h

And we can check that the route command is available:

$ kubectl exec neo4j-0 -- ls -alh /sbin/route 
lrwxrwxrwx    1 root     root          12 Oct 18 18:58 /sbin/route -> /bin/busybox

Let’s have a look what role each server is currently playing:

$ kubectl exec neo4j-0 -- bin/cypher-shell "CALL dbms.cluster.role()"
role
"FOLLOWER"
 
Bye!
$ kubectl exec neo4j-1 -- bin/cypher-shell "CALL dbms.cluster.role()"
role
"FOLLOWER"
 
Bye!
$ kubectl exec neo4j-2 -- bin/cypher-shell "CALL dbms.cluster.role()"
role
"LEADER"
 
Bye!

Slight aside: I’m able to call cypher-shell without a user and password because I’ve disable authorisation by putting the following in conf/neo4j.conf:

dbms.connector.bolt.enabled=true

Back to the network partitioning…we need to partition away neo4j-2 from the other two servers which we can do by running the following commands:

$ kubectl exec neo4j-2 -- route add -host neo4j-0.neo4j.default.svc.cluster.local reject && \
  kubectl exec neo4j-2 -- route add -host neo4j-1.neo4j.default.svc.cluster.local reject && \
  kubectl exec neo4j-0 -- route add -host neo4j-2.neo4j.default.svc.cluster.local reject && \
  kubectl exec neo4j-1 -- route add -host neo4j-2.neo4j.default.svc.cluster.local reject

If we look at the logs of neo4j-2 we can see that it’s stepped down after being disconnected from the other two servers:

$ kubectl exec neo4j-2 -- cat logs/debug.log
...
2016-12-04 11:30:10.186+0000 INFO  [o.n.c.c.c.RaftMachine] Moving to FOLLOWER state after not receiving heartbeat responses in this election timeout period. Heartbeats received: []
...

Who’s taken over as leader?

$ kubectl exec neo4j-0 -- bin/cypher-shell "CALL dbms.cluster.role()"
role
"LEADER"
 
Bye!
$ kubectl exec neo4j-1 -- bin/cypher-shell "CALL dbms.cluster.role()"
role
"FOLLOWER"
 
Bye!
$ kubectl exec neo4j-2 -- bin/cypher-shell "CALL dbms.cluster.role()"
role
"FOLLOWER"
 
Bye!

Looks like neo4j-0! Let’s put some data into the database:

$ kubectl exec neo4j-0 -- bin/cypher-shell "CREATE (:Person {name: 'Mark'})"
Added 1 nodes, Set 1 properties, Added 1 labels
 
Bye!

Let’s check if that node made it to the other two servers. We’d expect it to be on neo4j-1 but not on neo4j-2:

$ kubectl exec neo4j-1 -- bin/cypher-shell "MATCH (p:Person) RETURN p"
p
(:Person {name: "Mark"})
 
Bye!
$ kubectl exec neo4j-2 -- bin/cypher-shell "MATCH (p:Person) RETURN p"
 
 
Bye!

On neo4j-2 we’ll repeatedly see these types of entries in the log as its election timeout triggers but fails to get any responses to the vote requests it sends out:

$ kubectl exec neo4j-2 -- cat logs/debug.log
...
2016-12-04 11:32:56.735+0000 INFO  [o.n.c.c.c.RaftMachine] Election timeout triggered
2016-12-04 11:32:56.736+0000 INFO  [o.n.c.c.c.RaftMachine] Election started with vote request: Vote.Request from MemberId{ca9b954c} {term=11521, candidate=MemberId{ca9b954c}, lastAppended=68, lastLogTerm=11467} and members: [MemberId{484178c4}, MemberId{0acdb8dd}, MemberId{ca9b954c}]
...

We can see those vote requests by looking at the raft-messages.log which can be enabled by setting the following property in conf/neo4j.conf:

causal_clustering.raft_messages_log_enable=true
$ kubectl exec neo4j-2 -- cat logs/raft-messages.log
...
11:33:42.101 -->MemberId{484178c4}: Request: Vote.Request from MemberId{ca9b954c} {term=11537, candidate=MemberId{ca9b954c}, lastAppended=68, lastLogTerm=11467}
11:33:42.102 -->MemberId{0acdb8dd}: Request: Vote.Request from MemberId{ca9b954c} {term=11537, candidate=MemberId{ca9b954c}, lastAppended=68, lastLogTerm=11467}
 
11:33:45.432 -->MemberId{484178c4}: Request: Vote.Request from MemberId{ca9b954c} {term=11538, candidate=MemberId{ca9b954c}, lastAppended=68, lastLogTerm=11467}
11:33:45.433 -->MemberId{0acdb8dd}: Request: Vote.Request from MemberId{ca9b954c} {term=11538, candidate=MemberId{ca9b954c}, lastAppended=68, lastLogTerm=11467}
 
11:33:48.362 -->MemberId{484178c4}: Request: Vote.Request from MemberId{ca9b954c} {term=11539, candidate=MemberId{ca9b954c}, lastAppended=68, lastLogTerm=11467}
11:33:48.362 -->MemberId{0acdb8dd}: Request: Vote.Request from MemberId{ca9b954c} {term=11539, candidate=MemberId{ca9b954c}, lastAppended=68, lastLogTerm=11467}
...

To ‘heal’ the network partition we just need to delete all the commands we ran earlier:

$ kubectl exec neo4j-2 -- route delete neo4j-0.neo4j.default.svc.cluster.local reject && \
  kubectl exec neo4j-2 -- route delete neo4j-1.neo4j.default.svc.cluster.local reject && \
  kubectl exec neo4j-0 -- route delete neo4j-2.neo4j.default.svc.cluster.local reject && \
  kubectl exec neo4j-1 -- route delete neo4j-2.neo4j.default.svc.cluster.local reject

Now let’s check that neo4j-2 now has the node that we created earlier:

$ kubectl exec neo4j-2 -- bin/cypher-shell "MATCH (p:Person) RETURN p"
p
(:Person {name: "Mark"})
 
Bye!

That’s all for now!

Written by Mark Needham

December 4th, 2016 at 12:37 pm

Posted in Kubernetes,neo4j

Tagged with ,