Dataiku in HA on EKS
• Mark Eschbach
Next up is to verify we can recover from loss of a single process and ensure our application works as expected. From
what I can tell the recovery story Dataiku uses is
taring a directory. I am hoping the path forward is to use a persistent volume for that data directory.
A persistent volume is a administrator configured storage device which lives outside of the lifecycle of a pod. Under some platforms like GKE these can be auto-provisioned. For the EKS cluster I am working against this was already deployed, however I would imagine it is something like this:
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: annotations: storageclass.beta.kubernetes.io/is-default-class: "false" name: persistent parameters: encrypted: "true" type: gp2 zones: us-east-1b,us-east-1c provisioner: kubernetes.io/aws-ebs
To utilize the volume you will need to define the volume.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: dataiku-orechestration spec: accessModes: - ReadWriteOnce resources: requests: storage: 8Gi storageClassName: persistent
To apply this one would use:
spec: containers: - name: private-reg-container volumeMounts: - name: data mountPath: "/home/dataiku/dss"
In theory this should work great. Unfortunately this does not work as the underlying container does not contain any
mount points. You can verify yourself with the docker command
docker inspect dataiku/dss:latest once you have run
dataiku/dss:latest and look at the path
..ContainerConfig.Volumes. At this point I think there are two options:
first is to ask Dataiku for guidance on how to run this or I think a StatefulSet would be an issue.
Speaking too soon
docker run -it --rm dataiku/dss /bin/sh there is a directory named
/home/dataiku which is
where the data is being stored. Perhaps there is hope! Additionally, talking with a coworker, I should be able to
force a mount point at using something like this:
spec: containers: - name: dataiku-container volumeMounts: - mountPath: /home/dataiku/dss name: data-store readOnly: false volumes: - name: data-store persistentVolumeClaim: claimName: example-claim
I have the claim mounted, however Dataiku is producing the following error:
[-] Directory /home/dataiku/dss already exists, but is not empty. Aborting !
I will need to investigate another time.