Go to file
2019-02-15 13:43:25 -05:00
analytics-nginx Rebuild repo and make it internal on gitlab 2018-11-26 13:32:49 -05:00
api Kubernetes deployment to aws eks 2019-02-12 13:57:18 -05:00
kubernetes readme 2019-02-13 12:12:53 -05:00
mimir Kubernetes deployment to aws eks 2019-02-12 13:57:18 -05:00
s3-endpoint Rebuild repo and make it internal on gitlab 2018-11-26 13:32:49 -05:00
spark-docker readme 2019-02-13 12:12:53 -05:00
ui-nginx Kubernetes deployment to aws eks 2019-02-12 13:57:18 -05:00
vizier-nginx-proxy Kubernetes deployment to aws eks 2019-02-12 13:57:18 -05:00
.DS_Store Rebuild repo and make it internal on gitlab 2018-11-26 13:32:49 -05:00
build-images.sh Kubernetes deployment is working 2019-01-30 13:09:31 -05:00
Readme.md readme 2019-02-15 13:43:25 -05:00
reset_microk8s.sh Kubernetes deployment is working. Likely straight docker is broken. 2019-01-30 23:29:46 -05:00
run-containers.sh Kubernetes deployment is working 2019-01-30 13:09:31 -05:00
run_containers_norn.sh Kubernetes deployment is working 2019-01-30 13:09:31 -05:00

VizierDB

Deploy a Containerized Vizier Instance

Vizier is a cloud-enabled tool that makes it easy to explore, validate, transform and debug data.

Components

Vizier has a number of components that are not trivial to set up manually:

  • [Web UI] - React user interface!
  • [API Server] - python wsgi api server
  • [MimirDB] - virtual probabalistic database
  • [Proxy] - a reverse proxy that provides an endpoint for vizier services
  • [Apache Spark] - distributed data processing
  • [Hadoop] - distributed data processing
  • [S3] - optional data staging endpoint
  • [Analytics] - optional vizier ui access tracking

Though instalation instructions for each of these components is availabel, it is time-consuming and difficult to install them manually. So is there an easier containerized deploment? Yes! Deployment to a kubernetes cluster is explained below.

Deploy The Vizier Stack to Kubernetes

If you already have a kubernetes cluster set up, good, make sure CoreDNS is enabled (we are using k8s v1.13.2). If not, you can get a single node cluster setup pretty fast using microk8s. See microk8s docs for more details, but basically you can just do the following:

sudo snap install microk8s --classic
microk8s.enable dns dashboard

Once your cluster is ready, get the yaml file for deploying Vizier and make the following adjustments:

  • update the host paths for the persistent volumes if you would like them somewhere other than /mnt/ (YAML Line 15, 28, 41, 310)
  • update the s3-credentials secret with your S3 access key id and secret - base64 encode them first: (YAML Line 330, 331)
    echo "YOUR-S3-ACCESS-KEY-ID" | base64    
    echo "YOUR-S3-ACCESS-KEY-SECRET" | base64
    
  • update the VIZIER_DOMAIN env variable for the vizier-proxy deployment to the domain you will use to access Vizier. You can use a real domain and DNS entries or the hosts file of a client. (YAML Line 622)

Deploy vizier

kubectl create -f vizier-deployment.yaml

You may need to do this to allow containers to access the internet

sudo iptables -P FORWARD ACCEPT

Find the ClusterIP or ExternalIP of the vizier-proxy service

kubectl get service vizier-proxy 

After you have the IP of the vizier-proxy service you need to add the following entries to either DNS for a real domain or the hosts file of the client: so where VIZIER_DOMAIN=vizier.dev

IP Address Host Name Purpose
demo.vizier.dev web ui for vizier
api.vizier.dev web api for vizier
vizier.vizier.dev supervisor ctl for api
mimir.vizier.dev supervisor ctl for mimir
proxy.vizier.dev supervisor ctl for proxy
analytics.vizier.dev endpoint for access analytics and ui
spark.vizier.dev web ui for spark master
driver.vizier.dev web ui for spark driver
hdfs.vizier.dev web ui for hadoop

Now you should be able to access the Vizier UI from a web browser.

https://demo.<VIZIER_DOMAIN>/vizier-db

License

Apache License 2.0