analytics-nginx | ||
api | ||
api-async | ||
kubernetes | ||
mimir | ||
s3-endpoint | ||
spark-docker | ||
twilio-video | ||
ui-nginx | ||
vizier-nginx-proxy | ||
.DS_Store | ||
build-images.sh | ||
Readme.md | ||
remove-containers.sh | ||
reset_microk8s.sh | ||
run-containers.sh | ||
run_containers_norn.sh |
Deploy a Containerized Vizier Instance
Vizier is a cloud-enabled tool that makes it easy to explore, validate, transform and debug data.
- vizier core platform is open-source. See: api, ui, mimir on GitHub
- for more details about Vizier see: https://vizierdb.info
Components
Vizier has a number of components that are not trivial to set up manually:
- [Web UI] - React user interface!
- [API Server] - python wsgi api server
- [MimirDB] - virtual probabalistic database
- [Proxy] - a reverse proxy that provides an endpoint for vizier services
- [Apache Spark] - distributed data processing
- [Hadoop] - distributed data processing
- [S3] - optional data staging endpoint
- [Analytics] - optional vizier ui access tracking
Though instalation instructions for each of these components is availabel, it is time-consuming and difficult to install them manually. So is there an easier containerized deploment? Yes! Deployments to a kubernetes cluster and to docker are explained below.
Deploy The Vizier Stack to docker
If you already have a docker installed, good. If not, you can get installed pretty fast. See docker for more details, but on Ubuntu, basically you can just do the following:
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
Once your docker instalation is ready, get the bash script for deploying Vizier and make the following adjustments:
- update the VIZIER_DOMAIN variable for the vizier-proxy deployment to the domain you will use to access Vizier. You can use a real domain and DNS entries or the hosts file of a client. (run-vizier-containers.sh: Line 26)
- update the name or host paths for the volumes if you would like them somewhere other than the default (run-vizier-containers.sh: Line 31)
- update the s3-credentials and bucket name with your S3 access key id, secret, and bucket name: (run-vizier-containers.sh: Line 28, 29, 30)
Deploy vizier
./run-vizier-containers.sh
The IP address of the vizier-proxy service for a local docker deployment will likely be 127.0.0.1
Deploy The Vizier Stack to Kubernetes
If you already have a kubernetes cluster set up, good, make sure CoreDNS is enabled (we are using k8s v1.13.2). If not, you can get a single node cluster setup pretty fast using microk8s. See microk8s docs for more details, but basically you can just do the following:
sudo snap install microk8s --classic
microk8s.enable dns dashboard
Once your cluster is ready, get the yaml file for deploying Vizier and make the following adjustments:
- update the host paths for the persistent volumes if you would like them somewhere other than /mnt/ (YAML Line 15, 28, 41, 310)
- update the s3-credentials secret with your S3 access key id and secret - base64 encode them first: (YAML Line 330, 331)
echo "YOUR-S3-ACCESS-KEY-ID" | base64 echo "YOUR-S3-ACCESS-KEY-SECRET" | base64
- update the VIZIER_DOMAIN env variable for the vizier-proxy deployment to the domain you will use to access Vizier. You can use a real domain and DNS entries or the hosts file of a client. (YAML Line 622)
Deploy vizier
kubectl create -f vizier-deployment.yaml
You may need to do this to allow containers to access the internet
sudo iptables -P FORWARD ACCEPT
Find the ClusterIP or ExternalIP of the vizier-proxy service
kubectl get service vizier-proxy
After Deployment
After you have the IP of the vizier-proxy service you need to add the following entries to either DNS for a real domain or the hosts file of the client: so where VIZIER_DOMAIN=vizier.dev
IP Address | Host Name | Purpose |
---|---|---|
IP of vizier-proxy service | demo.vizier.dev | web ui for vizier |
IP of vizier-proxy service | api.vizier.dev | web api for vizier |
IP of vizier-proxy service | vizier.vizier.dev | supervisor ctl for api |
IP of vizier-proxy service | mimir.vizier.dev | supervisor ctl for mimir |
IP of vizier-proxy service | proxy.vizier.dev | supervisor ctl for proxy |
IP of vizier-proxy service | analytics.vizier.dev | endpoint for access analytics and ui |
IP of vizier-proxy service | spark.vizier.dev | web ui for spark master |
IP of vizier-proxy service | driver.vizier.dev | web ui for spark driver |
IP of vizier-proxy service | hdfs.vizier.dev | web ui for hadoop |
Now you should be able to access the Vizier UI from a web browser.
https://demo.<VIZIER_DOMAIN>/vizier-db
License
Apache License 2.0