[![VizierDB]()](https://vizierdb.info) # Deploy a Containerized Vizier Instance Vizier is a cloud-enabled tool that makes it easy to explore, validate, transform and debug data. - vizier core platform is open-source. See: [api](https://github.com/vizierdb/web-api), [ui](https://github.com/vizierdb/web-ui), [mimir](https://github.com/UBOdin/mimir) on GitHub - for more details about Vizier see: ### Components Vizier has a number of components that are not trivial to set up manually: * [Web UI] - React user interface! * [API Server] - python wsgi api server * [MimirDB] - virtual probabalistic database * [Proxy] - a reverse proxy that provides an endpoint for vizier services * [Apache Spark] - distributed data processing * [Hadoop] - distributed data processing * [S3] - optional data staging endpoint * [Analytics] - optional vizier ui access tracking Though instalation instructions for each of these components is availabel, it is time-consuming and difficult to install them manually. So is there an easier containerized deploment? Yes! Deployments to a kubernetes cluster and to docker are explained below. ### Deploy The Vizier Stack to docker If you already have a docker installed, good. If not, you can get installed pretty fast. See [docker](https://docs.docker.com/install/) for more details, but on Ubuntu, basically you can just do the following: ```sh sudo add-apt-repository \ "deb [arch=amd64] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) \ stable" sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io ``` Once your docker instalation is ready, get the [bash script](https://vizierdb.info/assets/run-vizier-containers.sh) for deploying Vizier and make the following adjustments: - update the VIZIER_DOMAIN variable for the vizier-proxy deployment to the domain you will use to access Vizier. You can use a real domain and DNS entries or the hosts file of a client. ([run-vizier-containers.sh](https://vizierdb.info/assets/run-vizier-containers.sh): Line 26) - update the name or host paths for the volumes if you would like them somewhere other than the default ([run-vizier-containers.sh](https://vizierdb.info/assets/run-vizier-containers.sh): Line 31) - update the s3-credentials and bucket name with your S3 access key id, secret, and bucket name: ([run-vizier-containers.sh](https://vizierdb.info/assets/run-vizier-containers.sh): Line 28, 29, 30) Deploy vizier ```sh ./run-vizier-containers.sh ``` The IP address of the vizier-proxy service for a local docker deployment will likely be ### Deploy The Vizier Stack to Kubernetes If you already have a [kubernetes](https://kubernetes.io/) cluster set up, good, make sure CoreDNS is enabled (we are using k8s v1.13.2). If not, you can get a single node cluster setup pretty fast using [microk8s](https://microk8s.io/). See [microk8s docs](https://microk8s.io/docs/) for more details, but basically you can just do the following: ```sh sudo snap install microk8s --classic microk8s.enable dns dashboard ``` Once your cluster is ready, get the [yaml file](https://vizierdb.info/assets/vizier-deployment.yaml) for deploying Vizier and make the following adjustments: - update the host paths for the persistent volumes if you would like them somewhere other than /mnt/ (YAML Line 15, 28, 41, 310) - update the s3-credentials secret with your S3 access key id and secret - base64 encode them first: (YAML Line 330, 331) ```sh echo "YOUR-S3-ACCESS-KEY-ID" | base64 echo "YOUR-S3-ACCESS-KEY-SECRET" | base64 ``` - update the VIZIER_DOMAIN env variable for the vizier-proxy deployment to the domain you will use to access Vizier. You can use a real domain and DNS entries or the hosts file of a client. (YAML Line 622) Deploy vizier ```sh kubectl create -f vizier-deployment.yaml ``` You may need to do this to allow containers to access the internet ```sh sudo iptables -P FORWARD ACCEPT ``` Find the ClusterIP or ExternalIP of the vizier-proxy service ```sh kubectl get service vizier-proxy ``` ### After Deployment After you have the IP of the vizier-proxy service you need to add the following entries to either DNS for a real domain or the hosts file of the client: so where VIZIER_DOMAIN=vizier.dev | IP Address | Host Name | Purpose | | ------ | ------ | ------ | | IP of vizier-proxy service | demo.vizier.dev | web ui for vizier | | IP of vizier-proxy service | api.vizier.dev | web api for vizier | | IP of vizier-proxy service | vizier.vizier.dev | supervisor ctl for api | | IP of vizier-proxy service | mimir.vizier.dev | supervisor ctl for mimir | | IP of vizier-proxy service | proxy.vizier.dev | supervisor ctl for proxy | | IP of vizier-proxy service | analytics.vizier.dev | endpoint for access analytics and ui | | IP of vizier-proxy service | spark.vizier.dev | web ui for spark master | | IP of vizier-proxy service | driver.vizier.dev | web ui for spark driver | | IP of vizier-proxy service | hdfs.vizier.dev | web ui for hadoop | Now you should be able to access the Vizier UI from a web browser. ``` https://demo./vizier-db ``` License ---- Apache License 2.0