Apache Kyuubi on CDH 在竞技世界大数据平台实践

Apache Kyuubi on CDH 在竞技世界大数据平台实践

为了满足业务大数据架构使用多种sql引擎:spark,flink,trino(同时查询 hive,clickhouse 等),需要部署一个统一的sql入口,该入口满足多引擎多平台运行。本文的实践是上述需求的一个初始实践(后续部分正在进行),鉴于当前没有找到 Kyuubi on k8s 的实践,所以记录一下。

01
基础环境
Apache Kyuubi on CDH 在竞技世界大数据平台实践 1.2 TPCDS 数据集 

               组件名称

             组件版本

                 kyuubi

               v1.6.0

                  Spark

               v3.3.0

                  CDH

               v6.2.1

02

创建 Docker 镜像


Apache Kyuubi on CDH 在竞技世界大数据平台实践 创建 Spark 3.3.0 镜像

1.修改 Spark 的配置文件

修改 Spark-env.sh 文件,添加下面的内容

(路径是未来在容器中路径):

export HADOOP_CONF_DIR=/opt/spark/conf:/opt/spark/confexport YARN_CONF_DIR=/opt/spark/conf:/opt/spark/conf
Spark-defaults.conf 文件保持不变

2.将 CDH 中 Hadoop 中的配置添加到/data/spark-3.3.0/conf中
(CDH 的 Hadoop 配置在etc下)

Apache Kyuubi on CDH 在竞技世界大数据平台实践

3.编辑初始化脚本

(下面内容需要补充进去,***表示自定义的内容

#方便后面解析CDH集群ipecho " ***.***.***.***  "  >> /etc/hosts#kerberos认证需要的配置文件echo "***" > /etc/krb5.conf#在镜像中进行认证操作kinit -kt /opt/spark/work-dir/hive.keytab hive/***@****.****.****

编辑好 run.sh 放在/data/spark-3.3.0/路径下,同时 keytab 文件也放在该路径下(任意位置都行,但是放在这里最方便,下面的 Dockerfile 中需要用到)
Apache Kyuubi on CDH 在竞技世界大数据平台实践
4.修改 Docker 镜像进入后的脚本文件
/data/spark-3.3.0/kubernetes/dockerfiles/spark/entrypoint.sh

关键内容:添加driver和executor运行时初始化脚本run.sh(图方便使用了777的权限)

#!/bin/bash## Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements.  See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License.  You may obtain a copy of the License at##    http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.#
# echo commands to the terminal outputset -ex
# Check whether there is a passwd entry for the container UID#myuid=$(id -u)myuid=0mygid=$(id -g)# turn off -e for getent because it will return error code in anonymous uid caseset +euidentry=$(getent passwd $myuid)set -e
# If there is no passwd entry for the container UID, attempt to create oneif [ -z "$uidentry" ] ; then if [ -w /etc/passwd ] ; then echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd else echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID" fifi
if [ -z "$JAVA_HOME" ]; then JAVA_HOME=$(java -XshowSettings:properties -version 2>&1 > /dev/null | grep 'java.home' | awk '{print $3}')fi
SPARK_CLASSPATH="$SPARK_CLASSPATH:${SPARK_HOME}/jars/*"env | grep SPARK_JAVA_OPT_ | sort -t_ -k4 -n | sed 's/[^=]*=(.*)/1/g' > /tmp/java_opts.txtreadarray -t SPARK_EXECUTOR_JAVA_OPTS
if [ -n "$SPARK_EXTRA_CLASSPATH" ]; then SPARK_CLASSPATH="$SPARK_CLASSPATH:$SPARK_EXTRA_CLASSPATH"fi
if ! [ -z ${PYSPARK_PYTHON+x} ]; then export PYSPARK_PYTHONfiif ! [ -z ${PYSPARK_DRIVER_PYTHON+x} ]; then export PYSPARK_DRIVER_PYTHONfi
# If HADOOP_HOME is set and SPARK_DIST_CLASSPATH is not set, set it here so Hadoop jars are available to the executor.# It does not set SPARK_DIST_CLASSPATH if already set, to avoid overriding customizations of this value from elsewhere e.g. Docker/K8s.if [ -n "${HADOOP_HOME}" ] && [ -z "${SPARK_DIST_CLASSPATH}" ]; then export SPARK_DIST_CLASSPATH="$($HADOOP_HOME/bin/hadoop classpath)"fi
if ! [ -z ${HADOOP_CONF_DIR+x} ]; then SPARK_CLASSPATH="$HADOOP_CONF_DIR:$SPARK_CLASSPATH";fi
if ! [ -z ${SPARK_CONF_DIR+x} ]; then SPARK_CLASSPATH="$SPARK_CONF_DIR:$SPARK_CLASSPATH";elif ! [ -z ${SPARK_HOME+x} ]; then SPARK_CLASSPATH="$SPARK_HOME/conf:$SPARK_CLASSPATH";fi


case "$1" in driver) shift 1 chmod 777 /opt/spark/work-dir/run.sh /bin/bash /opt/spark/work-dir/run.sh cat /etc/hosts CMD=( "$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@" ) ;; executor) shift 1 chmod 777 /opt/spark/work-dir/run.sh /bin/bash /opt/spark/work-dir/run.sh cat /etc/hosts CMD=( ${JAVA_HOME}/bin/java "${SPARK_EXECUTOR_JAVA_OPTS[@]}" -Xms$SPARK_EXECUTOR_MEMORY -Xmx$SPARK_EXECUTOR_MEMORY -cp "$SPARK_CLASSPATH:$SPARK_DIST_CLASSPATH" org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend --driver-url $SPARK_DRIVER_URL --executor-id $SPARK_EXECUTOR_ID --cores $SPARK_EXECUTOR_CORES --app-id $SPARK_APPLICATION_ID --hostname $SPARK_EXECUTOR_POD_IP --resourceProfileId $SPARK_RESOURCE_PROFILE_ID --podName $SPARK_EXECUTOR_POD_NAME ) ;;
*) echo "Non-spark-on-k8s command provided, proceeding in pass-through mode..." CMD=("$@") ;;esac
# Execute the container CMD under tini for better hygieneexec /usr/bin/tini -s -- "${CMD[@]}"

5.编辑/data/spark-3.3.0/kubernetes/dockerfiles/spark/Dockerfile
关键点:
  • 修改 openjdk 的源(也可以不修改,但是网络不好的话镜像拉取不下来)
  • 修改拉取 debian 的源(原因同上)
  • 安装vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps等软件(方便后续在容器中进行操作)
  • 复制 cong 下文件到/opt/spark/conf下
  • 复制 keytab 文件到/opt/spark/work-dir路径下
  • 复制初始化脚本 run.sh,用来在镜像拉起后进行修改/etc/hosts文件
  • 设置 Spark_uid为0(root)(目的是需要更改hosts文件)
# contributor license agreements.  See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License.  You may obtain a copy of the License at##    http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.#ARG java_image_tag=8-jre-slim
FROM ***.***.***.***/bigdata/openjdk:${java_image_tag}
#ARG spark_uid=185ARG spark_uid=0
# Before building the docker image, first build and make a Spark distribution following# the instructions in https://spark.apache.org/docs/latest/building-spark.html.# If this docker file is being used in the context of building your images from a Spark# distribution, the docker build command should be invoked from the top level directory# of the Spark distribution. E.g.:# docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .
RUN set -ex && sed -i 's/http://deb.(.*)/https://deb.1/g' /etc/apt/sources.list && sed -i 's/http://security.(.*)/https://security.1/g' /etc/apt/sources.list && sed -i s@/security.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && sed -i s@/deb.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && apt-get update && ln -s /lib /lib64 && apt-get install -y vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps && mkdir -p /opt/spark && mkdir -p /opt/spark/examples && mkdir -p /opt/spark/work-dir && mkdir -p /opt/hadoop && touch /opt/spark/RELEASE && rm /bin/sh && ln -sv /bin/bash /bin/sh && echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && chgrp root /etc/passwd && chmod ug+rw /etc/passwd && rm -rf /var/cache/apt/*
COPY jars /opt/spark/jarsCOPY bin /opt/spark/binCOPY sbin /opt/spark/sbinCOPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/COPY kubernetes/dockerfiles/spark/decom.sh /opt/COPY examples /opt/spark/examplesCOPY kubernetes/tests /opt/spark/tests#COPY hadoop/conf /opt/hadoop/confCOPY conf /opt/spark/confCOPY data /opt/spark/dataCOPY hive.keytab /opt/spark/work-dirCOPY run.sh /opt/spark/work-dir
ENV SPARK_HOME /opt/spark
WORKDIR /opt/spark/work-dirRUN chmod 777 /opt/spark/work-dirRUN chmod a+x /opt/decom.shRUN chmod 777 /opt/spark/work-dir/run.shENTRYPOINT [ "/opt/entrypoint.sh" ]
# Specify the User that the actual main process will run asUSER ${spark_uid}

6.回到/data/spark-3.3.0路径下,执行下面的命令
#创建镜像./bin/docker-image-tool.sh -t v3.3.0 build#修改镜像tagdocker tag spark:v3.3.0 ***.***.***.***/bigdata/spark:v3.3.0#将镜像push到内部库中(公司内部自建)docker push ***.***.***.***/bigdata/spark:v3.3.0

Apache Kyuubi on CDH 在竞技世界大数据平台实践 创建 Kyuubi 1.6.0镜像

1.kyuubi不需要更改配置文件,官方给了更方便的方法(kyuubi-configmap.yaml)

2.编写初始化脚本 run.sh

(脚本内执行了命令,但是不一定会生效,需要在拉起来的容器中查看kubectl是否可以创建pod,”***”表示自定义的内容)
kubectl需要自己去网上下载,具体操作可百度
mkdir /etc/.kubechmod 777 /root/.kubecp /opt/kyuubi/config /root/.kube#kubectl可用的重要一步echo "export  KUBECONFIG=/etc/.kube/config" >> /etc/profileexport  KUBECONFIG=/etc/.kube/configsource /etc/profile
#将kubectl放入内网方便下载使用wget http://***.***.***.***/yum/k8s/kubectlchmod +x ./kubectlmv ./kubectl /usr/bin/#查看kubectl是否安装成功kubectl version --client
echo "***" >> /etc/hosts
echo "***" > /etc/krb5.conf
kinit -kt /opt/kyuubi/hive.keytab hive/***@HADOOP.****.***

3.修改/data/kyuubi-1.6.0/bin/kyuubi
在kyuubi run的位置添加
chmod 777 /opt/kyuubi/run.sh/bin/bash /opt/kyuubi/run.sh

#!/usr/bin/env bash## Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements.  See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License.  You may obtain a copy of the License at##    http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.#
## Kyuubi Server Main EntranceCLASS="org.apache.kyuubi.server.KyuubiServer"
function usage() { echo "Usage: bin/kyuubi command" echo " commands:" echo " start - Run a Kyuubi server as a daemon" echo " restart - Restart Kyuubi server as a daemon" echo " run - Run a Kyuubi server in the foreground" echo " stop - Stop the Kyuubi daemon" echo " status - Show status of the Kyuubi daemon" echo " -h | --help - Show this help message"}
if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then usage exit 0fi
function kyuubi_logo() { source ${KYUUBI_HOME}/bin/kyuubi-logo}
function kyuubi_rotate_log() { log=$1;
if [[ -z ${KYUUBI_MAX_LOG_FILES} ]]; then num=5 elif [[ ${KYUUBI_MAX_LOG_FILES} -gt 0 ]]; then num=${KYUUBI_MAX_LOG_FILES} else echo "Error: KYUUBI_MAX_LOG_FILES must be a positive number, but got ${KYUUBI_MAX_LOG_FILES}" exit -1 fi
if [ -f "$log" ]; then # rotate logs while [ ${num} -gt 1 ]; do prev=expr ${num} - 1 [ -f "$log.$prev" ] && mv "$log.$prev" "$log.$num" num=${prev} done mv "$log" "$log.$num"; fi}
export KYUUBI_HOME="$(cd "$(dirname "$0")"/..; pwd)"
if [[ $1 == "start" ]] || [[ $1 == "run" ]]; then . "${KYUUBI_HOME}/bin/load-kyuubi-env.sh"else . "${KYUUBI_HOME}/bin/load-kyuubi-env.sh" -sfi
if [[ -z ${JAVA_HOME} ]]; then echo "Error: JAVA_HOME IS NOT SET! CANNOT PROCEED." exit 1fi
RUNNER="${JAVA_HOME}/bin/java"
## Find the Kyuubi Jarif [[ -z "$KYUUBI_JAR_DIR" ]]; then KYUUBI_JAR_DIR="$KYUUBI_HOME/jars" if [[ ! -d ${KYUUBI_JAR_DIR} ]]; then echo -e "nCandidate Kyuubi lib $KYUUBI_JAR_DIR doesn't exist, searching development environment..." KYUUBI_JAR_DIR="$KYUUBI_HOME/kyuubi-assembly/target/scala-${KYUUBI_SCALA_VERSION}/jars" fifi
if [[ -z ${YARN_CONF_DIR} ]]; then KYUUBI_CLASSPATH="${KYUUBI_JAR_DIR}/*:${KYUUBI_CONF_DIR}:${HADOOP_CONF_DIR}"else KYUUBI_CLASSPATH="${KYUUBI_JAR_DIR}/*:${KYUUBI_CONF_DIR}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}"fi
cmd="${RUNNER} ${KYUUBI_JAVA_OPTS} -cp ${KYUUBI_CLASSPATH} $CLASS"
pid="${KYUUBI_PID_DIR}/kyuubi-$USER-$CLASS.pid"
function start_kyuubi() { if [[ ! -w ${KYUUBI_PID_DIR} ]]; then echo "${USER} does not have 'w' permission to ${KYUUBI_PID_DIR}" exit 1 fi
if [[ ! -w ${KYUUBI_LOG_DIR} ]]; then echo "${USER} does not have 'w' permission to ${KYUUBI_LOG_DIR}" exit 1 fi
if [ -f "$pid" ]; then TARGET_ID="$(cat "$pid")" if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then echo "$CLASS running as process $TARGET_ID Stop it first." exit 1 fi fi
log="${KYUUBI_LOG_DIR}/kyuubi-$USER-$CLASS-$HOSTNAME.out" kyuubi_rotate_log ${log}
echo "Starting $CLASS, logging to $log" nohup nice -n "${KYUUBI_NICENESS:-0}" ${cmd} >> ${log} 2>&1 /dev/null & newpid="$!"
echo "$newpid" > "$pid"
# Poll for up to 5 seconds for the java process to start for i in {1..10} do if [[ $(ps -p "$newpid" -o comm=) =~ "java" ]]; then break fi sleep 0.5 done
sleep 2 # Check if the process has died; in that case we'll tail the log so the user can see if [[ ! $(ps -p "$newpid" -o comm=) =~ "java" ]]; then echo "Failed to launch: ${cmd}" tail -2 "$log" | sed 's/^/ /' echo "Full log in $log" else echo "Welcome to" kyuubi_logo fi}
function run_kyuubi() { echo "Starting $CLASS" nice -n "${KYUUBI_NICENESS:-0}" ${cmd}}
function stop_kyuubi() { if [ -f ${pid} ]; then TARGET_ID="$(cat "$pid")" if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then echo "Stopping $CLASS" kill "$TARGET_ID" && rm -f "$pid" for i in {1..20} do sleep 0.5 if [[ ! $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then break fi done
if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then echo "Failed to stop kyuubi after 10 seconds, try 'kill -9 ${TARGET_ID}' forcefully " else kyuubi_logo echo "Bye!" fi else echo "no $CLASS to stop" fi else echo "no $CLASS to stop" fi}
function check_kyuubi() { if [[ -f ${pid} ]]; then TARGET_ID="$(cat "$pid")" if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then echo "Kyuubi is running (pid: $TARGET_ID)" else echo "Kyuubi is not running" fi else echo "Kyuubi is not running" fi}
case $1 in (start | "") start_kyuubi ;;
(restart) echo "Restarting Kyuubi" stop_kyuubi start_kyuubi ;;
(run) chmod 777 /opt/kyuubi/run.sh /bin/bash /opt/kyuubi/run.sh run_kyuubi ;;
(stop) stop_kyuubi ;;
(status) check_kyuubi ;;
(*) usage ;;esac

4.编辑/data/kyuubi-1.6.0/docker/Dockerfile
关键内容:
  • 修改 openjdk 的源
  • 修改拉取 debian 的源
  • 安装 wget vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps 等软件
  • 复制keytab文件到/opt/kyuubi路径下
  • 复制初始化脚本run.sh,用来在镜像拉起后进行修改/etc/hosts文件
  • 设置user用户为0(root)(使用root,或者0都行)
  • # Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements.  See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License.  You may obtain a copy of the License at##    http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.#
    # Usage:# 1. use ./build/dist to make binary distributions of Kyuubi or download a release# 2. Untar it and run the docker command below# docker build -f docker/Dockerfile -t repository/kyuubi:tagname .# Options:# -f this docker file# -t the target repo and tag name# more options can be found with -h
    ARG BASE_IMAGE=***.***.***.***/bigdata/openjdk:8-jre-slimARG spark_provided="spark_builtin"
    FROM ${BASE_IMAGE} as builder_spark_providedONBUILD ARG spark_home_in_dockerONBUILD ENV SPARK_HOME ${spark_home_in_docker}
    FROM ${BASE_IMAGE} as builder_spark_builtin
    ONBUILD ENV SPARK_HOME /opt/sparkONBUILD RUN mkdir -p ${SPARK_HOME}ONBUILD COPY spark-binary ${SPARK_HOME}
    FROM builder_${spark_provided}
    ARG kyuubi_uid=10009USER root
    ENV KYUUBI_HOME /opt/kyuubiENV KYUUBI_LOG_DIR ${KYUUBI_HOME}/logsENV KYUUBI_PID_DIR ${KYUUBI_HOME}/pidENV KYUUBI_WORK_DIR_ROOT ${KYUUBI_HOME}/work
    RUN set -ex && sed -i 's/http://deb.(.*)/https://deb.1/g' /etc/apt/sources.list && sed -i 's/http://security.(.*)/https://security.1/g' /etc/apt/sources.list && sed -i s@/security.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && sed -i s@/deb.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && apt-get update && apt-get install -y wget vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps && useradd -u ${kyuubi_uid} -g root kyuubi && mkdir -p ${KYUUBI_HOME} ${KYUUBI_LOG_DIR} ${KYUUBI_PID_DIR} ${KYUUBI_WORK_DIR_ROOT} && chmod ug+rw -R ${KYUUBI_HOME} && chmod a+rwx -R ${KYUUBI_WORK_DIR_ROOT} && rm -rf /var/cache/apt/*
    COPY bin ${KYUUBI_HOME}/binCOPY jars ${KYUUBI_HOME}/jarsCOPY beeline-jars ${KYUUBI_HOME}/beeline-jarsCOPY externals/engines/spark ${KYUUBI_HOME}/externals/engines/sparkCOPY hive.keytab /opt/kyuubiCOPY config /opt/kyuubiCOPY run.sh /opt/kyuubi

    WORKDIR ${KYUUBI_HOME}
    CMD [ "./bin/kyuubi", "run" ]
    USER ${kyuubi_uid}

    USER root

    5.回到/data/kyuubi-1.6.0路径下执行下面的命令
    #创建镜像./bin/docker-image-tool.sh -S /opt/spark -b BASE_IMAGE=***.***.***.***/bigdata/spark:v3.3.0 -t v1.6.0  build#修改镜像tagdocker tag kyuubi:v1.6.0 ***.***.***.***/bigdata/kyuubi:v1.6.0#将镜像push到内部库中docker push ***.***.***.***/bigdata/kyuubi:v1.6.0

    03
    修改 Kyuubi 服务 yaml 文件
    Apache Kyuubi on CDH 在竞技世界大数据平台实践 1.2 TPCDS 数据集 
    Apache Kyuubi on CDH 在竞技世界大数据平台实践修改/Kyuubi/Docker/Kyuubi-configmap.yaml
    1.添加ns信息:namespace:2
    2.添加 Kyuubi-env.sh 和 Kyuubi-defaults.conf 配置内容
    apiVersion: v1kind: ConfigMapmetadata:  namespace: ****-bd-k8s  name: kyuubi-defaultsdata:  kyuubi-env.sh: |     export SPARK_HOME=/opt/spark     export SPARK_CONF_DIR=${SPARK_HOME}/conf     export HADOOP_CONF_DIR=${SPARK_HOME}/conf:${SPARK_HOME}/conf
    export KYUUBI_PID_DIR=/opt/kyuubi/pid export KYUUBI_LOG_DIR=/opt/kyuubi/logs export KYUUBI_WORK_DIR_ROOT=/opt/kyuubi/work export KYUUBI_MAX_LOG_FILES=10 kyuubi-defaults.conf: | # ## Kyuubi Configurations # # kyuubi.authentication NONE # kyuubi.frontend.bind.host localhost # kyuubi.frontend.bind.port 10009 #
    # Details in https://kyuubi.apache.org/docs/latest/deployment/settings.html kyuubi.authentication=KERBEROS kyuubi.kinit.principal=hive/****-****-****-****@****.****.**** kyuubi.kinit.keytab=/opt/kyuubi/hive.keytab #很重要的一个内容,避免kyuubi服务起来后,通过hostname无法链接,使用该参数表示使用ip链接 kyuubi.frontend.connection.url.use.hostname false
    kyuubi.engine.share.level=USER kyuubi.session.engine.idle.timeout=PT1H
    kyuubi.ha.enabled=true kyuubi.ha.zookeeper.quorum=***.***.***.***:2181,***.***.***.***:2181,***.***.***.***:2181 kyuubi.ha.zookeeper.namespace=kyuubi_on_k8s
    spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf
    spark.kubernetes.trust.certificates=true
    spark.kubernetes.file.upload.path=hdfs:///user/spark/k8s_upload

    Apache Kyuubi on CDH 在竞技世界大数据平台实践修改/kyuubi/docker/kyuubi-deployment.yaml
    1.修改元信息:namespace
    2.修改镜像信息:image
    apiVersion: apps/v1kind: Deploymentmetadata:  namespace: ****-bd-k8s  name: kyuubi-deployment-example  labels:    app: kyuubi-serverspec:  replicas: 1  selector:    matchLabels:      app: kyuubi-server  template:    metadata:      labels:        app: kyuubi-server    spec:      imagePullSecrets:        - name: harbor-pull      containers:        - name: kyuubi-server          # TODO: replace this with the stable tag          image: ***.***.***.***/bigdata/kyuubi:v1.6.0          #image: apache/kyuubi:master-snapshot          imagePullPolicy: Always          env:            - name: KYUUBI_JAVA_OPTS              value: -Dkyuubi.frontend.bind.host=0.0.0.0          ports:            - name: frontend-port              containerPort: 10009              protocol: TCP          volumeMounts:            - name: kyuubi-defaults              mountPath: /opt/kyuubi/conf      volumes:        - name: kyuubi-defaults          configMap:            name: kyuubi-defaults          #secret:                  #secretName: kyuubi-defaults

    Apache Kyuubi on CDH 在竞技世界大数据平台实践 修改/kyuubi/docker/kyuubi-service.yaml
    1.修改元信息:namespace
    apiVersion: v1kind: Servicemetadata:  namespace: ****-bd-k8s  name: kyuubi-example-servicespec:  ports:    # The default port limit is 30000-32767    # to change:    #   vim kube-apiserver.yaml (usually under path: /etc/kubernetes/manifests/)    #   add or change line 'service-node-port-range=1-32767' under kube-apiserver    - nodePort: 30009      # same of containerPort in pod yaml      port: 10009      protocol: TCP  type: NodePort  selector:    # same of pod label    app: kyuubi-server

    04
    在 k8s 的客户端节点运行 Kyuubi 服务
    Apache Kyuubi on CDH 在竞技世界大数据平台实践 1.2 TPCDS 数据集 运行configmap
    Apache Kyuubi on CDH 在竞技世界大数据平台实践 运行 configmap
    kubectl apply -f docker/kyuubi-configmap.yaml
    Apache Kyuubi on CDH 在竞技世界大数据平台实践 运行 deployment
    kubectl apply -f docker/kyuubi-deployment.yaml
    Apache Kyuubi on CDH 在竞技世界大数据平台实践 运行 svc
    kubectl apply -f docker/kyuubi-service.yaml
05
使用 kyuubi 客户端节点本地 beeline 进行连接
Apache Kyuubi on CDH 在竞技世界大数据平台实践 1.2 TPCDS 数据集 
./bin/beeline -u 'jdbc:hive2://***.***.***.***:30009/default;principal=hive/***.***.***.***@HADOOP.****.TECH?spark.master=k8s://https://****.****.****/****/****/****;spark.submit.deployMode=cluster;spark.kubernetes.namespace=****-bd-k8s;spark.kubernetes.container.image.pullSecrets=harbor-pull;spark.kubernetes.authenticate.driver.serviceAccountName=flink;spark.kubernetes.trust.certificates=true;spark.kubernetes.executor.podNamePrefix=kyuubi-on-k8s;spark.kubernetes.container.image=***.***.***.***/bigdata/spark:v3.3.0;spark.dynamicAllocation.shuffleTracking.enabled=true;spark.dynamicAllocation.enabled=true;spark.dynamicAllocation.maxExecutors=10;spark.dynamicAllocation.minExecutors=5;spark.executor.instances=5;spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf' "$@"
06
效果展示

Apache Kyuubi on CDH 在竞技世界大数据平台实践

 1.2 TPCDS 数据集 

Apache Kyuubi on CDH 在竞技世界大数据平台实践

Apache Kyuubi on CDH 在竞技世界大数据平台实践

Apache Kyuubi on CDH 在竞技世界大数据平台实践 END


Apache Kyuubi on CDH 在竞技世界大数据平台实践





Apache Kyuubi on CDH 在竞技世界大数据平台实践




Apache Kyuubi 推特账号 现已开通

推特搜索 Apache Kyuubi 或 浏览器 打开下方链接 即可关注~

https://twitter.com/KyuubiApache


还可以加入 Apache Kyuubi Slack

https://join.slack.com/t/apachekyuubi/shared_invite/zt-1e1qw68g4-yE5HJsVVDin~ABtZISyuxg

和海外开发者交流互动哦~


最后

Kyuubi 在这里提醒大家

文明上网 科学上网

0 0 投票数
文章评分

本文转载自刘振业 Apache Kyuubi,原文链接:https://mp.weixin.qq.com/s/KK2I5pclU6QqgSw49FCKHg。

(0)
上一篇 2022-10-26 19:00
下一篇 2022-10-27 21:44

相关推荐

订阅评论
提醒
guest

0 评论
内联反馈
查看所有评论
0
希望看到您的想法,请您发表评论x