100 Star 330 Fork 295

openLooKeng / hetu-core

 / 详情

[1.4.0RC2][on-yarn] hetu-cli fails with untrusted certificate exception when connecting to a secure OLK instance.

Todo
Defect
创建于  
2021-09-15 18:01

Software Environment:

  • openLooKeng version (source or binary):
    tar package of September 15
  • OS platform & distribution (eg., Linux Ubuntu 16.04):
    centos7.6
  • Java version:
    jdk1.8.0

Describe the current behavior

Failed to deploy the security cluster

Describe the expected behavior

Steps to reproduce the issue

1.Enable SSL communication.
config.properties.template add ssl Related Configurations
One question:
输入图片说明
This file is local. How do I upload it to Yarn?
2.I simply used the command, and the error was reported.

输入图片说明

Related log/screenshots

Special notes for this issue

评论 (15)

@dengran You have not selected a milestone,please select a milestone.After setting the milestone, you can use the /check-milestone command to remove the needs-milestone label.

dengran 创建了Defect
dengran 关联仓库设置为openLooKeng/hetu-core
i-robot 添加了
 
needs-milestone
标签
dengran 负责人设置为Faras Mohan Dewal
dengran 移除了
 
needs-milestone
标签
dengran 添加了
 
bug
标签
dengran 添加了
 
m/OnYarn
标签
dengran 关联项目设置为Proj-openLooKeng
dengran 里程碑设置为20210930-v1.4.0
dengran 计划截止日期设置为2021-09-16
dengran 计划开始日期设置为2021-09-15
dengran 计划截止日期2021-09-16 修改为2021-09-21
dengran 计划截止日期2021-09-21 修改为2021-09-22
dengran 优先级设置为主要
展开全部操作日志

Issue reproduced.
Also need to update advanced configuration under documents.

Mike 任务状态Todo 修改为Fixing
Mike 任务状态Fixing 修改为ToTest

I didn't find any information about security clusters on the wiki

dengran 任务状态ToTest 修改为Todo

I investigated this and was able to start an SSL cluster in my development environment, but the steps were a bit awkward.

Here is how I set up the keystore.jks file:

  • I had to put keystore.jks on the machine manually, I couldn't have the script automatically upload/download it from HDFS. It's okay to manually upload the file to HDFS (hdfs dfs -put <the file> <the destination in HDFS>), and manually download it from HDFS on the destination machine (hdfs dfs -get <file location in HDFS> <destination on the machine>). In my example, I put the keystore file at /tmp/keystore.jks on my machine.
  • The named used when creating the keystore file has to be the same as the machine's fully qualified domain name.

Here is the config.properties.template that I used on coordinator nodes:

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
query.max-memory=10GB
query.max-total-memory=10GB
query.max-memory-per-node=2GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=https://_OLK_PROPERTIES_TEMPLATE_CN_DISCOVERY_IP:_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
hetu.multiple-coordinator.enabled=_OLK_PROPERTIES_TEMPLATE_HA_ENABLED
hetu.embedded-state-store.enabled=_OLK_PROPERTIES_TEMPLATE_HA_ENABLED
hetu.queryeditor-ui.allow-insecure-over-http=true

http-server.http.enabled=false
node.internal-address-source=FQDN
http-server.https.enabled=true
http-server.https.port=_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
http-server.https.keystore.path=/tmp/keystore.jks
http-server.https.keystore.key=keystorepassword
internal-communication.https.required=true
internal-communication.https.keystore.path=/tmp/keystore.jks
internal-communication.https.keystore.key=keystorepassword

Here is the config.properties.template that I used on worker nodes:

coordinator=false
node-scheduler.include-coordinator=false
http-server.http.port=_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
query.max-memory=10GB
query.max-total-memory=10GB
query.max-memory-per-node=2GB
query.max-total-memory-per-node=2GB
discovery.uri=https://_OLK_PROPERTIES_TEMPLATE_CN_DISCOVERY_IP:_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
hetu.multiple-coordinator.enabled=_OLK_PROPERTIES_TEMPLATE_HA_ENABLED

http-server.http.enabled=false
node.internal-address-source=FQDN
http-server.https.enabled=true
http-server.https.port=_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
http-server.https.keystore.path=/tmp/keystore.jks
http-server.https.keystore.key=keystorepassword
internal-communication.https.required=true
internal-communication.https.keystore.path=/tmp/keystore.jks
internal-communication.https.keystore.key=keystorepassword

Could you try the steps from the above comment and see if that successfully starts the SSL cluster in your environment?

I think there's something wrong with it.
The startup is successful.
输入图片说明
but
输入图片说明
Haven't returned for a long time so I manually stopped.
then
I set http-server.http.enabled=true at the same time
输入图片说明
输入图片说明
输入图片说明
输入图片说明
no worker

yumei 添加了
 
1.4.0
标签
yumei 添加了
 
r/RC2
标签

We think this is related to an untrusted SSL certificate. (Are there also errors when starting a cluster with SSL manually without yarn?)
Please ensure that the certificate used by olk is trusted by Java.
“Java Truststore File for TLS” from this page might be related? https://openlookeng.io/docs/docs/security/tls.html

yumei 里程碑20210930-v1.4.0 修改为20211230-1.5.0
Mike 负责人Faras Mohan Dewal 修改为Mike
Mike 添加协作者yumei
Mike 添加协作者Faras Mohan Dewal

This issue is not specific to olk-on-yarn. We need to update our OLK docs for the CLI to trust the server side certificate. Lower the priority, and remove the on-yarn lable.

Mike 移除了
 
m/OnYarn
标签
Mike 优先级主要 修改为次要
Mike 修改了标题
Mike 任务状态Analysing 修改为Todo

The -Djavax.net.ssl.trustStore=client_truststore.jks Java option needs to be passed when calling the hetu-cli:

Here are the steps I used to generate the keystore, truststore, and certificate:


Mike 添加协作者Mike
Mike 负责人Mike 修改为lilianyuan_c78e
yumei 添加了
 
m/OnYarn
标签

I generate the keystore, truststore, and certificate from your steps
输入图片说明
输入图片说明
输入图片说明

Here is the config.properties.template that I used on coordinator nodes:

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
query.max-memory=10GB
query.max-total-memory=10GB
query.max-memory-per-node=2GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=_OLK_PROPERTIES_TEMPLATE_DISCOVERY_URI
hetu.multiple-coordinator.enabled=_OLK_PROPERTIES_TEMPLATE_HA_ENABLED
hetu.embedded-state-store.enabled=_OLK_PROPERTIES_TEMPLATE_HA_ENABLED
hetu.queryeditor-ui.allow-insecure-over-http=true

node.internal-address=51-38-77-19.huawei.com
discovery.uri=https://51-38-77-19.huawei.com:9090
#node.internal-adress-source=FQDN
http-server.http.enabled=false
http-server.https.enabled=true
http-server.https.port=9090
#http-server.https.port=_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
http-server.https.keystore.path=/opt/keystore.jks
http-server.https.keystore.key=Huawei@123
internal-communication.https.required=true
internal-communication.https.keystore.path=/opt/keystore.jks
internal-communication.https.keystore.key=Huawei@123

Here is the config.properties.template that I used on worker nodes:

coordinator=false
node-scheduler.include-coordinator=false
http-server.http.port=_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
query.max-memory=10GB
query.max-total-memory=10GB
query.max-memory-per-node=2GB
query.max-total-memory-per-node=2GB
discovery.uri=_OLK_PROPERTIES_TEMPLATE_DISCOVERY_URI
hetu.multiple-coordinator.enabled=_OLK_PROPERTIES_TEMPLATE_HA_ENABLED

node.internal-address=51-38-77-19.huawei.com
discovery.uri=https://51-38-77-19.huawei.com:9090
#node.internal-address-source=FQDN
http-server.http.enabled=false
http-server.https.enabled=true
http-server.https.port=9090
#http-server.https.port=_OLK_PROPERTIES_TEMPLATE_HTTP_SERVER_PORT
http-server.https.keystore.path=/opt/keystore.jks
http-server.https.keystore.key=Huawei@123
internal-communication.https.required=true
internal-communication.https.keystore.path=/opt/keystore.jks
internal-communication.https.keystore.key=Huawei@123

python3 olk_on_yarn.py start -ha true -cn 2 -w 2 --use-nginx true --nginx-port 8070
输入图片说明

输入图片说明

If I set 'http-server.http.enabled=true',I can access the WebUI,but there is no worker.

When creating the keystore file, the section for "What is your first and last name?" needs to be the same as the machine's fully qualified domain name or unqualified hostname. (I'm not entirely sure which one it's supposed to be, for my environment they're the same. I recommend trying both?)

tushengxia 任务状态Todo 修改为ToTest

Here are keystore, truststore, and certificate:
输入图片说明
keytool -v -list -keystore keystore.jks
输入图片说明
keytool -v -list -keystore olk_trust.jks
输入图片说明

输入图片说明
There are no http_uri like 'https://51-38-76-121.huawei.com:xxx' both cn and worker

yumei 修改了标题
yumei 任务状态ToTest 修改为Todo
yumei 负责人lilianyuan_c78e 修改为Faras Mohan Dewal
yumei 取消协作者Faras Mohan Dewal
yumei 添加协作者lilianyuan_c78e
Mike 里程碑20220330-v1.6.0 修改为未设置

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(7)
Java
1
https://gitee.com/openlookeng/hetu-core.git
git@gitee.com:openlookeng/hetu-core.git
openlookeng
hetu-core
hetu-core

搜索帮助