# elasticsearch-analysis-dynamic-synonym **Repository Path**: luoxiang723/elasticsearch-analysis-dynamic-synonym ## Basic Information - **Project Name**: elasticsearch-analysis-dynamic-synonym - **Description**: 作者的同义词仓库 - **Primary Language**: Java - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: https://gitee.com/bubaiwantong/elasticsearch-analysis-dynamic-synonym - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 4 - **Created**: 2025-06-21 - **Last Updated**: 2025-06-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # elasticsearch安装dynamic-synonym插件 ​ 今天就来和大家讲讲如何在es中安装dynamic-synonym插件,首先我们需要去github上下载与es版本对应的插件,一般github上基本都是本地词库和远程文本词库的,在gitee上可以找到采用数据库作为词库的源码,大致思路就是修改一些参数配置,然后自己创建一个表作为同义词词库,最后将打包好的jar包插件丢到es-plugins目录下面,最后重启一下就能跑起来了。但是!!!作者没有跑起来,遇到了好多问题【哭泣泣】,因为我是在docker容器中运行的es,而容器一直报的是Java权限问题,我在网络上找了一圈才东拼西凑的把这个问题给解决,真的太高兴啦!!! ​ 接下来就开始讲讲思路 1. 下载源码,修改dynamic-synonym配置 2. 新增MySQL代码 3. 创建一个dynamic-synonym的表 4. 修改docker中es容器的Java.policy文件**【非常重要】** 5. 将打包好的jar包放入到 {es-root}/es-plugins目录下面 6. docker重启es容器 7. 新建es的dynamic-synonym索引测试 **文章末尾会给出作者已经配置好的插件代码!!!!!! 请注意签收!!!!!**可以直接跳到四或者五,根据你自己的需求来选择 ## 一、下载源码并且修改配置 ​ github好多好多的源码啊,真的是看都看不过来,下载之后要结合自己es版本切换分支,这里建议直接下载最原始的源码,链接为:https://github.com/bells/elasticsearch-analysis-dynamic-synonym,下载好了之后需要切换与es版本对应代码分支,作者的es版本为7.12.1,修改一下pom文件的配置 ![image-20230618095401413](https://csdn-blog-picture.oss-cn-guangzhou.aliyuncs.com/img/image-20230618095401413.png) ![image-20230618095436842](https://csdn-blog-picture.oss-cn-guangzhou.aliyuncs.com/img/image-20230618095436842.png) ### 1.1 修改pom.xml文件 ```xml 4.0.0 com.bellszhu.elasticsearch elasticsearch-analysis-dynamic-synonym 7.12.1 jar elasticsearch-dynamic-synonym Analysis-plugin for synonym UTF-8 ${project.version} 1.8 analysis-dynamic-synonym ${project.basedir}/src/main/assemblies/plugin.xml com.bellszhu.elasticsearch.plugin.DynamicSynonymPlugin true The Apache Software License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0.txt repo org.sonatype.oss oss-parent 9 scm:git:git@github.com:bells/elasticsearch-analysis-dynamic-synonym.git scm:git:git@github.com:bells/elasticsearch-analysis-dynamic-synonym.git https://github.com/bells/elasticsearch-analysis-dynamic-synonym org.elasticsearch elasticsearch ${elasticsearch.version} org.codelibs.elasticsearch.module analysis-common 7.10.2 junit junit 4.13.1 test org.apache.httpcomponents httpclient 4.5.13 mysql mysql-connector-java 8.0.22 org.apache.logging.log4j log4j-core 2.13.2 provided org.apache.logging.log4j log4j-api 2.11.1 provided org.codelibs elasticsearch-cluster-runner 7.10.2.0 test org.apache.maven.plugins maven-compiler-plugin 2.3.2 ${maven.compiler.target} ${maven.compiler.target} org.apache.maven.plugins maven-surefire-plugin 2.11 **/*Tests.java org.apache.maven.plugins maven-source-plugin 2.1.2 attach-sources jar maven-assembly-plugin false ${project.build.directory}/releases/ ${basedir}/src/main/assemblies/plugin.xml fully.qualified.MainClass package single ``` 这里在做链接MySQL数据的时候要注意一下**MySQL的驱动jar包**,不同版本的url会有所区别。 ## 二、新增MySQL代码 ### 2.1 新增MysqlRemoteSynonymFile文件 ```java public class MySqlRemoteSynonymFile implements SynonymFile{ /** * 数据库配置文件名 */ private final static String DB_PROPERTIES = "jdbc-reload.properties"; private static Logger logger = LogManager.getLogger("dynamic-synonym"); private String format; private boolean expand; private boolean lenient; private Analyzer analyzer; private Environment env; // 数据库配置 private String location; /** * 数据库地址 */ private static final String JDBC_URL = "jdbc.url"; /** * 数据库驱动 */ private static final String JDBC_DRIVER = "jdbc.driver"; /** * 数据库用户名 */ private static final String JDBC_USER = "jdbc.user"; /** * 数据库密码 */ private static final String JDBC_PASSWORD = "jdbc.password"; /** * 当前节点的同义词版本号 */ private LocalDateTime thisSynonymVersion = LocalDateTime.now(); private static Connection connection = null; private Statement statement = null; private Properties props; private Path conf_dir; MySqlRemoteSynonymFile(Environment env, Analyzer analyzer, boolean expand, boolean lenient, String format, String location) { this.analyzer = analyzer; this.expand = expand; this.format = format; this.lenient = lenient; this.env = env; this.location = location; this.props = new Properties(); //读取当前 jar 包存放的路径 Path filePath = PathUtils.get(new File(DynamicSynonymPlugin.class.getProtectionDomain().getCodeSource() .getLocation().getPath()) .getParent(), "config") .toAbsolutePath(); this.conf_dir = filePath.resolve(DB_PROPERTIES); //判断文件是否存在 File configFile = conf_dir.toFile(); InputStream input = null; try { input = new FileInputStream(configFile); } catch (FileNotFoundException e) { logger.info("jdbc-reload.properties 数据库配置文件没有找到, " + e); } if (input != null) { try { props.load(input); } catch (IOException e) { logger.error("数据库配置文件 jdbc-reload.properties 加载失败," + e); } } isNeedReloadSynonymMap(); } /** * 加载同义词词典至SynonymMap中 * @return SynonymMap */ @Override public SynonymMap reloadSynonymMap() { try { logger.info("start reload local synonym from {}.", location); Reader rulesReader = getReader(); SynonymMap.Builder parser = RemoteSynonymFile.getSynonymParser(rulesReader, format, expand, lenient, analyzer); return parser.build(); } catch (Exception e) { logger.error("reload local synonym {} error! cause: {}", location, e.getMessage()); throw new IllegalArgumentException( "could not reload local synonyms file to build synonyms", e); } } /** * 判断是否需要进行重新加载 * @return true or false */ @Override public boolean isNeedReloadSynonymMap() { try { LocalDateTime mysqlLastModify = getMySqlSynonymLastModify(); if (!thisSynonymVersion.isEqual(mysqlLastModify)) { thisSynonymVersion = mysqlLastModify; return true; } } catch (Exception e) { logger.error(e); } return false; } /** * 获取MySql中同义词版本号信息 * 用于判断同义词是否需要进行重新加载 * * @return getLastModify */ public LocalDateTime getMySqlSynonymLastModify() { ResultSet resultSet = null; LocalDateTime mysqlSynonymLastModify = null; try { if (statement == null) { statement = getConnection(props); } resultSet = statement.executeQuery(props.getProperty("jdbc.reload.swith.synonym.last_modify")); while (resultSet.next()) { Timestamp lastModify = resultSet.getTimestamp("last_modify"); mysqlSynonymLastModify = lastModify.toLocalDateTime(); // logger.info("当前MySql同义词最后修改时间为:{}, 当前节点同义词库最后修改时间为:{}", mysqlSynonymLastModify, thisSynonymVersion); } } catch (SQLException e) { e.printStackTrace(); } finally { try { if (resultSet != null) { resultSet.close(); } } catch (SQLException e) { e.printStackTrace(); } } return mysqlSynonymLastModify; } /** * 查询数据库中的同义词 * @return DBData */ public ArrayList getDbData() { ArrayList arrayList = new ArrayList<>(); ResultSet resultSet = null; try { if (statement == null) { statement = getConnection(props); } logger.info("正在执行SQL查询同义词列表,SQL:{}", props.getProperty("jdbc.reload.synonym.sql")); resultSet = statement.executeQuery(props.getProperty("jdbc.reload.synonym.sql")); while (resultSet.next()) { String theWord = resultSet.getString("words"); arrayList.add(theWord); } } catch (SQLException e) { logger.error(e); } finally { try { if (resultSet != null) { resultSet.close(); } } catch (SQLException e) { e.printStackTrace(); } } return arrayList; } /** * 同义词库的加载 * @return Reader */ @Override public Reader getReader() { StringBuilder sb = new StringBuilder(); try { ArrayList dbData = getDbData(); for (String dbDatum : dbData) { logger.info("正在加载同义词:{}", dbDatum); // 获取一行一行的记录,每一条记录都包含多个词,形成一个词组,词与词之间使用英文逗号分割 sb.append(dbDatum) .append(System.getProperty("line.separator")); } } catch (Exception e) { logger.error("同义词加载失败"); } return new StringReader(sb.toString()); } /** * 获取数据库可执行连接 * @param props 配置文件 * @throws SQLException 获取连接失败 */ private static Statement getConnection(Properties props) throws SQLException { try { Class.forName(props.getProperty(JDBC_DRIVER)); } catch (ClassNotFoundException e) { logger.error("驱动加载失败", e); } if (connection == null) { connection = DriverManager.getConnection( props.getProperty(JDBC_URL), props.getProperty(JDBC_USER), props.getProperty(JDBC_PASSWORD)); } return connection.createStatement(); } } ``` ### 2.2 在getSynonymFile新增MySQL的连接方式 修改的DynamicSynonymTokenFilterFactory的资源获取代码 ```java SynonymFile getSynonymFile(Analyzer analyzer) { try { SynonymFile synonymFile; if ("MySql".equals(location)) { synonymFile = new MySqlRemoteSynonymFile(environment, analyzer, expand, lenient, format, location); } else if (location.startsWith("http://") || location.startsWith("https://")) { synonymFile = new RemoteSynonymFile( environment, analyzer, expand, lenient, format, location); } else { synonymFile = new LocalSynonymFile( environment, analyzer, expand, lenient, format, location); } if (scheduledFuture == null) { scheduledFuture = pool.scheduleAtFixedRate(new Monitor(synonymFile), interval, interval, TimeUnit.SECONDS); } return synonymFile; } catch (Exception e) { logger.error("failed to get synonyms: " + location, e); throw new IllegalArgumentException("failed to get synonyms : " + location, e); } } ``` ## 三、创建一个dynamic-synonym的表 ### 3.1 建库建表 ​ 作者这边的数据库名称为word,表名为synonym ```mysql /* Navicat Premium Data Transfer Source Server : localhost Source Server Type : MySQL Source Server Version : 50717 Source Host : localhost:3306 Source Schema : auth Target Server Type : MySQL Target Server Version : 50717 File Encoding : 65001 Date: 05/01/2022 17:01:31 */ SET NAMES utf8mb4; SET FOREIGN_KEY_CHECKS = 0; -- ---------------------------- -- Table structure for synonym -- ---------------------------- DROP TABLE IF EXISTS `synonym`; CREATE TABLE `synonym` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键', `words` text CHARACTER SET utf8 COLLATE utf8_bin NULL COMMENT '同义词', `last_modify` timestamp(0) NULL DEFAULT CURRENT_TIMESTAMP(0) ON UPDATE CURRENT_TIMESTAMP(0) COMMENT '最后更新时间', PRIMARY KEY (`id`) USING BTREE ) ENGINE = InnoDB AUTO_INCREMENT = 2 CHARACTER SET = utf8 COLLATE = utf8_bin ROW_FORMAT = Dynamic; -- ---------------------------- -- Records of synonym -- ---------------------------- INSERT INTO `synonym` VALUES (1, '西红柿,番茄,洋柿子', '2022-01-05 16:48:24'); SET FOREIGN_KEY_CHECKS = 1; ``` ### 3.2 修改数据库连接的配置文件 在项目的src同级目录下新增config/jdbc-reload.properties文件 ```properties # permission java.net.SocketPermission "*", "connect,resolve"; # CHCP 65001 jdbc.url=jdbc:mysql://192.168.255.132:3306/word?serverTimezone=GMT jdbc.user=root jdbc.driver=com.mysql.cj.jdbc.Driver jdbc.password=123456 # 查询词库 jdbc.reload.synonym.sql=select words from synonym # 查询更新时间 jdbc.reload.swith.synonym.last_modify=SELECT MAX(last_modify) last_modify FROM synonym ``` ## 四、修改docker中es容器的Java.policy文件**【非常重要】** 这里作者用的是docker容器化部署,如果是直接装在windows系统或者centos系统下,就要去修改es依赖的Jdk,直接修改系统的jdk的java.policy文件。在这里不直接修改系统jdk的java.policy文件是因为docker容器化部署的es是独立于系统的jdk运行的,这个es有一套自己的输出逻辑。 ### 4.1 找到Java.policy 首先进入到容器内部操作 docker exec -it es /bin/bash,然后直接打开 cd /usr/share/elasticsearch/jdk/conf/security/文件夹,找到Java.policy文件。 ![image-20230618101229748](https://csdn-blog-picture.oss-cn-guangzhou.aliyuncs.com/img/image-20230618101229748.png) ```shel [root@localhost ~]# docker exec -it es /bin/bash [root@ee5fd3f35131 elasticsearch]# cd /usr/share/elasticsearch/jdk/conf/security/ [root@ee5fd3f35131 security]# ls java.policy java.security policy [root@ee5fd3f35131 security]# vi java.policy ``` ### 4.2 修改java.policy文件 ![image-20230618101541506](https://csdn-blog-picture.oss-cn-guangzhou.aliyuncs.com/img/image-20230618101541506.png) 下面文件的全部内容: ```text // // This system policy file grants a set of default permissions to all domains // and can be configured to grant additional permissions to modules and other // code sources. The code source URL scheme for modules linked into a // run-time image is "jrt". // // For example, to grant permission to read the "foo" property to the module // "com.greetings", the grant entry is: // // grant codeBase "jrt:/com.greetings" { // permission java.util.PropertyPermission "foo", "read"; // }; // grant codeBase "file:${{java.ext.dirs}}/*" { permission java.security.AllPermission; }; // default permissions granted to all domains grant { // allows anyone to listen on dynamic ports permission java.net.SocketPermission "localhost:0", "listen"; // "standard" properies that can be read by anyone permission java.util.PropertyPermission "java.version", "read"; permission java.util.PropertyPermission "java.vendor", "read"; permission java.util.PropertyPermission "java.vendor.url", "read"; permission java.util.PropertyPermission "java.class.version", "read"; permission java.util.PropertyPermission "os.name", "read"; permission java.util.PropertyPermission "os.version", "read"; permission java.util.PropertyPermission "os.arch", "read"; permission java.util.PropertyPermission "file.separator", "read"; permission java.util.PropertyPermission "path.separator", "read"; permission java.util.PropertyPermission "line.separator", "read"; permission java.util.PropertyPermission "java.specification.version", "read"; permission java.util.PropertyPermission "java.specification.vendor", "read"; permission java.util.PropertyPermission "java.specification.name", "read"; permission java.util.PropertyPermission "java.vm.specification.version", "read"; permission java.util.PropertyPermission "java.vm.specification.vendor", "read"; permission java.util.PropertyPermission "java.vm.specification.name", "read"; permission java.util.PropertyPermission "java.vm.version", "read"; permission java.util.PropertyPermission "java.vm.vendor", "read"; permission java.util.PropertyPermission "java.vm.name", "read"; permission java.net.SocketPermission "*", "connect,resolve"; permission java.lang.RuntimePermission "setContextClassLoader"; permission java.lang.RuntimePermission "accessDeclaredMembers"; permission java.lang.RuntimePermission "createClassLoader"; permission java.security.AllPermission; }; ``` ## 五、将打包好的jar包放入到 {es-root}/es-plugins目录下面 ### 5.1 在打包之前一定要注意自己es的版本号 ![image-20230618102014020](https://csdn-blog-picture.oss-cn-guangzhou.aliyuncs.com/img/image-20230618102014020.png) ### 5.2 打包完成之后解压文件并且上传到服务器中的es的plugins目录 ​ 这里作者用的docker的容器部署,如果是windows本地直接找到plugins目录放进去就可以了。 ![image-20230618102152704](https://csdn-blog-picture.oss-cn-guangzhou.aliyuncs.com/img/image-20230618102152704.png) ## 六、docker重启es容器 如果直接安装在系统上,就直接去找到elasticsearch/bin目录下重启一下就可以啦。作者这里是容器部署的哈。 ```shell docker restart es ``` 容器重启之后记得查看一下docker的控制台输出,看看有没有什么问题,如果出现权限之类的问题,那基本上就是java.policy文件没有配置正确,如果出现数据库之类的问题,请在本地建个Java项目连接一下试试,看看能不能跑的起来。 ```shell docker logs -f es ``` ## 七、新建es的dynamic-synonym索引测试 ```json PUT synonyms_index { "settings": { "number_of_shards": 1, "number_of_replicas": 1, "analysis": { "analyzer": { "synonym": { "type":"custom", "tokenizer": "ik_smart", "filter": ["synonym_custom"] } }, "filter": { "synonym_custom": { "type": "dynamic_synonym", "synonyms_path": "MySql" } } } }, "mappings": { "properties": { "name": { "type": "text", "analyzer": "synonym" } } } } ``` ```json GET /synonyms_index/_analyze { "text": "西红柿", "analyzer": "synonym" } ``` ![image-20230618103510795](https://csdn-blog-picture.oss-cn-guangzhou.aliyuncs.com/img/image-20230618103510795.png) 这样子就算运行成功啦,开心撒花!!! ```json delete synonyms_index ``` ## 八、总结 ### 8.1 源码地址 为了做这个项目,作者搞了大概得有一天,为了让大家节省时间,这里可以直接下载我已经配置好的[源码](https://gitee.com/bubaiwantong/elasticsearch-analysis-dynamic-synonym) ### 8.2 小节 ​ 经过一天的研究,终于大致弄明白es插件的运行过程了,为后续实现自动补全功能、优化搜索、广告推荐、聚合查询做好了前提条件。 以后如果做这些功能了再将博客补上,最后,感谢大家的支持