# jstarcraft-rns
**Repository Path**: data_middle_platform_solution/jstarcraft-rns
## Basic Information
- **Project Name**: jstarcraft-rns
- **Description**: 专注于解决推荐领域与搜索领域的两个核心问题:排序预测(Ranking)和评分预测(Rating). 为相关领域的研发人员提供完整的通用设计与参考实现. 涵盖了70多种排序预测与评分预测算法,是最快最全的Java推荐与搜索引擎.
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 2
- **Created**: 2020-05-07
- **Last Updated**: 2021-07-05
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# JStarCraft RNS
****
[](https://www.apache.org/licenses/LICENSE-2.0.html)
[](https://tokei.rs/b1/github/HongZhaoHua/jstarcraft-rns?category=lines)
希望路过的同学,顺手给JStarCraft框架点个Star,算是对作者的一种鼓励吧!
****
## 目录
* [介绍](#介绍)
* [特性](#特性)
* [安装](#安装)
* [安装JStarCraft Core框架](#安装JStarCraft-Core框架)
* [安装JStarCraft AI框架](#安装JStarCraft-AI框架)
* [安装JStarCraft RNS引擎](#安装JStarCraft-RNS引擎)
* [使用](#使用)
* [设置依赖](#设置依赖)
* [构建配置器](#构建配置器)
* [训练与评估模型](#训练与评估模型)
* [获取模型](#获取模型)
* [架构](#架构)
* [概念](#概念)
* [为什么需要信息检索](#为什么需要信息检索)
* [搜索与推荐的异同](#搜索与推荐的异同)
* [JStarCraft RNS引擎解决什么问题](#JStarCraft-RNS引擎解决什么问题)
* [Ranking任务与Rating任务之间的区别](#Ranking任务与Rating任务之间的区别)
* [Rating算法能不能用于Ranking问题](#Rating算法能不能用于Ranking问题)
* [示例](#示例)
* [JStarCraft RNS引擎与Groovy脚本交互](#JStarCraft-RNS引擎与Groovy脚本交互)
* [JStarCraft RNS引擎与JS脚本交互](#JStarCraft-RNS引擎与JS脚本交互)
* [JStarCraft RNS引擎与Lua脚本交互](#JStarCraft-RNS引擎与Lua脚本交互)
* [JStarCraft RNS引擎与Python脚本交互](#JStarCraft-RNS引擎与Python脚本交互)
* [对比](#对比)
* [版本](#版本)
* [参考](#参考)
* [个性化模型](#个性化模型)
* [数据集](#数据集)
* [协议](#协议)
* [作者](#作者)
* [致谢](#致谢)
****
## 介绍
**JStarCraft RNS是一个面向信息检索领域的轻量级引擎.遵循Apache 2.0协议.**
专注于解决信息检索领域的基本问题:推荐与搜索.
提供满足工业级别场景要求的推荐引擎设计与实现.
提供满足工业级别场景要求的搜索引擎设计与实现.
****
## 特性
* 1.跨平台
* [2.串行与并行计算](https://github.com/HongZhaoHua/jstarcraft-ai)
* [3.CPU与GPU硬件加速](https://github.com/HongZhaoHua/jstarcraft-ai)
* [4.模型保存与装载](https://github.com/HongZhaoHua/jstarcraft-ai)
* 5.丰富的推荐与搜索算法
* 6.丰富的脚本支持
* Groovy
* JS
* Lua
* MVEL
* Python
* [7.丰富的评估指标](#评估指标)
* [排序指标](#排序指标)
* [评分指标](#评分指标)
****
## 安装
JStarCraft RNS要求使用者具备以下环境:
* JDK 8或者以上
* Maven 3
#### 安装JStarCraft-Core框架
```shell
git clone https://github.com/HongZhaoHua/jstarcraft-core.git
mvn install -Dmaven.test.skip=true
```
#### 安装JStarCraft-AI框架
```shell
git clone https://github.com/HongZhaoHua/jstarcraft-ai.git
mvn install -Dmaven.test.skip=true
```
#### 安装JStarCraft-RNS引擎
```shell
git clone https://github.com/HongZhaoHua/jstarcraft-rns.git
mvn install -Dmaven.test.skip=true
```
****
## 使用
#### 设置依赖
* 设置Maven依赖
```maven
com.jstarcraft
rns
1.0
```
* 设置Gradle依赖
```gradle
compile group: 'com.jstarcraft', name: 'rns', version: '1.0'
```
#### 构建配置器
```java
Properties keyValues = new Properties();
keyValues.load(this.getClass().getResourceAsStream("/data.properties"));
keyValues.load(this.getClass().getResourceAsStream("/recommend/benchmark/randomguess-test.properties"));
Configurator configurator = new Configurator(keyValues);
```
#### 训练与评估模型
* 构建排序任务
```java
RankingTask task = new RankingTask(RandomGuessModel.class, configurator);
// 训练与评估排序模型
task.execute();
```
* 构建评分任务
```java
RatingTask task = new RatingTask(RandomGuessModel.class, configurator);
// 训练与评估评分模型
task.execute();
```
#### 获取模型
```java
// 获取模型
Model model = task.getModel();
```
****
## 架构
****
## 概念
#### 为什么需要信息检索
```
随着信息技术和互联网的发展,人们逐渐从信息匮乏(Information Underload)的时代走入了信息过载(Information Overload)的时代.
无论是信息消费者还是信息生产者都遇到了挑战:
* 对于信息消费者,从海量信息中寻找信息,是一件非常困难的事情;
* 对于信息生产者,从海量信息中暴露信息,也是一件非常困难的事情;
信息检索的任务就是联系用户和信息,一方面帮助用户寻找对自己有价值的信息,另一方面帮助信息暴露给对它感兴趣的用户,从而实现信息消费者和信息生产者的双赢.
```
#### 搜索与推荐的异同
```
从信息检索的角度:
* 搜索和推荐是获取信息的两种主要手段;
* 搜索和推荐是获取信息的两种不同方式;
* 搜索(Search)是主动明确的;
* 推荐(Recommend)是被动模糊的;
搜索和推荐是两个互补的工具.
```
#### JStarCraft-RNS引擎解决什么问题
```
JStarCraft-RNS引擎旨在解决推荐与搜索领域的两个核心任务:排序预测(Ranking)和评分预测(Rating).
```
#### Ranking任务与Rating任务之间的区别
```
根据解决基本问题的不同,将算法与评估指标划分为排序(Ranking)与评分(Rating).
两者之间的根本区别在于目标函数的不同.
通俗点的解释:
Ranking算法基于隐式反馈数据,趋向于拟合用户的排序.(关注度)
Rating算法基于显示反馈数据,趋向于拟合用户的评分.(满意度)
```
#### Rating算法能不能用于Ranking问题
```
关键在于具体场景中,关注度与满意度是否保持一致.
通俗点的解释:
人们关注的东西,并不一定是满意的东西.(例如:个人所得税)
```
****
## 示例
#### JStarCraft-RNS引擎与Groovy脚本交互
* [完整示例](https://github.com/HongZhaoHua/jstarcraft-rns/tree/master/src/test/java/com/jstarcraft/rns/script)
* 编写Groovy脚本训练与评估模型并保存到Model.groovy文件
```groovy
// 构建配置
def keyValues = new Properties();
keyValues.load(loader.getResourceAsStream("data.properties"));
keyValues.load(loader.getResourceAsStream("recommend/benchmark/randomguess-test.properties"));
def configurator = new Configurator(keyValues);
// 此对象会返回给Java程序
def _data = [:];
// 构建排序任务
task = new RankingTask(RandomGuessModel.class, configurator);
// 训练与评估模型并获取排序指标
measures = task.execute();
_data.precision = measures.get(PrecisionEvaluator.class);
_data.recall = measures.get(RecallEvaluator.class);
// 构建评分任务
task = new RatingTask(RandomGuessModel.class, configurator);
// 训练与评估模型并获取评分指标
measures = task.execute();
_data.mae = measures.get(MAEEvaluator.class);
_data.mse = measures.get(MSEEvaluator.class);
_data;
```
* 使用JStarCraft框架从Model.groovy文件加载并执行Groovy脚本
```java
// 获取Groovy脚本
File file = new File(ScriptTestCase.class.getResource("Model.groovy").toURI());
String script = FileUtils.readFileToString(file, StringUtility.CHARSET);
// 设置Groovy脚本使用到的Java类
ScriptContext context = new ScriptContext();
context.useClasses(Properties.class, Configurator.class);
context.useClasses(RankingTask.class, RatingTask.class, RandomGuessModel.class);
context.useClasses(Assert.class, PrecisionEvaluator.class, RecallEvaluator.class, MAEEvaluator.class, MSEEvaluator.class);
// 设置Groovy脚本使用到的Java变量
ScriptScope scope = new ScriptScope();
scope.createAttribute("loader", loader);
// 执行Groovy脚本
ScriptExpression expression = new GroovyExpression(context, scope, script);
Map data = expression.doWith(Map.class);
```
#### JStarCraft-RNS引擎与JS脚本交互
* [完整示例](https://github.com/HongZhaoHua/jstarcraft-rns/tree/master/src/test/java/com/jstarcraft/rns/script)
* 编写JS脚本训练与评估模型并保存到Model.js文件
```js
// 构建配置
var keyValues = new Properties();
keyValues.load(loader.getResourceAsStream("data.properties"));
keyValues.load(loader.getResourceAsStream("recommend/benchmark/randomguess-test.properties"));
var configurator = new Configurator([keyValues]);
// 此对象会返回给Java程序
var _data = {};
// 构建排序任务
task = new RankingTask(RandomGuessModel.class, configurator);
// 训练与评估模型并获取排序指标
measures = task.execute();
_data['precision'] = measures.get(PrecisionEvaluator.class);
_data['recall'] = measures.get(RecallEvaluator.class);
// 构建评分任务
task = new RatingTask(RandomGuessModel.class, configurator);
// 训练与评估模型并获取评分指标
measures = task.execute();
_data['mae'] = measures.get(MAEEvaluator.class);
_data['mse'] = measures.get(MSEEvaluator.class);
_data;
```
* 使用JStarCraft框架从Model.js文件加载并执行JS脚本
```java
// 获取JS脚本
File file = new File(ScriptTestCase.class.getResource("Model.js").toURI());
String script = FileUtils.readFileToString(file, StringUtility.CHARSET);
// 设置JS脚本使用到的Java类
ScriptContext context = new ScriptContext();
context.useClasses(Properties.class, Configurator.class);
context.useClasses(RankingTask.class, RatingTask.class, RandomGuessModel.class);
context.useClasses(Assert.class, PrecisionEvaluator.class, RecallEvaluator.class, MAEEvaluator.class, MSEEvaluator.class);
// 设置JS脚本使用到的Java变量
ScriptScope scope = new ScriptScope();
scope.createAttribute("loader", loader);
// 执行JS脚本
ScriptExpression expression = new JsExpression(context, scope, script);
Map data = expression.doWith(Map.class);
```
#### JStarCraft-RNS引擎与Lua脚本交互
* [完整示例](https://github.com/HongZhaoHua/jstarcraft-rns/tree/master/src/test/java/com/jstarcraft/rns/script)
* 编写Lua脚本训练与评估模型并保存到Model.lua文件
```lua
-- 构建配置
local keyValues = Properties.new();
keyValues:load(loader:getResourceAsStream("data.properties"));
keyValues:load(loader:getResourceAsStream("recommend/benchmark/randomguess-test.properties"));
local configurator = Configurator.new({ keyValues });
-- 此对象会返回给Java程序
local _data = {};
-- 构建排序任务
task = RankingTask.new(RandomGuessModel, configurator);
-- 训练与评估模型并获取排序指标
measures = task:execute();
_data["precision"] = measures:get(PrecisionEvaluator);
_data["recall"] = measures:get(RecallEvaluator);
-- 构建评分任务
task = RatingTask.new(RandomGuessModel, configurator);
-- 训练与评估模型并获取评分指标
measures = task:execute();
_data["mae"] = measures:get(MAEEvaluator);
_data["mse"] = measures:get(MSEEvaluator);
return _data;
```
* 使用JStarCraft框架从Model.lua文件加载并执行Lua脚本
```java
// 获取Lua脚本
File file = new File(ScriptTestCase.class.getResource("Model.lua").toURI());
String script = FileUtils.readFileToString(file, StringUtility.CHARSET);
// 设置Lua脚本使用到的Java类
ScriptContext context = new ScriptContext();
context.useClasses(Properties.class, Configurator.class);
context.useClasses(RankingTask.class, RatingTask.class, RandomGuessModel.class);
context.useClasses(Assert.class, PrecisionEvaluator.class, RecallEvaluator.class, MAEEvaluator.class, MSEEvaluator.class);
// 设置Lua脚本使用到的Java变量
ScriptScope scope = new ScriptScope();
scope.createAttribute("loader", loader);
// 执行Lua脚本
ScriptExpression expression = new LuaExpression(context, scope, script);
LuaTable data = expression.doWith(LuaTable.class);
```
#### JStarCraft-RNS引擎与Python脚本交互
* [完整示例](https://github.com/HongZhaoHua/jstarcraft-rns/tree/master/src/test/java/com/jstarcraft/rns/script)
* 编写Python脚本训练与评估模型并保存到Model.py文件
```python
# 构建配置
keyValues = Properties()
keyValues.load(loader.getResourceAsStream("data.properties"))
keyValues.load(loader.getResourceAsStream("recommend/benchmark/randomguess-test.properties"))
configurator = Configurator([keyValues])
# 此对象会返回给Java程序
_data = {}
# 构建排序任务
task = RankingTask(RandomGuessModel, configurator)
# 训练与评估模型并获取排序指标
measures = task.execute()
_data['precision'] = measures.get(PrecisionEvaluator)
_data['recall'] = measures.get(RecallEvaluator)
# 构建评分任务
task = RatingTask(RandomGuessModel, configurator)
# 训练与评估模型并获取评分指标
measures = task.execute()
_data['mae'] = measures.get(MAEEvaluator)
_data['mse'] = measures.get(MSEEvaluator)
```
* 使用JStarCraft框架从Model.py文件加载并执行Python脚本
```java
// 设置Python环境变量
System.setProperty("python.console.encoding", StringUtility.CHARSET.name());
// 获取Python脚本
File file = new File(PythonTestCase.class.getResource("Model.py").toURI());
String script = FileUtils.readFileToString(file, StringUtility.CHARSET);
// 设置Python脚本使用到的Java类
ScriptContext context = new ScriptContext();
context.useClasses(Properties.class, Configurator.class);
context.useClasses(RankingTask.class, RatingTask.class, RandomGuessModel.class);
context.useClasses(Assert.class, PrecisionEvaluator.class, RecallEvaluator.class, MAEEvaluator.class, MSEEvaluator.class);
// 设置Python脚本使用到的Java变量
ScriptScope scope = new ScriptScope();
scope.createAttribute("loader", loader);
// 执行Python脚本
ScriptExpression expression = new PythonExpression(context, scope, script);
Map data = expression.doWith(Map.class);
```
****
## 对比
****
## 版本
****
## 参考
#### 个性化模型
* 基准模型
| 名称 | 问题 | 说明/论文 |
| :----: | :----: | :----: |
| RandomGuess | Ranking Rating | 随机猜测 |
| MostPopular | Ranking| 最受欢迎 |
| ConstantGuess | Rating | 常量猜测 |
| GlobalAverage | Rating | 全局平均 |
| ItemAverage | Rating | 物品平均 |
| ItemCluster | Rating | 物品聚类 |
| UserAverage | Rating | 用户平均 |
| UserCluster | Rating | 用户聚类 |
* 协同模型
| 名称 | 问题 | 说明/论文 |
| :----: | :----: | :----: |
| AspectModel | Ranking Rating | Latent class models for collaborative filtering |
| BHFree | Ranking Rating | Balancing Prediction and Recommendation Accuracy: Hierarchical Latent Factors for Preference Data |
| BUCM | Ranking Rating | Modeling Item Selection and Relevance for Accurate Recommendations |
| ItemKNN | Ranking Rating | 基于物品的协同过滤 |
| UserKNN | Ranking Rating | 基于用户的协同过滤 |
| AoBPR | Ranking | Improving pairwise learning for item recommendation from implicit feedback |
| BPR | Ranking | BPR: Bayesian Personalized Ranking from Implicit Feedback |
| CLiMF | Ranking | CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering |
| EALS | Ranking | Collaborative filtering for implicit feedback dataset |
| FISM | Ranking | FISM: Factored Item Similarity Models for Top-N Recommender Systems |
| GBPR | Ranking | GBPR: Group Preference Based Bayesian Personalized Ranking for One-Class Collaborative Filtering |
| HMMForCF | Ranking | A Hidden Markov Model Purpose: A class for the model, including parameters |
| ItemBigram | Ranking | Topic Modeling: Beyond Bag-of-Words |
| LambdaFM | Ranking | LambdaFM: Learning Optimal Ranking with Factorization Machines Using Lambda Surrogates |
| LDA | Ranking | Latent Dirichlet Allocation for implicit feedback |
| ListwiseMF | Ranking | List-wise learning to rank with matrix factorization for collaborative filtering |
| PLSA | Ranking | Latent semantic models for collaborative filtering |
| RankALS | Ranking | Alternating Least Squares for Personalized Ranking |
| RankSGD | Ranking | Collaborative Filtering Ensemble for Ranking |
| SLIM | Ranking | SLIM: Sparse Linear Methods for Top-N Recommender Systems |
| WBPR | Ranking | Bayesian Personalized Ranking for Non-Uniformly Sampled Items |
| WRMF | Ranking | Collaborative filtering for implicit feedback datasets |
| Rank-GeoFM | Ranking | Rank-GeoFM: A ranking based geographical factorization method for point of interest recommendation |
| SBPR | Ranking | Leveraging Social Connections to Improve Personalized Ranking for Collaborative Filtering |
| AssociationRule | Ranking | A Recommendation Algorithm Using Multi-Level Association Rules |
| PRankD | Ranking | Personalised ranking with diversity |
| AsymmetricSVD++ | Rating | Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model |
| AutoRec | Rating | AutoRec: Autoencoders Meet Collaborative Filtering |
| BPMF | Rating | Bayesian Probabilistic Matrix Factorization using Markov Chain Monte Carlo |
| CCD | Rating | Large-Scale Parallel Collaborative Filtering for the Netflix Prize |
| FFM | Rating | Field Aware Factorization Machines for CTR Prediction |
| GPLSA | Rating | Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis |
| IRRG | Rating | Exploiting Implicit Item Relationships for Recommender Systems |
| MFALS | Rating | Large-Scale Parallel Collaborative Filtering for the Netflix Prize |
| NMF | Rating | Algorithms for Non-negative Matrix Factorization |
| PMF | Rating | PMF: Probabilistic Matrix Factorization |
| RBM | Rating | Restricted Boltzman Machines for Collaborative Filtering |
| RF-Rec | Rating | RF-Rec: Fast and Accurate Computation of Recommendations based on Rating Frequencies |
| SVD++ | Rating | Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model |
| URP | Rating | User Rating Profile: a LDA model for rating prediction |
| RSTE | Rating | Learning to Recommend with Social Trust Ensemble |
| SocialMF | Rating | A matrix factorization technique with trust propagation for recommendation in social networks |
| SoRec | Rating | SoRec: Social recommendation using probabilistic matrix factorization |
| SoReg | Rating | Recommender systems with social regularization |
| TimeSVD++ | Rating | Collaborative Filtering with Temporal Dynamics |
| TrustMF | Rating | Social Collaborative Filtering by Trust |
| TrustSVD | Rating | TrustSVD: Collaborative Filtering with Both the Explicit and Implicit Influence of User Trust and of Item Ratings |
| PersonalityDiagnosis | Rating | A brief introduction to Personality Diagnosis |
| SlopeOne | Rating | Slope One Predictors for Online Rating-Based Collaborative Filtering |
* 内容模型
| 名称 | 问题 | 说明/论文 |
| :----: | :----: | :----: |
| EFM | Ranking Rating | Explicit factor models for explainable recommendation based on phrase-level sentiment analysis |
| TF-IDF | Ranking | 词频-逆文档频率 |
| HFT | Rating | Hidden factors and hidden topics: understanding rating dimensions with review text |
| TopicMF | Rating | TopicMF: Simultaneously Exploiting Ratings and Reviews for Recommendation |
#### 数据集
* [Amazon Dataset](http://jmcauley.ucsd.edu/data/amazon/)
* [Bibsonomy Dataset](https://www.kde.cs.uni-kassel.de/wp-content/uploads/bibsonomy/)
* [BookCrossing Dataset](https://grouplens.org/datasets/book-crossing/)
* [Ciao Dataset](https://www.cse.msu.edu/~tangjili/datasetcode/truststudy.htm)
* [Douban Dataset](http://smiles.xjtu.edu.cn/Download/Download_Douban.html)
* [Eachmovie Dataset](https://grouplens.org/datasets/eachmovie/)
* [Epinions Dataset](http://www.trustlet.org/epinions.html)
* [Foursquare Dataset](https://sites.google.com/site/yangdingqi/home/foursquare-dataset)
* [Goodbooks Dataset](http://fastml.com/goodbooks-10k-a-new-dataset-for-book-recommendations/)
* [Gowalla Dataset](http://snap.stanford.edu/data/loc-gowalla.html)
* [HetRec2011 Dataset](https://grouplens.org/datasets/hetrec-2011/)
* [Jest Joker Dataset](https://grouplens.org/datasets/jester/)
* [Large Movie Review Dataset](http://ai.stanford.edu/~amaas/data/sentiment/)
* [MovieLens Dataset](https://grouplens.org/datasets/movielens/)
* [Newsgroups Dataset](http://qwone.com/~jason/20Newsgroups/)
* [Stanford Large Network Dataset](http://snap.stanford.edu/data/)
* [Serendipity 2018 Dataset](https://grouplens.org/datasets/serendipity-2018/)
* [Wikilens Dataset](https://grouplens.org/datasets/wikilens/)
* [Yelp Dataset](https://www.yelp.com/dataset)
* [Yongfeng Zhang Dataset](http://yongfeng.me/dataset/)
****
## 协议
JStarCraft RNS遵循[Apache 2.0协议](https://www.apache.org/licenses/LICENSE-2.0.html),一切以其为基础的衍生作品均属于衍生作品的作者.
****
## 作者
| 作者 | 洪钊桦 |
| :----: | :----: |
| E-mail | 110399057@qq.com, jstarcraft@gmail.com |
****
## 致谢
特别感谢[LibRec团队](https://github.com/guoguibing/librec)与**推荐系统QQ群**(274750470)在推荐方面提供的支持与帮助.
特别感谢[陆徐刚](https://github.com/luxugang/Lucene-7.5.0)在搜索方面提供的支持与帮助.
****