# recsys_faiss **Repository Path**: greitzmann/recsys_faiss ## Basic Information - **Project Name**: recsys_faiss - **Description**: 一个基于 fasttext + faiss 的商品内容相关推荐实现,nginx+uwsgi+flask / gunicorn+uvicorn+fastapi 提供api查询接口,增加Spark实现 Ansj+Word2vec+LSH+Phoenix - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-02-23 - **Last Updated**: 2021-03-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # recsys_faiss 一个基于 fasttext + faiss 的商品内容相关推荐接口实现,restful接口采用nginx+uwsgi+flask,gunicorn+uvicorn+fastapi 增加Spark实现内容相关推荐,Ansj+Word2vec+LSH+Phoenix #### 商品详情页效果图 将模型部署应用 ![](shoppingweb.png) #### 模型接口流程图 ![](process.png) #### 训练商品属性的特征向量,商品向量add到faiss ``` python embedding_recsys.py ``` #### flask封装faiss接口,输入商品id重建向量,进行余弦相似度检索 #### 启动uwsgi ``` uwsgi uwsgi.ini ``` #### nginx配置 ``` server { listen 8089; # 指定监听的端口 charset utf-8; server_name localhost; # ip地址 location / { include uwsgi_params; uwsgi_pass 127.0.0.1:8088; uwsgi_param UWSGI_CHDIR /Users/PycharmProjects/recsys_faiss; uwsgi_param UWSGI_SCRIPT recsys_faiss.faiss_api.py; } } ``` #### 接口测试 get请求,请求参数spu商品ID,n_items召回相似商品数量 ``` python >>> import requests >>> res = requests.get("http://127.0.0.1:8089/faiss/similar_items/?spu=3&n_items=10") >>> res.json() {'code': '200', 'msg': '处理成功', 'result': {'56482': 1.0, '92237': 1.0, '56483': 1.0, '56481': 1.0, '56484': 1.0, '56485': 1.0, '56486': 1.0, '4': 1.0, '18': 0.9981815814971924, '19': 0.9981815814971924}} ``` #### 推荐结果验证 spu = 3 ``` +-------------+--------------------------------------+ | ITEM_NUM_ID | ITEM_NAME | +-------------+--------------------------------------+ | 3 | 卓德优格乳杏口味含乳饮品 | +-------------+--------------------------------------+ ``` #### 推荐结果 ``` +-------------+---------------------------------------------------------------------------+ | ITEM_NUM_ID | ITEM_NAME | +-------------+---------------------------------------------------------------------------+ | 19 | 卓德低脂热处理风味发酵乳(森林水果口味)120g | | 8221 | 爱乐薇蓝莓味含乳饮品125克 | | 56481 | 卓德风味发酵乳(草莓鲜酪口味)120g | | 8 | 卓德脱脂含乳饮品(覆盆子口味) | | 56483 | 卓德风味发酵乳(香草口味)120g | | 20 | 卓德低脂热处理风味发酵乳(草莓口味)120g | | 56484 | 卓德脱脂含乳饮品水蜜桃口味+覆盆子口味4*115g | | 56486 | 卓德热处理风味发酵乳(原味)4*115g | | 56482 | 卓德风味发酵乳(焗苹果口味)120g | | 8229 | 爱乐薇菠萝味含乳饮品125克 | | 18 | 卓德低脂热处理风味发酵乳(水蜜桃、西番莲口味)120g | | 4 | 卓德优格乳草莓口味含乳饮品 | | 92237 | 卓德含乳饮品(草莓口味)460克(4*115克) | | 6 | 卓德脱脂含乳饮品(水蜜桃口味) | +-------------+---------------------------------------------------------------------------+ ``` #### 接口压力测试 ``` siege -c 100 -t 10s -b "http://127.0.0.1:8089/faiss/similar_items/?spu=3&n_items=50" Transactions: 41011 hits Availability: 100.00 % Elapsed time: 9.17 secs Data transferred: 12.24 MB Response time: 0.02 secs Transaction rate: 4472.30 trans/sec Throughput: 1.33 MB/sec Concurrency: 99.57 Successful transactions: 41011 Failed transactions: 0 Longest transaction: 0.07 Shortest transaction: 0.00 ``` #### fastapi ``` gunicorn faiss_fastapi:app -w 4 -k uvicorn.workers.UvicornWorker -D ``` ``` python >>> import requests >>> res = requests.get("http://127.0.0.1:8000/faiss/similar_items/?spu_id=3&n_items=10") >>> res.json() {'code': 200, 'msg': 'success', 'res': [4, 56486, 92237, 56484, 56485, 56481, 56482, 56483, 18, 20]} ``` #### Spark实现 在phoenix创建表RECSYS_SIMILAR_LSH ``` 0: jdbc:phoenix:> create table RECSYS_SIMILAR_LSH (id varchar not null primary key, recommend varchar) salt_buckets=8; ``` 提交spark任务 ``` bash submit.bash ``` 查看结果 ``` 0: jdbc:phoenix:> select * from RECSYS_SIMILAR_LSH limit 5; +---------+-----------------------------------------------------------------------------------------------------------------------------------------------+ | ID | | +---------+-----------------------------------------------------------------------------------------------------------------------------------------------+ | 100880 | 4407,78608,753,88585,99289,17360,42159,43082,8636,43403,109828,2409,214619,202489,43125,14123,97192,9408,73847,48269,20587,209262,76913,78394 | | 102431 | 100034,100280,98687,118912,114140,29619,106257,118940,100065,30217,49843,49891,41759,28874,109745,29915,20059,29191,238333,90415,51839,48266, | | 104213 | 237497,21255,12543,98798,90771,117289,21262,20042,75753,212108,29915,50095,50537,39070,20059,101172,53475,18816,29859,109745,41840,29619,1886 | | 105577 | 9681,91428,62392,41219,117776,13191,120160,97337,112055,78196,202915,202899,227439,39411,94532,102624,102618,235521,105425,120167,58650,85126 | | 106655 | 233605,42025,120616,59829,203421,209948,99844,94505,752,39665,93387,80632,232698,57406,102814,43438,42975,8926,91368,73961,210979,92327,94477 | +---------+-----------------------------------------------------------------------------------------------------------------------------------------------+ ```