# dami **Repository Path**: lgnlgn/dami ## Basic Information - **Project Name**: dami - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-04-17 - **Last Updated**: 2021-04-17 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README **dami** ============= Scalable algorithms in **da**ta **mi**ning. (***I am shifting this project to feluca and will refactor it. so this project is deprecating***) dami is writen in Java. Our goal is to make algorithms that can handle hundreds of millions of data with a limited memory PC Currently we have : - **utility**: Buffered vectors pool for dataset IO, High performance and simple text parser. (*More tests need*) - **classification**: SGD for logistic regressions - **recommendation**: SlopeOne, SVD, RSVD, itemneighborhood-SVD (see movielens_converter.py) - **significant test**: swap randomization - **graph**: Pagerank. Future: - **similarity**: simhash --------- >*2012/10/22 Release Notes:* > - L1 & L2 logistic regression > - memory cost estimation > - simple commandline integration for LR >*2012/7/22 Release Notes:* > - Asynchronous vector buffer for dataset IO > - High performance and simple text parser(only for digital related chars) > - small refactoring. >*2012/7/12 Release Notes:* > - code refactoring for recommendation and IO > - To run RMSE for recommendation, you first need to see *`movielens_convert.py`* for converting and/or splitting movielens data, and see *`CFDataConverter`* and *`TestSVD`* ---------- To achieve computation efficiency and memory utilization, two ways we have just adopted. *1: Using "id" as index of array for fetching data.* *2: Only maintaining model in memory and saving data to converted bytes for IO* So it's highly recommemded you use continuous ids for the algorithms :) My Chinese blog : [http://blog.csdn.net/lgnlgn](http://blog.csdn.net/lgnlgn) E-mail : gnliang10 [at] 126.com