3 Star 0 Fork 0

Gitee 极速下载 / bayesian

Create your Gitee Account
Explore and code with more than 12 million developers,Free private repositories !:)
Sign up
此仓库是为了提升国内下载速度的镜像仓库,每日同步一次。 原始仓库: https://github.com/jbrukh/bayesian
Clone or Download
contribute
Sync branch
Cancel
Notice: Creating folder will generate an empty file .keep, because not support in Git
Loading...
README

Naive Bayesian Classification

Perform naive Bayesian classification into an arbitrary number of classes on sets of strings. bayesian also supports term frequency-inverse document frequency calculations (TF-IDF).

Copyright (c) 2011-2017. Jake Brukhman. (jbrukh@gmail.com). All rights reserved. See the LICENSE file for BSD-style license.


Background

This is meant to be an low-entry barrier Go library for basic Bayesian classification. See code comments for a refresher on naive Bayesian classifiers, and please take some time to understand underflow edge cases as this otherwise may result in innacurate classifications.


Installation

Using the go command:

go get github.com/jbrukh/bayesian
go install !$

Documentation

See the GoPkgDoc documentation here.


Features

  • Conditional probability and "log-likelihood"-like scoring.
  • Underflow detection.
  • Simple persistence of classifiers.
  • Statistics.
  • TF-IDF support.

Example 1 (Simple Classification)

To use the classifier, first you must create some classes and train it:

import "github.com/jbrukh/bayesian"

const (
    Good bayesian.Class = "Good"
    Bad  bayesian.Class = "Bad"
)

classifier := bayesian.NewClassifier(Good, Bad)
goodStuff := []string{"tall", "rich", "handsome"}
badStuff  := []string{"poor", "smelly", "ugly"}
classifier.Learn(goodStuff, Good)
classifier.Learn(badStuff,  Bad)

Then you can ascertain the scores of each class and the most likely class your data belongs to:

scores, likely, _ := classifier.LogScores(
                        []string{"tall", "girl"},
                     )

Magnitude of the score indicates likelihood. Alternatively (but with some risk of float underflow), you can obtain actual probabilities:

probs, likely, _ := classifier.ProbScores(
                        []string{"tall", "girl"},
                     )

Example 2 (TF-IDF Support)

To use the TF-IDF classifier, first you must create some classes and train it and you need to call ConvertTermsFreqToTfIdf() AFTER training and before calling classification methods such as LogScores, SafeProbScores, and ProbScores)

import "github.com/jbrukh/bayesian"

const (
    Good bayesian.Class = "Good"
    Bad bayesian.Class = "Bad"
)

// Create a classifier with TF-IDF support.
classifier := bayesian.NewClassifierTfIdf(Good, Bad)

goodStuff := []string{"tall", "rich", "handsome"}
badStuff  := []string{"poor", "smelly", "ugly"}

classifier.Learn(goodStuff, Good)
classifier.Learn(badStuff,  Bad)

// Required
classifier.ConvertTermsFreqToTfIdf()

Then you can ascertain the scores of each class and the most likely class your data belongs to:

scores, likely, _ := classifier.LogScores(
    []string{"tall", "girl"},
)

Magnitude of the score indicates likelihood. Alternatively (but with some risk of float underflow), you can obtain actual probabilities:

probs, likely, _ := classifier.ProbScores(
    []string{"tall", "girl"},
)

Use wisely.

Empty file

About

Cancel

Releases

No release

Contributors

All

Activities

Load More
can not load any more
1
https://gitee.com/mirrors/bayesian.git
git@gitee.com:mirrors/bayesian.git
mirrors
bayesian
bayesian
master

Search