1 Star 3 Fork 1

泰晓科技 / tinycorrect

Create your Gitee Account
Explore and code with more than 12 million developers,Free private repositories !:)
Sign up
Clone or Download
Sync branch
Notice: Creating folder will generate an empty file .keep, because not support in Git

Corrector: TinyCorrect v0.1-rc3 - [autocorrect]

TinyCorrect - correct everything of the Markdown articles

It is in the early development stage, so, don't trust it too much, recheck and then commit.

The functions are unstable and the interfaces may change very frequently, and no data guarantee currently.

It is originally written for the RISC-V Linux project.


Currently, it provides such functions:

  • gitcheck

    • data safety is more important than data format
    • the target articles must be put in a git repository
    • old changes must be committed before doing new correction
  • tounix

    • make sure the file is a unix document format
    • call dos2unix to do the conversion
  • filename

    • file name naming rules
    • may with date, modules and features
  • spaces

    • strip spaces
    • like ending whitespaces, color info, bad control symbols
  • newline

    • keep every newline unique
    • remove trailing newlines, but at least reserve one '\n' character at the end
  • quotes

    • clean up the quoted blocks
  • comments

    • clean up the comments in '// ...' or '/* ... */' style
  • header

    • file header rules
    • like authors, revisors, licenses, sponsors
  • revisor

    • configure the revisor separately
    • the default list is added in configs/.authors/
  • toc

    • table of content requirement
    • articles should be organized carefully with good titles and comfortable view
  • codeblock

    • about the code wrapped with the "```" string
    • for a whole block of codes
  • codeinline

    • about the code wrapped with the "`" character
    • for functions, variables and so forth
  • tables

    • prettify the table boundaries, align the '|' characters
    • markdown-table-prettify used
    • The lines are not aligned normally when Chinese are used in the table
    • To reserve the original indentation, must add '|' at the begin of the table lines
  • images

    • existence check
    • description detection
    • auto convert mermaid graphs to svg images
    • check remote images and auto download for persistent access
  • urls

    • connectivity check
    • urls should be moved to in the alias map in the end of the article
  • refs

    • make sure references added in list style
    • collect all suitable urls as references
  • typeset

  • words

    • the mispronounced words should be detected and corrected if possible
    • pyautocorrector, integrated but missing good models for computer science
    • xmnlp, integrated but missing good models for computer science
    • epw, a new error-prune word detection and correction framework developed by ourselves

It is modularized and more functions will be added based on user's requirement.


$ git clone https://gitee.com/tinylab/tinycorrect


// Fast configuration
$ cd tinycorrect && source tinycorrect.sh


We have introduced the background, design and usage of this project in this video, enjoy~


Basic usage:

$ tico file.md

Disable some modules:

$ gitcheck=0 toc=0 tico file.md

Skip multiple modules:

$ skip='gitcheck toc' tico file.md

Run specified modules:

$ modules="tounix toc" tico file.md

Skip fixed titles check:

$ titles="" tico file.md

Specify our own default titles:

$ titles="Usage" tico file.md

For generic articles:

$ template=generic tico file.md

Configure typeset and words engine:

// default: typeset="pangu autocorrect" words=""
$ typeset="pangu autocorrect" words="pycorrector xmnlp" tico file.md

// If one of them not work as expected, disable it
$ typeset="autocorrect" words="xmnlp" tico file.md

// lang for pycorrector: 0: Chinese, 1: English, 2: Both
$ words="pycorrector" lang=1 tico file.md

Set tab width (must with cols=1, cols is disabled by default):

$ tabs=4 cols=1 tico file.md

Download the pictures from remote sites automatically:

$ download=1 tico file.md

Disable mermaid if already converted, enabled by default to do auto upgrade:

$ mermaid=0 tico file.md

Disable the time cost URL connectivity detection:

$ url_timeout=0 tico file.md

Disable URL filters (by default, will not check the DNS like github.com):

$ url_filter=0 tico file.md

Continue run all of the modules even if one of them fails:

$ failcontinue=1 tico file.md
$ failstop=0 tico file.md

// gitcheck can not be ignored except use 'skip=gitcheck' explicitly
$ skip=gitcheck failcontinue=1 tico file.md

Configure revisor directly:

// use value from 'argument' variable
$ module=revisor argument='Revisor Name <Email Address>' tico file.md

// choose one in the default list: configs/.author/default.md
$ module=revisor tico file.md

// choose one in the your own list, list with '-'
$ touch configs/.authors/myauthors.md
$ module=revisor authors=myauthors tico file.md

Configure action after correction:

// ask
$ correctaction=ask tico file.md
// stop or quit
$ correctaction=quit tico file.md
// commit
$ correctaction=commit tico file.md
// next or continue
$ correctaction=next tico file.md

Direct commit after correction, this will override correctaction setting:

$ commit=1 tico file.md

Generate references section automatically:

$ modules=refs tico file.md

Empty file


Auto detect & correct typeset, inline code, toc, file name, header, images, urls and words of Markdown documents, originally written for the RISC-V Linux project: https://gitee.com/tinylab/riscv-linux expand collapse


No release




Load More
can not load any more