# tokenizer
**Repository Path**: mirrors_lahmatiy/tokenizer
## Basic Information
- **Project Name**: tokenizer
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: CC0-1.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-12-11
- **Last Updated**: 2026-05-17
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# CSS Tokenizer
[
](https://www.npmjs.com/package/@csstools/tokenizer)
[
](https://travis-ci.org/github/csstools/tokenizer)
[
](https://codecov.io/gh/csstools/tokenizer)
[
](https://github.com/csstools/tokenizer/issues)
[
](https://github.com/csstools/tokenizer/pulls)
[
](https://gitter.im/postcss/postcss)
This tools lets you tokenize CSS according to the [CSS Syntax Specification](https://drafts.csswg.org/css-syntax/).
Tokenizing CSS is separating a string of CSS into its smallest, semantic parts — otherwise known as tokens.
This tool is intended to be used in other tools on the front and back end. It seeks to maintain:
- 100% compliance with the CSS syntax specification. ✨
- 100% code coverage. 🦺
- 100% static typing. 💪
- 1kB maximum contribution size. 📦
- Superior quality over Shark P. 🦈
## Usage
Add the [CSS tokenizer](https://github.com/csstools/tokenizer) to your project:
```sh
npm install @csstools/tokenizer
```
Tokenize CSS in JavaScript:
```js
import { tokenize } from '@csstools/tokenizer'
for (const token of tokenize(cssText)) {
console.log(token) // logs an individual CSSToken
}
```
Tokenize CSS in _classical_ NodeJS:
```js
const { tokenizer } = require('@csstools/tokenizer')
let iterator = tokenizer(cssText), iteration
while (!(iteration = iterator()).done) {
console.log(iteration.value) // logs an individual CSSToken
}
```
Tokenize CSS in client-side scripts:
```html
```
Tokenize CSS in _classical_ client-side scripts:
```html
```
## How it works
The CSS tokenizer separates a string of CSS into tokens.
```ts
interface CSSToken {
/** Position in the string at which the token was retrieved. */
tick: number
/** Number identifying the kind of token. */
type:
| 1 // Symbol
| 2 // Comment
| 3 // Space
| 4 // Word
| 5 // Function
| 6 // Atword
| 7 // Hash
| 8 // String
| 9 // Number
/** Code, like the character code of a symbol, or the character code of the opening parenthesis of a function. */
code: number
/** Lead, like the opening of a comment, the quotation mark of a string, or the name of a function. */
lead: string,
/** Data, like the numbers before a unit, the word after an at-sign, or the opening parenthesis of a Function. */
data: string,
/** Tail, like the unit after a number, or the closing of a comment. */
tail: string,
}
```
As an example, the CSS string `@media` would become a **Atword** token where `@` and `media` are recognized as distinct parts of that token. As another example, the CSS string `5px` would become a **Number** token where `5` and `px` are recognized as distinct parts of that token. As a final example, the string `5px 10px` would become 3 tokens; the **Number** as mentioned before (`5px`), a **Space** token that represents a single space (` `), and then another **Number** token (`10px`).
## Benchmarks
As of August 23, 2021, these benchmarks were averaged from my local machine:
```Benchmark: Tailwind CSS
┌────────────────────────────────────────────────────┬────────┬────────┬────────┐
│ (index) │ ms │ ms/50k │ tokens │
├────────────────────────────────────────────────────┼────────┼────────┼────────┤
│ CSSTree 1 x 7.55 ops/sec ±11.49% (24 runs sampled) │ 132.48 │ 13.87 │ 477434 │
│ PostCSS 8 x 13.78 ops/sec ±2.73% (39 runs sampled) │ 72.56 │ 3.88 │ 935267 │
│ Tokenizer x 17.09 ops/sec ±1.09% (47 runs sampled) │ 58.52 │ 3.09 │ 948045 │
└────────────────────────────────────────────────────┴────────┴────────┴────────┘
Benchmark: Bootstrap
┌──────────────────────────────────────────────────┬──────┬────────┬────────┐
│ (index) │ ms │ ms/50k │ tokens │
├──────────────────────────────────────────────────┼──────┼────────┼────────┤
│ CSSTree 1 x 118 ops/sec ±2.39% (77 runs sampled) │ 8.5 │ 13.1 │ 32425 │
│ PostCSS 8 x 408 ops/sec ±0.10% (96 runs sampled) │ 2.45 │ 2.4 │ 51170 │
│ Tokenizer x 288 ops/sec ±0.14% (93 runs sampled) │ 3.48 │ 2.92 │ 59566 │
└──────────────────────────────────────────────────┴──────┴────────┴────────┘
```
## Development
You wanna take a deeper dive? Awesome! Here are a few useful development commands.
### npm run build
The **build** command creates all the files needed to run this tool in many different JavaScript environments.
```sh
npm run build
```
### npm run benchmark
The **benchmark** command builds the project and then tests its performance as compared to [PostCSS].
These benchmarks are run against [Boostrap] and [Tailwind CSS].
```sh
npm run benchmark
```
### npm run test
The **test** command tests the coverage and accuracy of the tokenizer.
As of September 26, 2020, this tokenizer has 100% test coverage:
```sh
npm run test
```
[Boostrap]: https://getbootstrap.com
[PostCSS]: https://postcss.org
[Tailwind CSS]: https://tailwindcss.com