# tabula-java
**Repository Path**: linrol/tabula-java
## Basic Information
- **Project Name**: tabula-java
- **Description**: Extract tables from PDF files
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 1
- **Forks**: 0
- **Created**: 2021-01-29
- **Last Updated**: 2021-06-03
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
tabula-java [](https://travis-ci.org/tabulapdf/tabula-java) [](https://ci.appveyor.com/project/jazzido/tabula-java)
===========
`tabula-java` is a library for extracting tables from PDF files — it is the table extraction engine that powers [Tabula](http://tabula.technology/) ([repo](http://github.com/tabulapdf/tabula)). You can use `tabula-java` as a command-line tool to programmatically extract tables from PDFs.
© 2014-2020 Manuel Aristarán. Available under MIT License. See [`LICENSE`](LICENSE).
## Download
Download a version of the tabula-java's jar, with all dependencies included, that works on Mac, Windows and Linux from our [releases page](../../releases).
## Usage Examples
`tabula-java` provides a command line application:
```
$ java -jar target/tabula-1.0.2-jar-with-dependencies.jar --help
usage: tabula [-a ] [-b ] [-c ] [-f ]
[-g] [-h] [-i] [-l] [-n] [-o ] [-p ] [-r] [-s
] [-t] [-u] [-v]
Tabula helps you extract tables from PDFs
-a,--area -a/--area = Portion of the page to analyze.
Example: --area 269.875,12.75,790.5,561.
Accepts top,left,bottom,right i.e. y1,x1,y2,x2
where all values are in points relative to the
top left corner. If all values are between
0-100 (inclusive) and preceded by '%', input
will be taken as % of actual height or width
of the page. Example: --area %0,0,100,50. To
specify multiple areas, -a option should be
repeated. Default is entire page
-b,--batch Convert all .pdfs in the provided directory.
-c,--columns X coordinates of column boundaries. Example
--columns 10.1,20.2,30.3. If all values are
between 0-100 (inclusive) and preceded by '%',
input will be taken as % of actual width of
the page. Example: --columns %25,50,80.6
-f,--format Output format: (CSV,TSV,JSON). Default: CSV
-g,--guess Guess the portion of the page to analyze per
page.
-h,--help Print this help text.
-i,--silent Suppress all stderr output.
-l,--lattice Force PDF to be extracted using lattice-mode
extraction (if there are ruling lines
separating each cell, as in a PDF of an Excel
spreadsheet)
-n,--no-spreadsheet [Deprecated in favor of -t/--stream] Force PDF
not to be extracted using spreadsheet-style
extraction (if there are no ruling lines
separating each cell)
-o,--outfile Write output to instead of STDOUT.
Default: -
-p,--pages Comma separated list of ranges, or all.
Examples: --pages 1-3,5-7, --pages 3 or
--pages all. Default is --pages 1
-r,--spreadsheet [Deprecated in favor of -l/--lattice] Force
PDF to be extracted using spreadsheet-style
extraction (if there are ruling lines
separating each cell, as in a PDF of an Excel
spreadsheet)
-s,--password Password to decrypt document. Default is empty
-t,--stream Force PDF to be extracted using stream-mode
extraction (if there are no ruling lines
separating each cell)
-u,--use-line-returns Use embedded line returns in cells. (Only in
spreadsheet mode.)
-v,--version Print version and exit.
```
It also includes a debugging tool, run `java -cp ./target/tabula-1.0.2-jar-with-dependencies.jar technology.tabula.debug.Debug -h` for the available options.
You can also integrate `tabula-java` with any JVM language. For Java examples, see the [`tests`](src/test/java/technology/tabula/) folder.
JVM start-up time is a lot of the cost of the `tabula` command, so if you're trying to extract many tables from PDFs, you have a few options for speeding it up:
- the [drip](https://github.com/ninjudd/drip) utility
- the [Ruby](http://github.com/tabulapdf/tabula-extractor), [Python](https://github.com/chezou/tabula-py), [R](https://github.com/leeper/tabulizer), and [Node.js](https://github.com/ezodude/tabula-js) bindings
- writing your own program in any JVM language (Java, JRuby, Scala) that imports tabula-java.
- waiting for us to implement an API/server-style system (it's on the [roadmap](https://github.com/tabulapdf/tabula-api))
## Building from Source
Clone this repo and run:
```
mvn clean compile assembly:single
```
## Contributing
Interested in helping out? We'd love to have your help!
You can help by:
- [Reporting a bug](https://github.com/tabulapdf/tabula-java/issues).
- Adding or editing documentation.
- Contributing code via a Pull Request.
- Spreading the word about `tabula-java` to people who might be able to benefit from using it.
### Backers
You can also support our continued work on `tabula-java` with a one-time or monthly donation [on OpenCollective](https://opencollective.com/tabulapdf#support). Organizations who use `tabula-java` can also [sponsor the project](https://opencollective.com/tabulapdf#support) for acknowledgement on [our official site](http://tabula.technology/) and this README.
Special thanks to the following users and organizations for generously supporting Tabula with donations and grants:
