Docling4j brings the functionalities of Docling in document understanding to Java® projects.
The current version of this library is: 0.1.1
To use it in your project, define a dependency that contains the artifact coordinates (group id, artifact id and version) for the service, like this:
<dependency>
<groupId>com.ibm.docling</groupId>
<artifactId>docling4j</artifactId>
<version>0.1.1</version>
</dependency>
docling4j uses GraalPy, a high-performance embeddable Python 3 runtime for Java. Although not required, Oracle GraalVM JDK is recommended for running docling4j, since it supports runtime compilation to native code and efficient execution of embedded applications. Find more details on the level of optimizations of different Java runtimes here.
Please feel free to connect with us using the discussion section.
For more details on Docling's inner workings, check out the Docling Technical Report.
See Code of Conduct for details.
If you use Docling in your projects, please consider citing the following:
@techreport{Docling,
author = {Deep Search Team},
month = {8},
title = {Docling Technical Report},
url = {https://arxiv.org/abs/2408.09869},
eprint = {2408.09869},
doi = {10.48550/arXiv.2408.09869},
version = {1.0.0},
year = {2024}
}
The Docling codebase is under MIT license. For individual model usage, please refer to the model licenses found in the original packages.
Docling is hosted as a project in the LF AI & Data Foundation.
The project was started by the AI for knowledge team at IBM Research Zurich.