GitHub / aphp / edspdf
EDS-PDF is a generic, pure-Python framework for text extraction from PDF documents. It provides the machinery to use rule- or machine-learning-based approaches to classify text blocs between body and meta-data.
JSON API: https://data.code.gouv.fr/api/v1/hosts/GitHub/repositories/aphp%2Fedspdf
Stars: 47
Forks: 7
Open issues: 0
License: bsd-3-clause
Language: Python
Size: 8.93 MB
Dependencies parsed at: Pending
Created at: almost 3 years ago
Updated at: 20 days ago
Pushed at: 3 months ago
Last synced at: 6 days ago
Commit Stats
Commits: 293
Authors: 10
Mean commits per author: 29.3
Development Distribution Score: 0.621
More commit stats: https://commits.ecosystem.code.gouv.fr/hosts/GitHub/repositories/aphp/edspdf
Topics: extraction, machine-learning, pdf