pdf-parse
Pure TypeScript, cross-platform module for extracting text, images, and tabular data from PDFs. Run directly in your browser or in Node!
50
Versions
Apache-2.0
License
No
Install Scripts
Verified
Provenance
Supply chain provenance
Status for the latest visible version.
SLSA provenance attestation
npm registry signatures
gitHead linked
Maintainers
mehmet.kozan
Keywords
pdfpdf-parserpdf-parsepdf.jspdfjspdfjs-distpdf2textpdf2jsonpdf2imagepdf2picpdf-to-textpdf-to-imagepdf-viewerpdf-tablepdf-toolspdf-utilspdf-screenshotpdf-thumbnail
Accepted risks
Findings the reviewer chose to accept rather than block on.
| Source | Rule | Reason | Accepted by | When |
|---|---|---|---|---|
| semgrep | semgrep:dynamic-require | AI (semgrep): Dynamic require is constrained to the package's own bundled pdf.js version directories; not an arbitrary module load vulnerability. Stable pattern for this package. | ai | |
| phantom-deps | phantom-dep:debug | AI (phantom-deps): Minor packaging hygiene issue; debug is a declared dependency used transitively. Not a security concern for this package. | ai | |
| source-diff | net-exec-file:dist/pdf-parse/web/pdf-parse.umd.js | AI (source-diff): UMD browser build of PDF library. fetch/Worker usage is legitimate PDF processing functionality. | ai | |
| source-diff | net-exec-file:dist/pdf-parse/web/pdf-parse.es.js | AI (source-diff): PDF library legitimately uses fetch/Worker for PDF processing in browser context. | ai | |
| source-diff | obfuscated-file:dist/pdf-parse/web/pdf-parse.es.js | AI (source-diff): Standard Vite web build output. Minification expected for browser distribution. | ai | |
| source-diff | obfuscated-file:dist/worker/esm/index.js | AI (source-diff): ESM worker bundle embedding pdfjs worker as base64 data URL — documented pattern for browser bundling. | ai | |
| source-diff | obfuscated-file:dist/pdf-parse/cjs/pdf.worker.mjs | AI (source-diff): This is the bundled pdfjs-dist worker (Mozilla Foundation, Apache 2.0), minified as expected. Stable false positive for this PDF library. | ai | |
| source-diff | net-exec-file:dist/pdf-parse/cjs/pdf.worker.mjs | AI (source-diff): pdfjs-dist worker legitimately uses Worker/fetch for PDF processing. No malicious network calls present. | ai | |
| source-diff | obfuscated-file:dist/pdf-parse/esm/pdf.worker.mjs | AI (source-diff): Bundled pdfjs-dist worker (ESM variant), minified as expected. Stable false positive. | ai | |
| source-diff | net-exec-file:dist/pdf-parse/esm/pdf.worker.mjs | AI (source-diff): pdfjs-dist worker legitimately uses Worker/fetch. No malicious network calls. | ai | |
| source-diff | obfuscated-file:dist/pdf-parse/web/pdf.worker.mjs | AI (source-diff): Bundled pdfjs-dist worker (web variant), minified as expected. Stable false positive. | ai | |
| source-diff | net-exec-file:dist/pdf-parse/web/pdf.worker.mjs | AI (source-diff): pdfjs-dist worker legitimately uses Worker/fetch. No malicious network calls. | ai | |
| source-diff | obfuscated-file:dist/worker/pdf.worker.mjs | AI (source-diff): Bundled pdfjs-dist worker, minified as expected. Stable false positive for this PDF library. | ai | |
| source-diff | obfuscated-file:dist/worker/cjs/index.cjs | AI (source-diff): esbuild-generated CJS worker bundle with readable boilerplate. Minification is expected. | ai | |
| source-diff | obfuscated-file:dist/pdf-parse/cjs/index.cjs | AI (source-diff): Standard Vite/Rollup CJS build output of the library. Minification is expected for dist artifacts. | ai | |
| source-diff | net-exec-file:dist/pdf-parse/cjs/index.cjs | AI (source-diff): PDF library legitimately uses fetch/XHR for PDF loading. No malicious network behavior. | ai | |
| source-diff | net-exec-file:dist/worker/pdf.worker.mjs | AI (source-diff): pdfjs-dist worker legitimately uses Worker/fetch. No malicious network calls. | ai | |
| source-diff | obfuscated-file:dist/browser/pdf.worker.min-BFyMWnxX.cjs | AI (source-diff): This is a minified pdfjs-dist worker bundle embedded as a base64 data URI. The decoded content is standard Mozilla PDF.js worker code (Apache 2.0). Expected build artifact for this PDF parsing library. | ai | |
| source-diff | encoded-string-file:dist/browser/pdf-parse.cjs.min.js | AI (source-diff): The long base64 string is a test OTF font binary used by pdfjs-dist for font loading detection. This is a well-known pattern in PDF.js source code, not a malicious payload. | ai | |
| source-diff | obfuscated-file:dist/browser/pdf.worker.min-B8nzkz8p.js | AI (source-diff): Same as above — minified pdfjs-dist worker bundle as base64 data URI. Standard Mozilla PDF.js worker code. Expected build artifact. | ai | |
| source-diff | obfuscated-file:dist/browser/pdf.worker.min-q6UyegJs.cjs | AI (source-diff): Same as above — minified pdfjs-dist worker bundle as base64 data URI. Standard Mozilla PDF.js worker code. Expected build artifact. | ai | |
| source-diff | net-exec-file:dist/cjs/index.cjs | AI (source-diff): Bundle includes pdfjs-dist (Mozilla PDF.js) which naturally has network calls and eval-like patterns for PDF rendering. | ai | |
| source-diff | obfuscated-file:dist/cjs/index.cjs | AI (source-diff): CJS bundle produced by Vite/webpack; standard bundler output patterns (__webpack_modules__, __privateGet, etc.), not obfuscation. | ai | |
| source-diff | obfuscated-file:bin/worker/worker_source.js | AI (source-diff): File is a base64-encoded pdfjs-dist webpack worker bundle (Mozilla Foundation, Apache 2.0). Long lines are from minification, not obfuscation. Consistent with the package's build pipeline. | ai | |
| source-diff | obfuscated-file:bin/worker/worker_source.cjs | AI (source-diff): File is a base64-encoded pdfjs-dist webpack worker bundle (Mozilla Foundation, Apache 2.0). Long lines are from minification, not obfuscation. Consistent with the package's build pipeline. | ai | |
| source-diff | obfuscated-file:dist/worker/index.cjs | AI (source-diff): File is the pdfjs-dist v5+ worker bundle embedded as base64 data URI — standard PDF.js distribution pattern, not malicious obfuscation. | ai | |
| source-diff | obfuscated-file:dist/worker/index.js | AI (source-diff): ESM variant of the same pdfjs-dist worker bundle; same Apache 2.0 Mozilla Foundation content, not malicious. | ai | |
| source-diff | obfuscated-file:dist/browser/pdf.worker-Lwigm9C_.js | AI (source-diff): Minified webpack-bundled pdf.js worker for browser. Minification is expected for browser distribution bundles. SLSA provenance confirms CI build. | ai | |
| source-diff | net-exec-file:dist/browser/pdf.worker-De4zQndo.js | AI (source-diff): pdf.js browser worker combining network and execution is expected library behavior for PDF parsing. SLSA provenance confirmed. | ai | |
| source-diff | obfuscated-file:dist/browser/pdf.worker-De4zQndo.js | AI (source-diff): Webpack-bundled pdf.js worker for browser distribution. Standard webpack module format, not malicious obfuscation. SLSA provenance confirms CI build. | ai | |
| source-diff | net-exec-file:dist/cjs/pdf.worker-DJHiXjsz.cjs | AI (source-diff): pdf.js workers inherently combine network (PDF fetching) and code execution (font rendering). This is core library functionality, not dropper/loader behavior. SLSA provenance confirmed. | ai | |
| source-diff | net-exec-file:dist/browser/pdf.worker-Lwigm9C_.js | AI (source-diff): pdf.js browser worker combining network and execution is expected library behavior for PDF parsing. SLSA provenance confirmed. | ai | |
| source-diff | obfuscated-file:dist/cjs/pdf.worker-DJHiXjsz.cjs | AI (source-diff): Webpack-bundled pdf.js worker artifact from pdfjs-dist dependency. Long lines are standard webpack bundle format, not malicious obfuscation. SLSA provenance confirms CI build. | ai | |
| source-diff | large-new-source-files | AI (source-diff): 53 new files are pdfjs-dist worker build artifacts (ESM, CJS, browser, node variants). Consistent with the new build:worker script and ./worker_source export map added in this version. | ai | |
| source-diff | obfuscated-file:dist/worker/source.cjs | AI (source-diff): File embeds the pdfjs-dist worker as a base64 data URI for cross-platform Web Worker support. The decoded content is the standard Mozilla-licensed pdfjs worker bundle, not malicious obfuscation. | ai | |
| source-diff | obfuscated-file:dist/worker/source.js | AI (source-diff): Same as source.cjs — ESM variant of the pdfjs worker data URI embed. Legitimate build artifact from the new build:worker step. | ai | |
| source-diff | encoded-string-file:dist/cjs/index.cjs | AI (source-diff): Base64 string is a PDF.js test font embedded in _loadTestFont getter — standard upstream Mozilla PDF.js pattern, not malicious. | ai | |
| source-diff | encoded-string-file:dist/browser/pdf-parse.es.min.js | AI (source-diff): Same PDF.js test font base64 string appearing in browser build — identical benign pattern across all build outputs. | ai | |
| phantom-deps | phantom-dep:vite-plugin-dts | AI (phantom-deps): vite-plugin-dts is a build tool accidentally listed in dependencies instead of devDependencies. Packaging mistake, not a security issue. | ai | |
| source-diff | net-exec-file:dist/node/index.cjs | AI (source-diff): Standard webpack-bundled CJS output of pdf-parse library. Sample shows only webpack boilerplate, no malicious network or exec patterns. | ai | |
| source-diff | obfuscated-file:dist/node/pdf.worker.mjs | AI (source-diff): Minified pdfjs-dist worker bundle with Mozilla copyright header. Long lines are expected in bundled PDF.js worker output. | ai | |
| source-diff | net-exec-file:dist/node/pdf.worker.mjs | AI (source-diff): Mozilla pdfjs-dist worker bundle (v5.4.296). Network calls and dynamic execution are inherent to PDF.js worker architecture. | ai | |
| source-diff | obfuscated-file:dist/esm/pdf.worker.mjs | AI (source-diff): Minified pdfjs-dist worker bundle with Mozilla copyright header. Long lines are expected in bundled PDF.js worker output. | ai | |
| source-diff | net-exec-file:dist/esm/pdf.worker.mjs | AI (source-diff): Mozilla pdfjs-dist worker bundle (v5.4.296). Network calls and dynamic execution are inherent to PDF.js worker architecture. | ai | |
| source-diff | obfuscated-file:dist/cjs/pdf.worker.mjs | AI (source-diff): Minified pdfjs-dist worker bundle with Mozilla copyright header. Long lines are expected in bundled PDF.js worker output. | ai | |
| source-diff | net-exec-file:dist/cjs/pdf.worker.mjs | AI (source-diff): Mozilla pdfjs-dist worker bundle (v5.4.296). Network calls and dynamic execution are inherent to PDF.js worker architecture. | ai | |
| source-diff | net-exec-file:dist/browser/pdf.worker.mjs | AI (source-diff): Mozilla pdfjs-dist worker bundle (v5.4.296) with Apache 2.0 license header. Network calls and dynamic execution are inherent to PDF.js worker architecture. | ai | |
| source-diff | obfuscated-file:dist/browser/pdf.worker.mjs | AI (source-diff): Minified pdfjs-dist worker bundle with Mozilla copyright header. Long lines are expected in bundled PDF.js worker output. | ai | |
| source-diff | net-exec-file:dist/browser/pdf.worker.min.mjs | AI (source-diff): Standard Mozilla PDF.js worker bundle (Apache 2.0, pdfjsVersion 5.4.296). Dynamic execution is inherent to PDF rendering; not malicious. | ai | |
| source-diff | encoded-string-file:dist/browser/pdf-parse.umd.min.js | AI (source-diff): Same PDF.js test font base64 string in UMD build — benign, standard PDF.js pattern. | ai | |
| source-diff | encoded-string-file:dist/browser/pdf-parse.umd.js | AI (source-diff): Same PDF.js test font base64 string in UMD build — benign, standard PDF.js pattern. | ai | |
| source-diff | encoded-string-file:dist/browser/pdf-parse.cjs.js | AI (source-diff): Long base64 string is PDF.js _loadTestFont getter — a well-known pattern for embedding a test font binary. Not a hidden payload. | ai | |
| source-diff | encoded-string-file:dist/browser/pdf-parse.es.js | AI (source-diff): Same PDF.js test font base64 string in ES build — benign, standard PDF.js pattern. | ai | |
| source-diff | net-exec-file:dist/browser/pdf-parse.umd.min.js | AI (source-diff): Minified browser bundle produced by vite+terser build. Network/exec patterns are PDF.js worker infrastructure, not malicious. | ai | |
| source-diff | net-exec-file:dist/browser/pdf-parse.es.min.js | AI (source-diff): Minified browser bundle produced by vite+terser build. Base64 workerUrl is standard PDF.js inline worker pattern with Mozilla Foundation license header. | ai | |
| source-diff | net-exec-file:dist/browser/pdf-parse.cjs.min.js | AI (source-diff): Minified browser bundle produced by vite+terser build. Network/exec patterns are PDF.js worker infrastructure (data URI worker embedding), not malicious dropper behavior. | ai | |
| dependencies | unvetted-dep:pdfjs-dist | AI (dependencies): pdfjs-dist is Mozilla's official PDF.js distribution — an expected and appropriate dependency for a PDF parsing library. | ai | |
| dependencies | unvetted-dep:@napi-rs/canvas | AI (dependencies): @napi-rs/canvas is a well-known native canvas binding, appropriate for a package that renders/extracts images from PDFs. | ai |
Versions (showing 50 of 50)
| Version | Deps | Published |
|---|---|---|
| 2.4.5 | 2 / 13 | |
| 2.4.4 | 2 / 11 | |
| 2.4.3 | 2 / 10 | |
| 2.4.0 | 2 / 10 | |
| 2.3.12 | 1 / 8 | |
| 2.3.11 | 1 / 9 | |
| 2.3.10 | 1 / 9 | |
| 2.3.9 | 1 / 10 | |
| 2.3.8 | 1 / 10 | |
| 2.3.7 | 1 / 9 | |
| 2.3.6 | 1 / 9 | |
| 2.3.5 | 1 / 9 | |
| 2.3.0 | 1 / 11 | |
| 2.2.16 | 1 / 11 | |
| 2.2.13 | 1 / 11 | |
| 2.2.12 | 1 / 11 | |
| 2.2.11 | 1 / 11 | |
| 2.2.10 | 2 / 11 | |
| 2.2.9 | 1 / 11 | |
| 2.2.8 | 1 / 12 | |
| 2.2.7 | 1 / 12 | |
| 2.2.6 | 1 / 12 | |
| 2.2.5 | 1 / 12 | |
| 2.2.4 | 1 / 12 | |
| 2.2.3 | 1 / 12 | |
| 2.2.2 | 1 / 12 | |
| 2.2.1 | 1 / 11 | |
| 2.2.0 | 1 / 10 | |
| 2.1.10 | 1 / 10 | |
| 2.1.9 | 1 / 10 | |
| 2.1.8 | 1 / 11 | |
| 2.1.7 | 1 / 9 | |
| 2.1.6 | 1 / 12 | |
| 2.1.5 | 1 / 12 | |
| 2.1.4 | 1 / 10 | |
| 2.1.3 | 1 / 8 | |
| 2.1.2 | 1 / 8 | |
| 2.1.1 | 1 / 8 | |
| 1.1.1 | 2 / 1 | |
| 1.1.0 | 2 / 1 | |
| 1.0.9 | 2 / 1 | |
| 1.0.8 | 2 / 1 | |
| 1.0.7 | 3 / 1 | |
| 1.0.6 | 3 / 1 | |
| 1.0.5 | 3 / 1 | |
| 1.0.3 | 3 / 1 | |
| 1.0.2 | 3 / 1 | |
| 1.0.1 | 3 / 1 | |
| 0.0.1 | 1 / 1 | |
| 2.4.4-beta.1 | 2 / 11 |
v0.0.1
1 finding
LOW
No provenance attestation
provenance
Package was published without Sigstore provenance. Only ~12% of npm packages have provenance, so this is common but not ideal.
v2.4.4-beta.1
1 finding
INFO
Has SLSA provenance attestation
provenance
Published via CI/CD with Sigstore attestation (predicate: https://slsa.dev/provenance/v1). This is the strongest supply chain integrity signal.