Skip to content

Design Philosophy

PDF Reader MCP is built on these core principles:

1. Performance First

  • Concurrent Processing - Multiple PDF sources are processed in parallel
  • Efficient Parsing - Uses pdfjs-dist for reliable, fast PDF parsing
  • Minimal Overhead - Direct stdio communication with no HTTP overhead
  • Batch Operations - Process multiple files in a single request

2. Comprehensive Extraction

  • Text Extraction - Full document or specific pages
  • Page Ranges - Flexible page selection with ranges like "1-5, 10, 15-20"
  • Metadata Access - Document properties, author, title, dates
  • Image Extraction - Embedded images as base64-encoded PNG

3. Simple Integration

  • Single Tool - One read_pdf tool handles all extraction needs
  • Standard MCP - Compatible with any MCP client
  • Easy Setup - One command installation via npx
  • Multiple Clients - Works with Claude Desktop, Claude Code, Cursor, and more

4. Flexible Input

  • Local Files - Read PDFs from any path on the filesystem
  • Remote URLs - Download and process PDFs from URLs
  • Mixed Sources - Combine local and remote files in one request

5. Robust Error Handling

  • Graceful Failures - One failed source doesn't stop others
  • Clear Errors - Specific error codes and messages
  • Partial Results - Get results from successful sources even if some fail

Technical Stack

  • Runtime: Node.js 22+
  • PDF Parsing: pdfjs-dist
  • Image Encoding: pngjs
  • Schema Validation: Zod
  • MCP SDK: @sylphx/mcp-server-sdk
  • Build Tool: bunup

Released under the MIT License.