{ "version": "https://jsonfeed.org/version/1.1", "user_comment": "This feed allows you to read the posts from this site in any feed reader that supports the JSON Feed format. To add this feed to your reader, copy the following URL -- https://matoken.org/blog/tag/pdfimages/feed/json/ -- and add it your reader.", "home_page_url": "https://matoken.org/blog/tag/pdfimages/", "feed_url": "https://matoken.org/blog/tag/pdfimages/feed/json/", "language": "ja", "title": "pdfimages – matoken's blog", "description": "Is there no plan B?", "icon": "https://matoken.org/blog/wp-content/uploads/2025/03/cropped-1865f695c4eecc844385acef2f078255036adccd42c254580ea3844543ab56d9.jpeg", "items": [ { "id": "https://matoken.org/blog/?p=4014", "url": "https://matoken.org/blog/2024/05/02/remove-image-margins/", "title": "\u56fd\u7acb\u56fd\u4f1a\u56f3\u66f8\u9928\u30c7\u30b8\u30bf\u30eb\u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u306e\u8535\u66f8\u306e\u753b\u50cf\u306e\u4f59\u767d\u3092\u524a\u9664\u3057\u3066\u8aad\u307f\u3084\u3059\u304f\u3059\u308b", "content_html": "
\u56fd\u7acb\u56fd\u4f1a\u56f3\u66f8\u9928\u30c7\u30b8\u30bf\u30eb\u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u306e\u8535\u66f8\u3092\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\u3057\u3066\u30ed\u30fc\u30ab\u30eb\u3067\u8aad\u3080\u969b\u4f59\u767d\u304c\u90aa\u9b54\u306a\u306e\u3067\u6d88\u3057\u307e\u3057\u305f\uff0e
\n\n
\u307e\u305a\uff0c\u5341\u5168\u306b\u56fd\u7acb\u56fd\u4f1a\u56f3\u66f8\u9928\u30c7\u30b8\u30bf\u30eb\u30b3\u30ec\u30af\u30b7\u30e7\u30f3\u3092\u5229\u7528\u3059\u308b\u305f\u3081\u306b\u306f\u56fd\u7acb\u56fd\u4f1a\u56f3\u66f8\u9928\u306e\u5229\u7528\u8005\u767b\u9332\uff08\u500b\u4eba\uff09\u306e\u672c\u767b\u9332\u3092\u3057\u3066\u304a\u304d\u307e\u3057\u3087\u3046\uff0e
\n\u73fe\u5728\u306f\u73fe\u5730\u306b\u884c\u304b\u305a\u3068\u3082\u9060\u9694\u3067\u672c\u767b\u9332\u307e\u3067\u53ef\u80fd\u3067\u3059\uff0e\u4fbf\u5229\u306b\u306a\u308a\u307e\u3057\u305f\u306d\uff0e
\n\u56fd\u7acb\u56fd\u4f1a\u56f3\u66f8\u9928\u30c7\u30b8\u30bf\u30eb\u30b3\u30ec\u30af\u30b7\u30e7\u30f3 \u3067\u8aad\u307f\u305f\u3044\u672c\u3092\u691c\u7d22\uff0c\u30ed\u30b0\u30a4\u30f3( \u767b\u9332\u5229\u7528\u8005ID\uff0c\u30d1\u30b9\u30ef\u30fc\u30c9 )
\n\u53f3\u4e0b\u306e\u5370\u5237\u3067\u5370\u5237\u7bc4\u56f2\u3092\u6307\u5b9a\u3057\u3066\u300c\u5370\u5237\u30d5\u30a1\u30a4\u30eb\u3092\u958b\u304f\u300d\u3067\u8a72\u5f53\u7bc4\u56f2\u306e pdf \u30d5\u30a1\u30a4\u30eb\u3092\u4f5c\u6210\uff0c
\n\u4e0a\u90e8\u306b \u300c\u30d5\u30a1\u30a4\u30eb\u3092\u4f5c\u6210\u3057\u3066\u3044\u307e\u3059\u2026\u300d \u306e\u30e1\u30c3\u30bb\u30fc\u30b8\u304c\u8868\u793a\u3055\u308c\uff0c\u3057\u3070\u3089\u304f\u5f85\u3064\u3068 \u300c\u5370\u5237\u7528\u30d5\u30a1\u30a4\u30eb\u3092\u4f5c\u6210\u3057\u307e\u3057\u305f\u3002\u53f3\u306e\u30ea\u30f3\u30af\u304b\u3089PDF\u30d5\u30a1\u30a4\u30eb\u3092\u8868\u793a\u3067\u304d\u307e\u3059\u3002\u300d\u306b\u5909\u308f\u308b\u306e\u3067 \u300cPDF\u30d5\u30a1\u30a4\u30eb\u3092\u958b\u304f\u300d \u304b\u3089\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\u3057\u307e\u3059\uff0e
| \n Note \n | \n\n \n \n50\u30b3\u30de\u3092\u8d85\u3048\u3066\u3044\u308b\u5834\u5408\u306f\u300c\u7bc4\u56f2\u3092\u6307\u5b9a\u300d\u3067 | \n
\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\u3057\u305f PDF \u30d5\u30a1\u30a4\u30eb\u306f\u4e0b\u90e8\u306b \u767b\u9332\u5229\u7528\u8005ID, \u5229\u7528\u8005\u540d\uff0c\u751f\u6210\u5e74\u6708\u65e5\u6642\u5206\u79d2 \u304c\u753b\u50cf\u306b\u57cb\u3081\u8fbc\u307e\u308c\u3066\u3044\u307e\u3059\uff0e
\nPDF \u30d5\u30a1\u30a4\u30eb\u306f\u3053\u3093\u306a\u611f\u3058\u3067\u3057\u305f\uff0e2481 x 1732.5 pts 50\u30b3\u30de(100\u30da\u30fc\u30b8\u5206)\u306737MB\u7a0b\uff0e
\n$ pdfinfo digidepo_2530201_0001-001-050.pdf\nTitle: \u5b87\u5b99\u8239\u30d3\u30fc\u30b0\u30eb\u53f7\u306e\u5192\u967a\nKeywords: A.E.\u30f4\u30a1\u30f3\u30fb\u30f4\u30a9\u30fc\u30af\u30c8 \u8457 \u307b\u304b\u300e\u5b87\u5b99\u8239\u30d3\u30fc\u30b0\u30eb\u53f7\u306e\u5192\u967a\u300f,\u6771\u4eac\u5275\u5143\u793e,1964.2(43\u5237:1999.10). \u56fd\u7acb\u56fd\u4f1a\u56f3\u66f8\u9928\u30c7\u30b8\u30bf\u30eb\u30b3\u30ec\u30af\u30b7\u30e7\u30f3 https://dl.ndl.go.jp/pid/2530201 (\u53c2\u7167 2024-05-02)\nAuthor: A.E.\u30f4\u30a1\u30f3\u30fb\u30f4\u30a9\u30fc\u30af\u30c8 \u8457\nProducer: PyPDF2\nCreationDate: Thu May 2 00:30:21 2024 JST\nModDate: Thu May 2 00:30:21 2024 JST\nCustom Metadata: no\nMetadata Stream: no\nTagged: no\nUserProperties: no\nSuspects: no\nForm: none\nJavaScript: no\nPages: 50\nEncrypted: yes (print:yes copy:no change:no addNotes:no algorithm:AES-256)\nPage size: 2481 x 1732.5 pts\nPage rot: 0\nFile size: 36676525 bytes\nOptimized: no\nPDF version: 1.7\n
Encrypted \u3067 print:yes copy:no change:no addNotes:no \u3067\u30d1\u30b9\u30ef\u30fc\u30c9\u3082\u308f\u304b\u3089\u306a\u3044\u306e\u3067\uff0ctdftk \u306f\u4f7f\u3048\u307e\u305b\u3093\uff0e
\nError: Invalid PDF: unknown.encryption.type.r\nError: Failed to open input PDF file:\n
print \u306f\u51fa\u6765\u308b\u306e\u3067 mcomix \u3067\u3082\u8aad\u3081\u307e\u3059\uff0e\u3068\u3044\u3046\u3053\u3068\u3067 pdfimages \u3067\u4e2d\u306e\u753b\u50cf\u306e\u53d6\u308a\u51fa\u3057\u306f\u51fa\u6765\u307e\u3057\u305f\uff0e
$ pdfimages -all digidepo_2530201_0001-001-050.pdf 2530201/2530201\n$ ls 2530201/\n2530201-000.jpg 2530201-004.jpg 2530201-008.jpg 2530201-012.jpg 2530201-016.jpg 2530201-020.jpg 2530201-024.jpg 2530201-028.jpg 2530201-032.jpg 2530201-036.jpg 2530201-040.jpg 2530201-044.jpg 2530201-048.jpg\n2530201-001.jpg 2530201-005.jpg 2530201-009.jpg 2530201-013.jpg 2530201-017.jpg 2530201-021.jpg 2530201-025.jpg 2530201-029.jpg 2530201-033.jpg 2530201-037.jpg 2530201-041.jpg 2530201-045.jpg 2530201-049.jpg\n2530201-002.jpg 2530201-006.jpg 2530201-010.jpg 2530201-014.jpg 2530201-018.jpg 2530201-022.jpg 2530201-026.jpg 2530201-030.jpg 2530201-034.jpg 2530201-038.jpg 2530201-042.jpg 2530201-046.jpg\n2530201-003.jpg 2530201-007.jpg 2530201-011.jpg 2530201-015.jpg 2530201-019.jpg 2530201-023.jpg 2530201-027.jpg 2530201-031.jpg 2530201-035.jpg 2530201-039.jpg 2530201-043.jpg 2530201-047.jpg\n$ file 2530201/2530201-001.jpg\n2530201/2530201-001.jpg: JPEG image data, JFIF standard 1.01, aspect ratio, density 1x1, segment length 16, baseline, precision 8, 3308x2310, components 3\n
\u4f59\u767d\u304c\u6c17\u306b\u306a\u308b\u306e\u3067 Imagemagick \u306e trim \u3067\u30ab\u30c3\u30c8\u3057\u307e\u3059\uff0e\u6b63\u653b\u6cd5\u3060\u3068\u5ea7\u6a19\u3092\u6307\u5b9a\u3057\u307e\u3059\u304c\uff0c\u4eca\u56de\u306f fuzz \u3067 Imagemagick \u306b\u9811\u5f35\u3063\u3066\u3082\u3089\u3044\u307e\u3059\uff0e
\nfuzz \u306e % \u306f\u753b\u50cf\u306b\u3088\u308a\u4e01\u5ea6\u3044\u3044\u5024\u3092\u63a2\u3059\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\uff0e\u4eca\u56de\u306f 60% \u3067\u826f\u3055\u305d\u3046.
$ convert 2530201/2530201-001.jpg -fuzz 60% -trim 2530201-001-trim.jpg\n
\u4f59\u767d\u304c\u30ab\u30c3\u30c8\u3055\u308c\uff0c3308×2310 \u2192 2831×2099 \u306b\u306a\u308a\u307e\u3057\u305f\uff0e\u3044\u3044\u611f\u3058\u305d\u3046\u306a\u306e\u3067 mogrify \u3067\u5168\u753b\u50cf\u306b\u53cd\u6620\u3057\u30661\u3064\u306e\u30a2\u30fc\u30ab\u30a4\u30d6\u306b\u307e\u3068\u3081\u307e\u3057\u305f\uff0e
\n$ mogrify -fuzz 60% -trim 2530201/*\n$ find 2530201 | sort -V | zip -@0 ./\u5b87\u5b99\u8239\u30d3\u30fc\u30b0\u30eb\u53f7\u306e\u5192\u967a.zip\n
50\u30b3\u30de\u4ee5\u4e0a\u306e\u8cc7\u6599\u3067\uff0c2\u3064\u4ee5\u4e0a\u306e\u30a2\u30fc\u30ab\u30a4\u30d6\u306e\u5834\u5408\u306f2\u3064\u76ee\u4ee5\u964d\u306e PDF \u3092\u540c\u69d8\u306b\u5909\u63db\u3092\u7e70\u308a\u8fd4\u3057\u3066\u30a2\u30fc\u30ab\u30a4\u30d6\u306b\u8ffd\u52a0\u3057\u307e\u3059\uff0e\u305f\u3060\uff0c pdfimages \u3067\u751f\u6210\u3055\u308c\u308b\u30d5\u30a1\u30a4\u30eb\u540d\u304c\u88ab\u3063\u3066\u3057\u307e\u3046\u306e\u3067 rename \u30b3\u30de\u30f3\u30c9\u306a\u3069\u3067\u30d5\u30a1\u30a4\u30eb\u540d\u3092\u5909\u66f4\u3057\u307e\u3059\uff0e
$ rm 2530201/* (1)\n$ pdfimages -all ./digidepo_2530201_0001-051-100.pdf 2530201/2530201 (2)\n$ rename 's/(\\d+)-(\\d+)/sprintf \"$1-%03d\",$2 + 50/e' 2530201/* (3)\n$ mogrify -fuzz 60% -trim 2530201/* (4)\n$ find 2530201 -print | sort -V | zip -@0 ./\u5b87\u5b99\u8239\u30d3\u30fc\u30b0\u30eb\u53f7\u306e\u5192\u967a.zip (5)\n
\u30bf\u30d6\u30ec\u30c3\u30c8\u7aef\u672b\u3067 MComix \u3067\u95b2\u89a7\u3059\u308b\u3068\u3044\u3044\u611f\u3058\u305d\u3046\u3067\u3059\uff0e\u3061\u3087\u3063\u3068\u30b5\u30a4\u30ba\u304c\u5927\u304d\u3044\u306e\u3067\u7e2e\u5c0f\u3057\u305f\u308a\uff0c\u4e92\u63db\u6027\u306e\u305f\u3081\u306b PDF \u5f62\u5f0f\u306b\u3057\u305f\u308a\u3057\u3066\u3082\u826f\u3055\u305d\u3046\u3067\u3059\uff0e
\n\u30b9\u30de\u30db\u306a\u3069\u3067\u8aad\u3080\u5834\u5408\u306f\u3055\u3089\u306b\u771f\u3093\u4e2d\u304b\u30892\u5206\u5272\u3057\u305f\u3089\u826f\u3055\u305d\u3046\u3067\u3059\u304c\uff0c\u3056\u3063\u3068\u898b\u305f\u611f\u3058\u306e\u3069\u304c\u4e2d\u592e\u306b\u63c3\u3063\u3066\u3044\u308b\u308f\u3051\u3067\u306f\u306a\u3044\u306e\u3067\u5c11\u3057\u9762\u5012\u305d\u3046\u3067\u3059\uff0e
$ dpkg-query -W imagemagick zip poppler-utils rename\nimagemagick 8:6.9.12.98+dfsg1-5.2\npoppler-utils 22.12.0-2.2+b1\nrename 2.02-1\nzip 3.0-13\n$ lsb_release -a\nNo LSB modules are available.\nDistributor ID: Debian\nDescription: Debian GNU/Linux trixie/sid\nRelease: n/a\nCodename: trixie\n$ arch\nx86_64\n
$ apt show poppler-utils | grep Description: -A99\r\n\r\nWARNING: apt does not have a stable CLI interface. Use with caution in scripts.\r\n\r\nDescription: PDF \u5411\u3051\u30e6\u30fc\u30c6\u30a3\u30ea\u30c6\u30a3 (Poppler \u30d9\u30fc\u30b9)\r\n Poppler \u306f xpdf PDF \u30d3\u30e5\u30fc\u30a2\u3092\u57fa\u306b\u4f5c\u3089\u308c\u305f PDF \u63cf\u753b\u30e9\u30a4\u30d6\u30e9\u30ea\u3067\u3059\u3002\r\n .\r\n \u672c\u30d1\u30c3\u30b1\u30fc\u30b8\u306b\u306f (Poppler \u30d9\u30fc\u30b9\u306e) \u30b3\u30de\u30f3\u30c9\u30e9\u30a4\u30f3\u30e6\u30fc\u30c6\u30a3\u30ea\u30c6\u30a3\u304c\u542b\u307e\u308c\u3001\r\n PDF \u6587\u66f8\u306e\u60c5\u5831\u306e\u53d6\u5f97\u3001\u4ed6\u306e\u5f62\u5f0f\u3078\u306e\u5909\u63db\u3001\u7de8\u96c6\u304c\u3067\u304d\u307e\u3059\u3002\r\n * pdfdetach -- \u57cb\u3081\u8fbc\u307f\u30d5\u30a1\u30a4\u30eb (\u6dfb\u4ed8\u30d5\u30a1\u30a4\u30eb) \u306e\u4e00\u89a7\u51fa\u529b\u307e\u305f\u306f\u62bd\u51fa\r\n * pdffonts -- \u30d5\u30a9\u30f3\u30c8\u5206\u6790\u30c4\u30fc\u30eb\r\n * pdfimages -- \u753b\u50cf\u62bd\u51fa\u30c4\u30fc\u30eb\r\n * pdfinfo -- \u6587\u66f8\u60c5\u5831\r\n * pdfseparate -- \u30da\u30fc\u30b8\u62bd\u51fa\u30c4\u30fc\u30eb\r\n * pdfsig -- \u30c7\u30b8\u30bf\u30eb\u7f72\u540d\u306e\u691c\u8a3c\r\n * pdftocairo -- PDF \u304b\u3089 PNG/JPEG/PDF/PS/EPS/SVG \u3078\u306e Cairo \u3092\u4f7f\u3063\u305f\u5909\u63db\u30c4\u30fc\u30eb\r\n * pdftohtml -- PDF \u304b\u3089 HTML \u3078\u306e\u5909\u63db\u30c4\u30fc\u30eb\r\n * pdftoppm -- PDF \u304b\u3089 PPM/PNG/JPEG \u753b\u50cf\u3078\u306e\u5909\u63db\u30c4\u30fc\u30eb\r\n * pdftops -- PDF \u304b\u3089 PostScript (PS) \u3078\u306e\u5909\u63db\u30c4\u30fc\u30eb\r\n * pdftotext -- \u30c6\u30ad\u30b9\u30c8\u306e\u62bd\u51fa\r\n * pdfunite -- \u6587\u66f8\u306e\u4f75\u5408\u30c4\u30fc\u30eb\n
$ sudo apt install poppler-utils\n
$ pdfimages\r\npdfimages version 0.69.0\r\nCopyright 2005-2018 The Poppler Developers - http://poppler.freedesktop.org\r\nCopyright 1996-2011 Glyph & Cog, LLC\r\nUsage: pdfimages [options] <PDF-file> <image-root>\r\n -f <int> : first page to convert\r\n -l <int> : last page to convert\r\n -png : change the default output format to PNG\r\n -tiff : change the default output format to TIFF\r\n -j : write JPEG images as JPEG files\r\n -jp2 : write JPEG2000 images as JP2 files\r\n -jbig2 : write JBIG2 images as JBIG2 files\r\n -ccitt : write CCITT images as CCITT files\r\n -all : equivalent to -png -tiff -j -jp2 -jbig2 -ccitt\r\n -list : print list of images instead of saving\r\n -opw <string> : owner password (for encrypted files)\r\n -upw <string> : user password (for encrypted files)\r\n -p : include page numbers in output file names\r\n -q : don't print any messages or errors\r\n -v : print copyright and version info\r\n -h : print usage information\r\n -help : print usage information\r\n --help : print usage information\r\n -? : print usage information\n
$ pdfimages ./bicycle_parking.pdf -list\r\npage num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio\r\n--------------------------------------------------------------------------------------------\r\n 1 0 image 340 120 rgb 3 8 image no 5 0 221 221 9576B 7.8%\r\n 1 1 image 960 720 rgb 3 8 jpeg yes 14 0 170 170 64.2K 3.2%\n
$ pdfimages ./bicycle_parking.pdf ./bicycle_parking-images\r\n$ ls -l ./bicycle_parking-images*\r\n-rw-r--r-- 1 matoken matoken 122415 11\u6708 12 21:40 ./bicycle_parking-images-000.ppm\r\n-rw-r--r-- 1 matoken matoken 2073615 11\u6708 12 21:40 ./bicycle_parking-images-001.ppm\r\n$ identify ./bicycle_parking-images-000.ppm\r\n./bicycle_parking-images-000.ppm PPM 340x120 340x120+0+0 8-bit sRGB 122415B 0.000u 0:00.000\n
$ pdfimages ./bicycle_parking.pdf ./bicycle_parking-images -png\r\n$ ls -l ./bicycle_parking-images*\r\n-rw-r--r-- 1 matoken matoken 10274 11\u6708 12 21:46 ./bicycle_parking-images-000.png\r\n-rw-r--r-- 1 matoken matoken 321115 11\u6708 12 21:46 ./bicycle_parking-images-001.png\r\n$ identify ./bicycle_parking-images-000.png\r\n./bicycle_parking-images-000.png PNG 340x120 340x120+0+0 8-bit sRGB 10274B 0.000u 0:00.000\n
$ pdfimages ./bicycle_parking.pdf ./bicycle_parking-images -j\r\n$ ls -l ./bicycle_parking-images*\r\n-rw-r--r-- 1 matoken matoken 122415 11\u6708 12 21:48 ./bicycle_parking-images-000.ppm\r\n-rw-r--r-- 1 matoken matoken 65695 11\u6708 12 21:48 ./bicycle_parking-images-001.jpg\n
OpenDocumentFormat\u306fzip\u5f62\u5f0f\u3067\u5727\u7e2e\u3055\u308c\u3066\u305d\u306e\u4e2d\u306b\u753b\u50cf\u30d5\u30a1\u30a4\u30eb\u3082\u542b\u307e\u308c\u308b\u306e\u3067Draw\u5f62\u5f0f\u306b\u5909\u63db\u3057\u3066\u753b\u50cf\u30c7\u30a3\u30ec\u30af\u30c8\u30ea\u306e\u683c\u7d0d\u3055\u308c\u3066\u3044\u308b Pictures \u3060\u3051\u5c55\u958b\u3059\u308b\u3068\u753b\u50cf\u304c\u53d6\u308a\u51fa\u305b\u308b\uff0e
\u5c0e\u5165\u306f\u30c7\u30a3\u30b9\u30c8\u30ea\u30d3\u30e5\u30fc\u30b7\u30e7\u30f3\u306e\u307b\u304bFlatpack, snap\u306a\u3069\u3082\u516c\u5f0f\u3067\u914d\u5e03\u3055\u308c\u3066\u3044\u308b\u306e\u3067\u304a\u597d\u307f\u3067\uff0e\u4eca\u56de\u306f Debian sid amd64 \u3067apt install\u3057\u305f\u3082\u306e\uff0e
\n$ libreoffice --headless --nologo --nofirststartwizard --convert-to odg ./bicycle_parking.pdf\r\nconvert /home/matoken/Downloads/bicycle_parking.pdf -> /home/matoken/Downloads/bicycle_parking.odg using filter : draw8\r\n$ unzip ./bicycle_parking.odg Pictures/*\r\nArchive: ./bicycle_parking.odg\r\n extracting: Pictures/10000000000003C0000002D0136E1A08DF8E2B28.jpg\r\n extracting: Pictures/100000000000015400000078BA7345C344D8D008.png\r\n$ ls -lA Pictures/\r\n\u5408\u8a08 80\r\n-rw-r--r-- 1 matoken matoken 10812 11\u6708 12 12:56 100000000000015400000078BA7345C344D8D008.png\r\n-rw-r--r-- 1 matoken matoken 65695 11\u6708 12 12:56 10000000000003C0000002D0136E1A08DF8E2B28.jpg\r\n$ identify Pictures/*\r\nPictures/100000000000015400000078BA7345C344D8D008.png PNG 340x120 340x120+0+0 8-bit sRGB 10812B 0.000u 0:00.000\r\nPictures/10000000000003C0000002D0136E1A08DF8E2B28.jpg JPEG 960x720 960x720+0+0 8-bit sRGB 65695B 0.000u 0:00.000\n
$ dpkg-query -W poppler-utils unzip libreoffice imagemagick\r\nimagemagick 8:6.9.10.14+dfsg-7\r\nlibreoffice 1:6.1.3-1\r\npoppler-utils 0.69.0-2\r\nunzip 6.0-21\r\n$ lsb_release -d\r\nDescription: Debian GNU/Linux unstable (sid)\r\n$ uname -m\r\nx86_64\n