{"id":3516,"date":"2022-03-07T00:51:33","date_gmt":"2022-03-06T15:51:33","guid":{"rendered":"http:\/\/matoken.org\/blog\/?p=3516"},"modified":"2022-03-07T01:54:11","modified_gmt":"2022-03-06T16:54:11","slug":"try-the-cli-version-of-singlefile","status":"publish","type":"post","link":"https:\/\/matoken.org\/blog\/2022\/03\/07\/try-the-cli-version-of-singlefile\/","title":{"rendered":"SingleFile\u306eCLI\u7248\u3092\u8a66\u3059"},"content":{"rendered":"<div class=\"paragraph\">\n<p>SingleFile\u306e\u62e1\u5f35\u6a5f\u80fd\u7248\u3092\u8a66\u3057\u307e\u3057\u305f\uff0e<\/p>\n<\/div>\n<div class=\"ulist\">\n<ul>\n<li><a href=\"https:\/\/matoken.org\/blog\/2022\/03\/04\/save-your-website-as-a-single-file-using-a-singlefile\/\">SingleFile \u3092\u4f7f\u3063\u3066\u30a6\u30a7\u30d6\u30b5\u30a4\u30c8\u30921\u3064\u306e\u30d5\u30a1\u30a4\u30eb\u3067\u4fdd\u5b58\u3059\u308b \u2013 matoken\u2019s meme<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/gildas-lormeau\/SingleFile\">gildas-lormeau\/SingleFile: Web Extension for Firefox\/Chrome\/MS Edge and CLI tool to save a faithful copy of an entire web page in a single HTML file<\/a><\/li>\n<\/ul>\n<\/div>\n<div class=\"paragraph\">\n<p>\u30a6\u30a7\u30d6\u95b2\u89a7\u4e2d\u306b\u306f\u4fbf\u5229\u3067\u3059\u304c\uff0c\u30d8\u30c3\u30c9\u30ec\u30b9\u3067\u3082\u4f7f\u3044\u305f\u3044\u306e\u3067CLI\u7248\u3092\u8a66\u3057\u3066\u307f\u307e\u3057\u305f\uff0e<\/p>\n<\/div>\n<div class=\"ulist\">\n<ul>\n<li><a href=\"https:\/\/github.com\/gildas-lormeau\/SingleFile\/blob\/master\/cli\/README.MD\">SingleFile\/README.MD at master \u00b7 gildas-lormeau\/SingleFile<\/a><\/li>\n<\/ul>\n<\/div>\n<div class=\"paragraph\">\n<p><!--more--><\/p>\n<\/div>\n<div class=\"paragraph\">\n<p>\u3044\u304f\u3064\u304b\u306e\u9078\u629e\u80a2\u304c\u3042\u308a\u307e\u3059\u304c\uff0c\u4eca\u56de\u306f\u5730\u756a\u304a\u624b\u8efd\u305d\u3046\u306aDocker Hub\u306e\u30a4\u30e1\u30fc\u30b8\u3067\u8a66\u3057\u3066\u307f\u307e\u3057\u305f\uff0e<\/p>\n<\/div>\n<div class=\"listingblock\">\n<div class=\"content\">\n<pre>$ docker pull capsulecode\/singlefile\n$ docker tag capsulecode\/singlefile singlefile\n$ docker image ls singlefile\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nsinglefile   latest    36fda8dcb810   4 months ago   755MB<\/pre>\n<\/div>\n<\/div>\n<div class=\"paragraph\">\n<p>help\u3092\u898b\u308b\u3068\u30aa\u30d7\u30b7\u30e7\u30f3\u304c\u305f\u304f\u3055\u3093\uff0e<\/p>\n<\/div>\n<div class=\"listingblock\">\n<div class=\"content\">\n<pre>$ time docker run singlefile --help\nsingle-file [url] [output]\n\nSave a page into a single HTML file.\n\nPositionals:\n  url     URL or path on the filesystem of the page to save  [string]\n  output  Output filename  [string]\n\nOptions:\n  --help                                  Show help  [boolean]\n  --version                               Show version number  [boolean]\n  --back-end                              Back-end to use  [choices: \"jsdom\", \"puppeteer\", \"webdriver-chromium\", \"webdriver-gecko\", \"puppeteer-firefox\", \"playwright-firefox\", \"playwright-chromium\"] [default: \"puppeteer\"]\n  --browser-server                        Server to connect to (puppeteer only for now)  [string] [default: \"\"]\n  --browser-headless                      Run the browser in headless mode (puppeteer, webdriver-gecko, webdriver-chromium)  [boolean] [default: true]\n  --browser-executable-path               Path to chrome\/chromium executable (puppeteer, webdriver-gecko, webdriver-chromium)  [string] [default: \"\"]\n  --browser-width                         Width of the browser viewport in pixels  [number] [default: 1280]\n  --browser-height                        Height of the browser viewport in pixels  [number] [default: 720]\n  --browser-load-max-time                 Maximum delay of time to wait for page loading in ms (puppeteer, webdriver-gecko, webdriver-chromium)  [number] [default: 60000]\n  --browser-wait-delay                    Time to wait before capturing the page in ms  [number] [default: 0]\n  --browser-wait-until                    When to consider the page is loaded (puppeteer, webdriver-gecko, webdriver-chromium)  [choices: \"networkidle0\", \"networkidle2\", \"load\", \"domcontentloaded\"] [default: \"networkidle0\"]\n  --browser-wait-until-fallback           Retry with the next value of --browser-wait-until when a timeout error is thrown  [boolean] [default: true]\n  --browser-debug                         Enable debug mode (puppeteer, webdriver-gecko, webdriver-chromium)  [boolean] [default: false]\n  --browser-script                        Path of a script executed in the page (and all the frames) before it is loaded  [array] [default: []]\n  --browser-stylesheet                    Path of a stylesheet file inserted into the page (and all the frames) after it is loaded  [array] [default: []]\n  --browser-args                          Arguments provided as a JSON array and passed to the browser (puppeteer, webdriver-gecko, webdriver-chromium)  [string] [default: \"\"]\n  --browser-start-minimized               Minimize the browser (puppeteer)  [boolean] [default: false]\n  --browser-cookie                        Ordered list of cookie parameters separated by a comma: name,value,domain,path,expires,httpOnly,secure,sameSite,url (puppeteer, webdriver-gecko, webdriver-chromium, jsdom)  [array] [default: []]\n  --browser-cookies-file                  Path of the cookies file formatted as a JSON file or a Netscape text file (puppeteer, webdriver-gecko, webdriver-chromium, jsdom)  [string] [default: \"\"]\n  --compress-CSS                          Compress CSS stylesheets  [boolean] [default: false]\n  --compress-HTML                         Compress HTML content  [boolean] [default: true]\n  --crawl-links                           Crawl and save pages found via inner links  [boolean] [default: false]\n  --crawl-inner-links-only                Crawl pages found via inner links only if they are hosted on the same domain  [boolean] [default: true]\n  --crawl-no-parent                       Crawl pages found via inner links only if their URLs are not parent of the URL to crawl  [boolean]\n  --crawl-load-session                    Name of the file of the session to load (previously saved with --crawl-save-session or --crawl-sync-session)  [string]\n  --crawl-remove-url-fragment             Remove URL fragments found in links  [boolean] [default: true]\n  --crawl-save-session                    Name of the file where to save the state of the session  [string]\n  --crawl-sync-session                    Name of the file where to load and save the state of the session  [string]\n  --crawl-max-depth                       Max depth when crawling pages found in internal and external links (0: infinite)  [number] [default: 1]\n  --crawl-external-links-max-depth        Max depth when crawling pages found in external links (0: infinite)  [number] [default: 1]\n  --crawl-replace-urls                    Replace URLs of saved pages with relative paths of saved pages on the filesystem  [boolean] [default: false]\n  --crawl-rewrite-rule                    Rewrite rule used to rewrite URLs of crawled pages  [array] [default: []]\n  --dump-content                          Dump the content of the processed page in the console ('true' when running in Docker)  [boolean] [default: false]\n  --emulate-media-feature                 Emulate a media feature. The syntax is &lt;name&gt;:&lt;value&gt;, e.g. \"prefers-color-scheme:dark\" (puppeteer)  [array]\n  --error-file  [string]\n  --filename-template                     Template used to generate the output filename (see help page of the extension for more info)  [string] [default: \"{page-title} ({date-iso} {time-locale}).html\"]\n  --filename-conflict-action              Action when the filename is conflicting with existing one on the filesystem. The possible values are \"uniquify\" (default), \"overwrite\" and \"skip\"  [string] [default: \"uniquify\"]\n  --filename-replacement-character        The character used for replacing invalid characters in filenames  [string] [default: \"_\"]\n  --group-duplicate-images                Group duplicate images into CSS custom properties  [boolean] [default: true]\n  --http-header                           Extra HTTP header (puppeteer, jsdom)  [array] [default: []]\n  --include-BOM                           Include the UTF-8 BOM into the HTML page  [boolean] [default: false]\n  --include-infobar                       Include the infobar  [boolean] [default: false]\n  --load-deferred-images                  Load deferred (a.k.a. lazy-loaded) images (puppeteer, webdriver-gecko, webdriver-chromium)  [boolean] [default: true]\n  --load-deferred-images-max-idle-time    Maximum delay of time to wait for deferred images in ms (puppeteer, webdriver-gecko, webdriver-chromium)  [number] [default: 1500]\n  --load-deferred-images-keep-zoom-level  Load defrrred images by keeping zoomed out the page  [boolean] [default: false]\n  --max-parallel-workers                  Maximum number of browsers launched in parallel when processing a list of URLs (cf --urls-file)  [number]\n  --max-resource-size-enabled             Enable removal of embedded resources exceeding a given size  [boolean] [default: false]\n  --max-resource-size                     Maximum size of embedded resources in MB (i.e. images, stylesheets, scripts and iframes)  [number] [default: 10]\n  --remove-frames                         Remove frames (puppeteer, webdriver-gecko, webdriver-chromium)  [boolean] [default: false]\n  --remove-hidden-elements                Remove HTML elements which are not displayed  [boolean] [default: true]\n  --remove-unused-styles                  Remove unused CSS rules and unneeded declarations  [boolean] [default: true]\n  --remove-unused-fonts                   Remove unused CSS font rules  [boolean] [default: true]\n  --remove-imports                        Remove HTML imports  [boolean] [default: true]\n  --remove-scripts                        Remove JavaScript scripts  [boolean] [default: true]\n  --remove-audio-src                      Remove source of audio elements  [boolean] [default: true]\n  --remove-video-src                      Remove source of video elements  [boolean] [default: true]\n  --remove-alternative-fonts              Remove alternative fonts to the ones displayed  [boolean] [default: true]\n  --remove-alternative-medias             Remove alternative CSS stylesheets  [boolean] [default: true]\n  --remove-alternative-images             Remove images for alternative sizes of screen  [boolean] [default: true]\n  --save-raw-page                         Save the original page without interpreting it into the browser (puppeteer, webdriver-gecko, webdriver-chromium)  [boolean] [default: false]\n  --urls-file                             Path to a text file containing a list of URLs (separated by a newline) to save  [string]\n  --user-agent                            User-agent of the browser (puppeteer, webdriver-gecko, webdriver-chromium)  [string]\n  --user-script-enabled                   Enable the event API allowing to execute scripts before the page is saved  [boolean] [default: true]\n  --web-driver-executable-path            Path to Selenium WebDriver executable (webdriver-gecko, webdriver-chromium)  [string] [default: \"\"]\n  --output-directory                      Path to where to save files, this path must exist.  [string] [default: \"\"]\n\nreal    0m7.511s\nuser    0m0.036s\nsys     0m0.036s<\/pre>\n<\/div>\n<\/div>\n<div class=\"paragraph\">\n<p>\u3068\u308a\u3042\u3048\u305a\u30b7\u30f3\u30d7\u30eb\u306b\uff0e<\/p>\n<\/div>\n<div class=\"listingblock\">\n<div class=\"content\">\n<pre>$ time docker run singlefile https:\/\/github.com\/gildas-lormeau\/SingleFile\/ &gt; \/tmp\/singlefile-test.html\n\nreal    0m58.279s\nuser    0m0.018s\nsys     0m0.049s\n$ w3m -dump_source https:\/\/github.com\/gildas-lormeau\/SingleFile\/ | zcat | grep \\&lt;img | dd bs=120 count=1 status=none;echo\n    &lt;img class=\"avatar mr-2 flex-shrink-0 js-jump-to-suggestion-avatar d-none\" alt=\"\" aria-label=\"Team\" src=\"\" width=\"28\n$ grep \\&lt;img \/tmp\/singlefile-test.html | dd bs=120 count=1 status=none;echo\n        &lt;img data-test-selector=\"commits-avatar-stack-avatar-image\" src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAA<\/pre>\n<\/div>\n<\/div>\n<div class=\"paragraph\">\n<p>\u7d50\u69cb\u6642\u9593\u639b\u304b\u308a\u307e\u3059\u306d\uff0e\u591a\u5206\u56de\u7dda\u304c\u7d30\u3044\u306e\u304c\u884c\u3051\u306a\u3044\u306e\u304b\u306a?(ADSL)<\/p>\n<\/div>\n<div class=\"paragraph\">\n<p>\u540c\u3058\u30da\u30fc\u30b8\u3092\u4e00\u65e6\u30ed\u30fc\u30ab\u30eb\u306b\u4fdd\u5b58\u3057\u3066\u8a66\u3057\u3066\u307f\u307e\u3057\u305f\uff0e<\/p>\n<\/div>\n<div class=\"listingblock\">\n<div class=\"title\">\u4fdd\u5b58\u3057\u305f\u30d1\u30b9\u3092root\u306b\u3057\u3066httpd\u3092\u8d77\u52d5<\/div>\n<div class=\"content\">\n<pre>$ python3 -m http.server --bind 172.17.0.1 --directory ~\/Downloads\/<\/pre>\n<\/div>\n<\/div>\n<div class=\"listingblock\">\n<div class=\"title\">SingleFile\u3067\u4fdd\u5b58<\/div>\n<div class=\"content\">\n<pre>$ time docker run singlefile http:\/\/172.17.0.1:8000\/gildas-lormeau_SingleFile_%20Web%20Extension%20for%20Firefox_Chrome_MS%20Edge%20and%20CLI%20tool%20to%20save%20a%20faithful%20copy%20of%20an%20entire%20web%20page%20in%20a%20single%20HTML%20file.html &gt; \/tmp\/sample.html\n\nreal    0m47.339s\nuser    0m0.059s\nsys     0m0.110s<\/pre>\n<\/div>\n<\/div>\n<div class=\"paragraph\">\n<p>\u601d\u3063\u305f\u3088\u308a\u5909\u308f\u3089\u306a\u304b\u3063\u305f\u3067\u3059\uff0e<\/p>\n<\/div>\n<div class=\"listingblock\">\n<div class=\"title\">\u74b0\u5883<\/div>\n<div class=\"content\">\n<pre>$ docker image ls singlefile\nREPOSITORY   TAG       IMAGE ID       CREATED        SIZE\nsinglefile   latest    36fda8dcb810   4 months ago   755MB\n$ dpkg-query -W docker.io python3\ndocker.io       20.10.11+dfsg1-2+b1\npython3 3.9.8-1\n$ lsb_release -dr\nDescription:    Debian GNU\/Linux bookworm\/sid\nRelease:        unstable\n$ arch\nx86_64\n$ grep ^model\\ name\\.*: -m1 \/proc\/cpuinfo\nmodel name      : Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz<\/pre>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>SingleFile\u306e\u62e1\u5f35\u6a5f\u80fd\u7248\u3092\u8a66\u3057\u307e\u3057\u305f\uff0e SingleFile \u3092\u4f7f\u3063\u3066\u30a6\u30a7\u30d6\u30b5\u30a4\u30c8\u30921\u3064\u306e\u30d5\u30a1\u30a4\u30eb\u3067\u4fdd\u5b58\u3059\u308b \u2013 matoken\u2019s meme gildas-lormeau\/SingleFile: Web Ext [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"webmentions_disabled_pings":false,"webmentions_disabled":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":4,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":""},"categories":[7,6,199],"tags":[702,701,478],"class_list":["post-3516","post","type-post","status-publish","format-standard","hentry","category-debian-linux","category-linux","category-sid","tag-docker","tag-finglefile","tag-html"],"_links":{"self":[{"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/posts\/3516","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/comments?post=3516"}],"version-history":[{"count":0,"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/posts\/3516\/revisions"}],"wp:attachment":[{"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/media?parent=3516"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/categories?post=3516"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/matoken.org\/blog\/wp-json\/wp\/v2\/tags?post=3516"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}