Pandoc 3.0 で pandoc-server を少し試す

各種ファイル変換をしてくれるPandoc が久々にメジャーバージョンアップして 3.0 になりました.

手元の端末に入れて少し触ってみました.

ディストリビューションパッケージのPandoc をアンインストール
$ sudo apt remove pandoc pandoc-data
GitHub のRelease からPandoc 3.0の amd64 の.deb を入手してインストール
$ wget https://github.com/jgm/pandoc/releases/download/3.0/pandoc-3.0-1-amd64.deb
$ sudo apt install ./pandoc-3.0-1-amd64.deb
$ $ apt show pandoc -a
Package: pandoc
Version: 3.0-1
Status: install ok installed
Priority: optional
Section: text
Maintainer: John MacFarlane <jgm@berkeley.edu>
Installed-Size: 129 MB
Depends: libc6 (>= 2.13), libgmp10, zlib1g (>= 1:1.1.4)
Suggests: texlive-latex-recommended, texlive-xetex, texlive-fonts-recommended
Replaces: pandoc-data
Download-Size: 不明
APT-Manual-Installed: yes
APT-Sources: /var/lib/dpkg/status
Description: general markup converter
  Pandoc is a Haskell library for converting from one markup format
  to another, and a command-line tool that uses this library. The
  formats it can handle include light markup formats (many variants
  of Markdown, reStructuredText, AsciiDoc, Org-mode, Muse, Textile,
  txt2tags), HTML formats (HTML 4 and 5), ebook formats (EPUB v2
  and v3, FB2), Documentation formats (GNU TexInfo, Haddock),
  Roff formats (man, ms), TeX formats (LaTeX, ConTeXt), XML
  formats (DocBook 4 and 5, JATS, TEI Simple, OpenDocument),
  outline formats (OPML), bibliography formats (BibTeX, BibLaTeX,
  CSL JSON, CSL YAML, RIS), word processor formats (Docx, RTF,
  ODT), interactive notebook formats (Jupyter notebook
  ipynb), page layout formats (InDesign ICML), wiki markup
  formats (MediaWiki, DokuWiki, TikiWiki, TWiki, Vimwiki,
  XWiki, ZimWiki, Jira wiki, Creole), slide show formats
  (LaTeX Beamer, PowerPoint, Slidy, reveal.js, Slideous, S5,
  DZSlides), data formats (CSV and TSV tables), and PDF (via
  external programs such as pdflatex or wkhtmltopdf).

Package: pandoc
Version: 2.17.1.1-1.1
Priority: optional
Section: text
Maintainer: Debian Haskell Group <debian-haskell@lists.debian.org>
Installed-Size: 172 MB
Provides: pandoc-abi (= 1.22.2.1-1)
Depends: libc6 (>= 2.34), libffi8 (>= 3.4), libgmp10 (>= 2:6.2.1+dfsg1), liblua5.3-0, libyaml-0-2, zlib1g (>= 1:1.1.4), pandoc-data (>= 2.17.1.1-1.1), pandoc-data (<< 2.17.1.1-1.1.~)
Suggests: texlive-latex-recommended, texlive-xetex, texlive-luatex, pandoc-citeproc, texlive-latex-extra, context, wkhtmltopdf, librsvg2-bin, groff, ghc, nodejs, php, perl, python, ruby, r-base-core, libjs-mathjax, libjs-katex, citation-style-language-styles
Homepage: https://pandoc.org/
Tag: devel::doc, implemented-in::haskell, interface::commandline,
 role::documentation, role::program, use::converting,
 works-with-format::bib, works-with-format::docbook,
 works-with-format::html, works-with-format::json,
 works-with-format::man, works-with-format::odf,
 works-with-format::plaintext, works-with-format::tex,
 works-with-format::xml, works-with::text
Download-Size: 21.3 MB
APT-Sources: http://deb.debian.org/debian bookworm/main amd64 Packages
Description: general markup converter
 Pandoc is a Haskell library for converting
 from one markup format to another,
 and a command-line tool that uses this library.
 The formats it can handle include
  * light markup formats
    (many variants of Markdown, reStructuredText, AsciiDoc,
     Org-mode, Muse, Textile, txt2tags)
  * HTML formats (HTML 4 and 5)
  * Ebook formats (EPUB v2 and v3, FB2)
  * Documentation formats (GNU TexInfo, Haddock)
  * Roff formats (man, ms)
  * TeX formats (LaTeX, ConTeXt)
  * XML formats
    (DocBook 4 and 5, JATS, TEI Simple, OpenDocument)
  * Outline formats (OPML)
  * Bibliography formats (BibTeX, BibLaTeX, CSL JSON, CSL YAML)
  * Word processor formats (Docx, RTF, ODT)
  * Interactive notebook formats (Jupyter notebook ipynb)
  * Page layout formats (InDesign ICML)
  * Wiki markup formats
    (MediaWiki, DokuWiki, TikiWiki, TWiki,
     Vimwiki, XWiki, ZimWiki, Jira wiki, Creole)
  * Slide show formats
    (LaTeX Beamer, PowerPoint, Slidy,
     reveal.js, Slideous, S5, DZSlides)
  * Data formats (CSV tables)
  * PDF (via external programs such as pdflatex or wkhtmltopdf)
 .
 Pandoc can convert mathematical content in documents
 between TeX, MathML, Word equations, roff eqn, and plain text.
 It includes a powerful system
 for automatic citations and bibliographies,
 and it can be customized extensively using templates, filters,
 and custom readers and writers written in Lua.
 .
 This package contains the pandoc tool.
 .
 Some uses of Pandoc require additional packages:
  * SVG content in PDF output requires librsvg2-bin.
  * YAML metadata in TeX-related output requires texlive-latex-extra.
  * *.hs filters not set executable requires ghc.
  * *.js filters not set executable requires nodejs.
  * *.php filters not set executable requires php.
  * *.pl filters not set executable requires perl.
  * *.py filters not set executable requires python.
  * *.rb filters not set executable requires ruby.
  * *.r filters not set executable requires r-base-core.
  * LaTeX output, and PDF output via PDFLaTeX,
    require texlive-latex-recommended.
  * XeLaTeX output, and PDF output via XeLaTeX, require texlive-xetex.
  * LuaTeX output, and PDF output via LuaTeX, require texlive-luatex.
  * ConTeXt output, and PDF output via ConTeXt, require context.
  * PDF output via wkhtmltopdf requires wkhtmltopdf.
  * Roff man and roff ms output, and PDF output via roff ms,
    require groff.
  * MathJax-rendered equations require libjs-mathjax.
  * KaTeX-rendered equations require node-katex.
  * option --csl may use styles in citation-style-language-styles.
$ dpkg -L pandoc
/.
/usr
/usr/bin
/usr/bin/pandoc
/usr/share
/usr/share/doc
/usr/share/doc/pandoc
/usr/share/doc/pandoc/copyright
/usr/share/man
/usr/share/man/man1
/usr/share/man/man1/pandoc-lua.1.gz
/usr/share/man/man1/pandoc-server.1.gz
/usr/share/man/man1/pandoc.1.gz
/usr/bin/pandoc-lua
/usr/bin/pandoc-server
$ pandoc-server -v
pandoc-server 3.0
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: /home/matoken/.local/share/pandoc
Copyright (C) 2006-2023 John MacFarlane. Web:  https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.
$ ls -l /bin/pandoc*
-rwxr-xr-x 1 root root 128885952  1月 19 06:00 /bin/pandoc
lrwxrwxrwx 1 root root         6  1月 19 06:00 /bin/pandoc-lua -> pandoc
lrwxrwxrwx 1 root root         6  1月 19 06:00 /bin/pandoc-server -> pandoc
対応フォーマットの数
$ pandoc --list-input-formats | wc -l
42
$ pandoc --list-output-formats | wc -l
63

Pandoc Server を少し試す

Pandocサーバ起動
$ pandoc-server
Starting server on port 3030...
Pandocサーババージョン確認
$ http http://localhost:3030/version
HTTP/1.1 200 OK
Content-Type: text/plain;charset=utf-8
Date: Fri, 20 Jan 2023 13:24:51 GMT
Server: Warp/3.3.23
Transfer-Encoding: chunked

3.0
絵文字をPandocサーバでhtmlに変換
$ http POST http://localhost:3030/ text=:+1: from=markdown+emoji to=html4 Accept:text/plain
HTTP/1.1 200 OK
Content-Type: text/plain;charset=utf-8
Date: Fri, 20 Jan 2023 20:40:17 GMT
Server: Warp/3.3.23
Transfer-Encoding: chunked

<p><span class="emoji" data-emoji="+1">👍</span></p>
MarkdownファイルをPandocサーバでhtmlに変換
$ head ~/src/tilck/README.md (1)
<p align="center">
   <img src="http://vvaltchev.github.io/tilck_imgs/v2/tilck-logo-v5.png" alt="Tilck - A Tiny Linux-Compatible Kernel">
</p>

[![Build Status](https://vkvaltchev.visualstudio.com/Tilck/_apis/build/status/Tilck?branchName=master)](https://vkvaltchev.visualstudio.com/Tilck/_build/latest?definitionId=1&branchName=master)
[![codecov](https://codecov.io/gh/vvaltchev/tilck/branch/master/graph/badge.svg)](https://codecov.io/gh/vvaltchev/tilck)
[![License](https://img.shields.io/badge/License-BSD%202--Clause-orange.svg)](https://opensource.org/licenses/BSD-2-Clause)

<a href="https://youtu.be/Ce1pMlZO_mI">
   <img
$ http POST http://localhost:3030/ text=@~/src/tilck/README.md from=markdown+emoji to=html4 Accept:text/plain | head (2)
<p align="center">
<img src="http://vvaltchev.github.io/tilck_imgs/v2/tilck-logo-v5.png" alt="Tilck - A Tiny Linux-Compatible Kernel">
</p>
<p><a
href="https://vkvaltchev.visualstudio.com/Tilck/_build/latest?definitionId=1&amp;branchName=master"><img
src="https://vkvaltchev.visualstudio.com/Tilck/_apis/build/status/Tilck?branchName=master"
alt="Build Status" /></a> <a
href="https://codecov.io/gh/vvaltchev/tilck"><img
src="https://codecov.io/gh/vvaltchev/tilck/branch/master/graph/badge.svg"
alt="codecov" /></a> <a
$ http POST http://localhost:3030/ text=@~/src/tilck/README.md from=markdown+emoji to=html4 Accept:text/plain | w3m -T text/html -dump | head (3)
                    Tilck - A Tiny Linux-Compatible Kernel

Build Status codecov License

Tilck

Contents

  • Overview
      □ What is Tilck?
  1. Markdownファイル確認
  2. htmlに変換
  3. Markdownをhtmlに変換してw3mでレンダリング
環境
$ dpkg-query -W pandoc httpie w3m
httpie  3.2.1-1
pandoc  3.0-1
w3m     0.5.3+git20220429-1+b1
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux bookworm/sid
Release:        n/a
Codename:       bookworm
$ arch
x86_64

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です