===== What? =====
Notes on getting japanese inputs/fonts working on linux and learning japanese. My presentation on the topic is here: https://fluxcoil.net/files/speeches/latex_japlinux/japlinux.pdf .
===== input hiragna/katagana/kanji =====
* https://iamacat.wordpress.com/2008/07/07/more-japanese-on-linux-anthy-uim-and-ubuntu-among-others/
===== tex =====
* cjk-latex, ptex (non-utf8), XeTeX (japanese typesetting, also left to right writing)
* xelatex, xecjk package
* [[https://aminophen.github.io/slide/hytexconf18.pdf|日本語のLATEXで幸せになる...かもしれない方法]]
* options:
* pdflatex + CJK package
* xelatex + xeCJK package
* lualatex + luatex-ja package
* uplatex or platex: TeX implementations as they were specifically Japanese typography, work differently in some aspects when compared to the more general-purpose implementations from above
* https://tex.stackexchange.com/questions/15516/how-to-write-japanese-with-latex
===== tex cjk installation on Fedora19 =====
* with this the kanji/hiragana/katakana are properly used from utf8 tex-files
# cjk
$ yum -y install yum install texlive-cjk.noarch texlive-collection-langcjk.noarch
$ cat minimal_japanese_example.tex
\documentclass{beamer}
\usepackage[encapsulated]{CJK}
\usepackage{ucs}
\usepackage[utf8x]{inputenc}
\newcommand{\jptext}[1]{\begin{CJK}{UTF8}{min}#1\end{CJK}}
\begin{document}
\jptext{日本語}
\end{document}
$ pdflatex minimal_japanese_example.tex && evince minimal_japanese_example.pdf
===== terminals =====
I use this on Fedora:
xterm -en UTF-8 -fg white -bg black \
-fn -Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO10646-1 -e bash
Also i tried different fons with xterm, a listing of the available fonts comes back from executing "xlsfonts". To grab these out containing 'ja' and try them out:
for i in $(xlsfonts|grep ja); do
echo "current font: $i"; xterm -fn $i -e 'echo ちち; sleep 10';
done
===== japanese input on emacs =====
* make sure your terminal supports UTF8, i.e. it can properly display utf8-files, using xterm here
emacs ~/.emacs # and add this:
;;;;;;;;;;;;;;;;;;;;
;; unicode-setup
(prefer-coding-system 'utf-8)
(set-default-coding-systems 'utf-8)
(set-terminal-coding-system 'utf-8)
(set-keyboard-coding-system 'utf-8)
(setq default-buffer-file-coding-system 'utf-8)
(setq x-select-request-type '(UTF8_STRING COMPOUND_TEXT TEXT STRING))
;; make japanese default choice for input system
(set-input-method 'japanese)
emacs -nw test.tex
# now you can use C-x C-m C-\ and are asked to enter an input method.
# jap shows a selection, here
# 'japanese' works for hiragana directly/katakana with after hiragana-inputs
# 'japanese-katakana' works for katakana-input
# then you can input japanese as with uim.
# use C-\ to switch between english<->japanese input
===== converting Kanji to Hiragana/Furigana =====
* tlug++ for so many hints on that
* https://www.edrdg.org/~jwb/mecabdemo.html has a online demo of conversion with MeCab/Unidic
* **kakasi:** https://kakasi.namazu.org/ . Easy to install and use, but has an old dictionary and is bad for complex sentences. Simple example:
$ echo '私は馬鹿です'| kakasi -JK -i utf8 -o utf8
ワタシはバカです
* **MeCab**, i.e. with the mecab-ipadic-neologd dictionary ( https://github.com/neologd/mecab-ipadic-neologd ) is more modern.
$ echo 例文文章です。|mecab --node-format='%pS%m[%f[7]]' --eos-format='\n'
例文[レイブン]文章[ブンショウ]です[デス]。[。]
* Fedora: dnf -y install mecab mecab-ipadic
===== irc codepage =====
* iso-2022-jp
===== fonts =====
* font Electroharmonix which is for Romaji characters, but has them look like Kanij/katakana: [[https://www.dafont.com/electroharmonix.font|dafont.com]]
===== Libreoffice =====
* By default, libreoffice started to use Chinese variants of some Kanji for me. Use tools -> language -> for all text -> more, then "Default languages for documents" -> Asian: Japanese. For example 石炭, second 字, has Chinese and Japanese variants. Also 捨てる.
===== Translation =====
* cut'n'paste to google translate
* EBView can read EB dictionaries, dictionaries: 広辞苑, 大辞林, Readers+
* Kenkyusha "KOD" online dictionary (not for free)
* Eijiro dictionary
* translating single Japanese words on the commandline, offline:
* install jmdict https://jmdict.sourceforge.net/
* and the JMdict dictionary file: https://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project
===== Links =====
* [[https://github.com/PaddlePaddle/PaddleOCR#PP-OCRv2|PaddleOCR]] Japanese OCR, [[https://gigazine.net/news/20210919-paddleocr/|Gigazine article]]