Site Tools


languages:japanese:linux:0verview

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
languages:japanese:linux:0verview [2022/01/10 09:50] – [fonts] chrislanguages:japanese:linux:0verview [2024/03/04 23:42] (current) chris
Line 1: Line 1:
 +===== What? =====
 +Notes on getting japanese inputs/fonts working on linux and learning japanese. My presentation on the topic is here: https://fluxcoil.net/files/speeches/latex_japlinux/japlinux.pdf .
 +===== input hiragna/katagana/kanji =====
 +  * https://iamacat.wordpress.com/2008/07/07/more-japanese-on-linux-anthy-uim-and-ubuntu-among-others/
 +===== tex =====
 +  * cjk-latex, ptex (non-utf8), XeTeX (japanese typesetting, also left to right writing)
 +  * xelatex, xecjk package
 +  * [[https://aminophen.github.io/slide/hytexconf18.pdf|日本語のLATEXで幸せになる...かもしれない方法]]
 +  * options:
 +    * pdflatex + CJK package
 +    * xelatex + xeCJK package
 +    * lualatex + luatex-ja package
 +    * uplatex or platex: TeX implementations as they were specifically Japanese typography, work differently in some aspects when compared to the more general-purpose implementations from above
 +    * https://tex.stackexchange.com/questions/15516/how-to-write-japanese-with-latex
 +===== tex cjk installation on Fedora19 =====
 +  * with this the kanji/hiragana/katakana are properly used from utf8 tex-files
 +<code>
 +# cjk
 +$ yum -y install yum install texlive-cjk.noarch texlive-collection-langcjk.noarch
  
 +$ cat minimal_japanese_example.tex
 +\documentclass{beamer}
 +\usepackage[encapsulated]{CJK}
 +\usepackage{ucs}
 +\usepackage[utf8x]{inputenc}
 +\newcommand{\jptext}[1]{\begin{CJK}{UTF8}{min}#1\end{CJK}}
 +\begin{document}
 +\jptext{日本語}
 +\end{document}
 +
 +$ pdflatex minimal_japanese_example.tex && evince minimal_japanese_example.pdf
 +</code>
 +
 +
 +===== terminals =====
 +I use this on Fedora:
 +<code>
 +xterm -en UTF-8 -fg white -bg black \
 +  -fn -Misc-Fixed-Medium-R-Normal--18-120-100-100-C-90-ISO10646-1 -e bash
 +</code>
 +Also i tried different fons with xterm, a listing of the available fonts comes back from executing "xlsfonts". To grab these out containing 'ja' and try them out:
 +<code>
 +for i in $(xlsfonts|grep ja); do 
 +    echo "current font: $i"; xterm -fn $i -e 'echo ちち; sleep 10'; 
 +done
 +</code>
 +
 +===== japanese input on emacs =====
 +  * make sure your terminal supports UTF8, i.e. it can properly display utf8-files, using xterm here
 +<code>
 +emacs ~/.emacs # and add this:
 +;;;;;;;;;;;;;;;;;;;;                                                                                                                        
 +;; unicode-setup                                                                                                                            
 +(prefer-coding-system       'utf-8)
 +(set-default-coding-systems 'utf-8)
 +(set-terminal-coding-system 'utf-8)
 +(set-keyboard-coding-system 'utf-8)
 +(setq default-buffer-file-coding-system 'utf-8)
 +(setq x-select-request-type '(UTF8_STRING COMPOUND_TEXT TEXT STRING))
 +;; make japanese default choice for input system
 +(set-input-method 'japanese)
 +
 +emacs -nw test.tex
 +# now you can use C-x C-m C-\ and are asked to enter an input method.
 +# jap<tab><tab> shows a selection, here 
 +#   'japanese' works for hiragana directly/katakana with <blank> after hiragana-inputs
 +#   'japanese-katakana' works for katakana-input
 +# then you can input japanese as with uim.
 +# use C-\ to switch between english<->japanese input
 +</code>
 +
 +===== converting Kanji to Hiragana/Furigana =====
 +  * tlug++ for so many hints on that
 +  * https://www.edrdg.org/~jwb/mecabdemo.html has a online demo of conversion with MeCab/Unidic
 +  * **kakasi:** https://kakasi.namazu.org/ . Easy to install and use, but has an old dictionary and is bad for complex sentences. Simple example:
 +<code>
 +$ echo '私は馬鹿です'| kakasi -JK -i utf8 -o utf8
 +ワタシはバカです
 +</code>
 +  * **MeCab**, i.e. with the mecab-ipadic-neologd dictionary ( https://github.com/neologd/mecab-ipadic-neologd ) is more modern.
 +<code>
 +$ echo 例文文章です。|mecab --node-format='%pS%m[%f[7]]' --eos-format='\n'
 +例文[レイブン]文章[ブンショウ]です[デス]。[。]
 +</code>
 +  * Fedora: dnf -y install mecab mecab-ipadic
 +
 +
 +===== irc codepage =====
 +  * iso-2022-jp
 +
 +===== fonts =====
 +  * font Electroharmonix which is for Romaji characters, but has them look like Kanij/katakana: [[https://www.dafont.com/electroharmonix.font|dafont.com]]
 +
 +===== Libreoffice =====
 +  * By default, libreoffice started to use Chinese variants of some Kanji for me. Use tools -> language -> for all text -> more, then "Default languages for documents" -> Asian: Japanese. For example 石炭, second 字, has Chinese and Japanese variants. Also 捨てる.
 +
 +===== Translation =====
 +  * cut'n'paste to google translate
 +  * EBView can read EB dictionaries, dictionaries: 広辞苑, 大辞林, Readers+
 +  * Kenkyusha "KOD" online dictionary (not for free)
 +  * Eijiro dictionary
 +  * translating single Japanese words on the commandline, offline:
 +    * install jmdict https://jmdict.sourceforge.net/
 +    * and the JMdict dictionary file: https://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project
 +
 +===== Links =====
 +  * [[https://github.com/PaddlePaddle/PaddleOCR#PP-OCRv2|PaddleOCR]] Japanese OCR, [[https://gigazine.net/news/20210919-paddleocr/|Gigazine article]]
languages/japanese/linux/0verview.txt · Last modified: 2024/03/04 23:42 by chris