The Fellowship / Fellows / ciaran / Ciarán's free software notes

Ciarán's free software notes

Ciaran O'Riordan's irregularly kept software freedom journal

Limit entries displayed: [ 2 ] [ 4 ] [ 6 ] [ 8 ]

Using LaTeX to make PDF documents with Japanese characters

Even if you know nothing about LaTeX, you can make your first Japanese PDF document by taking a copy of this example file JIS.tex, going to a shell command line and typing "pdflatex JIS.tex". That should produce this output: JIS.pdf.

If that doesn't work for you, then you need to install some LaTeX software or Japanese fonts. On my Debian GNU/Linux system, I think I just installed texlive-latex-base and latex-cjk-japanese, and the package manager automatically installed the other packages needed by those two. I don't remember if I also had to install a fonts package.

Once you've got that working, you can start modifying and removing lines from that example file to see what you really need. I trimmed it down to eight lines:

\documentclass[12pt]{scrartcl}
\usepackage{CJK}

\begin{document}
\begin{CJK*}[dnp]{JIS}{min}

\section{What I learned today}
I can write this 私はキランです in Japanese.

\end{CJK*}
\end{document}

%%% Local Variables:
%%% coding: euc-japan

 

Ok ok, that's ten lines since I included two commands at the end to tell Emacs which character encoding to use when saving the file. This seems important since when I saved it as utf-8, the pdflatex program failed. Because these two lines start with percent signs, they will be ignored by LaTeX processors such as pdflatex, so it's safe to leave them there even if you're not using Emacs.

In the sixth line of my small example you should see seven mostly-simple Japanese characters. If that's not what you see, try setting your browser's character encoding to EUC-JP or maybe UTF-8. (This might be in [menu-bar]->View->Character Encoding->...)

Once you have this working, you should look at the other examples that came with the LaTeX CJK package. On my system, the examples are installed in the directory /usr/share/doc/latex-cjk-japanese/examples/ (Thanks for the tip, LUK ShunTim) This is probably also the best way to get started with other complex fonts such as Chinese and Korean.

It took me four hours to figure out how to use LaTeX to make a PDF document with Japanese characters. At one point, I became so frustrated with the LaTeX documentation that I gave up and decided to use DocBook instead. Unfortunately, DocBook's documentation was just as bad.

I think I learned something from all this about what makes a good tutorial: get the user to a working example as quickly as possible. Once you have something working, then you can experiment and learning becomes fun.

For a start, I think I'll put the "ruby" commands from JIS.tex back in since they're a pretty useful reading aid for learners. "Ruby" here refers to the little superscript phonetic kana characters, usually called furigana. It has no relation to the Ruby programming language, which was developed by a Japanese guy.

To write Japanese hirigana, katakana, and kanji, in Emacs you just use the function M-x set-input-method and then type japanese at the prompt. The usual command (C-h I) will show the documentation for how the input method works. While using the japanese input method, typing qq will put you into the japanese-ascii input method, which you'll need for typing LaTeX commands and symbols "\{}". And qq again will bring you from the japanese-ascii input method back to the normal japanese input method.

If you want to use other applications, then you'll need to install some separate input method software. I installed the packages "anthy", "scim", and "scim-canna" and then was able to write Japanese in GNOME applications by right clicking in a text box and from the "Input Methods" submenu, choosing "SCIM Input Method". It's annoying that SCIM uses Ctrl+Space as it's activation sequence. You can change this by going to "Show command menu->SCIM Setup->Global Setup" I wasn't able to get OpenOffice.org to work. From looking around, it seems OpenOffice only supports "IIIMP", but I can't see any package that provides IIIMP.

You might find useful info on these pages:

Hope that helps!

-- 
Ciarán O'Riordan,
Support free software: Join FSFE's Fellowship

Using and writing Emacs 22 input methods

Emacs 21 had a generic function called iso-accents-mode for writing âççéntèd çhàrâçtërs, but that was removed in Emacs 22. It took me a while, but I found the replacement was to use set-input-method, and then select whichever language you want to be able to type the accented characters of.

The default keybinding for set-input-method is not very convenient (C-x RET C-\), and I almost always use the same input method, so I put this small helper function in my .emacs and bound it to an easy key sequence:

(defun ciaran-toggle-french-input-method ()
  "toggle between French and no input method"
  (interactive)
  (if (string= current-input-method "french-alt-postfix")
      (set-input-method nil)
    (set-input-method "french-alt-postfix")))
(global-set-key [?\C-c ?.] 'ciaran-toggle-french-input-method)

Sometimes I need Dutch characters, but the "dutch" input method contains some completely unnecessary conversion sequences which make it frustrating to use. And sometimes I want the "á" character so I can write my name properly. So what do I do if I want a personalised input method?

About modifying input methods, the Emacs Lisp Reference Manual just says "How to define input methods is not yet documented in this manual". So I went to the Emacs page on sv.gnu.org, checked out a CVS copy of the emacs source, grepped around, and found that the Dutch input is defined in the file /emacs/leim/quail/latin-alt.el. Looking inside, it's not so complicated.

Here's a minimalist example of what you could put in your .emacs to create your own very basic input method:

(quail-define-package
 "ciarans-chars" "MYlanguage" "MY" t
 "Ciaran's personal input method defining only the
 conversion sequence he wants
" nil t nil nil nil nil nil nil nil nil t)

(quail-define-rules
 ("\"a" ?ä) ;; LATIN SMALL LETTER A WITH DIAERESIS
 ;; remember to comment your code, if you like :-)
 ("\"e" ?ë) ;; LATIN SMALL LETTER E WITH DIAERESIS
 ("a'" ?á) ;; LATIN SMALL LETTER A WITH GRAVE
 )

For more information on those two quail-* functions, you can get help in the usual way with C-h f and then type the name of the function at the prompt. If you want to test the above code, just paste those two code snippets into an Emacs buffer and run M-x eval-last-sexp after each. Then you can select the "ciarans-chars" input method, and you can read about the input method by pressing C-h I and typing "ciarans-chars" at the prompt.

You will also see that, like with the existing input methods, when you type the first character of what could be a conversion sequence (in the above example, this is just " or 'a'), you will see in the minibuffer which characters could follow it to cause both characters to be converted into another character. So with ciarans-chars, when you type " the minibuffer will display: "[ae].

Looking at the source in /emacs/leim/quail/latin-alt.el should give you ideas for what other conversion sequences you'd use, and the other files in that directory contain the conversion code for more complex alphabets.

Me, I'll make a minimal input method for the characters I use from French, Dutch, plus the Irish a-fada "á". I filed a bug report about the current Dutch input method, but seeing how uncomplicated it is, I might be able to fix it and submit a patch now.

-- 
Ciarán O'Riordan,
Support free software: Join FSFE's Fellowship


[ RSS Feed ]

Right menu

Fellow Events

<< September 2008 >>
Mon Tue Wed Thu Fri Sat Sun
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 
Selected Day Today


FSFE Card


DRM.info
© FSFE