[DVIPDFMx] dvipdfm vs. x vs. xx

Heiko Oberdiek oberdiek at uni-freiburg.de
Tue Dec 15 19:49:05 KST 2009


On Tue, Dec 15, 2009 at 05:58:50PM +0900, Jin-Hwan Cho wrote:

> On Dec 15, 2009, at 8:44 AM, Karl Berry wrote:
> 
> >    Heiko>
> >    * UTF16-warning: AFAIK the hyperref code is correct in producing
> >      correct PDF strings in UTF16. I do not have information about
> >      xdvipdfmx, whether it uses different specials, expects the data
> >      in different form, ...
> 
> I already read the discussion in the [tex-live] mailing list. Let's recall Volovich's sample:
> 
> \documentclass{article}
> \usepackage[unicode,pdftitle={test}]{hyperref}
> \pdfpagewidth=300bp
> \pdfpageheight=300bp
> \begin{document}
> %\showthe\pdfpagewidth
> This is a test.
> \end{document}
> 
> 1. (xelatex) The "unicode" option makes the "UTF16-warning". The reason is as follows:
> 
> 	At first "hyperref" package encodes the pdftile "test" into the
> 	UCS-2 encoding because the "unicode" option was specified
> 	explicitly. After that xdvipdfmx tries to encode the UCS-2 encoded
> 	pdftitle into UCS-2 again. So you got the warning message
> 	"UTF16-warning" in the process of xelatex.

Agreed. The question is now, how this reencoding can be suppressed?
The BOM marker is explicitly given "\376\377".

> 2. (latex+dvipdfmx) the 2nd line must included the driver option "dvipdfmx" as
> 	\usepackage[dvipdfmx,unicode,pdftitle={test}]{hyperref})
> 
> 	There is no warning because dvipdfmx never tries to encode the pdftitle into UCS-2
> 	without the special "pdf: tounicode [cmap_file]".
> 
> Here is the difference between dvipdfmx and xdvipdfmx.
> 
> 	xdvipdfmx assumes that the pdftitle was given in the encoding "UTF-8",
> 	but dvipdfmx does not.
> 
> Here is my answer to Volovich's sample:
> 
> 	Never use the "unicode" option with xelatex. Even though pdftitle
> 	contains CJK characters (encoded in UTF-8), xdvipdfmx translates it
> 	into UCS-2 perfectly.

No. This is definitely wrong. Hyperref would use PDFDocEncoding, but
xdvipdfmx assumes Unicode/UTF-8. However PDFDocEncoding is not a
subset of Unicode. Some slots are different! Therefore I have disabled
this way in hyperref. Only `pdfencoding=unicode' (same as `unicode=true')
and `pdfencoding=auto' are possible.

> For example, the following code will work well with xelatex:
> 
> \documentclass{article}
> \usepackage[pdftitle={??????test}]{hyperref}
> \begin{document}
> This is a test.
> \end{document}

Now (6.79t) unicode is used and the failed conversion warning
appears.

> Then how do you get the same result without xelatex?
> 
> In the case of "pdflatex", I do not know how to use directly the UTF-8 encoded pdftitle.
> But the combination "latex+dvipdfmx" can do that with the special "pdf:tounicode"
> as follows:
> 
> \documentclass{article}
> \usepackage{atbegshi}
> \AtBeginShipoutFirst{\special{pdf:tounicode UTF8-UCS2}}
> \usepackage[dvipdfmx,pdftitle={????????????test}]{hyperref}
> \begin{document}
> This is a test.
> \end{document}

A lot of stuff isn't encoded in UTF8, thus the result is an
encoding mess.

Yours sincerely
  Heiko <oberdiek at uni-freiburg.de>


More information about the dvipdfmx mailing list