[DVIPDFMx] dvipdfm vs. x vs. xx
Jin-Hwan Cho
jinhwan.cho at gmail.com
Tue Dec 15 21:11:40 KST 2009
On Dec 15, 2009, at 7:49 PM, Heiko Oberdiek wrote:
> On Tue, Dec 15, 2009 at 05:58:50PM +0900, Jin-Hwan Cho wrote:
>
>> On Dec 15, 2009, at 8:44 AM, Karl Berry wrote:
>>
>>> Heiko>
>>> * UTF16-warning: AFAIK the hyperref code is correct in producing
>>> correct PDF strings in UTF16. I do not have information about
>>> xdvipdfmx, whether it uses different specials, expects the data
>>> in different form, ...
>>
>> I already read the discussion in the [tex-live] mailing list. Let's recall Volovich's sample:
>>
>> \documentclass{article}
>> \usepackage[unicode,pdftitle={test}]{hyperref}
>> \pdfpagewidth=300bp
>> \pdfpageheight=300bp
>> \begin{document}
>> %\showthe\pdfpagewidth
>> This is a test.
>> \end{document}
>>
>> 1. (xelatex) The "unicode" option makes the "UTF16-warning". The reason is as follows:
>>
>> At first "hyperref" package encodes the pdftile "test" into the
>> UCS-2 encoding because the "unicode" option was specified
>> explicitly. After that xdvipdfmx tries to encode the UCS-2 encoded
>> pdftitle into UCS-2 again. So you got the warning message
>> "UTF16-warning" in the process of xelatex.
>
> Agreed. The question is now, how this reencoding can be suppressed?
> The BOM marker is explicitly given "\376\377".
Without touching the source code of (x)dvipdfmx, there is no GOOD way to suppress
the reencoding (maybe_reencode_utf8() in xdvipdfmx does this job.)
One possible way (not GOOD) is:
(1) Prepare an external CMap file Identity-Byte (attached in this mail)
(2) Before calling hyperref, give the following lines.
\usepackage{atbegshi}
\AtBeginShipoutFirst{\special{pdf:tounicode Identity-Byte}}
Future plan for (x)dvipdfmx is to prepare the special "pdf: tounicode none" that
suppress the working of maybe_reencode_utf8() in xdvipdfmx.
Do you have any better idea?
>> 2. (latex+dvipdfmx) the 2nd line must included the driver option "dvipdfmx" as
>> \usepackage[dvipdfmx,unicode,pdftitle={test}]{hyperref})
>>
>> There is no warning because dvipdfmx never tries to encode the pdftitle into UCS-2
>> without the special "pdf: tounicode [cmap_file]".
>>
>> Here is the difference between dvipdfmx and xdvipdfmx.
>>
>> xdvipdfmx assumes that the pdftitle was given in the encoding "UTF-8",
>> but dvipdfmx does not.
>>
>> Here is my answer to Volovich's sample:
>>
>> Never use the "unicode" option with xelatex. Even though pdftitle
>> contains CJK characters (encoded in UTF-8), xdvipdfmx translates it
>> into UCS-2 perfectly.
>
> No. This is definitely wrong. Hyperref would use PDFDocEncoding, but
> xdvipdfmx assumes Unicode/UTF-8. However PDFDocEncoding is not a
> subset of Unicode. Some slots are different! Therefore I have disabled
> this way in hyperref. Only `pdfencoding=unicode' (same as `unicode=true')
> and `pdfencoding=auto' are possible.
You are right. PDFDocEncoding is different from Unicode. But I cannot catch
the exact meaning of "pdfencoding=auto" under XeTeX. This option does not
touch anything (reencoding) under XeTeX, right?
>> For example, the following code will work well with xelatex:
>>
>> \documentclass{article}
>> \usepackage[pdftitle={??????test}]{hyperref}
>> \begin{document}
>> This is a test.
>> \end{document}
>
> Now (6.79t) unicode is used and the failed conversion warning
> appears.
I don't understand why conversion warning does appear. I will try after
changing to the new version 6.79t, and will continue this discussion.
Best regards, ChoF.
More information about the dvipdfmx
mailing list