ftblog

:: widerstand zwecklos ::
email jabber gpgkey
hackergotchi

February 26, 2009

Crazy Unix Printing System #2

Filed under: rants -- 23:47

CUPS is back with a vengeance...

This time I wanted to print a pdf file, that included a german u-umlaut
character. Let's say 'ü.pdf'. It failed.

Note: I'm using '?.pdf' throughout this session. The shell will take care of
generating the correct file name. This works, because 'ü.pdf' is the only
one-character-name-pdf-file in my current directory.

Let's first see what character the 'ü' is, UTF-8 or ISO-8859-15...
% print ?.pdf | od -b
0000000 374 056 160 144 146 012

Oh, so it's one byte 0374 octal. That happens to be the 'ü' in ISO-8859-15.

Now let's see how a sane program handles that:
% strace ls -la ?.pdf
execve("/bin/ls", ["ls", "-la", "\374.pdf"], [/* 56 vars */]) = 0
    [...]
lstat64("\374.pdf", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0

\374.pdf goes in aaaand it comes back out.

Now let us take a look at what cups's lp does:
% strace lp ?.pdf
execve("/usr/bin/lp", ["lp", "\374.pdf"], [/* 56 vars */]) = 0
    [...]
access("\303\274.pdf", R_OK) = -1 ENOENT (No such file or directory)

What the hell!? One byte goes in, two come out!?

So, what is octal 0303-0274?
Well, you've probably guessed it, it's UTF-8's representation of u-umlaut.

Clearly a bug you would say? Well, so do I.
Before reporting to debian's BTS I wanted to check if someone already reported
it, and yes - someone did: #440685

Aaaand as you can see, it was forwarded to CUPS: #2812

The response?
Fix Version: Will Not Fix
mike: This is working exactly as designed/required. If you use filenames
that have a different encoding than the locale [...]

Let's see...
% locale
LANG=
LC_CTYPE=de_DE@euro
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
% grep de_DE@euro /etc/locale.gen
de_DE@euro ISO-8859-15

Well "mike", no.
The locale is ISO-8859-15, the filename is encoded ISO-8859-15.

And even if they were different. The idea to convert the encoding of input
filenames is really one of the dumbest I've ever heard... But let's assume that
was the case, the least thing you could do is to document that behaviour. But
guess what, neither the lp manpage nor the 'Command-Line Printing and Options'
section of the online help in the CUPS server talk about encoding. They just
say 'file(s)' or 'filename' - that's it.

If that's what the design/requirement is, then maybe CUPS really is the piece
of shit everyone else keeps telling me it is...

Powered by zblog
valid css | valid xhtml | utf-8 encoded | best viewed with anybrowser