Quantcast

Unicode and --with-colons

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Unicode and --with-colons

Robert J. Hansen-3
C:\Users\Robert J. Hansen\Desktop> gpg --fixed-list-mode --with-colons
--list-key 0x3ADBFA6D00A1E6FE

=====
[... trimmed ...]
uid:-::::1436536488::100E4A12486A5261E374B3B0CA16CF0516F4367C::Ludwig
Hügelschäfer <[hidden email]>:
=====

"That's an odd encoding," I said to myself.  "It must be UTF-8 presented
as ASCII or Windows-1252.  Let's look, shall we?"

=====

C:\Users\Robert J. Hansen\Desktop> gpg --fixed-list-mode --with-colons
--list-key 0x3ADBFA6D00A1E6FE > ludwig.asc

C:\Users\Robert J. Hansen\Desktop> python
Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64
bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> with open("ludwig.asc") as fh:
...     bytes = fh.read()
...
>>> bytes
'ÿþt\x00r\x00u\x00:\x00:\x001\x00:\x001\x004\x009\x001\x000\x003\x004\x004\x004\x009\x00:\x000\x00:\x003\x00:\x001\x00...'

=====

Weirder and weirder.  GnuPG is outputting data in UTF-16LE, complete
with a correct byte-order mark... but is first taking what is
(apparently) the UTF-8 of Ludwig's name, giving each byte a null pair
byte, and calling it UTF-16.

Looking at the output from just a plain --list-key, it appears correct:

=====

\x00H\x00ü\x00g\x00e\x00l\x00s\x00c\x00h\x00ä\x00f\x00e\x00r

=====

So -- what's the canonically approved way to convert this mangled form
back into Unicode?  Is this mangled form a deliberate design choice, or
is this a bug?


_______________________________________________
Gnupg-users mailing list
[hidden email]
http://lists.gnupg.org/mailman/listinfo/gnupg-users
Loading...