Unicode and --with-colons

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Unicode and --with-colons

Robert J. Hansen-3
C:\Users\Robert J. Hansen\Desktop> gpg --fixed-list-mode --with-colons
--list-key 0x3ADBFA6D00A1E6FE

=====
[... trimmed ...]
uid:-::::1436536488::100E4A12486A5261E374B3B0CA16CF0516F4367C::Ludwig
Hügelschäfer <[hidden email]>:
=====

"That's an odd encoding," I said to myself.  "It must be UTF-8 presented
as ASCII or Windows-1252.  Let's look, shall we?"

=====

C:\Users\Robert J. Hansen\Desktop> gpg --fixed-list-mode --with-colons
--list-key 0x3ADBFA6D00A1E6FE > ludwig.asc

C:\Users\Robert J. Hansen\Desktop> python
Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64
bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> with open("ludwig.asc") as fh:
...     bytes = fh.read()
...
>>> bytes
'ÿþt\x00r\x00u\x00:\x00:\x001\x00:\x001\x004\x009\x001\x000\x003\x004\x004\x004\x009\x00:\x000\x00:\x003\x00:\x001\x00...'

=====

Weirder and weirder.  GnuPG is outputting data in UTF-16LE, complete
with a correct byte-order mark... but is first taking what is
(apparently) the UTF-8 of Ludwig's name, giving each byte a null pair
byte, and calling it UTF-16.

Looking at the output from just a plain --list-key, it appears correct:

=====

\x00H\x00ü\x00g\x00e\x00l\x00s\x00c\x00h\x00ä\x00f\x00e\x00r

=====

So -- what's the canonically approved way to convert this mangled form
back into Unicode?  Is this mangled form a deliberate design choice, or
is this a bug?


_______________________________________________
Gnupg-users mailing list
[hidden email]
http://lists.gnupg.org/mailman/listinfo/gnupg-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unicode and --with-colons

Ben McGinnes
On Sat, Apr 01, 2017 at 04:57:04AM -0400, Robert J. Hansen wrote:

> C:\Users\Robert J. Hansen\Desktop> gpg --fixed-list-mode --with-colons
> --list-key 0x3ADBFA6D00A1E6FE
>
> =====
> [... trimmed ...]
> uid:-::::1436536488::100E4A12486A5261E374B3B0CA16CF0516F4367C::Ludwig
> Hügelschäfer <[hidden email]>:
> =====
>
> "That's an odd encoding," I said to myself.  "It must be UTF-8 presented
> as ASCII or Windows-1252.  Let's look, shall we?"
I've never noticed anything like that with Ludwig's key.

Either regularly or with the flags you used here.  That said, if you
ever do need to be absolutely certain that the output is in UTF-8,
there is a way to guarantee it.  Just use GPGME's little known or
noticed XML output with gpgme-tool:

echo "KEYLIST 0x3ADBFA6D00A1E6FE /bye" | gpgme-tool > 0x3ADBFA6D00A1E6FE.xml

The output will need to be trimmed of the GPGME header at the top, the
"OK" disconnection at the bottom, the "D " at the beginning of each
line and the "%0A" at the end of each line.  I'm sure you can script
it to trim all that for you before writing the file anyway.  Although
I've attached that example here.

On a semi-related note; a bit over a year ago I generated W3C XML
Schema (XSD) and Relax-NG Schema (RNG) files for the GPGME XML data.
From these I also generated Relax-NG Compact (RNC), DTDs, Docbook 5
documentation (from the XSD) and XHTML docs (from the Docbook); in
case anyone finds a need for validating the XML files.  There isn't
currently a formal XML namespace setup for the schemas since it
doesn't appear that anyone's done anything with it (i.e. no one
noticed and thus no one asked for it).

Anyway, if they're of use, the schemas and docs are in one of my
branches on the git server, here:

https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gpgme.git;a=tree;f=lang/xml-schemas;h=06dfd4c925294cffba88f4f451bdef39dd2b4e4d;hb=6e9d5a5800fa8da96c706748bf60a8a074818af6

I'd recommend using either the XSD or the RNG rather than the others.
They're generally going to be better or more accurate.  The RNC and
DTD are there because I figured I may as well generate them at the
same time.


Regards,
Ben

_______________________________________________
Gnupg-users mailing list
[hidden email]
http://lists.gnupg.org/mailman/listinfo/gnupg-users

0x3ADBFA6D00A1E6FE.xml (2K) Download Attachment
signature.asc (643 bytes) Download Attachment
Loading...