Edgewall Software

Opened 10 years ago

Closed 6 years ago

Last modified 4 years ago

#54 closed enhancement (fixed)

msgctxt support

Reported by: cmlenz Owned by: palgarvio
Priority: major Milestone: 1.0
Component: PO and MO Files Version: devel
Keywords: context pgettext msgctxt Cc: chris@…

Description

As described in the GNU gettext manual.

If we really want to do this, the following is needed:

  • Support msgctxt fields in the reading and writing of PO files
  • Support msgctxt fields when compiling catalogs to MO files
  • Provide an extension of the gettext.GNUTranslations class that provides the pgettext family of functions, and knows how to parse msgctxt fields out of MO files.

Attachments (6)

po-file-save-msgctxt.patch (1.8 KB) - added by asheesh+msgctxt-2008-11-05@… 9 years ago.
Patch for having write_po write out the msgctxt
corrected-msgctxt-output.patch (2.2 KB) - added by asheesh+msgctxt-2008-11-05@… 9 years ago.
Patch for having write_po write out the msgctxt (FIXED to write msgctxt before msgid)
msg_context_support.patch (16.5 KB) - added by palgarvio 9 years ago.
full_ctxt_support.patch (52.1 KB) - added by palgarvio 9 years ago.
name_orthogonality.patch (1.5 KB) - added by Christopher A. Stelma <chris@…> 6 years ago.
name_orthogonality.2.patch (2.1 KB) - added by Christopher A. Stelma <chris@…> 6 years ago.

Download all attachments as: .zip

Change History (22)

comment:1 Changed 10 years ago by cmlenz

Plone has its own msgfmt.py module that can apparently deal with msgctxt fields:

https://svn.plone.org/svn/collective/python-gettext/trunk/pythongettext/msgfmt.py

In MO files, the msgctxt seems to be prepended to the msgid, separated by an EOT character (“\x04”)

comment:2 Changed 9 years ago by palgarvio

Messages extraction also needs to handle msgctxt.

Perhaps, instead of adding another item that should be returned, we could probably make message extractor expect lineno, funcname, messages, extra instead of lineno, funcname, messages, comments per message like it currently expects. The last one is a simple dict which can carry comments or context or any other thing we might want' it to return in the future?

Changed 9 years ago by asheesh+msgctxt-2008-11-05@…

Patch for having write_po write out the msgctxt

comment:3 Changed 9 years ago by asheesh+msgctxt-2008-11-05@…

This patch is (C) Creative Commons, though I (Asheesh Laroia) am the author. Permission is granted to distribute it under the same terms as the rest of Babel is currently distributed.

It was pretty simple, so I added support for writing out msgctxt in write_po. I also added a test that demonstrates msgctxt correctly round-trips when reading a PO file and then writing it out.

That means task #1 is complete:

  • Support msgctxt fields in the reading and writing of PO files

So #2 and #3 remain:

  • Support msgctxt fields when compiling catalogs to MO files
  • Provide an extension of the gettext.GNUTranslations class that provides the pgettext family of functions, and knows how to parse msgctxt fields out of MO files.

I'd appreciate feedback on my test (would you rather I move the test into the general write_po section of tests?) and code, and given appropriate encouragement could be convinced to try to handle part of #2 and/or #3.

comment:4 Changed 9 years ago by anonymous

When I write "#" above it doesn't refer to ticket numbers, but the bulleted list I quoted.

comment:5 Changed 9 years ago by asheesh+msgctxt-2008-11-05@…

I updated the patch: msgctxt should always be BEFORE msgid, apparently.

You can read the *corrected* patch prettified on the web at http://code.creativecommons.org/viewgit?p=babel.git;a=commitdiff;h=9a2b61300be048d585e28d3958bc251b81c5aee7;hp=67e783f5aab52488fc7eecad4319c859e88c7575 , and I'll attach the non-prettified version.

Changed 9 years ago by asheesh+msgctxt-2008-11-05@…

Patch for having write_po write out the msgctxt (FIXED to write msgctxt before msgid)

comment:6 follow-up: Changed 9 years ago by cboos

  • Keywords context pgettext msgctxt added

Any progress on this topic?

I think it would be handy to have this feature for being able to select the proper translation depending on the gender, e.g. "created" in french translates into "créée" or "créé", depending on whether we're talking about a Wiki page ("une page wiki") or a ticket ("un ticket").

As an alternative to the full fledge pgettext family (which looks like a lot of work as you have to decline it through all the variants, unicode/non-unicode, domain, etc.), I have in a much simpler approach: we could have embedded contexts, some strings that would simply be replaced by nothing.

To some extent, we can already achieve this by (ab)using keyword expansion:

_("created%(a_ticket)s", a_ticket='')

But it would be nicer if there was a shorter notation for it, e.g.

  • _("created%(a_ticket)c") (c for context, if it's not already taken)
  • _("created%{a_ticket}")
  • _("{a_ticket_was}created")

Or anything similar.

comment:7 in reply to: ↑ 6 Changed 9 years ago by palgarvio

Replying to cboos:

Any progress on this topic?

I think it would be handy to have this feature for being able to select the proper translation depending on the gender, e.g. "created" in french translates into "créée" or "créé", depending on whether we're talking about a Wiki page ("une page wiki") or a ticket ("un ticket").

Work on this has begun, with the patch on #124 babel supports extrators to pass aditional info like custom formats, message contexts etc...

I'll see if I can add the gettext context family of functions to Babel's Translations class.

As an alternative to the full fledge pgettext family (which looks like a lot of work as you have to decline it through all the variants, unicode/non-unicode, domain, etc.), I have in a much simpler approach: we could have embedded contexts, some strings that would simply be replaced by nothing.

Nah, fully fledged! :)

comment:8 Changed 9 years ago by palgarvio

Context support to the Translations class added. Tests will follow.

Changed 9 years ago by palgarvio

comment:9 Changed 9 years ago by palgarvio

Tests are now also included into the patch.

Changed 9 years ago by palgarvio

comment:10 Changed 9 years ago by palgarvio

Sorry, discard attachment:full_ctxt_support.patch it contains changes for several tickets. It can be deleted.

comment:11 Changed 9 years ago by palgarvio

  • Owner changed from cmlenz to palgarvio

comment:12 Changed 9 years ago by palgarvio

  • Resolution set to fixed
  • Status changed from new to closed

Fixed on [462] and [463]. Thank You Asheesh!

Changed 6 years ago by Christopher A. Stelma <chris@…>

Changed 6 years ago by Christopher A. Stelma <chris@…>

comment:13 follow-up: Changed 6 years ago by Christopher A. Stelma <chris@…>

  • Cc chris@… added
  • Resolution fixed deleted
  • Status changed from closed to reopened

I was writing a script to generate these names and realized it wasn't as easy as it should be. The naming of these four methods is the culprit.

"u" and "l" are mutually exclusive, I hope you'll agree they should occupy the same place in the method name.

encodings = ['', 'l', 'u']
domains   = ['', 'd']
plurals   = ['', 'n']
contexts  = ['', 'p']
for context in contexts:
    for domain in domains:
        for plural in plurals:
            for encoding in encodings:
                print domain + encoding + plural + context + 'gettext'

name_orthogonality.2.patch

comment:14 in reply to: ↑ 13 Changed 6 years ago by fschwarz

  • Resolution set to fixed
  • Status changed from reopened to closed

Replying to Christopher A. Stelma <chris@…>:

I was writing a script to generate these names and realized it wasn't as easy as it should be. The naming of these four methods is the culprit. (…)

Thank you very much for your comment/patches. However I think this is different enough to warrant a new ticket (#263). That way the discussion stays more focused.

comment:15 follow-up: Changed 4 years ago by nhooey

Note that Python 2.x and 3.x's version of gettext doesn't support the msgctxt string. So compiling an .mo file from a .po file with "msgctxt" labels won't work when running Babel's gettext(). Python's gettext doesn't even print an error, it just loads a corrupted catalog that translates nothing.

comment:16 in reply to: ↑ 15 Changed 4 years ago by nhooey

Nevermind, you just have to use Babel's "pgettext()" function instead of gettext(), which accounts for msgctxt labels.

Replying to nhooey:

Note that Python 2.x and 3.x's version of gettext doesn't support the msgctxt string. So compiling an .mo file from a .po file with "msgctxt" labels won't work when running Babel's gettext(). Python's gettext doesn't even print an error, it just loads a corrupted catalog that translates nothing.

Note: See TracTickets for help on using tickets.