Skip to main content

Cookie settings

We use cookies to ensure the basic functionalities of the website and to enhance your online experience. You can choose for each category to opt-in/out whenever you want.

Essential

Preferences

Analytics and statistics

Marketing

Changes at "Use automatic language detection for the machine translation feature"

Avatar: Virgile Deville Virgile Deville

Title

  • +{"en"=>"Use automatic language detection for the machine translation feature"}

Body

  • +["

    Is your feature request related to a problem? A clear and concise description of what the problem is.

    The futureu.europa.eu platform is one of the first Decidim instances to be using the machine translation feature.

    Based on the proposal ā€œMachine translation enhancement for source language detection edge casesā€ weā€™d like to offer a more complete solution to the source language problem.


    As described in the additional context, adding a dropdown for language selection on UGC (User generated content) forms might not be enough as some users wonā€™t notice it or take the time to select the right language.


    Describe the solution you'd like

    We propose to leverage on the automatic language detection feature that some machine translation services offer. This way weā€™d be able to select the source language for the user.Ā 

    The issue with automatic language detection is that it doesnā€™t work well on short sentences like (OK , Da etc.). Also some languages are very close (ex : Romanian and Moldavian)Ā  and errors can be made.

    Considering the user experience needs to be as smooth as possible, and in order to avoid issues with availability of the language detection software, the implementation we are considering is to have a Javascript widget that would query the external service through AJAX. This will allow us to use external services like google translate, or add our own implementation (for instance a small Python script that would allow language detection - Please note there are no mature enough libraries to use in the ruby context). Also, this implementation would allow the user to select the language from a drop down in case the Confidence score provided by the used library is not high enough.Ā 

    Acceptance criteria

    Given that Iā€™m logged in
    When I am writing a comment / proposal / meeting in the same language as the one I selected to browse the website
    Then the form displays the language selection dropdow with my preferred language.

    Given that Iā€™m logged in
    When I am writing a comment / proposal / meeting in another language than the one I selected to browse the website
    If the language detection is at a high confidence level
    Then the language of my contribution is automatically selected and a message offers me via a click to display the language selection dropdown to correct if any mistake was made.

    If the language detection is at a low confidence level
    Then a message appears explaining the situation and showing the detected language and its confidence level (%).Ā 


    Given I am a logged in user
    When I create an event of proposal
    Then the language detection should be done only by one field, the body of the content added.

    Describe alternatives you've considered

    None


    Additional context

    This builds upon the meta proposal ā€œMachine translation enhancement for source language detection edge casesā€


    Resources : Libraries for automatic language selectionĀ 

    https://github.com/ankane/fastText < Ruby fast text implementationĀ 

    https://github.com/bung87/whatlangid < Python libraryĀ 

    https://github.com/tremend-cofe/language-detection < Python implementation of the language detection application (based on whatlangId)


    Does this issue could impact on users private data?

    No

    Funded by

    EU Commission

    "]

Confirm

Please log in

The password is too short.

Share