This site uses cookies. By continuing to browse the site, you agree to our use of cookies. Find out more about cookies.
Skip to main content
Metadecidim's official logo
  • English Triar la llengua Elegir el idioma Choose language
    • Català
    • Castellano
Sign Up Sign In
  • Home
  • Processes
  • Assemblies
  • Initiatives
  • Consultations
  • Conferences
  • Help

Propose new functionalities for Decidim software

#DecidimRoadmap Designing Decidim together

Phase 1 of 1
Open 2019-01-01 - 2030-12-31
Process phases Submit a proposal
  • The process
  • Debates
  • Propose new features
  • News
chevron-left Back to list

Use automatic language detection for the machine translation feature

Avatar: Virgile Deville Virgile Deville
10/06/2021 17:14  

Is your feature request related to a problem? A clear and concise description of what the problem is.

The futureu.europa.eu platform is one of the first Decidim instances to be using the machine translation feature.

Based on the proposal “Machine translation enhancement for source language detection edge cases” we’d like to offer a more complete solution to the source language problem.


As described in the additional context, adding a dropdown for language selection on UGC (User generated content) forms might not be enough as some users won’t notice it or take the time to select the right language.


Describe the solution you'd like

We propose to leverage on the automatic language detection feature that some machine translation services offer. This way we’d be able to select the source language for the user. 

The issue with automatic language detection is that it doesn’t work well on short sentences like (OK , Da etc.). Also some languages are very close (ex : Romanian and Moldavian)  and errors can be made.

Considering the user experience needs to be as smooth as possible, and in order to avoid issues with availability of the language detection software, the implementation we are considering is to have a Javascript widget that would query the external service through AJAX. This will allow us to use external services like google translate, or add our own implementation (for instance a small Python script that would allow language detection - Please note there are no mature enough libraries to use in the ruby context). Also, this implementation would allow the user to select the language from a drop down in case the Confidence score provided by the used library is not high enough. 

Acceptance criteria

Given that I’m logged in
When I am writing a comment / proposal / meeting in the same language as the one I selected to browse the website
Then the form displays the language selection dropdow with my preferred language.

Given that I’m logged in
When I am writing a comment / proposal / meeting in another language than the one I selected to browse the website
If the language detection is at a high confidence level
Then the language of my contribution is automatically selected and a message offers me via a click to display the language selection dropdown to correct if any mistake was made.

If the language detection is at a low confidence level
Then a message appears explaining the situation and showing the detected language and its confidence level (%). 


Given I am a logged in user
When I create an event of proposal
Then the language detection should be done only by one field, the body of the content added.

Describe alternatives you've considered

None


Additional context

This builds upon the meta proposal “Machine translation enhancement for source language detection edge cases”


Resources : Libraries for automatic language selection 

https://github.com/ankane/fastText < Ruby fast text implementation 

https://github.com/bung87/whatlangid < Python library 

https://github.com/tremend-cofe/language-detection < Python implementation of the language detection application (based on whatlangId)


Does this issue could impact on users private data?

No

Funded by

EU Commission

  • Filter results for category: Multi-language Multi-language
Endorsements count0
Use automatic language detection for the machine translation feature Comments 1

Reference: MDC-PROP-2021-06-16402
Version number 1 (of 1) see other versions
Check fingerprint

Fingerprint

The piece of text below is a shortened, hashed representation of this content. It's useful to ensure the content hasn't been tampered with, as a single modification would result in a totally different value.

Value: 2941d2b2f6005d18fff7bab17e1bd4d8cd743737e71733433f74869f25a1ea9a

Source: {"body":{"en":"<p><strong>Is your feature request related to a problem? A clear and concise description of what the problem is.</strong></p><p>The<a href=\"http://futureu.europa.eu/\" target=\"_blank\"> futureu.europa.eu</a> platform is one of the first Decidim instances to be using the machine translation feature.</p><p>Based on the proposal “Machine translation enhancement for source language detection edge cases” we’d like to offer a more complete solution to the source language problem.</p><p><strong><br></strong>As described in the additional context, adding a dropdown for language selection on UGC (User generated content) forms might not be enough as some users won’t notice it or take the time to select the right language.</p><p><br></p><p><strong>Describe the solution you'd like</strong></p><p>We propose to leverage on the automatic language detection feature that some machine translation services offer. This way we’d be able to select the source language for the user.&nbsp;</p><p>The issue with automatic language detection is that it doesn’t work well on short sentences like (OK , Da etc.). Also some languages are very close (ex : Romanian and Moldavian)&nbsp; and errors can be made.</p><p>Considering the user experience needs to be as smooth as possible, and in order to avoid issues with availability of the language detection software, the implementation we are considering is to have a Javascript widget that would query the external service through AJAX. This will allow us to use external services like google translate, or add our own implementation (for instance a small Python script that would allow language detection - Please note there are no mature enough libraries to use in the ruby context). Also, this implementation would allow the user to select the language from a drop down in case the Confidence score provided by the used library is not high enough.&nbsp;</p><p><strong>Acceptance criteria</strong></p><p>Given that I’m logged in<strong><br></strong>When I am writing a comment / proposal / meeting in the same language as the one I selected to browse the website<strong><br></strong>Then the form displays the language selection dropdow with my preferred language.</p><p>Given that I’m logged in<strong><br></strong>When I am writing a comment / proposal / meeting in another language than the one I selected to browse the website<strong><br></strong>If the language detection is at a high confidence level<strong><br></strong>Then the language of my contribution is automatically selected and a message offers me via a click to display the language selection dropdown to correct if any mistake was made.</p><p>If the language detection is at a low confidence level<strong><br></strong>Then a message appears explaining the situation and showing the detected language and its confidence level (%).&nbsp;</p><p><strong><img src=\"https://lh4.googleusercontent.com/1uNjNWKuQ0hzm9VZ6rfCz5SAc-_Syp0Nrvkr2g3Oll2-BxbbpwNfHAZIf_YaSUjcoqUoKgIf6fTzm_VCNodlSJgsknCUGhc8I9a1j4zhL795bdbBOGBF05C0Gf92oHcxryW4c6tQ\"></strong></p><p><strong><br></strong>Given I am a logged in user<strong><br></strong>When I create an event of proposal<strong><br></strong>Then the language detection should be done only by one field, the body of the content added.</p><p><strong>Describe alternatives you've considered</strong></p><p>None</p><p><strong><br>Additional context</strong></p><p>This builds upon the meta proposal “Machine translation enhancement for source language detection edge cases”</p><p><strong><br></strong>Resources : Libraries for automatic language selection&nbsp;</p><p><a href=\"https://github.com/ankane/fastText\" target=\"_blank\">https://github.com/ankane/fastText</a> &lt; Ruby fast text implementation&nbsp;</p><p><a href=\"https://github.com/bung87/whatlangid\" target=\"_blank\">https://github.com/bung87/whatlangid</a> &lt; Python library&nbsp;</p><p><a href=\"https://github.com/tremend-cofe/language-detection\" target=\"_blank\">https://github.com/tremend-cofe/language-detection</a> &lt; Python implementation of the language detection application (based on whatlangId)</p><p><strong><br>Does this issue could impact on users private data?</strong></p><p>No</p><p><strong>Funded by</strong></p><p>EU Commission</p>"},"title":{"en":"Use automatic language detection for the machine translation feature"}}

This fingerprint is calculated using a SHA256 hashing algorithm. In order to replicate it yourself, you can use an MD5 calculator online and copy-paste the source data.

Share:

link-intact Share link

Share link:

Please paste this code in your page:

<script src="https://meta.decidim.org/processes/roadmap/f/122/proposals/16402/embed.js"></script>
<noscript><iframe src="https://meta.decidim.org/processes/roadmap/f/122/proposals/16402/embed.html" frameborder="0" scrolling="vertical"></iframe></noscript>

Report inappropriate content

Is this content inappropriate?

Reason

1 comment

Order by:
  • Older
    • Best rated
    • Recent
    • Older
    • Most discussed
Avatar: Andrés Andrés verified-badge
13/07/2021 10:34
  • Get link Get link

I agree with the general idea of using automatic language detection, but have some doubts regarding the implementation:

1) Don't add the "We could not automatically detect...". As far as I know, this isn't a feature that other platforms with automatic translations have (Facebook and Twitter). If a translation fails, it fails. The fallback is to give the option to show the original text.
2) Regarding AJAX/Python/Ruby. As this depends heavily on which service/library you plan to use in your app, it would need to extend the contract that we already have in MyTranslationService, for instance, MyTranslationService.detect(text) or something like that

Add your comment

Sign in with your account or sign up to add your comment.

Loading comments ...

  • Terms and conditions of use
  • About the community
  • Download Open Data files
  • Metadecidim at Twitter Twitter
  • Metadecidim at Instagram Instagram
  • Metadecidim at YouTube YouTube
  • Metadecidim at GitHub GitHub
Creative Commons License Website made with free software.
Decidim Logo

Confirm

OK Cancel

Please sign in

decidim Sign in with Decidim Barcelona
Or

Sign up

Forgot your password?