This proposal has been accepted to be implemented in the main repository. Check for updates in the comments

Intelligent recommendations

Main repo (accepted)

DataForGoodBCN Official participant 30/06/2020 18:12

When someone publishes a new proposal, a list of similar entries is displayed to avoid duplicates. The current recommendation algorithm calculates the similarity of each pair of proposals based on trigram (sets of 3-characters) comparison. This method, however, does not take into account the semantic aspects of the text and can be easily improved using simple Machine Learning techniques.

We suggest using a technique called word embeddings which consists of assigning to each proposal a multi-dimensional vector, in such a way that similar proposals (in terms of semantics) end up having close vectors. Therefore, the recommendations for a given proposal would be the proposals with the smallest distances between the vectors.

To calculate the vectors associated with each proposal, we suggest using pre-calculated vector embeddings for each word (of those more frequent in the Decidim vocabulary) and then calculating the average of all words appearing in the proposal. The pre-calculation of word vectors could be done offline by any person with medium knowledge of NLP (DataForGoodBCN, the community that has created this proposal, could provide these calculations).

Filter results for: Awaiting funding

Comment

Avatar: Carol Romero

Avatar: Arnau

Avatar: Pablo Aragón

Avatar: txema

Avatar: Ivan Vergés

Avatar: Pierre Mesure

Avatar: Pau Parals

Liked by Carol Romero and 16 more

Liked by

Avatar: Carol Romero Carol Romero Decidim Member

Avatar: Arnau Arnau

Avatar: Pablo Aragón Pablo Aragón Decidim Member

Avatar: txema txema Decidim Member

Avatar: Ivan Vergés Ivan Vergés Decidim Member

Avatar: Pierre Mesure Pierre Mesure

Avatar: Pau Parals Pau Parals Decidim Member

Antoine Gaboriau

Avatar: Platoniq Platoniq Official participant

Avatar: Oliver Azevedo Barnes Oliver Azevedo Barnes

Avatar: Decidim Product Decidim Product Official participant

Avatar: Pauline Bessoles Pauline Bessoles Decidim Member

Xavi Ros Roca

Avatar: Didac Fortuny Didac Fortuny

Laura Portell

Avatar: Felipe Álvarez Felipe Álvarez

Quentin Lp

Comment details

You are seeing a single comment

View all comments

Felipe Álvarez

09/11/2020 14:56

Hey there! loving it too!
I'm working with proposals recomendations in the AhoraNosTocaParticipar version of decidim.
I was thinking that there are a couple of versions of word embeddings already computed in spanish (I know of this one in Chile). As I understand it there are a few techniques for searching phase similarities (https://medium.com/@adriensieg/text-similarities-da019229c894).
What do you guys think of creating a separate engine, for doing so? Currently, it is Postgres that does the text similarities engine, so I think if we create a separate engine and simply replace the calls in decidim could work.

Essential

Preferences

Analytics and statistics

Marketing

Intelligent recommendations

Liked by

Please log in

Cookie settings

Essential

Preferences

Analytics and statistics

Marketing

Intelligent recommendations

Share

QR Code