Propose new functionalities for Decidim software
#DecidimRoadmap Designing Decidim together
Intelligent recommendations
When someone publishes a new proposal, a list of similar entries is displayed to avoid duplicates. The current recommendation algorithm calculates the similarity of each pair of proposals based on trigram (sets of 3-characters) comparison. This method, however, does not take into account the semantic aspects of the text and can be easily improved using simple Machine Learning techniques.
We suggest using a technique called word embeddings which consists of assigning to each proposal a multi-dimensional vector, in such a way that similar proposals (in terms of semantics) end up having close vectors. Therefore, the recommendations for a given proposal would be the proposals with the smallest distances between the vectors.
To calculate the vectors associated with each proposal, we suggest using pre-calculated vector embeddings for each word (of those more frequent in the Decidim vocabulary) and then calculating the average of all words appearing in the proposal. The pre-calculation of word vectors could be done offline by any person with medium knowledge of NLP (DataForGoodBCN, the community that has created this proposal, could provide these calculations).
This proposal is being evaluated
List of Endorsements
Report inappropriate content
Is this content inappropriate?
Comment details
You are seeing a single comment
View all comments
Hey there! loving it too!
I'm working with proposals recomendations in the AhoraNosTocaParticipar version of decidim.
I was thinking that there are a couple of versions of word embeddings already computed in spanish (I know of this one in Chile). As I understand it there are a few techniques for searching phase similarities (https://medium.com/@adriensieg/text-similarities-da019229c894).
What do you guys think of creating a separate engine, for doing so? Currently, it is Postgres that does the text similarities engine, so I think if we create a separate engine and simply replace the calls in decidim could work.
Loading comments ...