Simple Python implementation of collaborative topic modeling?

python machine-learning lda topic-modeling collaborative-filtering

This should get you started (although not sure why this hasn't been posted yet): https://github.com/arongdari/python-topic-model

More specifically: https://github.com/arongdari/python-topic-model/blob/master/ptm/collabotm.py

class CollaborativeTopicModel:    """    Wang, Chong, and David M. Blei. "Collaborative topic                                 modeling for recommending scientific articles."    Proceedings of the 17th ACM SIGKDD international conference on Knowledge                                discovery and data mining. ACM, 2011.    Attributes    ----------    n_item: int        number of items    n_user: int        number of users    R: ndarray, shape (n_user, n_item)        user x item rating matrix    """

Looks nice and straightforward. I still suggest at least looking at gensim. Radim has done a fantastic job of optimizing that software very well.

python machine-learning lda topic-modeling collaborative-filtering

A very simple LDA implementation using gensin. You can find more informations here: https://radimrehurek.com/gensim/tutorial.html

I hope it can help you

from nltk.corpus import stopwordsfrom nltk.tokenize import RegexpTokenizerfrom nltk.stem import RSLPStemmerfrom gensim import corpora, modelsimport gensimst = RSLPStemmer()texts = []doc1 = "Veganism is both the practice of abstaining from the use of animal products, particularly in diet, and an associated philosophy that rejects the commodity status of animals"doc2 = "A follower of either the diet or the philosophy is known as a vegan."doc3 = "Distinctions are sometimes made between several categories of veganism."doc4 = "Dietary vegans refrain from ingesting animal products. This means avoiding not only meat but also egg and dairy products and other animal-derived foodstuffs."doc5 = "Some dietary vegans choose to wear clothing that includes animal products (for example, leather or wool)." docs = [doc1, doc2, doc3, doc4, doc5]for i in docs:    tokens = word_tokenize(i.lower())    stopped_tokens = [w for w in tokens if not w in stopwords.words('english')]    stemmed_tokens = [st.stem(i) for i in stopped_tokens]    texts.append(stemmed_tokens)dictionary = corpora.Dictionary(texts)corpus = [dictionary.doc2bow(text) for text in texts]# generate LDA model using gensim  ldamodel = gensim.models.ldamodel.LdaModel(corpus, num_topics=2, id2word = dictionary, passes=20)print(ldamodel.print_topics(num_topics=2, num_words=4))

[(0, u'0.066*animal + 0.065*, + 0.047*product + 0.028*philosophy'), (1, u'0.085*. + 0.047*product + 0.028*dietary + 0.028*veg')]

CodeHunter

Simple Python implementation of collaborative topic modeling?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last