Skip to content

Clarification on find_topics() #1789

Closed Answered by MaartenGr
owwdesilva asked this question in Q&A
Discussion options

You must be logged in to vote

However my requirement is not to generate topics for each document, i need to find out the most suitable topics (top 5 or 10) for this entire corpus.

It depends on what you mean with suitable but generally, you could predict the distribution of topics, using .transform, of each document and then aggregate these probabilities over the entire corpus. That way, you could the top n most probable topics across the entire corpus.

BTW,
I have observed when I pass my corpus as a single document (concatenating transcript + labels as a single document) into .find_topics() i was able to get set of topics and looks like those topics are almost aligned with my expected output too. But when I change…

Replies: 1 comment 9 replies

Comment options

You must be logged in to vote
9 replies
@owwdesilva
Comment options

@MaartenGr
Comment options

@owwdesilva
Comment options

@MaartenGr
Comment options

Answer selected by owwdesilva
@owwdesilva
Comment options

@MaartenGr
Comment options

@owwdesilva
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants