Topic modeling is an Information Retrieval (IR) technique that discovers representative topics from a collection of documents. Thus, we expect that logically related words will co-exist in the same document more frequently than words from different topics. For example, in a document about the space, it is more possibly to find words such as: planet, satellite, universe, galaxy, and asteroid. Whereas, in a document about the wildlife, it is more likely to find words such as: ecosystem, species, animal, and plant, landscape. But why text classification is so useful? In this blog post, we try to explain the importance of topic modeling and its use in software engineering.
What is topic modeling?
Reply