Stack overflow Tag Predictor



Stack Overflow is the largest, most trusted online community for developers to learn, share their programming knowledge, and build their careers.

Stack Overflow is something which every programmer use one way or another. Each month, over 50 million developers come to Stack Overflow to learn, share their knowledge, and build their careers. It features questions and answers on a wide range of topics in computer programming.

The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or Digg. As of April 2014 Stack Overflow has over 4,000,000 registered users, and it exceeded 10,000,000 questions in late August 2015. Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: Java, JavaScript, C#, PHP, Android, jQuery, Python and HTML.

Statement: (Multilabel Classification) A tag is a word or phrase that describes the topic of the question. Every question should have at least one tag, and can have up to five tags. Tags can be newly created by the user (if the user has reputation above 1500), or can be chosen from the list of tags available in the site. Tags help experts in finding the relevant questions that they can answer. Tags can also be used to find questions that are relevant or interesting to a user. Given this huge number of tags, it may be difficult for users to manually search appropriate tags while posting questions. Also, only users with good reputation can add new tags which in a way limit normal users from suggesting new tags

Since there are a huge number of tags, it is often a cumbersome process to search the correct tags. It may be useful to have an auto-tagging system that suggests tags to users depending on the content of the question.


Data Type:CSV files

train.csv (Id , title, body, tags)

Test.csv (id, title, body)

Data Size: 10GB

Target Audience:

We are building our course content and teaching methodology to cater to the needs to students at various levels of expertise and varying background skills. This course can be taken by anyone with a working knowledge of a modern programming language like C/C++/Java/Python. We expect the average student to spend at least 5 hours a week over a 6 month period amounting to a 145+ hours of effort. More the effort, better the results. Here is a list of customers who would benefit from our course:

    1. Undergrad (BS/BTech/BE) students in engineering and science.
    2. Grad(MS/MTech/ME/MCA) students in engineering and science.
    3. Working professionals: Software engineers, Business analysts, Product managers, Program managers, Managers, Startup teams building ML products/services.

Course Features

  • Lectures 270
  • Quizzes 0
  • Duration 70+ hours
  • Skill level All levels
  • Language English
  • Students 3
  • Assessments Yes
QUALIFICATION: Masters from IISC Bangalore PROFESSIONAL EXPIERENCE: 9+ years of Experience( Yahoo Labs, Matherix Labs Co-founder and Amazon)

Leave A Reply