Cancer Diagnosis using Medical Records



A lot has been said during the past several years about how precision medicine and, more concretely, how genetic testing is going to disrupt the way diseases like cancer are treated. But this is only partially happening due to the huge amount of manual work still required.
Once sequenced, a cancer tumor can have thousands of genetic mutations. But the challenge is distinguishing the mutations that contribute to tumor growth (drivers) from the neutral mutations (passengers).
Currently this interpretation of genetic mutations is being done manually. This is a very time- consuming task where a clinical pathologist has to manually review and classify every single genetic mutation based on evidence from text-based clinical literature.
Objective:– To classify every single genetic mutation based on evidence from text-based clinical literature.

  • Data Type:
    1. (||) pipe delimited and csv files.
    2. Training_variants.csv (Id , Gene, Variations, Class)
    3. Training_text (ID, Text)
    4. Test_variants.csv (Id , Gene, Variations)
    5. Test_text (ID, Text)
  • Data Size: 159MB

Key Points:

  1. Validity of this course is 240 days( i.e Starts from the date of your registration to this course)
  2. Expert Guidance, we will try to answer your queries in atmost 24hours
  3. 10+ machine learning algorithms will be taught in this course.
  4. No prerequisites– we will teach every thing from basics ( we just expect you to know basic programming)
  5.  Python for Data science is part of the course curriculum.


Target Audience:

We are building our course content and teaching methodology to cater to the needs to students at various levels of expertise and varying background skills. This course can be taken by anyone with a working knowledge of a modern programming language like C/C++/Java/Python. We expect the average student to spend at least 5 hours a week over a 6 month period amounting to a 145+ hours of effort. More the effort, better the results. Here is a list of customers who would benefit from our course:

    1. Undergrad (BS/BTech/BE) students in engineering and science.
    2. Grad(MS/MTech/ME/MCA) students in engineering and science.
    3. Working professionals: Software engineers, Business analysts, Product managers, Program managers, Managers, Startup teams building ML products/services.

Course Features

  • Lectures 304
  • Quizzes 0
  • Duration 100+ hours
  • Skill level All levels
  • Language English
  • Students 9
  • Assessments Yes
QUALIFICATION: Masters from IISC Bangalore PROFESSIONAL EXPIERENCE: 9+ years of Experience( Yahoo Labs, Matherix Labs Co-founder and Amazon)


  1. February 10, 2018

    At the end of last lecture in ENSEMBLE MODELS(24.12), we will be starting the project/case study for Cancer detection ? or when does this Cancer detection project start ?

Leave A Reply