An interview with Rajarshee Mitra, Data Scientist at Microsoft
This is a part of the interview series I have started on data science and machine learning. These interviews are from the people that have inspired me, that have taught me (read are teaching me) about these beautiful subjects. The purpose of doing this is to mainly get insights about the real-world project experiences, perspectives on learning new things, some fun facts and thereby enriching the communities in the process.
This is where you can find all the interviews done so far.
Today, I have Rajarshee Mitra with me. Rajarshee is currently a Data Scientist at Microsoft. Previously, he worked as a Research Engineer at Artifacia. He has also worked as an AI Intern at Niki.ai. His primary area of interest is Natural Language Processing. He has been able to convert his research works into full-fledged papers and those can be checked out here. His article titled Dynamics of Neural Networks is one of my personal favorites. Be sure to check it out. You can learn more about Rajarshee from here.
Rajarshee is my university senior and he is one of my inspirations of what I am doing today. In fact, he was the one who was kind enough to introduce me to the beautiful world of machine learning on a summer afternoon in 2016. I still remember how he took the example of the Boston House Prices dataset and taught me the basics of Regression. I am forever grateful to him for this. I would like to wholeheartedly thank Rajarshee for taking the time to do this interview. I hope this interview serves a purpose towards the betterment of data science and machine learning communities in general :)
An interview with Rajarshee Mitra, Data Scientist at Microsoft
Sayak: Hi Rajarshee! Thank you for doing this interview. It’s a pleasure to have you here today.
Rajarshee: Thank you for having me, Sayak!
Sayak: Maybe you could start by introducing yourself — what is your current job and what are your responsibilities over there?
Rajarshee: I work as a Data Scientist at Microsoft, Hyderabad. My work primarily revolves around applying language understanding techniques in the domain of web and search. I apply the existing state-of-the-art neural models or sometimes form ideas of my own on millions of data for millions of users!
Sayak: Awesome! I am curious to know how did you become interested in data science and machine learning?
Rajarshee: I was drawn to machine learning during my sophomore year in college. I did some MOOCs and few projects with one professor. As a result, I delved deep into the subject and soon started working on a few projects on my own. I learnt the theory more as I did more hands-on.
I liked language. The idea of making a computer understand the same was exceptionally appealing to me (it is even today). It was then a matter of time before I applied what I learned in ML to NLP. And the journey began!
Sayak: I can kind of feel the appeal and I too possess the interest in the field for that appeal. When you were starting what kind of challenges did you face? How did you overcome them?
Rajarshee: I started at a time when highly abstract libraries like PyTorch or compute resources like GPU didn’t hit the markets. Every project used to take a lot of time and often we implemented things based on just a rough understanding. Moreover, keeping on doing these along with the college curriculum was a challenge.
Sayak: This is so relatable! But your passion always kept you going :) What were some of the capstone projects you did during your formative years?
Rajarshee: I have worked on multiple small projects.
I did an internship in 2015 where I built a basic user query understanding NLP system as a part of a chatbot. The bot would help users to order food, book flights or cab and other daily needs. It required me to predict the nature of query, break it down into a graph structure, process it to get entities, attributes etc finally leading to an action by the bot.
Sayak: Woah! That must have been super fun! These fields data science and machine learning are rapidly evolving. How do you manage to keep track of the latest relevant happenings?
Rajarshee: Twitter, mostly. I follow researchers who work in my field. Also, I follow most of the important conferences and their publications. It might be important to say that I don’t read everything but focus mostly on papers or articles that are related to my current work or excites me a lot.
There are many websites that aggregate papers and code like this which can be very helpful.
Sayak: Thank you for passing that on, Rajarshee. Being a practitioner, one thing that I often find myself struggling with is learning a new concept. Would you like to share how do you approach that process?
Rajarshee: In this age of information, there are too many “new” things. Learning things that could be relevant and useful for you is very much important than learning every new thing.
I look at numerous new stuff coming up every week but I pick only a few of them and really delve deep into them. I don’t discard that concept until I am done with completely grasping that thing, taking the help of resources like code snippets, blogs, forums — — almost everywhere that concept is mentioned. During most of the time, I try to form my own ideas while learning or simply transfer the learned ideas to my current projects.
During my beginning years, I used to try out things I learn along with the learning papers. I would reproduce papers and validate them. This would give me confidence which you won’t get by only reading them.
Sayak: That is so very true. There is also a different kind of joy when you reproduce a paper. You learn so many things along with that process, not only the concepts/novelty introduced in the paper. Any advice for the beginners?
Rajarshee: Giving advice to a beginner would be akin to giving the same to myself a few years ago.
Learn a few related things, but specialize in one. Be very specific about what you are doing or want to do. Associate with one thing for a long time and be really good at it. AI is a broad field.
Try to use other people’s work and build on them. When you are implementing papers, see if you can mix your own ideas into them and make things improve.
Doing one thing very good is a million times better than doing so many average things. Take one problem that you are passionate about, learn about prior work and start working on it. Research on it enough until you get very satisfactory results or interesting insights.
Sayak: Those suggestions are going with me to my grave, for sure. I have listened to you speaking about this so many times and I have got motivated each time. Thank you so much, Rajarshee, for doing this interview and for sharing your valuable insights. I hope they will be immensely helpful for the community.
Rajarshee: I really hope so. Sharing what I have learned gives me immense joy as well and part of my learning as well.
Summary
Rajarshee is very specific about his interest in machine learning and I think maintaining that specificity is very important. It gives more definition to the overall journey we pursue in our careers. But at the same time, keeping yourself updated with the latest happenings (at least at a high level) helps you to be more innovative, Rajarshee added. Talking a particular concept and not leaving it until unless you are done grasping it completely is surely one of the key sauces to become an effective practitioner.
I hope you enjoyed reading this interview. Watch out this space for the next one and I hope to see you soon.
If you want to know more about me, check out my website.