An interview with Joel Grus, Research Engineer at Allen Institute for Artificial Intelligence

7 min readSep 25, 2019

This is a part of the interview series I have started on data science and machine learning. These interviews are from the people that have inspired me, that have taught me (read are teaching me) about these beautiful subjects. The purpose of doing this is to mainly get insights about the real-world project experiences, perspectives on learning new things, some fun facts and thereby enriching the communities in the process.

This is where you can find all the interviews done so far.

Today, I have Joel Grus with me. Joel is a Research Engineer at Allen Institute for Artificial Intelligence which is also popularly known as AI2, where he tries to convince computers to correctly answer science questions. Before joining AI2 he was a software engineer at Google and before that a data scientist at a variety of startups. He has a math degree from the University of Washington and an economics degree from Caltech.

Joel is also the author of the best-selling book Data Science from Scratch: First Principles with Python. He is often seen at prestigious conferences such as ICLR, AAAI, EMNLP talking and teaching on a wide range of topics. Some of my personal favorites include Reproducibility in ML, Writing Code for NLP Research (co-presented with Matt Gardner and Mark Neumann), Fizz Buzz in Tensorflow. Joel likes to host live-coding sessions as well and those sessions can be found here. You can learn more about Joel here.

I would like to wholeheartedly thank Joel for taking the time to do this interview. I hope this interview serves a purpose towards the betterment of data science and machine learning communities in general :)

An interview with Joel Grus, Research Engineer at Allen Institute for Artificial Intelligence

Sayak: Hi Joel! Thank you for doing this interview. It’s a pleasure to have you here today.

Joel: Thanks for having me.

Sayak: Maybe you could start by introducing yourself — what is your current job and what are your responsibilities over there?

Joel: I’m a research engineer on a team called AllenNLP. I work on a library called (big surprise) AllenNLP, which is a deep learning library for NLP researchers. I do a little bit of everything: thinking about the long-term product roadmap, adding new features, refactoring code, implementing models, fixing bugs, partnering with researchers, giving talks and tutorials, and so on.

Sayak: Head full responsibilities! But it all must have been super fun. How did you become interested in data science and machine learning?

Joel: Back in the day I studied math and then spent a while doing quantitative finance (i.e. building spreadsheets). In 2006 I left finance and got a role at a startup doing “analysis”, which again was mostly Excel and SQL. A few years after that “data science” became a thing, but I was sort of already doing it! I took the original Andrew Ng ml-class in 2011, and that’s kind of what got me started doing machine learning.

Sayak: Nice to know that our starting point i.e. Andrew Ng’s ml-class is the same, only difference is I started 5 years after and it was from Coursera. When you were starting what kind of challenges did you face? How did you overcome them?

Joel: One of the biggest challenges I faced early on was that I was doing data science, but “data science” was not yet a thing. In 2008 the startup I was working at was acquired by Microsoft, and Microsoft told me that (in essence) I needed to become either a software engineer or a program manager so that they could stack rank me because the proto-data-science that I was doing was valuable but did not fit into their career ladder. Obviously today it is more than possible to be a data scientist at Microsoft. I overcame that challenge by switching jobs a couple of times until I found somewhere that wanted me as a data scientist.

Sayak: That was dauntless and I really take that as an inspiration! What were some of the capstone projects you did during your formative years?

Joel: I used to do all sorts of stupid toy projects. I once scraped the “Seattle real-time 911” website and did a social network analysis of Seattle fire trucks. Very early on I built a terrible naive Bayes classifier to predict whether I would be interested in any given Hacker News story. (The people at Hacker News resented the idea that someone might not be interested in every article.) And another time I did a simple ML project to see if I could predict whether a t-shirt was a boy’s shirt or as a girl’s shirt. (This would have been a good deep learning project, but I did it before deep learning was really popular or accessible.)

Sayak: I think those are not stupid at all! Coincidentally, I too happen to have worked on Hacker News stories. Coming to your best-selling book Data Science from Scratch: First Principles with Python, I have the book and personally. I have really liked how you put together so many important pieces and especially the order of the chapters. The chapter A Crash Course in Python is so neat! I am curious to know what motivated you to start writing the book. Would you like to share that?

Joel: There is a company I won’t name that indiscriminately asks people to write books. They asked me to write a book on a topic I wasn’t particularly interested in, but that got me thinking about what book I would like to write, and I came up with the idea for Data Science from Scratch. (The two biggest inspirations were probably my training in mathematics, where everything was done from first principles, and the Andrew Ng ml-class, from which I got the idea of using gradient descent as an organizing principle.) I pitched it to O’Reilly, we argued about it for many weeks, and then finally they agreed to publish it.

Sayak: Nice to know about those two major ingredients of the motivation behind writing the book! These fields data science and machine learning are rapidly evolving. How do you manage to keep track of the latest relevant happenings?

Joel: Mostly I hear about new stuff on Twitter. Also, I work on a library for NLP researchers, so whenever there are new relevant happenings in NLP (and to some degree in ML more broadly) people ask me when those things are going to make it into the library. So I also hear about them that way.

Sayak: Haha, that was very precise! Being a practitioner, one thing that I often find myself struggling with is learning a new concept. Would you like to share how do you approach that process?

Joel: The best way to learn a concept is to teach that concept. People sometimes ask me how I knew all the things that are in my book. The answer is that many of them I didn’t know that well when I started the book, but that having to explain them and cleanly implement them forced me to learn them really well. For many reasons I wouldn’t recommend that people just go out and write books, but it’s easy to write a blog post or give a talk at a meetup or at your work, and that can really force you to figure things out.

Sayak: I am in 100% agreement with this. Many of the articles I have written are a direct result of this philosophy. The way you present, the way your slides are prepared — everything is so super amazing. Would you like to share any tips and tricks to that end?

Joel: There is some alternative universe in which I went into stand-up comedy instead of data science, but in this one, I went into data science, so I use conference talks as an outlet for that side of my personality. That doesn’t mean that everyone should try to be a comedian, but I find talks more interesting when they have some personality to them. One other tip: if you put the rough text of your talk in the speaker notes, then your slides make a nice stand-alone representation of the talk and people who aren’t at the talk can get much more out of them.

Sayak: Yes! That can definitely be seen in all the slides you have made publicly available. Along with your work, you are engaged in a number of activities, public speaking being the most prevalent one. How do you manage your time so efficiently?

Joel: I don’t feel like I manage my time particularly efficiently. However, around 8 years ago I got rid of my TV, and that gave me an extra 2–3 hours a day to use reading and learning and coding and Tweeting. (Eventually, I got another TV for my family to watch Netflix and so on, but I rarely watch it myself.) This made a huge difference for me, but I appreciate that not everyone would want to do this. So it’s less than I’m efficient and more than I have more time in which to be inefficient.

Sayak: I see! But if inefficiency looks like this, I think it’s acceptable, Haha. Any advice for the beginners?

Joel: Read other people’s code. Get other people to read your code. Learn (and adopt) software engineering best practices, like source control, unit testing, code review, and so on. Follow data science people on Twitter. Give talks. Keep learning.

Sayak: Thank you so much, Joel, for doing this interview and for sharing your valuable insights. I hope they will be immensely helpful for the community.

Joel: Thanks for having me.

Summary

I have been following Joel’s work from a while now and they all have been tremendously beneficial for me. I have probably watched all his publicly available videos on Data Science and Machine Learning. Joel brings in an unmatched consistency of good software engineering as well as Pythonic practices in his codebases. Those have been great learning experiences for me. There’s one particular thing I really like about Joel’s talks is how he picks up really uncovered topics (Reproducibility in ML, for example) and enlightens the world with them.

I hope you enjoyed reading this interview. Watch out this space for the next one and I hope to see you soon.

If you want to know more about me, check out my website.

An interview with Joel Grus, Research Engineer at Allen Institute for Artificial Intelligence

An interview with Joel Grus, Research Engineer at Allen Institute for Artificial Intelligence

Summary

Written by Sayak Paul