An interview with Dat Tran, Head of AI at Axel Springer AI
Let’s welcome Dat Tran for today’s interview. Dat is currently with Axel Springer AI where he heads all the AI initiatives. Axel Springer AI is the artificial intelligence unit of Axel Springer SE, the largest digital publishing house in Europe. Recently Dat’s team released a post on Headliner, an NLP-based tool that is based on SoTA NLP tools. Previously, Dat used to head the Data Science team at idealo.de.
Dat is an active community contributor as well. He regularly blogs about his work on Medium and is an active public speaker having delivered talks at PyData, ODSC, Data Hack Summit, and so on. His community presence and contributions have easily allowed him to become a Google Developers Expert in Machine Learning. You can follow Dat on LinkedIn and Twitter to have a balanced news-dose on the kind of work he is doing, novel AI research, and more.
I would like to wholeheartedly thank Dat for taking the time to do this interview. I hope this interview serves a purpose towards the betterment of data science and machine learning communities in general :)
An interview with Dat Tran, Head of AI at Axel Springer AI
Sayak: Hi Dat! Thank you for doing this interview. It’s a pleasure to have you here today.
Dat: Hi Sayak, thanks for having me here.
Sayak: Maybe you could start by introducing yourself — what is your current job and what are your responsibilities over there?
Dat: I’m the Head of AI at Axel Springer AI which I started a few months ago. We’re the AI unit of Axel Springer SE, the largest digital publishing house in Europe. Our mission is to make AI accessible to everyone within the company and hence drive innovation. We mainly use deep learning where we not only advise our other units on what’s possible and what’s not but also help them to implement it. Other than that, we also do research but this is not our top priority at the moment. In the long run, though, we want to establish a research mindset at Axel Springer which is quite new here.
Concerning my responsibility since I started this unit by myself, my role was different at the beginning. I laid out the strategy, research direction, vision, and mission but then was also involved in hiring the people. Now after the first hires are here, I have a lot of managerial tasks. My day to day job is to ensure that we have everything to get our work done. For example, from securing internal and external funds to ordering hardware. I’m also doing a lot of evangelism work for my team which is very important since we are a new unit.
Sayak: Many hats to wear, Dat. They are indeed challenging but at the same time must be fun too. I am interested to know how did you become interested in data science and machine learning?
Dat: Long story but originally I was in Investment banking and got tired of it. Then I went to grad school and got interested in Operations Research (OR). OR is a subfield of applied mathematics and involves a lot of optimization. In fact, OR people used to work with data to make informed decisions before data science and machine learning even became super trendy. Nonetheless, I got hooked up with machine learning due to a good friend of mine who studied Statistics at that time. He recommended me to do a course in Machine Learning on Coursera by Andrew Ng. I immediately fell in love with this course because I had done similar things in OR already but now I had a direction where I could work after graduating.
Sayak: Ah, this is nice! I first heard of Operations Research during my undergraduate days. I even studied a few algorithms from there including the max-cut, min-flow network operations. I loved the subject, to be honest. When you were starting in the field what kind of challenges did you face? How did you overcome them?
Dat: When I started my career, data science and machine learning had not quite arrived yet in Germany. At that time, everyone was into big data and how they can use Hadoop or Apache Spark to create some meaningful insights. In general, it was more data engineering work. Due to my academic background, I was more interested in data science and machine learning though than data engineering so I had to convince my managers at that time to focus on this instead of data engineering. Over time, data science and machine learning became more popular and my bet paid off.
Due to my academic background, I was more interested in data science and machine learning though than data engineering so I had to convince my managers at that time to focus on this instead of data engineering.
Sayak: This indeed takes guts. Believing in your own instincts intelligently and abiding by them is really one rare of a quality. What were some of the projects you feel proud to have associated with?
Dat: That’s a tough question. There are many projects that I feel proud of. But if I had to choose one, I would pick one of the projects at Idealo. Most notably, we used deep learning to automatically rank millions of hotel images according to their aesthetic and technical quality. The best thing about this project is that it’s also used in production. Usually, in many machine learning projects, not everything goes into production because the model might not perform well or there’s too much technical debt that it might take months or years until it is live. However, we managed to do it within months which I’m really proud of. We also had the chance to write a blog article about this project on the NVIDIA Developer Blog, so if you’re interested you can check it out there. We also open-sourced the code on GitHub. Of course, we didn’t release our best model.
But if I had to choose one, I would pick one of the projects at Idealo. Most notably, we used deep learning to automatically rank millions of hotel images according to their aesthetic and technical quality.
Sayak: Thank you for sharing that, Dat. I did check out this project when you shared this on LinkedIn probably a year ago. Very interesting, indeed! You bring so much expertise not only from the data science and machine learning domains but also from different software engineering domains too. How do you manage to do it so efficiently?
Dat: Well I used to work at Pivotal and they are really famous for their software engineering practices. I learned most of my engineering skills there like test-driven development, pair programming and many more. So it’s quite natural that I combined what I learned and applied it to other domains like data science and machine learning. As a matter of fact, you need good software engineering skills these days also in machine learning. Usually, when publishing a paper, it’s common to release code as well. Or also when you want to operationalize your ML model, you also need good software engineering skills because at the end of the day we’re writing code.
[…] when you want to operationalize your ML model, you also need good software engineering skills because at the end of the day we’re writing code.
Sayak: I concur with you here. Nothing can beat a well-written piece of software. These fields like machine learning are rapidly evolving. How do you manage to keep track of the latest relevant happenings?
Dat: I do many things. First, I browse social media like Twitter, LinkedIn, etc. But then secondly, I also go to conferences and meetups. Other than that, my team is also a source of knowledge. All of us have different interests and we often post interesting papers, articles and so on in our Slack channel.
Sayak: Our methodologies match here. For me, Twitter is the biggest source of new ML discoveries I would say. Being a practitioner, one thing that I often find myself struggling with is learning a new concept. Would you like to share how do you approach that process?
Dat: Well I see it as a divide and conquer problem. First I divide the “bigger” problem into smaller problems and then from there I figure out what I need to learn in order to conquer the bigger problem. For example, if I’m a newbie in machine learning, I really need to make a proper plan to learn it. You can not get straight into machine learning but need to master the basics first like linear algebra, calculus, and probability theory. From there you can go on with more advanced topics like supervised learning, unsupervised learning and then also reinforcement learning.
Sayak: This sounds like a good plan. Divide and conquer any day here over backtracking :) Any plans for authoring a book in the coming future?
Dat: Maybe someday if I have more time.
Sayak: Any advice for the beginners?
Dat: Most beginners are quite impatient at the beginning of their journey. They want to know everything and sometimes get a job in this field immediately but you have to know that Rome wasn’t built in a day. It takes a lot of time before you really get the gist of data science and machine learning, especially since our field is moving so fast. So be more patient. Another advice I can give is to get your hands dirty. I really value people a lot who contributed to some open-source projects or participated in Kaggle challenges. That’s not only a good way to practice your ML skills but can also be used to showcase your skills to hiring managers.
Most beginners are quite impatient at the beginning of their journey. They want to know everything and sometimes get a job in this field immediately but you have to know that Rome wasn’t built in a day. It takes a lot of time before you really get the gist of data science and machine learning, especially since our field is moving so fast.
Sayak: Thank you so much, Dat, for doing this interview and for sharing your valuable insights. I hope they will be immensely helpful for the community.
Dat: Thanks Sayak for the interview. I really enjoyed it. I hope that my advice and insights can help a little bit especially for those who are at the beginning of their journey.
I hope you enjoyed reading this interview. Watch out this space for the next one and I hope to see you soon. This is where you can find all the interviews done so far.
If you want to know more about me, check out my website.