From Scatch to Top
Let's go back in time to my first industrial experience, it was during the summer of 2019 and I was looking for an Internship to engage myself in something constructive. I spoke to a lot of people and sent out my resume on the fly. One of my connections reached out to me about an internship opportunity. I sent out my resume and I was called in for an interview.
The Interview
The interview process was rigorous, they had a telephonic interview then an FTF interview. In the first interview itself, I got to know that they were looking for someone well versed in Python and Machine Learning Algorithms. So I realized it was up for grab, I brushed up my concepts and nailed the interview and then I got a call from the HR department for next round of interview. All went well and I got an Internship in EY, one of the big four consulting firms.
The Overview, Ernst and Young
So on 15th July, I stepped into my first workplace. It was really an amazing experience, the ambiance, and the environment was really great. The best part about the place was the people, so helpful and professional. In the first week, they tested my coding skills with some problems and basic website designing using flask. Later, I was asked to work on a data science competition that was going on for two months, titled, EY tax guide Recommender System. At that time, I had no idea about the various concepts relating to Recommender Systems and Natural Language Processing. I had to start afresh, I learned and gathered whatever I could about the pre-requisites of the project. My senior told me and I quote, "If you use a built-in or pre-trained model then you have achieved nothing". So it was clear that I had to build everything from scratch.
My Approach
There was a need to develop a query response system for the EY tax guide. The tax guide was parsed manually, split into smaller sections and we had to classify the query to the corresponding response section. It was clear to me that I would use the "bag of words" approach to represent my data and eventually generate responses. I used the Term Frequency Inverse Document Frequency technique of information retrieval on the top of the bag of words approach to generate responses. But the output accuracy was way too less for my satisfaction although I had a rank of 70 among 1000 entries. I then started the fine-tuning process, ruled out other methods like the Probabilistic approach and Frequency-based classification techniques. I read more and more research papers and started to implement them from scratch. Eventually, I found Okapi BM25+, an optimization of the previously used TF-IDF approach to give better output accuracy and with this algorithm, I managed to get into the top 20.
There were still 2 days to go and I was thinking about more innovations that I could do to improve my results. Then I thought to combine the bag of words and vector approach, for that I used n-grams, these are basically sentence broken down to continuous n-tokens to be one entity. My presumption was that the query won't have any relevant pentagrams. I added weights to the n-grams of the initial dataset and generated responses. This approach was really effective and got me to the top 7 and the top entry from the Asia-Pacific zone.
With my performance in the competition, I was able to secure a chance to avail the EY silver badge for Artificial Intelligence for my team. Though being an intern I was not eligible for the badge, my colleagues were successful to earn it. Overall, this internship and this competition was a great learning experience, I met some wonderful people that I would like to mention
My Innovation
There were still 2 days to go and I was thinking about more innovations that I could do to improve my results. Then I thought to combine the bag of words and vector approach, for that I used n-grams, these are basically sentence broken down to continuous n-tokens to be one entity. My presumption was that the query won't have any relevant pentagrams. I added weights to the n-grams of the initial dataset and generated responses. This approach was really effective and got me to the top 7 and the top entry from the Asia-Pacific zone.
Final Thoughts
With my performance in the competition, I was able to secure a chance to avail the EY silver badge for Artificial Intelligence for my team. Though being an intern I was not eligible for the badge, my colleagues were successful to earn it. Overall, this internship and this competition was a great learning experience, I met some wonderful people that I would like to mention
Mr. Deepak Garg, Senior Manager
Mr. Saurabh Agarwal, Senior Consultant
Mr. Vinay Joshi
Mr. Ankit Agarwal
Mr. Somil Sharma
Mr. Karan Gambhir, Senior Manager
and thank them for their constant support during my time at EY. In the end, I would say it was a great place to start my long journey.
Comments
Post a Comment