
ABOUT TANMAY DHYANI
Profile Summary
Data Scientist with over 5 years of experience and expertise in data science, data visualization, and consulting with the ability to translate business and functional requirements into substantial deliverables. Strong knowledge in designing, developing, and deploying highly adaptive machine learning models.
TECHNICAL SKILLS

DATA ANALYSIS
Python (Jupyter, AWS SageMaker), SQL, R, Excel, Dataiku

VISUALIZATION
Tableau, Power BI, QuickSight

CLOUD/DATABASE MANAGEMENT/DISTRIBUTED FRAMEWORK:
AWS, Oracle, MySQL, SQL Server, Spark, Hadoop

PROGRAMMING
Python, Java, C, JavaScript, PL/SQ

AI/ML
Classification, Regression, Clustering, Time Series, Natural Language Processing/Search, Hypothesis Testing, Statistical Analysis , Optimization

CERTIFICATIONS

PROFESSIONAL HISTORY
DATA SCIENTIST, SYSTECH SOLUTIONS
February 2019-Present
In this dynamic position, I engage with customers/prospects to analyze, design, develop, and implement advanced analytics solutions.
DATA ANALYST, LEARNING TECHNOLOGY SOLUTIONS, UIC
January 2018 – December 2018
As a Survey Data Analyst for University of Illinois at Chicago my responsibilities were to identify and analyze the variation in student satisfaction and propose solutions to improve teaching methodology.
SYSTEMS ANALYST, TATA CONSULTANCY SERVICES, INDIA
August 2015 – June 2017
I have worked as a Systems Engineer with Tata Consultancy Services which is a leading Technology Consulting firm and was responsible for handling and analyzing data sets, to find relevant patterns and find key insights to help our clients with effective decision making. I have driven various firm initiatives and worked collaboratively as well as autonomously on business problems having extremely tight timelines.
INTERN, ZENSAR TECHNOLOGIES,INDIA
June 2014 – August 2014
At the end of my junior year, I was selected as an intern for the coveted ‘Employability Skills Development’ (ESD) Program at Zensar Technologies (Pune) through which Skills Development Training is imparted to fresh graduates with a focus to enhance technical, functional and soft skills for effective working in the IT industry.

PROJECTS

NATURAL LANGUAGE SEARCH BAR FOR AMERICAN FOOTBALL
-
Design and Develop an optimized data model for achieving the most robust capability for NLS in Football
-
Implemented a solution integrated with a semantic layer that captures intent for a varied range of questions, providing accurate and apt responses
-
Executed a scalable strategy that limits end user involvement and time to get a response

ETA PREDICTIVE MODELING FOR INTERMODAL TRANSPORT NETWORKS
• Predicted arrival time for vehicles being transported using multiple means of transportation, enabling effective operational decision-making
• Enhanced process functionality to anticipate potential delays between transportation leg, leading to improvement in prediction accuracy
• Slashed process time by 70% through parallel execution of logical and data pipelines
• Estimated reduction of arrival time delay to approximately 8 hours when implemented in production

UNDERWRITING CHAT
TOPIC MODELING
-
Optimized operational efficiency, reduced resolution time and improved customer experience by proactively train agents on key subjects/issues
-
Developed custom python wrapper to create tokens using dependency parsing based on Insurance context before feeding to LDA model
PROPENSITY MODELING FOR COMMUNICATION CHANNEL
• Built enterprise solution to predict customers/prospects’ propensity to respond to different communication channels
• Optimized effort/cost invested in reaching out to customer resulting in 14% reduction of total cost on a $70 million budget
• Deployed real time prediction API integrated with adaptive learning capabilities which retrains machine learning model
• Created, revised, and executed quality control checks in the form of SQL queries to load client data into production databases


PATIENT READMISSION PREDICTION
• Built cloud-based solution to predict patients with higher risk for unplanned readmissions and identified factors determining the decision
• Reduced readmission percentage by 11% in Q2 2019, helped prevent HRPP penalties and improved operational efficiency
• Performed rigorous hyperparameter tuning to boost model efficiency and integrated SHAP to explain feature importance at a patient level
STATISTICAL MODELING AND ANALYSIS OF
OBSERVED DEATHS PER CAPITA FOR
CANCER IN 2015
Executive Summary
This report summarizes the statistical modeling and analysis results of Observed Deaths Per
Capita for Cancer in 2015 and several factors that could potentially affect these deaths. We
focused on the potential lifestyle and environmental factors that could affect cancer; the five
being air pollution levels per state, hospitals per capita, percent of smokers per state, percent of
physically inactive people per state, and the percent of obese people per state. Using statistical
analysis and regression models, we were able to observe the significance of these factors to the
observed deaths per capita for cancer related deaths in the United States. Our analysis found that
smoking habits had the most significant relationship with cancer related deaths in the United
States in 2015. We suggest that the government take measures to lower the percentage of
smokers across the States in order to reduce the Observed Deaths due to Cancer. We estimate
that by reducing the percentage of smokers by 1 percent across the US, the number of Observed
Deaths due to cancer will reduce by about 18,908people. This will result in savings in cancer
treatment costs of about $112,636,847 annually.

WEB SCRAPING AND ANALYSIS OF ONLINE COURSE DATA
Executive summary
Increasing the traffic is very important for websites. In this study, we focused on Lynda.com, a website that offers a number of online courses, to investigate how we can increase the number of subscription of Lynda.com. We scraped data from the websites of software development courses using selenium package. Software development courses included web development, mobile apps, and database courses. The number of observations in our dataset was 785 including 2 missing values. We found and proposed that to increase the number of views, the managers of Lynda.com should increase the number of courses offered by each instructor, make the title concise, and make the course description detailed.

PREDICTING CUSTOMER CHURN AT QWE INC.
After analyzing several Random Forest, Decision Tree and Logistic Regression models, we find model_rose_prune (Rose sampled and pruned decision tree with optimal cp value) to be the best model with the best sensitivity (Recall) of 82% on Train as well as test data. Training data was also re-sampled to make the data balanced based on different techniques while optimizing each model to find the best.Also, the variables that affect the churn rate would be the Age (category) ,Customer happiness index, and difference in views since last login .For instance, if the Customer Age is between 6 and 14, then the probability of a customer leaving is high.



