Praveen D Chougale Data science Machine Learning R programming Python
No reviews yet

Currently Pursuing PhD in one of the Premier Institute of India (Indian Institute of Technology, Bombay ) . I have to worked in Industry as Data scientist for 9 years in Data science /ML / Deep learning. I teach students using modern teaching tools, help them to understand the concept thoroughly. As I come from multidisciplinary background and also have industry experience would be guide the students will real work problems . I am skilled in teaching statistics, Mathematics, Data science, Deep learning, R programming, Python programming, Operational research, Management.

Subjects

  • PhD guidance

  • Statistical Analysis Beginner-Expert

  • Marketing management Beginner-Expert

  • Report preparation Beginner-Expert

  • Time Series Analysis Beginner-Expert

  • R programming language Beginner-Expert

  • Operation Research, Decision Analysis and Decision Making Beginner-Expert

  • Mathematical Modeling Beginner-Expert

  • Data Science and Machine Learning Beginner-Expert

  • Microsoft Excel Dashboard Beginner-Expert

  • Statistical Data Analysis Beginner-Expert

  • Statistical Quality Control Beginner-Expert

  • Deep Learning Projects in Python Beginner-Expert

  • Python for Data Science Beginner-Expert

  • Management & Economics Beginner-Expert

  • Descriptive Data Analysis Beginner-Expert

  • Excel (Advanced) Beginner-Expert

  • Power BI Dashboards Beginner-Expert

  • Statistical Modelling Beginner-Expert

  • Study material on Data science / Machine learning/ Deep Learning Beginner-Expert


Experience

  • Consultant - Data science (May, 2022May, 2023) at Trinity Lifesciences
    Modelling
    Applying Linear/non linear models on the data and get the contribution for each of the promotional activity. The data should be in the appropriate format at each stages All the negative and missing values should be imputed Double check the values for average Adstock, ROI, Check all the graphs if they are properly populated and aligned
    ▪ Simulations :
    Here we perform the iterations till we get the valid contribution of each promotion. These contributions are compared and validated with the previous year’s contribution and the business judgements. When we perform the first iteration then we consider the previous year model lag and we perform the iteration.
    Each iteration consists of building the model, scoring the model and generating the contributions of each promotions. These iterations keeps on running till we get valid contributions of each promotions
    ▪ HCP Segmentation :
    Use K-means clustering to cluster the HCP’s and understand their characteristics .Target the HCP’s based on their demographics;
  • Manager - Data science (Oct, 2021May, 2022) at Novartis Pharmaceuticals Pvt Ltd Hyderabad
    To calculate percent contribution of promotion to sales of a Particular Brand and ROI of promotions, optimizationof different promotions/Budgets
    ▪ Collect brand sales, competitor sales and promotional & costing details.
    ▪ The Time period should be checked for each promotion with the relevant fields. Check for duplicate NOVID and remove them, also update the missing NOVIDs where required. Take the product codes for business requirements. Apply required filters for promotion sub types.
    ▪ Panel Data Structures: After we refine the data and we transform the data to HCP level for all the promotions we will be having two datasets. One dataset which consists all HCP and NPP promotions at HCP_ID level and another Dataset which we have at DMA level.
    ▪ Trends
    Various trends can be seen with the help of trend graphs for TRx, Calls, samples, emails, website etc.
    ▪ ROI evaluation:
    Simulators are used to generate the ROI’s for each promotion. This simulator is generated for 3 years in a way such that the promotion is only given for one year and remaining two years no promotion will be given.
    The reason for doing the simulator only for three years is that the promotion effect will won’t last more than three years.
    ▪ RESPONSE CURVES:
    Response curves generally are done for calculating the AROI and MROI at tier level for different level of calls i.e. different Avg no of P1E. Generally these Avg no of p1E all considered from (1-45) across tier level. These AROI’s and MROI’s are used for generating response curves for different tiers
  • Lead Data scientist (Jul, 2019Oct, 2021) at Synechron technologies pvt ltd
    Responsible to constantly interact with the stakeholders to understand the business need, data cleansing and preparation,conduct statistical tests, exploratory analysis and come up with best credit scoring model to understand the credit risk of the customers,forecast liquidity ratio and build recommendation system for corporate bonds
    To model credit risk of the customers by various credit scoring methods such as survival analysis,Logistic regression, Linear discriminant analysis,Naïve Bayes,Decision tree,Random forest,SVM,PGM and neural networks and choosing the best model by validating the performance metrics.
    Tools used - Python
    • Data extraction from multiple sources, Data cleansing to merge acquisition, performance and macro-economic related variables.
    • Data preprocessing include missing value treatment, dummy encoding categorical features, generating new features . • Statistical test -Check for multicollinearity, Granger causality test has been performed to find the impact of the
    macro economic variables upon one of the credit events - default .
    • Feature extraction by using entropy as a measure and by using graphical networks.
    • Modelling the data using logistic regression,survival analysis,Niave bayes,Decision tree,random forest
    ,neural networks,LDA,Probabilistic graphic model.
    • Logistic regression -Finding the appropriate threshold for classification by plotting False positive and True positive rates. • Survival analysis -Fitting Cox Proportional Hazard Model
    • status - 0 indicates that hazard event has not occurred and status - 1 indicates that the Hazard event has occurred (loan borrower defaulted) time - represents the duration of survival of the loan borrower. Fit multivariate cox regression model with macro-economic variables, loan characteristics and Equity as covariates .
    •Probabilistic graphical model - Implemented a PGM model between all the variables .Assigned conditional probabilities at each node. User can query (reasoning) for any combination of different levels of independent or dependent variables with respect to other variable levels using Bayesian network.
    • Validating the models by K-fold cross validation.
    • Evaluation of the models by confusion matrix,ROC curve, gini index .Also data envelope analysis technique is implemented to choose the best fit model.
    Building product recommendation system using associative mining/collaborative filtering techniques to recommend corporate bonds.
    • Data - Data consists of trade date,quantity,price,ticker,coupon,cousip,amount outstanding,maturity date,coupon structure etc
    • Data cleansing and exploration- Missing value treatment and imputation, convert continuous variables to discrete, cross tabulation, scatter plots.
    • Performing various statistical tests to explore data and find the how independent variables are impacting dependent variable.
    • Recommendation system - Associative mining model/collaborative filtering is built to recommend corporate bonds to customers/ clients .
    • Data visualization -Heat maps of rules generated for different combination of support and confidence.
    • Graphical user interface -GUI is designed using R-shinny by giving the user the flexibility to choose different values
    of support and confidence,limiting number of rules generated,filtering other variables.
    • Stock market analysis –Data collection, visualization of data to understand the pattern, Beta calculation of individual Stocks with respect to the market,portfolio optimization
  • Consultant - Data science (Apr, 2019Jul, 2019) at Ernst & Young Pvt Ltd
    Responsible to constantly interact with the stakeholders to understand the business need, data cleansing and preparation, conduct statistical tests, exploratory analysis and come up with best forecasting model to predict the demand for the different products at FU levels.
    • Data cleansing and preparation to remove the unwanted columns, map the data at the FU (Forecasting unit) level or APG (Account planning group) or category level.
    • To extract important features which are impacting the sales volume using PCA and Random forest.
    • To develop an algorithm to model using forecasting models both univariate (Arima,Decomposition model) and
    multivariate (Neural network, Random forest,linear model, XGBoost, Recurrent neural network).
    • Model validation to fit the best model using AIC,BIC,MAPE,MSE and RMSE.
    • Using K means clustering to segment the products in different groups,which in turn helps in advertising and
    promotional budgeting .
  • Lead data analyst (Dec, 2015Apr, 2019) at Accenture solutions pvt ltd
    Responsible to conduct statistical, mathematical and Data modeling exercise to forecast call, emails, financial trade and sales volumes. Forecast is used for business strategy purposes, effective management and efficient allocation of resources . Sustainable business development and maximizing revenue while minimizing costs.
    Responsibilities Includes -
    ▪ To identify the most important variables which are impacting CSAT and also analyze their impact on CSAT score using CART model.
    ▪ Text Analysis (NLP) to understand the problems of the customers (by observing the world cloud/frequency plot)
    ▪ Implemented probabilistic graphical models to analyze and study the interaction to between the independent and dependent variables (CSAT).
    ▪ To segment/cluster the agent as high /low/average performers using K-means clustering and understand their characteristics.
    ▪ Speech Recognition algorithms (Hidden Markov Model) to covert the speech signal to text, preprocess the text and using word cloud and sentiment frequency plot (using NRC emotions) to analyze the customer feedback (customer voice).
    ▪ Principal components analysis (PCA) to identify the important variables which are impact the CSAT/DSAT score.
    ▪ Build and maintain forecasting models (ARIMA,ARAMA,Neural networks, decomposition model, polynomial regression) and there performance analysis (AIC,BIC,MSE,RMSE,MAPE )for various Lines of business. The forecasting models are implemented to forecast interval level (15 mins,30 mins), weekly,monthly,yearly based on the business need.
  • Executive (Dec, 2014Dec, 2015) at Borderless access
    To baseline the forecasting and scheduling accuracy using group mean/control charts for all lines of Business
    • Fulfill data analysis plan including data tabulation and any required statistical analysis.
    • Taking up data mining operations to find answers to marketing issues.
    • Initiated marketing strategies and coordinate actions to influence the market segmentation on vendor management.
    • Monitor competitor’s offering, prices and make a competitive strategy for building panels.
    • Questionnaire analysis of consumer feedback to understand their needs/preferences (Using PCA).
    • Predicting the life span and chunk analysis of the panelist (Consumers who takes surveys).

Education

  • Ph.D in Digital Health Tech (Dec, 2022now) from IIT Bombay, Powai
  • Masters in Machine learning & Intelligent systems (Oct, 2017Aug, 2019) from M.TECH IN CSEscored 8.44/10
  • Master of Business Administration (Oct, 2012Jun, 2014) from VTU, Belgaum, KAscored 68.66
  • Bachelor of Engineering (Sep, 2009Jun, 2012) from Bvb college of engineering and technology, Hubballiscored 7.73/10
  • Diploma in Electronics & Communications (Jun, 2006Jul, 2009) from Motichand Lengade Bharatesh Polytechnicscored 76.68

Fee details

    1,0005,000/hour (US$11.8959.46/hour)

    Fess may vary based on the kind of the assignment I get


Reviews

No reviews yet. Be the first one to review this tutor.