Technical skills include Spark, HDFS, MapReduce, R
Skills include statistical / ML descriptive and predictive techniques like linear regression, logistic regression, K-nearest neighbor classification, linear discriminatory analysis , support vector machines (MMC,SVC,SVM), K-means & Hierarchical clustering, dimension reduction techniques (factor analysis and principal component analysis), Decision Trees
Having good applicability of the Data preprocessing techniques like Exploratory Data Analysis (EDA), missing value imputation , preparing the training and test data for model building, k-fold cross validation and predictive models
Strong understanding in inferential statistical techniques like parametric and non parametric tests (one-sample, two-sample, paired-sample t-tests, ANOVA, HOV); skilled in data replication, ETL, databases, reporting