Advanced Analytics
Advanced Analytics is a generic term for advanced statistical or mathematical approaches. "Advanced" has several facets:
- complex statistical background
- sophisticated modelling and interpretation of results
- substantially large amounts of data (which may make statistical significance obsolete)
- complexity of ETL processes (possibly including the cleansing of dirty data)
We analyze data for each question from each discipline. At each scale level, with nearly every statistical-mathematical method, with data tables in all sizes, from thousands of records of variables and billions of rows to datasets with just a few dozen values (provided Advanced Analytics approaches may still make sense here). It is important that you get an idea of the added value Advanced Analytics can provide. Let us convince you. That's why we call these first explorations "appetizer analyses". A selection of some methods follow ...
-
Modelling of association and correlation
- interval: e.g. Pearson's r
- ordinal: e.g. Spearman's R, Gamma, Kendall's tau-b, Somer's D, Stuart's tau-c.
- association: Please see section Table Analysis.
- consistency: Agreement, concordance, reliability (e.g. Kappa, Cronbach's Alpha)
-
Causal modelling: Analysing cause and effect
- Regression models: e.g. Linear, Multiple, Loglinear, Nonlinear, Nonparametric, Ridge, Robust, Binary Logistic Regression, Ordinal Regression, Multinomial Logistic Regression, Hedonic Regression, Quantile, Logit, Probit, To-bit, Categorical incl. Elastic Net, WLS, 2LS, Partial-Least Squares; Generalized Linear Models; Time-to-Event analysis / survival analysis (Kaplan-Meier, Proportional Hazard) etc.; Multilevel Analysis/Regression etc. Also Mixed Models and other special variants.
- Special regression phenomena: e.g. Over-/Underfitting, Regression to the Mean, Mediation Effects, Regression Trap, Outlier Analysis, etc.
- Testing of requirements: e.g. Multicollinearity, autocorrelation, homogeneity of variance etc.
- Time Series Models: e.g. Transfer-Function Models incl. Granger-Causality Tests for economic time series data etc.
-
Considering the effect of time:
- Regression models (see above)
- Time-to-Event models e.g.
- Kaplan-Meier, Proportional Hazards, Actuarial Method etc.
- Time Series Analysis: (particularly econometric) e.g. Stochastic Time Series (ARIMA, Holt-Winters, exponential) also for high volatile data etc.
- Modelling change over time: e.g. Repeated Measurements, Random Coefficients Approach, Associated t-tests, Cochran–Armitage Test for Trend etc.
-
Analysis of Surveys, Experiments and Designs: Focussing differences, causes and effects:
- Parametric Approaches: e.g. t-tests, ANOVA for balanced designs; GLM for imbalanced and other designs, analysis of covariance, split-plot analysis; repeated measures analysis, multivariate analysis of variance; variance components modelling; modelling of covariance structure, random coefficient approaches (PROC MIXED). Analysis of variance for data from an experiment with lattice design (PROC LATTICE). Analysis of variance for nested random models (NESTED). Special applications (ORTHOREG, TRANSREG). Multiple testing and comparisons. Modelling of Fixed and Random effects (PROC MIXED) e.g. for Multi-Level-Modelling. Tests of equivalence (TOST by Schuirmann) etc. Data Mining Approach: Automatic Linear Modelling.
- Nonparametric Approaches: e.g. Tests for location and scale differences: Wilcoxon-Mann-Whitney, Median, Van der Waerden (normal), Savage, Siegel-Tukey, Ansari-Bradley, Klotz, Mood, Conover.
- Design of Experiments (DoE): e.g. Optimal Designs (e.g. PROCs OPTEX, FACTEX), Statistical Power, Effect Sizes, Sample Sizes (e.g. PROC POWER) etc.
- Special statistical Techniques: e.g. Sampling, Matching, Simulation, Fitting, Jack-Knifing (Reampling) etc.
-
Predictive Analytics (Forecasting): Modelling future events
- Multivariate Approaches: e.g. Decision Trees (CHAID, CART, C4.5/C5 etc.), Neural Networks, Multilayer Perceptron, k-nearest neighbours (KNN), Discriminant Analysis, Binning etc.
- Mathematical Approaches: e.g. Modelling of Credit Risk. LGD, EAD and PD (component model) according to Basel II context. Modelling of Probability to Change (e.g. percentage approach in insurance). ROC/AUC Approach / Confusion Matrix (sensitivity, specificity, accuracy, depth, lift).
- Time Series Analysis e.g. With/out trend, with/out seasonal effects, calender effects etc.)
- Regression models (see above).
- Time-to-Event models (see above). In Applied Statistics usually a combination of approaches.
-
Latent Modelling:
- Classes: e.g. Latent Class Analysis (PROC LCA, stand-alone SAS procedure).
- Factors: e.g. Factor Analysis (PFA, ML, Alpha, Image, ULS, GLS, Wong's etc.).
- Paths: e.g. Path Analysis, Structural Equation Modelling, LISREL.
-
Clustering and Segmentation:
- Basic: Conditional approaches, random-based approaches, RFM analysis.
- Mathematical: Cluster Analysis (Hierarchical, k-means, Two-Step), Conjoint Analysis, Correspondence Analysis, Multi-Dimensional Scaling (MDS/MDA).
- Data Mining: Neural Networks, Multilayer Perceptron, k-nearest neighbours (KNN), Discriminant Analysis, Kohonen, Binning etc.
- Text Mining: Text Mining by SPSS Modeller; Visual Analytics by IBM COGNOS (a/k/a 'Many Eyes'); Analysis of unstructured texts using Word Trees, Tag Clouds, Phrase Nets, and HISTORIO.
-
Special Topics, e.g. with SAS:
- Network Visualisation and Analysis.
- Statistical Matching (random-based matching incl. fuzzy factor). Also by criteria-based parallelization and propensity scores.
- Random Sampling: (e.g. Unrestricted / Simple, PROC SURVEYSELECT).
- GIS visualization with SAS: e.g. map visualization by GfK GeoMarketing Map Data Sets.
- Geo-Analytics: Computing of distances in 2d / 3d space.
- Iterative Proportional Fitting (Small Area Estimation): e.g. extrapolation for census data.
- Bootstrapping.
- Weighting and Weighted Analysis.
- Honest assessment of models and scoring of data sets.
-
Table Analysis
- Measures: Interval: Pearson's r, ordinal: Spearman's R, Gamma, Kendall's tau-b, Somer's D, Stuart's tau-c; nominal: Cramer's V, Contingency Coefficient, Phi Coefficient, Lambda, Uncertainty Coefficient), Simple Kappa Coefficient, Overall Kappa Coefficient, Cochran's Q, Binomial Proportion, Odds Ratio, Polychoric / Tetrachoric Correlation.
- Tests: McNemar's Test, Test of Symmetry, Test for Equal Kappa Coefficients,Chi-Square, Likelihood Ratio Chi-Square, Mantel-Haenszel Chi-Square, Fisher's Exact Test, Jonckheere-Terpstra Test, Cochran-Armitage Trend Test.
-
Other Techniques, Methodologies, and Terminology:
- Evaluation e.g. DeGEval / SEval.
- (Non)linear Modelling
- Probability Calculations and many more...
-
We also offer expertise in:
- Data Mining
- Six Sigma (DMAIC, FMEA, VOC/VOP)
- Research Methods
- Competitive Intelligence, as well as
- Business Intelligence.
- We develop special applications, procedures and measures for your special requirements. These project-specific procedures would be tailored to your requirements, your project (products, processes). Contact us.
Method Consult prefers to work with SAS 9.4, Enterprise Guide 6.1, Enterprise Miner 12.1, and SPSS 22 for statistical analysis, beside other applications. Analyses by MS Excel are not recommended(e.g., McCullough & Wilson, 2002, 1999): "McCullough and Wilson (1999) examined Microsoft Excel 97 and concluded 'Persons desiring to conduct statistical analyses of data are advised not to use Excel'. An examination of Excel 2000 and Excel XP provides no reason to revise that conclusion" (McCullough & Wilson, 2002, 717).