AN INTER-MODELS DISTANCE FOR CLUSTERING UTILITY FUNCTIONS 1

Statistica Applicata Vol. 18, n. 3, 2006 521 AN INTER-MODELS DISTANCE FOR CLUSTERING UTILITY FUNCTIONS 1 Elvira Romano, Carlo Lauro Dipartimento di Matematica e Statistica, Università di Napoli Federico II, Complesso Universitario di Monte S. Angelo, Via Cinthia, I - 80126 Napoli - Italy elvroman@unina.it; clauro@unina.it Giuseppe Giordano Dipartimento di Scienze Economiche e Statistiche Università di Salerno Via Ponte Don Melillo, 84084 Fisciano (Sa), Italy ggiordan@unisa.it Abstract Conjoint Analysis is one of the most widely used techniques in the assessment of the consumer s behaviors. This method allows to estimate the partial utility coefficients according to a statistical model linking the overall note of preference with the attribute levels describing the stimuli. Conjoint analysis results are useful in new-product positioning and market segmentation. In this paper a cluster-based segmentation strategy based on a new metric has been proposed. The introduced distance is based on a convex linear combination of two Euclidean distances em bedding information both on the estimated parameters and on the model fitting. Market segments can be then defined according to the proximity of the part-worth coecients and to the explicative power of the estimated models. Key words: Multiattribute Preference Data, Conjoint Analysis, Cluster Analysis, Market Segmentation. 1 This paper was financially supported by MIUR grants: Multivariate Statistical and Visualization Methods to Analyze, to Summarize, to Evaluate Performance Indicators, coordinated by prof. M.R. D Esposito and Models for Designing and Measuring Customer Satisfaction, coordinated by prof. C.N. Lauro.

522 Romano E., Lauro C., Giordano G. 1. INTRODUCTION 2. THE DATA STRUCTURE AND THE INTER-MODELS DISTANCE

An inter-models distance for clustering utility functions 523 Tab.1: The data collection. Utility models Coefficients Model fitting Model 1 w 1 1,..., w1 k,..., w1 K 1 w 1 K Model j w 1 j,..., w k j,...,w j K 1 w j K Model J w J 1,..., w J k,...,w J K 1 w K J

524 Romano E., Lauro C., Giordano G.

An inter-models distance for clustering utility functions 525 3. CLUSTERING UTILITY FUNCTIONS

526 Romano E., Lauro C., Giordano G. 4. SIMULATIONS STUDY

An inter-models distance for clustering utility functions 527 Tab. 2: Simulation plan for the three classes of models with dierent coecients and similar fitting values. In order to generate three sets of models with a quite good approximation, we build the global preference ratings according to the model (10) and with coecients given in Table 2. They are used as dependent variables in a multivariate multiple regression model with dummy explicative variables defined by the orthogonal experimental design in Table 3. Tab. 3: Experimental design. Intercept x 1 x 2 x 3 x 4 1 1 1-1 -1 1 2 1-1 -1-1 1 3 1 1 1 0 1 4 1-1 1 0 1 5 1 1 0 1 1 6 1-1 0 1 1 7 1 1-1 -1-1 8 1-1 -1-1 -1 9 1 1 1 0-1 10 1-1 1 0-1 11 1 1 0 1-1 12 1-1 0 1-1

528 Romano E., Lauro C., Giordano G. Fig: 1. Box-plot of the adj - R 2, the intercept and the four estimated coecients.

An inter-models distance for clustering utility functions 529 Fig. 2. The tree structure of the simulated models. Fig. 3. Distribution of the optimum λ -values.

530 Romano E., Lauro C., Giordano G. Tab. 4: Simulation plan involving a hidden tree structure. N. w 0 w 1 w 2 w 3 w 4 e i Class A 20 6.50-1.33 1.00 1.25-1.83 N(0, 1) Class B 20 6.50-1.33 1.00 1.25-1.83 N(0, 3) Class C 20 6.50-0.50-0.25 3.00 0.00 N(0, 1) 5. CONCLUDING REMARKS

An inter-models distance for clustering utility functions 531 Fig. 4: Distribution of the λ -values corresponding to the maximum cophenetic coecient in 100 replications of the clustering. Fig. 5: Dendrogram of the models. λ = 1:the bad fitted models are hidden in the two cluster structure.

532 Romano E., Lauro C., Giordano G. Fig. 6: Dendrogram of the models. λ = 0.2:the cophenetic coecient is maximum, a three cluster structure is more evident.

An inter-models distance for clustering utility functions 533 REFERENCES GREEN P.E., SRINIVASAN V. (1990), Conjoint Analysis in marketing: New Developments With Implication for Research and Practice, Journal of Marketing, 25: 3 19. GUSTAFSSON A., HERMANN A., HUBER F. (2000), Conjoint Measurement. Methods and Application. Berlin, Heidelberg: Springer Verlag. HENNING C. (2000), Identifiability of Models for Clusterwise Linear Regression, Journal of Classification, 17: 273 296. LAURO C., SCEPI G., GIORDANO G., (2002) Cluster Based Conjoint Analysis, in Proceedings of Sixth International Conference on Social Science Methodology, RC Logic & 33 Methodology, August 17-20, 2004 Amsterdam. PLAIA A. (2003), Constrained Clusterwise Linear Regression, in New Developments in Classification and Data Analysis, M. Vichi P. Monari, S. Mignani, S. Montanarini eds., Springer, Bologna 79 86. SPAETH H. (1979), Clusterwise Linear Regression, Computing, 22: 367 373. TAKANE Y., DE LEW J., YOUNG F.W. (1990), Regression with Qualitative and Quantitative Variables: An Alternating Least Squares Method with Optimal Scaling Features, Psycometrika, 41(4): 505 529. VRIENS M., WEDEL M. and WILMS T. (1996), Metric Conjoint Segmentation Methods: a Monte Carlo comparison, Journal of Marketing Research, 33: 73 85. CLASSIFICAZIONE DI FUNZIONI DI UTILITÀ ATTRAVERSO UNA DISTANZA TRA MODELLI Riassunto La Conjoint Analysis è una delle tecniche maggiormente utilizzate nella valutazione del comportamento dei consumatori. Questa metodologia consente di stimare i coefficienti di utilità parziale in base ad un modello statistico che lega la valutazione globale di preferenza alle caratteristiche descrittive degli stimoli (prodotti o servizi). I risultati della Conjoint Analysis trovano vasta applicazione nella segmentazione del mercato. In questo lavoro viene proposta una strategia di classificazione basata su una nuova metrica. La distanza introdotta è definita come combinazione convessa di due distanze. Essa consente di tener conto di una duplice qualità dell informazione relativa al modello: il valore dei coefficienti stimati e la bontà di adattamento. Di conseguenza, la differenziazione tra segmenti di mercato è ottenuta considerando la prossimità dei modelli di utilità individuali stimati e la capacità predittiva degli stessi.