2002 Spring Research Conference on
Statistics in Industry and Technology
University of Michigan Ann Arbor
May 20-22, 2002
ABSTRACTS
MONDAY, MAY 20 9:15 10:15
KEYNOTE ADDRESS ASSEMBLY HALL, HALE AUDITORIUM
Challenges in Industrial Statistics Sir David Cox, Oxford University
Changes in emphasis in industrial statistics over the last 50 and more years will be reviewed and some current challenges outlined. Issues of the distinctiveness of industrial statistics and mainstream general statistics will be discussed.
MONDAY, MAY 20 10:45
12:25
INVITED SESSION 1A: DATA MINING AND
PATTERN RECOGNITION ASSEMBLY HALL, HALE
AUDITORIUM
In this talk we introduce a notion of actionability of the data mining results. Typically, a data miming process is applied to a large, "unclean" data source with many attributes, potentially highly correlated, and one or more performance measurements (yield, transaction amount, life-time value, etc). One of the goals is to discover important relationships between attributes and the performance values, in a multivariate context, and to characterize the nature of these relationships. With many potential candidates, a further requirement is posed: that an important attribute be "actionable", i.e., that it clearly identify a way to improve the performance metric. We introduce a simple actionability score in the context of process optimization in silicon chip manufacturing, using Friedman's MART model as an underlying data mining tool. We will overview the MART model, its variable importance ranking, as well as its partial dependence plots, stressing the underlying fast and memory-efficient algorithm. Hence we will define the actionability metric that ranks the attributes by their realistic potential to improve the yield of the process. Finally we will show some results from a real, high-volume silicon chip manufacturing data.
Mining Functional Process Data - Hugh Chipman*, Jock MacKay, Stefan Steiner, Sofia Mosesova, University of Waterloo
Increasing automation and computerization in manufacturing industries has lead to the collection of large volumes of data. Sometimes each item produced may have hundreds or even thousands of values measured on it. This talk considers a production process in which a sleeve is force-fitted by a ram into a cylindrical tube. The force exerted is automatically measured at about 1000 time points during each insertion. Distance traveled by the ram is similarly measured as a function of time. The resultant curves can be thought of as very high dimensional data, but with a special functional structure. The goal is to use these curves to detect anomalous insertions and also understand how the process is varying over time. Methods for reducing the dimensionality of the data and extracting relevant information from these curves will be discussed, including smoothing, clustering and functional data analysis.
Face
Detection and Synthesis Using Markov Random Field Models
- Sarat Dass*, Michigan State University (with Anil K. Jain)
The
spatial distribution of gray level intensities in an image can be naturally
modeled using Markov Random Fields (MRFs). We develop and investigate the
performance of face detection algorithms derived from MRF considerations. For
enhanced detection, the MRF models are defined for every permutation of site
indices (pixels) in the image. We find the optimal permutation that provides
maximum discriminatory power to identify faces from nonfaces using certain types
of metrics. These metrics avoid parameter estimation when finding the optimal
permutation from the training database.
A
maximum pseudolikelihood criteria is proposed for subsequent estimation of the
MRF parameters. We investigate the performance of the estimated MRF models for
face detection and synthesis. Some detection and synthesis results based on the
first and second order neighborhood systems will be shown.
INVITED SESSION
1B:
MINIMIZING THE MAGIC: TRANSFORMING KNOWLEDGE REPRESENTATIONS INTO STATISTICAL
MODELS ASSEMBLY HALL, WOLVERINE ROOM
Although highly complex research and development projects provide statisticians with exciting analytical problems, the large amounts of data, information and knowledge involved in such projects can make model building quite challenging. To address complex problems, the Statistical Sciences group at the Los Alamos National Laboratory has developed a multidisciplinary approach known as Information Integration Technology, or IIT. This framework combines research methods from the social sciences with statistics to develop complex analytical models that draw together and analyze diverse data sources in a way that is comprehensible to clients. This panel uses a ballistic missile research and development reliability model as a case study to explain the IIT approach, focusing particularly on the processes for translating qualitative problem representations into quantitative models.
Knowledge Elicitation, Representation and Ontologies for Statistical Model Development: A Case Study in Ballistic Missile Defense - Deborah Leishman, Laura McNamara, Los Alamos National Laboratory
Eliciting and representing expert design knowledge can be a tricky business, particularly in the context of dynamic and evolving projects where knowledge is distributed among many individuals. Doing so for the purpose of building any kind of predictive model complicates the situation, as statisticians and their clients often approach problems from different perspectives. Using a case study from a ballistic missile defense project, we present a set of hybrid methods from cultural anthropology and knowledge representation for eliciting and representing expert knowledge, for the purpose of creating mathematically tractable ontologies that translate fluidly between engineers and statisticians.
Statistical Methods for Information Integration - Alyson Wilson, Los Alamos National Laboratory
Many of the statistical problems addressed at Los Alamos involve assessing large, complex systems. Data directly assessing the system is often sparse or non-existent, but many other data and information sources, such as subsystem tests, test on similar systems, computer models, and expert opinion are available. In the context of the ballistic missile research and development reliability case study, this talk will address the development of graphical models to represent the system and its metrics of interest, the population of these models with data, and the resulting analysis and computational issues.
Discussant: Shane Reese, Brigham Young University
CONTRIBUTED SESSION
1A:
APPLICATIONS OF EXPERIMENTAL DESIGN ASSEMBLY HALL, MICHIGAN ROOM
Design and Analysis of Experiments with Continuous On-off Factors: An Application in Chine Placement on Nacelle Julio Piexoto and Winson Taam*, The Boeing Company
Design of experiments is a systematic approach to develop a cost effective test plan. Standard approaches are not appropriate when experimental factors consist of continuous factors that can be absent or present. There are many engineering systems that possess this feature. This presentation describes the design, analysis and assessment of an experiment of placing chines (air deflectors) on aircraft nacelles conducted in a wind tunnel. The issues of optimization with these factors, and model selection are discussed using this experiment. The emphasis is placed on the design.
DOE in Chemical Process Research: Searching for Firmer Ground Gary Proehl, Eisai Research Institute
This presentation will describe experiments done to improve the robustness of a chemical process used to synthesize a pharmaceutical intermediate. Given a process that occasionally produced out-of-specification material, design of experiment work was used to find the critical process parameters for the addition of a tert-butyloxycarbonyl protecting group to an amine. Relying heavily on laboratory automation, a five-factor, fractional-factorial study screened factors for their effect on the isolated product. A response surface model was then generated from a three-factor central composite design. Verification runs confirmed the robust conditions found for this reaction. Details of this work will be presented with an emphasis on the statistical techniques used.
Achieving Process Improvement Through Strip Block Design: An Industrial Application to Reduce Defective Rate Carla Vivacqua*, Andre de Pinho and Wenzhen Huang, University of Wisconsin
Although strip block designs have been known and applied in the agricultural field since the late thirties, their use in industrial settings has been very limited. Strip block designs provide not only an economical, but also an operational advantage in executing experiments. These designs are appropriate in three basic scenarios: robust product experimentation, presence of factors difficult to change and multistage processes, that is, processes with a sequence of at least two steps. When there is a need to investigate the effects of factor associated with different stages, the use of a strip plot configuration may be an alternative to reduce costs of experimentation by decreasing the number of runs required, and consequently less time and resources are consumed. Its use is especially beneficial in situations where several production units can be processed simultaneously as a group. In this study, we explore the use of this configuration in industrial settings. We also present an example of a strip block experiment aiming at defective rate reductions, which illustrates this designs appeal to engineers and industrial practitioners. Finally, we show how much we can gain by choosing a suitable design to accommodate all the environmental and process characteristics, as well as the limitations inherent in a working place.
Design of a Multi-Level Materials Experiment Joanne Wendelberger*, Leslie Moore, Los Alamos National Laboratory
Planning of scientific experiments involves a variety of challenges, both statistical and logistical in nature. Interesting questions arise in the planning of multi-level experiments that involve assessing the tradeoffs between run number, selection of levels, ability to estimate effects, and the degree to which orthogonality can be achieved. The relative merits of fractional factorials, orthogonal arrays, and near-orthogonal arrays for achieving desirable statistical properties are explored, while facing the realities of practical constraints. These issues are examined as they arise in the process of designing an experiment for a materials study.
On Information Matrices of Irregular Factorial Designs Tena Ipsilantis Katsaounis, Ohio State University
In this paper, we present a computational efficient method for calculating the elements of X′ X of any array X in two symbols. The concept of marginal index of an array in two symbols is introduced. A theorem for the case of partially balanced (PB) array and its generalization for PB1 and extended PB1 arrays is presented. Applications in calculating the information matrix or parts of the information matrix of irregular factorial designs with two level factors are discussed.
CONTRIBUTED SESSION
1B:
APPLICATIONS OF STATISTICS IN AUTOMOTIVE INDUSTRY: MAKING BUSINESS DECISION
DAVIDSON HALL, ROOM D1210
Survival Analysis and Days-on-Lot Lan Wang*, Gint Puskorious, Irv Salmeen, Ford Motor Company
In the automotive industry, an important marketing problem is the determination of which vehicle options consumers prefer. We apply methods of survival analysis to the numbers of days vehicles remain in inventory and determine how vehicle options, incentives, etc. affect the days vehicles stay in inventory.
Statistical Issues in Modeling User Accommodation Carol Flannagan*, Matthew Reed, UMTRI
Automotive engineers use statistical accommodation models to predict the proportion of the user population that will be accommodated by a particular vehicle design. However, modeling accommodation can be challenging because of the difficulties inherent in modeling the tails of a distribution. In this paper, the authors present a set of statistical strategies for approaching the problem of modeling accommodation. This approach is illustrated throughout the paper using the example of passenger-vehicle design to accommodate driver's preferred seat position. The general approach begins with operationally defining accommodation and identifying key subject and design factors that affect accommodation. Next, a model of the relationship between design factors and accommodation of an individual is developed. Finally, the model of individual accommodation is combined with the population distribution of the key subject variables. Each step in this process raises different statistical issues. Future directions for refinement are discussed.
Comparison of Estimates of Preparatory and Syndicate Methods in Auto Industry Survey Daniel Wang, Central Michigan University
Preparatory and syndicate survey are often used in assessing appeal and initial quality of new vehicles by J.D. Power for many auto manufactures. When the two types of survey are conducted using different or partially different new vehicle owner registrations, it is not clear whether the results are still comparable. This study discusses the difference between the two types of studies, and proposes a computer simulation based method for checking the appropriateness of using the syndicate study to predict future score of J.D Power general study. Simulated examples for Mercedes-Benz M-Class are also included.
Pricing Automobiles to Reflect Their Perceived Quality Differentials in the U.S. Automobile Market: An Empirical Study Satoshi Myojo and Yuichiro Kanazawa, University of Tsukuba
Although the U.S. consumer preferences for automobiles shifted from passenger cars to minivans and SUVs during the last two decades, Japanese automobile manufacturers maintained their share of passenger cars somewhere between 13.6% to 20.0%, while their cars have been priced higher than the domestic counterparts. Motivated by the finding, we investigate automobile pricing policies of domestic versus Japanese manufacturers by total cost of ownership (TCO), which includes the initial purchasing price, the fuel cost, the maintenance and repair expenditure, and the trade-in value. The maintenance and repair expenditure is far more difficult to estimate, although the automobiles reliability scores based on the experiences of actual owners are available in Consumer Reports. In this paper we propose and illustrate a method to convert that knowledge on the automobile reliability to monetary maintenance and repair expenditure. We find the maintenance and repair expenditure differential plays an important part of the TCO calculation.
An Automotive Competitive Response Model Yong Yang*, Gint Puskorious, Rose Peng, Ford Motor Company
We have developed an Automotive Competitive Response Model, which has advanced the existing application research in three respects. First, this competitive response model can measure consumer response to not only price changes but also quality changes, while other similar models can measure consumer response to only price changes; Second, this competitive response model can measure manufacturers' response to each other's price changes and quality changes, while other similar models largely focus on the consumer demand side and nearly ignore the interactions among manufacturers; Third, this competitive response model can be applied to predict market outcome in terms of market share and pricing at market equilibrium, while other similar models completely lack this capability. This model can be used by the product development division to assess cost and benefit of investment in vehicle freshening and quality improvement. It can also be used by the pricing and C&I divisions to predict consumer responses to price changes and C&I programs. Most importantly, our model adopts a competitive, dynamic and systematic view of the market and gives automotive business planners capabilities to assess potential reactions from their major competitors to their action plans on product development, pricing and C&I. Therefore this model is able to truly predict what would ultimately happen in the market after all the dust has settled.
MONDAY, MAY 20 2:00
3:40
INVITED SESSION 1C: NOVEL METHODOLOGIES AND APPLICATIONS IN INDUSTRIAL EXPERIMENTATION ASSEMBLY HALL, HALE AUDITORIUM
Quality Engineering Tools to Improve Life Cycle Design of a New Vehicle in Virtual Environment Stefano Barone, Antonio Lanzotti*, Stanislao Patalano, Universitat degli Studi di Napoli Federico II
In early phases of life cycle design of a new vehicle, designer needs tools to evaluate the goodness of his ideas and, consequently, whether his design choices could satisfy the end user. A current challenge is to realize these evaluations on virtual prototypes instead of physical ones. To get the best results, using specific and expensive simulation software in virtual environment, designer need adoption of well-tested methodologies.
In this paper, the authors present some results of research projects aimed to optimize:
a. Comfort of a new vehicle in the concept design phase;
b. Tolerance allocation for complex mechanical systems in the tolerance design phase.
The first research project, since comfort evaluation depends primarily on man-machine interaction, deals with human model experimentation to simulate man behavior in virtual environment. Integrating statistical and industrial design competencies, a classical parameter design approach has been adopted and performed. Furthermore a multiresponse loss function has been formulated in order to discriminate among various configuration settings and to eventually find the optimal design in terms of comfort for driver. The results obtained by virtual experimentation concern the improvement of design in terms of comfort and the increase of designer knowledge about the effects of design parameters on the ergonomics of the vehicle. The fully developed case study shows that some design factors can be fixed at the optimum value and other ones need direct user control to attain optimal comfort requirements. These results improve the concept design of the vehicle, suggesting if and where, for example, to introduce adjustment devices.
The second research project deals with tolerance design of complex assemblies of parts, frequently occurring in the aeronautical and automotive fields. Using Computer Aided Tolerancing (CAT) systems, the designer can verify the effect on the response (or responses) of design factor tolerance by developing a feature based model in virtual environment. Two main needs must be reached using CAT systems in tolerance analysis: the first consists of the capacity to orient the steps of the simulation processes towards the optimal solution; the second deals with the possibility to reach the optimal solution in a short time. In this paper, the authors show a new procedure using CAT systems to find the set of dimensional and geometrical tolerances that assure the fulfillment of product functional requirements, in a lower cost condition, using a limited number of simulation steps. This procedure involves the application of parameter design methodologies.
These case studies have been fully developed in a virtual lab available at PRODE (PROduct DEsign), a recently established consortium between University of Napoli Federico II and Elasis-Sistema Ricerche FIAT nel Mezzogiorno.
Large-Scale Reliability Based Design Optimization for Vehicle Crash Safety Lei Gu*, Ren-Jye Yang, Ford Motor Company
The application of multidisciplinary design optimization (MDO) to automotive vehicle design for safety has been of significant interest over the last several years. Sobieski et al. and Kodiyalam et al. reported a very significant reduction in computing time for such large-scale MDO problems - from months to days - through the efficient use of shared memory multiprocessor systems. The present work is an extension of work reported in Sobieski et al. and Kodiyalam et al. with a substantial increase in computational complexity. The focus of this presentation is on a large-scale reliability based design optimization of a car body structure for crash safety. The Finite element based full vehicle structural crash simulation is a design tool commonly used throughout the automotive industry to evaluate vehicle crash performance. This presentation describes how this simulation technique has been extended to perform reliability based design optimization for key response variables using nonlinear response surface methods. Monte Carlo and performance measure methods are used to perform reliability based design optimization. This technology uses computer based design of experiments and lends itself readily to parallel implementation.
Failure Amplification Method (FAM): An Experimental Approach to Categorical Response Optimization Jeff Wu*, Roshan J. Vengazhiyil, University of Michigan
Categorical data arises quite often in industrial experiments because of an expensive or inadequate measurement system for obtaining continuous data. When the failure probability/defect rate is small, experiments with categorical data provide little information regarding the effect of factors of interests and are generally not useful for product/process optimization. We propose an engineering-statistical framework for categorical response optimization that overcomes the inherent problems associated with categorical data. The basic idea is to select a factor that has a known effect on the response and use it to amplify the failure probability so as to maximize the information in the experiment. We develop new modeling and optimization methods that are appropriate for these experiments. The methodology is illustrated with two real experiments.
INVITED SESSION
1D:
CLASSIFICATION, CLUSTERING AND DATA MINING ASSEMBLY HALL, MICHIGAN ROOM
Multivariate Analysis of Massive Distributed Data Sets
George Ostrouchov, Nagiza Samatova, Oak Ridge National LabsMassive data sets distributed over a network are becoming more common as more widely distributed data collection devices are installed and because of increasing collaborations over the Internet. Even if it is possible to centralize a massive distributed data set on local storage devices, we have few means of analyzing such data. A recent Scientific American article reports that our data storage capacity doubles every 9 months, while processor speed and memory still follow Moore's law of doubling roughly every 18 months. Although the size of data sets we analyze with centralized methods is increasing with available memory, we are losing ground relative to the size of available data sets. The effect this has on the nature of massive data sets is that they are increasingly stored as massive collections of manageably small data sets. As a result, most massive data sets are distributed regardless of whether they reside on a single device or multiple devices. In this talk we describe some characteristics of the massive distributed data analysis problem, some components necessary to address the problem of using distributed data, and our approach to developing algorithms for distributed data sets. We conclude with our algorithm for principal components analysis of a set of distributed data sets.
Simultaneous Classification and Feature Clustering Using Discriminant Vector Quantization with Applications to Microarray Data Analysis
Jia Li*, H. Zha, Penn State UniversityIn many applications of supervised learning, automatic feature clustering is often desirable for a better understanding of the interaction among the various features as well as the interplay between the features and the class labels. In addition, for high dimensional data sets, feature clustering has the potential for improvement in classification accuracy and reduction in computational complexity. In this paper, a method is developed for simultaneous classification and feature clustering by extending discriminant vector quantization (DVQ), a prototype classification method derived from the principle of minimum description length using source coding techniques. The method incorporates feature clustering with classification performed by fusing features in the same clusters. To illustrate its effectiveness, the method has been applied to microarray gene expression data for human lymphoma classification. It is demonstrated that incorporating feature clustering improves classification accuracy, and the clusters generated match well with biological meaningful gene expression signature groups.
A Graph Based Tree Pruning Algorithm - Xiaoming Huo, Georgia Institute of Technology
A graph based tree pruning algorithm is proposed. The new
method is computationally comparable with existing ones, but it provides a full
spectrum of information regarding a tree pruning:
(1) given the value of the penalization parameter, this method gives the minimum
size of a decision tree;
(2) given the size of a decision tree, it provides the range of the penalization parameter, in which the cost penalizing approach will render a tree that has this size;
(3) it can tell the sizes of trees that will be definitely
inadmissible---no matter what the value of the penalty parameter is, the
complexity penalization approach will not render a tree having this size.
Simulations illustrate the usefulness of this approach. This method can be
incorporated in many tree-building techniques, e.g. cross validation, boosting,
just name a few.
CONTRIBUTED SESSION
1C: GENERAL
METHODS I ASSEMBLY HALL, WOLVERINE ROOM
EWMA Tolerance Limits
Raid Amin*, Kuiyuan Li, University of West Florida and Sven Knoth, Viadrina UniversityTolerance limits are limits that include a specified proportion of the population at a given confidence level. They are used to make sure that the production will not be outside specifications. Tolerance limits are either designed based on the normality assumption, or nonparametric tolerance limits are established. In either case, no provision for autocorrelated processes is made in the available design tables of tolerance limits. It is shown how to utilize EWMA tolerance limits to cover a specified proportion of the population when autocorrelation is present in the process.
On the Effectiveness of Ranked Set Sampling for Estimating a Population Proportion Haiying Chen, Ohio State University
Ranked set sampling (RSS) is a sampling procedure that can be considerably more efficient than simple random sampling (SRS). It involves preliminary ranking of the variable of interest. Although ranking processes for continuous variables that are implemented through either subjective judgment or via the use of a concomitant variable have been studied extensively in the literature, the use of RSS in the case of a binary variable of interest has not been investigated thoroughly. The objective of this study is to use a National Health and Nutrition Examination Survey data set to investigate the application of RSS to estimate a population proportion. We use logistic regression to aid in the ranking of the variable of interest. Our results indicate that the use of logistic regression improves the accuracy of the preliminary ranking in RSS and this leads to substantial gains in estimation of population proportion.
Estimation of a Finite Population Mean by Judgment Post-stratification Liyan Hua, Ohio State University
Judgment post-stratification is a new estimation method that keeps the benefits of stratification while requiring no knowledge of the population structure. It is based on the collection of a simple random sample of units from the finite population. The units are grouped into sets, and either subjective judgment or objective concomitant measurement is used to rank the units in a set, thereby creating the post-strata. Analytic results and simulations demonstrate that judgment post-stratification yields an unbiased estimator of the population mean with smaller mean squared error than does simple random sampling. We also provide rules for collapsing the strata while retaining satisfactory results. Our study shows some of the many benefits that accrue from judgment post-stratification.
Increasing the Precision of Quadratic Metamodels Wheyming Song, National Hsing Hua University
A simulation experiment is frequently performed to estimate a metamodel, which is a functional relationship between the mean response of the simulation model and a set of simulation inputs. Several strategies have been proposed to increase the accuracy of estimation by inducing a desired correlation structure among the responses. This talk proposes a new correlation inducing strategy for a quadratic metamodel. The proposed strategy is shown to be superior to a number of existing and competing strategies in terms of various variance measures.
Optimal
Variable EWMA Controller -
Arthur Yeh*, Bowling Green State University (with Sheng-Tsaing Tseng, Fugee
Tsung, Yun-Yu Chan)
The
exponentially weighted moving average (EWMA) feedback controller with a fixed
discount factor is a popular run-by-run (RbR) control scheme, which primarily
uses data from past process runs to adjust settings for the next run. Although
the EWMA controller with a small discount factor can guarantee, under fairly
regular conditions, a long-term process stability, it usually requires a
moderately large number of runs to bring the process output to its target. This
could lead to a severe consequence in short production runs. The reason is that
the output deviations are usually very large at the first few production runs
and, as a result, the process output may be out of specifications.
In order to reduce a possibly high rework rate, we propose a variable discount factor to tackle the problem. We derive the stability condition and the optimal variable discount factor for the proposed EWMA controller. Examples are given to demonstrate the strength of the EWMA controller with a variable discount factor. Moreover, a heuristic method is proposed to simplify the computation of the optimal variable discount factor. It is seen that the proposed heuristic method is easy to implement and provides a good approximation to the optimal variable discount factor.
CONTRIBUTED SESSION
1D: PROCESS
ADJUSTMENT, MONITORING, AND DIAGNOSIS DAVIDSON HALL, ROOM D1210
Identification and Fine Tuning of Closed-loop Processes Under Discrete EWMA and PI Adjustments
Rong Pan*, Enrique del Castillo, Pennsylvania State UniversityConventional process identification techniques of an open-loop process use the cross-correlation function between historical values of the process input and of the process output. If the process is operated under a linear feedback controller, however, the cross-correlation function has no information on the process transfer function because of the linear dependency of the process input on the output. In this paper, several circumstances where a closed-loop system can be identified by the autocorrelation function of the output are discussed. It is assumed that a Proportional-Integral (PI) controller with known parameters is acting on the process while the output data were collected. The disturbance is assumed to be a member of a simple yet useful family of stochastic models, which is able to represent drift. It is shown that, with these general assumptions, it is possible to identify some dynamic process models commonly encountered in manufacturing. After identification, our approach suggests to tune the controller to a near-optimal setting according to a well-known performance criterion.
Robustness of Multivariate EWMA Control Charts when the Covariance Matrix is Estimated Using Small Samples Alexandra Kapatou, University of Michigan
Multivariate control chars can be used to monitor the means of two or more correlated variables of a process. We consider processes where the target values of the means, variances, and covariances are estimated from the data. During monitoring, small random samples are collected at equal time intervals. Since estimates based on small samples are unreliable, the estimates of variances and covariances are updated with each new sample. In order to increase the sensitivity of the charts to detect small shifts from the target, past sample information is retained for each variable using an exponentially weighted moving average statistics (EWMA). Three multivariate, EWMA control charts are compared in this paper, each based on the componentwise sample average, sign, and signed rank statistics, respectively. The properties of the chars are evaluated using simulation. We study the advantages and disadvantages of each chart for various cases. Finally, we give recommendations on how to apply these charts in practice.
Interpreting the T2 Control-Chart Signals: A Comparison of the Effectiveness of the MTY Decomposition versus a Neural Network Francisco Aparisi*, Jose Sanz, Gerardo Avendam-Qo, Universidad Politecnica de Valencia
The use of multivariate SPC has several known advantages in
comparison with the use of multiple univariate charts. The T2 control chart has
become a popular option for implementing multivariate SPC due to its relative
simplicity and the fact that it is the equivalent multivariate scheme of the
Shewhart X-chart. However, the main disadvantage of these multivariate charts is
the interpretation of an out-of-control signal. The decomposition proposed by
Mason, Tracy and Young (1995, 1997) for the T2 chart (hereafter MTY) has the
advantage of being an analytic procedure. In this work we study the
effectiveness (percent of correct classifications) of MTY versus shift magnitude
and type, number of variables,
probability, etc. We then introduce a backpropagation neural network designed to
solve the same problem (interpreting control chart signals for the T2 chart). A
comparison between both methods is made.
Multivariate Outlier Detection Method in Process Monitoring Jiaqiong Xu*, Bavos Abraham, Stefan Steiner, University of Waterloo
A generalized Cook's statistic for detecting multiple outliers in multivariate linear regression models is proposed. An approximate distribution of the proposed statistic is also obtained to get a suitable cutoff point for a test of hypothesis of no outliers. A simulation study has been conducted to examine the performance of the approximate distribution. In addition, an application to process monitoring is also considered.
Variance Components Analysis Method for Diagnosability Study of Multistage Manufacturing Processes Shiyu Zhou*, Yu Ding, Yong Chen, Jianjun (Jan) Shi, University of Michigan
The automatic in-process sensing and data collection techniques have been widely used in complicated manufacturing processes in recent years. The huge amounts of product measurement data create a great opportunity for process monitoring and diagnosis. Given the product quality measurements, the diagnosability of the process faults in a multistage manufacturing process is studied in this paper under the framework of variance components analysis. Fault diagnosability is defined in such a general way that it does not depend on specific diagnosis algorithms. The concept of minimal diagnosable class is proposed to expose the "aliasing" structure among process faults in a partially diagnosable system. The algorithms and procedures to obtain the minimal diagnosable class and to evaluate the system-level diagnosability are presented. The methodology can be used for any general linear input-output system and it is illustrated by using the examples of a panel assembly process and an engine-head machining process.
MONDAY, MAY 20 4:10
5:10
PLENARY SESSION ASSEMBLY HALL, HALE AUDITORIUM
The Role of Statistical Science in Automotive Engineering Tim Davis, Ford Motor Company
Higher customer expectations and increasing competitiveness in the automotive industry dictate an ever-increasing complexity of automobile systems and components. The design and field performance of these systems become more difficult to understand purely in terms of deterministic engineering science. Opportunities for good statistical science abound, serving as an empirical adjunct to modern automotive engineering practice. Applications range from quantification of customer expectations and improvement of field reliability performance to design optimization and problem solving & prevention techniques. Useful statistical methods include traditional areas such as Survival Analysis, Experimental Design, and Statistical Process Control, as well as more recent developments such as Probabilistic Design, Two-Stage Regression and Generalized Point Process Modeling. The discussion is accompanied by numerous examples from current practice within the Ford Motor Company. The emphasis is not only on the technicalities of the statistics involved, but also on the role of the statistical scientist in getting these methods implemented.
TUESDAY, MAY 21 8:40
10:20
INVITED SESSION 2A: RELIABILITY AND RECURRENT EVENTS ASSEMBLY HALL, HALE AUDITORIUM
Some Prediction Problems Concerning Repeated Events Jerald F. Lawless, University of Waterloo
The prediction of events and related costs in a population of individuals or units at risk for recurrent events will be considered. Structured Poisson models and Bayesian methodology will be used to address problems involving the prediction of warranty claims and the discovery of faults in software.
A General Class of Models for Recurrent Events Arising in Reliability Edsel A. Pena, University of South Carolina
Recurrent events are prevalent is reliability and engineering studies. In this talk I will describe a general class of models for recurrent events which could simultaneously incorporate effects of intervention or repairs, effects of accumulating event occurrences, and effects of environmental factors or concomitant variables. This new class of models subsumes as special cases many existing models that have been proposed to model recurrent phenomena in reliability and engineering settings. Statistical methods for analyzing data arising from this class of models will be described and illustrated.
An Approach to Fuzzy Inference, with Applications to Reliability - Francisco J. Samaniego*, University of California-Davis, Nozer Singpurwalla of George Washington University
Lotfi Zadehs seminal 1965 paper introduced the notion of a fuzzy set and established the foundations of a theory for combining and operating with such sets. If U is taken as the universal set of interest in a given problem, then the fuzzy set A อ U is defined and determined by a "membership function" m: U ฎ [0, 1] which tracks the "extent" to which each x ฮ U is a member of A. If, for example, A is the set of "rich" people, and x is a particular persons net worth, then m(x) represents the extent to which x belongs to A. In the arithmetic of fuzzy sets, Aศ B is defined as the set with membership function max{mA(x), mB(x)} and Aว B as the set with membership function min{mA(x), mB(x)}. While fuzzy set theory and "fuzzy logic" have been applied in selected statistical contexts, notably in pattern recognition, classification problems and decision analysis, little progress has been made to date on the application of the fuzzy set framework to classical problems of statistical estimation.
This talk is centered around a stochastic model for "fuzzy data" based on a "super-population" viewpoint of a particular sampling problem of interest -- one among several plausible ways to interpret and apply Zadehs fuzzy set theory in a statistical context. Specifically, we will consider a bivariate distribution for the random pair (X,Y), where X takes values in the universal set of choice and, given X = x, Y is a Bernoulli variable with probability p = m(x), interpreted as the conditional probability of membership in the set A of interest when X = x. We will explore the situation in which inference concerning the distribution FX is of interest, but the random variable X cannot be observed directly. In such situations, one would seek to draw inferences concerning F from the observed values of Y. We treat both classical and Bayesian versions of the problem of estimating model parameters. Our overall purpose is to call attention to an interesting class of statistical problems, to formulate a way of treating them, to point out some related literature and, finally, to provide a detailed example in which the coherence and feasibility of the approach is demonstrated. Our example is drawn from the literature on stress-strength models in reliability. This work is joint with Nozer Singpurwalla of George Washington University.
INVITED SESSION
2B: STATISTICS IN
DRUG DISCOVERY AND BIOINFORMATICS ASSEMBLY HALL, WOLVERINE ROOM
Flexible Modelling of High Throughput Screening Data
- Marcia Wang, Hugh A. Chipman, and William J. Welch* University of WaterlooHigh Throughput Screening (HTS) is used in drug discovery to
screen large numbers of compounds against a biological target. Data on activity
against the target are collected for a representative sample (experimental
design) of compounds selected from a collection. The explanatory variables re
chemical descriptors of compound structure.
This talk will concentrate on the analysis of such data. Some previous work
comparing statistical analysis methods on two HTS data sets shows that local
methods perform well. These local methods, namely K-nearest neighbours (KNN) and
classification and regression trees (CART) have good predictive performance in
comparison with regression, neural network and MARS models, which assume more
smoothness. After reviewing these comparisons, we will present some adaptations
of KNN and CART, including averaging over subsets of explanatory variables,
bagging and boosting. These further improve the models' performance.
Testing Non-Additivity of Biological Activity in Combinatorial Library - Nanxiang (Sean) Ge, Aventis Pharmaceuticals
Combinatorial chemistry offers new opportunities to generate and analyze QSAR data. Tradition QSAR attempts to correlate activity with structure. With combinatorial chemistry, it is possible to correlate activity directly with the reagents used in a combinatorial library. If one can determine which reagents lead to the compounds of highest activities, it is then possible to predict active compounds in virtual libraries of 106 to 1010 compounds. This would greatly facilitate library design and provide confidence that the best compounds are being considered for synthesis. An important question is whether the activities of a product molecule can be considered as a sum of its components. This is referred to as additivity between reagents. If there is non-additivity, it is necessary to identify and include the non-additive terms in the model in order to improve QSAR models.
Presented here is a method for detecting the second and third order effects of side-chain non-additivity. If the reagents in a library are shown to be additive in their contribution to activity, simple QSAR based on additive models can be applied confidently to reagents. Furthermore, methods are presented for identifying non-additive terms in QSAR models. Testing non-additivity can also guide the synthesis of the library. If the contributions are shown to be additive then the strategy for library synthesis may be shifted to include many reagents of a given type but not to make all combinations. The result is a more efficient use of resources.
Validation in High Throughput Screening - David Stock, BMS Pharmaceuticals
The Pharmaceutical Industry uses High Throughput Screening (HTS) to rapidly test large numbers of compounds against biological targets. In a primary screen the activity of a compound is judged by its performance at a single concentration. A typical primary screen may be used to test 500,000 or more compounds in a few weeks. Compounds identified as active by the primary screen are typically subjected to concentration-response testing. A typical laboratory group may be responsible for generating thousands of concentration-response curves in a week. The increasing demand for HTS services, and the scarcity of resources, has made it critical that every screen be properly validated before being put into production. This talk outlines experimental procedures and statistical analyses for the validation of screens.
David Stock1, Lynda Cook2, Robert Stoffel2, & Ramesh Padmanabha2 Xianggui Qu3, C.F. Jeff Wu3
(1) Corresponding author: Dept 716 - Nonclinical Biostatistics, Bristol-Myers Squibb, 5 Research Parkway, Wallingford CT 06438, StockD@BMS.COM (2) Lead Discovery, Bristol-Myers Squibb (3) University of Michigan
CONTRIBUTED SESSION 2A: APPLICATION OF STATISTICS IN AUTOMOTIVE INDUSTRY: MODELING ENGINEERING DATA ASSEMBLY HALL, MICHIGAN ROOM
Statistical Techniques for Vehicle Crash Testing Ming-Wei Lu*, Richard Rudy, Daimler Chrysler Corporation
Often the vehicle crash testing is conducted using a very small sample size. Results of two or three test samples may be lower than the compliance limit, but the question always arises, " How good is good enough?". For example, if the government compliance limit is 80 g and the test result in an average of 60 g, is this value good enough to establish that the government requirement will be met in a subsequent test? The compliance prediction methods will be discussed to determine the risk in not meeting the government mandated requirement based upon successful test results or the companys performance objective requirement.
(1) Single test compliance (Normal distribution)
Given X1, X2,
, Xn are n independent
variables from a normal distribution with mean m and
standard deviation s , the Confidence (C) of the next
test Xn+1 will be less than the compliance limit will be presented
with two cases (s is unknown and s
is known). If some historical test data on similar designs exist, then certain
variation estimate can be made (i.e., s is known).
(2) Population compliance (Normal distribution)
Given X1, X2,
, Xn are n independent
variables from a normal distribution with mean m and
standard deviation s , the Confidence (C) that at
least a given specified proportion of the population complies government
requirement will be discussed when s is unknown and s
is known.
(3) Population compliance (Weibull distribution)
When the population is not normally distributed, the Weibull distribution is considered. The Confidence (C) that at least a given specified proportion of the population complies government requirement will be discussed when the Weibull slope b (shape parameter) is known.
Various Statistical Methods for the Analysis of Chest Crash Data Lan Wang, Richard Banglmaier and Priya Prasad, Ford Motor Company
The conventional method to analyze crash data is the binary logistic model. This talk will discuss other statistical methods that can be used. One method is empirical: maximal likelihood estimates of the probability distribution. Since the data set is censored, parametric survival models can be used. Other methods we discuss include confidence method and median rank method. We will show the resulting distributions and consider the similarities and differences among the various distributions.
Statistical Evaluation of Staircase Fatigue Data-Arithmetic, Dixon and Mood & Probit Calculations Andy Shea*, Xuming Su, John Lasecki, John Allison, Ford Motor Company
The staircase testing method is a common testing procedure used for sensitivity testing. In particular, it is frequently used in defining the fatigue strength of materials used in engineering structures. Literature published in 1948 described the procedure for obtaining and analyzing the data from this test method. While a common test procedure has generally been followed for obtaining the data, a common analysis method has not been followed. This presentation will focus on analysis of staircase fatigue testing of a cast aluminum alloy. The differences in the mean and standard deviation calculated using three analysis methods will be discussed.
Welding Application of Response Surface Designs in Automotive Manufacturing Research Karry Roberts, Ford Motor Company
Advanced statistical design of experiments, such as the Box-Behnken design illustrated in this paper, provide an efficient and comprehensive evaluation of manufacturing technology research projects. In manufacturing, there are often situations where understanding of variables with 3 levels and their interactions are of primary interest. The data driven results lead the engineers to technical decisions needed in development of new technologies. This article provides details around the optimization of process variables for a bracket welding application for automotive body structures. A 46 run Box-Behnken response surface design was utilized to study a proposed range of process settings for 5 key variables, for a new manufacturing welding process, which attaches a bracket to sheet metal. The drawn-arc welding method butt welds the bracket directly to the base material, eliminating the flange and providing savings in piece price and weight. The method simplifies the tooling, makes installation a one step process, and allows one-sided access, which are all critical to manufacturing operations.
The complete testing strategy included evaluation of 644 samples. This report includes the statistical analysis of the tensile strength and cross sectional area of the weld, with analysis of average effects for the response variables as well as assessments of the variability across sample groups. Results indicated a robust process, within the ranges of the factors studied, hence validating the required quality level for the process settings that were recommended.
Wavelet-Based Functional Data Analysis in Automotive Water Pump Dynamic Seal Test Development Baocheng Sun, Ford Motor Company
Better understanding of water pump dynamic seal noise factors is very critical to develop a good validation test for new product development. In this paper, a DOE approach is used to study the impact of different noise factors on dynamic seal performance. The critical parameters for seal performance are the radial surface trace and circumferential surface trace, which can be quantified as different types of failure modes. Since both surface traces have functional-like curvatures, wavelet-based functional data analysis is applied in this paper for dimension reduction, while the challenge is to extract key features for these seal surface traces and to combine them as one parameter for each kind of failure mode in our seal test development. In this paper, a wavelet transform, Haar transform, is proposed to extract some key features from each seal surface trace due to specific interpretation of Haar coefficients. The physical interpretation of each Haar coefficient is discussed in terms of seal failure mode. A combined index is proposed based upon Haar coefficients for ease of data analysis. This index is very indicative to seal failure mode, which dramatically facilitates our data analysis. The result from this study is being used to develop a complete set of validation tests for seal supplier competition for a new engine program.
CONTRIBUTED SESSION
2B: DESIGN OF
EXPERIMENTS I DAVIDSON HALL, ROOM D1210
Testing Multiple Dispersion Effects in Fractional Factorial Designs
Richard McGrath*, Bowling Green State University, Dennis K.J. Lin, Pennsylvania State UniversityIn two-level fractional factorial designs, the assumption of constant variance is commonly made. When the variance of the response differs between the two levels of a column in the effect matrix, that column produces a dispersion effect. In this paper we show that two active dispersion effects may create a spurious dispersion effect in their interaction column. Most existing methods for dispersion effect testing in unreplicated fractional factorial designs are subject to these spurious effects. We propose a method of dispersion effect testing based on geometric means of sample variances. This test may be applied to both replicated and unreplicated designs.
Experimental Design and Validation of Simplified Models for Prediction Grace Montepiedra*, Bowling Green State University, Francis Pascual, Washington State University
Suppose that interest is in the prediction of the response over a specified region of interest. We explore experimental design conditions and the nature of the "true" model such that a simpler model provides a better estimator in the sense of the mean squared error. Following the rationale provided by Toro-Vizcarrondo and Wallace [J. Amer. Statist. Assoc. (1968):558-572], as well as previous related work, we introduce new concepts which we call the model validity region and the optimal model validity region. Based on these concepts, we introduce a new optimal design criterion that measures the extent of robustness of a design to simplified models. Simple examples in polynomial regression and multifactor experiments are given to illustrate the ideas proposed. We also provide a strategy that will enable the researcher to apply the concepts discussed, by using the hypothesis test proposed by Toro-Vizcarrondo and Wallace.
Factor Assignment of L18 Array Kenny Ye, State University of New York-Stony Brook
L18 array is one of the most popular designs among experimenters. The problem of factor assignment arises under two situations: 1. The number of factors in an experiment is less than the number of columns of the array; 2. Some factors in an experiment are quantitative and some are qualitative. Bad factor assignment plans could result in less efficiency in modeling. This paper presents the optimal factor assignment plans for L18 based on a generalized minimum aberration criterion.
Minimum G-aberration Design Construction and Design Tables for 24 Runs Debra Ingram*, Arkansas State University, Boxin Tang, University of Memphis
Criteria for identifying the best regular fractional factorial designs are well known and design tables containing the best designs are readily available in the literature. Non-regular orthogonal fractional factorial designs can be constructed for run sizes that are multiples of four and design tables containing the best nonregular designs are an important addition, providing experimenters run size flexibility. The minimum G-aberration criterion and its relaxed version, minimum G2 aberration, provide systematic criteria for assessing the goodness of nonregular designs. Minimum G-aberration designs of 12, 16, and 20 runs were obtained by a complete search of Hadamard matrices in Deng, Li, and Tang (2000) and Deng and Tang (2001). Simple algorithms based on forward selection or backward eliminatin of columns of Hadamard matrices have been studied as solutions to the problem of how to construct larger designs that have minimum G-aberration. Applications to the small run sizes for which complete search results are available revealed that the potential for obtaining minimum G-aberration designs by such algorithms is excellent. This paper discusses improvements that have led to the more sophisticated algorithms that have now been implemented in the construction of minimum G-, G2 and G4-aberration designs of 24 runs and 28 runs. Design tables containing these new designs are presented
.Graphical Method to Identify the Significant Sum of Squares in Factorial Experiments Suha Sari, Western Michigan University
A graphical approach is proposed to identify significant sum of squares of factor effects obtained from mixed level factorial experiments. This method can be used to give preliminary results for modeling the response with respect to the factorial effects. Estimation of experimental error is also available from the graphical result.
TUESDAY, MAY 21 10:50
12:30
INVITED SESSION 2C: APPLICATIONS OF STATISTICAL COMPUTER EXPERIMENTS IN AUTOMOTIVE INDUSTRY ASSEMBLY HALL, HALE AUDITORIUM
Using Statistical Computer Experiments as Part of a Math Model Validation Process - John Cafeo*, Jian Tu, General Motors R&D Center
Improved computational models and simulations offer the potential to reduce design, development and manufacturing time and cost by providing better predictions, and a more complete exploration of system and process design space. However, these models and simulations will not achieve this potential unless credible uncertainty statements about their predictions accompany them. Key to attaining this goal is a model validation process that permits comparison of computational predictions to experimental outcomes over a meaningful set of validation experimentsa process that will provide confidence in the predictive capability of the computational model. The ultimate goal is to design, conduct and analyze validation experiments so as to be able to credibly make statements such as:
"Based on our analysis of the validation experiments, we are 80% confident that if we measured an actual system, it would fall within these bands about the output of the model we just ran".
In this talk, a general validation strategy will be proposed. Both field and computer experiments play a prominent role. An academic example problem will be presented to illustrate the procedure. The results from a spot-weld model validation procedure will also be presented.
Statistical Applications in Complex Engineering Simulation Models: Some Challenges - Agus Sudjianto, Ford Motor Company
During the beginning phase of product development time, engineers, today, rely on computer simulation models to guide their decision making process. These computer simulation models represent complex engineering phenomena, which may be constructed using one-dimensional models from "First Principle" in physics or complex three-dimensional models such as Computational Fluid Dynamics (CFD) and Finite Element. In this talk, several examples of these simulation models (e.g., CFD, structural durability, multi-body dynamics, and Noise Vibration & Harshness) are presented to illustrate the challenges of applying statistical methods. These simulation models share common characteristics such as they are highly non-linear involving large number of input and output variables, and they are computationally intensive. These characteristics pose significant challenge in applying statistical methods to provide timely and accurate (i.e., practical) information to design engineers during the design process. Several statistical challenges in the following area are discussed: model correlations, design of computer experiments, sensitivity analysis, analysis of functional response, multi-response optimization, and robust design optimization.
Changing the Quality Paradigm - Mary Fortier, General Motors Corporation
The pace, scale and scope of the world economy are staggering. Producers recognize the need to maintain and grow their share of the marketplace. Customers demand products that meet and exceed their expectations. Companies who wish to extend their business know their customers must insist on their brands. Global forces drive the need to offer quality products at competitive prices. Combined, these attributes constitute business strategies that align innovative product design within lean cost structures. Corporate leaders understand the factors that directly support customer satisfaction and loyalty. These include reliable products that meet their intended performance and conform to requirements. Visionary corporate leaders understand that quality is imperative for long-term growth. They know the benefits reach beyond satisfied customers by improved productivity, reduced cycle time, enthused employees and lower costs, true characteristics of a successful business. However, visionaries also recognize that variation can stand in their way. Applied systematically, Robust Design, the ability to desensitize a product or process to the effects of variability, will improve quality levels and provide a stimulus for growth and a backbone for success.
It is necessary to design quality into products early in the development cycle. This is a paradigm shift which leaders, engineers, and key support areas must realize and implement. Upstream Robust Design provides this actionable process. This presentation will discuss some mechanisms useful to drive change from deterministic thinking to systems and statistical thinking. The Computer Aided Engineering community will be emphasized with examples of math-based Robust Design.
INVITED SESSION 2D: PROCESS MONITORING ASSEMBLY HALL, MICHIGAN ROOM
The Robustness of the Multivariate EWMA Control Chart
- Zachary G. Stoumbos*, Rutgers University, Joe H. Sullivan, Mississippi State UniversityWe investigate the effects of non-normality on the statistical performance of the multivariate exponentially weighted moving average (MEWMA) control chart, and its special case, the Hotelling's chi-squared chart, when applied to individual observations to monitor the mean vector of a multivariate process variable. We show that the chi-squared chart is highly sensitive to non-normality. We argue that the performance is most sensitive to departures from multivariate normality with individual observations (subgroups of size one). We show that with individual observations, and therefore, by extension, with subgroups of any size, the MEWMA chart can be designed to be robust to non-normality and very effective at detecting process shifts of any size or direction, even for highly skewed and extremely heavy-tailed multivariate distributions.
A Unified Approach for Process Control and Monitoring - Kwok Tsui*, Georgia Institute of Technology, Alexopoulos, D. Goldsman and W. Jiang
According to W. Shewhart, the process variation can be classified into assignable cause and common cause variations. The assignable cause variation can be eliminated by statistical process control (SPC) methods through identification and elimination of the root cause of the process shift. The common cause variation is inherent in the process and is generally difficult to reduce by SPC methods. However, if the common cause variation can be modeled by an autocorrelated process and physical variables exist to adjust the output, the common cause variation can be reduced by automatic process control (APC) methods through feedback/feedforward controllers. As suggested by Box and Kramer (1992), integration of SPC and APC methods can result in major improvement in industrial efficiency, which is described as algorithmic statistical process control (ASPC) by Vander Weil et al. (1992).
Integration of SPC and APC involves the problem of detecting
parameter changes from autocorrelated processes. Traditional SPC control charts
have been shown to have poor performance in monitoring and controlling such
processes (Harris and Ross 1991; Alwan 1992). Alwan and Roberts (1988) propose
the special cause chart (SCC) to rectify this problem. Their idea is to
"whiten" the autocorrelated process by subtracting the one-step-ahead
minimum mean squared error (MMSE) prediction, and then monitor the residuals (or
the prediction errors) with traditional control charts. Jiang et al. (2001)
propose to replace the MMSE predictor by the commonly used proportional-intergrated-derivative
(PID) controller with subsequent monitoring of the prediction errors, which
denoted as PID charts. On the other hand, Zhang (1998) points out that it is
quite efficient to apply the exponentially weighted moving average (EWMA) chart
to monitor autocorrelated processes. Jiang et al. (2000) extend the EWMA chart
to a general class of control charts based on the autoregressive moving average
transformation, called RMA charts.
There are close relationships among modeling and monitoring methods of MMSE
controllers, PID controllers, EWMA charts, SCC charts, PID charts, and ARMA
charts etc. The objective of this paper is to develop a unified approach to
model the relationships among these well-known methods in SPC, APC, and
intergated SPC/APC.
The Changepoint Model for Statistical Process Control - Douglas M. Hawkins*, Peihua Qiu, and Chang Wook Kang, University of Minnesota
Statistical process control requires statistical methodologies that detect changes in the pattern of data. One such method is the changepoint formulation. Long recognized as a Phase I analysis tool, we argue that it is also highly effective in allowing the user to progress smoothly from the start of Phase I data gathering right through Phase II SPC. While not quite as powerful as an optimally-tuned cusum chart, the changepoint formulation has a much greater robustness of performance than does the cusum and is close to optimal over the whole range of possible shifts.
CONTRIBUTED SESSION
2C:
RELIABILITY ANALYSIS ASSEMBLY HALL, WOLVERINE ROOM
The Random Fatigue-Limit Model in Multi-Factor Experiments - Francis Pascual, Washington State University
The Random Fatigue-Limit (RFL) model discussed by Pascual and Meeker (1999) was motivated by the need to describe fatigue life of certain materials. The RFL model describes the relationship between lifetime (measured in cycles) and a single factor, often the applied stress on the material. It has been found to be particularly useful for modeling high-cycle fatigue (HCF), that is, lifetimes that exceed 10 million cycles. Outputs from this model can be used as input for system models for lifetimes of jet engines for design purposes. In such applications, cycles are defined in terms of engine revolutions. Understanding how well the parts hold up as a function of stress and other environmental variables would be valuable to engineers designing new and better engines. Thus, the RFL model needs to be adapted to situations when there are several factors that affect product life. In this article, we investigate how the RFL model can be extended from dependency on stress to dependency on stress and other explanatory variables. We fit the proposed model extensions to available data and describe methods to assess the adequacy of the fits.
Applying Statistical Models to Accelerated Life Testing Song Miao*, Chun Lei, Raj Cornelius, Alex Klajic, Finisar Corporation
Traditional "Life Data Analysis" involves analysis times-to-failure data obtained under "normal" operating conditions in order to quantify the life characteristics of the product, system or component. In many situations, and for many reasons, such life data is very difficult to obtain. The reasons for this difficulty can include the long life times of todays products the small time period between design and release, and the challenge of testing products that are used continuously under normal conditions. Given this difficulty, and the need to observe failures of products to better understand their failure modes and their life characteristics, people have attempted to devise methods to force these products to fail more quickly than they would under normal use conditions. In other words, they have attempted to accelerate their failures. Many products can be life tested at high stress conditions to yield failures quickly. Analysis of data from such an accelerated life testing yields valuable information on product life at design conditions (low stress).
In fiber optical industry, products normally have long life. In order to better understand products failure modes and their life characteristics, accelerated life testing becomes very important.
This paper presents the development of the statistical and experimental basis of accelerated life testing. Weibull and Lognormal distributions have been used to establish these statistical models. Products life prediction and qualification tests have also been established based on the accelerated life testing models.
Nonstandard Asymptotics in a Modulated Gamma Process Nibedita Bandyopadhyay*, Ananda Sen, Oakland University
Failure data from a repairable system is commonly modeled by a nonhomogeneous Poisson process (NHPP). A modulated gamma process evolves as a generalization to an NHPP, where the observed failure epochs correspond to every successive kth event of the underlying Poisson process. We focus on a special class of modulated gamma process, called a modulated power law process (MPLP) that assumes a Weibull form of intensity function. The traditional power law process is a stochastic formulation of certain empirical relationships, often observed in industrial experiments. MPLP retains this underlying physical basis and provides a more flexible modeling environment. Curiously, the MLE's of the MPLP are asymptotically normal with a singular variance-covariance matrix, making its derivation quite non-standard. A set of simple closed-form estimators are proposed that are asymptotically equivalent to the MLE's. Performances of the estimators in small samples are compared and contrasted by means of extensive numerical simulation.
Statistical and Engineering Concerns in Estimating POD Functions Economically Peter Hovey*, Alan Berens, University of Dayton
Nondestructive inspections (NDI) have become an important tool in maintaining the structural integrity of both military and civilian aircraft. A key measure of the effectiveness of NDI in the damage tolerance programs is the probability of detection (POD) as a function of crack length. This talk will briefly review the history of methods used to estimate the POD function, the current standard methods used by the US Air Force, and the new concepts under study that may allow more economical estimation of the POD function. A special emphasis will be placed on the sampling issues in trying to conduct more economical POD studies while maintaining accuracy.
Empirical Likelihood Methods for Comparison of Survival Functions Yi-Chuan Zhao*, Ian McKeague, Florida State University
The use of empirical likelihood in survival analysis was initiated by Thomas and Grunkemeier (1975) who derived pointwise confidence intervals for the survival function. Since the breakthrough work of Owen (1988, 1990) the method has been applied to a variety of statistical problems. The goal of our research is to develop the approach for the comparison of survival functions for k-sample problems in survival analysis. We derive an empirical likelihood simultaneous confidence band for the ratio of two survival functions based on independent right-censored data. Earlier authors have studied such bands for the difference of two survival functions, but the ratio provides a more appropriate comparison in some applications, e.g., in comparing two treatments in biomedical settings. Our approach also works for the difference of two cumulative hazard functions. A test for equality of corresponding hazard functions is also constructed, and consistency against any fixed alternative is established. We develop a Monte Carlo simulation method to approximate the null distribution of the test statistic. Cumulative hazard ratios appear to be more tractable than ratios of survival functions or differences of cumulative hazard functions in the k-sample setting. However, the band for the ratio of survival functions is more stable and narrower than the band for the ratio of cumulative hazard functions. A goodness-of-fit test is developed for checking proportional hazards in k-sample problems. For the comparison of two distributions in the random censorship model (independent competing risks model without censoring), we construct empirical likelihood confidence bands for the ratio of the two cumulative hazards and the ratio of two survival functions. Goodness-of-fit tests for the KoziolGreen model and the equality of the corresponding hazard functions are also developed. We extend our approach to adjust for covariate effects. All the corresponding results are established under quite general conditions. The proposed methods are illustrated with a real data from a Mayo Clinic trial involving a treatment for primary biliary cirrhosis (PBC) of the liver.
CONTRIBUTED SESSION
2D:
INDUSTRIAL APPLICATIONS I DAVIDSON HALL ROOM D1210
An Application of Mixed-effect Models and Functional Data Analysis
Robert Kushler, Oakland UniversityThe spray pattern produced by a paint gun is an important determinant of the quality of the resulting painted surface. Data from a test device which measures air flow patterns are used to illustrate and compare mixed-effect model and FDA approaches to the analysis of such data.
Chemical Identification using Bayesian Model Selection Tom Burr*, Herb Fry, Brian McVey, Los Alamos National Laboratory, Eric Sander, National Nuclear Security Administration
We consider the problem of remote detection and identification of chemicals in a scene. We introduce a unique approach that uses some image's pixels to establish the background characteristics while other pixels represent the target for which we seek to identify all chemical species present. This leads to a generalized least squares problem in which we focus on subset selection to identify the chemicals thought to be present. Recent results in Bayesian model selection allow us to approximate the posterior probability that each chemical in the library is present by averaging the posterior probabilities of each possible subset. We present results using realistic simulated data for the case with 1 to 5 chemicals present in each target and compare performance to a hybrid of forward and backward stepwise selection procedure using the F statistic.
Helping Sensory Analysts to Assess the Similarity of Food Products Alexander MacRae, University of Birmingham
Sensory analysts are specialists in determining product attributes using panels of trained assessors. Though chemistry, physics or microbiology can tell us about some ways that food products differ, other important attributes such as pleasantness or the similarity of one odor to another can be measured only in this way. Probably no two batches of a product are ever completely identical, given sufficiently sensitive analytical methods, but even real differences may not be noticed by consumers. A company may know that raw materials recently delivered are in some respect different from usual---but is the resulting change in the finished product perceptible? The statistical training of sensory analysts is often limited. While significance tests are familiar, many practitioners have little awareness of other methods and when they want reassurance about the similarity of batches or products, they are often at a loss. A dedicated calculator and a supporting web site to facilitate the search for reassurance about similarity, using confidence intervals for the estimated probability of a
difference being noticed, will be described.SVD-based Structured Kernel Regression: A New Method for High-Dimensional Prediction
Z.Q. John Lu, National Institute of Standards and TechnologyRecently singular value decomposition has been successfully applied in high-dimensional data analysis, such as information retrieval, feature extraction, and clustering/classification. In this paper, we propose a local version of SVD technique, which may be more effective at finding nonlinear structures in high-dimensional data space. We have developed an SVD-based local polynomial prediction algorithm with the goal of applying it to multi-step prediction of noisy chaotic time series. The algorithm is shown to have the desirable denoising property as well as the ability to find the intrinsic nonlinear structure in arbitrary phase space reconstruction. Very encouraging results have been obtained on several well-known data sets and laboratory data.
A Case Study on Ordinal Responses
Rogelio Ramos*, Graciela Gonzalez, CIMATThe goal of this study is to investigate models for ordinal experimental data with longitudinal structure and spatial dependencies. We present data from an agricultural experimental station on agave plants. The study consists of a large experiment on infection control of agave plants over time. It is of interest to compare different treatments taking into account the spatial dependencies due to the spreading infection.
TUESDAY, MAY 21 2:00
3:40
INVITED SESSION 2E: INFORMATION TECHNOLOGY ASSEMBLY HALL, HALE AUDITORIUM
Posterior Pareto Front Analysis: A New Method for Gene Filtering in Microarray Experiments - Al Hero, University of Michigan
Click Stream Analysis for Services Creation and Usability Assessment - Mark Hansen*, Bell Labs, Lucent Technologies, Elizabeth Shriver, Bell Labs, and Rituparna Sen, University of Chicago
The Web is a big place. The search engine google.com, for example, claims to have indexed over a billion separate Web pages. Given the large number of resources on the Web, sites are constantly competing to attract and retain new visitors. To improve a user's experience, it is now common for sites to customize content (generating pages based on a user's previous activities) and to consider creative caching and prefetching schemes to deliver content as quickly as possible. In both cases, a model for describing how visitors navigate a Web site can be invaluable.
Data to support this kind of modeling can come from either access logs for a particular Web site, or proxy access logs which record all the pages requested by users of a given gateway or ISP. We begin this talk by briefly describing a "data browser" that allows an informal assessment of Web site usability. We then develop a predictive model for the pages a user will visit, given some portion of their previous browsing history, and illustrate its application to a cache-based method for prefetching.
Next, given the scale of the Web, we routinely rely on search engines like google.com to find relevant resources. In the second half of this talk, we consider using proxy data to improve Web searching. We begin by describing a template for the searching task that allows us to extract "search sessions" in real time from a proxy. These sessions are then used to train a mixture model that organizes queries into clusters. An online version of the EM algorithm is developed that can be easily implemented in a proxy. The model is then applied to both impose a topic-like structure on search results as well as to improve the ranking of resources within topics.
On the Design of Efficient Simulations for Complex Queuing Networks - George Michailidis, Department of Statistics, University of Michigan
Using ideas from the field of experimental design, we develop a framework to efficiently simulate several performance measures of modern high-speed networks. The main objectives are to minimize the number of required simulation runs and to fit models to the derived simulated data that predict well the performance of the network in all regions of the input space.
INVITED SESSION
2F: RELIABILITY
MODELING AND INFERENCE ASSEMBLY HALL, WOLVERINE ROOM
On Bayesian Inference for the Power Law Process for Repairable Systems Data - Ananda Sen, Oakland University
In analyzing failure data pertaining to a repairable system, perhaps the most widely used parametric model is a nonhomogeneous Poisson process with Weibull intensity, more commonly referred to as the Power Law Process (PLP) model. PLP is a stochastic formulation of certain empirical relationships between the time to failure and the cumulative number of failures, often observed in industrial experiments. The investigations relating to statistical inference of PLP under a frequentist framework abound in the literature. The focus of this talk is to supplement those findings from a Bayesian perspective, which has thus far been explored to a limited extent in this context. Both estimation and future prediction are considered under traditional as well as more complex censoring schemes. Modern computational tools such as Markov Chain Monte Carlo are exploited efficiently to facilitate the numerical evaluation process. Results from the Bayesian inference are contrasted with the corresponding findings from a frequentist analysis, both from a qualitative and a quantitative viewpoint. The developed methodology is implemented in analyzing interval-censored failure data of equipments in a fleet of marine vessels.
Constrained Quadratic Spline as a Model of Cumulative Hazard Function Vasiliy Krivtsov, Ford Motor Company
In parametric reliability analysis, it is customary to work with the integral of hazard rate function, h(t), known as cumulative hazard function H(t). Conventional probability distributions (exponential, Weibull, Gumbel, lognormal, and normal) do not always provide enough flexibility to model H(t). Some do not fit the data well. Others extrapolate at physically unreasonable rates. This paper considers a quadratic spline with a single free knot as a model for cumulative hazard function. The parameters of the spline and the location of the free knot are optimized so as to best fit the nonparametric estimate of the subject function.
Fusing Technological and Statistical Information Via Generalized Practical Bayes Estimators - Massimiliano Giorgio, Second University of Naples, Pasquale Erto, University of Naples Federico II
In this paper Generalized Practical Bayes Estimators
(GPBE) of the parameters of the Weibull reliability model are presented. The
main peculiarity of these estimators is that they enable to directly incorporate
into the estimation process the prior information that is truly available in the
most advanced technological fields. They are called Generalized since
their priors include all the previous priors used by the past Practical Bayes
Estimators (PBE) and the Modified Practical Bayes Estimators (MPBE). This new
feature enables to better fit and control the prior uncertainty.
From a technical point of view, the approach adopted by GPBE can be considered a compromise between pure Bayes and the Empirical Bayes approach (the first one uses completely specified prior distributions, the second one uses prior distributions based on past experimental data). In order to use the GPBE it is necessary to anticipate the values of only the prior parameters in which the engineers prior knowledge can be really converted, leaving the other parameters unspecified. In order to evaluate the performances of the GPBE in the case of small sample size, a large Monte Carlo study has been carried out (for both complete and censored samples). This study showed the good not asymptotic properties of such estimators, which performed much better than the Maximum likelihood ones.
INVITED SESSION 2G: STATISTICS IN ENGINEERING ASSEMBLY
HALL, MICHIGAN ROOM
Some Statistical Problems in Optoelectronics Kevin Coakley, National Institute of Standard and Technology
A NIST team of physicists, engineers and statisticians are developing methods and associated software for characterization of high-speed optoelectronic devices. A photodiode converts an optical signal into an electrical signal. This electrical signal is detected with a high-speed equivalent time sampling oscilloscope. Both the photodiode and oscilloscope have impulse response functions which distort the signal of interest. We wish to estimate the power and phase of the impulse response function of the photodiode up to 50 GHz. Due to time base distortion errors, the measured signal is not equally spaced in time. Moreover, repeat measurements of noisy signals drift in time and are jittered. I discuss methods for compensating for the effects of time base distortion, methods for aligning noisy signals and estimation of the variance of timing jitter noise.
Selective Assembly in Manufacturing: Statistical Issues and Optimal Strategies David Mease*, Vijay Nair, University of Michigan, Agus Sudjianto, Ford Motor Company
Selective assembly is a cost-effective approach for high-precision assembly from low-precision components. The quality of an assembly in a manufactured product is often largely determined by the clearance between two mating components. Selective assembly can be used to reduce the deviation of this clearance from a target value without requiring reduction in the tolerances of individual components. In order to accomplish this, the individual components are binned into several classes prior to assembly, and the final assembly is manufactured by selecting components appropriately from the classes. The result is a high quality product manufactured from relatively inexpensive components.
In this talk, we study optimal strategies for selective assembly under various loss functions. Closed form expressions for optimal rules under L1 and L2 loss functions are obtained and conditions for existence and uniqueness of the rules are studied. A family of asymmetric loss functions is also examined. We present results for various component distributions and provide comparisons to traditional selective assembly procedures such as equal width and equal area partitioning. We also examine the benefits of allowing non-equal partition probabilities for the components and explore optimal selection for the distribution of one component given a distribution for the second component.
Multistage Statistical and Engineering Process Control - Fugee Tsung, Hong Kong University of Science and Technology
As manufacturing quality has become a decisive factor in global market competition, quality control techniques such as Statistical Process Control (SPC) and Engineering Process Control (EPC) are becoming popular in industries. With advances in information, sensing, and data capture technology, large volumes of data are being routinely collected and shared over multiple-stage processes, which have growing impacts on the existing SPC and EPC methods. Thus, there is an urgent need for an effective quality control strategy for a multistage process combining SPC and EPC. However, some technical challenges, such as the eliminating of the "window of opportunity" for detection, the handling of the dynamics and autocorrelation structure, and the decomposing of the confounded incoming signals, need to be addressed to ensure high detectability and traceability. This research will tackle these unique issues due to SPC and EPC integration, and provide an effective approach to improve the detectability and traceability of monitoring and diagnosing a multistage process.
CONTRIBUTED SESSION 2E: APPLICATION OF STATISTICS IN AUTOMOTIVE INDUSTRY:
COMPUTER EXPERIMENTATIONS AND OPTIMIZATIONS DAVIDSON HALL, ROOM D1210
Virtual Experimental Design for Front Wheel Alignment Variation Analysis Shih-Chung Tsai*, Ronald Charleville, General Motors
High warranty cost for front suspension alignment is a big concern in current vehicle program. The objective of this project is to apply virtual experimental design (VED) process along with computer simulation code, ADAMS, to study the root causes for the front suspension alignment variation. More than 40 design parameters and associated tolerance specs were investigated in this study. In the VED, 6 alignment characteristics were treated as output responses. An integration computer software, iSIGHT, was applied to design the DOE matrix and then execute automatically ADAMS simulations to generate output response data. All simulation data were stored in the database of iSIGHT and then analyzed to identify critical design parameters for the front suspension alignment variation. The analyses show that vital few factors contribute to the majority of the total variation. Finally, the results and conclusions were presented to release and manufacture engineers for quality improvement purpose.
A Method for the Determination of Statistical Strain-Life Curves Christopher Williams*, Yung-Li Lee, John Rilly, DaimlerChrysler Stress Lab and Durability Development
In todays industry environment the need to accelerate product development schedules requires that engineers actively engage simulation models to eliminate iterations from the design and test process. Much time is spent on the design and evaluation of component reliability and durability. If accurate descriptions of the fatigue behavior of materials are available, then simulation software such as nCode can be used to simulate component durability and thus provide insight to design flaws more quickly than can the design and test cycle.
A method is presented that facilitates the development of accurate statistical strain-life curves given experimental data from strain controlled uni-axial fatigue tests. The method establishes a series of selection criteria that ensure that the data used in the statistical analysis is significant and representative of the materials true behavior. The method goes on to establish a procedure for the statistical analysis that ensures that each domain of the materials behavior is accurately represented. The result is an R50 strain-life curve that is then scaled to an R90C90 strain-life curve by incorporating a technique presented by Shen and Wirsching. This R90C90 curve represents a strain-life curve which encapsulates the behavior of 90 per cent of samples of a material and in which 90 per cent confidence exits. This R90C90 curve is then suitable for use in industrial simulation software.
Decomposition of Variable Importance in Probabilistic Design Xiaoping Liu*, Liem Ferryanto, Ford Motor Company
The application of probabilistic design to practical engineering design is often hindered by lack of precise statistical information of design variables. Nevertheless, such an application is still considered advantageous for sensitivity analysis to quantitatively identify order of importance of the stochastic variables. In this presentation, we explore and compare possible alternatives to calculating probabilistic sensitivity analysis by decomposing the effects of probabilistic variables to the variability of response variable. We suggest and evaluate the following three methods: Sobol Sensitivity Index (SSI), Extended Fourier Amplitude Sensitivity Test (EFAST), and Reliability Index Sensitivity (RIS). A toy example is used for benchmark comparison among the methods and a real automotive application demonstrates their usefulness in engineering design practice.
Math-Based Statistical Quality Analysis on Cable-Drive Glass Guidance System Chun-Liang Lin, General Motors
Cable-drive glass guidance system has been widely used as the glass lift mechanism of modern vehicles for the benefit of low-cost and light-weight. The entire glass guidance system consists of four major subsystems, which are metal panel (door in white), seal, glass, and regulator. Glass movement is driven by a steel cable powered by either manual effort or an electric motor, which provides required torque to overcome friction within regulator mechanism, friction between glass and seal, and glass weight.
A sophisticated system-level Computer-Aided Engineering (CAE) model was established to analyze the glass guidance system. Through the virtual CAE model, analysts and designers can evaluate the design in an early stage of development process, which significantly reduces time and cost for late design changes and hardware tests. The established CAE model for glass guidance system has been shown to provide a high fidelity prediction for both system and subsystem performance.
The relationship between the system performance and design parameters are established using Response Surface Methodology (RSM). The mean value of performance is approximated with a regression analysis using a quadratic model with central composite design. Variation of performance is then estimated using Taylor Expansion. Design optimization and robustness are conducted afterward. In this paper, a brief introduction to the CAE model and details regarding the statistical quality analysis for cable-drive glass guidance system are presented.
Stochastic Optimization Application for Vehicle Structures Yan Fu*, Steve Wang, Ford Motor Company, Urmila Diwekar, Kemal Sahin, Carnegie Mellon University
With the continuous improvement of powerful computers, vehicle structural designs have been addressed using computational methods, resulting in reductions in cost and time to develop new vehicles. Traditional simulation-based optimization generates deterministic optimal designs, which are frequently pushed to the limits of design constraint boundaries, leaving little or no room for uncertainties in modeling, simulation, and/or manufacturing imperfections. This paper presents an application of a new stochastic optimization method for vehicle side impact design. A nonlinear response surface model is employed to conduct this study. The main goal is to enhance the vehicle side impact performances while minimizing the vehicle weight under various uncertainties. The new algorithm alleviates the computational burden of excessive model evaluations by estimating the objective and constraint functions during the optimization process through a reweighting method. The efficiency and accuracy of this algorithm is presented through a real-world vehicle safety design problem.
CONTRIBUTED SESSION
2F: DESIGN OF
EXPERIMENT II DAVIDSON HALL, ROOM D1273
The Maximum Estimability Criterion for Fractional Factorial Designs Xianggui Qu, University of Michigan
This talk introduces the maximum estimability (Maxest) criterion to address the problem of optimal factor assignment for any fractional factorial designs. The Maxest criterion is a natural extension of the minimum aberration and the MaxC2 criterion for regular designs. It is a refinement of Webbs concept of resolution. The Maxest criterion is coding dependent, which is distinct from other coding independent criteria, such as the generalized minimum aberration and the minimum moment aberration criteria. The Maxest criterion is used to classify the projections of some useful nonregular designs. Comparing with other projective properties, such as the geometric and the hidden projection properties, the new classification is simpler and takes statistical modeling into consideration.
Fractional Factorial Designs that Maximize the Probability of Identifying Active Factors Navara Chantarat*, Theodore Allen, Ohio State University
We use simulation to evaluate the abilities of fractional factorial designs to achieve model identification-related objectives. We compare various automatic approaches for model identification including alternatives to Daniel plots and Bayesian selection methods. The results motivate new classes of fractional factorials that directly maximize the power to identify effects under the assumption of the prior distribution on main effects and interactions from a standard textbook. Both new and established methods are illuminated by novel approaches to support experimental planning decision-making. The proposed methods are also illustrated by a real world case study.
Experimental Designs that Maximize the Expected Utility in the Robust Engineering Context Waraphorn Ittiwattana*, Mikhail Bernshteyn, Theodore Allen, Anup Nair, Ohio State University
We describe our alternative approach to Taguchi's signal-to-noise ratios based on a direct application of utility theory. We also compare this approach to other alternatives in the literature. Then, we describe our alternative experimental plans to compound arrays derived from a so-called "recourse", utility-based formulation, i.e., a formulation that incorporates the two-stage decision process including choice of the experimental plan in the outside optimization and the choice of the best engineering system in the inside optimization. We compare the proposed method with summary measure-based alternative approaches from the literature and use an example to illustrate how each approach could be implemented.
Quality Loss Functions for Nonnegative Variables and Their Applications Roshan J. Vengazhiyil, University of Michigan
Quality loss functions play a fundamental role in every quality engineering method. A new set of loss functions is proposed based on Taguchi's societal loss concept. Its applications to robust parameter design are discussed in detail. The loss functions are shown to posses some interesting properties and lead to theoretical results that cannot be handled with other loss functions.
k-Circulant Supersaturated Designs
Yufeng Liu, Angela Dean, Ohio State UniversityIn the early stages of industrial experimentation on a product or process, a large number of factors is likely to have been identified as possibly having an influence on the response. It is quite common, however, that only a few of these actually have a substantial effect---a situation known as factor sparsity. The small number of active, or influential, factors can often be identified through a screening experiment. Various design strategies have been proposed for screening under factor sparsity, including the use of supersaturated designs. In this talk, a class of supersaturated designs called k-circulant designs for m=(2t-1)k factors and n=2t runs is explored. These designs are constructed from cyclic generators by cycling k elements at a time and adding a row of +1's to give n runs in total. The class of k-circulant designs includes some previously known Es2-optimal designs as well as new designs which are more efficient in terms of model estimation under factor sparsity. We also examine and compare the projection D-efficiencies of k-circulant designs and make recommendations on the choice of designs.
TUESDAY, MAY 21 4:10
5:10
PLENARY SESSION ASSEMBLY HALL, HALE AUDITORIUM
Support Vector Machines, Nonparametric Regression, and Boosting Trevor Hastie, Stanford University (with Ji Zhu)
The SVM techniques pioneered by Vladimir Vapnik has created a
growth industry in computer science and machine learning. Originally developed
as enhancements of the separating hyperplane for two-class classification, we
now have SVM versions of regression, principal components, time-series models,
..., and the crank is still turning.
In this talk we describe the SVM, and its use of "inner-product"
kernels to achieve flexible generalizations. We then view the SVM as the
minimization of a regularized error function in a reproducing kernel Hilbert
space of functions, with strong connections to the smoothing spline technology
of Grace Wahba. Our conclusion is that the SVM looks very much like
nonparametric logistic regression, without all the benefits such as a
multi-class generalization, and the interpretation of the fitted functions as
logistic transformations of class probabilities. We propose a modified version
of logistic regression, which we call the "Import Vector Machine",
which appears to inherit the good properties of both SVMs and logistic
regression. We illustrate these techniques on some examples, and make
connections with boosting, another popular machine-learning method for
classification.
WEDNESDAY, MAY 22 8:40
10:20
INVITED SESSION 3A: DESIGN OF EXPERIMENTS ASSEMBLY HALL, HALE AUDITORIUM
Projection Properties of Orthogonal Arrays
- C.S. Cheng, University of California-BerkeleyIn factor screening, often only a few factors among a large pool of potential factors are active. Under such assumption of effect sparsity, in choosing a design for factor screening, it is important to consider projections of the design onto small subsets of factors. I will present some recent results on hidden projection properties of certain orthogonal arrays. Applications of these results to the construction of supersaturated designs will also be discussed.
Fractional Split-Plot Experiments with Replicated Settings of Whole-Plot Factors - Eric Schoen*, TNO TPD, Delft, Netherlands, Derek Bingham, University of Michigan and Randy Sitter, Simon Fraser University
When it is impractical to perform the experimental runs of a fractional factorial design in a completely random order, restrictions on the randomization can be imposed. The resulting design is said to have a split-plot error structure. Similar to completely randomized fractional factorials, fractional factorial split-plot designs can be ranked using the aberration criterion. Techniques that generate the required designs systematically presuppose unreplicated settings of the whole-plot factors. In this talk, I use a cheese-making experiment to demonstrate the practical relevance of designs with replicated settings of the whole-plot factors. I present a systematic method to generate the required designs. A key element in the method is the splitting of whole-plots according to one or more sub-plot effects.
Selecting 2m-p Designs Using A Minimum Aberration Criterion When Some Two-Factor Interactions Are Important - Boxin Tang, University of Memphis
We consider in this paper the problem of selecting appropriate 2m-p designs when some two-factor interactions are important. Current methods in the literature select designs that permit estimation of the postulated model consisting of the main effects and important two-factor interactions, under the assumption that all the other effects are negligible. When the effects not in the postulated model are not negligible, they will bias the estimates of the effects in the model. To minimize the contamination of these nonnegligible effects on the model, this paper proposes and studies a minimum aberration criterion. We then discuss the application of this new aberration criterion to compromise plans. Finally, we examine how to search for the best designs according to the criterion and present some results for designs of 16 runs.
INVITED SESSION
3B: STATISTICAL
METHODS FOR MANUFACTURING VARIATION REDUCTION ASSEMBLY HALL, WOLVERINE ROOM
Statistical Methods Driven by Engineering Models for Manufacturing Process Control and Variation Reduction
- Jan Shi, University of MichiganThe economic globalization brings intense competition among manufacturing enterprises. The key to succeed in this competitive climate is to rapidly respond to market changes and produce high-quality product with low cost. This demands new and efficient process control and management methodologies. On the other hand, with the development of computer and sensing technology, massive information of the manufacturing process (e.g. product design, process planning, in-process sensing information of process status, and product quality information) is now available. This presents us a great opportunity to develop such a new process management methodology that meets the requirements.
This presentation will introduce a model-based methodology for monitoring, diagnosis, and control of multistage manufacturing processes (MMP). A multistage manufacturing process involves more than one workstation or operation to produce products. The complexity of the process variations at each station and the propagation of variation along multiple stations make the fault diagnosis and product variation reduction of MMP a very challenging problem. In this research, a state space modeling approach is developed to describe the variation propagation and links the product dimensional quality and the process faults. This model integrates the product design (CAD) information and the process planning (CAPP) information such as the fixture layout and workpiece setup at each stage. Based on this model, the diagnosability and root cause identification of the system is pursued by integrating advanced statistics with engineering knowledge. Besides the process-level modeling and diagnosis, station-level dynamic control strategies for vibration reduction will also be presented.
The presented math-based methodology not only provides the process monitoring capability as the traditional statistical process control does, but also goes beyond detection of process faults to fault isolation, root cause determination, and in-process compensation/action.
Diagnosing Manufacturing Variation Via Blind Source Separation - Daniel Apley, Texas A&M University
In modern manufacturing processes, large quantities of multivariate measurement data are routinely available through automated in-process sensing. The data typically contain valuable information regarding the nature of each of the (possibly many) individual variation sources that contribute to the overall level of process variability. This work assumes that each source causes a distinct spatial variation pattern in the measurement data. It is argued that a suitable model for representing the variation patterns is of identical structure to one that is widely used in sensor array signal processing. Consequently, methods developed for these signal processing applications, termed blind source separation, can be used to identify spatial variation patterns in manufacturing data. Basic blind separation concepts and their applicability to diagnosing manufacturing variation are discussed.
Integrated Modeling and Analysis of Quality and Reliability for Multistage Manufacturing Processes - Jionghua (Judy) Jin, The University of Arizona
System reliability analysis of a manufacturing process should consider both effects of manufacturing system component reliability and product quality. In general, there are significant and complex interactions between product quality and tooling reliability (QR-Interaction). Recent research efforts have been made to develop a new model, QR-Co-Effect Model, to study this interaction and its impact on the system reliability of a manufacturing process. In multistage manufacturing processes (MMPs), the propagation of QR-Co-Effect across all stages becomes even complex: the degradation of manufacturing system components can cause the deterioration of the downstream product quality; at the same time, the system component reliability can be affected by the deterioration of the incoming product quality of upstream stations. This kind of quality and reliability interaction characteristics can be observed in many manufacturing processes such as machining processes, assembly processes, and stamping processes. However, there is no available model to describe this complex relationship between product quality and manufacturing system component reliability. In this research, considering the unique complex characteristics of MMPs, a new concept of quality and reliability chain (QR-Chain) effect is proposed to describe the complex propagation relationship of the interaction between manufacturing system component reliability and product quality across all stages. Based on this, a general QR-chain modeling of the system reliability in MMPs is proposed to integrate the product quality and manufacturing system component reliability information for system reliability analysis. This model can be further applied to system reliability evaluation, optimal integration of tool tolerance allocation and maintenance decision for multistage manufacturing processes.
CONTRIBUTED SESSION
3A:
APPLICATIONS IN INDUSTRY AND INFORMATION TECHNOLOGY ASSEMBLY HALL, MICHIGAN
ROOM
Small and Large Time Scale Analysis of a Network Traffic Model Krishanu Maulik*, Sidney Resnick, Cornell University
Recent empirical studies of the Internet and WAN traffic data have observed multifractal behavior at time scales below a few hundred milliseconds. There have been some attempts to model this phenomenon, but there is no model to connect the small time scale behavior with behavior observed at large time scales of bigger than a few hundred milliseconds. There have been separate analyses of models for high speed data transmissions, which show that appropriate approximations to large time scale behavior of cumulative traffic are either fractional Brownian motion or stable L้vy motion, depending on the input rates assumed. This paper tries to bridge this gap and develops and analyzes a model offering an explanation of both the small and large time scale behavior of a network traffic model based on the infinite source Poisson model. Previous studies of this model have usually assumed that transmission rates are constant and deterministic. We consider a non-constant, multifractal, random transmission rate at the user level which results in cumulative traffic exhibiting multifractal behavior on small time scales and self-similar behavior on large time scales. Also, we model the file size and the transmission rate, which are more natural objects than the usually modeled transmission time and rate.
Estimating Network Internal Loss Using Bicast End-to-end Measurement Bowei Xi*, George Michailidis, Vijay Nair, University of Michigan
In this talk we examine the network tomography problem whose objective is to recover the loss probabilities of all the internal links in a network by using measurements only on the edge nodes. This problem is of increasing importance due to the fact that it is impossible to measure every single link in a large decentralized network. We introduce a new monitoring scheme, involving bicast measurements, and analyze its theoretical properties on the networks with a tree topology. The link loss probabilities can be calculated from network traces using the EM algorithm. A Bayesian version of our approach will also be discussed.
A Comparison of Dirichlet-Multinomial Model for Bayesian Information Retrieval to Probabilistic Retrieval Systems Based on Binary Independence Assumption Valeria Thompson*, I-Li Lu, University of Washing and the Boeing Company
In probabilistic approach to information retrieval (IR) it is commonly assumed that query terms in a relevant document are stochastically independent. Together with the Probability Ranking Principle, these assumptions have led to the development of binary independence model (BIM) for IR. Recently, due to its simplicity and self-updating property through conjugate prior, Bayesian information retrieval (BIR) using BIM has emerged and demonstrated to be an empirically appealing process. In real world context, however, the assumption of conditional independence of query terms almost never holds. Many models and ideas have been proposed and attempted to circumvent this problem. Lu (2001, Boeing M\&CT-TECH-01-005) developed a Bayesian retrieval model based on the Dirichlet-Multinomial (D-M) distribution without the notion of conditional independence. In this paper, we illustrate that the D-M model for BIR developed by Lu can easily implement the self-updating process of the BIM for BIR. We use simulation to compare the D-M model for BIR to binary independence models under different specification for the initial prior parameters. And finally, we apply this model to actual data and compare the efficiency gained by adopting this model to existing BIMs.
Bayesian Inference for Multivariate Ordinal and Binary Data Earl Lawrence, University of Michigan
We consider situations where multiple sets of ordinal data are observed from the same experimental unit. We develop Bayesian inference using a multivariate probit regression model. An underlying latent variable framework is used to capture the dependence among the multivariate data. This model includes the special case in which one or more sets of the data are binary. By a judicious choice of priors, we are able to overcome some of the computational difficulties of other Bayesian approaches in the literature. An application to multivariate probe test data from integrated circuit fabrication will be described.
Developing a Model System for the Scrap Processing Problem under the Just-in-time Setting Tao Ding, University of Alberta
In this paper we study the scrap processing problem under the just-in-time setting. We establish a developing model system that integrates different statistical models for the scrap rate data under different situations through analysis of their stochastic properties for the MS/MP/DP case. We also derive a simple algorithm for the optimal ordering quantity and then discuss the related management efforts for the developing model system under Supply Chain Management environment. Further, we use a case study about a typical automobile company to illustrate the implementation of the system and design a DSS plug-in system, AOGS (Automatic Order Generating System) under the ERP environment. Finally, we point out some topics for future research in this area.
CONTRIBUTED SESSION
3B: GENERAL
METHODS II DAVIDSON HALL, ROOM D1210
Conditional Independence and General Factorizations in Times Series Graphical Models
Rob Deardon*, Henry Wynn, University of Warwick, Peter Caines, McGill UniversityThis work is a contribution to the recent research program fusing together Bayes graphical models and times series, that is, graphical models in which every node is a time series. The main task is to derive conditions for conditional independence. This paper concentrates on the stationary Gaussian case. Two cases are distinguished: global conditional independence when two whole (past, present, future) times series, X, Y are conditionally independent given a whole third series, Z, and local conditional independence in which the present of X and Y (at time t) are conditionally independent give the past of Z (time < t). A comparison is made between local and global conditions and computations carried out for autoregressive processes. This work is then applied to data from complicated industrial processes (e.g. waste water treatment plants).
Robustifying the F Test and Bartletts Test for Homogeneous Spreads James Fenwick, Millersville University
Two adjustments to the F test for variances are introduced which ameliorate the poor robustness properties of the test. A comparison of these adjustments to the classical F test and Levene/Brown-Forsythe's test shows a significant improvement for various sample sizes and distributions. One of the adjustments is extended to testing homogeneous variances in multiple samples and comparisons are made to Levene's and Bartlett's tests for equal variability.
A Test of Homogeneity for Two Multivariate Populations Maria Rizzo, Bowling Green State University
The classical tests of homogeneity, such as the two-sample Kolmogorov-Smirnov test, do not have a natural extension to comparing two multivariate populations. G. J. Szekely and N. K. Bakirov have proposed a new test based on Euclidean distance between sample elements. This test can be applied to testing homogeneity of any two d-dimensional multivariate populations with finite mean vectors, and the test is affine invariant and consistent. The talk will discuss the background of the new test, present the test implementation methods, and discuss applications. We show that a practical implementation of this test is possible via bootstrap for testing the composite hypothesis of equal distributions when both distributions are unspecified. Empirical results in the univariate case suggest that this test compares favorably with existing tests. The talk will also present empirical results for multivariate distributions.
The Center Similar Distribution and Its Applications in Generating Random Variables Zhenhai Yang*, Beijing Polytechnic University, W.K. Pang, S.H. Hou, Hong Kong Polytechnic University
In this paper, the center similar distribution (CSD) of a
multivariate distribution is proposed based on the vertical density
representation theory. We construct the distribution family of CSD related Gamma
distribution and display its relation with the Dirichlet distribution. We can
generate a random vector from the Dirichlet distribution by generating a random
vector from the center similar distribution. We also give the algorithm for
generating the center similar distribution. A random vector Xd
has the center similar distribution if Xd=RUd,
where R and Ud are independent of each other, R
is a random variable with non-negative values and Ud is
a random vector with a uniform distribution on the set D0
Rd whose Lebesgue measure L(D0) > 0. A
CSD is a generalization of the spherical symmetric distribution. A random vector
Xd has a spherical symmetric distribution if Xd=RVd,
but Vd is uniformly distributed on the surface of unit
sphere. For a CSD, when D0 is unit sphere, Ud
has a uniform distribution on the body (not the surface) of the sphere. We also
point out that they are equivalent.
Canonical Correlation Analysis Based on Information Theory Xiangrong Yin, University of Georgia
Canonical correlation analysis (Hotelling, 1935, 1936) has
long been used as a standard method in multivariate analysis, whose merit is
simply to catch the linear relations between p
1
vector Y-set and q
1
vector X-set. However, it could fail when there is no linear trend
between them while the relation goes through nonlinear way. To overcome this
drawback, here we develop a new canonical correlation method between Y-set
and X-set, called informational canonical correlation, based on
information theory. This method finds the most important relationships between
the two sets whether linearly or nonlinearly. It can recover nonlinear relations
between the two sets if they exists. When the linear trend dominates, then it
should be reduced to the equivalence of the usual canonical analysis. Thus it
can capture more general information conveyed between X and Y.
Another issue in canonical correlation analysis is to determine the number of
pairs of the canonical variables which typically rely on normal distribution or
large sample. Here simply, we suggest a permutation test for determining the
pairs of informational canonical correlation variables, which requires no
specific distributions for X and Y at all as long as one can
estimate the densities. Comparison with the usual canonical correlation analysis
is made and examples illustrating this new method are presented.
WEDNESDAY, MAY 22 10:40
11:50
INVITED SESSION 3C: TEST PLANNING AND MODELING FOR MILITARY SYSTEMS ASSEMBLY HALL, HALE AUDITORIUM
Shelf Life and the Monotone Change Problem - Michael Woodroofe, University of Michigan
Suppose that items are manufactured and placed in storage for
future use (or possible use) and that the items may deteriorate over time. Thus,
letting X denote the useful life of an item and F denote its
distribution, the probability that an item fails to operate properly after t
time units is F(t) = P[X < t]. Suppose
next periodic tests are performed in order to monitor the possible
deterioration. In simple schemes mi items are tested after ti
time units, where 0 = t0 < t1 <
ททท, and the proportion Xi of failures is recorded.
Let pi = F(ti). Then 0 < p0
< p1 ททท and miXi
~ Binomial (mi,pi), i = 0, 1, ททท,
are independent. In more complicated schemes mi and even ti
can depend on earlier results, and the Xi are no longer
independent. If the data are analyzed at time tn, then the
following statistical questions arise: Is pn > p0;
that is, has deterioration occurred yet? Or, is pn > p0,
where p0 denotes an acceptable level (specified externally)?
How should p0, ททท , pn be estimated
with special attention to pn, the current value? If the pi
are modeled parametrically, say log(pi) - log(1 - pi)
=
+
Stochastic Models for System Design Fault Removal in Support of Operational Test Planning - Don Gaver*, Patricia Jacobs, Naval Postgraduate School, and Ernest Seglie, Department of Defense
Early-stage prototypes of single-shot multi-stage systems (missiles or rockets) and certain stage-wise repetitive-usage service (software, communications) systems are often found to be functionally deficient when subjected to acceptance test (Operational Test, or OT). Unanticipated design faults or failure modes are likely to appear. Re-design or "fault removal" is then conducted (a "fix") and re-test is conducted. Question: how to determine when to stop the Test-Fix-Test (TFT) process?
This talk addresses the above problem by probability modeling, and evaluates simple success-run stopping criteria: stop TFT after first achieving a run of r (e.g. 3, or 5) consecutive total system test successes (the SR criterion). The models evaluate the costs and benefits of the TFT system. The models represent more and different sources of inherent variability in the process than is usual: first, the effect of early-stage failure upon testing of later stages where failures are assumed to occur independently and identically for each system realization/copy; secondly, failure processes are individualized by copy, and also across presumed environmental variations. The results provide operational and budgetary information for system program executives, and OT planners.
Discussant: Rob Easterling
INVITED SESSION
3D: WARRANTY
ANALYSIS AND PREDICTION ASSEMBLY HALL, WOLVERINE ROOM
Text Analysis with Application to Warranty Claims
- Mark Cavaretta, Ford Motor CompanyIt has been stated that 80% of corporate information is in the form of unstructured text. Traditional research in statistics and data mining has focused on structured data, typically stored in databases. With the dramatic increase in textual data, particularly from the web, there has been increased interest in text analysis. Text analysis is a broad field encompassing information extraction, information retrieval, and text data mining. Information extraction and retrieval are generally used to present information for human-centered pattern recognition. Conversely, text data mining uses machine learning algorithms to find patterns. This discussion will present examples of text analysis tools and technologies, including our research on the application of text data mining to warranty claims.
Early Detection of Reliability Problems Using Information From Warranty Databases - Huaiqing Wu*, William Meeker, Iowa State University
Most companies maintain warranty databases for purposes of financial reporting and warranty expense forecasting. In some cases, there are attempts to extract engineering information (e.g., on the reliability of components) from such databases. Another important application is to use warranty data to detect potentially serious field reliability problems as early as possible. When a serious problem arises, the existence of the problem will eventually be obvious. Early detection of serious problems through the use of sensitive statistical methods, allowing early action to mitigate potential reliability problems, could save large amounts of money and product goodwill. This paper describes a detection procedure that has been designed for this purpose. In addition to the statistical decision rules, we suggest graphical tools for illustrating and describing the particular information in the data that caused the potential problem to be flagged. The methods are illustrated using data from an automobile warranty database.
TECHNOMETRICS SESSION
ASSEMBLY HALL, MICHIGAN ROOM
A Filter Bank Approach for Modelling and Forecasting Seasonal Pattern Ta-Hsin Li*, Melvin Hinich, IBM T.J. Watson Research Center
A novel approach for modelling and forecasting seasonal time series is proposed. Unlike the traditional approach that depends solely on dynamic models, the proposed method combines stochastic dynamic modelling with an analysis filter bank designed to reduce dimensionality and to extract persistent components for reliable long-term forecasting. The filter bank decomposes the time series of interest into seasonal components, and only those components that are highly coherent across periods are selected for subsequent modelling and forecasting. Experiments show that under suitable conditions, the use of highly coherent components not only reduces the modelling complexity and the required amount of training data but also limits the impact of noise and occasional corruption in the training data and thus provides robust forecasts with reduced variability. Fourier and wavelet filter banks are discussed in detail. Simulated and real-data examples are used to illustrate the method.
CONTRIBUTED SESSION
3C:
OPTIMIZATION AND RESPONSE SURFACE EXPLORATION DAVIDSON HALL, ROOM D1210
Possible Roles of Optimization in the Future of Experimental Planning
Theodore Allen, Ohio State UniversityA simple real world example is used to illustrate the current paradigm of experimental planning and the limitations of the decision support available to the growing number of novice users. Then, one vision is proposed of how simulation and optimization can be used to provide information and method options that experimental planers might desire. Trade-offs in conceptual clarity, method performance, and apparent complexity are discussed in addition to the implications for teaching experimental planning at various levels. While it is probably true that optimal plans will not obsolete designs derived from combinatorial structures in the short run, it is likely that many approaches derived from realistic optimization formulations besides, e.g., Box-Behnken designs, will start to enjoy wide spread usage.
Identifying Rising Ridge Behavior in Quadratic Response Surfaces Bruce Ankenman, Northwestern University
Canonical analysis is a common method for exploring and exploiting fitted quadratic response surfaces. Much attention in canonical analysis is given to identifying ridge behavior in these surfaces in order to achieve optimal response at minimum cost. However, little attention has been given to classifying the identified ridge as a stationary ridge or a rising ridge. Knowing whether a ridge is stationary or rising is critical for making decisions about how to continue the response surface exploration or for setting process parameters. This talk presents two methods that allow for identification, classification and confirmation of ridge behavior. The first method is based on linear regression and though easily implemented, can be imprecise. The second method is more precise and is based on a new parameterization of the canonical form.
Fractional Polynomial Response Surfaces Steven Gilmour, University of London
Second order polynomial response surfaces have been widely used to model the relationship between a response variable and several quantitative factors. Sometimes the second order model does not fit well, while higher order polynomial models require many more runs and produce unexpected turning points on the fitted response surface. The Box-Tidwell method of transforming the factors provides a rich class of "fractional polynomial response surface models" for such cases, with just one additional parameter for each factor. Modern computational facilities mean that these models can now be used routinely. Examples will be given which show how the fractional polynomial response surface model can give a better fit than the second order model and a more parsimonious fit than higher order models. A routine method for fitting these models will be described and illustrated with application to the example. Finally, some ongoing work on designing experiments for fitting fractional polynomial response surfaces will be briefly described.
CONTRIBUTED SESSION
3D:
INDUSTRIAL APPLICATIONS II DAVIDSON HALL, ROOM D1273
Random Discrete Latin Hypercube Sampling for Stochastic Simulations of Spot Welds
Xuru Ding*, Shrish Kale, General Motors CorporationIn performing random simulations using CEA models, the method of sampling is important. It effects the accuracy and the convergence speed of the mean and standard deviation estimates. This presentation discusses one special case: sampling of spot welding location from potentially thousands of spot welds on a vehicle body. This study is prompted by the need of evaluating the effect of missing or low quality welds on the structural integrity of the body, identifying critical welds, and optimizing weld locations. A random sampling method based on the principle of Latin-Hypercube sampling is developed for this special application, which also accommodates GM's internal specifications on welding quality. We'll also present a case study in which the efficiency of three different sampling methods is compared. The new method, called Balanced Latin-Hypercube Sampling (BLHS), has shown significant improvements over the other two. Further applications of the method will also be discussed.
Statistical Modeling of Minimum Intergranular Corrosion Path Length in High Strength Aluminum Alloy Shiling Ruan, Ohio State University
In this article a brick wall model is developed to describe the relationship between the minimum intergranular corrosion (IGC) path length and the aspect ratio of grains of high strength wrought aluminum alloy. We study the distribution of the horizontal distance that a corrosion path will travel in the metal and fit the model to an actual corrosion data set using the method of moments. The distribution of the horizontal distance of a corrosion path along a given grain is assumed to be uniformly distributed given the length of the grain, which is itself modeled by a gamma distribution. A modified brick wall model is proposed that imposes a distribution on the vertical distance traveled by the corrosion path as well. We use computer simulation to evaluate the fit of these models.
Supplier Selection Based on Bootstrap Confidence Regions of Process Capability Indices Alan Polansky, Northern Illinois University
The supplier selection problem consists of selecting the best of several possible suppliers for a product. The selection criteria, in the absence of considerations such as cost, are based on a quality metric such as the capability of the supplier's manufacturing process. Because quality metrics are estimated based on sample process data, the inherent variability in the estimates must be accounted for when selecting the best supplier. In this paper we consider a methodology based on the bootstrap that assigns confidence levels to each of the suppliers. These confidence levels are designed to reflect the amount of confidence we have that the supplier actually has the best quality metric, conditional on the observed samples. The methodology is demonstrated using two examples.
WEDNESDAY, MAY 22 11:50
12:50
PLENARY SESSION ASSEMBLY HALL, HALE AUDITORIUM
Predicting Warranty Returns from Accelerated Testing William Q. Meeker, Iowa State University
Accelerated life tests are commonly used to assess the reliability of materials, components, and subsystems. A frequently asked question is "what do these test results say about performance in the field." Laboratory tests are carefully controlled whereas the field environment is highly variable. Products in the field see, for example, different use rates. If detailed information on the distribution of use rates in the field is available, it is possible to use laboratory test results to predict the failure time distribution in the field. Often such information is not available. If both life test data and field data (e.g., from Warranty returns) are available, focusing on one or more failure modes, it is possible to fit a physically-motivated transfer function to relate the two data sets. Under a reasonable set of practical assumptions, this transfer function can then be used to predict the failure time distribution for a future component or product operating in the same use environment. This talk will describe a model and methods for fitting such a transfer function model. The methods will be illustrated with an example in which the transfer function was used to predict the failure time distribution of a turbine device that had two different failure modes.