Whenever an EXAKT menu item is mentioned in bold, it should be left-clicked in the EXAKT program. When other items are mentioned, usually, they should be double clicked. X indicates that you should close the active sub-window.
When you expand a sub-window and then close it, you will lose the familiar arrangement of underlying windows. When this happens, simply reduce the underlying window or hit the Mov Win Left icon.These general rules will not be repeated. You may either print out this tutorial or you may resize and position this document or mindmap elsewhere on your screen. Similarly, you may resize and position the EXAKT program. When necessary, you can temporarily expand the EXAKT window to full screen.
You will learn the basic functions of the EXAKT model building software and the EXAKT decision agent software. You will use a reduced set of oil analysis data from a fleet of haul truck transmissions to build a proportional hazards model. Then you will deploy this model as an “intelligent agent” that silently and automatically monitors future condition monitoring data, returning an optimized decision (whether or not to remove and repair the transmission) as each new set of condition monitoring readings are received.
A long term policy of making optimized decisions will, on the average, minimize some undesirable feature, such as cost , or maximize some wanted feature, such as availability. The agent provides a remaining useful life estimate based on the current condition of the equipment, its age, and all relevant maintenance and operational events that have occurred.
Download the data file. Extract the database files (Transmission_MES.mdb and Transmission_DMDR.mdb) in a folder, say in "\ExaktBasic" on your hard drive.
Launch “EXAKT for Modelling”. This is the program for validating and analyzing condition monitoring and event data and for building the optimized CBM (condition based maintenance) model.
Start, Exakt for Modelling
File, New, Navigate to the folder where you placed the Transmission_MES.mdb file, File name: Transmission_WMOD.mdb, CreateModeling (on menu bar), Data Setup, Enter the following into the script editor:
Database="Transmission_MES.mdb"; Attach Inspections=OilAnalysisData, Events=TransLifetimes, EventsDescription, VarDescription, CovariatesOnEvent
Execute, Save
Examine the attachment script You will note that it creates an ODBC (open database connectivity) link to an external database called “Transmission_MES.mdb.” Then it has "attached" a number of tables. It has applied its own internal names to two of the tables using the A=B syntax but other tables are attached directly since their names are already consistent with EXAKT’s internal names for those tables.
Notice that the attached tables have now become visible and accessible in the tree view in the left window pane. In the next steps you will examine each one of those tables to become familiar with their content and structure
Inspections, X
Open the Inspections table. Note the column names and content. Ident, Date, and WorkingAge are key words used by EXAKT. “Ident” is the unique name of each unit of a specific type of Item to be analyzed.
An item is a significant system, subsystem, or component upon which it is convenient and desirable to conduct a reliability analysis. An item may consist of several components and may undergo several failure modes. But in this introductory section of the tutorial we will keep it simple and assume that the item is a simple item. The “Date” may be in date or date/time format.
If condition monitoring inspections are more frequent than once every 24 hours, the date/time format must be used. The WorkingAge is a measure such as hours of operation, fuel consumed, thousands of feet of steel rolled, or any other measurement that reflects the accumulated usage or stress on the item. Calendar time can only be used if the units operate regularly in time – a rare situation. Databases of production records, hour meters, or counters must be used to acquire useful WorkingAge data. The remaining columns contain the condition monitoring data which we refer to as condition data.
Events, X
Now examine the Events data table. Contrasted with the Inspections table, its information represents the other side of the coin. Both Event and Inspection data are required for CBM optimization. The EXAKT modelling process is one of correlation of Events (of all kinds) and Inspections (that is, condition data). Condition data often comes from specialized databases provided by CBM product or service vendors. Common examples are oil analysis and vibration analysis. These databases are invariably well organized and consistently populated.
The Events data, on the other hand, often comes from the organization’s CMMS (computerized maintenance management system) and from production databases. (The records in the CMMS, typically, have been less rigorously kept than the others. Hence EXAKT contains tools and techniques to validate and get the CMMS data into shape.) The basic required Events are: 1) Beginning (an item has been placed into service) designated by B. 2) Ending by Failure, (EF)and 3) Ending by Suspension (ES). By “suspension” we mean that the item has been taken out of service for any reason other than failure. For example, it may have been preventively replaced.
Once again the Ident, Date of the Event, WorkingAge are required fields. The Event itself is recorded in the fourth column. “OC” in this example represents an “oil change” event. Any event which affects the condition data (in this case it would initialize the wear metals and contaminant elements to zero) must be included in the model.
CovariatesOnEvent , X
Examine the CovariatesOnEvent . We must provide the “initialization values” for each event. Note that in this case we are initializing wear metals and contaminants to zero and additives to their new-condition levels. We may also establish calendar periods for which these initialized values to be used. (For example, the brand or grade of lubricating oil may be changed periodically.)
EventsDescription , X
Examine the EventsDescription table. The column "Precedence" tells EXAKT program in which order to consider separate events that occur at the same date/time. For example, if an oil sample is drawn from an oil drain, we would wish that the sequence of the Inspection precede that of the oil change. The inspection event is implicitly given the precedence “0”.
Models, X
Examine the Models table. It contains no records yet. That is because you have not yet begun building a model. This table is populated automatically by EXAKT as you proceed. The only time you might access this table manually would be to delete certain sub-model(s) that you do not wish to retain. A sub-model is one of any number of models that are tested in the modelling process. The sub-model that is considered the best, is then exported. An agent will use to provide optimized decision support based on a particular item’s current condition data.
Now that we have examined the internal and external database tables we are ready to proceed with the development of a rudimentary CBM optimization model. We turn our attention to the right hand window pane containing buttons arranged in a flow chart of activities. We enter the general project data using these steps:
Data Preparation,
Enter General Data,
Project Title: Haul Trucks,
CBM Model: Trans Oil Anal,
Description: 350 T Transmission Oil Analysis,
Time Unit: Hrs., OK.
Next we instruct EXAKT to assemble the Events and Inspections into a single table called C_Inspections . The algortihms will use this table for subsequent calculations. Depending on which version of EXAKT you are using there are a number of alternative buttons we may hit at this point. For this exercise please hit
With Covariates (Complete).
After hitting this button two more tables will appear in the left pane, C_Events and C_Inspections.
C_Inspections , X
Examine the C_Inspections table. Note that the records of both tables (Events and Inspections) have been combined and arranged in chronological order in the single table C_Inspections . Inspection (condition monitoring) record events are designated by an *. The other event records (B, EF, ES, OC) have monitored data (covariate) values set to their initialized levels according to the CovariatesOnEvent table discussed previously.
Now let’s begin the “modelling” phase of the analysis.
Modelling (button on flow diagram),
Weibull PHM,
Select Covariates,
Submodel Name: ilcm,
Covariates Unselected: Iron, -> -> -> ->
OK, OK (in warning message), X.
After executing the above steps, the Trans Oil Anal (ilcm) report window appears. Examine the report. The “Summary of Events and Censored Values” presents the overall summary of the data being analyzed. A “Sample Size” of 13 means that there are 13 histories or lifetimes having a beginning and some kind of ending event.
Of the 13 histories 6 ended in failure, 3 (Censored (Def)) ended prior to a failure, and 4 (Censored (Temp)) units are currently in operation at the time of building this model. They are referred to in EXAKT as “temporary suspensions” and are identified automatically by the software. The next tabulation “Summary of Estimated Parameters” provides the results of our first sub-model “ilcm”.
The column “Sign.” indicates whether the “Parameter” is significant – that is, whether it has been found to be statistically related to failure. The Shape (i.e. WorkingAge), Iron, and Lead are designated as significant (at this point in the analysis) while Calcium and Magnesium are not. Note that Magnesium has the highest Wald Test p-Value. (The Wald Test is used to test if an independant variable has a statistically significant relationship with a dependant variable. The p-value represents the relative probability that Magnesium has no significant impact on risk of failure).
The next step is to try a different model by eliminating the variable whose impact on the probabilty of failure is lowest - it is magnesium.
Close the PHM Parameter Estimation report window and execute the following steps in order to create 2 more sub-models:
Select Covariates, sub-model Name: ilc, Magnesium, ← OK , X
Select Covariates, sub-model Name: il, Calcium, ← , OK, OK (in warning message), X
Notice that we are successively removing the covariate with the highest reported p-Value. After hitting (the first) “OK” you will receive an alert warning message from EXAKT.m telling you that the procedure is over. This is normal for samples of small size (low number of histories ending in failure). You may safely ignore this message by hitting OK in the message box.
The columns in the reports provide the results of various statistical tests. They are well-explained in the Exakt Manual. The manual is accessible from the Windows Start menu or from the help menu.
At this point we have a sub-model with covariates and shape parameter that are all significant. We may conclude that this, therefore, is potentially an acceptable model for failure risk prediction. To be rigorous, we should test one last possible combination – a sub-model with iron alone. (We choose Iron as it is the variable with the lowest p-value and thus is likely to have the strongest relationship to failure.)
Select Covariates,
sub-model Name:
i, Lead, ←,
OK, OK (in warning message), X
The report tells us that this is also a potentially good predictive model (i.e. iron alone is still significant). In the next step we decide which of the two sub-models (i or il) should be retained and ultimately deployed in the decision agent.
Execute these steps:
Comparative Report,
Compare: il, (Compare the more complex model with the simpler model)
i, →, OK, X
The “PHM Parameter Estimation - Comparison” report is displayed. The “N” in the second column is telling you that the sub-model “i” is not close to the base sub-model “il”. This means that this simpler sub-model is not as good as il. That is, the il model contains significant information that is not present in the i model. We would be losing confidence by using it rather than the more complete model “il”.
In this step we examine the results of statistical testing performed by EXAKT on the retained sub-model, il. Reactivate this model with the following steps:
Activate left (Transmission_WMOD.mdb) pane.
Modeling (menu bar),
Select Current Model,
Sub model: il,
OK.
Notice that the Model (Trans Oil Anal (il) appears in the title bar of the flow chart window.
Now hit the Modeling button (not the Modeling menu item).
Modeling , Summary Report
The third table of the “PHM Goodness of Fit Test” report tells us that the proportional hazards model we constructed for risk as a function of working age and the two significant covariates “fits” the data well enough for it to be used with a confidence of 95%.
The test used for this is known as the Kolmogorov Smirnov test and is well accepted as a statistical tool. The test shows that the model is not rejected at the 5% significance level - i.e. it is accepted at a 95% confidence level.
Modeling,
Transition Probability Model,
Covariate Bands,
Covariate: Lead
Upon executing the above steps of we see that EXAKT has created a set of bands (listed under Interval Start Points). These are “transition” states for Lead and Iron. These divisions or bands are used by the software to build a “transition probability model”. The transition probability model calculates the probability of jumping to another state within the next inspection interval. (An example of what we mean by jumping to another state will be given below.
select Covariate Iron
Notice the transition bands provided for Iron are quite different than those of Lead. This is because historical iron measurements are scattered throughout an entirely different range of values. This can be ascertained using EXAKT's cross-graph function (to be described in another Tutorial, and also discussed in the user guide and manual)
OK
Notice that the two buttons “Display Matrix” and “Display Survival” have become active. Let’s examine the Display Survival function report. Set WorkingAge to, say, 8000 hours, and Observation Interval to, say 200 hours. (assuming, for example, that our asset is currently at age 8000 and we are interested in knowing its risk of failure in the next 200 hours.) The “Markov Chain Model Survival Probability matrix” report is displayed. The probabilities of Iron values jumping to another state and the probability of failure in the upcoming interval are displayed in a tabular format.
(This table represents only a part of the entire set of transition probabilities taken into account by the model. We have chosen to ignore the other significant covariate, Lead in this report. To include more than one covariate in the visual report would require the representation of multi-dimensional matrices. Instead this report allows us to see how a single variable changes irrespective of the others.) Looking at the table we see, for example, that the cell "0- 4.004" and "4.004-9.009" has the entry 0.301615. This means that there is a 30.1615% probability that iron will be in that state at the next monitoring interval. Hence this report provides the probabilities of being in any state at some future time. (Of course, this report is provided for analysis purposes only while building the model. The transition probabilities are fully integrated into the final decision model that will be deployed in section 2.)
Now for the final step in developing a decision optimization model. We blend into the model the economics governing the failure and repair of this item.
Decision Model,
Decision Model Parameters,
Replacement (C): 1200,
Failure (C+K): 6000,
Cost Unit: $,
Inspection Interval: 250, OK,
Full Report Icon (to the left of the Print Icon), X
We apply the average cost of a preventive repair C and the average cost (including consequential costs) of a failure C+K. (It is rarely necessary to have great precision in these relative costs. The cost sensitivity function of EXAKT (described in the manual) allows us to confirm the adequacy of these cost amounts for the decision model in question.
After hitting the Full Report Icon (which you'll find to the left of the Print Icon on the Tool Bar), the “Condition Based Replacement Policy – Cost Analysis report appears. Examine the “Summary of Cost Analysis” table below the Cost Function graph. It is telling you that by adhering to the interpretive decisions of the model, an optimal long run ratio of preventive-to-failure replacements of 98.8:1.2 will be attained. This policy will result in a cost savings of 75.1% relative to a replacement-only-at-failure policy. (The cost comparison reporting function similarly compares the optimal EXAKT policy with existing practice. It’s usage is described in the EXAKT user manual.)
We have been, up to now, building a model based on the historical data from the entire fleet. We may now test the model on any individual unit either for the current situation (i.e. the latest data available in the database, called "LH" for last history) or we may look at any other history retroactively.
Decisions, 17-66,
shift+17-79,
Report,
Full Report Icon ,
PgDn, PgDn, PgDn, X,
Last Histories dialog, X
The steps above display the reports of the latest monitored values of each unit. Four sets of graphs are shown - one for each of the four units 17-66, 17-67, 17-77 and 17-79. By examining the graphs we see that none are in alarm at the current moment when this snapshot of the data has been made.
If the weighted sum of the significant covariates (i.e. the y-axis plotted variable) falls in the Green region, no action is necessary; in the yellow, the item should be renewed before the next monitoring interval; in the read, the item should be repaired or replaced immediately. It should be noted that these boundries vary with working age. This reflects the analysis findings that working age, as well as Iron and Lead, are significnt failure risk factors. At some point in the past the values for 17-67 hit the red zone. This may indicate a spurious laboratory result that was corrected in a follow-up verification. (For modeling, known incorrect data should be removed from consideration.) Note that the x-axis scale differs from graph to graph depending on the current age of the unit.
The analysis and model building phase is complete. We are now ready to export the optimal decision model we created into our maintenance system environmnent (where it has access to continuously renewing data), and, where it can do its job.
Activate the left pane, ModelDbase, Connect to Database Script, key in (or copy) the script for exporting the model
DATABASE="Transmission_DMDR.mdb"; ATTACH DecModels, UnitToModel, DecCovariatesOnEvent, DecEventsDescription, Decisions
Save
In the next slide we will send the model to a database located on the network. You will notice that several new table links appear in the tree view. Now that the links to the Transmission_DMDR.mdb database have been set up, we proceed to the actual export in the next slide.
ModelDbase,
Store the Decision model,
Close EXAKTm
You may examine the tables DecModels, UnitToModel, DecCovariatesOnEvent, DecEventsDescription (by double clicking on the file names in the tree view of the left pane) to see just what information has been exported to the external database.
In this section we run the “agent” manually. (It can also be set up to run automatically). After you execute the following steps the user interface of the EXAKTd decision agent appears.
Start, Programs, Exakt, Exakt for Decisions
File, New, Navigate to your working folder. File name: Transmission_WDEC.mdb, Create.
Now we will link to the database where we previously exported our model (Slide 22 of Section I.). After executing these steps you will see the name of the Model you created, “Trans Oil Anal” in the top left pane
Setup, Connect to model database script
copy and paste this script:
DATABASE="Transmission_DMDR.mdb"; ATTACH DecModels, UnitToModel, DecCovariatesOnEvent, DecEventsDescription, DecisionsSave
After executing the above step, you will see each of the units whose optimal decisions for oil analysis will be governed by this model. (new units may be added easily in the EXAKTd program.)
By selecting any unit in the top left pane, we see a list of properties but no values. Next, we will run the agent manually on the latest available set of condition monitoring oil analysis data.
Trans Oil Anal, Reports, Create reports, Calculate time to replace
Full Report icon, expand report window, scroll, X
Reports, Create new report list, New Report List Name: Indoor trucks, OK Reports, Create new report list, New Report List Name: Outdoor trucks, OK Trans Oil Anal, Select 17-66 + 17-67, ctrl-c, Indoor Trucks, ctrl-v, Trans Oil Anal, Select 17-77 + 17-79, ctrl-c, Outdoor Trucks, ctrl-v
With “Trans Oil Anal” selected you can conveniently examine the optimal decisions for the entire fleet on one list in the right window. You are actually examining the contents of the Decisions table of the Transmissions_DMDR.mdb database. This database can be accessed easily by any program, such as your CMMS. This implies that the decision model’s operation and its results may be integrated within existing maintenance system software. In other words, the EXAKTd program's user interface need not be used at all.
Select Indoor Trucks, Reports, Create Reports, Calculate time to replace, Select Outdoor Trucks, Reports, Create Reports, Calculate time to replace
This completes this section of the Tutorial. This has been a minimal exercise to demonstrate a small portion of the EXAKT functionality. Please refer to the On-line guide (available on your Start | Programs | EXAKT menu) for a much more detailed treatment of the subject of CBM optimization.
In 2003, the Condition Based Maintenance Laboratory at the University of Toronto developed a data structure and methodology for the predictive analysis of complex systems - items containing multiple components and subject to a variety of failure modes.
The example of this tutorial is of a single reduction gearbox that contains two gears (referred to as Gear1 and Gear2) respectively. We concern ourselves, in this example with the failure mode “tooth fails due to root crack”, which can occur on either gear. We treat this unit, therefore, as a complex item having two failure modes. A CBM policy must consider all significant (reasonably likely) failure modes whose potential failures are detectable in the condition monitoring data set. The policy must distinguish data patterns characterizing one failure mode from those characterizing another. The policy must advise on which potential failure mode is imminent and it should also provide a residual life estimate.
The EXAKT software uses the term Marginal Analysis to indicate that a complex item is being analyzed. The database whose structure is shown in the slide contains the data for the equipment or fleet under analysis. It contains the original data or data that has been transferred or linked from the CMMS and one or more CBM databases.
Marginal analysis will allow us to build several CBM decision models, each corresponding to a specific component or to a specific failure mode. The slide illustrates the strucure of a database containing Event and Inspection tables as well as six other supporting tables. This structure will enable us to conduct analyses of complex items. From those analyses we develop models for multiple failure modes occurring in a single equipment item.
Download and unzip Files_for_Exercise2.zip.
Open the EXAKT for modeling program (EXAKTm) in Windows in the normal manner.
Start, EXAKT Tools, EXAKT for Modeling
File, New
Navigate to the folder where you extracted Files_For_Exercise2.zip, File name: ComplexItemsDemo_WMOD.mdb, Create
Data Preparation, Enter General Data,
Project Title: “GearboxA 2 failure modes”,
CBM Model: “GBX”,
Description:
Time Unit: “h”, OK
Note: We place only the general or "base name" (e.g. GBX) of our model in the CBM Model text box. The CBM Model will be renamed automatically by EXAKT, (e.g. GBX_CMOD_1). An extension will be added for each model that we create. We may edit the extension later to make it more descriptive (e.g. GBX_CMOD_Gear1).
In the Tables tree view delete the three default tables:
(These are default table structures not applicable to marginal analysis. They will be recreated in the step 5.)
Modeling, Data Setup
To attach the external database(s) to our analysis, enter (copy and paste from the download file) the following into the script editor:
DATABASE = "ComplexItemsDemo_MES.mdb"; ATTACH Events_MA,Inspections_MA, CovariatesOnEvent_MA, EventsDescription_MA, VarDescription_MA, IdentToModel,EventToModel, VarToModel,IntegratedModels
Execute, Save
Notice from the tree view in the left pane that all the tables (of slide 2) of the external database are visible and accessible.
Marginal Analysis
Idents Selection (tell EXAKT which idents, i.e. units are to have their data included in the predictive model that we are currently building. In this case there is only one unit Gearbox A).
Events Selection Set up the data mapping as shown in the slide.
For example, select "B" in the Input Events column.
Then select "B" in the Select Event drop down list.
And enter 10 in the Precedence box.
Hit Apply.
Repeat for each of the entries shown.
Set up the data mapping for Variables as shown in the slide.
You may edit the Model name, say by changing "GBX_CMOD_0" to "GBX_CMOD_Gear1"
Do not hit OK yet. Instead, hit More Models.
Events Selection (tell EXAKT which named events in the database the model should use internally as B, EF and ES respectively)
You may be wondering why we are mapping EF2 to ES. The reason is that EF2 is a failure mode of Gear2 (to be modeled next). The current policy is to replace Gear1 preventively when Gear2 fails. Hence the failure of Gear2 marks the suspension (ES) of Gear1. For the same reason "B" indicates the renewal of both gears.
Precedence (The precedence of B was chosen to be arbitarily large compared to EF and ES. This is to tell EXAKT that should an ending and beginning occur on the same date/time, the beginning should be considered to have occurred after the ending of the previous life cycle.) EXAKT enforces the arbitrary rule that the precedence of ES must be greater than the precedence of EF.
Variable Selection (tell EXAKT which variables to use and how to rename them for the model we are building. (This allows the decision agent to display short meaningful names in the optimal decision and other graphs and reports.)
The variable name Health_Indicator1 in the database is mapped to the variable name H1 used by the model. Shorter names are more convenient in building the model.
Base model.
Enter "GBX" as the base model.
OK
Data mapping for Gear 2 model
As before set up the data mapping for the second CBM model named:
"GBX_CMOD_1" (which you can change to "GBX_CMOD_Gear2"
Now hit OK.
You may check the Models table to ascertain that the marginal analysis models have been set up correctly.
In the Models table you may add individual descriptions that indicate the failure mdoes for each of the two models to be built.
(Move cursor another to another row after adding the text to make sure that the row has been updated.)
We will now proceed to build the model for Gear1, which should be the current model. That is "GBX_CMOD_0" (or GBX_CMOD_Gear1) should be indicated in the title block of the Procedures (right) window.
Modeling (on flow diagram),
Weibull PHM,
Select Covariates,
Sub-model Name: H1,
H1 →, OK, OK.
Examine the Summary of Estimated Paramaters Table.
By rejecting “Shape”, the software is telling us that age is not a significant risk factor for the fracture of a tooth on Gear1. Therefore we will remove age from the model by fixing the shape parameter to “1”.
X.
Select Covariates, Submodel Name: HI_B1, Fix shape parameter=1:, Unselect h2, OK, OK (in warning message box), X
Build the decision model. The dialogs for the Transition Probability Model (“Covariates Bands and Groups”) and the Decision Model Parameters are shown in the slide. The procedure is:
Transition Probability Model,
Transition Rates, OK,
Decision Model, Decision Model Parameters,
Replacement (C): 1000,
Failure (C+K): 6000, Cost Unit: $,
Inspection Interval: 30, OK
Full Report Icon, Scroll or resize to view the “Summary of Cost Analysis” table, X
Now build the model for the second failure mode, Gear 2, whose default model name is "GBX_CMOD_1" .
Modeling (menu bar),
Select Curent Model,
GBX_CMOD_1 (or GBX_CMOD_Gear2),
OK
Repeat steps 9 to 13 making the obvious changes (as shown) required for the second model.
Decisions, GearboxA, All Histories, Select “GearboxA1” to “GearboxA9”, Report, Full Report Icon , PgDn, PgDn, PgDn … , X
We have created and tested a decision model for Gear1. We may now, in the same way, generate a decision model for Gear2.
The procedure is:
Start, EXAKT tools, DataPrepTool,
Resize window so that you can see these instructions,
File, New Corporate Database, From Default Template, DMDR for Decision Module, Navigate to working folder where you had placed ComplexItemsDemo_MES.mdb, Filename: ComplexItemsDemo_DMDR.mdb, Save, Enter Covariate Name: H1, Enter, H2, Enter, Marginal Analysis Format: Check, OK , OK, File, ExitBack in ExaktM. Attach the 7 tables from ComplexItemsDemo_DMDR.mdb using the following procedure.
Activate the left pane Window by clicking on it, ModelDBase (on the Menu bar), Connect to Model Database Script, type or copy and paste the following script into the editing window that appears.
DATABASE = "ComplexItemsDemo_DMDR.mdb"; ATTACH DecCovariatesOnEvent, DecEventsDescription, UnitToModel, DecEventToModel, DecVarToModel, Decisions, DecModels, DecIntegratedModelsSave.
The attached tables will appear in the tree view in EXAKT’s left pane. There is also added an eighth table, DecIntegratedModels that we will be using in Exercise 7.
Assuming you have previously completed building the model for Gear2 (step 13), execute the following instructions. This will save this model to the DMDR database.
Activate left pane, ModelDBase, Store
Make the previous model for Gear1 the current model by following these steps:
Modeling (on the menu bar), Select current model, CBM Model: Gear1, Submodel:H1_B1, OK
Now Store the model for Gear1 to the DMDR database, by following these steps:
Activate left pane, ModelDBase, Store Decision Model.
You can check the DecModels table to assure yourself that both models have been exported to the DMDR database.
Congratulations. You have created and exported two decision models for a complex item.
Close the EXAKTm program.
In this section we will manually run the “agent” so that it applies the two models that we have created to the current data. (It can also be set up to run automatically). After you execute the following steps the user interface of the EXAKTd decision agent appears.
Start, Programs, “Exakt Tools”, “Exakt for Decisions”, File, New, navigate to your working folder, File name: ComplexItemsDemo_WDEC.mdb, Create
Attach the 7 tables from ComplexItemsDemo_DMDR.mdb . The left pane will show both models, and the list of equipment to which these models will be applied. In this case there is only one equipment, “GearboxA”. However if there had been a fleet of similar equipment, they would have been listed below each model.
Setup, Connect to Model Database Script, type (or copy and paste) this script into the window:
DATABASE="ComplexItemsDemo_DMDR.mdb"; ATTACH DecModels, UnitToModel, DecEventsDescription, DecCovariatesOnEvent, Decisions, DecEventToModel, DecVarToModel, DecIntegratedModels
Save
Reports, Create Report List, New Report List Name: All Units, OK, Gear1_CMOD_0, ctrl-c, All Units, ctrl-v, Gear2_CMOD_0, ctrl-c, All Units, ctrl-v, All Units, Reports, Create Reports, Check alll boxes, OK
Page down to view each of the 8 graphs. Then close the graphic window.
Page Dn, Page Dn, Page Dn, Page Dn, Page Dn, Page Dn, Page Dn, X
Note: If you would like more information on the CBM features extracted from the vibration signals, and used in this study, see Monitoring Gearboxes for Tooth Failure.
Observe the All Units report.
This report shows at a single glance the condition (remaining useful life, and optimal recommendation) for every failure mode of every unit monitored by the CBM optimization agent. This decision database database is available for access by the CMMS/EAM.
Data is the fuel of reliability improvement. This is especially true in the failure cause management tactic known as "condition based maintenance". This exercise will provide you with additional insight into the value of good data practices, particularly in regard to your organization's records of the as-found condition of physical assets at the time of their maintenance or repair.
In this exercise we will examine some of the data validation tools in EXAKT.
For more information and background on this CBM optimization project see this article.
Extract Files_For_Exercise3.zip into a working folder, say test.
Start the “EXAKT for Modeling” application.
File, New, Navigate to the working folder (test/Extract Files_For_Exercise3) that contains the Mar2004CRC_MES.mdb
File name: Mar2004CRC_WMOD.mdb, Create
Activate the left sub-window by clicking anywhere in it, Modeling (on menu bar), DataSetup, copy and paste the following script into the script editor:
Database="Mar2004CRC_MES.mdb"; Attach Inspections=OilInspections, Events, EventsDescription, CovariatesOnEvent, VarDescription
Execute, Save
Notice that the five tables in the script's "attach" statement are now visible in EXAKT's left window tree view. You may open and examine each of those tables (as you did in the Basic Tutorial).
Data Preparation,
Enter General Data,
Project Title: CRC Data Analysis,
CBM Model: PHM (no OC),
Description: Wheel Motors,
Time Unit: hr, OK
Activate left pane, Modeling (on menu bar), Extend Output Variables, Copy and paste the following into the script editor:
// OutputVarScript PrevSed1=Sed-Diff(Sed); CorrSed1=Sed*(Sed>0)+(Sed=0)*PrevSed1; PrevSed2=CorrSed1-Diff(CorrSed1); CorrSed=CorrSed1*(CorrSed1>0)+(CorrSed1=0)*PrevSed2; LogSed=Log(1+CorrSed); LogFe=Log(1+Fe); CorrSi=Si*(Si<>900)+1.2*Fe*(Si=900); SqrtFe=Sqrt(Fe)
With Covariates (Complete)
After a few moments notice the appearance of the C_Inspections table and the C_Events table.
C_Inspections table, scroll to the right
Notice the extended variable columns PrevSed1, CorrSed1, PrefSed2, CorrSed, LogSed, LogFe, CorrSi, SqrtFe
Close the C_Inspections table
Activate left pane, Edit, Check Database, Data
Examine the "Summary of Data Check" report
Notice that the number of Beginnings, 177 = 89 + 41 + 46 = the number of Ending events. This is the most basic data validation of the "life data" and is made first.
Close the report.
Now we will make some deeper checks into the "life" data
Open DataCheck table,
Click on “Description” column header
View (menu bar), Inspections, Include Events View, OK
Arrange windows and panes so that the Inspections and Events window covers the top two-thirds of the screen and the DataCheck window the bottom third. Drag the bottom boundary of the Inspectons window upwards to expose the Events panes.
The Inspections window should now have four panes.
Widen the Date column of the top right pane so that the entire Date is visible
The five panes (tables and and graphic views) are now in automatic synchronization. This makes it easy to find and correct errors, as we shall see in subsequent steps.
Look at the DataCheck window. You will notice that there are many requests in the table to “Check whether the "history is temporary suspended or 'EF/ES' is 'missing.” The software has no way to distinguish between missing ending events and “temporary” suspensions (units currently in operation).
The user must ascertain that all such indicated records correspond to units that are operating currently. If the lifetime corresponding to the message, is, in fact, going on at this moment, then the user should ignore this message. EXAKT will then consider that they are indeed temporary suspensions. Otherwise the message means that you are missing an ending event, either an EF or an ES. You must manually insert the missing record into the Events table.
DataCheck window, scroll to Record 47 and place cursor in Ident field of Record 47.
The 47th record of the DataCheck table has the description
“This record can't be properly identified. It has the same Ident, Date, WAge , and Event as the previous record:Id=5508L 2, Date=27.08/1997, WAge=68926, Event-IN in Inspections table”
Inspections window, widen the Date column so the full date is visible, scroll up 1 row on the scroll bar so that record 818 is visible
Note that the pointer is at record 2217 in the Inspections table and the Events table likewise has its pointer positioned at record 1036.
Note that record 2216 corresponds to an oil sample taken on the same equipment on the same day as record 2217. EXAKT is suspicious about this and is asking you to verify the dates and working ages for these two. Maintenance planning personnel tell us that record 2217 must be an error. Therefore we may delete it.
Delete record 2217 (by selecting it and hitting the Delete key).
DataCheck window, record 52,
Inspections window, scroll up up to the recored mentioned.
"Id=5503R, Date=31/07/94, WAge=65634, Event-IN in Inspections table"
Here is a similar type of problem. But in this case two samples have the same working age but different calendar dates. EXAKT is not pleased with this situation and is asking you to do something about it. You should check if the equipment was really idle for one month, or if this is a data entry error.
Thus, does one go systematically through the database records, as indicated by the DataCheck table, correcting the anomolies that are pointed out by EXAKT.
View, Cross Graph, maximize window, Table: Inspections, Horizontal: WorkingAge , Vertical: SI, Condition: Si < 1000, Show
A pattern in the silicon readings strikes us as unusual. We question the unusual fact that multiple readings are exactly 900 ppm. Investigation found that the laboratory's spectrometer had an incorrect photomultiplier tube over a number of years and was saturating at 900 ppm. One should avoid using known incorrect data for model building. A correction needs to be applied if Si is to be analyzed in the model building process.
Horizontal: Fe, Vertical: Si, delete Condtion: “Si1000”, Show
We notice the correlation between iron and silicon. We can make use of this relationship.
Reduce and close the cross graph window
Modeling (on menu bar), Create Model Input tables, Complete data, View, Cross Graph, maximize, Table: C_Inspections , Horizontal: Fe, Vertical: CorrSI , Show, reduce, X
Recalling the extended variable script from Slide 3, the line:
CorrSI =Si*(Si<>900)+1.2*Fe*(Si=900);
can be read as: The corrected values of Si shold be set to the original values of Si if the original Si is not 900 ppm. However, if the value of SI is (exactly) 900 ppm then multiply it by the slope of the Si vs Fe line which was found to be 1.2
Reduce and close the Cross Graph window.
For more discussion on EXAKT's transformation scripting language see the post by Martin Kay on the EXAKT forum
EXAKT handles events (such as oil changes, adjustments, alignments, calibrations and other minor maintenance) that impact condition data in a correct manner.
Display the C_Inspections table.
Scroll to record 356
Note that for a period of 5 months, From 7/6/94 to 11/21/94 no oil change (OC) events are indicated, where oil changes were performed previously about every month. We suspect that the oil changes occurred but were not reported. This could affect the model.
Reduce and close the C_Inspections table
Add the following two lines to the OutputVarScript, (adding a semicolon to the end of the current last line) so that the last three lines read:
SqrtFe=Sqrt(Fe); HWAge = WorkingAge-First(WorkingAge); OilAge = HWAge-NonDecr(HWAge*(Precedence = 1))
We have applied a transformation to the data, which originally does not contain OilAge. Now we have a variable for the oil's age at any CBM inspection. This transformation allows further analysis, by the following steps.
Click "With Covariates (Compete)"
Activate left pane, Modeling (menu bar), Create Model Input Tables, Complete Data, View, Cross Graph, Table: C_Inspections, Horizontal: OilAge, Vertical: Fe, Show
Note the samples whose oil age is as high as 15000 hours. Synthetic oils were introduced only towards the end of the period represented by this sample. It is impossible that the mineral oils would have remained unchanged for upwards of 2000 hours. Rather, we may conclude, oil changes often went unreported.
Modeling (on flow chart), Weibull PHM, Select Covariates, Fe ->, CorrSed ->, OK, OK, close report window, Residual Analysis, In Order of Appearance, Close graph window
Left Window, Residuals: PHM(noOC)(*) #1”, click on the “Residual” column header to order the records by Residual, scroll down to last row, note the History Number of 64, close the table
Transition Probability Model, Transition Rates, OK Decision Model,
Decision Model Parameters, Replacement (C): 1200, Failure (C+K): 6000, Inspection Interval: 250, OK, Close Cost Function graph window
History numbers (such as 64) are applied by EXAKT to the life cycles in chronological order. We must identify which life cycle of which unit is the offending one. Following the instructions below, we can find the history (life cycle) is the 2nd history of unit 5509R.
Procedures panel, Decisions, All Histories, Select History 5501L1 (That is the first lifetime of the left wheelmotor of haul truck 5501), hit the DnArrow key 63 times, Note that we are at 5509R[2],Close
We need to examine the cause of the offending history. The instructions below reproduce the table and graphs shown on the slide. From these, we observe that the cause of offending history is the unusually high values of Fe and Si not explained by a failure event. A reasonable solution to obtain a better fit model is to assume that a maintenance event was not properly recorded and to exclude this history from the model.
View, Inspections, Include Events View, View by history, Select All: Uncheck, move all variables to “Unselect” position, move Iron and Si to “Selected” position (as shown on Slide), OK
Select 5509R2
Click on a point on the large peak around 65000.
We note that there is an unexplained increase in Fe and Si. That is there is no event to support the strange increase and 4 months later the decrease in values. The residual analysis of the previous slide is telling us that this history violates the model that we are attempting to build. We remove this history from consideration in the analysis, because, obviously, a failure event went un reported. See the EXAKT forum for a way to remove the offending history.
Random fluctuation of monitored condition data characterizes many otherwise straight-forward CBM applications. In this exercise we use data from pressure tests, which reflects the deterioration of a sealing system, in a nuclear fuel rod manipulating mechanism. For additional background and details on this application, you may refer to the document Fuel Handling System.
Start, EXAKT for Modeling, File, Open, Navigate to /test/Files_For_Exercise4, candu_WMOD database, OK
Activate left (tree view) window, View, Inspections, OK, Ident drop down list, hit various idents and observe their inspection data, reduce the inspections window, close (X) the inspections window.
Note the randomness yet increasing nature (generally rising slope) of the data. Although it is obvious that the item's deterioration is reflected by the monitored data, how does one make a decision at any given inspection if the data is so erratic? How do we know if a high reading is due to noise fluctuation or to a deteriorating failure mode? The following steps in EXAKT provide a solution to this problem.
EXAKT provides a way to perform “smoothing transformations” of the data. In the OutputVarScript window you will see a small program that transforms the original variable LeakRate into the transformed variables leakSmooth and leakSmoothAve . EXAKT’s programming language provides several smoothing functions. Smooth() and SmoothAve() are smoothing functions that take parameters to adjust the way in which they transform the variables.
Database pane, OutputVarScript , X
(Note that we have defined 4 new variables from the original LeakRate and WorkingAge variables:
leakSmooth0, leakSmooth, leakSmoothAve0, and leakSmoothAve
Let's look at the first transformation:
leakSmooth0=Smooth(LeakRate,WorkingAge ,3);
leakSmooth0 will be the smoothed transformation of the LeakRate using a smoothing window of three time units. That is, each value of leakSmooth is transformed into what its value would be on a linear regression line fitted to all the points in the 3 time unit window.
Let us generate the decision graphs of the model built directly on the original (untransformed) data.
A) Modeling (on menu bar), Select Current Model, CBM Model: Seals, Submodel: LR_b1, OK, Procedures panel, Decisions, Select Ident: 5EH1, scroll down to last row, shift+8WH4, Report, Close, PageDown or PageUp, X
B) Modeling (on Procedures panel), Weibull PHM, Select Covariates, (note the variable used for this model LR_b1 is LeakRate), Cancel
Observe how much randomness there is in the inspection data. Such randomness may bias the model and may make it difficult to clearly apply an optimal decision.
Repeat Step A (Slide 4) but select the submodel LR_Smooth0 instead of LR_b1
Repeat Step B (Slide 4) but note the variable used for this model LR_Smooth0 is leakSmooth0 , Cancel
The model LR_Smooth0 uses a variable that has been smoothed by the Smooth() function in EXAKT. On the decision graphs, we observe that we have eliminated the randomness of the previous submodel. But we have another problem. We observe a drooping artifact at the end of every history. This causes a poor model and a poor decision recommendation because the current value of the condition indicator leakSmooth0 is erroneously low! In step 7 we will correct this problem with a further transformation.
Repeat Step A (Slide 4) but this time use the submodel LR_Smooth
Repeat Step B (Slide 4) but this time note that the variable used in the submodel LR_Smooth is leakSmooth
The adjusted smoothed variable produces a better model and a better decision recommendation. Note that the randomness of the data is further reduced and the drooping artifact has been corrected.
Examine the OutputVarScript . The purpose of these transformations is to smooth the variable with a special care at the end of the history to avoid drooping caused by including the non-existant values (in the smoothing 'window') beyond the last data point.
leakSmooth0=Smooth( LeakRate,WorkingAge ,3);
leakSmooth0 : the smoothed transformation of the LeakRate using a smoothing window of three time units. That is, each value of leakRate is transformed into what its value would be on a linear regression line fitted to all the points in the 3 time unit window. (I.E. all points between t-3 and t+3.)
However, there is a problem. Near the end (i.e. as t approaches the current time) there will be no values between t and t+3 to use in the Smooth calculation. The regression line will be fitted to zeros inside the 3 time unit window but beyond the end of data. This causes the transformed values to decrease artificially. This is called an "artifact". We have to do something about this. The next transformation to the variable leakSmooth solves this problem.
leakSmooth=leakSmooth0*(Last(WorkingAge)-WorkingAge>=3) +NonDecr(leakSmooth0) *((3-(Last(WorkingAge)-WorkingAge))*.01+1) *(Last(WorkingAge)-WorkingAge3);
There are two terms in the above expression that calculates leakSmooth. The two terms are separated by a "+" The first term is operative for all points that are located 3 time units or more before the last point in the history.
The first term
leakSmooth0*(Last(WorkingAge)-WorkingAge>=3)
is easy to understand. It says simply that the Smooth transformation applies to all points except those in the final window of 3 time units. Read the first term as follows: it is equal to leakSmooth0 where the working age of the last point minus the working age of the current point is greater than 3 time units. Otherwise the term is zero.
The second term is operative for all points within 3 time units beyond the last point in the history. The second term, may be simplified as:
NonDecr(leakSmooth0)*Factor*(Last(WorkingAge)-WorkingAge < 3);
It reads: The largest value of leakSmooth till now, multiplied by some factor. And this term is operative within three time units from the end.
Now let us determine the values of Factor for each point close to the end (i.e. the present). Its expresssion is ((3-(Last(WorkingAge)-WorkingAge))*.01+1). Read Last(WorkingAge) as "the working age of the last point".
We will calculate this Factor for the last four points: the end, one time unit before the end, two time units before the end, and three time units before the end. The calculations yield:
at end Factor is (3 - 0)*1.01 = 3.03
at 1 back from end, (3 - (Last - (Last -1)*1.01 = 2.02
at 2 back from end, (3 - (Last - (Last -2)*1.01 = 1.01
at 3 back from end, (3 - (Last - (Last -3))*1.01 = 0
As we approach the end the Factor grows. This guarantees that LeakSmooth will not droop.
Now that we have seen some techniqes for pre-processing data to eliminate confusing noise, we may look more closely at the model itself. You may be wondering about the naming convention we adopted for the model “LR_Smooth_b1”. The “b1” part of the name indicates that we have fixed Beta, the shape factor, to 1. We will proceed to learn why we did this.
Modeling (on Procedures panel), Weibull PHM, Select Covariates, Cancel
We note, in carrying out the above steps, that this Submodel “LR_Smooth” uses the transformed variable leakSmooth and that the “Fix shape factor to 1” checkbox is unchecked.
Residual Analysis, Summary Report, scroll down. (note that the goodness of fit hypothesis is rejected), reduce window, X
Look at the modeling results in the orange framed "Parameters" window inside the Procedures window. Note the NS (not significant) indication after Shape = 1.35644.
Upon executing the above steps, we note that the model is rejected by the Kolmogorov-Smirnov test. The test is telling us that the hypothesis that the model is “good” (fits the data) must be rejected.
EXAKT has told us in the previous step that working age is not significant. In fact it is highly significant, so much so that it correlates closely with the LeakRate . Thus EXAKT is really telling us that the LeakRate itself contains all the information we need, to establish a good predictive model, and it is telling us that we should remove WorkingAge as a significant factor from the model by setting Shape to 1.
Modeling (on menu bar), Select Current Model, LR_Smooth_b1, Modeling (on Procedures panel), Weibull PHM, (note that the shape parameter has been fixed to 1 for this submodel), Cancel
Residual Analysis, Summary Report, expand and scroll down. (note that the goodness of fit hypothesis is not rejected), X
Similar results can be found for models: LR_SmoothAve0_b1, and LR_SmoothAve_b1. You may go ahead examine these models using the tecniques you have learned in this exercise.
Once you have made smoothing and other adjustments to the model, you may apply cost data as in the Basic tutorial in order to develop the decsion model. it is ready to be deployed as an intelligent agent. You no longer have to worry about the erratic and noisy nature of the data. The compensating algorithm has been built into the model and will be applied automatically each time a new set of condition monitoring data is received.
These judgments may include average life, standard deviation, type of deterioration, percentage of survival beyond some time point, etc).
The program will then create a small number of lifetime histories using the information provided. This will make it possible for EXAKT to estimate the parameters of the Simple Weibull.
Condition monitoring warning limits applied by the user will supplement the initial decision model.
As new histories occur, ending in PF, FF, or S, the model will be quickly rebuilt to reflect experience. EXAKT will present statistics on goodness of fit as the model improves continuously.
In this tutorial, we will create two dummy lifetimes each ending by failure (EF). EXAKT creates these lifetimes such that they fulfill conditions derived from expert knowledge about the age based failure behavior of the item, component, or failure mode. Such knowledge must be available to us, either from the OEM or from the maintenance department's experience with similar assets, components, or failure modes.
EXAKT places these events into an Events table. EXAKT will then estimate the initial statistical model. This approach will be useful when we don't have any histories ending in failure (e.g., for new equipment, or old equipment but for which we have not recorded life cycle events). We can still build an "initial" - starting model, that we update as new data arrives.
1. We will begin by building a new database in which to hold the eventual data for analysis.
Start, Data Prep Tool,
File, New Corporate Database,
From Default Template, Standard MES.
2. Navigate to a folder (say test\Exercise5), name, and Save the database.
3. Begin adding the variable names for the condition data you expect to be analyzing and from which you eventually wish to build a CBM decision model. OK.
4. Close the “Open Corporate Database” dialog.
Close the Data Preparation Tool.
5. Start, EXAKT for Modeling,
File, New,
Navigate to the folder where you created the tutorial5_MES.mdb file, Select it and change the name to “Exercise5_WMOD.mdb”,
Create
6. Modeling, Data Setup, Copy and paste this script into the Script Editor, Execute, Save.
Database = "Exercise5_MES.mdb"; Attach CovariatesOnEvent, Events, EventsDescription, Inspections, VarDescription
The “Create Initial Data” option in EXAKT will be used to start building a PHM model where there are an insufficient number of histories.
7. Edit, Create Initial Data, Events.
8. Select one out of eight "dummy" history generating methods.
Specify two points on the "survival curve". For example, you specify that
(in other words, it is assumed that of a large number of similar units, 20% will fail up to 2000 (say) h, 65% will fail between 2000h and 6500h, and 15% will fail after 6500h).
There are some default restrictions on the specification.
The program will create two histories for dummy idents 00001_D and 00002_D in the table Events, as shown
These two histories were obtained by specifying two survival points using any of the above eight event generation methods. They will usually appear at the top of the Events table, but this depends on the notation of any real idents whose histories events are also in the table.
9.1.1 Hit Events. Observe the Events table.
Proceed and try some of the other seven generating methods. Afer using one of the specification methods, hit OK.
Mean & Risk Behavior:
This option will open the Specify Mean and Risk Behavior dialog box, as shown below.
You may specify
The risk behavior as presented in this dialog is characterized by the shape parameter of the Simple Weibull distribution that is given in brackets. For example, moderately increasing hazard is represented in this dialog by the shape parameter = 2.5. (If you want to select your own value of the shape parameter, use the option Mean & Shape.)
Specify the median of the distribution and type of the risk behavior (you assume that about 50% of lifetimes are shorter than the median;
Specify the mean and the shape parameter of the Simple Weibull distribution.
This is an extension of the option Specify Mean & Risk Behavior.
Specify the median and the shape parameter of the Simple Weibull distribution.
This is an extension of the option Specify Median & Risk Behavior. For some comments on the median, see above in Median & Risk Behavior.
Specify the mean and the standard deviation of the Simple Weibull distribution.
Specify the median and the standard deviation of the Simple Weibull distribution.
For some comments on the median, see above in Median & Risk Behavior.
Specify the scale and the shape parameter of the Simple Weibull distribution.
Assume that a second expert wishes to express failure behavior using another one of the methods. Create two additional histories using the other generating method (keeping the previously generated histories).
Set up the General Project Data.
Generate artificial events as described above.
We note from the Inspections table that there are three additional histories in the sample. These histories are currently in progress. That is, they will be "temporarily suspended" for this analysis. Their beginning events must be added to the Events table as shown. Make sure that the date of these Events precedes the dates of their respective first inspections.
View, Histories, Working Age by Unit.
Edit, Check Database, Data
Notice that there are 5 histories, 5 beginnings and 5 endings. This (equivalence of the number of beginnings and endings) is necessary for reliability analysis because it needs well defined life cycles. The software, therefore, automatically adds 3 temporary suspensions so that it may analyze complete life cycles.
Open and close any table, to activate the buttons on the flow chart.
The C_Inspections and C_Events tables appear in the tree view.
Modeling, WeibullPHM, Select Covariates, Submodel Name: NoVar, OK, OK. View the report. The calculation of Scale, Shape, Mean Life, Med. Life, Char. Life, and Std. Dev will completely reflect the information used to generate the two artificial histories.
Decision Model, Decision Model Parameters, Replacement (C): 1000, Failure (C+K): 5000, Cost Unit: $
OK
Hit Opt Replacement Policy.
Note that this graph is one dimensional, as opposed to the two dimensional graph of a model that includes condition monitoring variables. The optimal decision is based entirely on working age.
Decisions, All Histories, Select a (real) Item, Report.
Data Preparation, With Covariates (Complete), Modeling, Weibull PHM, Select Covariates, Submodel Name: HIERRO, Select HIERRO, OK.
Transition Model, Transition Rates, OK, Decision Model, Decision Model Parameters, Replacement (C): 1000, Failure (C+K): 5000, Cost Unit: $, Inspection Interval: 30, OK, Close Cost Function Window.
Decisions, Select, Report
This example illustrates Combined Analysis and Marginal Analysis to be used together on the same data set.
The issue: While there is a button in EXAKT called "Marginal Analysis", and a button called "Combined Analysis" there is no special procedure in EXAKT called "Combined and Marginal Analysis".
Therefore, in this exercise:
Marginal Analysis - analyzes and models individual failure modes of an item. The advantage of this technique is that in a single WMOD one can build an entire analysis for a complex item that contains constituent models for each failure mode.
Combined Analysis - deals with the situation where several condition based maintenance (CBM) techniques are used to monitor an item's health, but these inspections are not carried out at the same time. Therefore a method is required to extrapolate data from one test time to to the time when another test is carried out.
Download the zip file and extract into a folder, say "test".
(the test folder will now contain the folder Exercise6)
Start DataPrep tool
File, New Corporate Database, From Default Template, MES for Combined Analysis
Navigate to the test\Exercise6 folder
Save the file as combinado320_MES.mdb
VariableName |
---|
Al |
B |
C |
Cal |
Cr |
Cu |
Di |
Fe |
H2 |
Mg |
Mo |
Na |
Nit |
Oxi |
Pb |
Si |
Sul |
Vi |
VariableName |
---|
APDER |
APIZQ |
ATDER |
ATIZQ |
BPDER |
BPIZQ |
COPRB |
COPRH |
CRCPR |
ENGCT |
ENGIT |
ENGOP |
ENGOT |
ENRPM |
FUELP |
FUELR |
FUELT |
INJPR |
INJPU |
OUTTQ |
PLOAD |
Close the DataPrep tool.
To save time simply overwrite the table previously created in DataPrep.
ExaktM, File, New, Navigate to your test\Exercise6 folder, create combinado320_WMOD.mdb
Modeling, Data Setup, Copy and paste Attach Script
Confirm that all tables have been attached.
Data Preparation, Enter General Data
Combined Analysis
Inspections_A_1 ->
Inspections_A_2 ->
Inspections_A_3 ->
OK (keep finger on Enter until the process completes - this is a bug.)
Confirm that the tables CMI_Events and CMI_Inspections have been created.
Data Preparation Tool, File,
New Corporate Database,
From Default Template,
MES for Marginal Analysis,
Navigate to the test\Exercise6 folder,
Filename: combinado320MA_MES.mdb,
Create.
Hit OK without entering any Covariates. This will make a blank database, with the three standard mapping tables for marginal analysis:
EventToModel
IdentToModel
VarToModel
Open the table with MS Access and delete all but these three tables.
We have three of the eight tables required for marginal analysis.
In the next section we will use SQL queries to add the other five _MA tables required for Marginal Analysis.
(using queries in combinado320_WMOD.mdb)
Open combinado320_WMOD.mdb in Microsoft Access,
Queries, New, Design View, Close,
View, SQL View, Create the following queries:
(Copy and paste from scripts.txt after replacing "path" with the path to your working folder.)
1. Make CovariatesOnEvent_MA
SELECT CovariatesOnEvent.* INTO CovariatesOnEvent_MA IN 'path\combinado320MA_MES.mdb' FROM CovariatesOnEvent;
2. Make Events_MA
SELECT CMI_Events.* INTO Events_MA IN 'path\combinado320MA_MES.mdb' FROM CMI_Events;
3. Make EventsDescription_MA
SELECT EventsDescription.* INTO EventsDescription_MA IN 'path\combinado320MA_MES.mdb' FROM EventsDescription IN 'path\combinado320_MES.mdb';
4. Make Inspections_MA
SELECT CMI_Inspections.Ident, CMI_Inspections.Date, CMI_Inspections.WorkingAge, CMI_Inspections.Al, CMI_Inspections.B, CMI_Inspections.C, CMI_Inspections.Cal, CMI_Inspections.Cr, CMI_Inspections.Cu, CMI_Inspections.Di, CMI_Inspections.Fe, CMI_Inspections.H2, CMI_Inspections.Mg, CMI_Inspections.Mo, CMI_Inspections.Na, CMI_Inspections.Nit, CMI_Inspections.Oxi, CMI_Inspections.Pb, CMI_Inspections.Si, CMI_Inspections.Sul, CMI_Inspections.Vi, CMI_Inspections.TCCP, CMI_Inspections.TCOMB, CMI_Inspections.APDER, CMI_Inspections.APIZQ, CMI_Inspections.BPDER, CMI_Inspections.BPIZQ, CMI_Inspections.COPRB, CMI_Inspections.COPRH, CMI_Inspections.CRCPR, CMI_Inspections.ENGCT, CMI_Inspections.ENGIT, CMI_Inspections.ENGOP, CMI_Inspections.ENGOT, CMI_Inspections.ENRPM, CMI_Inspections.FUELP, CMI_Inspections.FUELT, CMI_Inspections.INJPR, CMI_Inspections.INJPU, CMI_Inspections.OUTTQ, CMI_Inspections.PLOAD INTO Inspections_MA IN 'path\combinado320MA_MES.mdb' FROM CMI_Inspections WHERE (((CMI_Inspections.Event)="*")) ORDER BY CMI_Inspections.Ident, CMI_Inspections.Date, CMI_Inspections.WorkingAge;
5. Make VarDescription_MA
SELECT VarDescription.* INTO VarDescription_MA IN 'path\combinado320MA_MES.mdb' FROM VarDescription;
You will now have these five queries in the WMOD database which will be used to create the five data tables required for marginal analysis.
Run the queries
ExaktM, File, New, combinado320_MA_WMOD.mdb
Delete the three default tables: CovariatesOnEvent, EventsDescription,VarDescription
Modeling, Data Setup, Enter the script:
DATABASE = "combinado320MA_MES.mdb"; ATTACH Events_MA,Inspections_MA, CovariatesOnEvent_MA, EventsDescription_MA, VarDescription_MA, IdentToModel, EventToModel, VarToModel, IntegratedModels
Execute
Save
You can now perform marginal analysis normally (as in Exercise 2).
These two tables have an identical structure.
Their names should be added to the respective MES and DMDR attach scripts.
(The download files of the next paragraph contain the added tables.)
Reopen, WMOD from Exercise 2. Or download Files_For_Exercise7.zip and place them in the folder, test.
ExaktM, Navigate to test\Exercise7\Files_For_Exercise7,
ComplexItemsDemo_WMOD.mdb, Open,
Modeling, Integrate Models,
To Start Choose One Procedure: New.
Integrated model: GBX
Base Name: GBX_CMOD_
GBX_CMOD_Gear1
Submodel Name: Check h1_b1
Apply
To Start Choose One Procedure: Edit
GBX
GBX_CMOD_Gear2
Uncheck h1_b1
Check h1_b1
Ignore the fact that GBX_CMOD_Gear1 has h1_b1 as a Submodel Name in the Incoporated Model//Submodel listing
Apply
Procedure: Edit
GBX
GBX_CMOD_Gear2
Check h2_b1
Apply
OK
Check IntegratedModels table
Modeling
Integrate Models
Procedure: Report
Integrated Model: GBX
Apply
Procedure: Decision Reports
Integrated Model: GBX
Select Ident: Check
Apply
Confirm that the _DMDR database is attached.
If not:
ModelDBase
Connect to model database script
Insert and run script (by hitting Save).
(Note: Script is already inserted for you.)
DATABASE = "ComplexItemsDemo_DMDR.mdb"; ATTACH DecCovariatesOnEvent, DecEventsDescription, UnitToModel, DecEventToModel, DecVarToModel, Decisions, DecModels, DecIntegratedModels
Modeling, Integrated Models
Procedure: Store Integrated Model
Start EXAKT for Decisions
File, New, Navigate to working folder
Create ComplexItemsDemo_WDEC.mdb
Setup
Connect to model database script
Insert same script as in step 12 above
Save
Notice that the Integrated model has been added to the individual models
Reports, Create Reports
Check all four reports
OK
Full Report Icon
PgDn, ...
X
Now the failure modes of the Ident are listed in a single report.
We only had one unit (Ident) in this exercise.
Other units would be listed included on this same report, e.g. GearboxB, GearboxC, etc
The derived covariates function provides a way to perform transformations that is easier than writing EXAKT transformation programs (i.e. general and history specific transformations) .
The option Add Derived Covariates is implemented to make creation of certain useful transformations (called "Derived Covariates") easy and straightforward. Examples of derived covariates are the "composite covariate" and "lags".
A "composite covariate" is a linear combination of covariates that is part of a PH model. This transformation can be extremely useful in building the decision model, because it reduces a multi-dimensional problem to one-dimensional problem, and thus significantly increases the speed of model calculation and decisions (but this may come with some reduction in accuracy). In applying this option, EXAKT will transform the composite covariate into a derived covariate to be used in another submodel. EXAKT will immediately start a new model and display the Modeling menu.
A "lag" applied to a selected variable creates a new variable equal to one of the previous values of that variable. In that way is possible to include in the model a variable and any number of its previous values ("lag_01", "lag_02", ...). This option can be very useful to incorporate the influence of trends in the model.
EXAKTm, File, Navigate to /test/Files_For_Exercise8/ BC_WMOD2003.MDB, Modeling, Select Current Model
Steps:
There are two options under “Add Derived Covariates”:
The option Composite Covariate will be disabled until the current PH submodel has been built at least one time and thus has a composite covariate. The option will be enabled when revising a previously built submodel. In this case it is enabled because the current submodel SWP2T2 has been previously built.
The option Lags is always available because it does not depend on a composite covariate.
Selecting “Composite Covariate” causes the following dialog to appear.
At this point we have two possible courses of action.
For now, hit OK. (We'll come back to the alternative action "Specify Partition" later.)
Two things happen:
and,
Stg2 Rings (Z__00).
Summary: EXAKT created the transformed variable from the composite covariate of the formerly current model. Then initiated (and made current) the new submodel named "Z__00".
The variable Z = γ1* Z1+ γ2*Z2+... is called the (overall) "composite covariate" of that PHM.
The parameters γ1, γ2,... are the parameters related to the respective covariates Z1, Z2.
We may decide to use only the composite covariate in a model, and thus reduce the dimensionality of the covariate vector to one.
A "composite covariate" is a linear combination of covariates. It appears in the equation for the PH model. The transformation reduces a multi-dimensional problem to a smaller dimensional problem. It can significantly increase the speed of model and decision calculations. This may come with some (usually negligeable) reduction in accuracy.
When we selected "Composite Covariate" from the Add Covariates drop down list, recall mention of the “Specify Partition” button.
Make the submodel, Stg 2 rings (SWP2T2T3), current.
Modeling, Weibull PHM,
Add Derived Covariates
Composite Covariate
The Composite Covariate dialog opens with a new Submodel Name and a new Composite Covariate based on the SWP2T2T3 model.
This time hit “Specify Partition” (instead of OK).
The Composite dialog expands to allow the grouping (or partitioning) of covariates.
Select a partition from the Partition Detail drop down list.
It may be desirable to "group" covariates. EXAKT refers to these groups as partitions. Then one composite covariate will represent each group, and by this technique, reduce dimensionality to the number of partitions.
The sum of all "group" (partition) composite covariates is then equal to the composite covariate of the original model.
EXAKT gives the created composite covariate the same default name as that of the created model, which by default is:
Z__00 (using two underscores).
In a similar manner, the default names of the partition composite covariates will be:
Z__00_00, Z_00_01, ….
Partition variable names can be edited, as well as the composite covariate name.
A covariate should be included in one and only one group. The program will do an check for this. Therefore, the hazard function of the original PHM will not change its value, but only its form. Howerver the dimension of the transition probability matrix (that depends directly on the number of covariates) will be significantly reduced.
The dimension of the transition probability matrix is one of the main factors in all calculations regarding the decision model and the calculation of decisions.
Select one or more covariates to place in that partition. In this case select Press_Stg3 for the first partition.
And move it down.
Repeat the procedure for required number of group names. You may edit the selection by moving variables between the Selected Covariate list and Unselected Covariate list, and also edit all names.
Press OK.
The program creates the overall composite variable Z__00 and the group variables. In this case there were two groups.
Two group covariates and the overall covariate have been created for use in models.
Note that Z__00 = Z__00_00+Z__00_01.
And EXAKT creates and makes current the new model Z_01
We may go ahead and build this model in the usual way by hitting Select Covariates. The range of covariates to select from has been expanded. It now includes the overal covariate and the two group covariates that we just created..
Cancel.
Assume that we wish to build a model, Lags_00, that contains prior value(s) of a covariate.
Add Derived Covariates, Lags
A "lag" applied to selected variable creates a new variable equal to the previous value of that variable. In that way is possible to include in the model a variable and any number of its previous values ("lag_01", "lag_02", ...). This option can be very useful to incorporate possible trend in the model.
Where:
have been discriminated in the CMMS.
The following analytic procedure in EXAKT provides the means with which to assess and compare CBM performance in various calendar periods. This can be used by maintenance managers to justify expanding or adjusting the CBM program.
ExaktM, File, Open, Navigate to folder test\Files_For_Exercise9, Transmission_WMOD.mdb, Open
(The above conditions are easily identified in View, Histories1.)
Apply minimum age (for preventive maintenance) values.2
2 This is an option in EXAKT which may be used when setting the decision parameters. Sometimes there is a short period of time at the beginning of a life cycle where the mechanical components are "bedding in" or "wearing in". During this period monitored variables, such as wear metals may be abnormally high. This would cause the hazard as calculated by the PHM to be high. Yet the model should not return a potential failure alarm during this transitory period. By setting this parameter, we avoid false alarms during the time of bedding in.
that were not excluded in the previous step.
Minimum preventive maintenance time
See explanation given in step 7.
Regular maintenance interval
This is an option in EXAKT that is used when setting the decision parameters. This optional parameter of the CBM Model will, if applicable, improve the calculation of the optimal policy. The Regular Maintenance Interval refers to non-rejuvenating events performed regularly in time and those actions are known to impact the covariate values. Such events may include minor adjustments, calibrations or oil changes carried out at some interval of the working age. For example, oil changes performed every 600 hours.
“Current”:
What actually occurred. Of the 13 actual histories in the sample 6 failed, 3 were replaced, and 4 are “undecided” – that is, at this time we do not know whether they will eventually fail or be preventively replaced. (At present they are still operating).
EXAKT applied:
When the EXAKT policy is applied retroactively to the data set,
Fitted EXAKT applied:
The curve of the EXAKT decision chart is fitted to the actual data; so as to minimize “average” realized cost.
11. Examine Tables A and B
EXAKT applied: The cost of the policy obtained from applying the optimal model retroactively to the sample.
Fitted EXAKT applied: The curve of the EXAKT decision chart is fitted to the actual data; so as to minimize “average” realized cost.
EXAKT: The theoretical “expected” cost effectiveness of the EXAKT model.
Replace at failure: The policy of not using any proactive (neither scheduled nor on-condition) maintenance.
Table B provides the other extreme assumption.
While Table A assumed that histories that are at present incomplete will have been (successfully) preventively replaced by the proposed decision model, Table B simply ignores the incomplete histories.
One may consider the assumptions of A and B as defining the envelope of possibilities of future performance of the model. If both provide satisfactory results, we may confidently apply the model going forward.
SPAR PHM provides failure prediction and maintenance optimization by modeling:
Analyses in SPAR PHM derive from failure mode behavior. Specifically SPAR PHM captures and analyzes the relationship between mission success and the probability of occurrence of each critical failure mode.
The software accounts for projected usage profiles, current sensor reading and planned maintenance activities as they influence the probability that a failure mode will occur within the mission. SPAR PHM applies simulation to perform what-if analysis for alternative scenarios of usage and maintenance in order to develop the optimal maintenance plan at an acceptable cost.
We begin by walking through the main parts of SPAR PHM. First load the demo file 521363 by following the next instruction.
Clear any results left over from a previous user.
Each platform has its own breakdown structure (at left in image below) of systems, subsystems, components, and failures modes (the failure modes being the leaves of the tree structure).
The platform has logged the indicated accumulated age variables: e.g. Hours in Service, Total Hours Operated, Hours on Land, Hours in Water, Miles Driven, Engine Starts, or whatever else has been recorded and is considered significant to the survival probabilty of the platform and its failure modes.
Note the Engine starts Accured Age of 7348.13
Examine the Engine Starts indicator. The value is 840 for the engine whose serial number is SN-091, not the 7348.13 Engine Starts indicated at the Lav-25A1 platform level.
Notice also the columns differentiating Age since new and Age since last visit. Of course these can have different influences on survival probability.
The Proportional Hazards Modeling (PHM), that relates monitored variables to a failure mode probability, are input into SPAR-PHM so that prediction and optimization will account for the influence of both operational (external) and sensor (internal) variables as well as age.
Failure modes are check boxed as "Critical" or not depending on whether their occurrence will compromise the mission.
At the failure mode level the accured accrued Age is indicated. As well the baseline distribution (in this case Weibull) and the Proportional Hazard Model coefficients are displayed. Alternatively an "Aging Factor" can be used where no PHM is available to estimate the impact on accumulated hazard of past and future operating conditions.
The radio subsystem, for example, has five failure modes in series.
We will be considering only serial systems here. The block diagrams tell us about the relation between a system's failure modes and its ability to perform the mission. The serial block diagram (blocks in series) indicates all components as being mission critical.
The dashboard displays the current and projected "% Cumulative Damage" (accumulated hazard) for the platform and its components. The % Cumulative Damage reflects all the significant factors (past and projected usage as well as latest sensor readings).
Because we selected Radio in the Data Explorer, it is in focus at the top of the Dashboard. Its accumulated hazard is currently 46.53. If we hit Sort, the radio's failure modes will be sorted with the worst offenders at the top.
Cumulative Damage: The Cumulative Failure Probability, F(t), when calculated at the time of interest, t, taking into account relevant age and sensor values, represents the percentage of "life" that has been used as a result of operation prior to t.
F(t) depends on:
A failure mode’s failure behavior may depend on one or more of
All of these may impact the failure rate (hazard). Therefore the model must use these in order to make optimal maintenance and supportability decisions.
Hit the Description Tab.
Hit the Graph tab.
The verticle line marks the present time. The blue line is the projected operating profile for this variable which is a user input. (We will see how to provide this information to the model.)
Similarly for Sensor variables.
The simulation will include the proportional hazard model and apply it to current sensor data.
With the cumulative distribution function F(t) available from the simulation, the user may have the software calculate survival probabilty at the system, sub-system, LRU, or failure mode level for any duration of the current or next mission.
In the following sections we will run simulations on three scenarios in order to project survival in the current and subsequent mission. The scenarios are:
Then, in the section, "Setting up your own scenarios", we will see how to model these simulations.
Any number of scenarios that we have developed may be recorded and played back at any time. For example:
This is the cumulative distribution function, F(t), for the configured system scenario, in this case "No maintenance".
Simulating various alternative strategies will allow choosing an optimal one.
At the current time, then, the Radio is the line replaceable unit (LRU) that represents the greatest portion of accumulated damage (cumulative failure probability).
Assume that it is expedient to replace the radio now, and plan more extensive maintenance at the next scheduled opportunity.
We would like to investigate how replacing the radio now will impact the current mission's survival probability from now till the next maintenance opportunity.
In a "no maintenance" scenario, the Dashboard, once you hit Sort in the "Current" column, indicates that the most likely source of failure will take place within the Communication system.
Hence the radio is the main offender in the communication system.
Drilling further, it is the Transmitter.
We analyzed the system by drilling down to the failure modes that have the highest probability of interrupting a mission.
If we are able to maintain (decrease F(t)) the "bad actors" now or at some specified time in the future, SparPHM allows us to understand to what degree those maintenance actions will improve the system's ability to perform its mission.
SPAR-PHM also provides the the cost to do so.
Several alternative sets of activities and costs may be examined (simulated) to determine the optimal maintenance plan and operational profile.
The health indicator, F(t), allows us to consider, statistically, the probabilistic damage level. If we
SPAR PHM will guide us to a plan with which to maintain system health at the level needed to accomplished the mission.
The objective, then, is to use the software tools to determine the best actions and their timings in order to ensure that the mission will be accomplished with acceptable confidence and cost.
Thus replacing radio provides a notable impact on surivival.
The next most significant LRU is "Wheels".
With the radio replaced, the Drive System is the major source of unreliability. In the next steps we drill down to the most critical failure mode in the Drive System.
Data Explorer, Drive System
Data Explorer Land Drive Sub-System
Data Explorer, Steering
The above simulations, were based on operational profile scenarios that had already been configured for us. Now we will learn how to set up our own scenarios of operational profiles and maintenance plans as input to simulation.
Each operational variable is planned in the following calendar. These may be numerical or qualitative variables. Both are used in the simulation. Examine the default usage profile for each of the operations variables. You may use hte defaults in this exercise. Note the planning horizon in month intervals from 01/04/2008 to 01/01/2010. The horizon and its intervals may be changed for each operations variable individually by hitting the calendar drop down as shown.
This view shows the active maintenance schedule for the current platform. We have an opportunity to perform maintenance a the current time (25/03/2008) and in six months (26/09/2008) with a planned end of mission six months later on (30/03/2009).
Task planned for current maintenance period (0).
Tasks planned for next maintenance period (1).
End of mission No tasks planned for next maintenance period (2).
SparPHM allows us understand how a given maintenance plan will reduces "virtual age" so as to increase its survival profile. The simulation applies the proposed maintenance plan. This will illuminate, through what-if analysis whichindividual tasks applied at the scheduled maintenance periods, will best reduce virtual age to the degree required for mission success.
Effect Factor
Note that a task does not not necessarily reduce age to zero (as-good-as-new). The Effect Factor for each task shown above indicates the amount of age conservation. In the above maintenance of the radio, all failure modes are considered to have been repaired to an as-good-as-new state. That is no age has been conserved from the previous life cycle.
MTTF Multiplier
The maintenance task may include an upgrade (or downgrade) which would affect the failure distribution. This effect would be input as a MTTF Multiplier, say 1.5 (which would have the effect of changing the beta parameter of the failure mode Weibull distribution).
Cost
The requirement is to know what failure modes are being affected and in what way, since they would affect future availability (the failure distribution). There is a cost input. Some maintenance tasks are more expensive than others. Optimization requires the accomplishment of required levels of survivability per dollar spent so that the most cost effective strategy can be selected through multiple simulations.
Set up the single task for the first maintenance period.
The graph indicates the "point of diminishing returns" where additional cost expended will not yield an increase in survival probability.
Define the maintenance scope for the second maintenance point. Since the Radio was replaced at the previous point notice that it is no longer listed near the top of the list when sorted by System Failure Probability.
The last step of the Maintenance Planner wizard asks if you want to update the plan. Keep the existing plan.
The important output is the failure probability of the system. We want to understand how all those factors, collectively and individually will be affecting system reliability - the capability of the platform to perform its upcoming mission.
Historical usage is procurred from operational databases.
The sensor readings and operational data are processed and formatted the SparPHM input database. The PHM models are in place for calculating the system failure distribution. Future maintenance plans and usage profiles are updated. And a simulation is performed.
A living RCM process will be used to update the distributions based on past failures as recorded in the SAP, Maximo, or other maintenance information systems.
The following table lists typical sensor data used in the software.
It is not necessary to have all the covariates stamped with common time points. The software normalizes the values so that they can be processed by the PHM algorithm on a single timeline.
For example, certain sensor readings may be taken once every 24 hours, while others are acquried hourly, weekly, and so on. All readings will be synchronized (via extrapolation) behind the scene (in memory not in the database) by the software.
EXAKT, given
provides the conditional probability distributions of individual failure modes. An optimum maintenance decision can be made for each failure mode individually.
SPAR PHM, given
provides the conditional failure probability distribution of the entire system. Running the simulation for different maintenance and usage scenarios provides an optimum plan for the entire system.
Both methodologies, to be effective, require a living RCM process in the maintenance organization! |