ANFIS: Establishing and Applying to Managing Online Damage

Fuzzy logic (FL) and artificial neural networks (ANNs) own individual advantages and disadvantages. Adaptive neuro-fuzzy inference system (ANFIS), a fuzzy system deployed on the structure of ANN, by which FL and ANN can interact to not only overcome their limitations but also promote the ability of each model has been considered as a reasonable option in the real fields. With the vital strong points, ANFIS has been employed well in many technology applications related to filtering, identifying, predicting, and controlling noise. This chapter, however, focuses mainly on building ANFIS and its application to identifying the online bearing fault. First, a traditional structure of ANFIS as a data-driven model is shown. Then, a recurrent mechanism depicting the relation between the processes of filtering impulse noise (IN) and establishing ANFIS from a noisy measuring database is presented. Finally, one of the typical applications of ANFIS related to online managing bearing fault is shown.

To build an ANFIS from a given database, firstly, an initial data space (IDS) expressing the mapping f : X ! Y must be created. A cluster data space (CDS) is then built from the IDS to form the ANFIS via a training algorithm. Being viewed as a popular technique for unsupervised pattern recognition, clustering is an effective tool for analyzing and exploring data structures to build CDSs [30][31][32][33][34][35][36][37]. Reality shows that the accuracy and training time of the ANFIS depend deeply on the features of both the IDS and CDS [2][3][4][5][6]. In the process of building ANFIS, the two issues as follows should be considered: (1) What is the essence of the interactive relation between ANFIS's convergence capability and CDS' attributes? (2) How to exploit this essence for increasing ANFIS's ability to converge to the desired accuracy with the improved calculating cost?
Many different clustering approaches have been discovered [2-4, 10, 30-34, 37]. Separating data in X and in Y distinctly with a mutual result reference, step by step, was described in [10]. The method, however, could not solve appropriately the above issues. Besides, the difficulty in deploying fuzzy clustering strategies along with the high calculating cost was their disadvantage. Generally, a hard relation could not reflect fully database attributes [31,34]. The well-known method of fuzzy C-means clustering was seen as a better option in this case. It, however, was not effective enough for the "non-spherical" general datasets [30,37]. Therefore, the idea of fuzzy clustering in a kernel feature space was then developed to deal with these cases [30][31][32][33][34]37]. A weighted kernel-clustering algorithm could be referred to [30], or a method of weighted kernel fuzzy C-means clustering based on adaptive distances was detailed in [31]. In spite of owning considerable advantages, the identification and prediction accuracy of the ANFIS based on the CDS coming from [30][31] are sensitive to attributes of the CDS due to the negative influence of noise.
Reality has shown that noise status including IN always exists in the measured IDSs [2,4,9,16], which degrades violently the accuracy of ANFISs deriving from them. There are many reasons resulting in this, such as the lack of precision of the measurement devices, tools, measurement methods, or the negative impact of the surrounding environment. In [7], an ANFIS took part in the system in the form of an inverse MR damper (MRD) model to specify the time-verifying desired control current. To maintain the accuracy of the inverse MRD model, the ANFIS was retrained after each certain period due to the dynamic response of the MRD depending quite deeply on temperature. Another more active approach is filtering noise or preprocessing data [7,9,11,17,21,[38][39][40]. In [11,17], where ANFISs were employed to predict the health of mechanical systems, vibration signal was always measured and filtered to update the ANFISs. Related to the preprocessing data to set up ANFIS, it can observe that to maintain the stability of the above online ANFIS-based applications, reducing time delay is really meaningful. One of the becoming solutions for this can be referred to [16] where filtering IN and building ANFIS were carried out synchronously via a recurrent mechanism. A recurrent strategy for forming ANFIS was carried out, in which the capability to converge to a desired accuracy of the ANFIS training process could be estimated and directed online. As a solution, increasing the quality of both the IDS and CDS was paid attention. Building an ANFIS via a filtered database and exploiting the ANFIS as an updated filter to refilter the database were depicted via an online and recurrent mechanism. The process was upholden until the ANFIS-based database approximation convergent to the desired accuracy or a stop condition appears.
Inspired by the ANFIS's capability, in order to provide the readers with the theoretical basis and application direction of the model, this chapter presents the formulation of ANFIS and one of its typical applications. The rest of the chapter is organized as follows. Section 2 shows a structure of ANFIS as a data-driven model deriving from fuzzy logic and artificial neural networks. Setting up the CDS consisting of the input data clusters, output data clusters, and the CDS-based ANFIS as a jointed structure is all detailed. Deriving from this relation, a theoretical basis for building ANFIS from noisy measuring datasets is presented in Section 3. An online and recurrent mechanism for filtering noise and building ANFIS synchronously is clarified via algorithms for filtering IN and establishing ANFIS. A typical application of ANFIS related to online managing bearing fault status is shown in Section 4. Finally, some general aspects are mentioned in the last section.

Structure of ANFIS
Let's consider a given IDS having P input-output data samples ∈ ℜ n , y i ∈ ℜ 1 , and i ¼ 1…P: With a data normalization solution and a used certain clustering algorithm, a CDS is then created. The kth cluster, signed Γ k , k ¼ 1…C, consists of one input cluster and one output cluster signed Γ k A ð Þ and Γ k B ð Þ , respectively. The CDS can be seen as a framework for establishing ANFIS. This section presents how to build the CDS as well as the CDS-based ANFIS structure.

Some related notions
Some notions shown in [16] are used in this chapter as follows. Definition 1. Normalizing a given IDS to set up a normalized initial data space signed IDS is performed as follows: By this way, the ith data sample (also signed (x i , y i )) in the IDS is constituted as follows: Definition 2. The root-mean-square error (RMSE) in Eq. (3) is used to evaluate accuracy rate of ANFIS. The required RMSE value is signed E ½ . The absolute error, ε i , i ¼ 1…P, between the data output y i ¼ f x i ð Þ and the corresponding ANFIS-based outputŷ i x i ð Þ is defined in Eq. (4). The desired value of ε i is signed ε ½ : Definition 3. Let's consider x i ∈ IDS in which IDS depicts an unknown mapping f : X ! Y. The ANFIS-based approximation of f : X ! Y is called to be continuous Definition 4. Let's consider an ANFIS-based approximation of a mapping expressed by an IDS. The ANFIS is said to be a uniform approximation with a required error ε ½ if at ∀ x i ∈ X, by choosing any small constant ε ≥ ε ½ , corresponding data sample x j ∈ IDS always exists such that.
Definition 5. Data cluster Γ k and data sample x p ; y p ∈ Γ k in a CDS derived from an IDS are depicted in Figure 1. Let sign Γ k \p to be a subset consisting of the data samples belonging to Γ k except x p ; y p . The subset contains Q kp data samples. It is assumed that all of data samples in Γ k A ð Þ are distributed closely, while in Γ k B ð Þ , most of them are located closely, except y p ; it is far from the other and distributes at one side of Γ k B ð Þ . This status is described in Eq. (7): and satisfies Eqs. (8) and (9): In this case, x p ; y p is called a critical data sample in the CDS.

Setting up the input data clusters
Let's consider the normalized initial data space IDS (see Def. 1). Many wellknown clustering methods can be used to build a CDS from the IDS. Here, the CDS is built by using the clustering algorithm KFCM-K (kernel fuzzy C-means clustering with kernelization of the metric) presented in [31]. By this way, distribution of data samples in the CDS is established. The membership degree of the jth data sample belonging to the ith cluster is denoted by μ ij ∈ 0; 1 ½ ∀i, j and Cluster centroids x 0 1 , …, x 0 C in the CDS are specified such that the following objective function is minimized: Two typical distribution types in data cluster Γ k : Impulse noise point IN x p ; y p ∈ Γ k causing the distribution at one side, the right side (a), and the left side (b).
subjected to ∑ C i¼1 μ ij ¼ 1 ∀j and μ ij ∈ 0; 1 ½ ∀i, j. In Eq. (10), in ½ ∈ ℜ n is the ith cluster center; ϕ x j À Á À ϕ x 0 i À Á 2 denotes the squared distance between x j and x 0 i in the kernel space; ϕ : ð Þ is the kernel function; Þ is the distribution matrix; and m . 1 is the fuzzy factor.
The objective function can be rewritten via Gaussian kernel function as follows: Deriving J KFCM U; x 0 À Á in Eq. (11) with respect to x 0 i , at the optimal centers, the following must be obtained: From Eqs. (11) to (12) and the use of Lagrange multipliers with μ ij ∈ 0; 1 ½ ∀i, j and ∑ C i¼1 μ ij ¼ 1∀j, the following update laws are obtained: By using index ts as in Eq. (15), ts ½ to be the required value of ts and r to denote the rth loop, the clustering phase is accomplished until ts ≤ ts ½ : Specification of the optimal centers and their relationship values as abovementioned is detailed in Appendix A of [12].

Setting up the output data clusters
The result of the clustering process in the input data space is an input cluster centroid vector x 0 1 ; …; x 0 C Â Ã of corresponding data clusters, respectively, signed as Γ 1 , …, Γ C : Let A 1 , …, A C , respectively, be input fuzzy sets established via x 0 1 , …, x 0 C [12,16]. The membership value ofx il belonging to A k is inferred from Eq. (14): With following the product law, membership value of x q belonging to A i is and its normalized membership value is as follows: The membership of a data sample in each cluster determined based on Eqs. (16)-(18) is then used to specify the hard distribution status of the data samples in each cluster. It is then used to specify the index vector a of hyperplanes (or the output data clusters) w k : ð Þ and k ¼ 1…C. The ith data sample is hardly distributed into the kth data cluster if Deriving from the t k data samples hardly distributed in the kth data cluster, by using the least mean squares method, vector a ¼ a 0 ; a 1 ; …; a n ½ Finally, the value of hyperplane w k corresponding to x i is calculated in Eq. (21):

Structure of ANFIS
As mentioned in Eq. (1), the ANFIS for approximating the mapping f : X ! Y is derived from M fuzzy laws in Eq. (22): where A i In the fuzzification phase, membership value of x q belonging to input fuzzy set A i signed A i x q À Á μ iq x q À Á is specified by Eq. (17). For the defuzzification, if the center-average method is used, the output of the qth data sample is expressed via the membership values in the input fuzzy space of x q as follows: where y i q ¼ w i x q À Á i is the value of hyperplane w i corresponding to data sample x q calculated in Eq. (21). Finally, all the above-mentioned contents can be depicted via the ANFIS with five layers signed D, CL, Π, N, and S in Figure 2. Layer D (data) has n input nodes corresponding to n elements of data vector , while its outputs are the corresponding normalized values using Eq. (1). Layer CL (clustering) expresses the clustering process. The result of this process is C clusters with C corresponding cluster centroids x 0 1 , …, x 0 C ; to which C fuzzy sets, A 1 ,…, A C , are given. The output of this layer is the membership value of x i calculated for each dimensionx i1 ; …;x in Þ À via Eq. (16). Layer Π (product layer) specifies membership values based on Eq. (17). Layer N (normalization) estimates the normalized membership value of a data sample belonging to each fuzzy set upon Eq. (18). Layer S (specifying) is used to estimate the output of the ANFIS based on any well-known method. In case of using the center-average defuzzification, it is calculated by Eq. (23), while it is specified by Eq. (24) if the "the winner takes all" law is employed:ŷ where w k x i ð Þ is the value of the kth hyperplane corresponding to input data sample x i (21); k is the index of the data cluster where data sample x i gets the maximum membership specified via N : ð Þ x i ð Þ as in Eq. (25):

Building ANFIS from a noise measuring database
This section presents the recurrent mechanism together with the related algorithms consisting of the one for ANFIS-based noise filtering and the one for building ANFIS showed in [16].

Convergence condition of the ANFIS-based approximation
Deriving from a given IDS having P input-output data samples with a data normalization solution as in Def. 1, the IDS is built, to which a CDS is created as depicted in Section 2. It should be noted that IN is often considered as disturbances distributed uniformly in a signal source which impacts negatively on the created CDS. In general, IN causes raising the number of critical data samples in the CDS. The negative impact of IN on the convergent ability of training ANFIS is formulated via Theorem 1 as follows.
Theorem 1 [16]: Let's consider a given IDS deriving from an IDS and an ANFIS uniformly approximating an unknown mapping f : X ! Y expressed by the IDS. The ANFIS is built via a CDS built from the IDS. Assume that X is compact. The necessary condition for the approximation convergent to a desired error E ½ is that in the CDS there is not any critical data sample.
Proof: Let's consider cluster Γ k belonging to the CDS. Assume that x p ; y p ∈ Γ k is a critical data sample (see Def. 5); it has to be proven that the ANFIS will not converge to E ½ . It can infer from Eq. (3) that Because the ANFIS is a uniform approximation of f : X ! Y and X is the compact set, it can infer that the ANFIS is continuous in Γ k \p, so Eq. (27) can be inferred from Eq. (26): It should be noted that the ANFIS is a uniform approximation of the f : X ! Y in Γ k \p, x p ; y p ∈ Γ k is a critical data sample, and samples in Γ k A ð Þ are distributed closely. As a result, Eq. (28) can be inferred: Due to y p . max y i ∈ Γ k B ð Þ \p ð Þ y i À Á (see Def. 5), the following can be obtained from Eqs. (27) to (28): From Eqs. (8) and (29), it can conclude that RMSE . E ½ . Similarly, due to y p , min y i ∈ Γ k B ð Þ \p ð Þ y i À Á (see Def. 5), from Eqs. (27) to (28), the following can be also inferred: RMSE ≥ P À0:5 min From Eq. (9) to (30), RMSE . E ½ can be implied. Finally, it can conclude that if existing at least a critical data sample in the CDS, the ANFIS could not converge to the required error [E]. □.

Algorithm for filtering IN
An essential advantage of the clustering algorithms presented in [30][31] is the convergent rate. However, the quality of the ANFIS based on the CDS deriving from them is sensitive to the IDS attributes. It can be observed that the main reason of this status via Theorem 1 is the appearance of critical data samples. Besides, regarding the preprocessing IDS shown in [9], in spite of the positive filtering effectiveness, the calculating cost of the method is quite high. A becoming solution for the above issues can be referred in [16] where the recurrent mechanism illustrated in Figure 3 was employed. The recurrent mechanism has two phases being performed synchronously: filtering IN in the database and building ANFIS based on the filtered database.
Firstly, an adaptive online impulse noise filter (AOINF) is proposed. The recurrent mechanism is then depicted via the algorithm named FIN-ANFIS consisting of three main phases: filtering IN, clustering data, and building ANFIS. By this way, the filtered IDS is used to build the ANFIS, then the created ANFIS is applied as an updated filter to refilter the IDS, and so on, until either the process converges or a stop condition is satisfied. To get a guarantee of convergence and stability, an update law for the AOINF is discovered via Lyapunov stability theory. Remark 1. ANFIS cannot converge to the required error [E] if there is at least one critical data sample in the CDS (see Theorem 1). The clustering strategy of the FIN-ANFIS therefore focuses on preventing the clustering process from appearing critical data samples, along with seeking to exterminate the critical data samples in the CDS having been taking form. As a result, in each loop of the ANFIS training process, the strategy well directs the clustering process to a new CDS where either there is not any critical data sample or there exist with a smaller amount. Theorem 2 shows the convergence condition of the training process.
Theorem 2 [16]: Following the flowchart in Figure 3, the ANFIS-based approximation of an unknown mapping f : X ! Y expressed by the given IDS is built via a CDS which drives from the IDS (the normalized IDS). Let Q be the number of critical data points in the CDS at the rth loop. At these critical data samples, if the data output is filtered by law (31), then the RMSE (3) of the ANFIS will converge to [E]: rþ1 ð Þ y i ¼ ðrÞ y i À ρ sgn ðrÞ y i Àŷ i À Á Þ , i ¼ 1…Q: In the above, ρ . 0 is the update coefficient to be optimized by any well-known optimal method; ðrÞ y i Àŷ i À Á is the error between the ith data output and the corresponding ANFIS-based output; and function sgn : ð Þ is defined as Proof: A Lyapunov candidate function is chosen as in Eq. (33), to which expression (34) can be inferred: In the above, _ Ξ ¼ dΞ=dt expresses derivative of Ξ with respect to time; X is the vector of state variables deriving from IDS as follows: From update law (31), Eq. (34) can be rewritten as in Eq. (36): It should be noted that the update process is performed with respect to the critical data points; hence, Eq. (36) can be rewritten as follows: In addition, the following can be implied from (33) to (35): Finally, it can infer from Eqs. (37) to (38) that e X ð Þ ! 0 is a stable Lyapunov process. Hence, from Eq. (3) one can infer the aspect needing to be proven: Remark 2. (1) To enhance the ability to adapt to the noise status of the IDS, ρ in Eq. (31) is specified as follows: where α ≥ 0 is an adaptive coefficient chosen by the designer. Thus, ρ ¼ ρ X i ; t ð Þ takes part in adjusting the filtering level Δ i ¼ rþ1 ð Þ y i À ðrÞ y i j .
(2) It can infer from Theorem 1 that disposing of critical data samples in the CDS needs to be carried out. Therefore, the useful solution offered in Theorem 2 via update law (31) is employed to establish the filtering mechanism of the AOINF as shown below.
The algorithm AOINF for filtering IN: 1. Look for critical data samples in the CDS to specify the worst data point (WP) where the continuous status of the ANFIS is worst: 1. Specify the data samples satisfying condition (42): In the above,ŷ WP ð Þ i is the ANFIS-based output, while y WP ð Þ i is the corresponding data output at the WP; σ . 1 is an adaptive coefficient (to be 1.35 for the surveys shown in [16]).
1. Based on the updating law (43) to filter the data samples satisfying condition (42) ðrþ1Þ y q ðrÞ y q þ α ðrÞ y q Àŷ q j sgn ðrÞ y q Àŷ q Þ , q ¼ 1… Q: Figure 4 illustrates the establishment of the CDS from the IDS. It consists of (1) building fuzzy clusters with centroids x 0 1 ; …; x 0 C À Á or the input data clusters (see Subsection 2.2), (2) estimating the hard distribution of samples in each input data cluster indicated by x 0 1 ; …; x 0 C À Á , and (3) building the hyperplanes or the output data clusters (see Subsection 2.3) in the output data space using the specified hard distribution status. Based on the created CDS, Figure 3 shows the flowchart of the FIN-ANFIS consisting of three main phases: filtering IN, building the CDS driving from the filtered IDS, and forming ANFIS.

Algorithm FIN-ANFIS
Initializing: The initial index of the loop process, r = 1; the number of clusters C ≪ P À 1; J KFCM r ð Þ ¼ Ω, where Ω is a real number Ω . ts ½ ; and the initial cluster centroids corresponding to r = 1 chosen randomly: Build the input data clusters:

Establish the input data clusters:
Based on the x 0 i r ð Þ to be known, calculate μ ij via Eq. (14) to update x 0 i r ð Þ via Eq. (13).

Specify the stop condition of the clustering phase via ts in Eq. (15):
If ts ≤ ts ½ : go to Step 3; ff ts . ts ½ and r , r ½ , setup r ≕ r þ 1 and return to Step 1; if ts . ts ½ and r ¼ r ½ and C , P À 1, set C ≕ C þ 1, r ≕ 1, and return to Step 1; and if ts . ts ½ and r ¼ r ½ and C ¼ P À 1, stop (not converge).   ; set C ≕ C þ 1,r ≕ 1, and set up a new cluster centroid x 0 C in the neighborhood of the WP; and go to Step 5.

Filter IN:
Call the algorithm AOINF and return to Step 1.

ANFIS for managing online bearing fault
An application of ANFIS to estimating online bearing fault upon the ability to extract meaningful information from big data of intelligent structures is shown in this section. Estimating online bearing status to hold the initiative in exploiting the systems is meaningful because bearing is an important machine detailed in almost mechanical structures.
In [17], an Online Bearing Damage Identifying Method (ASSBDIM) based on ANFIS, singular spectrum analysis (SSA), and sparse filtering (SF) was shown. The method consists of two phases: offline and online. In offline, the ANFIS identifies the dynamic response of the mechanical system in the individual bearing statuses. The trained ANFIS is then used to estimate its real status in the online phase. These aspects are detailed in the following paragraphs.

Singular spectrum analysis
By using SSA, from a given time series, a set of independent additive time series can be generated [41][42][43]. This work is clarified via the algorithm for SSA presented in [42] as follows.

Embedding:
Let's consider a given time series of N 0 data points z 0 ; z 1 ; …; z N0À1 ð Þ . From selected window length L 0 , 1 < L 0 < N 0 , sliding vectors X j ¼ z jÀ1 ; z j ; …; z jþL0À2 À Á T , j = 1,…,K=N 0 À L 0 + 1, and matrix X as in Eq. (45) are built: 2. Building the trajectory matrix: From Eq. (45), one builds matrix S ¼ XX T ∈ ℜ L0ÂL0 . Vectors V i are then constructed, , i ¼ 1…d, in which λ 1 , …, λ d are the non-zero eigenvalues of S arranged in the descending order and U 1 ,…, U d are the corresponding eigenvectors. A decomposition of the trajectory matrix into a sum

Reconstruction:
Each elementary matrix is transformed into a principal component of length N by applying a linear transformation known as diagonal averaging or Hankelization. Let Z ∈ ℜ L0ÂK be a matrix of elements z i, j .

Sparse filtering
In this work SF is used to extract features from a given time series-typed measured database. Relying an objective function defined via the features, the method tries to specify the good features such that the objective function is minimized [11,[44][45]. To deploy SF effectively, a process with the two following phases is operated. Preprocessing data based on the whitening method [46] is carried out in the first phase. A H-by-L matrix signed F of real numbers depicting the relation between each of the H training data samples and the L selected features is established in the second phase. SF presented in [11,45] is detailed as follows.
In the first phase, a training set of the H data samples x i ∈ ℜ 1ÂN , i ¼ 1…H, in the form of a matrix signed S ∈ ℜ HÂN is established from the given time series-typed measuring dataset. By adopting the whitening method [46], it then tries to make the data samples less correlated with each other and speed up the convergence of the sparse filtering process which employs the eigenvalue decomposition of the covariance matrix cov S ¼ Z DZ T : In the expression, D is the diagonal matrix of its eigenvalues, and Z is the orthogonal matrix of eigenvectors of cov S . Finally, the whitened training set signed S white is formed as in Eq. (47): Subsequently, in the second phase, SF maps the data sample x i ∈ ℜ 1ÂN of S white onto L features f i , i ¼ 1…L, relied on a weight matrix signed W ∈ ℜ NÂL . A linear relation between data samples in S white and the L features is expressed via W as in Eq. (48), in which F ∈ ℜ HÂL is called the feature distribution matrix: Optimizing the feature distribution in F is then performed as detailed in [45]. The features in each column of F is normalized by dividing them by their l 2 -norm, For each row of the obtained matrix, these features per example are normalize by computingf i ¼f i =f i k 2 , i ¼ 1…H , by which they lie on the unit l 2 -ball. The features normalized after the two above steps are optimized for sparseness using the l 1 -penalty to get a matrix signedF ∈ ℜ HÂL . A loop process is then maintained via Eq. (48), in whichF takes the role of F, until the optimal weights of W are to be established that make the objective function J SF W À Á of Eq. (49) be minimized, to which, finally,F is resigned F:

The ASSBDIM
The ASSBDIM focuses on online bearing fault estimation. The aim is detailed in this subsection consisting of the way of setting up the databases and the algorithm ASSBDIM for online bearing fault estimation upon the built databases.

Building the databases for the ASSBDIM
A measuring dataset deriving from the mechanical system vibration is established for each surveyed bearing fault type. Regarding Q fault types, one obtains Q original datasets as in Eq. (50): where D i is corresponding to the ith bearing fault type 1 ≤ i ≤ Q ð Þ . By using SSA for D i , m time series as in Eq. (51) are set up: where m is parameter selected by the designer. This work is carried out by the three steps as presented in Subsection 4.1.1, in which D i is used in the first step as the given time series of N 0 data points z 0 ; z 1 ; …; z N 0 À1 ð Þ for building the trajectory matrix X in Eq. (45). Because the mechanical vibration signal is prone to the low frequency range [42], among the m time series, the (m-k) ones owning the highest frequencies are considered as noise. The k remainder time series as in Eq. (52) is hence kept to build the databases: Specifying the optimal value of both k and m will be mentioned in Subsection 4.2.2. For each time series in Eq. (52), for example, D ij , j ¼ 1…k, based on SF one obtains the feature distribution matrix as in Eq. (48) which is resigned F ij ω ð Þ∈ ℜ HÂL . By using this result for all the time series in Eq. (52), a new data matrix D i as in Eq. (53) is formed which is the input data space of the ith data subset corresponding to the ith bearing fault type: By employing this way for Q, the surveyed bearing fault types, an input data space in the form of matrix (54), are established, which relates to building two offline databases signed Off_DaB and Off_testDaB as well as one online database signed On_DaB used for the algorithm ASSBDIM as follows: Namely, matrix D relates to the input data space (IDS), to which the databases for identifying the bearing status are built as follows. Firstly, by encoding the ith fault type by a real number y i , the output data space (ODS) of the ith subset can be depicted by vector y i of H elements y i as in Eq. (55): Then, by combining Eq. (55) with Eq. (54), the input-output relation in the three datasets Off_DaB, Off_testDaB, and On_DaB can be described as in Eq. (56): In the above, the input space D comes from Eq. (54), while the output space y as in Eq. (57) is constituted of y i ∈ ℜ HÂ1 in Eq. (55):

The algorithm ASSBDIM for estimating health of bearings
In the offline phase, by initializing the parameters in vector ps in Eq. (58), together with applying SSA and SF to the measuring data stream, the Off_DaB and Off_testDaB are built as in Eq. (56): where corresponding to the nth damage type, n ¼ 1…Q, cr_samples n is the number of checking samples expressing correctly the real status of the bearing, while to_samples n is the total of checking samples used in the survey; Q is the number of surveyed bearing fault types as mentioned in Eq. (50).
Following the MeA, an objective function is defined as follows: The Off_testDaB, function J, and DE [47] are then employed to optimize the parameters in vector ps, to get L 0 ; N 0 ; m; k; H; L ½ opt . Namely, by using the input of the Off_testDaB for the ANFIS which has been trained by the Off_DaB, one obtains the outputsŷ i , i ¼ 1…H. These outputs are then compared with the corresponding encoded outputs to estimate the bearing real status, which is the one encoded by "q" satisfying Eq. (62): The completion of the offline phase as above can be seen as the beginning of the only phase. During the next operating process, first, by the way similar to the one for building the offline database Off_DaB, the online dataset On_DaB in the form D ON D i ∈ ℜ HÂ kL ð Þ as in Eq. (53) is built. By using the On_DaB for the ANFIS trained in the offline, the bearing real status at this time is then specified based on Eq. (62).
The ASSBDIM is hence can be summarized as follows.
The offline process: Initialize vector ps in Eq. (58): 1. Build the Off-DaB and Off-testDaB in the form of Eq. (56).
2. Train an ANFIS to identify the Off-DaB using the algorithm FIN-ANFIS.
The Off-testDaB is used as database of the trained ANFIS, using the condition (62)

Experimental apparatus and estimating way
The experimental apparatus for measuring vibration signal is shown in Figure 5. The apparatus consists of the motor (1), acceleration sensors (2) and (4), surveyed bearings (3) and (5), module for processing and transforming series vibration signal incorporating software-selectable AC/DC coupling (Model: NI-9234) (6), and computer (7). In Table 1, "encoding value" is abbreviated to "EV." The three cases listed in Table 1 related to nine of the widespread single-bearing faults as in Table 2 are surveyed. In the above, Q = 7 (see Eq. 50) for the Cases 1-2, while Q = 10 for Case 3; the damaged location is the inner or outer or balls (signed In, or Ou, or Ba, respectively); damaged degrees are from 1 to 3 (signed D1 or D2 or D3); the load impacting on the system at the survey time consists of Load 1 or 2 or 3 (signed L1 or L2 or L3). For example, LmUnd shows the load degree to be m and the bearing to be undamaged, or LmDnBa expresses the load degree to be m (1,…,3), the damage level to be n (1,…,3), and the damage location to be the ball.
The ASSBDIM with H = 303, m = 30, k = 7 along with four other methods [48][49][50][51] is employed to be surveyed. The first one [48] (N in = N out = 100; number of segments to be 20 Â 10 3 and λ ¼ 1E À 5) is the intelligent fault diagnosis method using unsupervised feature learning toward mechanical big data. The second one [49] employs the energy levels of the various frequency bands as features. In the third one [50], a bearing fault diagnosis upon permutation entropy, empirical mode decomposition, and support vector machines is shown. In the last one [51], a method of identifying bearing fault based on SSA is presented.
For the surveys, along with Ac and MeA, the root-mean-square error as in Eq. (63) is also employed, where y i andŷ i , respectively, are encoding and predicting outputs:   Theŷ i and y i depicted by lines (6) in Figure 6 to be zoomed in.
The error reflecting the difference between y i andŷ i in Figure 6.

Some survey results
The measured databases from Cases 1 to 3 with Q = 7 as in Table 1 along which the methods consist of the ASSBDIM [17] and the ones from [48][49][50][51] were adopted to identify the status of the bearing. The obtained results were shown in Figures 6-9 and Tables 3 and 4.

Discussion
Following the above results, it can observe that among the surveyed methods, the ASSBDIM which is based on ANFIS gained the best accuracy. This aspect can be

Surveyed cases
Ac (%) [ recognized via the quite equivalent values between the encoding and predicting outputs from the tested data samples. The small difference depicted by the zooming in in Figure 7 and the root-mean-square error in Figure 8 as well as the high/higher values of Ac and MeA deriving from the ASSBDIM in Tables 3 and 4 and Figure 9 reflect clearly the ANFIS's identification ability.
It should be noted that the methodology shown via the algorithm ASSBDIM can be also used to discover the method of managing damage of mechanical structures as well.

Conclusion
The hybrid structure ANFIS, where ANN and FL can interact to not only overcome partly the limitations of each model but also uphold their strong points, has been seen as a useful mathematical tool for many fields. Inspired by the ANFIS's capability, in order to provide the readers with the theoretical basis and application direction of the model, this chapter presents the formulation of ANFIS and one of its typical applications.
Firstly, the structure of ANFIS as a data-driven model deriving from fuzzy logic and artificial neural networks is depicted. The setting up the input data clusters, output clusters and ANFIS as a joint structure is all detailed. Deriving from this relation, the method of building ANFIS from noisy measuring datasets is presented. The online and recurrent mechanism for filtering noise and building ANFIS synchronously is clarified via the algorithms for filtering noise and establishing ANFIS. Finally, the application of ANFIS coming from the online managing bearing fault is presented. The compared results reflect that among the surveyed methods, the ASSBDIM which exploited the identification ability of ANFIS gains the best accuracy. Besides, the methodology shown via this application can be also used as appropriate solution for developing new methods of managing damage of mechanical structures.
In addition to the above identification field, it should be noted that (1) ANFIS has also attracted the attention of many researchers in the other fields related to prediction, control, and so on, as mentioned in Section 1 and (2)  collaborate effectively with some other mathematical tools to enhance the effectiveness of technology applications.

Author details
Sy Dzung Nguyen