The Feasibility of Human Identiﬁcation from Multiple ECGs using Maximal Overlap Discrete Wavelet Transform (MODWT) and Weighted Majority Voting Method (WMVM)

Electrocardiography (ECG) has been a subject of research interest in human identiﬁcation because it is a promising biometric trait that is believed to have discriminatory characteristics. However, features of ECGs that are recorded at different times are often likely to vary signiﬁcantly. To address the variability of ECG features over multiple records, we propose a new methodology for human identiﬁcation using ECGs recorded on different days. To demonstrate the applicability of our method, we use the publicly available ECG ID dataset. The main goal of this work is to extract the most signiﬁcant and discriminative wavelet components of the ECG signal, followed by utilizing the ECG spectral change for human identiﬁcation using multi-level ﬁltering technique. Our proposed multi-channel identiﬁcation system is based on using the Maximal Overlap Discrete Wavelet Transform (MODWT) and its inverse (the IMODWT) to create multiple ﬁltered ECG signals. The discriminative feature that we utilize for human identiﬁcation is based on modeling the dynamic change of the frequency components in these multiple ﬁltered signals. To reach the best possible identiﬁcation performance, we use the Weighted Majority Voting Method (WMVM) for ECG classiﬁcation. We evaluated the robustness of our proposed method over several random experiments and obtained 92.29% average identiﬁcation accuracy, 0.9495 precision, 0.9229 recall, 0.0771 FRR and 0.0013 FAR. These results indicate that ﬁltering some of the ECG wavelet components along with performing data fusion technique can be utilized for human identiﬁcation


Introduction
In recent years, physiological signals have showed great potential in human recognition [1][2][3].In addition, it has been demonstrated that physiological signal-based identification systems are more robust against counterfeit than the existing conventional and traditional biometric systems [4,5].Therefore, researchers have presented various methods to investigate the possibility of human recognition via biomedical signals [6][7][8].Specifically, among different biomedical signals, the electrocardiogram (ECG) has been widely studied as a new approach in human identification.It has been shown that ECG based biometric systems achieve satisfactory identification accuracy in a wide range of applications [3].In fact, the ECG has some key advantages, including mainly its hidden nature and its liveness assurance, which make it preferable to other biometric modalities such as face, fingerprint and iris [3,5] since they can be damaged or stolen.Additionally, the ECG has several characteristics that are required for any biometric modality.In general, a biometric trait should satisfy the following requirements to be used for human recognition [9,10]: (1) Universality: the trait should be present in living population.
(2) Uniqueness: major differences in trait characteristics should be derived among different people.
(3) Collectability: the trait should be quantitatively measurable and easily accessible.
(4) Acceptability: the trait should be user friendly and widely acceptable.
(5) Resistance to circumvention: the trait should be resistant to the various spoofing attacks.(6) Permanence: the extent to which the trait features should remain stable over time.
The ECG satisfies most of the abovementioned requirements because it is essentially a vital sign and is present in all living people [11][12][13].In addition, it has been proven that the ECG has unique patterns among individuals [3] and it can be easily recorded using a single lead [14,15].Moreover, the ECG can be hardly forged due to its biological nature and liveness indicator [16].However, the stability of ECG features is one of the most controversial characteristics because it has been demonstrated that cardiac signals are highly affected by many geometrical, individual, and technical factors [5,17].
To illustrate, geometrical attributes such as heart size, cardiac muscle thickness, heart shape and the number of cardiac cells involved in the electrical activity directly dictate the routes of the electrical current inside the heart [5,18,19].On the other hand, personal characteristics including mainly the health status, age and weight Digital Medicine and Healthcare Technology 2/32 could cause changes in the heart position and orientation [20].Hence, these factors shift the electrical current orientation, and also change the conductivity of the heart [5,20].Additionally, the electrode features such as the type, quantity, degree of dryness and position may cause some changes in the electrical properties of the electrodes [5,18,20].Consequently, all these previously mentioned factors create morphological variations in the ECG signals which are highly remarkable in ECGs that are recorded on different times [18].This changeability of the ECG features is usually categorized as intra-subject variability and inter-subject variability [21].
Where the former one refers to the variations within or between ECGs of a single subject, the latter refers to the variations of ECGs of different subjects [3,5].
In fact, the inter-subject variability is highly desired for human identification because the uniqueness of the ECG signal can be explained through finding the core differences between cardiac signals of different individuals [5,20].On the other hand, the intra-subject variability could be beneficial because the dynamic change between ECG features of a single subject can be modeled to create individual based biometric signatures [3,18].Hence, a perfect biometric trait should have a very high inter-subject variability.In contrast, it should have very low intra-subject variability [3,5,16].However, the stability of these parameters over time, i.e. their permanence, remain the main challenge in using multiple ECGs for human identification.
The variability of ECG features can be clearly noticed when analyzing the fiducial and non-fiducial characteristics of such a biomedical signal.For example, in figure 1 we show the heart beats of two subjects from the ECG-ID database.Obviously, the inter-subject variability can be observed by the various morphologies that form the personal heartbeats of these two individuals.On the other hand, the intra-subject variability can be noticed as the significant fluctuations in the amplitude of the QRS complexes.In addition, figure 2 shows the intra-subject variability which can be seen as the rapid changes in the ECG frequency components when the data are recorded on different days.Moreover, figure 3 shows the various morphological bundles of the ECG heartbeat such as the right bundle branch block beat, the left bundle branch block beat and normal beat [22].
All the previously presented factors reflect that utilizing the cardiac signal for human recognition does not only depend on choosing the appropriate features, but also relies on categorizing the variability of ECG features over time [23].In this work, we focus on investigating the feasibility of human identification using ECGs that are recorded on two different days.Specifically, we selected the ECG-ID data because it was originally recorded for biometric purposes.For each subject, two 20-second ECG recordings were chosen.These ECG signals were recorded over a six-month period [24,25].subjects were excluded because their T wave is tall and higher than the QRS complex.In addition, we designed our method to decompose the ECG using SYM4 which generally detects the QRS features.Therefore, including all the database and Digital Medicine and Healthcare Technology 4/32 treating it equally without changing the mother wavelet or classifying the reference data according to the heartbeat morphology, would have affected the performance of our methodology because there is a significant difference between features of normal heartbeats and other types of heart beats with larger T waves.

Literature review
The use of ECG in various clinical diagnostic applications has significantly demonstrated different characteristics of this human cardiac signal.Therefore, the potential use of ECG for human identification was driven by utilization of these features to create a new biometric modality.To the best of our knowledge, the first attempt to utilize the ECG for human recognition was presented by Biel et al. [1].
The authors used 12 fiducial heartbeat features and reported 95% identification accuracy.However, most of the ECG based human identification systems have increasingly been presented in the last decade.
Dar et al. [26] have presented a method based on discrete wavelet transform (DWT) for human recognition.Technically, the preprocessing stage [26] involved removal of baseline wonder and power interference followed by normalization of the signal with R peak detection.The DWT was applied using Har wavelet coefficients at five level decomposition to extract the ECG features.Additionally, the Best First Search (BFS) method was performed for feature reduction and the k nearest neighbor method was utilized for feature classification.Consequently, Dar et al. [26] reported 82.03% identification accuracy.
On the other hand, Morteza et al. [27]  Further, Kim et al. [33] proposed a method which is based on a generalized likelihood ratio test (GLRT) and composite hypothesis testing.Based on the results, Kim et al. [33] reported 93% detection probability for user authentication.Tan et al. [34] introduced a sparse representation learning framework that utilizes the time frequency distribution of the ECG signal for biometric purposes and reported 98% average identification accuracy.Their work was based on using the statistical n-best adaptive Fourier decomposition (SAFD) method for reducing the intra-subject variability and increasing the inter-subject variability of ECG features.
Furthermore, the research investigations in [35] reported that the QRS complex exhibits significant features among different individuals and such features can be utilized for human authentication.The authors in [36] and [23] have shown that the ECG signal reveal varied and unique patterns.The research findings in [37] showed that ECG based biometric identification highly relies on the type of methods that are  [2,3,23].
In this paper, we focus on identifying individuals using multiple ECGs.These multiple ECGs are recorded on different days, and they are more likely to have some variability in features.Based on our previous work in [38], modeling the dynamic change in ECG spectral features and using Fréchet based distance measurement for ECG classification have shown excellent results on individual recognition.However, when we select the ECG data from different days, the identification performance decreases significantly.This is expected because of the above-mentioned reasons on the variability of ECG features.To solve this problem, the main contributions of this paper include: (1) Addressing the variability of ECG features at the preprocessing stage by decomposing the signal and performing data filtering methods.Unlike the previous works where the processing stage involved noise removal and signal correction [26,[30][31][32], in this paper we focus on partitioning the signal variability according to its wavelet components.The variability of the ECG features can be analyzed by decomposition of the signal into its wavelet components [39].To accomplish our objective, we show that filtering some of the high frequency wavelet components in a set of parallel processes can be modeled for human identification.The main advantage of this process is to reduce the variability of the ECG features at the fundamental wavelet components while keeping majority of the signal information [40].
(2) Proposing a new technique for data fusion at the classification stage to reach true identification.To achieve our goal, we show that utilizing the minimum Fréchet distances between filtered versions of multiple ECGs can be modeled to create a weighted scoring technique based on majority voting for reaching correct identification.The main advantage of our proposed weighted majority voting includes using the minimum distance between multiple ECGs -a unique feature that has an effective role in decision making [41].

Materials and methods
In this work, we propose a method for individual identification using multiple ECGs that are recorded on two different days.The general flowchart of our proposed methodology is shown in figure 4.
Digital Medicine and Healthcare Technology 7/32

The ECG referencing and testing data
As mentioned above, we used the public ECG ID database of 62 subjects because it was originally recorded for biometric purposes [25].For each subject, we selected two ECG recordings.The public ECG ID data do not have information about the exact time and date in which the ECGs were recorded.However, all the ECG recordings were taken over a six-month period.We grouped the ECG ID data into two categories, namely the referencing data and testing data.To achieve our goal of identifying individuals using multiple ECGs, we use the ECGs from the former group for referencing purposes and we use the ECGs from the latter group for testing purposes.

Preprocessing using Maximal Overlap Discrete Wavelet Transform (MODWT)
The Maximal Overlap Discrete Wavelet Transform (MODWT) like the Discrete Wavelet Transform (DWT) is a linear filtering process which is used to decompose a signal into a set of time dependent wavelet and scaling coefficients [39].However, MODWT is non-orthogonal transform compared to DWT [39].The MODWT basic idea relies on using the values that are removed from DWT by down sampling.
Therefore, MODWT is a highly redundant transform compared to DWT since it is defined for all samples sizes.Like the DWT, the MODWT is utilized to perform multiresolution analyses (MRAs) and the redundancy of the MODWT enables instantaneous comparison between the original time series and its decomposition at each level.Most importantly, the MODWT coefficients of various scales are usually not correlated.Thus, it is a useful transform to partition the variability of the signal [39].
Digital Medicine and Healthcare Technology 8/32 The ECG is a nonstationary signal, and its features are often localized in time and frequency [22].Therefore, it is better to analyze such a signal using wavelets because they are utilized to decompose the signal and provide sparser representation [3].
However, choosing the most appropriate wavelet function depends on the ECG features of interest [2].Specifically, the QRS complex is the prominent wave of the ECG; therefore, we selected the sym4 as an analyzing wavelet to decompose the ECG into time-varying frequency (scale) components.In addition, the QRS complex can be easily segmented compared to the P and T waves since they require expert labeling to achieve proper segmentation [42,43].The sym4 wavelet resembles the QRS complex and is an appropriate choice to detect most of the ECG information [14].
In figure 5, we show a comparison between the sym4 wavelet and the QRS complex.The figure shows that the sym4 resembles the QRS complex.Although sym4 is generally utilized to detect QRS features, it also can detect non QRS features by changing the scale and translation parameters [44].In this paper, the wavelet coefficients are computationally returned based on utilizing different versions of the analyzing wavelet.The small scales, compressed versions of sym4, are utilized to detect the high frequency components of the signal.In contrast, the large scales, stretched versions of SYM4, are utilized to detect the low frequency components of the signal [44,45].
In signal processing, real world biological signals such as the ECG are sampled over finite intervals of discrete times [46].Therefore, the ECG data can be written as a discrete function f (x) recorded at n l samples.The f (x) can be expressed as a linear combination of two main functions, i.e., a scaling function ϕ(x) and an analyzing Digital Medicine and Healthcare Technology 9/32 wavelet ψ(x) at varying scales and translations [39].The linear representation of f (x) can be written as: where and J 0 is the number of levels of wavelet decomposition.
According to equations ( 1) and ( 2), the MODWT returns n l scaling coefficients (c k ) and J 0 × n l detail coefficients (d j,k ).However, the detail coefficients are generated at each level j such that j = 1, 2, … , J 0 , but the scaling coefficients are generated only at the final decomposition level J 0 .Therefore, X can be written as: where X is the ECG data, W j consists of the detail coefficients at scale j and V J0 are the final level scaling coefficients.
In this work, we set J 0 to 10 to provide redundant MRAs of the ECG signal.In figure 6, we show the 10 level wavelet coefficients of a random ECG signal.The figure shows the details coefficients for scales 2 1 to 2 10 .In addition, it shows the final level scaling coefficients.These coefficients permit an easier analysis of the ECG because they provide sparser (reduced) representation of the signal.These wavelet components are likely to have some variations when it is extracted from multiple records.To address the variability of ECG features, we filter some of the wavelet components to obtain the most significant information.

Filtering and reconstruction using the Inverse Maximal Overlap Discrete Wavelet Transform
The ECG is an aperiodic random signal whose value at any instant is unknown and it is generally unpredictable [22].In addition, the ECG features exhibit some changes over time, specifically, the mean and variance of the ECG are a function of time, and they can vary significantly from heartbeat to heartbeat [46].The variability of ECG features is caused by physiological and non-physiological factors which we have explained in the introduction [3,5].Consequently, it has been demonstrated that utilizing the ECG for human recognition highly requires building identification systems that are adaptable to the variability of ECG features [23].To achieve our objective, we developed a multi-channel wavelet-based filtering system because we Digital Medicine and Healthcare Technology 10/32 expect that by filtering some of the wavelet components, the variability of ECG features will reduce, and the performance of the identification process will increase.
In fact, all the ECG wavelet-based features including the high frequency and the low frequency components can be useful for the identification process [38,45].
Therefore, we designed our filtering systems to remove different wavelet components at different levels of the filtering process [46].The proposed system is designed to filter the high frequency components in a set of parallel processes.The However, our filtering system is not designed to filter the low frequency components, which are calculated by utilizing larger scales of the analyzing wavelet, because we expect that such components have most of the permanent information of the ECG (see figure 7).Technically, because we eliminated some of the wavelet information, the reconstructed signals are named as filtered ECGs [39].Since we Digital Medicine and Healthcare Technology 11/32 apply different levels of filtration, we create different types of filtered ECGs [40].
The main goal here is to find the most significant wavelet components of the ECG signal which are utilized to create our multi-channel identification system.
In figure 8, we show our Parallel High Frequency Filtering System (PHFFS) which consists of five channels.In addition, table 1 shows the wavelet coefficients that are removed at each level of the filtering system.The PHFFS removes the detail coefficients from levels W 1 to W 4 in a parallel process.Moreover, the PHFFS consists of an additional channel which removes the detail coefficients of levels W 1 to W 4 (figure 8).
Then, the ECG is reconstructed using the Inverse Maximal Overlap Discrete Wavelet Transform (IMODWT) at each level of these filtering processes [39].
Therefore, the output of the PHFFS consists of several filtered ECGs.According to  Table 1.Reconstruction of the ECG signal using the multichannel wavelet-based filtering system.

Filtering channel p
Removed coefficients using

PHFFS system
Filtered signal notation identification process and utilized as a unique personal identifier.In figure 9, we show an example of all the reconstructed signals.Generally, the high frequency components of the ECG have slightly larger statistical variation than the low frequency components [47].This variation may influence the overall identification performance for some individuals.Therefore, we apply the PHFFS to remove multiple wavelet components and investigate the applicability of the identification system using reduced amount of the signal information [40].

Spectral feature extraction (STFT)
The feature extraction stage involves utilizing significant characteristics from the ECG for human identification.Since ECG is a random time varying signal, it generally has intra feature variability between multiple heartbeats [4,22].Therefore, the most appropriate way to utilize the ECG for human identification does not only depend on filtering some of the wavelet components but also involves tracking the dynamic change of features among multiple heartbeats [2].In our previous work [38], we introduced a new feature that is based on modeling the dynamic change of ECG spectral components.This feature, extracted from the main signal, has shown excellent results for individual recognition.Differently, in this work we extract the dynamic change of the spectral components in each of the filtered ECG signals.
The complete process of extracting the dynamic change of the ECG frequency components can be found in [38].In short, let X r be the referencing ECG from Digital Medicine and Healthcare Technology 13/32 Similarly, let X t be a testing ECG from day 2, we obtain the spectral feature matrix F t p using the same procedure.For each subject, the process of randomly selecting testing data is repeated many times to evaluate the performance of our method.

Classification using Fréchet distance
ECG classification for human identification purposes is the process of correctly assigning a class for the transformed feature matrices [9,16,30,44,47].Technically, the procedure for choosing the right classifier highly depends on the geometrical characteristics of the feature matrix [38].Here, we also refer to our previous findings on the robustness of utilizing the Fréchet distance for correctly classifying the covariance matrices of the ECG dynamic features [38].In short, let n sb be the total number of subjects, we use equation ( 4) n sb many times to compute the Fréchet distance (fd) between a single testing feature matrix of random subject and Digital Medicine and Healthcare Technology 14/32 the reference feature matrices of all subjects such that: where where 1 ≤ m ≤ n sb is the index of one subject, n = 1, 2, 3, … , n sb refers to the subject number with a total of n sb subjects and p is the index of the filtering channel (see table 1).
In this work, the scaling coefficients (V J0 ) are larger than the detail coefficients which may make the feature matrices to be singular [46].Consequently, these two square roots in equation ( 4) ( √ Ât p,m and √ Âr p,n ) may not exist.To address this problem, we removed the scaling coefficients in all the filtered channels (see table 1).
However, the scaling coefficients may have distinctive information that can be used in other biometric applications such as ECG data clustering.According to equation ( 4), we use the referencing data of all subjects to obtain the Fréchet distance; however, the scaling coefficients could be utilized to address this problem by clustering the referencing data to reduce the computational time of classification.
After finding the Fréchet distances, we use equation (7) to classify the testing data to a specific class (person) such that: where FD p is a vector that has n sb individual Fréchet distances.
In addition, equation ( 8) returns the minimum Fréchet distance: where d p is the minimum Fréchet distance that is obtained by using filter p.
Since our PHFFS consists of p filtering channels, we obtain C p ∈ R np which is a vector that has n p classes where each filtering channel returns one class.In this work, the total number of classes is equivalent to the total number of subjects.We also obtain D p ∈ R np which is a vector that has n p minimum Fréchet distances.For each subject, we repeated all the previously explained processes many times by changing the testing ECG data to examine the stability of our method.Thereafter, the C p and D p vectors were transmitted to the data fusion algorithm.
Digital Medicine and Healthcare Technology 15/32

Decision fusion using weighted majority voting method
Data fusion is a process of combining information collected from a multisensory system to form a final decision [41].The use of multisensory data has been widely applied in many fields including medical applications [48].In general, the data collected from a multisensory system are incomplete or overlapping which may cause improper decision making.Therefore, data fusion is an essential step that improves the overall performance of a multisensory system [49][50][51].
In this work, we utilize the reconstructed ECG signals as information collected from multiple detectors.To reach a common final decision, we aim to combine the multiple decisions that are obtained from the wavelet-based filtering channels [41].
As explained in the previous stage, C p is a vector that has a maximum of n p identities which are selected using n p minimum Fréchet distances (see figure 10).
Each of these filtering channels picks one identity and it also returns the corresponding minimum Fréchet distance.
Here we propose a scoring technique to reach a final decision using the weighted majority voting method (WMVM) [40,41].Our technique is based on computing a weighted score for each person picked by the classification process.For each person in C p (e.g., person n), the scoring weight is calculated using the following equation: where v p,n is the vote given for person n using filter p such that: Digital Medicine and Healthcare Technology 16/32 Therefore, the identity with the smallest distance will be picked as a final decision.
However, if more than one filtering channel picked a similar identity as seen in figure 11, a higher weighted score is calculated for that person according to equations ( 9) and ( 10) respectively.In data fusion, adding a measurement-based higher score for decisions which have been made using a multisensory system depends on the statistical parameters of the corresponding measurement [41].To illustrate, figure 12 shows an example of the range of minimum Fréchet distance which are obtained after randomly choosing multiple testing data and applying equations ( 4)-( 6) and ( 8) respectively.For any of the filtering channels if the range of the minimum distance is very low, it will result in higher weighted scores which make the WMVM method very biased to one filter.In contrast, if the range of the Digital Medicine and Healthcare Technology 17/32 minimum distance is very high, it will result in lower weighted scores which make the outcome of the corresponding filtering channel less effective to reach a final decision.According to figure 12, the range of the minimum distances obtained from each filter is very close, which makes our scoring method sufficient to reach a true final decision.

Identification
In this stage, the identification step is a process of picking one identity from the multiple identities obtained by the classification process.As explained above, the WMVM assigns a weighted score for each person selected by the classification process.To reach a final decision, the identity which has the higher weight is chosen, such that: where c f is the final class (subject).
In short, features of multiple ECGs which belong to a same class might be different whenever it is recorded [19].Utilizing the complete information of multiple ECGs (e.g., the wavelet components) might not adequately obtain the required minimum distance to reach the right class.The variability in the ECG features might be due to the variability in the individual bases of the ECG wavelet components [18].Therefore, we developed our multichannel Fréchet based scoring method to achieve the maximum possible similarity between multiple ECGs by parallelly filtering some of the wavelet components [39].Finally, we combine the outcome of this multi-level filtering technique to reach a final class using equations ( 9), (10), and ( 11) respectively.
Digital Medicine and Healthcare Technology 18/32

Experiment setups
We have applied our proposed method using multiple ECGs of 62 subjects from the ECG ID database.The experiments set up are designed according to the following steps: 1.For each subject (m), starting from subject 1 to subject 62 (n sb ): 1.1 A random test data X t m is selected from day 2-ECG.
1.2 Next, the five filtered test ECGs are created using equations ( 1)-( 3) and they are labeled as 1.5 After that, the Fréchet distance between each test feature in step 1.3 and its corresponding many reference features in step 1.4.3 is calculated using equation ( 4).In addition, the results are stored in the following distance vectors 1.6 Then using equation ( 7), the classification process is performed based on the minimum Fréchet distance in each of the five distance vectors that are obtained from step 1.5 and the results are stored in the C p ∈ R np vector.
1.7 Also, the values of the corresponding five minimum distances are stored in the D p ∈ R np vector.
2. Next, equations ( 9), (10) are used to find a weighted score for each identity that is picked by step 1.6.
3. After that, the final level classification is performed using equation (11).
5. Finally, we obtained the ID ∈ R n sb ×n sb matrix which contains the final classification results of all the experiments, according to figure 13, where the rows of the ID matrix represent the actual identities (the true subjects), and the columns represent the rate of predicted identities to the total number of experiments per row (the identities that are obtained by the classification process).
Digital Medicine and Healthcare Technology 19/32 Figure 13.The output of the classification process using WMVM.
In addition, for each subject (row), the elements of the ID matrix are defined as: (i) TP m which is the diagonal element (id m,m ) representing the total number of times the mth subject is truly identified using WMVM (when the test data was selected from the mth subject).
(ii) FN m which represents the total number of times the mth subject is falsely not identified using WMVM when the test data was selected from the mth subject (the corresponding row elements except the diagonal element).The FN m is calculated using equation (12): (iii) FP m which represents the total number of times the mth subject is falsely identified using WMVM when the test data was from different subjects (the corresponding column elements except the diagonal element).The FP m is calculated using the equation (13): (iv) TN m which represents the total number of times the mth subject is truly not identified using WMVM when the test data was selected from different subjects (the corresponding off diagonal elements).The TN m is calculated using the equation ( 14):

Identification results based on using the wavelet filters individually
We examined the performance of our method in terms of the personal identification accuracy using equation ( 15) such that: Digital Medicine and Healthcare Technology 20/32 where acc p,m is the personal identification accuracy of the mth subject using the pth filter, exp is the experiment number with a total of n exp random experiments and TP m,p is the number of times that the mth subject correctly identified using the pth filter.
In addition, we evaluated the average identification accuracy of each filter using equation ( 16): where acc p is the average identification accuracy of the pth filter.
The personal identification accuracy using our filtering system has shown good results.Consequently, figure 14 shows full details of the personal identification accuracy at each filter.In addition, table 2 shows that most of the subjects are identified with an identification accuracy ranging from 91% to 100%, with best findings of 42 subjects.Furthermore, figure 15 shows the average identification accuracy using each filter individually with the best findings of 85.55% for filter p 2 .
These results indicate that all the ECG wavelet components are informative; however, utilizing only a single filtering channel does not appropriately identify majority of the subjects.Therefore, fusing the information obtained from all of these filters is a mandatory task to achieve the best possible performance.

Identification results based on data fusion using the WMVM
We evaluated the performance of our proposed method after we performed data fusion using the WMVM.The personal identification accuracy is calculated using equation (17): where acc wm,m is the personal identification accuracy of the mth subject.
Digital Medicine and Healthcare Technology 21/32  In addition, we evaluated the average identification accuracy of our proposed method using equation (18): Consequently, the personal identification accuracy significantly increased as shown in table 3.After we combined the information using the WMVM, 53 subjects were identified with an identification accuracy ranging from 91% to 100%.In addition, figure 16 shows the full details of the personal identification accuracy for Digital Medicine and Healthcare Technology 22/32  According to table 4, we excluded six subjects who have less than 80% identification accuracy (subjects with red and yellow bars in figure 16).These subjects were excluded because none of the filtering channels were able to identify them indicating that their ECG features from multiple days had significant variability.As a result, our proposed method for ECG based human identification which is based on filtering some of the wavelet components and applying the WMVM for data fusion has achieved 98.07% identification accuracy.These findings indicate that applying the WMVM is significantly useful since it accurately combines information obtained from multiple filtering channels to correctly reach the final class.
Digital Medicine and Healthcare Technology 23/32

Performance evaluation of the proposed method
We evaluated the general performance of our proposed method using the following parameters: Precision: the rate of truly identifying subjects to the total number of identifications: Recall/True Positive Rate (TPR): the rate of truly identifying subjects to the total number of identification attempts: False Rejection Rate (FRR): the rate of falsely not identifying subjects to the total number of identification attempts: False Acceptance Rate (FAR)/False Positive Rate (FPR): the rate of falsely identifying subjects to the total number of rejection attempts: Accordingly, table 5 shows the precision, recall, FAR and FRR parameters computed by performing classification at each filter.As a result, the best findings are achieved via applying the p 2 filter with 0.8631 precision, 0.8555 recall, 0.1445 FRR, and 0.0024 FAR.Consequently, after performing data fusion using the WMVM, the precision/recall parameters had significantly increased and the FRR/FAR had significantly decreased.As a result, we achieved 0.9495 precision, 0.9229 recall, 0.0771 FRR and 0.0013 FAR.Additionally, figure 17 shows the performance comparison using each filter individually and after performing data fusion using WMVM.
Moreover, we evaluated the performance of the proposed method using the receiver operating characteristic curve (ROC), which shows the tradeoff between the true positive rate and false positive rate.Consequently, figure 18 shows the ROC of the proposed method with the closest curve to the top left corner achieved via performing data fusion.In addition, we obtained our highest area under the curve (ROC AUC) that is equal to 0.9608 via applying the WMVM.
Digital Medicine and Healthcare Technology 24/32     Furthermore, we analyzed the performance using the precision recall curve (PR) as shown in figure 19.Consequently, the best results were also obtained after performing data fusion.According to figure 19, the closest curve to the top right corner with PR AUC equal to 0.9362 is achieved via using the WMVM.As previously mentioned, we repeated the process of randomly selecting the test data 50 times for each subject.Consequently, figure 20 shows the cross-validation results in terms of the average identification accuracy of each experiment.These results further show that filtering some of the wavelet components and performing data fusion using the proposed voting method can be utilized to identify subjects from multiple ECGs.
Digital Medicine and Healthcare Technology 26/32

Discussion
In general, the use of ECG for human identification is a challenging task that depends mainly on choosing the appropriate features and classifiers [1,2,5,22].
Different studies have presented different ECG features which can be utilized for biometric purposes [3].In our previous work, we had proposed a study on the most appropriate features and classifiers [38].However, the variability of ECG features remains a major challenge for utilizing the cardiac signal as a biometric modality for real applications [23].Previous studies have achieved excellent identification results; however, topics regarding the variability of ECG features were not discussed [4,22,38,52].Therefore, we proposed a methodology to investigate the feasibility of human identification using multiple ECGs that are recorded at different days.
Table 5 shows performance comparison with state-of-the-art methods and summarizes the main algorithms that are used in these previous approaches.The performance of our method has shown excellent results compared to some of the approaches in the literature [26,29,32,33] as shown in table 6.In addition, our performance slightly exceeded some of the recent methods in ECG biometrics which are based on time frequency analysis of the cardiac signal [28,30,34].Although method [27] reported 100% accuracy, the authors used ECG data of 21 subjects which makes the use of this method very limited due to the small data size.The method of Ciocoiu et al. [31], which is based on converting the ECG heartbeat segments into images and utilizing the CNN for classification, has slightly exceeded our performance.However, the performance of method [31] considering the variability of ECG features among multiple records was not reported.In comparison, this paper presents a contribution that is based on MODWT to address the variability of ECG features.In addition, we present our WMVM which is used to combine the multiple decisions obtained from our multi-channel filtering system to reach a single common decision for identification purposes.Finally, our proposed method has shown excellent experimental results of up to 98.07% identification accuracy, with 53 subjects having a personal identification accuracy ranging from 90% to 100%.
In addition, to make our proposed method clinically applicable, the ECG can be utilized in multibiometric identification systems.The combination of the intrinsic characteristic of the ECG with the extrinsic characteristics of some of the existing biometrics such as voice and iris recognition can increase patient security in clinics.
Furthermore, clinics can benefit from deploying the ECG biometrics in telemedicine to update the personal records periodically for identification purposes.Generally, ECG records should be updated according to the personal health status and age to be utilized for human identification [20].
However, there are some limitations to our methodology.Specifically, topics on optimizing the number of filters, ECG data clustering and personal ECG selection Digital Medicine and Healthcare Technology 27/32 for enrolment purposes need to be addressed in future work.According to figure 13, the number of MODWT filters that are required to correctly identify individuals is subject relevant, which may limit the applicability of our method due to the long computational time.However, optimizing techniques might address this problem to reduce the number of required filters.Also, implementing our method requires to cluster the ECG into different groups to further reduce the screening time since our Digital Medicine and Healthcare Technology 28/32 method depends on finding the minimum Fréchet distance of one random testing data and all the referencing data.In our future work, we will address this problem by utilizing the MODWT scaling coefficients to cluster the ECG referencing data.
Moreover, our method was evaluated based only on two ECG records per subject.
Therefore, the applicability of our method should be investigated on larger ECG records.Accordingly, future work should focus on selecting the most appropriate personal ECG records which may require to perform similarity measurements algorithm at the enrolment stage of the biometric system.

Conclusion
One of the main challenges to utilize ECG for human identification is to address the variability of ECG features across multiple records [5,22].To solve this problem, we proposed a methodology for human identification using multiple ECGs via applying data filtering and data fusion techniques.To model the changeability of ECG features over multiple records, we utilized the MODWT to create a multi-channel filtering system that is used for partitioning the variability of ECG features according to its wavelet components followed by removing different wavelet components at different levels of the filtering system [39].The proposed filtering system is utilized to identify subjects with reduced amounts of the signal information through filtering the wavelet components that may have significant change across multiple ECG records [40].In addition, we proposed the WMVM technique which is utilized to combine information obtained from multiple filtering channels [41].The WMVM is a scoring technique based on the minimum Fréchet distances and is utilized to obtain a common final decision for reaching correct identification.The experimental results have shown that our proposed method has achieved an identification accuracy ranging from 92.29% to 98.07%.In addition, we achieved 0.9495 precision, 0.9229 recall, 0.0771 FRR and 0.0013 FAR.In conclusion, ECG based human identification using multiple ECGs is feasible.However, it requires implementing methods that are adaptable with variability ECG features because it may adversely influence the performance of biometric applications.

Figure 1 .
Figure 1.The variability of morphological features using ECGs recorded on different days of two subjects.

Figure 2 .
Figure 2. The variability of spectral features using ECGs recorded on two different days of one subject.

Figure 3 .
Figure 3. Normal heart beats of five different subjects from the ECG ID database.

Figure 4 .
Figure 4.The flowchart of the proposed methodology.

Figure 6 .
Figure 6.The ten level wavelet coefficients of the ECG using MODWT.
main goal here is to remove some of the components which may have high variability between ECGs that are recorded at different times.To illustrate, we applied a windowing technique based on short time Fourier transform (STFT) to see how the variance of detail coefficients at each scale changes over time.In addition, each window contains an ECG time segment that approximately has a full heartbeat.According to figure7, the variance of detail coefficients at scales 1-4 (the high frequency wavelet components of the ECG signal) shows significant change across multiple heartbeats compared to the variance of details coefficients at scales 5-10 (the low frequency wavelet components of the ECG signal).However, the change in the variance over time of the detail coefficients at scales 1-4 is subject based.Therefore, filtering different high frequency wavelet components of the ECG signal in a set of parallel processes helps to reduce the variability of ECG features.

Figure 7 .
Figure 7.The variance in the frequency components of the ECG wavelet components over multiple records from one subject.

Figure 8 .
Figure 8.The block diagram of the parallel high frequency filtering system (PHFFS).

Figure 9 .
Figure 9. Multiple filtered versions of the ECG using the PHFFS.

Figure 10 .
Figure 10.The outcome of the classification process.

Figure 11 .
Figure 11.Three examples of the identities picked at each filtering channel based on the minimum Fréchet distance.

Figure 12 .
Figure 12.The range of minimum Fréchet distance using multiple testing data of one subject.

Figure 14 .
Figure 14.The personal identification accuracy using each filter individually.

Figure 15 .
Figure 15.The average identification accuracy using each filter individually.

Figure 16 .
Figure 16.The personal identification accuracy using the WMVM.

Figure 17 .
Figure 17.Performance evaluation of the proposed method.

Figure 18 .
Figure 18.The ROCs of the proposed method.

Figure 19 .
Figure 19.The PR curves of the proposed method.

Figure 20 .
Figure 20.The cross validation results.
[28]Euclidean distance between the test data and the mean of 100 training data was determined for ECG classification.As a result, the authors in[27]reported 100% identification accuracy using ECG data of 21 subjects.Lee et al.[28]proposed an algorithm based on a time frequency representation of the ECG data.Both the robust principal components analysis network (RPCANet) and used the Daubechies wavelet (Db3) coefficients at five level decomposition for ECG feature extraction.In the Digital Medicine and Healthcare Technology 5/32 classification stage,

table 1 ,
the PHFFS constructs five types of filtered ECGs which are defined as X1 , X2 , X3 , X4 and X5 .Each of these signals is independently analyzed for the Digital Medicine and Healthcare Technology 12/32

Table 2 .
Total subjects identified per accuracy range using each filter.

Table 3 .
Total subjects identified per accuracy range using the WMVM.

Table 4 .
Average identification accuracy of the proposed method.

Table 5 .
Performance evaluation of the proposed method.

Table 6 .
Summary of the previous state-of-the-art and the proposed methodology on the ECG based human identification.