Machine Learning Clustering Techniques for Selective Mitigation of Critical Design Features
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Machine Learning Clustering Techniques for Selective Mitigation of Critical Design Features Thomas Lange∗† , Aneesh Balakrishnan∗‡ , Maximilien Glorieux∗ , Dan Alexandrescu∗ , Luca Sterpone† ∗ iRoC Technologies, Grenoble, France † Dipartimento di Informatica e Automatica, Politecnico di Torino, Torino, Italy ‡ Department of Computer Systems, Tallinn University of Technology, Tallinn, Estonia {thomas.lange, aneesh.balakrishnan, maximilien.glorieux, dan.alexandrescu}@iroctech.com luca.sterpone@polito.it arXiv:2008.13664v1 [cs.AR] 31 Aug 2020 Abstract—Selective mitigation or selective hardening is an ef- number of workloads to analyse and the duration in cycles of fective technique to obtain a good trade-off between the improve- each workload. A detailed functional failure analysis requires ments in the overall reliability of a circuit and the hardware over- a significant investment in terms of human efforts, processing head induced by the hardening techniques. Selective mitigation relies on preferentially protecting circuit instances according to resources and tool licenses. Studies have shown that exhaustive their susceptibility and criticality. However, ranking circuit parts fault simulation is not feasible for today’s complex circuits [4]. in terms of vulnerability usually requires computationally inten- sive fault-injection simulation campaigns. This paper presents A. Objective of This Work a new methodology which uses machine learning clustering Identifying and ranking the sequential elements which are techniques to group flip-flops with similar expected contributions to the overall functional failure rate, based on the analysis of a most vulnerable to transient faults, usually requires compu- compact set of features combining attributes from static elements tationally intensive fault-injection simulation campaigns. This and dynamic elements. Fault simulation campaigns can then be procedure can be optimized by grouping flip-flops together executed on a per-group basis, significantly reducing the time which are expected to have a similar sensitivity to faults. Fault and cost of the evaluation. The effectiveness of grouping similar injection campaigns can then be performed on a per-group sensitive flip-flops by machine learning clustering algorithms is evaluated on a practical example.Different clustering algorithms basis and thus, significantly reduce the time and cost of the are applied and the results are compared to an ideal selective evaluation [5]. However, this optimization heavily relies on mitigation obtained by exhaustive fault-injection simulation. the effectiveness of the grouping methodology. Therefore, we Index Terms—Transient Faults, Single-Event Upsets, Selective propose a new approach to effectively group flip-flops together Mitigation, Selective Hardening, Soft Error Protection which are expected to have a similar sensitivity to functional failures. The approach is based on machine learning clustering I. I NTRODUCTION techniques which uses a set of features to characterizes each The advancement of the process technologies in the last flip-flop in the circuit. The feature set combines attributes years made it possible to manufacture chips with tens of from static and dynamic elements. Machine learning clustering millions of flip-flops. At the same time due to the technology algorithms are evaluated and compared to an ideal selective scaling, lower supply voltages and higher operating frequen- mitigation obtained by exhaustive fault-injection simulation. cies, circuits became more vulnerable to reliability threats, such as transient faults. Especially, the Soft Errors in flip-flops B. Organisation of the Paper are a major concern and countermeasures have to be taken into In the following sections, first, the use of clustering tech- consideration by using hardening techniques, such as Triple niques for selective mitigation is summarized. Further, the Modular Redundancy (TMR). However, a fully protected chip general principle of machine learning clustering techniques are might not meet the system requirements in terms of area, explained. Section III presents the proposed methodology to power or target frequency. Since for many applications it group flip-flops together based on machine learning clustering is not necessary to decrease the vulnerability to a possible and the used feature set. The approach is evaluated on a minimum, Selective Mitigation can be used. Thereby, only the practical example in Section IV for different machine learning most critical elements of the circuit are protected against Soft algorithms. Section V concludes the paper and gives some Errors and thus, the failure rate of the system is decreased to prospects for future work. meet all requirements [1]–[3]. In order to perform Selective Mitigation an exhaustive II. C LUSTERING T ECHNIQUES FOR S ELECTIVE failure analysis is required to identify and rank the most M ITIGATION vulnerable elements of the circuit. Especially, the failure The approach of protecting only the smallest set of elements analysis on a functional level grows with the design size, the in a circuit to meet a specified reliability target is called selective mitigation. Therefore, the individual circuit elements This work was supported by the RESCUE project which has received funding from the European Union’s Horizon 2020 research and innovation of a circuit need to be ranked from the most to the least programme under the Marie Sklodowska-Curie grant agreement No. 722325. sensitive. In the case of transient faults in the sequential
logic which lead to a functional failure this usually requires III. P ROPOSED M ACHINE L EARNING C LUSTERING exhaustive fault injection simulation, which might not be A PPROACH feasible for large and complex circuits. The proposed methodology is based on clustering tech- In order to reduce the mentioned fault injection efforts, niques for selective mitigation. In contrary to previous work fault simulation campaigns can be performed on a group the clustering approach uses machine learning clustering al- basis. Therefore, prior any simulations, flip-flops are grouped gorithms and a feature set to characterize each flip-flop in the together and the statistical fault injection is performed on each circuit. In this way no assumptions are made about the design of these groups. This can significantly reduce the time and and a general methodology is provided. cost of the evaluation. However, the accuracy of this coarse- grained fault injection solely relies on an effective approach to group flip-flops together which have highly similar sensitivity Circuit Representation Testbench to faults. (RTL Design) The previous studied techniques for clustering are based on buses, design hierarchy, or a hybrid approach using buses, hi- erarchy and signal naming information [5]. These approaches have several drawbacks. The bus based and hierarchical based approach are only able to provide a fixed number of clusters. Thereby, the hierarchical based approach often provides a low Extract Features per Flip-Flop number of clusters which can be very heterogeneous. The bus based approach often results in a high number of clusters with small number of flip-flops per cluster and one larger cluster containing all the flip-flops which do not belong to any Flip-Flop bus. Thus, the reduction of the number of fault injections is Feature Set limited and the large cluster tends to be heterogeneous which negatively impacts the effectiveness of the clustering. The hybrid approach overcomes the problem of the fixed number of Apply clusters by combining the bus and hierarchy based approaches ML Clustering Nc and also taking the signal names into account. The approach Algorithm assumes that flip-flops with a similar naming also have a similar function and thus, have a similar sensitivity to faults. However, this relies on a consistent naming convention and Flip-Flop strong correlation between the naming and the function within Groups the circuit. A. Machine Learning Clustering Techniques Fault Injection Campaign on Group Basis In order to tackle the drawbacks of the previous stud- ied clustering approaches this paper investigates if machine learning clustering techniques are suitable to group the flip- flops. Clustering techniques in the machine learning domain Functional Failure Rate belong to the unsupervised learning category. In general these per Group algorithm try to group similar objects together based on a given set of features. The feature set characterizes the objects and the clustering algorithms group objects together which Rank and have similar feature values, while objects from a different Select Groups group should have highly dissimilar feature values [6]. to Mitigate Fig. 1. Selective Mitigation by Using Machine Learning Clustering In this paper, the flip-flops are characterized by a set of features which will be described in the next section. Machine The steps to perform a selective mitigation are illustrated in learning clustering algorithms use this feature set to group flip- Fig. 1. First, the features for each flip-flop in the design are flops together. Afterwards, the effectiveness of the grouping extracted by using the RTL description and a corresponding will be evaluated on a practical example and it will be deter- testbench. Second, the machine learning clustering algorithm mined if flip-flops with similar fault sensitivity are grouped is applied to the obtained feature set and the flip-flop groups together. are obtained. The resulting number of groups Nc can be
TABLE I F EATURE S ET TO C HARACTERISE A F LIP -F LOP I NSTANCE FFi Feature Name Description Structural Related Features # FF at Startpoint/Endpoint The number of flip-flops directly connected to/by FFi . # Connections from/to FF The number of flip-flops connected to/by FFi within the circuit. # Connections from/to Primary Input/Output The number of primary inputs/outputs connected to/by FFi . # FF Stages to/from Primary Input/Output (max/avg/min) Number of flip-flop stages from/to the primary input/output of the circuit. Feedback Depth The depth (in terms of flip-flops stages) of the shortest feedback loop. Bus Position Describes the position of FFi within the bus. Bus Length Describes the total length of the bus signal FFi is part of. Bus Label The number/label of the bus signal FFi is part of. Module Label The number/label of the hierarchical module FFi is part of. Signal Activity Related Features @0/@1 The relative time FFi output is at logical 0/1. State Changes The number of state changes. adjusted by the parameters of the machine learning clustering behaviour of the flip-flops. Therefore, the information related algorithm. The number of groups also dictates the effort to the signal activity is considered, such as the state distribu- needed for the next step: the statistical fault injection. The fault tion and transitions. injection campaign is performed on the computed flip-flop IV. E VALUATING M ACHINE L EARNING C LUSTERING FOR cluster and thus, needs more efforts (in terms of computing S ELECTIVE M ITIGATION resources, human efforts and tool licensees) with a higher number of cluster and vice versa. Eventually, the sensitivity In this section the machine learning clustering approach is to faults for each cluster is obtained and they can be ranked evaluated on a practical example. Therefore, different clus- from the most sensitive to the least sensitive. The selective tering algorithms are used to group the flip-flops based on mitigation will be applied starting from the most sensitive the presented feature set. The effectiveness of the clustering cluster until the reliability requirement is met. algorithms are measured by evaluating the created flip-flop cluster. The goal is to create flip-flop groups which have a A. The Feature Set similar vulnerability to critical failures. Therefore, first, an Previous works has shown that the masking effects and the exhaustive full flat statistical fault injection campaign was vulnerability of a flip-flop can be related to certain charac- performed to obtain the sensitivity to critical failures for teristics of the circuit, such as circuit structure and signal each flip-flop and to provide an independent measure of probability [7]–[9]. Motivated by this idea a feature set has the sensitivity. Afterwards, this data is used to evaluate the been developed in [10] which is adapted for the approach different clustering algorithms against an ideal and random presented in this paper. The feature set characterizes each flip- approach. flop instance in the circuit and combines attributes from static elements, such as the circuit structure, as well as dynamic A. Circuit Under Test elements, such as the signal activity. For the practical example, the Ethernet 10GE MAC Core The original approach was based on the gate-level netlist from OpenCores is used. This circuit implements the Me- of a design and contained features which corresponds to dia Access Control (MAC) functions as defined in the synthesis attributes. In order to obtain a more general approach IEEE 802.3ae standard. The 10GE MAC core has a 10 Gbps which can already be applied in an early design stage, the interface (XGMII TX/RX) to connect it to different types of methodology presented in this paper uses only features which Ethernet PHYs and one packet interface to transmit and receive can be derived from the RTL description of the design (e.g. packets to/from the user logic [11]. The circuit consists of by performing a fast logic elaboration). Further, two additional control logic, state machines, FIFOs and memory interfaces. features are extracted to reflect the bus and hierarchical based It is implemented at the Register-Transfer Level (RTL) and is clustering approach described in the previous section. In total publicly available on OpenCores. The RTL description of the the feature set consists of 20 features. design consists of 1054 flip-flops. The features extracted for each flip-flop F Fi are described The corresponding testbench writes several packets to the in detail in Tab. I. They are divided into two parts, the 10GE MAC transmit packet interface. As packet frames be- structural and the signal activity related features. The structural come available in the transmit FIFO, the MAC calculates a related features describe a flip-flop in relation with other flip- CRC and sends them out to the XGMII transmitter. The XG- flops in the circuit without taking the (technology dependent) MII TX interface is looped-back to the XGMII RX interface combinatorial logic into account. To consider the workload of in the testbench. The frames are thus processed by the MAC the circuit, features are extracted which describe the dynamic receive engine and stored in the receive FIFO. Eventually, the
testbench reads frames from the packet receive interface and It is assumed that after mitigation, the sensitivity to Single- prints out the results [11]. Event Upsets is zero1 . The flip-flops to protect were selected starting with the most sensitive clusters first. The results B. Full Flat Statistical Fault Injection Results are compared against an ideal and a random approach. In In order to evaluate the effectiveness of the clustering the ideal approach the most sensitive flip-flops are selected algorithms the sensitivity of the considered design was mea- based on the exhaustive fault injection campaign. The random sured by performing a flat statistical fault injection campaign. approach selects flip-flops to protect randomly (averaged over The simulations are performed at the RT-Level using the 100 independent runs). corresponding testbench which allows a functional verification. The clustering algorithms were implemented by using Thus, it is possible to evaluate the system-level impact of Python’s scikit-learn Machine Learning framework [12] and errors. The fault injection mechanism consisted of inverting applied to the extracted flip-flop feature set. This process the value stored in a flip-flop by using a simulator functions. took only several seconds and is negligible. Each considered In networking applications, such as the considered design, clustering algorithm has different parameters which can be important data is protected by checksums. This means that adjusted. They affect the performance of the clustering and a minor payload corruption can be handled by the error also the resulting number of clusters. For some of the al- correction algorithm. However, in case the fault causes the gorithms the number of clusters can be specified directly. circuit to stop working and interrupting the flow of sending Other algorithms try to find the optimal number clusters within packages or data is continuously corrupted, then the effect can the constrained parameters. In the following evaluation the be considered as critical. Especially, these failures are highly parameters were chosen in a way that the number of resulting problematic and should be mitigated by selective mitigation. clusters are about 20 %, 10 % and 5 % of the number of flip- In each flip-flop 200 faults are injected at a random time flops in the circuit. This would correspond to a reduction of during the active phase of the test-case. The functional failure the fault injection efforts by 5×, 10× and 20× respectively. rate of each flip-flop is calculated by dividing the number of 1) K-Means Clustering: K-Means clustering aims to par- simulation runs which lead to a functional failure with the tition the given data points into k clusters. The algorithm number of total simulation runs. The overall results of the flat computes the euclidean distance for each data point in the statistical fault-injection campaign are presented in Table II. feature space. The data samples are then separated in such a The average critical failure rate is 5.13 % and the most critical way the variance within each cluster is equal. The algorithm flip-flops were identified and ranked. requires the number of clusters Nc to be specified. Fig. 2 shows the overall critical sensitivity when the group- TABLE II ing is performed by the K-Means clustering with different SEU FAULT I NJECTION C AMPAIGN R ESULTS number of clusters Nc . The flip-flops to protect were selected Total Per Injection starting with the most sensitive clusters first until all flip-flops Injection Targets (FFs) 1054 - are mitigated. It can be noted that the grouping performs better Injected Faults (SEU) 210800 - when the number of clusters is higher. Functional Failure 10814 5.13 % 2) Agglomerative Clustering: Agglomerative Clustering performs a hierarchical clustering by creating nested clusters, which are represented as a tree. The advantage of hierarchical C. Machine Learning Clustering for Selective Mitigation clustering is that any valid measure of distance can be used, The machine learning clustering algorithms are used to in comparison to K-Means clustering which performs on an group flip-flops based on the feature-set described in sec- euclidean distance metric. Agglomerative Clustering uses a tion III. The features are extracted from the RTL design bottom-up approach where each data point starts as its own of the circuit. Therefore, a fast elaboration is performed cluster. Clusters are merged together by following the linkage and the design is converted into a graph representation. The criteria. This criteria defines the metric used for the merge structural features are extracted from the graph by using graph strategy. It was noted that the best results were obtained by algorithms and the features regarding the signal activity are using the maximum or complete linkage, which minimizes the extracted by tracing the simulation. The feature extraction is maximum distance between data points of pairs of clusters and automated and overall, takes less than 5 minutes. a manhatten distance metric (l1 norm). The effectiveness of the clustering is evaluated considering In Fig 3 the results of the selective mitigation are shown they would be used to selectively mitigate against the critical when Agglomerative Clustering is used for different number failures. Therefore, the data obtained from the exhaustive of clusters Nc . Similar to the K-Means clustering the effec- statistical fault injection is used to compute the sensitivity to critical faults for each cluster. Then, the reduction of the 1 The authors are aware that this is a simplification and important aspects overall sensitivity of the circuit was calculated by varying related to physical design are not considered. However, the here presented the number of groups being protected. For the protection it approach focuses on the evaluation on the functional level by using the RTL description of the circuit. To obtain a more complete analysis, the presented is assumed that the considered flip-flops within the group approach could be combined with e.g. the classical analysis for electrical and are substituted by hardened cells, TMR or other approaches. temporal masking by using post place and route gate-level netlist.
0.05 Ideal 0.05 Ideal Random Random K-Means Nc = 53 Agglomerative Clustering Nc = 53 K-Means Nc = 105 Agglomerative Clustering Nc = 105 0.04 K-Means Nc = 211 0.04 Agglomerative Clustering Nc = 211 Overall Critcal Sensitivity Overall Critcal Sensitivity 0.03 0.03 0.02 0.02 0.01 0.01 0.00 0.00 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100% % of Mitigated Flip-Flops % of Mitigated Flip-Flops (a) From 0 % to 100 % of mitigated flip-flops (a) From 0 % to 100 % of mitigated flip-flops 0.05 0.05 0.04 0.04 Overall Critcal Sensitivity Overall Critcal Sensitivity Ideal Ideal 0.03 Random 0.03 Random K-Means Nc = 53 Agglomerative Clustering Nc = 53 K-Means Nc = 105 Agglomerative Clustering Nc = 105 K-Means Nc = 211 Agglomerative Clustering Nc = 211 0.02 0.02 0.01 0.01 0% 5% 10% 15% 20% 25% 0% 5% 10% 15% 20% 25% % of Mitigated Flip-Flops % of Mitigated Flip-Flops (b) Section from 0 % to 25 % of mitigated flip-flops (b) Section from 0 % to 25 % of mitigated flip-flops Fig. 2. K-Means Clustering Fig. 3. Agglomerative Clustering tiveness of the selective mitigation is increasing with a higher 105 and 210, respectively. Fig 4 shows the effectiveness of number of clusters. However, the improvements from 53 to the algorithm when using the obtained clusters for selective 105 and 211 clusters are very minor. When comparing the mitigation. As for the previous algorithms the effectiveness is Agglomerative Clustering to K-Means Clustering it can be increasing with a higher number of clusters (smaller window noted that Agglomerative Clustering performs generally better. sizes). It can be seen, that the results with a window size of 3) Mean Shift Clustering: Mean Shift Clustering aims to w = 0.85, which results in 210 groups, is almost as good as find dense areas of data points. The algorithm is based on the ideal solution. a sliding-window which is shifted towards regions of higher density. The density of the sliding-window is proportional to D. Comparison and Discussion the number of data points within the window it. The goal is The results of the considered clustering algorithms with to locate center points for each group in the dataset. For the different resulting number of clusters are summarized in previous clustering algorithm the number of clusters had to Tab. III. Since the number of resulting clusters was chosen to be specified manually. In Mean Shift clustering the number be about the same, the average cluster size is identical or very of clusters is determined by the algorithm. Further, K-Means similar. The standard deviation of the cluster sizes however clustering assumes spherical distribution shape of the clusters shows that Agglomerative and Mean Shift clustering tend to in the feature space. For the algorithm the window size w create cluster with more variety in the cluster size. needs to be specified. In order to quantify the effectiveness of the algorithm to A general problem when using Mean Shift Clustering is to create flip-flop groups with similar functional failure rate, two chose the correct windows size. In this analysis the window metrics were derived from the results: the average variance sizes were chosen in a way the resulting number of clusters of the functional failure rate and the maximum difference of are close to the number of clusters used for the previous the functional failure rate of flip-flops within the same cluster. algorithms. By using window sizes of w = 2.8, w = 1.7 An additional metrics was created where the average variance and w = 0.85 the number of clusters were resulting in 52, of the functional failure rate within one cluster is weighted
TABLE III C OMPARISON OF THE D IFFERENT C LUSTERING A LGORITHM Mean Mean Weighted Max Clustering Mean Standard Deviation Functional Failure Functional Failure Functional Failure Davies-Bouldin Algorithm # Cluster Cluster Size Cluster Size Variance Variance Difference Index k-Means 53 19.89 13.01 0.009 0.095 0.92 0.73 105 10.04 7.45 0.010 0.041 0.92 0.68 211 4.99 3.80 0.003 0.008 0.64 0.51 Agglomerative 53 19.89 21.39 0.015 0.094 0.92 0.78 Clustering 105 10.04 11.76 0.011 0.040 0.92 0.57 211 4.99 5.32 0.003 0.007 0.64 0.54 Mean Shift 52 20.27 28.71 0.017 0.167 1 0.60 105 10.04 14.12 0.006 0.038 0.92 0.42 210 5.02 6.64 0.002 0.005 0.47 0.39 0.05 Ideal considered number of clusters. Random In order to evaluate the performance of the clustering Mean Shift w = 2.8 Mean Shift w = 1.7 algorithm without knowledge of the actual/reference values, 0.04 Mean Shift w = 0.85 several metrics exists which quantify certain characteristics Overall Critcal Sensitivity of the created clusters (such as the separation of the data). 0.03 These metrics can be used when the presented approach is applied to a new unknown circuit and different algorithms 0.02 are evaluated or the algorithm parameters are fine tuned. For this analysis the Davies-Bouldin index was used and results 0.01 are shown in Tab. III. This index can be used to evaluate the separation between the clusters. It signifies the average similarity between clusters by comparing the distance between 0.00 0% 20% 40% 60% 80% 100% clusters with the size of the clusters themselves. An index of 0 % of Mitigated Flip-Flops is the lowest possible score and values closer to zero indicate (a) From 0 % to 100 % of mitigated flip-flops a better partition. 0.05 The Davies-Bouldin index does not fully correlate with the functional failure variance or difference metrics. However, it follows the general direction and a lower index can be ob- 0.04 served with higher number of clusters. Further, similar scores Overall Critcal Sensitivity are obtained for the k-Means and Agglomerative Clustering as Ideal also seen when comparing the functional failure metrics. The 0.03 Random Mean Shift w = 2.8 best index, the closest to zero, is also obtained for the best Mean Shift w = 1.7 Mean Shift w = 0.85 result, Mean Shift clustering with 210 clusters. 0.02 V. C ONCLUSION AND F UTURE W ORK This paper proposes a new methodology to group flip- 0.01 flops together which are expected to have a similar contri- 0% 5% 10% 15% 20% 25% butions to the overall functional failure rate. The grouping % of Mitigated Flip-Flops is based on machine learning clustering and uses a compact (b) Section from 0 % to 25 % of mitigated flip-flops set of features combining attributes from static elements and Fig. 4. Mean Shift Clustering dynamic elements. The advantage in comparison to other existing approaches is that the approach is more flexible and no assumption of the circuit or its representation is made. with the cluster size. In this way a large cluster, which has the Further, the number of clusters can be chosen by the user, same variance within the cluster as a small cluster, is penalized which determines the needed efforts for the fault injection more. campaign. The results verify what was observed from the Fig. 2, 3 and The effectiveness of the grouping by different machine 4. In general the algorithms perform better with a higher num- learning clustering algorithms were evaluated on a practical ber of clusters. Agglomerative Clustering performs slightly example and compared to an ideal solution. Good results were better than k-Means and Mean Shift clustering performs worst obtained by choosing the number of clusters with 5 % of with low number of clusters and close to ideal with the highest the total number of flip-flops in the circuit and results close
to the ideal solution were obtained with number of clusters corresponding to 10 % to 20 % of the number of flip-flops. This would mean that the fault injection efforts could be reduced by a factor of 5×, 10× or 20× respectively. Future work will focus on identifying new features to improve the effectiveness of the clustering algorithms as well as applying the techniques to a broader range of circuits. Further, the approach could be extended by using features which take physical design aspects into account and thus, reducing the efforts needed to perform a complete failure analysis on several levels. R EFERENCES [1] I. Polian, S. M. Reddy, and B. Becker, “Scalable Calculation of Logical Masking Effects for Selective Hardening Against Soft Errors,” in 2008 IEEE Computer Society Annual Symposium on VLSI, Apr. 2008, pp. 257–262. [2] M. Maniatakos and Y. Makris, “Workload-driven selective hardening of control state elements in modern microprocessors,” in 2010 28th VLSI Test Symposium (VTS), Apr. 2010, pp. 159–164. [3] M. G. Valderas, M. P. Garcia, C. Lopez, and L. Entrena, “Extensive SEU Impact Analysis of a PIC Microprocessor for Selective Hardening,” IEEE Transactions on Nuclear Science, vol. 57, no. 4, pp. 1986–1991, Aug. 2010. [4] Y. Yu, B. Bastien, and B. W. Johnson, “A state of research review on fault injection techniques and a case study,” in Annual Reliability and Maintainability Symposium, 2005. Proceedings., Jan. 2005, pp. 386–392. [5] A. Evans, M. Nicolaidis, S. Wen, and T. Asis, “Clustering techniques and statistical fault injection for selective mitigation of SEUs in flip-flops,” in International Symposium on Quality Electronic Design (ISQED), Mar. 2013, pp. 727–732. [6] E. Alpaydin and F. Bach, Introduction to Machine Learning, ser. Adaptive Computation and Machine Learning Series. MIT Press, 2014. [7] I. Wali, B. Deveautour, A. Virazel, A. Bosio, P. Girard, and M. Sonza Re- orda, “A Low-Cost Reliability vs. Cost Trade-Off Methodology to Se- lectively Harden Logic Circuits,” Journal of Electronic Testing, vol. 33, no. 1, pp. 25–36, Feb. 2017. [8] O. Ruano, J. A. Maestro, and P. Reviriego, “A Methodology for Automatic Insertion of Selective TMR in Digital Circuits Affected by SEUs,” IEEE Transactions on Nuclear Science, vol. 56, no. 4, pp. 2091– 2102, Aug. 2009. [9] P. K. Samudrala, J. Ramos, and S. Katkoori, “Selective Triple Modular Redundancy (STMR) Based Single-Event Upset (SEU) Tolerant Synthe- sis for FPGAs,” IEEE Transactions on Nuclear Science, vol. 51, no. 5, pp. 2957–2969, Oct. 2004. [10] T. Lange, A. Balakrishnan, M. Glorieux, D. Alexandrescu, and L. Ster- pone, “Machine Learning to Tackle the Challenges of Transient and Soft Errors in Complex Circuits,” in 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS), Jul. 2019, pp. 7–14. [11] Andre Tanguay, “10GE MAC Core Specification,” Jan. 2013. [12] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander- plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch- esnay, “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
You can also read