# TCAD-Enabled Machine Learning Defect Prediction to Accelerate Advanced Semiconductor Device Failure Analysis

Chea-Wei Teo<sup>1, 2</sup>, Kain Lu Low<sup>1</sup>, Vinod Narang<sup>2</sup>, and Aaron Voon-Yew Thean<sup>1, \*</sup>

<sup>1</sup>Department of Electrical and Computer Engineering, National University of Singapore, Singapore

<sup>2</sup>Device Analysis Lab, Advanced Micro Devices Pte Ltd, Singapore

<sup>\*</sup>Tel: +65 6516 6471, <sup>\*</sup>email: aaron.thean@nus.edu.sg

*Abstract*- In this work, we present a unique approach of combining TCAD modelling and machine learning to detect the defect locations of a bridging defect in a single-fin FinFET. The prediction of the defect location is guided by the predictive model consisting of Random Forest algorithm which is trained with the measureable electrical attributes from the I-V. High accuracy in predicting the defect location is achieved by the proposed scheme which can further enhance the FA success rate, expediting the cycle of design to product.

### Keywords– Defect Location Prediction, FinFET, Machine Learning, TCAD.

### I. Introduction

Failure Analysis (FA) has been a critical process for driving semiconductor yield enhancement, reliability, and accelerating product development cycle. In today complementary metal-oxide semiconductor (CMOS) technology, the number of transistors in an integrated circuits (IC) approximately doubles every two years as predicted by the Moore's law [1]. Continued scaling of CMOS devices results in an exponential growth in the number of transistors on IC chips. Higher packing density allows more logic circuits to be fabricated on a given IC chip area which in turn reduces the cost per function.

However, as transistor dimensions are aggressively scaled to the deep sub-micrometer regime and beyond, several serious challenges arise, including the shortchannel effects (SCEs) [2]. Short channel effects, such as drain induced barrier lowering (DIBL),  $V_{TH}$  roll-off, and punch-through, significantly increase the OFF-state current ( $I_{OFF}$ ) of highly scaled MOSFETs. To overcome the issues associated with SCEs, FinFET devices [3] have been introduced in recent years to meet the high performance and low power requirement for state-of-the-art electronic products. However, the complexity for fault identification has significantly increased [4] due to the use of FinFET in the electronic industry. Firstly, nanoscale non-planar device structures have led to more occurrences of Non-Visual Defects (NVD), and this lowers the chance

| Parameters                     | Dimensions                          |
|--------------------------------|-------------------------------------|
| Fin height                     | 50 nm                               |
| Fin width                      | 10 nm                               |
| Gate length                    | 20 nm                               |
| Source / Drain doping          | 1×10 <sup>19</sup> cm <sup>-3</sup> |
| Body doping                    | 5×10 <sup>17</sup> cm <sup>-3</sup> |
| GOX HfO <sub>2</sub> thickness | 2 nm                                |
| GOX SiO <sub>2</sub> thickness | 1 nm                                |

Figure 1: FinFET Structure and paramaters for TCAD.

of getting to the root cause of failure. Secondly, the complex multi-layer structure of FinFET devices leads to more complex Transmission Electron Microscopy (TEM) analysis, which is both time consuming and difficult to prepare. As such, defect identification workflows are becoming increasingly reliant on electrical nanoprobing. However, the electrical interactions between the defect, the transistors, the complex interconnects can be difficult to partition. In the light of the above, we propose a new approach of defect prediction to improve the defect identification and location success rates.

The organization of this paper is as follows. Section II details the methodology, covering the setup of TCAD model, dataset generation and predictive model for machine learning. In Sections III-A and III-B, the analysis of current-voltage (I-V) for a single FIN with different defect configurations is presented. It is followed by the discussion on the performance of random forest model on the prediction of the defect location in III-C. Finally, the conclusions are drawn in Section IV.

### **II.** Approach

Due to rare occurrence of defects in chips, it is extremely difficult to collect enough statistically-significant failing samples. In this context, the defect modeling and simulation using calibrated TCAD models can serve as a forward prediction model to generate electrical responses for many defect-device configurations. In this work, we focus on bridging defects that lead to leakage and electrical shorts. Guided by actual defect and nanoprobing results, we built a database of bridging defects in different location, size, and the electrical responses. We curated key electrical features that may identify the defects. A 3D TCAD model of the transistor and local interconnect is used to generate labeled data set for the machine training. We tested the machine learning model against TCADgenerated defect-device configurations to evaluate its prediction accuracy.



Figure 2: Gate pattern defect: Planar view (top) and Cross-sectional view (bot): (a) STEM Image. (b) TCAD Defect model.



Figure 3: Placement of defect in a single-FIN FinFET: (a) Z-X view. (b) Z-Yview.



Figure 4: Classification of the regions serving as the classes for the dataset.

## A. TCAD Model Setup

A single-fin FinFET structure is considered in this work. It is constructed using Synopsys Sentaurus Structure Editor Process Emulation method [5], as shown in Fig. 1 with the parameters detailed. Correlation between the simulated result and actual device failure was validated with the existing FA cases which involve a gatepattern defect on a multi-fin FinFET, as described in [4]. The gate pattern defect was introduced into the TCAD model as shown in Fig. 2. The simulated drain/source current vs gate voltage characteristics ( $I_D/I_S-V_G$ ) give a qualitative resemblance to the IV behavior of actual device.

#### **B.** Dataset Generation for Machine Learning

A single-fin FinFET with a fixed dimension of bridging defect is considered for this work. The bridging defect consists of Titanium Nitride as material. It has a fixed X, Y, and Z dimension of 5 nm, 18 nm, 3 nm, respectively.

As shown in Fig. 3, the defects are distributed at various X and Z positions and Y-position is fixed at 30 nm and 62 nm, respectively. The region is further broken down into 10 sub-regions which are the classes for the dataset, as captured in Fig. 4. By employing the models used in [4], the electrical characteristics of FinFET with introduced defects are simulated using the Synopsys Sentaurus device simulator.

Measureable electrical attributes are subsequently extracted from the I-V which serve as the feature set in the database for the machine learning. Once the dataset is setup, the supervised learning algorithm based on Random Forest (RF) [6] is adopted for training and predicting the defect location based on the electrical attributes provided. A total number of 273 samples were consolidated which constitute the dataset for training and validating the predictive model.

#### C. Predictive Model Setup for Machine Learning

The machine learning component starts with data preprocessing step to ensure the integrity of the inputs. Prior to training the predictive model, the dataset is split into training and testing sets in a random manner [7]. To minimize the situation of underfitting or overfitting, crossvalidation approach is used to ensure that the model is generalized to an independent or unseen data set. The optimal parameters for Random Forest algorithm are obtained using the grid search technique via exhaustive searching from the range of parameters specified for the best cross validation score.

#### III. Results and Discussions A. *I-V*: Defect within the FIN (S/D – Channel)

Firstly, the *I-V* of defect located between the drain and channel inside the FIN [Fig. 5(a)] is investigated. From the transfer characteristic shown in Fig. 5(b), the leakage current of the defective FIN is higher than that of the control device. In order to comprehend the trend observed, the band diagram at the OFF-state ( $V_G = 0 \text{ V}$ ,  $V_D = 1 \text{ V}$ ) is extracted in Fig. 6. It is found that the metallic defect (TiN) alters the band diagram around the channel and drain substantially. The 1-dimensional (1-D) band diagram along the defect reveals that the source barrier significantly reduced by the defect compared to the one of control device, resulting high leakage current.

On the contrary, the magnitude of the current level appears to be lower than that of the control device for the case where the defect is located in between the source and the channel [Fig. 7(a)], as demonstrated in Fig. 7(b).



Figure 5: (a) Defect in between the drain and the drain. (b)  $I_D$ - $V_G$  of defective FIN and the control device in linear and logarithmic scale.



Figure 6: (a) 2-D conduction band ( $E_{C}$ ). (b) 1-D band diagram along the cut line where the defect is located for control and defective FIN.



Figure 7: (a) Defect in between the source and the channel. (b)  $I_D$ - $V_G$  of defective FIN and control device in linear and algorithmic scale.



Figure 8: (a) 2-D conduction band ( $E_C$ ). (b) 1-D band diagram along the cut line (A to A).

Similarly, the 2-D and 1-D band diagrams are examined to understand the underlying physical insights leading to lower current level observed in the defective FIN. The 1-D band diagram along the defect [extracted at the condition of  $V_G = 0.6$  V and  $V_D = 1$  V] in Fig. 8(b) shows that the degree of reduction in the source barrier height with  $V_G$  is smaller for the defective FIN. The Schottky barrier formed in between the defect in the source and the channel increases the source barrier. Consequently, smaller amount of carriers can surpass the potential barrier, resulting in lower current level in the defective FIN.

### B. I-V: Defect outside the FIN (S/D – Channel)

The schematic diagram in Fig. 9(a) shows the defect located outside of the FIN electrically connect the drain and the gate electrode. The defect forms a resistive con-



Figure 9: (a) Transistor Schematic showing that the gate and drain electrode are electrically connected via the defect, as represented by a resistive path  $(I_{D\rightarrow G})$ . (b)  $I_D / I_G$  versus  $V_G$  of FinFET exhibiting linear dependence on the  $V_G$ . (c)  $I_S$  versus  $V_G$  follows the conventional current characteristic of FinFET.



Figure 10: (a) Transistor Schematic: the gate and source electrode shorted electrically by the defect, as represented by a resistive path  $(I_{S \to G})$ . (b)  $I_D$  versus  $V_G$  follows the conventional current characteristic of Fin-FET. (c)  $I_S / I_G$  versus  $V_G$  showing the magnitude of  $I_S$  and  $I_G$  increases linearly with  $V_G$ .

ducting path between the drain and the gate electrode. By the Kirchhoff's current law (KCL), the current of drain is contributed by the gate and the source current. From the current characteristics of the drain ( $I_D$ ) and gate ( $I_G$ ) versus the  $V_G$  in Fig. 9(b), both  $I_D$  and  $I_G$  exhibit a linear dependence on the potential difference between the drain and gate electrode due to the resistive path formed by the defect. It is also noted that the magnitude of  $I_D$  and  $I_G$  is much larger than the source current ( $I_S$ ). The  $I_S$  follows the current characteristic of a transistor [Fig. 9(c)].

Similar analysis and justifications could be applied in the scenario where the defect forms a resistive conducting path in between the source and the gate electrode [Fig. 10(a)]. As depicted in Fig. 10(b), the characteristic of  $I_D$  is similar to that of the control device. On the other hand, the magnitude of both  $I_S$  and  $I_G$  increases linearly with  $V_G$ due to the flow of current through the resistive conducting path between the source and the gate electrode, shown in Fig. 10(c).

### C. Performance of the Random Forest

Based on the Random Forest model with optimal parameters, high accuracy is achieved with an average accuracy score of 0.9612 which is obtained by running 1000 randomly-split training and testing sets on the model. From the evaluation of the confusion matrix (Fig. 11), all samples, except for those from region 5, are classified



Figure 11: Confusion matrix with all features considered. R stands for Region.



Figure 12: Distribution of the accuracies of Random Forest Model considering only the important features

| Table I   | Average accuracy score |  |
|-----------|------------------------|--|
| I able I. | Average accuracy score |  |

| Model                                   | Average Accuracy |
|-----------------------------------------|------------------|
| Random Forest (All features)            | 0.9612           |
| Random Forest (Important features only) | 0.9629           |

| i ubie ii. i cutures importance or rundom i ores | Table II. | Features | Importance | of Random | Forest |
|--------------------------------------------------|-----------|----------|------------|-----------|--------|
|--------------------------------------------------|-----------|----------|------------|-----------|--------|

| Features                                                                                 | Importance |
|------------------------------------------------------------------------------------------|------------|
| <i>S<sub>min</sub> s</i> (minimum subthreshold swing of <i>Is</i> )                      | 0.122769   |
| $S_{min_D}$ (minimum subthreshold swing of $I_D$ )                                       | 0.113200   |
| $V_{TH_D}$ (threshold voltage at $I_D = 1e-8$ A)                                         | 0.077937   |
| $S_{avg D}$ (average subthreshold swing of $I_D$ )                                       | 0.072414   |
| $S_{avg}$ s (average subthreshold swing of $I_s$ )                                       | 0.071190   |
| $S_{max_D}$ (maximum subthreshold swing of $I_D$ )                                       | 0.071001   |
| $S_{max S}$ (maximum subthreshold swing of $I_S$ )                                       | 0.060934   |
| $V_{TH_S}$ (threshold voltage at $I_S = 1e-8$ A)                                         | 0.055311   |
| $I_{S}$ _slope (average slope $I_{S}$ - $V_{G}$ )                                        | 0.050797   |
| $I_{Gsat}$ (Drain current at $V_G = 1$ V, $V_D = 1$ V)                                   | 0.047725   |
| I <sub>Dsat</sub> /I <sub>Ssat</sub> (ratio of I <sub>Dsat</sub> and I <sub>Ssat</sub> ) | 0.044750   |
| $I_{Dsat}$ (Drain current at $V_G = 1$ V, $V_D = 1$ V)                                   | 0.038916   |

correctly. It also reveals the issue of imbalanced dataset with more samples from region 4 and region 9. To circumvent this issue, the "class\_weight" argument in the predictive model is set to "balanced" in order to achieve a balanced mix of each class in the dataset.

Among the important features, as depicted in Table II, are the features related to the subthreshold swing ( $S_{min}$ ,  $S_{avg}$ , and  $S_{max}$ ), threshold voltage ( $V_{TH}$ ),  $I_S\_slope$ ,  $I_{Gsat}$ , and  $I_D\_slope$ . From the perspective of device physics, the differentiation for the defect in region 1, 3, 4, 6, 8, and 9 is related to the subthreshold swing ( $S_{min}$ ). Due to the closer proximity of the defects residing in region 1, 3, 6, and 8 to the semiconductor area (region 2 and 7), the influence of defect in region 1, 3, 6, and 8 on the electrostatic of the semiconductor is more pronounced than that of defect in region 4 and 9. This leads to different characteristic in the subthreshold swing (SS), subsequently affecting the threshold voltage ( $V_{TH}$ ) which is extracted based on constant-current method. This justifies the importance of SS and  $V_{TH}$  in classifying the regions of defect.

As discussed in the section III (A) and III (B), the defect located outside of the FIN results in high gate leakage current where  $I_S/I_D/I_G$  show linear dependence on the  $V_G$ . This explains  $Is\_slope$  and  $I_{Gsat}$  being the important features that distinguish the defect located outside the FIN from the one within the FIN.

Based upon the importance score of the features, the Random Forest model are retrained using the reduced dataset which only considers features with an importance score higher than 0.05. The distribution of the model accuracy obtained from 1000 runs presented in Fig. 12 illustrates that the accuracy is at least 0.86. The average accuracy is further improved to 0.9629 relative to the one with all features considered. This implies that the selected features are sufficient for accurate prediction, reducing the noise in the dataset as well as enabling the model pick up the relevant features.

#### **IV.** Conclusion

We successfully demonstrated a systematic approach for predicting the locations of bridge defect in a single-fin FinFET using a combination of TCAD-generated defect database and machine learning. The Random Forest algorithm as predictive model is trained with the electrical attributes from the simulated I-V. The proposed scheme showcases high accuracy in predicting the defect location. It can be easily extended to predict other type of defects and more complex circuits, such as multiple-fin FinFET transistors and SRAM bitcell structure. Once a properly calibrated TCAD transistor model is set up, it can be employed for predicting real failing device failures. Finally, this machine-learning-aided guidance defect detection system will further enhance the FA success rate for advanced nanoscale devices.

### Acknowledgements

This work is supported in part by A\*Star Accelerated Materials Development for Manufacturing Grant no: A1898b0043.

© 2019 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

#### References

- G. E. Moore, "Cramming more components onto integrated circuits," Proceedings of the IEEE, vol. 86, no. 1, pp. 82-85, Jan. 1998.
- [2] D. J. Frank, R. H. Dennard, E. Nowak, P. M. Solomon, Y. Taur, and H. S. P. Wong, "Device scaling limits of Si MO-SFETs and their application dependencies," Proc. IEEE, vol. 89, issue 3, pp. 259–288, 2001.
- [3] C. Auth et al., "A 22nm high performance and low-power CMOS technology featuring fully-depleted tri-gate transistors, self-aligned contacts and high density MIM capacitors," 2012 Symposium on VLSI Technology, Honolulu, HI, 2012, pp. 131-132.
- [4] C.W. Teo, V. Narang, and A. Thean, "Electrical Characterization of FEOL Bridge Defects in Advanced Nanoscale Devices Using TCAD Simulations," 2018 IEEE International Symposium on the Physical and Failure Analysis of Integrated Circuits (IPFA), Singapore, 2018, pp. 1-4.
- [5] Synopsys Sentaurus Structure Editor user guide.
- [6] L.Breiman, "Random Forests", Machine Learning, 45(1), 5-32, 2001.
- [7] Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2001.