1. Introduction

journal of economic analysis

Journal of Economic Analysis

2811-0943

Anser Press

10.58567/jea03010007

JEA-50

Journal Article

A Comparative Machine Learning Survival Models Analysis for Predicting Time to Bank Failure in the US (2001-2023)

https://orcid.org/a,*

Vallarino

Diego

aIndependent Researcher, Spain

*Correspondence: diego.vallarino@gmail.com

15 03 2024

3 1 129 144 11 06 2023 09 07 2023

2024

This is an open-access article distributed under a CC BY license (Creative Commons Attribution 4.0 International License https://creativecommons.org/licenses/by/4.0/)

This study investigates the likelihood of time to bank failures in the US between 2001 and April 2023, based on data collected from the Federal Deposit Insurance Corporation's report on "Bank Failures in Brief - Summary 2001 through 2023". The dataset includes 564 instances of bank failures and several variables that may be related to the likelihood of such events, such as asset amount, deposit amount, ADR, deposit level, asset level, inflation rate, short-term interest rates, bank reserves, and GDP growth rate. We explore the efficacy of machine learning survival models in predicting bank failures and compare the performance of different models. Our findings shed light on the factors that may influence the probability of bank failures with a time perspective and provide insights for improving risk management practices in the banking industry.

Bank bankruptcy survival analysis stratified hazard model survival machine learning models

1. Introduction

The forecast of company failure is important in both economics and society. Bankruptcies cause a breach in the business environment's stability, making estimating the sustainability of partners, clients, and financial institutions a particularly difficult and crucial problem for business players.

There is now a large number of bankruptcy prediction models (see M.A. Aziz et al., 2006; H.A. Alaka et al., 2017), however virtually all of them are classification based, which means they may estimate the posterior probability that a certain business would fail based on its financial features. The estimated time to failure is not expressly considered. For example, if a classification model is based on data collected one year prior to failure, the model's output is the posterior probability that a certain business would fail within one year. Decisions based on this probability may not be made in time to avert a failure that occurs in much less than a year.

A survival analysis, on the other hand, is concerned with the time of occurrence of the event of interest. Despite its prevalence in the medical and technological disciplines, survival analysis is seldom used to forecast financial failure. In their assessment of bankruptcy prediction models, Aziz and Dar (2006) included 12 kinds of classification models (ranging from discriminant analysis and logit to case-based reasoning, neural networks, and rough sets), but did not discuss survival analysis. According to this publication, the most often used approaches are multiple discriminant analysis and logistic regression; these two models account for more than half of the papers assessed. A 2018 study from H.A., Alaka at al. identified eight common technologies, including two statistical approaches (multiple discriminant analysis and logistic regression) and six machine learning models.

As a result, we may infer that survival analysis is not a primary focus of financial failure prediction experts. Our research aims to assess the usefulness of survival analysis (SA) to bankruptcy prediction. SA models and classification approaches are classified into two types: statistical and machine learning based. Statistical SA models originally debuted in the early 1970s, whereas machine learning SA models are the outcome of contemporary research. A large body of research confirms that machine learning models outperform statistical models in classification and regression tasks, particularly in classification-based bankruptcy prediction (see F., Barboza, et al., 2017). Several articles offer similar findings on the superiority of machine learning technologies in different areas of survival analysis.

Despite these findings, most writers of bankruptcy prediction approaches, especially when using SA, use the most basic statistical models (see A. Beretta, et al., 2018; R.C. Cox, et ., 2017).

In this paper, we innovate analyzing the results of our survival-ensemble-machine-learning-models comparison and the economic interpretation of these results. Our analysis focuses on the performance of different models in predicting time to bank failures using a set of relevant variables. Specifically, we compare the predictive power of several machine learning survival models, including the Kernel SVM, DeepSurv, Survival Random Forest and MTLR models. To compare the different machine learning algorithms, we use the concordance index (C-index)

Our goal is to identify which model provides the most accurate and informative predictions of time to bank failures and to interpret the economic significance of the model's results. To do so, we consider the significance and magnitude of the estimated coefficients for each variable in the model and compare these results to economic theory and intuition.

By analyzing the results of our model comparison and the economic interpretation of these results, we hope to provide insights into the factors that contribute to bank failures and to provide a better understanding of how different models can be used to predict these failures.

The remainder of the paper is structured as follows. Following a theoretical perspective on the address of different statistical models used to analyze the probability of bank failure. This analysis shows how the statistical tools have advanced in the study of banks bankruptcy. One of these new questions that arise in bank failure analysis is related to the probability of survival, the need to understand the time until bankruptcy becomes critical. That is why this section advances with we conduct a short survey of papers that use survival analysis to solve the financial collapse issue. Following that, we describe the empirical analysis section, which includes the models used, the data source, and the evaluation metrics.

We then present the results of the analysis, including a comparison of the different models, and we discuss the economic perspective of our findings. Finally, we conclude the paper by summarizing the key findings and their implications for future research and policymaking.

Overall, our study contributes to the literature on the use of survival analysis in finance and provides insights into the factors that drive financial collapses, which can help policymakers to design more effective regulations to prevent such events from occurring in the future.

2. Theorical perspective

Signals reflecting a company's operational state may disclose symptoms of financial difficulty, which can subsequently be incorporated into prediction models. Beaver (1966) was the first to use financial ratios to forecast bankruptcy, and financial ratios have been the most important piece of information in financial distress prediction for decades (Altman, 1968; Ohlson, 1980).

Market-based knowledge may provide us with a timely forecast; that is, on the premise of efficient markets, the market price incorporates all future viewpoints (Bharath & Shumway, 2008; Merton, 1974). Corporate governance (Li, Crook, Andreeva, & Tang, 2021; Platt & Platt, 2002), corporate efficiency (Li, Crook, & Andreeva, 2017; Paradi, Asmild, & Simak, 2004), external resource considerations (Hu & Ansell, 2007), and macroeconomic issues are all elements to examine (Duffie, Saita, & Wang, 2007; Tinoco & Wilson, 2013).

Furthermore, in recent years, unstructured data has received a lot of attention in business research. Mai, Tian, Lee, and Ma (2019) utilized textual data, and Hosaka (2019) used picture data created from financial documents to forecast company bankruptcy using convolutional neural networks in information extraction. Bankruptcy and financial distress prediction research has used statistical analysis and data mining approaches to improve decision-making tools (Yang, You, & Ji, 2011). Altman (1968) was the first to apply multiple discriminant analysis (MDA), which was then expanded upon by Deakin (1972), Edmister (1972), and others.

Later, logistic regression (or Logit) substituted the Z-score because it may provide probabilistic findings (Martin, 1977; Ohlson, 1980), which became a Basel II criterion. Machine learning algorithms have been appearing in the literature since the latter decade of the twentieth century. Jabeur (2023), Tam and Kiang (1992) and Lacher, Coats, Sharma, and Fant (1995) employed neural networks to categorize both bankrupted and non-bankrupted listed companies.

There are other innovative algorithms, including genetic algorithms (Back, Laitinen, & Sere, 1996), rough sets (Dimitras, Slowinski, Susmaga, & Zopounidis, 1999; Li, Wang, & Deng, 2008; Wang & Li, 2007), decision trees (Geng, Bose, & Chen, 2015), support vector machines (Hua, Wang, Xu, Zhang, & Liang, 2007; Min & Lee, 2005), and many hybrid and ensemble models such as Henriques et al. (2020), Choi, Son, & Kim (2018), du Jardin (2017), and Sun et al (2011).

Another kind of algorithm is mathematical programming. Data envelopment analysis (DEA) is a nonparametric approach for comparing companies and calculating relative efficiency based on the distance to the optimal frontier. Paradi et al. (2004), Cielen, Peeters, & Vanhoof (2004), Li, Crook, & Andreeva (2014), Li et al. (2017), and others have used DEA to forecast bankruptcy and financial difficulty. Altman, Marco, and Varetto (1994), Balcaen and Ooghe (2006), Kumar and Ravi (2007), and Verikas, Kalsyte, Bacauskiene, and Gelzinis (2007) evaluated the debates of several models (2010). While the preceding strategies represent financial hardship as a classification issue, a survival analysis methodology is concerned with the event's timing as well as its occurrence.

Survival analysis may also benefit from time-varying variables and censoring in modeling, making it preferable to static classification methods. Lane, Looney, and Wansley (1986) were the first to utilize the Cox proportional hazard model to forecast bank collapse. Cox proportional hazard models were employed by Luoma and Laitinen (1991) to forecast the failure of Finnish industrial and retail enterprises, although they were shown to be somewhat inferior to both discriminant and logit analysis.

Shumway (2001) creates a discrete-time bankruptcy hazard model that incorporates both accounting and market data. The discrete hazard model was used by Chava and Jarrow (2004), Carling, Pan, Ariyan, Narayan, and Truini (2007), Leong, Nguyen, Meredith, et al. (2008), and Leonardis & Rocci (2009) because of the benefits in parameter calculation and the type of variables reported regularly for companies (2008).

When compared to discriminant analysis and logistic regression in terms of prediction accuracy, Gepp and Kumar (2008) discovered that the Cox model was similar at equal misclassification costs but worse in adjusting to greater Type I error costs. Kristanti and Herwany (2017) discovered good findings using survival analysis on troubled enterprises in Indonesia. Recurrent event data are often used in medical research, most notably in the study of epilepsy, asthma, heart attacks, and hospital stays (Clayton, 1994). Within-subject correlation is a key feature of recurrent event data, in which one event raises or reduces the chance of following occurrences (Box-Steffensmeier & Boef, 2006).

Traditional statistical methods, such as logistic regression and Cox proportional hazards regression, either ignore the presence of recurring events or fail to account for within-subject correlation, resulting in an incorrect estimation of standard errors and a deviation from the original research question (Twisk, Smidt, & de Vente, 2005). Many approaches for analyzing recurring occurrences that incorporate all available information and within-subject correlations have been offered. Marginal intensity approaches, based on various definitions of risk sets, allow all cases to be at risk for each repeated event (Wei, Lin, & Weissfeld, 1989), whereas conditional intensity models are estimated in elapsed time or gap time, and cases are designated at risk for the kth repeated event only after experiencing the (k-1)th event (Andersen & Gill, 1982; Chang & Wang, 1999; Prentice, Williams, & Peterson, 1981).

The recurring occurrences in the Andersen-Gill (AG) model (Andersen & Gill, 1982) are considered to be ordered but have an equal chance of happening. The Prentice, Williams, and Peterson (PWP) model (Prentice et al., 1981) assumes that a person is not at risk for a future occurrence until the preceding event occurs. Even though there is considerable literature on modeling recurrent events using the PWP model in the fields of medical (e.g., Ejoku, Odhiambo, & Chaba, 2020; Moulton & Dibley, 1997; Pea, Slate, & González, 2007; Pfennig et al., 2010), consumer behavior (Bijwaard, Franses, & Paap, 2006), and product or equipment reliability (e.g., 1983; Jiang, Landers, & Reed Rhoads, 2006).

There are just a few studies in corporate finance. Parker, Peters, and Turetsky (2005), for example, employed the Cox and PWP models to examine the influence of corporate governance characteristics on the recurrent going-concern evaluations performed by auditors on failing enterprises. Wang and Carson (2010) used the PWP model to examine insurers' recurrent rating shifts. Godlewski (2015) investigated the factors influencing debt contract renegotiations between banks and European corporations using the PWP model in the context of corporate loans.

Zhou et al. (2022) employed the Cox analysis to investigate financial difficulty in their situation. In this effort, they have constructed three alternative models, each with its own set of variables, and he hopes to determine, via a single survival model, which elements most explain financial suffering. These Cox models are not comparable to other survival machine learning algorithms.

3. Empirical Analysis

In this section, we present the empirical analysis of the risk of bank failure in the United States. We analyze all 564 bank failures that occurred between 2001 and April 2023, as reported in the "Bank Failures in Brief - Summary 2001 through 2023" by the Federal Deposit Insurance Corporation (FDIC). The study of bank failures is of great importance in finance and economics, as it has significant implications for financial stability and the broader economy. In this section, we describe the models used in our analysis, the data source, and the evaluation metrics.

3.1. Models 3.1.1. Cox Proportional Hazards Model (coxph)

The Cox proportional hazards model is a widely used semi-parametric model in survival analysis. It assumes that the hazard function can be represented as the product of a time-independent baseline hazard function and a time-varying covariate function. Mathematically, the model can be represented as:

h t x = h 0 t exp ⁡ β T x

where htx is the hazard function for a given time t and covariate values x, h0t is the baseline hazard function, β is a vector of regression coefficients, and expβX is the hazard ratio, which represents the change in hazard associated with a unit change in the covariate.

3.1.2. Multi-Task Logistic Regression (MTLR)

Multi-task logistic regression is a machine learning method that can be used for survival analysis. It is a multi-output learning algorithm that can predict the probability of an event occurring at different time points. Mathematically, the model can be represented as:

h t x = e x p Σ k = 1 K Σ j = 1 p β k j x k j

Where htx is the hazard rate for an individual with covariates x,βkj are the regression coefficients for the kth characteristic of the jth group, and xkj is the kth feature of the jth group.

3.1.3. Kernel Support Vector Machine (Kernel SVM)

Kernel support vector machines are a popular machine learning method for survival analysis. They can handle non-linear relationships between covariates and outcomes by projecting the data into a higher-dimensional space using a kernel function. The model can be represented as:

f x = s i g n Σ i = 1 n α i y i K x i , x + b

Where Kxi,x is a kernel function that measures the similarity between the feature vectors xi and x, yi is the class label of the i-th instance, αi are the weights of the support vectors andb is the bias.

3.1.4. Random Survival Forest

Random survival forests are an extension of random forests for survival analysis. They use an ensemble of decision trees to predict the survival function. The model can be represented as:

h t x = 1 / B Σ b = 1 B h b t x

Where hbtx is the hazard rate for an individual with covariates x in the bth decision tree and B is the number of trees in the random forest.

3.1.5. DeepSurv

DeepSurv is a deep learning model for survival analysis. It uses a neural network with a flexible architecture to predict the survival function. The model can be represented as:

h t x = e x p Σ i = 1 p β i f i x + g h θ x

Where htx is the hazard rate for an individual with covariates x,βi are the regression coefficients for the input features fix,g⋅ is a non-linear function that transforms the output features and hθx is a neural network with θ parameters.

3.2. Data

In this analysis, we examine data on all 564 bank failures that occurred between 2001 and April 2023, as reported by the Federal Deposit Insurance Corporation (FDIC) in the "Bank Failures in Brief - Summary 2001 through 2023". The dataset contains information on several variables that may be related to the probability of bank failure. These variables include, asset amount, deposit amount, ADR, deposit level, asset level, inflation rate, short-term interest rates, bank reserves, and GDP growth rate. ^{^[1]} The database does not contain censored data.

Asset (Millions): The amount of assets a bank owns can be a good indicator of its financial strength and ability to withstand a crisis. A bank with a large amount of assets is less likely to fail. Deposit (Millions): The amount of deposits of a bank is another indicator of its financial strength, since it represents the confidence of depositors in the bank. A bank with a large number of deposits is less likely to fail.

ADR: The level of ADR (loan to deposit ratio) can be a good indicator of a bank's exposure to credit risk. A bank with a high ADR level could be at higher risk of bankruptcy in a recession or financial crisis. Deposit Level: The level of deposits in relation to the size of the bank can be an indicator of the financial strength of the bank. A bank with a high deposit-to-size ratio is less likely to fail.

Asset Level: The level of assets in relation to the size of the bank can be an indicator of the financial strength of the bank. A bank with a high asset-to-size ratio is less likely to fail. Inflation: The rate of inflation can affect the financial strength of a bank. High inflation can increase the risk of loan defaults, which would increase the probability of bank failure.

FFRate: The short-term interest rate set by the Federal Reserve can affect the financial strength of a bank. A high interest rate increases borrowing costs and reduces the number of loans that can be made, which can increase the likelihood of bank failure.

BanksRes: Bank reserves are funds that banks hold to cover potential losses on their loans and other assets. The higher a bank's reserves, the greater its ability to withstand a financial crisis and the lower its probability of failure.

GDP1pch: Represents the annual percentage change in the real Gross Domestic Product (GDP) of the United States relative to the previous year. The higher the GDP rate, the greater the bank's ability to survive based on the operation and health of its activity.

3.3. Metrics 3.3.1. C-Index

The C-index (also known as the concordance index or the area under the receiver operating characteristic curve) is a widely used metric in survival analysis and medical research to assess the performance of predictive models that estimate the likelihood of an event occurring over a given time period.

The C-index is generated using the rankings of anticipated event occurrence probability for each participant in a dataset. It calculates the percentage of pairings of people in whom the person with the higher anticipated probability experienced the event before the person with the lower projected probability. In other words, it assesses a predictive model's capacity to rank people in order of their likelihood of experiencing the event of interest.

The C-index scales from 0 to 1, with 0.5 representing random prediction and 1 indicating perfect prediction. In medical research, a C-index value of 0.7 or above is considered satisfactory performance for a prediction model. Here is the formula of non-censored data C-Index.

C - i n d e x = Σ i j 1 T j < T i . 1 η j > η i . δ j Σ i j 1 T j < T i . δ j

ηi,theriskscoreofauniti, 1Tj<Ti=0ifTj<Tielse0,1ηj<ηi=0ifηj<ηielse0,δj,representswhetherthevalueiscensoredornot.

4. Results

The Kaplan-Meier curve(Figure 1, Figure 2, Figure 3) displays the survival probability over time for a group of banks. The x-axis shows the time, and the y-axis displays the survival probability. At the start of the observation period, all banks are assumed to be "alive," represented by the value of 1. Over time, some banks may "die," meaning they fail, and their survival probability decreases.

The table below(Table 1) shows the evolution of the risk of bank failures over time. At time 1.07, there were 395 banks in the sample, and one bank failed. This translates to a survival probability of 0.99747 (i.e., 395-1/395). The survival probability at time 4.07 is 0.99494, indicating that two more banks have failed since the first observation.

The survival probability decreases as time progresses. At time 42.37, 382 banks remained in the sample, with 13 banks having failed over the observation period. The survival probability at that time was 0.96456. This means that the risk of bank failures increased from 1.07 to 42.37. After that time, the survival probability continued to decrease rapidly, suggesting that there was an increase in the risk of bank failures during that period.

Between 120 and 150 days, there is a significant drop in the survival probability from 0.4000 to 0.1038. This indicates a much higher risk of failure during this time period. This drop in survival probability could be indicative of some event or factor that increases the risk of failure during this time.

It's important to note that the analysis does not provide any insight into the cause of the bank failures, and further investigation would be necessary to determine the reasons behind the increase in the risk of bank failures.

4.1. Model comparison

The paper analyzed the performance of different machine learning survival models in predicting bank failures using a set of relevant variables(Figure 4). This procedure divided the dataset into a training set and a testing set for machine learning design. The code randomly selects 70% of the rows from the data frame df and assigns them to data.train. The train_index variable stores the numeric row indices of data.train. The remaining rows, which constitute 30% of the original data, are assigned to data.test. This separation allows for training a model on the training set and evaluating its performance on the testing set to assess its effectiveness and generalization capabilities. The concordance index (C-index) was used to compare the predictive power of different models.

According to the results presented in the paper, the model with the highest C-index value of 0.985 was the DepSurv model, indicating that it performed the best in predicting bank failures using the selected variables. The RendForest model had the second-highest C-index value of 0.798, followed by MTLR with a C-index of 0.741. The Cox model had a C-index of 0.666, while the KarnelSVM model had the lowest C-index of 0.571.

These results suggest that the DepSurv model was the most effective in predicting bank failures, followed by the RendForest and MTLR models. This information is valuable for banks and regulatory agencies in predicting the likelihood of bank failures and taking necessary actions to mitigate the risks.

The variables analyzed in the study, including asset amount, deposit amount, ADR, deposit level, asset level, inflation rate, short-term interest rates, bank reserves, and GDP growth rate, can provide insights into the factors that contribute to bank failures. By understanding these variables, banks and regulatory agencies can take measures to reduce the likelihood of bank failures.

These results highlight the potential of ensemble machine learning survival models in predicting bank failures and provides insights into the factors that contribute to these failures. This information can be used to improve the stability of the banking system and reduce the risk of financial crises.

4.2. Economic perspective 4.2.1. Matrix Analysis

The relative weights matrix, that is represented in the figure below(Figure 5), can be useful in understanding how regulators and analysts assess a bank's risk of failure and which factors they consider most important at different times. However, it is also important to note that these weights may change over time as markets and the economy evolve, and that different regulators and analysts may have slightly different approaches to assessing bankruptcy risk.

In this analysis, it can be seen that at the beginning of the weighting matrix, a relatively high weight is given to the “FFRate” because interest rate fluctuations can have a large impact on a bank's income and expenses, especially in terms of loans and deposits. Also, interest rate changes can signal changes in the broader economy, which can affect the financial health of banks.

However, as time progresses in the weight matrix, it is observed that the "FFRate" loses weight compared to other variables, such as "Inflation" and "BanksRes". This may be due to a number of factors, including the increasing importance of other risk factors such as inflation and a bank's ability to maintain adequate reserves. It may also reflect a heightened awareness on the part of regulators and analysts that the interest rate alone is not enough to assess a bank's risk of failure and that multiple factors need to be considered.

At the end, the weight matrix suggests that the size of a bank's assets and deposits are important factors in reducing the probability of bankruptcy, while a high level of indebtedness and low level of reserves increase the probability of bankruptcy. In addition, inflation appears to be a protective factor against bank failures. ^{^[2]}

4.2.2. 2008 Financial Crisis

It is interesting to note that during the period from 100 to 150, which coincided with the 2008 financial crisis, the survival rate of banks ranged from 0.88861 to 0.10380. This suggests that the financial crisis had a significant impact on the ability of banks to consolidate solvent.

Furthermore, it is important to note that the survival rate continued to decline after the crisis period, albeit at a slower rate. This could be indicative of the aftermath of the crisis, such as the economic downturn that followed and its lingering effects on the economy and the banking sector. Overall, these results underline the importance of considering the economic context and external events when analyzing the financial health of banks.

We can combine the information from the survival table and the weights matrix to better understand the variables that emerge from failure after period 150. From the survival table, we can see that the survival of banks decreases significantly after period 150. This It may be indicative that the variables that have a greater weight in the weight matrix after period 150 have a greater impact on bank failures.

In the weight’s matrix, we can see that the variables "BanksRes" and "Inflation" have a relatively high weight after the 150 period. This suggests that these variables may be more important in predicting bank failures after the financial crisis of 2008. Post-crisis, regulators may have placed more emphasis on the importance of adequate capital buffers and the ability of banks to stabilize solvents in an environment of rising inflation. Therefore, these factors may have been more important in predicting bank failures in the post-crisis period.

4.2.3. If we include the data from GDP1pch

The variables with the greatest weight in predicting bank failure are Asset (Millions), ADR, Deposit Level and Deposit (Millions), in that order, all of them with negative weights, which means that as these variables increase, the probability of bank failure decreases.

GDP1pch variable has the lowest weight of all, but it is also negative, which suggests that a decrease in economic growth increases the probability of bank failure. The other variables with negative weights are Inflation, FFRate, and BanksRes, indicating that high inflation, high interest rate, and low bank reserve also increase the probability of bank failure.

As for the Asset Level and Deposit Level variables, although they have positive weights, their weights are very low compared to other variables, so their effect in predicting bank failure is probably limited.

In summary, the results showed that banks with lower assets, lower deposit levels, high inflation rates, high interest rates, low bank reserves, and low economic growth are more likely to fail.

5. Conclusion

The risk evolution over time was used to analyze bank failures, and it demonstrated a substantial decline in survival probability between particularly between 120 and 150 months, coinciding with the years 2009, 2010 and 2011. After the global financial crisis, which originated in the US.

According to the relative weights matrix study, interest rate changes, inflation, bank reserves, and the amount of assets and deposits were all critical variables in determining a bank's risk of failure. It is worth noting that during the 2008 financial crisis, the survival rate of banks declined dramatically, implying a severe effect on their capacity to stay viable. The data offered in this research may be utilized to enhance banking system stability and lessen the likelihood of financial crises.

When several machine learning survival models were compared in forecasting time to bank failures using a collection of relevant characteristics, the DepSurv model was found to be the most successful, followed by the RendForest and MTLR models. The study's variables, which included asset amount, deposit amount, ADR, deposit level, asset level, inflation rate, short-term interest rates, bank reserves, and GDP growth rate, may give insight into the causes that lead to bank failures. Banks and regulatory bodies may lower the chance of bank failures by knowing these characteristics.

The initial high weight allocated to the "FFRate" variable in the weight matrix is one striking discovery. This highlights the substantial influence that interest rate swings have on a bank's revenue and costs, notably in terms of loans and deposits. Furthermore, fluctuations in interest rates may serve as indications of larger economic movements, which can have an impact on banks' financial stability.

However, as the study goes through the weight matrix, the weight allocated to "FFRate" decreases in comparison to other factors like as "Inflation" and "BanksRes." This trend might be linked to a number of causes, including the growing prominence of other risk variables such as inflation and a bank's capacity to keep enough reserves. It also implies that regulators and analysts are more aware that measuring a bank's risk of collapse needs taking into account many elements rather than relying just on interest rates.

The research suggests at the conclusion of the weight matrix that the size of a bank's assets and deposits has a critical impact in minimizing the likelihood of bankruptcy. A high degree of indebtedness and minimal reserves, on the other hand, enhance the chance of failure. Furthermore, inflation appears as a protective factor against bank failures, emphasizing its significance in financial stability.

The research makes an important insight about the 2008 financial crisis. During this time period, which spanned from 100 to 150 years, the survival rate of banks ranged from 0.88861 to 0.10380. This demonstrates the significant effect of the crisis on banks' capacity to remain solvent. It is worth noting that the rate of survival continued to fall following the crisis, but at a reduced pace. This reduction may be symptomatic of the continuing impacts of the economic slump that followed the crisis, indicating that external events continue to have an impact on the banking industry.

When the insights from the survival table and the weight matrix are combined, it is possible to have a clearer understanding of the elements that contribute to failure beyond period 150. The survival table shows a considerable decline in bank survival throughout this time period, indicating that factors with higher weights in the weight matrix had a stronger influence. The weight matrix, in particular, emphasizes the unusually large weights allocated to "BanksRes" and "Inflation" after the 150th period. In the post-2008 financial crisis environment, these indicators may be increasingly crucial in forecasting bank failures. During this period, regulators may have put a greater focus on the need of appropriate capital buffers and banks' capacity to retain solvency in the face of growing inflation.

One restriction of our research is the possibility of aberrant activities in the investigated banks, as well as a lack of control over their management. Our research is based on Federal Deposit Insurance Corporation (FDIC) official figures, which may not give precise information on particular banks with skewed accounting indications or aberrant activities. We do not have direct access as researchers to establish the degree of irregular activity inside the studied banks. This issue is inherent in dealing with publicly accessible data sources and might limit our investigation. We will specifically acknowledge this issue in the updated text, highlighting the necessity of taking this element into account when interpreting our results. By recognizing this restriction, we want to retain openness and provide readers a thorough knowledge of the limits imposed by the data sources used in our study.

Based on the prior analysis's findings, below are some suggested next steps for additional research. Some are as follows.

Extend the study to include a wider sample of banks or financial institutions, as well as a longer time span of observation. This might lead to a better understanding of the elements that contribute to bank failures and increase the prediction models' accuracy.

Consider using information on the banks' management practices, corporate governance, or social responsibility efforts as additional data sources or variables in the research. This might give a more comprehensive perspective of the variables influencing bank failures and a more sophisticated understanding of the link between these factors and bankruptcy risk.

Funding Statement

This research received no external funding.

Acknowledgment

Acknowledgments to anonymous referees' comments and editor's effort.

Declaration of Competing Interest

The author claims that the manuscript is completely original. The author also declares no conflict of interest.

Notes 1

U.S. Inflation Rate 1960-2023 from World Bank; Federal Funds Effective Rate from FED; Liabilities and Capital: Other Factors Draining Reserve Balances: Reserve Balances with Federal Reserve Banks: Week Average from FED; Real Gross Domestic Product from U.S. Bureau of Economic Analysis. ↑

Inflation could help banks generate more income on the same assets, increasing their ability to maintain adequate reserves and avoid bankruptcy. However, it is important to note that excessive inflation can also be detrimental to banks and the economy in general. ↑

References

Altman

E.I.

Financial ratios, discriminant analysis and the prediction of corporate bankruptcy

The Journal of Finance 1968 23 4 589 609

10.2307/2978933

Altman

E.I.

Marco

Varetto

Corporate distress diag- nosis: comparisons using linear discriminant analysis and neural networks (the Italian experience

Journal of Banking & Finance 1994 18 3 505 529

10.1016/0378-4266(94)90007-8

Andersen

P.K.

Gill

R.D.

Cox’s regression model for counting processes: A large sample study

The Annals of Statistics 1982 10 1100 1120

Ascher

Regression analysis of repairable systems reliability

Electronic systems effectiveness and life cycle costing 1983 119 133

10.1007/978-3-642-82014-4_8

Back

Laitinen

Sere

Neural networks and ge- netic algorithms for bankruptcy predictions

Expert Systems with Applications 1996 11 4 407 413

10.1016/S0957-4174(96)00055-3

Bai

Liu

Song

F.M.

Zhang

Corporate governance and market valuation in China

Journal of Comparative Economics 2004 32 4 599 616

10.1016/j.jce.2004.07.00

Balcaen

Ooghe

35 Years of studies on business failure: an overview of the classical statistical methodologies and their related problems

The British Accounting Review 2006 38 1 63 93

10.1016/j.bar.2005.09.001

Beaver

W.H.

Financial ratios as predictors of failure

Journal of Accounting Research 1966 4 71 111

10.2307/2490171

Bharath

S.T.

Shumway

Forecasting default with the merton distance to default model

Review of Financial Studies 2008 21 3 1339 1369

10.1093/rfs/hhn044

Bijwaard

G.E.

Franses

P.H.

Paap

Modeling purchases as repeated events

Journal of Business & Economic Statistics 2006 24 487 502

10.1198/073500106000000242

Bonfim

Credit risk drivers: evaluating the contribution of firm level information and of macroeconomic dynamics

Journal of Banking & Finance 2009 33 2 281 299

10.1016/j.jbankfin.2008.08.00

Box-Steffensmeier

J.M.

Boef

S.D.

Repeated events survival models: the conditional frailty model

Statistics in Medicine 2006 25 20 3518 3533

10.1002/sim.2434

Brier

G.W.

Verification of forecasts expressed in terms of probability

Monthly Weather Review 1950 78 1 1 3

Cai

Schaubel

D.E.

Analysis of recurrent event data

Handbook of Statistics 2003 23 603 623

10.1016/S0169-7161(03)23034-0

Carling

Pan

Ariyan

Narayan

Truini

Diagnosis and treatment of interval sentinel lymph nodes in patients with cutaneous melanoma

Plastic and Reconstructive Surgery 2007 119 3 907 913

Chang

S.-H.

Wang

M.-C.

Conditional regression analysis for recurrence time data

Journal of the American Statistical Association 1999 94 448 1221 1230

10.1080/01621459.1999.10473875

Chava

Jarrow

R.A.

Bankruptcy prediction with industry effects

Review of Finance 2004 8 4 537 569

10.1093/rof/8.4.537

Choi

Son

Kim

Predicting financial distress of contractors in the construction industry using ensemble learning

Expert Systems with Applications 2018 110 1 10

10.1016/j.eswa.2018.05.026

Choodari-Oskooei

Royston

Parmar

M.K.

A simulation study of predictive ability measures in a survival model II: ex- plained randomness and predictive accuracy

Statistics in Medicine 2012 31 23 2644 2659

10.1002/sim.4242

Cielen

Peeters

Vanhoof

Bankruptcy prediction using a data envelopment analysis

European Journal of Operational Research 2004 154 2 526 532

10.1016/S0377-2217(03)00186-3

Clayton

Some approaches to the analysis of recurrent event data

Statistical Methods in Medical Research 1994 3 3 244 262

10.1177/096228029400300304

Deakin

E.B.

A discriminant analysis of predictors of business failure

Journal of Accounting Research 1972 3 3 167 179

10.2307/2490225

Dimitras

A.I.

Slowinski

Susmaga

Zopounidis

Business failure prediction using rough sets

European Journal of Operational Research 1999 114 2 263 280

10.1016/S0377-2217(98)00255-0

Jardin

Dynamics of firm financial evolution and bankruptcy prediction

Expert Systems with Applications 2017 75 25 43

10.1016/j.eswa.2017.01.016

Duffie

Saita

Wang

Multi-period corporate default prediction with stochastic covariates

Journal of Financial Economics 2007 83 3 635 665

10.1016/j.jfineco.2005.10.011

Edmister

R.O.

An empirical test of financial ratio analy- sis for small business failure prediction

Journal of Financial and Quantitative Analysis 1972 7 2 1477 1493

10.2307/2329929

Ejoku

Odhiambo

Chaba

Analysis of recurrent events with associated informative censoring: Application to HIV data

International Journal of Statistics in Medical Research 2020 9 21

Geng

Bose

Chen

Prediction of financial distress: an empirical study of listed Chinese companies using data mining

European Journal of Operational Research 2015 241 1 236 247

10.1016/j.ejor.2014.08.016

Gepp

Kumar

The role of survival analysis in financial distress prediction

International Research Journal of Finance and Economics 2008 16 16 13 34

Gilson

S.C.

Management turnover and financial distress

Journal of Financial Economics 1989 25 2 241 262

10.1016/0304-405X(89)90083-4

Godlewski

C.J.

The dynamics of bank debt renegotiation in Europe: A survival analysis approach

Economic Modelling 2015 49 19 31

10.1016/j.econmod.2015.03.017

Gönen

Heller

Concordance probability and discrimi- natory power in proportional hazards regression

Biometrika 2005 92 4 965 970

10.1093/biomet/92.4.965

Graf

Schmoor

Sauerbrei

Schumacher

As- sessment and comparison of prognostic classification schemes for survival data

Statistics in Medicine 1999 18 17–18 2529 2545

10.1002/(SICI)1097-0258(19990915/30)18:17/18

Graf

Schumacher

An investigation on measures of explained variation in survival analysis

Journal of the Royal Statistical Society. Series D 1995 44 4 497 507

10.2307/2348898

Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors

Statistics in Medicine 1996 15 4 361 387

10.1002/(SICI)1097-0258(19960229)15:4

Henriques

I.C.

Sobreiro

V.A.

Kimura

Mariano

E.B.

Two-stage DEA in banks: Terminological controversies and future directions

Expert Systems with Applications 2020 161 113632

10.1016/j.eswa.2020.113632

Hosaka

Bankruptcy prediction using imaged financial ratios and convolutional neural networks

Expert Systems with Applications 2019 117 287 299

10.1016/j.eswa.2018.09.039

Y.C.

Ansell

Measuring retail company performance using credit scoring techniques

European Journal of Operational Research 2007 183 3 1595 1606

10.1016/j.ejor.2006.09.101

Zheng

Does ownership structure affect the degree of corporate financial distress in China?

Journal of Accounting in Emerging Economies 2015 5 1 35 50

10.1108/JAEE-09-2011-0037

Hua

Wang

Zhang

Liang

Predicting corporate financial distress based on integration of support vector machine and logistic regression

Expert Systems with Applications 2007 33 2 434 440

10.1016/j.eswa.2006.05.006

Jiang

Jones

Corporate distress prediction in China: a machine learning approach

Account Finance 2018 58 1063 1109

10.1111/acfi.12432

Jiang

S.T.

Landers

T.L.

Rhoads

T. Reed

Proportional intensity models robustness with overhaul intervals

Quality and Reliability Engineering International 2006 22 3 251 263

10.1002/qre.713

John

T.A.

Accounting measures of corporate liquidity, leverage, and costs of financial distress

Financial Management 1993 22 3 91 100

10.2307/3665930

Kahl

Economic distress, financial distress, and dynamic liquidation

The Journal of Finance 2002 57 1 135 168

10.1111/1540-6261.00418

Kam

Citron

Muradoglu

Financial distress resolution in China – two case studies

Qualitative Research in Financial Markets 2010 2 2 46 79

10.1108/17554171011053667

Kaplan

E.L.

Meier

Nonparametric estimation from in- complete observations

Journal of the American Statistical Association 1958 53 282 457 481

10.1080/01621459.1958.10501452

Kim

Zhou

Survival prediction of distressed firms: evidence from the Chinese special treatment firms

Journal of the Asia Pacific Economy 2016 21 3 418 443

10.1080/13547860.2016.1176645

Kristanti

Farida Titik

Herwany

Aldrin

Corporate governance, financial ratios, political risk and financial distress: A survival analysis

Accounting and Finance Review 2017 2 2 26 34

Kuhnen

C.M.

Melzer

B.T.

Noncognitive abilities and financial delinquency: the role of self-efficacy in avoiding financial distress

The Journal of Finance 2018 73 6 2837 2869

10.1111/jofi.12724

Kumar

P.R.

Ravi

Bankruptcy prediction in banks and firms via statistical and intelligent techniques – a review

European Journal of Operational Research 2007 180 1 1 28

10.1016/j.ejor.2006.08.043

Lacher

R.C.

Coats

P.K.

Sharma

S.C.

Fant

L.F.

A neural network for classifying the financial health of a firm

European Journal of Operational Research 1995 85 1 53 65

10.1016/0377-2217(93)E0274-2

Lane

W.R.

Looney

S.W.

Wansley

J.W.

An application of the cox proportional hazards model to bank failure

Journal of Banking & Finance 1986 10 4 511 531

10.1016/S0378-4266(86)80003-6

Lee

M.C.

Business bankruptcy prediction based on survival analysis approach

International Journal of Computer Science & Information Technology 2014 6 2 103

10.5121/ijcsit.2014.6207

Leonardis

D.D.

Rocci

Assessing the default risk by means of a discrete – time survival analysis approach

Applied Stochastic Models in Business and Industry 2008 24 4 291 306

10.1002/asmb.705

Leong

R.W.

Nguyen

Meredith

C.G.

In vivo confocal endomicroscopy in the diagnosis and evaluation of celiac disease

Gastroenterology 2008 135 6 1870 1876

10.1053/j.gastro.2008.08.054

Crook

Andreeva

Chinese companies distress prediction: an application of data envelopment analysis

Journal of the Operational Research Society 2014 65 3 466 479

10.1057/jors.2013.67

Crook

Andreeva

Dynamic prediction of financial distress using Malmquist DEA

Expert Systems with Applications 2017 80 94 106

10.1016/j.eswa.2017.03.017

Crook

Andreeva

Tang

Predicting the risk of fi- nancial distress using corporate governance measures

Pacific-Basin Finance Journal 2021 68, Article 101334

10.1016/j.pacfin.2020.101334

Wang

Deng

Ownership, independent direc- tors, agency costs and financial distress: evidence from Chinese listed companies

Corporate Governance: The International Journal of Business in Society 2008 8 5 622 636

10.1108/14720700810913287

Luoma

Laitinen

E.K.

Survival analysis as a tool for company failure prediction

Omega 1991 19 6 673 678

10.1016/0305-0483(91)90015-L

Mai

Tian

Lee

Deep learning models for bankruptcy prediction using textual disclosures

European Journal of Operational Research 2019 274 2 743 758

10.1016/j.ejor.2018.10.024

Mantel

Evaluation of survival data and two new rank order statistics arising in its consideration

Cancer Chemother Rep 1966 50 3 163 170

Martin

Early warning of bank failure: a logit regression approach

Journal of Banking & Finance 1977 1 3 249 276

10.1016/0378-4266(77)90022-X

Merton

R.C.

On the pricing of corporate debt: the risk structure of interest rates

The Journal of Finance 1974 29 2 449 470

10.2307/2978814

Min

J.H.

Lee

Y.C.

Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters

Expert Systems with Applications 2005 28 4 603 614

10.1016/j.eswa.2004.12.008

Moulton

L.H.

Dibley

M.J.

Multivariate time-to-event models for studies of recurrent childhood diseases

International Journal of Epidemiology 1997 26 6 1334 1339

10.1093/ije/26.6.1334

Ohlson

J.A.

Financial ratios and the probabilistic prediction of bankruptcy

Journal of Accounting Research 1980 18 1 109 131

10.2307/2490395

O’Neill

H.M.

Turnaround and recovery: What strategy do you need?

Long Range Planning 1986 19 1 80 88

10.1016/0024-6301(86)90131-7

Paradi

J.C.

Asmild

Simak

P.C.

Using DEA and worst practice DEA in credit risk evaluation

Journal of Productivity Analysis 2004 21 2 153 165

10.1023/B:PROD.0000016870.47060.0b

Parker

Peters

G.F.

Turetsky

H.F.

Corporate governance factors and auditor going concern assessments

Review of Accounting and Finance 2005 4 3 5 29

10.1108/eb043428

Peña

E.A.

Slate

E.H.

González

J.R.

Semiparametric inference for a general class of models for recurrent events

Journal of Statistical Planning and Inference 2007 137 6 1727 1747

10.1016/j.jspi.2006.05.004

Pfennig

Schlattmann

Alda

Grof

Glenn

Oerlinghausen

B. Müller-

Influence of atypical features on the quality of prophylactic effectiveness of long-term lithium treatment in bipolar disorders

Bipolar Disorders 2010 12 4 390 396

10.1111/j.1399-5618.2010.00826.x

Platt

H.D.

Platt

M.B.

Predicting corporate financial distress: Reflections on choice-based sample bias

Journal of Economics and Finance 2002 26 2 184 199

10.1007/BF02755985

Prentice

R.L.

Williams

B.J.

Peterson

A.V.

On the regression analysis of multivariate failure time data

Biometrika 1981 68 373 389

10.1093/biomet/68.2.373

Rahman

M.S.

Ambler

Choodari-Oskooei

Omar

R.Z.

Review and evaluation of performance measures for survival pre- diction models in external validation settings

BMC Medical Research Methodology 2017 17 1 60

10.1186/s12874-017-0336-2

Schemper

Stare

Explained variation in survival analysis

Statistics in Medicine 1996 15 19 1999 2012

10.1002/(SICI)1097-0258(19961015)15:19

Shumway

Forecasting bankruptcy more accurately: a simple hazard model

Journal of Business 2001 74 1 101 124

Sun

Jia

M.Y.

Adaboost ensemble for finan- cial distress prediction: an empirical comparison with data from Chinese listed companies

Expert Systems with Applications 2011 38 8 9305 9312

10.1016/j.eswa.2011.01.042

Tam

K.Y.

Kiang

M.Y.

Managerial applications of neural networks: the case of bank failure predictions

Management Science 1992 38 7 926 947

10.1287/mnsc.38.7.926

Tinoco

M.H.

Wilson

Financial distress and bankruptcy prediction among listed companies using accounting, market and macroeconomic variables

International Review of Financial Analysis 2013 30 394 419

10.1016/j.irfa.2013.02.013

Twisk

Smidt

Vente

Applied analysis of recurrent events: a practical overview

Journal of Epidemiology and Community Health 2005 59 706 710

10.1136/jech.2004.030759

Uno

Cai

Pencina

M.J.

D’Agostino

R.B.

Wei

L.J.

On the C-statistics for evaluating overall adequacy of risk predic- tion procedures with censored survival data

Statistics in Medicine 2011 30 10 1105 1117

10.1002/sim.4154

Verikas

Kalsyte

Bacauskiene

Gelzinis

Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: a survey

Soft Computing 2010 14 9 995 1010

10.1007/s00500-009-0490-5

Wang

Y.L.

Carson

Macroeconomic factors and insurer rating transitions

Forensic Economics EJournal 2010

10.2139/ssrn.1558456

Wang

Deng

Corporate governance and financial distress

The Chinese Economy 2006 39 5 5 27

10.2753/CES1097-1475390501

Wang

Financial distress prediction of Chinese listed companies: a rough set methodology

Chinese Management Studies 2007 1 2 93 110

10.1108/17506140710758008

Wei

L.J.

Lin

D.Y.

Weissfeld

Regression analysis of multivariate incomplete failure time data by modelling marginal distributions

Journal of the American Statistical Association 1989 84 408 1065 1073

10.1080/01621459.1989.10478873

Wruck

K.H.

Financial distress, reorganization, and or- ganizational efficiency

Journal of Financial Economics 1990 27 2 419 444

10.1016/0304-405X(90)90063-6

Liang

Yang

Analysing the financial distress of Chinese public companies using probabilistic neural networks and multivariate discriminate analysis

Socio-Economic Planning Sciences 2008 42 3 206 220

10.1016/j.seps.2006.11.002

Yang

You

Using partial least squares and support vector machines for bankruptcy prediction

Expert Systems with Applications 2011 38 7 8336 8342

10.1016/j.eswa.2011.01.021

Zhou

The recurrence of financial distress: A survival analysis

International Journal of Forecasting 2022 38 3 1100 1115

10.1016/j.ijforecast.2021.12.005

Figures and Tables Figure 1.

Kaplan-Meier survival curve.