The Relationship Between an Individual's Ability to Evaluate Probabilities and Stock Market Participation - Student Theses Faculty of Economics ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
The Relationship Between an Individual’s Ability to Evaluate Probabilities and Stock Market Participation Master’s Thesis for MSc. Economics (EBM877A20) MSc. Finance (EBM866B20) Groningen, 02-06-2021 Wijmar B. Snijder S2976188 University of Groningen Faculty of Economics and Business Supervisor: dr. R.D. Freriks Abstract Previous research has shown that there exist many determinants that explain why stock market participation (SMP) rates are in practice lower than is predicted by financial theory. The motives for absence on stock markets among individuals are interesting since it is often argued to be a driving force behind income inequal- ity. This paper adds to the literature regarding the non-participation puzzle by focusing on a characteristic that has not been considered so far: the ability of individuals to assign probabilities to uncertain events. The focus is on SMP in the Netherlands, and the main sample of the study is made up of 1703 respondents of the DNB Household Survey (wave 2020). The relationship between the probability assigning ability (PAA) and SMP is analyzed using a Logit model. A PAA-index is constructed by means of a factor analysis on four questions of the survey that specifically measure this ability. Results that follow are sizeable: A one standard deviation increase in the PAA-index raises SMP by about 4 percentage points. This is an interesting finding in the daylight of welfare inequality, especially since the PAA-effect is found to be even larger for the poorest half of the sample. Keywords: Financial Decision-Making, Household Finance, Limited Stock Mar- ket Participation, Probability Evaluation JEL Classification: D91, G11, G59
Contents 1 Introduction 3 2 Literature 6 2.1 Determinants of Stock Market Participation . . . . . . . . . . . . . 6 2.2 Non-participation Puzzle . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Probability Evaluating Theory . . . . . . . . . . . . . . . . . . . . . 9 2.4 Stock Market Participation in the Netherlands . . . . . . . . . . . . 10 3 Empirical Strategy 12 3.1 Model Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Probability Assigning Ability Index . . . . . . . . . . . . . . . . . . 16 4 Data 19 4.1 Overview of the Data . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Measurement of Probability Assigning Ability . . . . . . . . . . . . 22 5 Results 25 5.1 Results of the Main Model . . . . . . . . . . . . . . . . . . . . . . . 26 5.2 Robustness Considerations . . . . . . . . . . . . . . . . . . . . . . . 28 5.2.1 Representativeness of the Sample . . . . . . . . . . . . . . . 28 5.2.2 Subgroup Analysis . . . . . . . . . . . . . . . . . . . . . . . 29 6 Conclusion and Discussion 32 6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 6.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 References 35 A Appendix 38 2
1 | Introduction When an individual is asked the question of whether he or she prefers an expected return on their savings of either 0% or 5%, it is no surprise what the answer would be. Although the SMP decision is obviously more complex than simply picking the highest expected return, it is astonishing that so many individuals do not participate in stock markets. According to theory, expected utility-maximizers should always invest in at least an arbitrarily small amount of the risky asset, as long as they have positive financial wealth (Haliasasos and Bertaut, 1995). Taking into account that participation rates in the Netherlands have never exceeded 20%, it is evident that individual financial behavior is not aligned with what financial theory teaches us. Individual financial behavior is central to the understanding of finance and how the economy operates. Different subjects have been studied in the last decades that contribute to behavioral & household finance literature. The focus of these studies is on the differences between the way households are expected to make financial decisions based on economic welfare-maximizing theory, and how these decisions are made in practice. One of these interesting topics is the major absence on stock markets. Why is it that such a small part of the population holds stocks? It is still questioning why such a large part of the population does not participate. This phenomenon, which is in the remainder of this paper referred to as the non- participation puzzle, is the main focus of this study. We currently live in an era of extremely low (even negative) interest rates and therefore the non-participation puzzle is more interesting than ever. Stock market indices all around the world are near all-time highs. One explanation for this fact is that there is no alternative (TINO) for households to invest their savings in. Furthermore, the easier (online) access to brokers has lowered the barrier to participate in stock markets, which makes the puzzle today even more interesting. Liquidity constraints (Mankiw and Zeldes, 1991), participation costs (Haliassos and Michaelides, 2003; Vissing-Jørgensen, 2002) and unawareness of stock markets 3
(Guiso and Japelli, 2005) are well-known contributions made to the puzzle so far, but a great part is still unexplained. This paper adds to the understanding of the non-participation puzzle by focusing on one specific characteristic of potential entrants: their ability to assign the correct probabilities to uncertain events, which is in the remainder of this paper referred to as PAA. The non-participation puzzle is so interesting because nonparticipation in stock markets is often linked to income and welfare inequalities. Guvenen (2006) found in a model where he compares otherwise identical stockholders and non-stockholders, that stockholders come to hold almost 80% of total wealth. For policymakers, it is highly relevant to find reasoning for why a certain part of the population invests whereas the majority does not. If more and more determinants are known, it might serve as a starting point for policymakers to eventually reduce welfare inequalities. This paper is the first to link SMP to the ability to evaluate probabilities in a proper way. The way in which this paper hypothesizes that this ability should impact participation rates is the following. A stock market is a place full of uncertain events and therefore every participant should be able to evaluate the probability distribution of such uncertain events in an appropriate way. For that reason, the expectation beforehand is that those who are better able to attach the right probabilities to uncertain events are more likely to participate in the stock market. When individuals cannot make a fair risk-reward trade-off before deciding to participate in financial markets, this could be a motivation to stay out. Note that this paper focuses on the ability to assign the right probabilities, given that the probability distribution is known beforehand. This should not be confused with the ability to forecast uncertain events. In order to study whether the probability evaluating ability does indeed im- pact SMP in the Netherlands, this paper uses the most recent wave of the DNB (De Nederlandsche Bank) Household Survey (DHS). This survey contains a rep- resentative sample of the Dutch-speaking population. It provides among others information regarding saving- and investing behavior and presents the respondents th with psychological concepts. The data of the 28 data wave is collected between March 2020 - December 2020, so this is the period from which conclusions in this 4
paper are drawn. The main sample for this study consists of 1703 respondents of the DHS, where 339 (20%) individuals indicated that they invest in some way in the stock market. The ability of individuals to assign the right probabilities to uncertain events is measured with the use of four different questions of the economic and psychological concepts questionnaire of the DHS. In these four questions, participants are asked to assign the right probabilities to an uncertain event occurring or not, after being provided with the probability distribution of that event. In this way, it measures the ability of individuals to evaluate probabilities in a unique but effective way. A factor analysis is conducted on the four different questions in order to construct a PAA-index. To test whether PAA indeed has an impact on SMP, the PAA-index is together with a broad set of demographic control variables regressed on a variable indicating whether individuals invest or not, using a Logit regression model. The results confirm the expectations. Holding other factors constant, a one standard deviation increase in the probability evaluation index raises SMP by about 4 percentage points. This effect is quite large when realizing that only 20% of the individuals in the sample actually hold stocks. Interestingly, a subgroup analysis on financial wealth revealed that the PAA-effect is even stronger for the poorest half of the sample. The remainder of this paper proceeds as follows. Section 2 presents the relevant literature regarding the non-participation puzzle and introduces probability eval- uation theory. Section 3 is devoted to present the research method used for this paper, whereas Section 4 gives a look and feel for the data that is used. Section 5 presents the results of the main model and robustness checks before Section 6 provides a conclusion and discussion. 5
2 | Literature The field of Behavioral Finance has, since its introduction in the ’80s, attempted to provide explanations for a phenomenon which is in this paper referred to as the non-participation puzzle. All contributions made so far provide insights from a different perspective, but are all focused on the same question: Why is it that so few individuals hold stocks? Haliassos and Bertaut (1995) show in their study that expected utility maximizers should always invest an arbitrarily small amount of the risky asset, as long they have a positive real income. However, it is crystal clear that this is not the case outside of textbook models. The latter explicitly holds for the Netherlands, where individual participation rates on stock markets have never exceeded 20% (AFM, 2021). In the remainder of this section, determinants of stock market participation, the relevance of the non-participation puzzle, probability evaluation theory, and stock market participation in the Netherlands are discussed separately. 2.1 Determinants of Stock Market Participation One of the first studies that addresses the differences between participants and nonparticipants of the stock market is the influential paper of Mankiw and Zeldes (1991), who discuss possible reasons for not holding stocks. They argue that a major part of the absence on the stock market is due to liquidity constraints, but lack explanations for those not participating but who do have ample liquid assets. Haliassos and Bertaut (1995) follow up by investigating several candidate ex- planations for why 75% of US households did not hold stocks at that time, despite the equity premium and predictions of expected-utility models. They conclude that inertia and departures from expected utility maximization are a crucial part of the explanation for the puzzle, but encourage others to test for further expla- nations of the phenomenon. The call for more explanations on lacking participation on stock markets re- 6
sulted in a lot of insightful conclusions. An important consideration is that partic- ipation costs, both monetary- and informational-, keep a significant part of house- holds out of the stock market, which is the main argument of Vissing-Jørgensen (2002) and Haliassos and Michaelides (2003). However, both Mankiw and Zeldes (1991) and Heaton and Lucas (2000) argue that such fixed costs do not explain the nonparticipation among the wealthy. Other insights explaining the non-participation puzzle include that of Guiso and Japelli (2005), who argue that a significant fraction of households is simply not aware of the existence of stocks and mutual funds. They argue that SMP could double if all consumers were aware of risky assets at that time. A likely argument is given by Hong, Kubik, and stein (2004), who state that social interaction impacts the participation decision. They argue that social households are more likely to invest in the stock market than non-social households. This is closely related to stock market awareness since the authors explain their findings through a word- of-mouth information-sharing mechanism. In an attempt to present a relationship between SMP and the ability of indi- viduals to assign proper probabilities to events, this paper stands on the shoulders of existing literature contributions that prove the relationship between engage- ment in risky financial markets and households’ capabilities. The first to address this relationship was Campbell (2006), who proved that less educated households are more likely to make investment mistakes. Poorly educated households may be aware of their limited investment skill and may therefore withdraw from risky markets altogether. Moreover, the low educated are more likely to delegate finan- cial decisions to professionals, increasing the costs of participation. Rather than solely education, Grinblatt, Keloharju, and Linnainmaa (2011) find that SMP is monotonically related to IQ, which is found to be present among the affluent as well. Furthermore, Van Rooij, Lusardi, and Alessie (2011) show that SMP is in- dependently impacted by financial literacy among households, and Christiansen, Joensen, and Rangvid (2008) find that having a background in economics increases the likelihood of holding stocks. The final factor influencing the participation decision of households discussed 7
in this section is households’ risk appetite. However, there is no consensus among existing literature about the way in which this variable impacts the participation decision. On the one hand, it could be reasonable to expect that nonparticipants are relatively risk-averse, and only small fixed entry costs are in this case enough to deter them from participation (Campbell, 2006). Carroll (2002) is more or less on the same line but argues that it is relative risk aversion that matters. The model he proposes, with declining relative risk aversion, explains the high participation rate of wealthy individuals. On the other hand, however, it could well be that risk-averse individuals have a strong precautionary saving motive and therefore attach more value to wealth accumulation. This way of thinking, introduced by Haliassos and Michaelides (2003) and Gomes and Michaelides (2005) expects a higher rate of risk-averse individuals investing in stocks. 2.2 Non-participation Puzzle There are certain reasons why it is so interesting to solve the non-participation puzzle. First, and perhaps most important is the total welfare effect that is implied by SMP. Guvenen (2006) finds that limited participation on stock markets is one of the driving forces behind income inequality. in a model where he compares other- wise identical stockholders and non-stockholders, he finds that stockholders come to hold almost 80% of total wealth, and are able to consume nearly 50 percentage points more per-capita than non-stockholders. In this setting, he demonstrates the potential of limited participation for developing substantial cross-sectional hetero- geneity. Dynan (2001) also acknowledges the effect of SMP on consumption. An increase in the level of participation rates would allow a greater part of the pop- ulation to enjoy consumption growth. Finally, Bovenberg, Koijen, Nijman, and Teulings (2007) are able to attach numbers to these welfare losses. They find that not participating in the stock market for retirement and other savings yields a substantial welfare loss of 12 percentage points. Although participating in the stock market is often associated with wealth ac- cumulation, not every participant in the stock market has equal expected returns. 8
Calvet, Campbell, and Sodini (2007) argue that lower educated are more likely to invest inefficiently when they participate. They show that welfare costs for not participating are lower by almost one-half for poorly educated when taking under-diversification costs into account. Grinblatt et al. (2011) provide a like- wise argument and argue that the risk-return trade-off is less favorable for low-IQ individuals, and therefore expected welfare costs are certainly overestimated for low-IQ individuals. Other issues wherefore the non-participation puzzle is relevant include house- hold choices regarding individual retirement accounts, which is extensively dis- cussed by Bernheim and Garrett (2003). Furthermore, mutual funds and brokers are interested in broadening the stockholder base, and hence in understanding what triggers the decision to invest in the stock market (Guiso and Japelli, 2005). 2.3 Probability Evaluating Theory Coming up with the right probability of an event occurring or not is not as straightforward as it might sound. This is because probability evaluation goes hand in hand with certain behavioral biases. Kahneman and Tversky (1997) de- scribe in their broadly-acknowledged Prospect Theory how individuals’ perception of probabilities is structurally biased. In particular, they find that small prob- abilities are often overweighted and highly probable occurrences are in general underweighted. Dimmock and Kouwenberg (2010) added to the understanding of the non-participation puzzle by focusing on the prospect theory. They found that loss-aversion among households indeed reduces the probability of participation in stock markets. Thaler and Sunstein (2009) discuss different ways in which individuals incor- rectly evaluate the probabilities of events occurring or not. One of these mecha- nisms is the availability bias, in which individuals relate probabilities to the history of events when these past events are in reality independent. A way to increase peo- ple’s fear of a bad outcome in this setting is to remind them of a related incident in which things went wrong. This availability bias is highly relevant for stock markets. 9
Goetzmann, Kim, and Shiller (2016) found that individuals tend to overestimate the probability of a stock market crash. In such a way, incorrectly interpreting probabilities might have an impact on the SMP decision of individuals. According to Reyna and Brainerd, (2008) The ability to come up with the right probability is closely related to the numerical skills of individuals. Studies such as Lusardi and Mitchell (2011) and Christelis, Japelli, and Padula, (2010) found that those with higher numeracy levels are much more likely to own stocks. Levels of numeracy among the population are found to be low in general, and particularly severe for certain demographic groups such as women, elderly, and those with low educational attainment, which was found by Lusardi (2012). 2.4 Stock Market Participation in the Netherlands There are different studies that highlight SMP in the Netherlands. Alessie, Hochguer- tel, and Van Soest (2002) wrote an entire article about household portfolios in the Netherlands and explain why an analysis of the determinants of stock ownership is particularly interesting for the Netherlands. In the first place, they argue that the institutional setting makes an analysis interesting. This entails that financial markets are well developed compared to other countries in Europe, and the tax system implicitly and explicitly offers strong incentives for various types of wealth accumulation. The other interesting feature about the Netherlands is the avail- ability of the DHS data set, which is unique in its form. It allows researchers to observe heterogeneity of SMP over households and over time. Alessie et al. (2002) find that certain predictions of classical theory do not square with the empirical patterns in the Netherlands. Elderly households are found to participate more risky asset markets, which is contrary to the hump- shaped age profiles predicted by theory and found in raw data for Italy and the US. This theory In the early twenty-first century, Guiso, Haliassos, and Jappelli (2003) found that there exist high contrasts in stock ownership levels among households in the Netherlands compared to the US. Stock ownership rates in the Netherlands were 10
Figure 1: Stock market participation trend in the Netherlands 25 Investing in stock market Share of respondents (%) Considering to start investing 20 Household member is investing 15 10 5 0 2017 2018 2019 2020 Year Notes: The figure presents the distribution of answers to the question whether respondents have money invested (directly or indirectly) in the stock market. Only the last four years of data are presented since questioning of AFM changed in 2017, which makes comparison with previous years impossible. Source: AFM (2021, p. 5) . about 21 percentage points lower than in the US at that time, which the writers at that time related to higher participation costs, resulting from less competition, and different levels of risk aversion. The most recent overview of the Dutch Authority for the Financial Markets (AFM) on SMP presented an increasing trend in participation levels in the Nether- lands. This trend in increasing participation and interest in stock markets is pre- sented in Figure 1. The participation rate has increased to 17% on an individual level in 2020, which is the highest rate since the AFM measures participation rates in its current form. The report presents some insightful information on the reasoning for households to start investing in financial markets in the last years. Two-thirds of the households argue that the demand for higher returns has been a reason to start participating, which is likely to be a result of the low interest rates anno 2020. For about 53% of the new investors, the ease with which an account could be opened plays a key role to start investing. The latter result is interesting since it implies that the participation costs, as discussed in section 2.1, might not have a large impact in this day and age for potential investors in the Netherlands. 11
3 | Empirical Strategy This section presents the research method that is pursued to measure whether individuals’ PAA is related to SMP. In the first subsection is introduced how this relationship is captured into a model. Since this paper is the first to relate SMP to probability evaluation, the method in which PAA is measured is described carefully. The latter is done in the second subsection, which concludes this section. 3.1 Model Specifications Why is it that so few hold stocks? This question has been posted a lot in the existing literature and is the main focus of this paper as well. There are different factors that are proven to have an impact on the decision to invest in stocks or not. One explanation regarding the lacking SMP in the Netherlands that has not been explored yet is the ability of individuals to evaluate probabilities for uncertain events. A stock market is a place where uncertain events are daily routine, and therefore participants of the stock market are expected to be able to evaluate these events in an appropriate way. The line of reasoning in this paper is the following: when individuals are poorly able to assign the right probabilities to events occurring or not, they are unable to make a proper risk-reward trade-off for participating in the stock market and might therefore decide to stay out. Moreover, when individuals are unable to evaluate probabilities in a correct way, they are more likely to exert behavioral biases in financial decision-making. Goetzmann, Kim, and Shiller (2016) already found that these types of individuals tend to overestimate the probabilities of a stock market crash, which is obviously expected to impact the probability of entering the stock market. A crucial note here is that this paper only regards the PAA of individuals given that the probability distribution of the uncertain event is known. The way in which individuals are able to estimate probabilities based on their own expectations goes beyond the scope of this study. 12
In order to study the relationship between individuals’ PAA and the probability that they participate in stock markets, it is required to construct a variable that measures SMP. This variable is constructed as a dummy variable since information regarding the measurement of the value of stock market wealth in the DHS data is rather noisy. For this reason, the questions where respondents reveal whether they invest in stocks or mutual funds are used in order to construct a dummy variable that takes a value of 1 for those that do participate and 0 for those who do not. Please note here that no distinction is made between direct participation (through owning stocks) and indirect participation (by investing in mutual funds). The way in which the main explanatory variable, the PAA-index, is constructed is extensively discussed in the following subsection. The empirical specification of this paper recognizes that there are many deter- minants for SMP. Therefore, a wide set of demographics that are available in the DHS are included. These variables are age, education, marital status, number of children, wealth level, and financial literacy level. The inclusion of these variables is done in such a way that they are the easiest to interpret. Age is controlled for by setting up dummies for four different age categories. In order to control for gender, a dummy variable is created that takes on a variable of 1 for a male respondent and 0 otherwise. At the end of this subsection is discussed how is dealt with the fact that the final sample is highly men-skewed1 . The inclusion of ISCED levels in the sample allows controlling for the highest level of education that is completed among respondents. ISCED levels are constructed using World Bank standards. Note that no distinction can be made between Bachelor’s, Master’s, or a Doctoral degree since the DHS data set only distinguishes university education as the highest educational level. For this reason, the ISCED level can take up a value between 0 and 6, rather than the 0-8 scale which is proposed by the World Bank. Dummies are created for marital status, which takes on a value of 1 if the observed individual is married and zero otherwise. Another set of dummies is created for having children, which takes on a value of 1 if the respondent has at least one child and zero otherwise. In order to control 1 More about the origin of the unequal gender distribution can be found in Section 4. 13
for financial wealth, dummies are included that refer to one of the four wealth quartiles. The choice for dummies for this variable comes from the fact that the measurement of wealth is rather noisy in the DHS. The use of dummies deals with this problem and allows to observe how stock ownership increases over the wealth distribution. Since stock market wealth is obviously highly correlated with SMP, it is deducted from the total financial wealth. Final control variable is financial literacy, which according to Van Rooij et al. (2011) is positively related to SMP. The self-reported financial literacy2 variable is constructed using the following question: "How knowledgeable do you consider yourself with respect to financial matters?". Dummies are created for each of the possible answering options of the respondents: ’Not knowledgeable’, ’more or less knowledgeable’, ’knowledgeable’, and ’very knowledgeable’. The relationship between PAA and SMP is evaluated using the following main model: SM Pi = β0 + β1 · P AAi + β2 · X1i + i (1) In this model, SM P is explained by the main explanatory variable P AA, whereas X1i contains the control variables that are introduced above. The main equation will be evaluated using a Logit estimation model. This type of model is regarded as the most suitable model, given the sample that is used for this study. Different considerations for why a Logit model is chosen are the following: Firstly, the dependent variable of the model, SMP, is a binary variable. Secondly, the explanatory variables have been tested for multicollinearity, and this is not of an issue. Finally, the sample size of 1703 respondents is argued to be sufficiently large for a logistic regression. A benefit of the Logit model is that it avoids the problem of the linear probability model that estimations are not bounded between 0 and 1. For these reasons, a Logit estimation model is deemed 2 Note that the data wave consulted for this paper provides a proxy for self-reported financial literacy rather than the index that Van Rooij et al. (2011) use. However, in their paper, they argue that self-reported levels are highly correlated with actual financial literacy levels, which means that self-reported financial literacy serves as an adequate proxy. 14
to be the appropriate model to estimate Equation 1. A Logit model transforms the linear probability model so that the fitted values are bounded between 0 and 1. The linear probability model is no longer a straight line but rather an S-shaped model. Mathematically the model looks as follows: p(SM Pi ) p(SM Pi = 1) = ln = β0 + β1 · P AAi + β2 · X1i + i (2) 1 − p(SM Pi ) In this notation, p(SM Pi ) denotes the probability that an individual invests in the stock market. In order to interpret the results in a straightforward setting, the estimates of the model are first transformed from log-odds into odds, and secondly into marginal effects. In this way the estimates allow us to observe the marginal effect of a change in the main explanatory variable or any of the control variables on the probability of participating in the stock market. Due to data attrition, the final sample used in this paper to test the main model is highly men-skewed. The way in which this data attrition has affected the gender distribution is discussed more extensively in the next section. In order to deal with the unequal distribution of men and women in the sample, a second specification of the model is proposed. This second specification allows to check whether the results of the main model also hold in a sample with an equal distribution of male and female respondents. The male-skewed nature of the sample is a result of the inclusion of indicators of financial wealth. In order to compare a model that is closest to the main model considered in this paper, but has an equal distribution of men and women, a specification without financial wealth is considered: SM Pi = β0 + β1 · P AAi + β2 · X2i + i (3) Where X2i contains the same control variables as in the main model but ex- cludes financial wealth: age, gender, ISCED level, marital status, having children, and financial literacy. Besides the main model, Equation 3 is estimated as well to serve as a robustness check. 15
The main model with wealth dummies included is still preferred because wealth is considered to be one of the most important determinants of SMP. Moreover, the gender distribution in the treatment group, those who participate in the stock market, is similar to the distribution that the Dutch authority on financial markets (AFM) revealed to be the distribution of Dutch investors in 2020. They argue in the most recent report (AFM, 2021) that 71% of the investors are male and 29% female, whereas the main sample of this study has a distribution of 79% male investors and 21% female investors. 3.2 Probability Assigning Ability Index In order to evaluate the ability of individuals to assign probabilities to uncertain events, the economic and psychological concepts module of the DHS wave 2020 is consulted. By answering four different questions that measure the PAA specifically, respondents reveal their true abilities. The exact wording of these four questions is presented in Appendix A. Since the four questions are open questions, the probability that answers may be correct simply because of random guessing is minimized. For each of the four questions, a dummy variable is constructed that takes a value of one for those respondents that answered the question correctly and zero otherwise. The available information through these four questions is combined by performing a factor analysis. The objective of this factor analysis is to create an index based on different variables that all measure the same thing: the ability of individuals to evaluate probabilities. The remainder of this subsection presents the way in which this factor analysis is performed, and how an index for PAA has been constructed accordingly. Before the factor analysis is actually carried out, the Kaiser-Meyer-Olkin mea- sure is computed in order to observe whether the consulted data is suitable for fac- tor analysis. This test assesses whether the correlation matrix of the consulted data set is appropriate for factor analysis, according to Dziuban and Shirkey (1974). 16
The Kaiser-Meyer-Olkin index is defined as follows: 2 Σi6=j rij KM O = 2 (4) Σi6=j rij + Σi6=j u2ij Where the r2 s are the squares of the off-diagonal elements of the original correlation matrix and the u2 s are the squares of the off-diagonal elements of the partial covariance matrix. The KMO-measure that is obtained from the sample is about 0.68, which is pretty mediocre. From this measure is concluded that the factor analysis is an appropriate way to set up an index for the probability evaluation ability, given the nature of the sample. The expectation is that there is one factor explaining the variance among the four questions that measure one and the same thing: the PAA. However, it is required to test whether there is indeed one main factor. In order to test this, the factor analysis is performed without specifying the number of factors. The variance of the first factor, the eigenvalues, and the scree plot that follow from this analysis all confirm the previously based expectation; there is indeed one factor explaining the variance in all four questions. An index is constructed by the use of a one-factor confirmatory factor analysis model. The remainder of this section describes how this factor analysis is carried out, making use of the most common notation. In factor analysis, the basic model is described as follows: y = Λx + z (5) where y is a vector of observed scores of the four questions on probability estima- tion, x is a vector of latent common factor scores, z is a vector of unique scores and Λ = λim is a matrix of factor loadings. The basic idea of the model is that the common factors x shall account for all correlations between the y’s. In other words, the common factor (PAA) should account for all correlation between the four different questions from the DHS. The model assumes that there exist common and unique factors in the data 17
set that explain the variance of the indicators. The input variable, which is in our case just one factor, that constructs the index is depicted as a combination of the common factors and an error term. x = α1 · F1 + α2 · F2 + α3 · F3 + α4 · F4 + (6) In this notation, every Fi represents a common factor and αi is the factor loading of the factor in explaining the variation of x. In this paper’s model, the different Fi s represent one of the four questions from the DHS and x represents the PAA, which is the only factor in constructing the index. For that reason the factor analysis index is constructed as follows: I=x (7) After the factor analysis has been performed, the index has been stored so that it could be added to the regression of Equation 1. The stored index assigns a score between -1.66 and 0.87 for every observation. In order to interpret this score more easily, the probability evaluating index is standardized so that it has a mean of 0 and a standard deviation of 1. This is done by subtracting every observation’s mean and dividing it by the standard error. 18
4 | Data This section provides insights into how the sample is created that measures the relationship between PAA and SMP. Since this paper is the first that measures the impact of PAA, the data that is used for the PAA-index is carefully described in a separate subsection. 4.1 Overview of the Data The data that is used is originating from the De Nederlandsche Bank (DNB) Household Survey (DHS). The DHS is an annual panel study of the Dutch central bank and collected by CentERdata, a survey research institute at Tilburg Uni- versity. The data set provides information regarding demographic and economic characteristics of a representative sample of the Dutch-speaking population in the Netherlands and contains over 2500 households. The DHS data are unique in the sense that they allow studies of both psychological and economic aspects of financial behavior. For this study, the 28th wave is consulted, which was conducted over the period March 2020 - December 2020. The DHS consists of six different questionnaires, with topics ranging from general information regarding the household to economic and psychological concepts. A total of 2417 households participated in this year’s data wave. Within each of these households, all persons aged over 16 were inter- viewed. The different questionnaires have different response rates, which implies that merging the different modules together results in missing values among certain variables. Since the main dependent variable is SMP among individuals, either through direct stock ownership or through participation in a mutual fund, all ob- servations that lack these data are dropped. This results in a starting point of the sample consisting of 2812 individuals. The main explanatory variable, the probability evaluating ability, is taken from a specific module of the DHS. The re- 19
spondents who did not complete this module are dropped as well (111 respondents; 3.9% of the raw sample). Since it is essential to control for financial wealth, total wealth data is in- cluded as well. Estimates of levels of wealth in survey data are often reputed to be unreliable since there is no incentive to reveal the true value. However, the accuracy of the DHS data is argued to be no worse than other wealth surveys and microdata sets have evident drawbacks as well (Alessie et al., 2002). A measure of financial wealth is constructed by the use of an aggregate wealth data set of the DHS, where some missing observations are imputed3 . Total wealth is defined as the sum of checking and saving accounts, employer-sponsored saving plans, home equity, cash value of life insurance, additional real estate, and other financial as- sets, subtracted by total debt. A large part of the raw sample (860 respondents; 30.1% of the raw sample) have not indicated complete information on their assets and debt and are therefore dropped. Furthermore, 15 observations are dropped because they obviously contained measurement errors for the estimation of total wealth, ending up with a highly negative value for total wealth4 . Merging the data sets of the other relevant variables loses another 123 obser- vations. This final reduction is the result of incomplete responses to questions regarding educational level (29 observations) and marital status (94 observations). The final sample contains 1703 respondents of which 1102 (64.7%) are men and 601 (35.3%) are women. The fact that the final sample is highly men-skewed is the result of incomplete answering among the women for questions regarding their financial wealth. The initial distribution of men and women in the raw data set is 50/50, but the 875 observations that have been dropped in the financial wealth section had a distribution of about 80/20, in favor of female respondents. The consequences of the selective group that has been dropped are closely analyzed 3 Please consult the DHS codebook wave 2020 for an overview of the imputation procedure. 4 These observations have been carefully analyzed before dropping them. Most of these ob- servations had large negative wealth levels because the value of the house was much lower than the mortgage on the house. The denoted value of the house however did often not coincide with the house’s value as decided by municipalities (‘WOZ-waarde’). Moreover, it has been checked whether the respondents indicated that their house’s worth has increased or decreased over the past years in order to make a proper judgment on whether it is a measurement error or the house is over-indebted (‘onder water ’). 20
and robustness checks are carried out to find out whether this highly men-skewed sample impacts the results. A key variable in the data set is the proportion of the sample that invests in the stock market. 339 (19.9%) of the individuals in our sample state that they directly or indirectly (mutual funds) invest in stocks. This is more or less in line with the findings of the Dutch authority on financial markets, who report that 17% of Dutch individuals participated in the stock market in 2020 (AFM, 2021). Insightful to observe are the sociodemographic characteristics among the investing and non- investing part of the sample. An overview of these characteristics is presented in Table 1. Although the data presented in the table is explanatory, the significant differences between investors and non-investors among certain variables hint at relationships that were introduced in section 2.1. Table 1: Baseline characteristics of investors and non-investors Non-Investors Investors N Gender Male 38.93 79.35*** 1,102 Female 61.07 20.65*** 601 Pearson χ2 (1)= 93.11 p = 0.00 Age ≤ 35 years old 10.85 9.21** 170 36 ≤ age ≤ 50 21.41 15.34** 344 51 ≤ age ≤ 65 29.03 35.99** 518 Age > 65 38.71 42.18 671 Pearson χ2 (3)= 15.27 p = 0.00 Education Primary education 2.42 0.29** 34 Lower secondary education 27.79 12.09*** 420 Upper secondary education 10.04 10.03 171 Senior vocational training 23.97 16.81*** 384 Short-cycle tertiary education 24.19 34.51*** 447 University 11.58 26.25*** 247 Pearson χ2 (5)= 90.73 p = 0.00 Continued on next page 21
Table 1 – continued from previous page Non-Investors Investors N Marital status Married 52.20 48.08 888 Unmarried 47.80 51.29 815 Pearson χ2 (1)= 0.01 p = 0.93 Having children Yes 23.02 18.88 378 No 76.98 81.12 1,325 Pearson χ2 (1)= 2.70 p = 0.10 Wealth quartiles 1 (low) 28.86 9.44*** 426 2 26.17 20.35** 426 3 24.63 26.55 426 4 (high) 20.31 43.66*** 425 Pearson χ2 (3)= 104.44 p = 0.00 Financial knowledge Not knowledgeable 14.96 4.72*** 220 More or less knowledgeable 54.62 41.89*** 887 Knowledgeable 25.37 43.95*** 495 Very knowledgeable 5.06 9.44*** 101 Pearson χ2 (3)= 71.53 p = 0.00 Notes: The table reports distribution of subgroups across investors and non-investors in per- centages. Differences between investors and non-investors are measured per variable and per subgroup of each variable. *, **, and *** indicate significance at the 10%, 5%, and 1% level, based on mean differences, assessed with a t-test. The Pearson Chi-square statistic that tests whether the distribution of investors and non-investors is independent of the respective variable is reported with its p-value as well. Note that percentages might not add up to 100 due to rounding errors. 4.2 Measurement of Probability Assigning Ability In order to measure individuals’ PAA, the DHS questionnaire regarding economic and psychological concepts is used. The 28th wave is the first to include these types of questions. The response rate of this module was 78.6%, which is well in line with the response rate of the other questionnaires of the survey. In order to measure the capabilities of individuals when it comes to assigning the right probabilities to events, four different questions of the mentioned module are consulted. The exact 22
wording of these four questions is presented in Appendix A. The four questions are open questions, where respondents do not have the option to indicate that they ‘do not know’ the answer. They are provided with information on the probability distribution of an uncertain event occurring or not and are asked to denote the right probability accordingly. As a reader, one might think of these types of questions as being relatively straightforward, but there are individuals in the sample that do not capture the notion of probabilities at all. About 2% of the respondents denoted a probability of 50 for all of the four questions. They might interpret these types of questions as an event occurring or not occurring; in other words, the probability distribution of every event is 50/50. A total of 364 (21.4%) respondents answered all four questions correctly whereas 103 (6.0%) participants answered all four questions wrong. The average amount of correctly answered questions is 2.5. In Table 2, the number of correct answers on the four questions are presented across certain demographic variables. Certain patterns can be observed from the table. As expected, the number of correctly answered questions increases strongly with levels of education. Those with the lowest number of correct answers are concentrated in the lowest educa- tional categories: primary education and lower secondary education. Conversely, those with a university degree performed the best. They answer on average 3.29 questions correctly out of a maximum score of 4. Similar to the findings of Lusardi (2012) regarding numerical skills, women and those in higher age categories under- perform relative to the sample. For higher wealth levels, there is also an increasing trend in the number of correct answers observable. 23
Table 2: Number of correct answers across baseline characteristics Correct Answers 0 1 2 3 4 M ean N Gender Male 5.99 12.34 22.32 34.85 24.50 2.60 1,102 Female 6.16 16.31 28.29 33.61 15.64 2.36 601 Age ≤ 35 years old 5.29 7.65 16.47 28.82 41.76 2.94 170 36 ≤ age ≤ 50 6.69 9.01 23.55 31.40 29.36 2.68 344 51 ≤ age ≤ 65 5.41 11.58 23.75 38.42 20.85 2.58 518 Age > 65 6.41 19.37 27.42 34.28 12.52 2.27 671 Education Primary education 6.06 18.18 45.45 30.30 3.03 2.06 34 Lower secondary education 10.48 22.38 29.52 32.14 5.48 2.00 420 Upper secondary education 3.51 13.45 18.13 36.84 28.07 2.73 171 Senior vocational training 6.25 14.58 29.95 37.76 11.46 2.34 384 Short-cycle tertiary education 4.70 9.84 23.49 37.14 24.83 2.67 447 Universtity 2.43 4.45 10.53 27.13 55.47 3.29 247 Marital status Married 5.18 12.61 25.56 35.25 21.40 2.55 888 Unmarried 6.99 14.97 23.19 33.50 21.35 2.47 815 Having children Yes 7.41 8.99 21.69 34.92 26.98 2.65 378 No 5.66 15.09 25.21 34.26 19.77 2.47 1,325 Wealth quartiles 1 (low) 8.22 16.43 24.65 33.10 17.61 2.35 426 2 7.04 11.27 27.23 33.33 21.13 2.5 426 3 4.46 16.20 23.71 35.45 20.19 2.51 426 4 (high) 4.47 11.06 22.12 35.76 26.59 2.69 425 Financial knowledge Not knowledgeable 7.73 17.27 25.45 32.73 16.82 2.34 220 More or less knowledgeable 5.86 14.99 26.38 34.61 18.15 2.44 887 Knowledgeable 6.26 11.31 21.41 35.35 25.66 2.63 495 Very knowledgeable 2.97 6.93 19.80 31.68 38.61 2.96 101 Notes: The table reports the distribution of correctly answered questions that measure PAA across certain sub- groups. It presents the distribution of correct answers per subgroup in percentages. The mean number of correctly answered questions per subgroup is presented as well. The questions are part of economic and psychological concepts module of the 28th DNB Household Survey. Exact wording of these questions are presented in Figure 3. Note that percentages might not add up to 100 due to rounding errors. 24
5 | Results Where this paper stands on the shoulders of other contributions regarding the non-participation puzzle, it is the first that links stock market participation to the ability of individuals to assign the right probabilities to uncertain events occurring or not. The stock market is a phenomenon full of uncertain events and therefore a crucial ability of any participant is that he or she is able to evaluate uncertain events in an appropriate way. This paper analyzes whether there exists a relation- ship between PAA and SPM in the Netherlands using DHS data. Figure 2 shows how SMP rates increase for higher levels of PAA, which is proxied by four questions that measure the ability. In order to test this relationship while controlling for a set of other variables, the results of the main estimation model are presented in the first section. Since the sample that is used to test the main model has certain challenging features, a lot of value is attached to proper robustness checks which are presented in section 5.2. Figure 2: Stock market participation and probability assigning ability Stock market participation (%) 30 25 20 15 10 0 1 2 3 4 Number of correct answers Notes: The figure presents the distribution of correct answers on the four questions that are used to set up the PAA-index. The four questions originate from the economic & psychological concepts module of the the 28th wave of the DHS. 25
5.1 Results of the Main Model In order to formally test the relationship between the probability evaluation ability and stock market participation, a Logit regression as introduced in Equation 1 is estimated. The output of a Logit estimation model is generally in log odds. Since these log odds are challenging to interpret, these estimates are transformed into odds and thereafter into marginal effects. In this way, the interpretation is similar to the interpretation of estimations of a linear probability model. In Table 3, marginal estimates are reported using two different specifications of the Logit model. For this subsection, the Logit (1) column is only relevant since the Logit(2) and Logit (3) columns are reported for robustness check purposes, which are extensively discussed in the second subsection. The empirical approach in this paper recognized that there are a lot of different determinants of stock ownership. An extensive set of demographic variables that are available in the DHS data are considered in the model. The signs of these control variables all coincide with the expectations based on previous evidence, which is introduced in section 2.1. Moreover, most of the included parameters show high statistical significance, which actually allows drawing a conclusion on its relationship to SPM. Gender, wealth, educational level, and financial literacy all turn out to have a significant impact on SPM. This is in line with expectations since their impact on stock ownership has already been proven in the existing literature. Most interesting is the focus on estimations of PAA, which is the key explana- tory variable in this study. Even after controlling for a large set of demographic characteristics, the estimates reveal that individuals’ PAA is related to SMP. Those who better grasp the understanding of assigning probabilities to uncertain events are more likely to participate in the stock market5 . A one standard deviation increase in the PAA-index is observed with an increase of SMP by about 4 per- centage points, ceteris paribus. The positive relationship is statistically significant 5 Note that stock market participation in this paper is measured as a binary variable mea- suring both direct and indirect investing among respondents of the DHS. 26
Table 3: Logit regression estimates VARIABLES Logit (1) Logit (2) Logit (3) Probability assigning ability 0.039*** 0.037*** 0.038*** (0.010) (0.008) (0.011) Male 0.092*** 0.083*** 0.095*** (0.022) (0.014) (0.023) Age dummies (base group: age ≤ 35) 36 ≤ age ≤ 50 0.026 0.061** 0.015 (0.040) (0.026) (0.044) 51 ≤ age ≤ 65 0.099*** 0.145*** 0.098** (0.038) (0.024) (0.042) Age > 65 0.088** 0.139*** 0.082* (0.038) (0.026) (0.042) ISCED 0.043*** 0.044*** 0.043*** (0.007) (0.005) (0.007) Married -0.093*** -0.038** -0.093*** (0.020) (0.015) (0.021) Having children -0.044* -0.037** -0.093** (0.025) (0.017) (0.025) Total wealth dummies (base group: first wealth quartile) Second wealth quartile 0.092*** (0.032) third wealth quartile 0.143*** (0.031) fourth wealth quartile 0.199*** (0.031) Log (Financial wealth) 0.046*** (0.008) Financial knowledge (base group: not knowledgeable) More or less knowledgeable 0.090** 0.083 0.081** (0.037) (0.027) (0.039) Knowledgeable 0.171*** 0.153*** 0.171*** (0.038) (0.027) (0.040) Highly knowledgeable 0.168*** 0.170*** 0.162*** (0.047) (0.035) (0.050) Cons. -0.006*** 0.005*** 0.000*** (0.002) (0.002) (0.000) Observations 1,703 2,578 1,606 Pseudo R-squared 0.16 0.15 0.16 Notes: The table presents results from a Logit regression in which the dependent variable is a dummy for stock market participation. Results of Logit regression are transformed into odds and thereafter into marginal effects. Probability assigning ability is a standardized index that measures the particular ability. ISCED levels are on a scale from 0-6 that measure highest level of education completed. Total wealth dummies measure total financial wealth are subtracted by stock market wealth. Robust standard errors are presented in parentheses. *, ** and *** indicate significance at the 10, 5 and 1 percent level, respectively. 27
at any reasonable significance level. Given that stock market participation rates in the Netherlands have never exceeded 20%, the relationship is quite a consider- able finding. Its impact is highly similar to a one-level increase in ISCED, which measures the formal level of education. Although the impact of the PAA-index is relatively large, it does not weigh up to the impact of wealth levels and self-assessed literacy. Individuals who find themselves in the highest wealth quartile are about 20 percentage points more likely to invest in stocks compared to those in the lowest wealth quartile, ceteris paribus. Furthermore, those who consider themselves highly knowledgeable with respect to financial affairs are observed with about 17 percentage points higher participation levels compared to those who are not knowledgeable with respect to financial matters, holding other factors constant. 5.2 Robustness Considerations 5.2.1 Representativeness of the Sample Since there are some challenges that arise from the use of the DHS data set, it is crucial to test whether the findings of the main model also appear in slightly different specifications of the model. First of all, the men-skewed nature of the sample is one of the considerations that might potentially impact the estimates of the main model. This concern is extensively introduced in section 3.1. In order to observe the PAA-effect in a sample where distribution between males and females is 50/50, a second specification of the model is proposed in Equation 3. The only variable that is excluded in this specification is a measure of financial wealth. Results of the estimation of the second model are presented in the Logit (2) column in Table 3. This additional specification of the main model is insightful because it enables us to compare the main results of the study to a sample with more than half as many observations. A look at the estimates learns us that hardly any coefficient is impacted by much after dropping the wealth variable. Only the age dummies increase by certain 28
percentage points and the impact of marital status diminishes. More importantly, the estimate of the key explanatory variable, the probability estimating ability, is practically the same as in the main model. The latter finding teaches us that the male-skewed nature of the sample does not impact the magnitude of the PAA-effect by much. Besides the effect that the financial wealth variable has on the gender distribu- tion, it is also a variable that is measured with quite some noise. For this reason, a second robustness check is performed. Rather than including dummy variables for wealth levels, a specification with the logarithm of total wealth (excluding stock market wealth) is considered. Results of this specification are presented in the Logit (3) column in Table 3. Since the data set contains 97 observations with a negative value of total wealth6 , the sample size of this specification is lower than the main model. Results of the third specification hardly differ from the results of the main model. Only the estimate of having children differs by certain percentage points from the main model. This implies that the results of the main model are robust to different specifications of the wealth variable. 5.2.2 Subgroup Analysis The main results that are presented in Table 3 do not provide any information on the heterogeneity of PAA-effects among different subgroups. It instead reveals the mean PAA-effect for the entire sample. This heterogeneity of PAA-effects is tested for gender, wealth level, and financial literacy. These subgroups are chosen because the results in Table 3 revealed that these subgroups have a large impact on SMP so it would be interesting to observe whether the PAA-effect differs among these groups. The difference in PAA-effect per subgroup is tested by including an interaction term in the main regression. This interaction term is added for the variables mentioned above and the results of these additional specifications are presented in Table 4. The estimates that are presented in the table follow from three different Logit estimations and results are presented in marginal effects. The estimates in Table 4 reveal that the PAA effect is more or less equal for 6 Negative wealth levels are a result of over-indebted houses or study loans. 29
men and women. The interaction term for PAA and gender is close to zero, from which is concluded that PAA-effects do not differ between males and females. This observation makes the unequal distribution of males and females in the main sample less of a concern. The results for the PAA-effect differentiated for financial wealth of the observed individuals provide interesting insights. The high- and low wealth categories are here defined as the two highest wealth quartiles and the two lowest wealth quar- tiles. Being able to assign the right probabilities to uncertain events is of greater impact on SMP for those with lower financial wealth compared to those with higher wealth levels. The effect of a one standard deviation increase in the PAA-index for individuals with relatively low wealth levels is associated with a 7.4 percentage point increase in SMP, whereas for individuals from the highest half of the wealth distribution this effect is about 5 percentage points lower. The policy implications implied by this finding are discussed in the final section. Finally, The estimates that distinguish the PAA-effect for individuals with higher and lower levels of self-assessed financial literacy are presented in the final column of Table 4. Higher financially literate individuals are defined as those that consider themselves ’knowledgeable’ or ’highly knowledgeable’, whereas low financially literate individuals consider themselves ’not knowledgeable’ or ’more or less knowledgeable’. The interaction term of the model is close to zero and highly insignificant. From the latter is concluded that there is no evidence that the PAA-effect differs for different levels of financial literacy. 30
You can also read