Title Page  

Doctoral Thesis 

Cognitive Traits and Entrepreneurial Pursuits in the Digital Age:  

A Multi-Layered Perspective 

 
In partial fulfillment of the requirements for the degree of 

Doctor of Economics and Business Studies 

(Doctor rerum politicarum, Dr. rer. pol.) 

 
Author 

Philipp Schade 

 
Submitted to Justus Liebig University Giessen 

Department of Business Administration and Economics 

Research Network Digitalization  

Giessen, August 10th, 2023 

 
Supervisors 

Prof. Dr. Monika C. Schuhmacher 

Chair of the Department for Technology, Innovation, and Start-up Management 

 
Prof. Dr. Irene Bertschek 

Chair of the Department for Economics of Digitalisation


II 

 
List of Contents  

List of Contents 

Acknowledgments .................................................................................................................... V 

List of Figures ....................................................................................................................... VII 

List of Tables ....................................................................................................................... VIII 

List of Abbreviations .............................................................................................................. IX 

List of Appendices .................................................................................................................. XI 

1. General Introduction ........................................................................................................... 1 

2. Study 1 – Predicting Entrepreneurial Activity Using Machine Learning ...................... 5 

2.1 Introduction .................................................................................................................... 7 

2.2 Data and methodology .................................................................................................... 9 

2.2.1 Data, feature selection, and data pre-processing .................................................. 9 

2.2.2 Methodology ...................................................................................................... 10 

2.2.3 Classifier performance evaluation ...................................................................... 12 

2.3 Results .......................................................................................................................... 13 

2.3.1 Comparison of classifier performance ............................................................... 13 

2.3.2 Global feature importance .................................................................................. 15 

2.3.3 Additional analysis and robustness check .......................................................... 17 

2.4 Discussion and conclusion ........................................................................................... 18 

3. Study 2 – Digital Infrastructure and Entrepreneurial Action-Formation: A Multilevel 

Study .................................................................................................................................... 23 

3.1 Introduction .................................................................................................................. 25 

3.2 Theoretical framework ................................................................................................. 28 

3.2.1 Social cognitive theory ....................................................................................... 28 

3.2.2 External enabler framework and venture creation ............................................. 29 

3.3 Mechanism-based theorizing: Hypotheses development ............................................. 29 


III 

 
3.3.1 Action-formation mechanisms at the individual level and baseline                

hypotheses .......................................................................................................... 31 

3.3.2 EE mechanisms of digital infrastructure and moderation hypotheses ............... 33 

3.3.2.1 Moderation of the self-efficacy action-formation mechanism .............. 34 

3.3.2.2 Moderation of the fear of failure as an action-formation mechanism ... 36 

3.3.2.3 Moderation of the opportunity recognition action-formation          

mechanism ............................................................................................ 37 

3.4 Research methodology ................................................................................................. 39 

3.4.1 Data…… ............................................................................................................ 39 

3.4.2 Variables and measures ...................................................................................... 40 

3.4.2.1 Dependent variable ............................................................................... 40 

3.4.2.2 Key explanatory variables ..................................................................... 40 

3.4.2.3 Control variables ................................................................................... 41 

3.4.2.4 Empirical multilevel model ................................................................... 43 

3.5 Results .......................................................................................................................... 43 

3.5.1 Descriptive statistics ........................................................................................... 43 

3.5.2 Multilevel logistic regression results .................................................................. 43 

3.5.3 Additional analyses and robustness checks ........................................................ 50 

3.6 Discussion .................................................................................................................... 53 

3.6.1 Contribution to the contextual entrepreneurship literature ................................ 54 

3.6.2 Contribution to the external enabler framework ................................................ 55 

3.6.3 Policy implications ............................................................................................. 56 

3.6.4 Limitations and directions for future research ................................................... 58 

4. Study 3 – Responding to the Situational Urgency of Digital Transformation: A 

Multilevel Analysis of CEO Humility and Corporate Venture Capital ........................ 61 

4.1 Introduction .................................................................................................................. 63 

4.2 Theoretical background ................................................................................................ 66 

4.2.1 The attention-based view from the CEO perspective ......................................... 66 


IV 

 
4.2.2 CEO humility and CVC investments as a CE action ......................................... 67 

4.2.3 Digital transformation as a context of situational urgency ................................ 68 

4.3 Hypotheses development .............................................................................................. 70 

4.3.1 CEO humility and CVC investments ................................................................. 70 

4.3.2 The influence of external situational urgency for digital transformation ........... 71 

4.3.3 The influence of internal situational urgency for digital transformation ........... 73 

4.4 Method .......................................................................................................................... 74 

4.4.1 Data and measures .............................................................................................. 74 

4.4.2 Dependent variable ............................................................................................. 75 

4.4.3 Independent variable .......................................................................................... 75 

4.4.4 Moderator variables ............................................................................................ 76 

4.4.5 Control variables ................................................................................................ 77 

4.5 Analysis and results ...................................................................................................... 79 

4.5.1 Analysis .............................................................................................................. 79 

4.5.2 Results…. ........................................................................................................... 80 

4.5.3 Additional analyses and robustness checks ........................................................ 83 

4.6 Discussion .................................................................................................................... 86 

4.6.1 Theoretical contributions .................................................................................... 87 

4.6.2 Practical implications ......................................................................................... 88 

4.6.3 Limitations and future research .......................................................................... 89 

4.7 Conclusion .................................................................................................................... 90 

5. Concluding Remarks .......................................................................................................... 92 

References ............................................................................................................................... 95 

Appendices ............................................................................................................................ 117 

Affidavit ................................................................................................................................ XII 


V 

 
Acknowledgments 

In the following, I would like to take the opportunity to thank all the people who made this 

doctoral thesis possible. First, I would like to thank my supervisor Prof. Dr. Monika Schuhmacher. 

I am deeply grateful for the trust she has placed in me and my research at all times. Through her 

active integration of the Research Network Digitalization into the Chair of Technology, 

Innovation, and Start-up Management, she has provided a fruitful research environment. I am very 

grateful that her door was always open and that she brought her perspective, sense of structure, 

rigor, and questions to the discussions on each of the studies. Her constructive feedback had a 

tremendously positive impact on each of the three papers.  

I am also very appreciative to Prof. Dr. Irene Bertschek for immediately agreeing to assist my 

dissertation as a second supervisor. Beyond that, and besides Prof. Dr. Monika Schuhmacher, I 

would also like to thank all other members of the Research Network Digitalization. Namely, Prof. 

Dr. Andreas Bausch, Prof. Dr. Christian Gissel, Prof. Dr. Georg Götz, Prof. Dr. Alexander Haas, 

and Prof. Dr. Frank Walter. Without the existence of the Research Network Digitalization, this 

work would not have been possible. 

For various reasons, I am very fortunate and deeply grateful for my colleague, co-author, and 

friend Dr. Petrit Ademi. I could not have imagined a person with whom I would have preferred to 

share an office over the past years. Our conversations have greatly enriched my work. Among 

many others, I will always remember our intensive discussions and a subsequent conciliatory 

attitude toward the notion of affordances and external enabler mechanisms.  

Great appreciation is also due to all my other companions and colleagues. Specifically, 

Yannick Amend, Dr. Sadrac Cenophat, Dr. Anna-Lena Hanker, Björn Hofmann, Junior Prof. Dr. 

Tobias Krämer, Victoria Kuharev, Julian Nickel, Dr. Stephan Philippi, Hieu Thieu, Alexandra 

von Preuschen, Denis Weinecker, and Ferogh Schaich-Zaman who have contributed to my 

research through various colloquia, doctoral seminars, and brown-bag conversations. I also 


VI 

 
acknowledge Carmen Wagner for taking care of countless administrative affairs, which resulted 

in more space on my desk for research. 

I would also like to address further words of thanks to the student assistants Christian Nicolay 

and Alicia Leona Schwalbach. Their support in teaching and research was a great help. 

Furthermore, I am appreciative to Dr. Yannik Bofinger, Benjamin Fiorelli, Florian Gärtner, and 

Dr. Darwin Semmler from the Research Network Behavioral and Social Finance & Accounting. 

The dialogues with you on various topics related to research and teaching were always enriching. 

I would like to dedicate another acknowledgment to a person who may not be aware of his 

contribution to this doctoral thesis. This person is Prof. Dr. Henrik Egbert, for whom I had the 

opportunity to work as a tutor for microeconomics and economic policy during my bachelor 

studies and who had an important and lasting influence on my later academic pursuits. 

Finally, and most importantly, I would like to thank my entire beloved family for their patience, 

encouragement, and advice. I especially thank my mother Annette and my twin brother Mark. 

You have been the greatest of all supports in the most diverse areas throughout my entire 

educational path. Another person I would like to thank explicitly is Lara. Without you, this 

doctoral thesis and the journal publications would not have been possible. You have been a 

bulwark of support during my doctoral years and have kept me going in extremely tense times, 

even if it meant personal privations for you. This is anything but a matter of course. I owe her a 

great debt of gratitude—Thank you!


VII 

 
List of Figures 

Figure 1: Structure of the doctoral thesis ................................................................................... 3 

Figure 2: ROC curve comparison for OME ............................................................................. 15 

Figure 3: Feature importance rank for predicting OME .......................................................... 16 

Figure 4: Theoretical model ..................................................................................................... 31 

Figure 5: Interaction plots – Entrepreneurial self-efficacy and digital infrastructure .............. 49 

Figure 6: Interaction plots – Fear of failure and digital infrastructure ..................................... 49 

Figure 7: Interaction plots – Opportunity recognition and digital infrastructure ..................... 49 

Figure 8: Conceptual framework ............................................................................................. 69 

Figure 9: Number of CVC investments as predicted by CEO humility and emerging digital 

competition – 3D surface plot ................................................................................. 83 

  
VIII 

 
List of Tables 

Table 1: ML model performance on the OME “hold-out”- dataset ......................................... 14 

Table 2: Correlation matrix for country and individual level variables ................................... 45 

Table 3: Mixed-effects multilevel logistic regression model ................................................... 46 

Table 4: Descriptive statistics and correlation matrix for CEO-, firm- and industry-level 

variables ..................................................................................................................... 81 

Table 5: Multilevel negative binomial regression .................................................................... 82 

Table 6: Related vs. unrelated CVC investments – Multilevel analysis .................................. 85 

Table 7: Firm value effect of CVC investments – Lagged fixed-effects panel analysis .......... 86 

 
IX 

 
List of Abbreviations 

AI  Artificial intelligence 

AIC  Akaike's information criterion 

ALE  Accumulated local effects 

ANN  Artificial neural network 

APS  Adult population survey 

AUC  Area under the receiver operating characteristic curve 

AutoML Automated machine learning  

BoW  Bag-of-words 

CATA  Computer-aided text analysis  

CE  Corporate entrepreneurship  

CEO  Chief executive officer  

CI  Confidence intervals  

CRE  Correlated random effects 

CTA  Company primary technology application  

CVC  Corporate venture capital  

DT  Decision tree 

EE  External enabler 

FE  Fixed effects 

FN  False negative 

FP  False positive 

GEM  Global entrepreneurship monitor 

GCV  Global corporate venturing  

ICC  Intra-class correlation coefficients  

ITU  International telecommunication union 

IML  Interpretable machine learning 

kNN  k-nearest neighbor  

LIME  Local interpretable model-agnostic explanation 

LIWC  Linguistic inquiry word count 

LPM  Linear probability model 

LR  Logistic regression  

LTS  Letter to shareholders  

MCC  Mathews correlation coefficient 

ML  Machine learning 

NB  Naïve bayes 


X 

 
NME  Necessity-motivated entrepreneurial activity 

OME   Opportunity-motivated entrepreneurial activity 

RE  Random effects 

RF  Random forest 

PE  Private equity  

R&D  Research and development  

ROC  Receiver operating characteristic curve 

SCT  Social cognitive theory 

SIC  Standard industry classification  

SMOTE  Synthetic minority over-sampling technique 

TN  True negative 

TP  True positive 

VC  Venture capital  

VIF  Variance inflation factors  

VPC  Variance partitioning coefficients  

WVS  World values survey  

XAI  Explainable artificial intelligence 

XGBoost Extreme gradient boosting tree ensemble 

 
XI 

 
List of Appendices 

Appendix A: Feature definitions ............................................................................................ 117 

Appendix B: Incidence of missing values .............................................................................. 118 

Appendix C: Patterns of missing values ................................................................................ 118 

Appendix D: Pearson cross-correlation heatmap ................................................................... 119 

Appendix E: Confusion matrices ........................................................................................... 119 

Appendix F: Error rates of the k-fold cross-validation .......................................................... 120 

Appendix G: Default hyperparameter settings ....................................................................... 120 

Appendix H: ML model performance on the NME “hold-out”- dataset ............................... 120 

Appendix I: ROC curve comparison for NME ...................................................................... 121 

Appendix J: Feature importance rank for predicting NME ................................................... 121 

Appendix K: Data description and sources ............................................................................ 122 

Appendix L: Summary statistics ............................................................................................ 123 

Appendix M: Observations and digital infrastructure per country ........................................ 124 

Appendix N: Additional analyses and robustness checks ...................................................... 125 

Appendix O: Sensitivity test .................................................................................................. 126 

Appendix P: CRE and LPM model for country-fixed effects ................................................ 127 

Appendix Q: Data description and source ............................................................................. 128 

Appendix R: Fixed-effects negative binomial panel regression ............................................ 129 

Appendix S: Random-effects negative binomial panel regression ........................................ 130 

 
1 

 
Chapter 1  

1. General Introduction 

The overarching aim of this cumulative dissertation is to shed light and advance our 

understanding of the nexus between cognitive traits and entrepreneurial pursuits in the digital age. 

To achieve this goal, we take a multi-layered perspective throughout the course of three separate 

studies. To this end, we take into account the individual, firm, industry and country levels as units 

of analysis across the studies conducted. A multi-layered perspective—i.e., looking at the different 

hierarchical levels—is of particular importance because it enables the elucidation of mechanisms 

that explain why effects or relationships within and between the levels occur. Moreover, since the 

phenomenon of “digitalization” can be characterized by various aspects, this doctoral thesis 

considers “digital” in different ways throughout the studies, viz. digital technology, digital 

infrastructure, and digital transformation. Therewith, this dissertation does not only provide a 

multi-layered perspective in terms of the hierarchical levels considered—i.e., depth—but also 

regarding the breadth of domains that digitalization comprises. The structure of the dissertation is 

presented in Figure 1.  

In Study 1 (see Chapter 2), we use digital technology, i.e., machine learning (ML), a subfield 

of artificial intelligence (AI), that marks the very center of the so-called digital era. Specifically, 

we apply ML algorithms and pursue a data-driven approach to unravel important agent-centric 

features for entrepreneurial pursuits. This way, ML enables us to investigate to what extent 

entrepreneurial activity is predictable, and more importantly, which features best explain the 

prediction. To this purpose, we apply various supervised machine-learning techniques—decision 

tree, random forest, extreme gradient boosting tree ensemble, k-nearest neighbor, artificial neural 

network, and naïve Bayes—as well as perform classical multiple logistic regression to the most 

comprehensive existing data set in the field of entrepreneurship. This approach enables us to 

engage in abductive reasoning and estimate the relative performance of the respective ML 

algorithm in predicting entrepreneurial activity. The benchmarking of the ML techniques reveals 


2 

 
that the extreme gradient boosting tree ensemble is the best-performing ML technique in 

predicting both necessity and opportunity-motivated entrepreneurial activity with the highest 

overall accuracy and area under the receiver operating characteristic curve. The feature 

importance rankings suggest that despite psychological self-regulation mechanisms such as 

cognitions and personal traits (i.e., socio-cognitive traits), macro country-level factors external to 

the respective individual at the micro level may also play a pivotal role in the engagement in 

entrepreneurial pursuits.  

In Study 2 (see Chapter 3), we engage in a hypothetico-deductive reasoning approach. 

Specifically, this study examines how the level of digital infrastructure of a country shapes the 

relationships between socio-cognitive traits and entrepreneurial action—so-called action-

formation mechanisms. For our hypothetico-deductive approach, we combine the agent-centric 

social cognitive theory (SCT) (Bandura, 1986; Sherman et al., 2015; Wood & Bandura, 1989) 

with the external enabler (EE) framework (Davidsson et al., 2020). Given that SCT is rather 

coarse-grained and lacks a theoretical explanation of how contextual factors shape entrepreneurial 

action-formation at the micro-level, we augment SCT with the EE framework and engage in EE 

mechanism-based theorizing, which allows to reason on how a specific contextual factor, in terms 

of EE, develops through specific (situational) mechanisms. To investigate the triadic reciprocal 

relationship system between action-formation at the micro-level and digital infrastructure at the 

macro-level, we apply multilevel modeling. In line with the SCT, our analysis shows that the 

socio-cognitive traits of entrepreneurial self-efficacy and opportunity recognition increase 

entrepreneurial action, while fear of failure reduces it. The results further indicate that a country’s 

level of digital infrastructure is an EE that takes a shaping role in the relationships between socio-

cognitive traits and entrepreneurial action. Consistent with our theorizing derived from the EE 

framework, the findings suggest that, in particular, the resource access and market access 

mechanisms of digital infrastructure explain the moderating effects.  


3 

 
Figure 1: Structure of the doctoral thesis 

 
Chapter 1: General Introduction 

Structure of Doctoral Thesis 

Chapter 2: Study 1 Chapter 3: Study 2 Chapter 4: Study 3 

T
it

le
 a

n
d

 
co
-a

u
th

o
rs

 Predicting Entrepreneurial Activity Using 

Machine Learning 

with Monika C. Schuhmacher 

Digital Infrastructure and Entrepreneurial 

Action-Formation: A Multilevel Study 

with Monika C. Schuhmacher 

Responding to the Situational Urgency of Digi-

tal Transformation: A Multilevel Analysis of 

CEO Humility and Corporate Venture Capital 

with Petrit Ademi, & Monika C. Schuhmacher 

R
es

ea
rc

h
 

q
u

es
ti

o
n

 
T

h
eo

ry
 

M
et

h
o

d
  

Chapter 5: Concluding Remarks 

To what extent is entrepreneurial activity 

predictable and what are the most important 

features? 

How does a country’s digital infrastructure shape 

the entrepreneurial action-formation  

of individuals? 

How do humble CEOs influence CVC investment 

activity in the context of urgency for  

digital transformation? 

Atheoretical  
Social cognitive theory (Wood & Bandura, 1989) 

and external enabler  

framework (Davidsson et al., 2020) 

Attention-based view (Ocasio, 1997) 

Abductive analysis of various machine learning 

algorithms based on 1,192,818 observations 

from 99 countries 

Logistic multilevel modeling based on 344,265 

individual-level observations from 46 countries 

Longitudinal study based of 373 CEOs from 198 

firms and 35 industries between 2010 and 2019 

(6,908 CVC investments over 1,597 firm-years) 

S
ta

tu
s 

Published in Journal of Business  

Venturing Insights (ABDC: A) 

Published in Journal of Business Venturing 

(FT50, ABS: 4, ABDC: A*, VHB-Jourqual: A) 

Revise & Resubmit in Journal of Management 

Studies (FT50, ABS: 4, ABDC: A*,  

VHB-Jourqual: A) 

L
ev

el
  

Individual (micro) and country (macro)  
Individual (micro), firm (meso), and industry 

(macro)  
Individual (micro) 


4 

 
In Study 3 (see Chapter 4), we examine whether the findings from Study 1 and 2 are 

transferable to the corporate context. The underlying assumption is that incumbent firms are 

increasingly fostering corporate entrepreneurial (CE) actions as a means of addressing challenges 

arising from the ongoing digital transformation of society and business. Existing research in the 

management and entrepreneurship literature has considerably advanced our knowledge of firm 

and industry-specific drivers of corporate venture capital (CVC) investment activity. However, 

the influencing role of CEOs is largely unknown. In digital times, the purposeful instigation of 

CE actions requires the CEO—as the top decision-maker and head of the organization—to not 

only uphold the company’s existing strengths but also to identify weaknesses, obtain accurate 

self-knowledge and promote self-improvement through continual learning (Ou et al., 2018). An 

auspicious basis for such contemporary executive leadership that gained momentum in the 

literature is a person-centered cognitive characteristic known as “humility”. In the study, we 

distinguish between two forms of urgency for digital transformation: (i) emerging digital 

competition at the external industry-level, and (ii) business model dependence on information and 

knowledge at the internal firm-level. By reasoning upon the situational mechanism emanated by 

the internal and external urgency for digital transformation, we aim to provide an understanding 

of CVC investments as a CE action that humble CEOs at the individual-level foster in the digital 

era. Through the application of a bag-of-words (BoW) approach for text analysis and multilevel 

modeling, we provide evidence for CEO humility as an important, yet overlooked, action-

formation mechanism for CVC investment activity. While the study finds support for the 

moderating role of emerging digital competition (i.e., external urgency), the findings suggest that 

internal urgency for digital transformation originating from the firm’s business model dependence 

on information and knowledge positively moderates the action-formation mechanism of CEO 

humility primarily for CVC investment activity in related ventures.  

 
5 

 
Chapter 2  

 
2. Study 1 – Predicting Entrepreneurial Activity Using Machine Learning 

 
Coauthors:  

Monika C. Schuhmacher 

 
Relative share:  

90% 

 
Status:  

Published in Journal of Business Venturing Insights (ABDC: A) 

 
This chapter is available under:    

Schade, P. & Schuhmacher, M.C. (2023). Predicting entrepreneurial activity using machine 

learning. Journal of Business Venturing Insights, 19, e00357, 

https://doi.org/10.1016/j.jbvi.2022.e00357  

 
A previous version of this chapter has been presented at:  

• Australian Center for Entrepreneurship Research Exchange (ACERE) Conference 2023, 

Brisbane, Australia 

• Doctoral Consortium of the Australian Center for Entrepreneurship Research Exchange 

(ACERE) Conference 2022, Melbourne/Virtual Edition, Australia 

https://doi.org/10.1016/j.jbvi.2022.e00357


6 

 
Predicting Entrepreneurial Activity Using Machine Learning 

 
Abstract 

This study evaluates the predictability of entrepreneurial activity using machine learning. We 

compare different supervised machine learning techniques: decision tree, random forest, artificial 

neural network, k-nearest neighbor, extreme gradient boosting tree ensemble, and naïve Bayes, as 

well as run the traditional multiple logistic regression for obtaining a baseline and estimating their 

relative model prediction performance on a Global Entrepreneurship Monitor dataset of 1,192,818 

individuals from 99 countries. By comparing different machine learning techniques, we predict 

out-of-sample opportunity-motivated entrepreneurial activity with an overall accuracy ranging 

from 70.1% to 91.2%. The results demonstrate that the extreme gradient boosting tree ensemble 

is superior in predicting opportunity-motivated entrepreneurial activity. Finally, a global surrogate 

model reveals that knowing an entrepreneur, entrepreneurial self-efficacy, and opportunity 

recognition are the three most important features for predicting opportunity-motivated 

entrepreneurial activity. For comparison purposes, we perform the same analyses for necessity-

motivated entrepreneurial activity. The results reveal that the extreme gradient boosting tree 

ensemble is also the best-performing technique in predicting this form of entrepreneurial activity 

with a 96.5% accuracy. 

JEL classification: C45, C53, C55, D91, L26 

Keywords: Supervised machine learning, classification, prediction, entrepreneurial activity 

  
7 

 
2.1 Introduction  

Over the last few decades, scholars have attempted to unravel the focal phenomenon of 

entrepreneurial activity from different perspectives and found that disentangling the 

entrepreneurial event is extremely complex. Against this background, entrepreneurship scholars 

have proposed the ignoramus et ignorabimus-like thesis in leading journals that it is highly 

improbable and doubtful whether research will ever be able to construct a mathematical model 

that can be used to predict the occurrence of the entrepreneurial event (e.g., Bruyat & Julien, 2001; 

Churchill & Bygrave, 1990). These scholars argue that if “we want to understand 

entrepreneurship, our research methodology must be able to handle nonlinear, unstable 

discontinuities” (Churchill & Bygrave, 1990, p. 28).  

However, the advancements in the area of artificial intelligence (AI) and machine learning 

(ML) in recent years have provided researchers with new methodological potentials for 

constructing models for predicting various human behaviors and offering fine-grained insights 

into the actual predictability of entrepreneurial events. As such, ML enjoys the greatest popularity 

in the business world and is used for performing the most complex prediction tasks, especially 

supervised machine learning techniques, which were designed for this purpose (Obschonka & 

Audretsch, 2020). The main benefit of ML is its high predictive accuracy, a property that is crucial 

in many areas of business (van Witteloostuijn & Kolkman, 2019). However, although the 

disruptive potentials of AI and ML in analyzing (big) data with a large number of observations or 

high dimensionality have received increasing attention in a variety of research and application 

fields, they have not undergone much scrutiny in contemporary entrepreneurship research yet 

(Hastie et al., 2009; Obschonka & Audretsch, 2020; Schwab & Zhang, 2019; Shepherd & 

Majchrzak, 2022; van Witteloostuijn & Kolkman, 2019).  

This fact is surprising because AI-based ML provides mathematical approaches that can be 

applied to mine and analyze the most comprehensive datasets such as the Global Entrepreneurship 

Monitor (GEM) (Gerasimovic & Bugaric, 2018; Lévesque et al., 2022) and, consequently, 


8 

 
challenge long-held assumptions about the predictability of entrepreneurial activity. In particular, 

ML offers algorithmic approaches that “learn” incrementally from the inferred data to make 

predictions by accommodating complex and high-order interactions. This way, ML techniques 

can select the functional form that best predicts the target outcome, whereas in classic statistical 

methods the functional form must be specified a priori (Arin et al., 2022). Moreover, ML 

techniques allow for the investigation of the “nuts and bolts, cogs and wheels” (Elster, 1989, p. 3), 

i.e., mechanisms, that lead to entrepreneurial activity. Understanding these mechanisms resulting 

from entrepreneurial activity-related features is critical (Cowen et al., 2022; Hedström & 

Swedberg, 1998), as entrepreneurial activity plays an important role in the economic growth and 

prosperity of a nation (e.g., Schumpeter, 1934; Wennekers & Thurik, 1999). Therefore, the better 

scientists and policymakers understand the relevant features and underlying mechanisms of 

entrepreneurial activity, the better they can leverage these aspects through specific actions. 

However, this understanding calls for research into effectively predicting entrepreneurial activity. 

To answer the long-standing research question of whether and to what extent entrepreneurial 

activity can be predicted, we apply and compare multiple state-of-the-art supervised machines 

and deep learning1 techniques to a GEM dataset of 1,192,818 individuals from 99 countries. As 

no large-scale investigation using ML has been conducted to date, the aim of this research is 

primarily to evaluate whether ML improves the accuracy of entrepreneurial activity prediction. 

Since causal inferences are not possible due to the cross-sectional nature of the individual-level 

GEM data, we predominantly seek to determine which supervised ML technique is superior in 

terms of predictive accuracy and identify which features are the most relevant in predicting 

entrepreneurial activity. Typically, entrepreneurship research distinguishes entrepreneurship 

activity either as a necessity-motivated entrepreneurial activity (NME) or an opportunity-

motivated entrepreneurial activity (OME) (Amorós et al., 2019). According to this push/pull 

 
1 Since deep learning techniques, such as artificial neural networks, are a special case of ML, we use the term 

“machine learning” throughout the paper for greater clarity.  


9 

 
framework (Storey, 2016), NME is linked to unemployment and economic recession (e.g., 

Amorós et al., 2019; Shane, 2009). Thus, entrepreneurship research is mostly interested in OME, 

wherein individuals start a new business venture in pursuit of profit, innovation, and growth (see 

e.g., Reynolds et al., 2005; Stenholm et al., 2013). Hence, we apply different ML techniques 

to OME. 

2.2 Data and methodology 

2.2.1 Data, feature selection, and data pre-processing 

This study uses the data of six years from the Adult Population Survey by the GEM initiative. 

We built a comprehensive cross-sectional sample of 1,192,818 individuals from 99 countries by 

pooling the individual-level GEM data from 2012 to 2017, with different individuals being 

observed in each year (Verbeek, 2008). As the GEM data provides labeled observations, the 

dataset is suitable for conducting supervised analyses. The GEM initiative estimates the 

prevalence rate of entrepreneurial activity across the participating countries (Reynolds et al., 

2005). The GEM data have been used in various previous studies examining entrepreneurial 

activity (e.g., Aidis et al., 2008; Fredström et al., 2020).  

As we aim to predict OME, we relied on the individuals from the GEM project who provided 

their assessment of whether they engage in entrepreneurial activity to take advantage of a business 

opportunity (TEAyyOPP). The GEM specifies OME as a binary feature (1 = yes, 0 = no). To 

ensure that OME is not simply predicted by the inherently non-orthogonal manifestations of this 

feature (e.g., total early-stage entrepreneurial activity, total early-stage entrepreneurial activity 

based on new technology, self-employment, etc.), we have removed these manifestations from 

the GEM datasets.2  

 
2 For example, if an individual engages in OME (TEAyyOPP), the same person also tends to pursue total early-stage 

entrepreneurial activity (TEAyy). The ML techniques would, therefore, use TEAyy as the most important feature in 

predicting TEAyyOPP. 


10 

 
We selected and included all agent-centric features from the unaggregated Adult Population 

Survey that were reported by the GEM. Specifically, we included several human capital 

endowment features, such as educational and occupational attainments (see e.g., Davidsson & 

Honig, 2003); socio-demographic characteristics, such as gender, age, household size (hhsize); 

and self-regulatory mechanisms such as cognitions and personal traits—entrepreneurial self-

efficacy (suskill), fear of failure (fearfail), and opportunity recognition (opport) (e.g., Baron, 

2004; Mitchell et al., 2002; Shaver & Scott, 1992). Overall, we included a total of 21 explanatory 

features. These features are queried in the GEM datasets, as entrepreneurship research has shown 

that these features are significantly associated with entrepreneurial activity3. Feature definitions 

are presented in Appendix A. From the initial dataset, observations with missing values and those 

which are string data (i.e., open-ended survey questions) were excluded from the analyses. 

Additional information on the incidence and patterns of missing data are provided in Appendix B 

and C. To account for potential survey effects and unobserved temporal heterogeneity, we also 

included respondent identifiers (setid), country identifiers (country), year of survey (yrsurv), and 

the developmental stage of countries (CAT_GCR1) in the analyses.  

2.2.2 Methodology  

To predict OME, we utilized various supervised ML techniques. Since the target class, OME, 

is binary (i.e., unordered discrete response), we applied the most commonly used ML techniques 

suitable for solving classification problems. Specifically, we applied the following ML 

techniques: decision tree (DT), random forest (RF), deep artificial neural network (ANN) in the 

form of a feedforward multilayer perceptron, k-nearest neighbor (kNN), extreme gradient 

boosting tree ensemble (XGBoost), and naïve Bayes (NB).4 For comparison, we also ran the 

traditional multiple logistic regression (LR) as a baseline-benchmark model. All ML techniques 

 
3 Features queried as special topics in individual GEM rounds are not taken into consideration. 
4 For the sake of brevity, we do not describe the ML methods used in greater detail. For a more comprehensive 

overview into the individual methods of statistical learning, the reader can refer to Hastie et al. (2009).  


11 

 
were used with their default hyperparameter settings.5 The use of default hyperparameter settings 

ensures c.p. direct comparability of different ML techniques without additional human 

intervention. Hyperparameter settings for the applied ML techniques are reported in Appendix G. 

To compare these ML techniques, we split the GEM dataset into a “training & validation” sample 

and an unseen “hold-out” test sample (Mullainathan & Spiess, 2017). For training and validating 

the ML models, we used 70% of the sample. For evaluating the final predictive out-of-sample 

performance of the ML techniques, we used the remaining 30% (see Choudhury et al., 2021).  

In a balanced dataset, the probabilities of engaging and not engaging in OME are equal. 

However, since entrepreneurial activity is a rather rare event, the input data is unbalanced. In other 

words, there are fewer observations for engaging in OME (i.e., 1 = yes) than for not engaging in 

OME (i.e., 0 = no). This unbalanced data lead to the issue of ML techniques being dominated by 

the majority class. To address this issue, we resampled the dataset. Specifically, we performed the 

synthetic minority over-sampling technique (SMOTE) suggested by Chawla et al. (2002), where 

data in the minority class is generated through over-sampling. This minority over-sampling was 

achieved by creating synthetic rows in the dataset by extrapolating between a real object of a given 

class and one of its nearest neighbors. Thus, the SMOTE increased the number of minority class 

observations, thereby improving the generalizability of the ML techniques (Chawla et al., 2002; 

Fernandez et al., 2018).  

Further, to ensure the reasonable predictive performance of different models in out-of-sample 

prediction, we employed the k-fold cross-validation technique (Geisser, 1975; Stone, 1974). 

Therewith, the training data were split randomly into k approximately equal-sized subsets of data. 

These k subsets were used separately as validation data, i.e., a pseudo-hold-out sample, for 

assessing the predictive ability of the ML model, whereas the other 𝑘 − 1 subsets were used to 

train the ML model. Moreover, k-fold cross-validation provides a means to the (over)fitting 

 
5 We adjusted the default setting of parameters in case the default values prevented the ML technique from technical 

functioning (e.g., the DT with the minimum number records per node and the number of threads).  


12 

 
conundrum, as cross-validation makes model prediction performance less sensitive to 

idiosyncrasies in any of the k subsets (Choudhury et al., 2021; Shao, 1993). For our calculation, 

we used 10-fold cross-validation (k = 10), which is a common choice for k (Choudhury et al., 

2021; Kohavi, 1995). Besides cross-validation, we followed the recommendation of Choudhury 

et al. (2021) and normalized the scale of features to obtain a unit variance of each feature after 

splitting the data into the “training & validation” sample and “hold-out” partitions by building z-

scores.6 With such feature scaling, we ensure that features with greater magnitude do not outweigh 

features with smaller magnitudes when they are weighted by an ML technique. This can be 

especially crucial when ML techniques backpropagate information to update weights, such as 

in ANNs. 

2.2.3 Classifier performance evaluation  

To evaluate the performance of the ML techniques, i.e., classifiers, we rely on different 

prominent classification performance scores that are calculated based on the confusion matrix. In 

Appendix D, we provide a Pearson cross-correlation heatmap for all features used in the main 

analyses. Specifically, we capture true positive (TP), false positive (FP), true negative (TN), and 

false negative (FN) scores. In this notation, TPs are positive instances correctly predicted by an 

ML technique as positive; FP is the number of negative cases that an ML technique predicted as 

positive; TNs are negative instances in which the classifier correctly predicted them to be 

negative; FNs are the number of positive cases that the classifier incorrectly predicted as negative. 

Therewith, this 2 × 2 confusion matrix is useful in understanding the balance between FNs and 

FPs predicted by a specific classifier (Choudhury et al., 2021).  

Based on these metrics, we rely on the most widely used classifier performance indices—

precision, recall/sensitivity, F1-score, and accuracy—to evaluate the out-of-sample performance 

and compare the overall model prediction performance (Bergstra & Bengio, 2012; Choudhury et 

 
6 To prevent information leakage, z-score normalization and SMOTE were performed after splitting the sample. 


13 

 
al., 2021; Mullainathan & Spiess, 2017). Moreover, we use the receiver operating characteristic 

(ROC) and investigate the area under the curve (AUC) estimates in order to evaluate and compare 

the overall performance of the group of classifiers. The ROC plots the TP rate of the confusion 

matrix against the FP rate. The AUC metric reflects the predictability of an ML technique and can 

be used to compare the superiority of a model. In addition, based on recent publications, we report 

the Mathews correlation coefficient (MCC) as an additional informative and reliable statistical 

score for evaluating binary classification tasks (Boughorbel et al., 2017; Chicco & Jurman, 2020).  

2.3 Results  

2.3.1 Comparison of classifier performance  

Appendix E (Panel A) represents the 2 × 2 confusion matrices for all fitted ML models. Based 

on the confusion matrices, the overall classification performance of the respective supervised ML 

techniques in predicting OME is depicted in Table 1. Table 1 summarizes the out-of-sample 

performance results achieved by each ML technique with regard to the different performance 

scores described in section 2.2.3. The results in this table reveal that the RF model obtained a 

maximum overall accuracy of 91.2% in predicting OME, which is followed closely by the 

XGBoost model with an accuracy of 91.1%. In comparison, the kNN and the NB classifiers 

underperformed with a total accuracy of 80.0% and 70.1%, respectively. If we compare the 

different ML techniques with respect to the AUC, a marginally different picture emerges—the 

XGBoost model with an AUC value of 0.850 is superior to the other classifiers. A comparison of 

the ROC curves across all ML techniques with the corresponding AUCs is shown in Figure 2 

(AUC estimates are in parenthesis). When considering the MCC score, the kNN model performs 

the best in predicting OME with a value of 0.289. 

A comparative look at the different performance scores reveals that employing ML techniques 

for predicting OME based on the comprehensive GEM data outperforms the baseline, multiple 

LR model, which shows the second lowest prediction accuracy (72.7%) and AUC (0.799) 


14 

 
estimates. The NB technique performs the worst, with a prediction accuracy of 70.1% and an 

AUC estimate of 0.782.  

Furthermore, if we explicitly compare the difference between the best-performing ML model 

(i.e., XGBoost with an AUC of 0.850, accuracy = 0.911, and precision = 0.506) and the baseline 

LR (AUC = 0.799, accuracy = 0.727, and precision = 0.207), we can state that LR performs 18.4 

percentage points (0.911− 0.727 = 0.184 × 100) worse at accurately predicting OME. Looking at 

the precision score as an estimate for how reliable a model can predict TP instances (i.e., class = 1) 

of OME—the proportion of relevant instances belonging to the positive class—we also see that 

LR correctly predicts an actual opportunity-motivated entrepreneur in only 20.7% of all cases. In 

comparison, the XGBoost model is 29.9 percentage points (0.506 − 0.207 = 0.299 × 100) better 

than LR in predicting OME; thus, it correctly predicts almost half of the actual OMEs. Given the 

fact, that LR primarily predicts non-OMEs precisely (class = 0), but fails in predicting positive 

data instances, the MCC for LR produces an overly optimistic, inflated score of 0.280. 

Table 1: ML model performance on the OME “hold-out”- dataset  

ML technique Class Recall Precision F1-score Accuracy AUC MCC 

DT 

0 0.969 0.922 0.945 - - - 

1 0.164 0.343 0.222 - - - 

Total - - - 0.898 0.814 0.188 

RF 

0 0.993 0.917 0.954 - - - 

1 0.083 0.530 0.144 - - - 

Total - - - 0.912 0.826 0.184 

ANN 

0 0.920 0.939 0.930 - - - 

1 0.388 0.322 0.352 - - - 

Total - - - 0.873 0.816 0.284 

kNN 

0 0.819 0.955 0.882 - - - 

1 0.601 0.245 0.348 - - - 

Total - - - 0.800 0.803 0.289 

XGBoost 

0 0.991 0.918 0.953 - - - 

1 0.095 0.506 0.160 - - - 

Total - - - 0.911 0.850 0.191 

NB 

0 0.698 0.964 0.810 - - - 

1 0.734 0.191 0.304 - - - 

Total - - - 0.701 0.782 0.259 

LR - baseline 

0 0.727 0.965 0.829 - - - 

1 0.729 0.207 0.322 - - - 

Total - - - 0.727 0.799 0.280 

However, if we consider the most precise ML models (i.e., XGBoost and RF), it must be noted 

that even these ML techniques are slightly better than a random draw at correctly predicting TPs 


15 

 
of OME. The estimates of prediction error rates for each ML technique in the respective fold of 

the cross-validation can be inferred from Appendix F. 

Figure 2: ROC curve comparison for OME 

 
2.3.2 Global feature importance  

To obtain a more detailed understanding of individual features in predicting entrepreneurial 

activity and enhance the interpretability of the outputs, we report on the global model-agnostic 

feature importance (Molnar et al., 2020). For this, we rank the importance of explanatory input 

features of the best-performing supervised ML technique in order to show how important each 

feature is on average in predicting OME. To estimate feature importance, we employ a global 

surrogate model method. 

A surrogate model creates a model that is trained to mimic the behavior of the original model 

by finding an approximation function (Crombecq et al., 2011). In other words, the surrogate model 

can make the same predictions as the original model and, thus, can be used to understand how the 

different input features are related to the final prediction (Crombecq et al., 2011; Gorissen et al., 


16 

 
2009). Specifically, as the RF model performs best in true positively predicting OME (see 

precision score), we perform a surrogate RF model. Global feature importance is determined by 

counting how often a specific feature was selected for a split in the DTs and identifying the rank 

of a feature among all other available explanatory input features in the RF model trees. Figure 3 

reports the global feature importance rank for predicting OME.7  

Figure 3: Feature importance rank for predicting OME 

 
This figure suggests that knowing an entrepreneur (knowent), entrepreneurial self-efficacy 

(suskill), and opportunity recognition (opport) are on average the three most important agent-

centric features for predicting entrepreneurial activity. Furthermore, the rank of a country’s 

developmental stage (CAT_GCR1) also indicates that a nation’s business environment in which 

an individual is located plays a pivotal role in entrepreneurial activity prediction. On the other end 

of the continuum, we see that features such as gender and public media coverage of successful 

entrepreneurs (nbmedia) both play only a minor role in OME prediction. 

 
7 Features with higher values are more important for predicting the target feature (i.e., they have a larger effect on the 

model). The feature importance scale is relative.  


17 

 
2.3.3 Additional analysis and robustness check 

To validate the predictive performance of the applied supervised ML techniques in predicting 

OME, we rerun all of the ML models using an automated ML technique (AutoML). Although we 

previously use the default parameter settings to better compare the used ML techniques, we now 

allow for hyperparameter optimization within the AutoML. Hyperparameters are a set of 

adjustable parameters that are unique to each ML technique. These parameters are assigned and 

tuned manually by the researcher to prevent overfitting or underfitting (i.e., stopping rules for 

rule-based DTs or the choice of a regularization term added to the loss function to penalize 

growing model complexity) and to further improve the out-of-sample predictive performance of 

the ML techniques (Choudhury et al., 2021; Mullainathan & Spiess, 2017). In the AutoML, the 

hyperparameters are automatically selected and tuned, which reduces the necessity for human 

interventions and, thus, increases overall comparability (Prüfer & Prüfer, 2020). The AutoML 

uses a Python-based library to find optimal values for regularization and other hyperparameters. 

As in the previous ML models, in the AutoML, we also implement 10-fold cross-validation. 

According to overall prediction accuracy and compared with the AUC values depicted in Figure 

2, the results of the AutoML technique are identical to our main findings that XGBoost 

(AUC = 0.949) and DT (AUC = 0.936) are the best-performing ML techniques, followed by RF 

(AUC = 0.911). Since we allow for optimal parameterization in the AutoML technique, the OME 

prediction performance of all the applied ML techniques is higher compared to their performance 

according to the results in our main analysis.  

Until now, we have used ML techniques to predict OME. However, according to the push/pull 

framework (Storey, 2016), entrepreneurial activity can also occur in the form of NME 

(TEAyyNEC). To account for this distinction and to compare OME vis-à-vis NME, we rerun our 

entire analyses for NME.8 Appendix E (Panel B) presents the confusion matrix for the fitted ML 

models on NME. In Appendix H, we list the detailed results of the ML model performance 

 
8 For the prediction of NME, the same default parameter settings were used as for the prediction of OME (see 

Appendix G). 


18 

 
measures for predicting NME. Appendix I depicts the ROC curve comparison for NME. The 

results unveil that the XGBoost model again obtained a maximum overall accuracy of 96.5% in 

predicting NME, as well as the highest precision score (0.491) and AUC (0.824). The feature 

importance rank in Appendix J illustrates that the developmental stage of a country (CAT_GCR1), 

the country identifier (country), and household size (hhsize) are the most important features in 

predicting NME. 

2.4 Discussion and conclusion 

In this atheoretical study, we attempt to investigate the comparative performance of multiple 

supervised ML techniques in predicting OME (and NME). The findings of our utilized ML 

techniques reveal that entrepreneurial activity can be predicted with a maximum out-of-bag 

overall accuracy of 91.2% for OME (and 96.5% for NME), without hyperparameter optimization. 

Therewith, ML techniques outperform the traditional multiple LR. When we strive for optimal 

hyperparameter values, the predictive accuracy of entrepreneurial activity can be increased. 

However, concerning the precision, our results also provide suggestive evidence that even the 

best-performing ML techniques—XGBoost and RF—are still modest at correctly predicting 

entrepreneurship activity for the hold-out sample. One possible reason for this is that most of the 

individual-level GEM features are dichotomous. Nevertheless, these findings still provide clear 

evidence that, contrary to earlier assumptions of scholars, it is possible to construct mathematical 

models related to entrepreneurship that can correctly distinguish between entrepreneurs and non-

entrepreneurs, even with limited information. The surrogate model in this study suggests that 

knowing an entrepreneur is the most important feature in predicting the occurrence of OME, 

followed by different self-regulation mechanisms such as cognitions and personal traits. These 

findings are in line with existing literature, highlighting the paramount importance of social capital 

and psychological characteristics of individuals as antecedents of entrepreneurial pursuits (e.g., 

Chen et al., 1998; Liñán & Santos, 2007). 


19 

 
The results of the ML techniques reveal that the XGBoost estimator is superior in predicting 

both OME and NME. Intuitively, this could be because ML techniques such as XGBoost and RF 

are ensemble methods that are developed primarily to perform two-class prediction tasks (i.e., 

coded 0 and 1) with structured data, such as the GEM. Since both techniques are tree-based, they 

could also benefit from the many binary features in the data, when splitting into branches. In 

addition, XGBoost uses a method known as “boosting“ in which predictors are trained 

sequentially so that each model in the ensemble strives to minimize the errors of its predecessor. 

However, if we look at the precision estimates for NME prediction and compare them with the 

precision values for predicting OME, we can conclude that all ML techniques perform rather 

poorly at reliably predicting true positives for NME. These results provide suggestive evidence, 

that the features queried in the GEM are in favor of understanding OME instead of NME, and that 

crucial concepts predicting NME are presumably missing in the GEM data. Moreover, while we 

see that agent-centric features are of importance in predicting OME, geospatial factors and 

national entrepreneurial ecosystems in which individuals are embedded play a superordinate role 

in predicting NME. 

Our study mainly contributes to the embryonic yet burgeoning body of ML-based 

entrepreneurship literature (e.g., Antretter et al., 2019; Prüfer & Prüfer, 2020; Tan & Koh, 1996) 

that is focused on analyzing and deciphering the complex phenomenon of entrepreneurial activity 

using ML methods that can effectively model highly non-linear processes. Due to the comparative 

nature of our study and by highlighting the predictive performance superiority of XGBoost, we 

provide an initial benchmark for future ML-based studies that seek to further improve the reliable 

predictability of entrepreneurial activity. As ML has only recently emerged as a tool for 

econometricians and entrepreneurship researchers, we consider this study as the first of a series 

of further studies exploring the predictability of different types of entrepreneurship activities, 

including digital entrepreneurship, female entrepreneurship, social entrepreneurship, corporate 

entrepreneurial activity, etc., to address research questions such as the following: Which ML 


20 

 
technique is superior in predicting other forms of entrepreneurial activity? Which are the most 

important features in specific forms of entrepreneurial activity? However, research questions are 

not limited to positive outcomes. Of particular interest might also be investigations into the 

predictability of different kinds of business failures. 

Nonetheless, our study has several limitations. First, it is tempting to draw causal conclusions 

from the findings. However, with utmost clarity, we emphasize that due to the cross-sectional 

nature of the individual-level GEM data, direct causal inferences and conclusions require 

extremely careful scrutiny. Despite that, even correlations and non-linear associations can reveal 

useful underlying structures and mechanisms in the data. Second, although the GEM is currently 

one of the most comprehensive datasets on individual entrepreneurial activity, a substantial part 

of the features is binary, which reduces the possibility of performing more in-depth analyses. To 

draw even more detailed conclusions about the hidden patterns, most frequent interactions, and 

underlying relationships between the target and a feature, the use of input features at a higher scale 

level (i.e., continuous or Likert-based scale) would be beneficial. A higher data quality would also 

allow for providing various partial dependence plots (PDPs), disaggregated individual conditional 

expectation curves (ICEs), or Friedman’s H-statistic (Friedman & Popescu, 2008) on selected 

feature pairs to illustrate specific feature effects (Goldstein et al., 2015). These plots help to 

effectively visualize how a change in a single explanatory input feature changes the outcome 

prediction, i.e., (conditional) marginal feature effects and higher-order interactions (Friedman, 

2001; Zhao & Hastie, 2021). Third, since we use a complete-case analysis approach, the prediction 

performance indices may be biased toward either over- or underestimation. However, this is only 

the case if the missing values are not missing completely at random (MCAR). Lastly, since our 

analyses only use the GEM data, our analyses may suffer from an omitted variable bias. 

Our findings and limitations also provide directions for future research. First, since we find 

that the developmental stage of a country (CAT_GCR1) is the 8th most important feature for 

predicting OME and the most important for NME prediction, country-specific factors seem to 


21 

 
play a significant role in entrepreneurial activity. To this end, future research can draw on different 

theories that take into account country-level factors that were unraveled through a contextual view 

of entrepreneurship (Welter, 2011), the institutional theory (North, 1990; Williamson, 2000), or 

the external enabler framework (Davidsson et al., 2020). These contextual factors are proven to 

be relevant contingencies, as they influence, for instance, human capital or cognitions and their 

effects on entrepreneurial pursuits (e.g., Autio & Acs, 2010; Boudreaux et al., 2019). These factors 

provide specific (situational) mechanisms that shape the entrepreneurial action-formation of 

individuals (Schade & Schuhmacher, 2022). In doing so, even high-dimensional non-linear 

relationships between contextual factors could be identified. The first vivid example of this fruitful 

path is provided by Jabeur et al. (2022), who forecasted and examined macro-level determinants 

of entrepreneurial opportunities. Second, while our study focuses on the comparison of an array 

of implemented ML techniques, we aim to inspire future research to dive deeper into and engage 

in the so-called “interpretable machine learning” (IML) or “explainable AI” (XAI) by using 

various model-agnostic explanation methods such as PDPs/ICEs, accumulated local effects (ALE) 

plots, local interpretable model-agnostic explanation (LIME) models or SHAPley values to 

unravel how models arrived at specific decisions and explain hidden, robust, and even anomalous 

patterns in the data (Choudhury et al., 2021; Molnar et al., 2020; Shepherd & Majchrzak, 2022). 

With these post hoc methods and higher data quality, future research can effectively infer and 

understand relationships between different features and their interactions on different target 

outcomes. This also allows future research to conduct meta-analytic reviews on independent ML 

studies that focused on similar prediction tasks with different data. Treating a validated ML 

prediction model as a stylized fact also opens avenues for theory development. Specifically, to 

explain and theoretically explicate patterns detected in the data, scholars can either use appropriate 

existing theories or engage in algorithm-supported induction for building entirely new theoretical 

approaches (Shrestha et al., 2021). Theoretical approaches that attempt to account for the 

uncovered patterns in the data can be used for classical hypothetico-deductive theory testing. 


22 

 
Lastly, since entrepreneurship research is constantly in flux, bringing to light new insights into 

the entrepreneurial phenomenon, we encourage future scholars to replicate our analysis for even 

more fine-grained data using sophisticated AI techniques. 

Declaration of competing interests 

The authors declare no conflicts of interest with respect to research, authorship, and publication 

of this article.  

Funding 

This research did not receive a grant from any funding agency in the public, commercial, or 

not-for-profit sectors. 

Acknowledgments 

We thank associate editor Andreas Kuckertz and the anonymous reviewer for their helpful 

comments and suggestions. Moreover, we appreciate the contribution from Marilyn A. Uy during 

the ACERE22 DC, as well as the comments from Per Davidsson and Dean A. Shephered on an 

earlier version and idea of this research project.


23 

 
Chapter 3  

 
3. Study 2 – Digital Infrastructure and Entrepreneurial Action-Formation: A 

Multilevel Study 

 
Coauthors:  

Monika C. Schuhmacher 

 
Relative share:  

90% 

 
Status:  

Published in Journal of Business Venturing (FT50, ABS: 4, ABDC: A*, VHB-Jourqual: A) 

 
This chapter is available under:    

Schade, P. & Schuhmacher, M.C. (2022). Digital Infrastructure and Entrepreneurial Action-

Formation: A Multilevel Study. Journal of Business Venturing, 37(5), 106232, 

https://doi.org/10.1016/j.jbusvent.2022.106232  

 
A previous version of this chapter has been presented at:  

• 42nd Babson College Entrepreneurship Research Conference 2022 (BCERC), 

Waco/Texas, USA  

• Australian Center for Entrepreneurship Research Exchange (ACERE) Conference 2022, 

Melbourne/Virtual Edition, Australia 

• 24th G-Forum (2020) Interdisciplinary Conference on Entrepreneurship, Innovation and 

SMEs, Karlsruhe/Virtual Edition, Germany 

https://www.sciencedirect.com/science/article/abs/pii/S0883902622000441


24 

 
Digital Infrastructure and Entrepreneurial Action-Formation:  

A Multilevel Study 

 
Abstract 

This study investigates how country-level digital infrastructure shapes the relationships between 

the action-formation mechanisms of socio-cognitive traits, i.e., entrepreneurial self-efficacy, fear 

of failure, and opportunity recognition, and entrepreneurial action. We amalgamate the agent-

centric social cognitive theory with the external enabler framework and apply mechanism-based 

theorizing to explain how access-related mechanisms provided by digital infrastructure influence 

entrepreneurial action-formation. Based on a multilevel analysis of 344,265 individual-level 

observations from 46 countries and an additional robustness analysis of 391,119 individuals from 

53 countries, we find that an individual’s proclivity to starting a new venture is contingent upon 

the level of the digital infrastructure of a country. The empirical results show that a country’s 

digital infrastructure is an external enabler that moderates the relationship between socio-

cognitive traits and entrepreneurial action. 

JEL classification: L26, D91 

Keywords: Entrepreneurial action, digital infrastructure, social cognitive theory, external enabler 

framework, mechanism-based theorizing, multilevel analysis 


25 

 
3.1 Introduction  

The World Economic Forum (2014) considers digital infrastructure as the backbone of 

digitalization and a prerequisite for venture creation and economic growth. Digital infrastructure 

refers to an unbounded, open, and evolving socio-technical system that includes technological and 

human components, networks, and systems (Hanseth & Lyytinen, 2010; Tilson et al., 2010). 

Hence, digital infrastructure refers to both applications of information and communication 

technologies and the associated infrastructure (Autio et al., 2018). Therefore, digital infrastructure 

is not limited to a distinct set of specific functions or restricted by strictly defined boundaries; 

rather, it is relational in nature (Tilson et al., 2010). In view of this relational property of digital 

infrastructure, we argue that digital infrastructure functions as a macro-level external enabler (EE) 

that provides specific mechanisms that shape individuals’ entrepreneurial action-formation 

(Davidsson et al., 2020; von Briel, Davidsson, & Recker, 2018). 

However, although public opinion and policies exalt digital infrastructure as a panacea for 

paving the way toward a digitally transformed economy and scholars stress its overriding 

relevance, there is a dearth of empirical research on whether and how a country’s digital 

infrastructure shapes individuals’ entrepreneurial action-formation as an EE. To shed light on 

entrepreneurship in the digital age, scholars adorn potentially effective theoretical lenses, such as 

the notion of digital affordances (Autio et al., 2018; Nambisan, 2017), the EE framework 

(Davidsson et al., 2020; von Briel, Davidsson, & Recker, 2018), or the digital entrepreneurial 

ecosystem perspective (Sussan & Acs, 2017). Although some of the concepts are qua design 

predominantly venture-level theoretical frameworks, they also allow for theorizing the enabling 

role of digital infrastructure in individual entrepreneurial action-formation.  

Extant research on individual entrepreneurship provides ample empirical evidence that socio-

cognitive traits, namely, entrepreneurial self-efficacy, fear of failure, and opportunity recognition, 

are essential mechanisms of entrepreneurial action (e.g., Busenitz & Barney, 1997; Shane & 

Venkataraman, 2000; Zhao et al., 2005). However, several researchers demonstrate that 


26 

 
entrepreneurial action-formation also depends on the proximate and distal macro contexts in 

which individuals at the micro-level operate. These contexts are external to the respective focal 

phenomenon (i.e., entrepreneurial action) and enable or hinder entrepreneurship (Jack & 

Anderson, 2002; Spigel, 2017; Welter, 2011). Country-level contexts that have so far been 

investigated as moderators for the relationship between individual-level socio-cognitive traits and 

various entrepreneurial pursuits are social (Schmutzler et al., 2018), institutional, economic, and 

political (e.g., Autio & Acs, 2010; Boudreaux et al., 2019), or cultural (Stephan & Pathak, 2016; 

Wennberg et al., 2013) conditions, situations, circumstances, and environments. Although such 

research provides considerable information about traditional macro-level factors, limited effort 

has been made in theorizing the role of contemporary factors, such as digital technologies and 

infrastructure, in shaping entrepreneurial action (Nambisan, 2017).  

This prevalent line of contextual thinking is consistent with the EE framework proposed by 

Davidsson (2015) and refined by Davidsson et al. (2020). In this framework, an EE refers to the 

aggregate macro-level circumstance that can shape individual entrepreneurial action-formation 

and plays a significant role in eliciting or enabling entrepreneurial endeavors. Thus, the 

framework also considers the crucial role of the individual (Kimjeon & Davidsson, 2021). 

Furthermore, the EE framework provides mechanisms that specify the benefits derived from 

external contexts, which can be strategically used for personal purposes regarding 

entrepreneurship (Davidsson et al., 2020; Kimjeon & Davidsson, 2021). Thus far, the EE 

framework has primarily been applied to different factors outside the scope of action by 

individuals, for example, the high-speed railway expansion in China (Chen et al., 2020), 

investments in physical infrastructure in the United States (Bennett, 2019), blockchain in the 

global music industry (Chalmers et al., 2021), and digital technologies in the IT sector (von Briel, 

Davidsson, & Recker, 2018), among others (e.g., Browder et al., 2019; Davidsson et al., 2021; 

Frederiks et al., 2019). We propose that digital infrastructure serves as an EE to support 

entrepreneurial action-formation (Davidsson, 2015; Nambisan, 2017). Hence, we theorize that 


27 

 
digital infrastructure provides mechanisms that shape the impact of individual socio-cognitive 

traits on entrepreneurial action. On this basis, the overarching research question is as follows: 

How does a country’s digital infrastructure shape the entrepreneurial action-formation of 

individuals? To answer this question, we used a large-scale, cross-sectional dataset comprising 

344,265 individual-level observations from 46 countries—and for robustness analysis, a dataset 

of 391,119 individuals from 53 countries—and employed logistic multilevel modeling.  

Overall, our study contributes to knowledge accumulation in the following ways. First, our 

study represents the first empirical investigation on whether and how a country’s digital 

infrastructure fosters individual entrepreneurial action-formation, thereby responding to 

numerous calls in the literature to examine entrepreneurship as a multilevel phenomenon (e.g., 

Busenitz et al., 2003; Terjesen et al., 2016).  

Second, our study contributes to the contextual entrepreneurship literature. Specifically, as we 

consider how digital infrastructure—a hitherto unconsidered technological contextual factor—

shapes entrepreneurial action-formation, we add a novel technological dimension to the existing 

classification of contexts proposed by (Welter, 2011). 

Third, we combine the agent-centric social cognitive theory (SCT) (Bandura, 1986; Sherman 

et al., 2015; Wood & Bandura, 1989) with the EE framework (Davidsson et al., 2020). Because 

SCT is rather unrefined and lacks a theoretical explanation of how contextual factors shape 

entrepreneurial action-formation at the micro-level, we augment SCT with the EE framework, 

which allows us to reason on how a specific contextual factor, in terms of EE, develops through 

specific mechanisms. With this theoretical approach, which allows for mechanism-based 

theorizing, we demonstrate the theoretical usefulness of the EE framework in elaborating on an 

existing agent-centric theory, thereby responding to calls from entrepreneurship scholars to merge 

the EE framework with agent-based theories (Davidsson et al., 2020; Kimjeon & Davidsson, 

2021). Our results show that variations in country-level digital infrastructure play a significant 

role in the relationship between agential cognitions and entrepreneurial action at the individual 


28 

 
level and, thus, provide an empirically substantiated theoretical elaboration of the emergent EE 

theory (Fisher & Aguinis, 2017). Specifically, we are the first to show that the EE framework is 

not only of paramount importance for theorizing at the venture level (Davidsson et al., 2020; von 

Briel, Davidsson, & Recker, 2018) but also an appropriate theoretical basis for explaining the role 

of EEs in the individual-level entrepreneurial phenomenon.  

3.2 Theoretical framework  

3.2.1 Social cognitive theory 

According to SCT, the effects of an individual’s predispositions, such as cognitions, emotions, 

etc., are determined and shaped by both individual-level characteristics and the external 

environment in which individuals find themselves (Bandura, 2015; Mischel, 2004; Wood & 

Bandura, 1989). Grounded in an agentic perspective, SCT states that environmental and socio-

structural factors operate through psychological or self-regulatory mechanisms of the self-system 

to produce behavioral outcomes (Bandura, 2001). Moreover, according to Lent et al. (1994), 

external contexts exert either direct or indirect influence by moderating the individuals’ socio-

cognitive traits–behavior relationship. Thus, SCT clarifies the basic mechanisms of agency that 

govern individuals’ behavior and provides a useful framework to understand underlying 

mechanisms through which individual dispositions lead to behavioral activity (Hmieleski & 

Baron, 2009; Ng & Lucianetti, 2016).  

As SCT is centered on the agentic individual’s embeddedness in their environment, the theory 

also considers the multilevel perspective necessary to understand complex behavioral processes 

(Hitt et al., 2007). In fact, Jack and Anderson (2002) and (Welter, 2011) demand that research 

takes into consideration the individual’s external context at a higher macro-level to improve the 

understanding of the entrepreneurship phenomenon at a lower level of analysis. Therefore, we 

focus on how the external environment at the macro-level—digital infrastructure, in this case—

indirectly influences the socio-cognitive traits–entrepreneurial activity relationship at 

the micro-level.  


29 

 
3.2.2 External enabler framework and venture creation 

According to the EE framework, an EE is an aggregate macro-level circumstance that can 

shape the venture creation process and play a significant role in enabling entrepreneurial 

endeavors. The EE framework classifies different types of EEs that are external to the focal 

entrepreneurial phenomenon, based on their origins, such as technological, socio-cultural, 

macroeconomic, regulatory, demographic, and political factors, and the natural environment 

(Davidsson, 2015; Davidsson et al., 2020; Mair & Marti, 2009). 

In understanding the EE framework, a mechanism is a relational construct that establishes a 

link between the external environment and the entrepreneurial agent (Davidsson et al., 2020). 

Hence, EE mechanisms provide a theoretical means of explaining in detail how external macro-

level circumstances and the micro-level entrepreneurial phenomenon are connected. However, 

whether an EE provides particular mechanisms depends on both the properties of the enabler and 

the entrepreneurial agent (Davidsson et al., 2020). As SCT lacks specific theoretical explanations 

on how external factors shape entrepreneurial action-formation, we combine SCT with the EE 

framework and reason about the underlying mechanisms provided by the EE—here, digital 

infrastructure—to deduce our hypotheses and further motivate our multilevel approach.  

3.3 Mechanism-based theorizing: Hypotheses development 

Based on the perspective of critical realism, mechanisms are causal structures that generate 

observable effects and events (Bhaskar, 1997; Henfridsson & Bygstad, 2013; Merton, 1968), by 

shaping relationships in a given set of elements (Tilly, 2001). Researchers categorize mechanisms 

as “situational, action-formation, or transformational mechanisms” depending on the level at 

which the mechanisms function (Hedström & Swedberg, 1998, pp. 22–23; Kim et al., 2016). First, 

action-formation mechanisms exclusively operate at the micro-to-micro-level (1-1). At this solely 

individual-centric micro-level, a plurality of psychological and socio-psychological mechanisms 

operate and provide a rationale for how agential cognitions, personal traits, beliefs, and 

motivations generate actions (Hedström & Swedberg, 1998). The action-formation mechanism 


30 

 
perspective follows the psychological notion of human agency in SCT; it suggests that human 

action-formation arises from psychological or self-regulatory mechanisms. These self-regulatory 

mechanisms, in turn, emerge from lower-order cognitive mechanisms, specifically socio-

cognitive traits (Bandura, 1989).  

Second, transformational mechanisms influence or generate macro-level outcomes bottom-up 

from the micro-level to the macro-level (1-2) (Hedström & Swedberg, 1998; Kim et al., 2016). 

Third, situational mechanisms cover contextual factors such as technological, political, socio-

cultural, and environmental factors that occur top-down and rest at the interacting imbrication 

between the macro- and micro-levels (2-1). These situational mechanisms determine and shape 

existing socio-cognitive traits of individual actors at the micro-level (Coleman, 1986, 1994; 

Hedström & Ylikoski, 2010; Kim et al., 2016; Sarason et al., 2006). Hence, situational 

mechanisms are a set of plausible hypotheses that can be explanations of some phenomena, 

whereby the explanation takes the form of interactions between individuals at the micro-level and 

a circumstance at the macro-level (Coleman, 1994; Schelling, 1998).  

We assert that the impact of digital infrastructure on entrepreneurial action-formation refers to 

such situational mechanisms. The understanding of situational mechanisms fully aligns with the 

notion of the underlying mechanisms of the EE framework, describing “the higher-level 

relationship between the emergence of new digital technologies as external enablers (i.e. cause) 

and venture creation activity in a sector (i.e. the effect)” (von Briel, Davidsson, & Recker, 2018, 

p. 51). Furthermore, this view of situational mechanisms is particularly emphasized in innovation 

research (Hedström & Wennberg, 2017) and entrepreneurship research concerning 

entrepreneurial action-formation (Johnson & Schaltegger, 2020; Kim et al., 2016; von Briel, 

Davidsson, & Recker, 2018). Since the understanding of situational and EE mechanisms are 

identical, we will refer to both in our theorizing as EE mechanisms. Figure 4 summarizes the 

theoretical model that we will hypothesize in the following subsections. 


31 

 
Figure 4: Theoretical model  

3.3.1 Action-formation mechanisms at the individual level and baseline hypotheses 

According to SCT (e.g., Bandura, 1986; Wood & Bandura, 1989), self-regulation explains how 

humans feel, think, and behave. Since both entrepreneurial activity and individuals’ socio-

cognitive traits are complex phenomena, we adopt the “cognitive approach to entrepreneurship” 

(Baron, 2004; Mitchell et al., 2002; Shaver & Scott, 1992). Specifically, we focus on the most 

prevalent goal-directed entrepreneurial socio-cognitive traits at the individual micro-to-micro-

level (1-1) that are representations of the external environment captured through individuals’ 

mental processes (Krueger, 2003) and are generative mechanisms for entrepreneurial action-

formation (see e.g., Baron, 1998), namely, entrepreneurial self-efficacy (e.g., Bandura, 1982; 

McGee et al., 2009), fear of failure (e.g., Caliendo et al., 2009; Langowitz & Minniti, 2007), and 

opportunity recognition (e.g., Ardichvili et al., 2003; Baron, 2006). 

An individual’s perceived self-efficacy refers to the self-assessment of one’s capabilities and 

skills (reflecting the innermost thoughts) to create and run a new business venture (McGee et al., 

2009; Zhao et al., 2005). Previous studies provide empirical evidence that (entrepreneurial) self-

efficacy, as developed by Bandura (1982), is a precursor of intentions (e.g., Chen et al., 1998; 

Macro-level 2 
 (Country) 

Micro-level 1  

(Individual)  

External enabler  

Digital  
infrastructure 

Action 

Entrepreneurial 
activity 

Socio-cognitive traits 

- Entrepreneurial self-

efficacy 

- Fear of failure 

- Opportunity recognition 

Action-formation mechanisms (1-1) 
(H1a, H1b, H1c) 

EE mechanisms (2-1) 
(H2, H3, H4) 


32 

 
Zhao et al., 2005) and the best ex-ante predictor of behavior (Armitage & Conner, 2001; Bagozzi 

et al., 1989).  

Aside from favorable traits, there are also socio-cognitive traits that can stifle entrepreneurial 

action-formation. Under SCT, this is the case when individuals surmise that an intended goal is 

difficult to achieve and the likelihood of failure is omnipresent (Wood & Bandura, 1989). Venture 

creation epitomizes a perilous endeavor associated with high uncertainty and risk-taking; as such, 

fear of failure plays a pivotal role in entrepreneurial action-formation (Caliendo et al., 2009). An 

amplification of fear of failure negatively affects the probability of individual entrepreneurial 

action (Arenius & Minniti, 2005; Langowitz & Minniti, 2007; Wagner, 2007).  

SCT also highlights the human capacity for self-motivation and self-direction by creating goals 

that serve as motivators and guides for action (Bandura, 1988). In this regard, entrepreneurial 

scholars have identified opportunity recognition as one of the most fundamental and distinctive 

personal traits of entrepreneurial action-formation (Kirzner, 1979; Venkataraman, 1997). 

Individuals who possess traits of opportunity recognition can translate symbolic conceptions into 

an appropriate course of action (Wood & Bandura, 1989) and may make individuals better at 

overcoming the inherent opacity of EE mechanisms and foreseeing their benefits (Davidsson et 

al., 2020; Grégoire & Shepherd, 2012). This idiosyncratic personal trait is described as an 

entrepreneur’s evolving vision that is malleable and becomes more concrete with the progress of 

entrepreneurial action-formation (Berglund et al., 2020; Davidsson, 2021). Therewith, 

entrepreneurial action originates from the subjective perception that the introduction of a new 

product or service is feasible and worth pursuing (Ardichvili et al., 2003; McMullen & Shepherd, 

2006). 

In summation, on an individual micro-to-micro-level (1-1), SCT suggests that the individual 

socio-cognitive traits, i.e., entrepreneurial self-efficacy, fear of failure, and opportunity 

recognition, are essential action-formation mechanisms. Hence, we propose the following 

baseline hypotheses:  


33 

 
Hypothesis 1a (H1a): An individual’s perception of entrepreneurial self-efficacy is 

positively associated with entrepreneurial activity. 

Hypothesis 1b (H1b): An individual’s fear of failure is negatively associated with 

entrepreneurial activity. 

Hypothesis 1c (H1c): An individual’s opportunity recognition is positively associated with 

entrepreneurial activity.  

3.3.2 EE mechanisms of digital infrastructure and moderation hypotheses 

To understand the role of digital infrastructure in entrepreneurial action-formation, we identify 

specific EE mechanisms by drawing upon the ontological properties of digital technologies (von 

Briel, Davidsson, & Recker, 2018). According to von Briel, Davidsson, and Recker (2018), digital 

technologies can be divided into two ambivalent properties—specificity and relationality. 

Specificity refers to the set of possible actions and interactions that can be performed with a 

technology (DeSanctis & Poole, 1994; von Briel, Davidsson, & Recker, 2018). In contrast, 

relationality refers to the set of relationships with other actors or end-users that have access to the 

same technology (Kallinikos et al., 2013; von Briel, Davidsson, & Recker, 2018).  

In light of this ontology, digital infrastructure is not characterized by specificity but is 

inherently relational and provides enhanced accessibility to different location-independent 

resources and markets through direct interactions with geographically dispersed audiences and 

end-users (Autio et al., 2018; Nambisan, 2017; Tilson et al., 2010; Wasko, 2005). Thus, we argue 

that digital infrastructure mainly facilitates access-related EE mechanisms (Bruton et al., 2015; 

Tilson et al., 2010). Drawing on the EE framework, we propose that digital infrastructure 

particularly provides resource access and market access mechanisms (Kimjeon & Davidsson, 

2021; Majchrzak et al., 2013; Podolny, 2001; von Briel, Davidsson, & Recker, 2018). While 

resource access mechanisms reflect “improved access for the [individual] to a previously existing 

type of resource,” market access mechanisms are defined as “improved access for the focal 

[individual] to a previously existing market” (Kimjeon & Davidsson, 2021, p. 4). These access-

related EE mechanisms are especially beneficial to those entities that pursue a specific goal 


34 

 
(Davidsson et al., 2020). In other words, digital infrastructure enables access to pre-existing 

resources and markets, thereby facilitating entrepreneurial action-formation of individuals with 

similar socio-cognitive traits. 

3.3.2.1 Moderation of the self-efficacy action-formation mechanism 

SCT suggests that the extent to which entrepreneurial self-efficacy fosters entrepreneurial 

action-formation depends on external contexts (Bandura, 1986; Wood & Bandura, 1989). Based 

on SCT, we theorize that digital infrastructure, with the accompanying resource access and market 

access mechanisms, is an EE that moderates the effect of entrepreneurial self-efficacy on 

entrepreneurial action. Specifically, we argue that this positive relationship is stronger when 

countries have a higher level of digital infrastructure for several reasons.  

First, countries with a high level of digital infrastructure provide individuals with a market 

access mechanism, which refers to improved access to existing markets (e.g., capital, product, 

labor, credit, customer, etc.) (von Briel, Davidsson, & Recker, 2018). Such market access enables 

individuals to exchange, buy, or sell services and goods with various customers and vendors 

(Shelton & Minniti, 2018). Therewith, a market access mechanism enables access to global 

markets and offers the potential to sell to lucrative international customers via easily accessible 

online marketplaces and platforms. Simultaneously, market access mechanisms reduce distance-

related issues, allow leveraging economies of scale, and lower transaction costs. These benefits 

enhance the probability that personal effort will lead to successful entrepreneurial performance 

(Chen et al., 1998). As digital infrastructure, and the associated market access mechanism, is 

considered an institutional arrangement (Leendertse et al., 2021), it also determines the relative 

rewards and expected returns from engaging in entrepreneurial activity (Baumol, 1996). 

Therefore, individuals having similar entrepreneurial self-efficacy beliefs but lacking market 

access have fewer options to offer products or services in the market, thereby lowering expected 

returns and enhancing incentives to pursue alternative career options, rather than entrepreneurship 

(Baumol, 1996). In contrast, in countries that provide market access, i.e., where digital 


35 

 
infrastructure is high, existing markets are easily accessible and individuals with similar 

entrepreneurial self-efficacy are likely to not discount the marginal value of future profits. This, 

in turn, enhances the impact of individuals prone to mobilizing motivation, abilities, and skills in 

their effort toward entrepreneurial action (Chen et al., 1998; Wood et al., 2016).  

Second, entrepreneurial action as an act of innovation describes an entrepreneur as an 

individual that utilizes existing resources necessary to produce and offer a product or service that 

fills a market gap (Drucker, 1985; Leibenstein, 1968). Thus, resources refer to access to, the 

possession of, and usage of specific human, financial, and other resources and assets necessary 

for entrepreneurial action-formation (Bull & Willard, 1993; Mitchell et al., 2000). Thus, 

entrepreneurial action-formation is the result of self-assessment in which individuals evaluate the 

perceived availability of resources and the constraints to task performance. The consideration of 

perceived personal resources, such as capabilities and skills (Gist & Mitchell, 1992), the type and 

amount of specific resources (i.e., tangible and intangible resources), as well as perceptions of 

external resource availability required to complete different tasks has been shown to determine 

and shape entrepreneurial action-formation (Bandura, 1988; Gist & Mitchell, 1992; Krueger, 

1993). In countries with a low level of digital infrastructure resources relevant to the 

entrepreneurial process are hardly accessible. This, in turn, reduces the effect of individuals 

mobilizing personal resources on entrepreneurial action, thereby lowering the likelihood of 

becoming entrepreneurially active. Thus, individuals with identical entrepreneurial self-efficacy 

beliefs will ultimately be less likely to engage in entrepreneurial action in countries without 

resource access mechanisms in place, i.e., where digital infrastructure is low. In contrast, when 

resource access is given, i.e., digital infrastructure is high, individuals’ self-efficacy allows them 

to virtually access more resources to address deficiencies that prevent them from starting a new 

business venture. 

In summation, we argue that a high (low) level of digital infrastructure, i.e., countries that (do 

not) provide resource access and market access mechanisms, reinforces (attenuates) the impact of 


36 

 
entrepreneurial self-efficacy on entrepreneurial action. Thus, we propose the following 

hypothesis:  

Hypothesis 2 (H2): The positive relationship between entrepreneurial self-efficacy and 

entrepreneurial activity is moderated by digital infrastructure, such that this relationship will 

be stronger when digital infrastructure is high than when it is low. 

3.3.2.2 Moderation of the fear of failure as an action-formation mechanism 

Within SCT, the fear of failure negatively affects the probability of individual entrepreneurial 

activity (Arenius & Minniti, 2005; Langowitz & Minniti, 2007; Wagner, 2007). We theorize that 

digital infrastructure weakens the negative effect of the fear of failure on entrepreneurial action 

for some reasons. First, different obstacles pose serious threats to individuals in the 

entrepreneurial action-formation process (Cacciotti et al., 2016). These obstacles amplify 

individuals’ fear of failure, for instance, the perception of tangible and intangible resource 

availability (e.g., Krueger, 2000). The resource access mechanism provided by digital 

infrastructure offers improved access to existing resources such as human and social capital, 

which are important conduits of information and resources for resolving uncertainties (Birley, 

1985; Engel et al., 2017). For example, individuals can engage in socially persuasive 

communication and draw on rich informational expertise from various actors and entrepreneurs 

that are not in close proximity (Kuhn & Galloway, 2015; Nambisan, 2017; Nambisan & Baron, 

2007). Hence, we expect individuals with a similar fear of failure, which dissuades them from 

becoming entrepreneurially active, to virtually receive assistance on a global scale when digital 

infrastructure is high, thereby weakening the relationship between fear of failure and 

entrepreneurial activity. 

Second, founders generally finance their initial entrepreneurial activities with their own money 

and “love money” from family and friends (e.g., Bygrave et al., 2003). People showing fear of 

failure will very likely turn away from making use of such financial resources, leading to fewer 

entrepreneurial actions. Market access allows for more external funding sources. The market 

access mechanism through digital infrastructure permits mobilizing external funding directly from 


37 

 
demand-side backers through social online networks and crowd-funding platforms, without any 

distance-related friction (Bruton et al., 2015; Eiteneyer et al., 2019). Hence, the market access 

mechanism permits individuals with similar failure concerns to mitigate the influence of fears 

stemming from financial security or the lack of ability to finance an entrepreneurial venture 

(Cacciotti et al., 2016).  

Third, entrepreneurs typically start with a niche strategy; competing at the fringe of the market 

always carries the risk that founders fear—that the target market is too small for entrepreneurial 

survival and success (Cacciotti et al., 2016). Digital infrastructure provides a market access 

mechanism and, therewith, direct access to different existing (over) regional, independent 

markets, and customers that go far beyond those facilitated by physical infrastructure (see e.g., 

Chen et al., 2020). Thus, in countries that provide a market access mechanism through a high-

level digital infrastructure, the relation between fear of failure and entrepreneurial activity will be 

weaker compared to countries without such a market access mechanism (i.e., low-level digital 

infrastructure). 

In summation, countries that provide both resource access and market access mechanisms (i.e. 

high in digital infrastructure) bequeath individuals a similar fear of failure such that the latter can 

benefit from the provided mechanisms and the negative effect of fear of failure, which prevents 

individuals from becoming entrepreneurially active, is weakened. Thus, we formulate the 

following hypothesis: 

Hypothesis 3 (H3): The negative relationship between fear of failure and entrepreneurial 

activity is moderated by digital infrastructure, such that this relationship will be weaker when 

digital infrastructure is high than when it is low. 

3.3.2.3 Moderation of the opportunity recognition action-formation mechanism 

Possessing the socio-cognitive trait of opportunity recognition is an important antecedent that 

positively increases the likelihood of entrepreneurial action (e.g., Kirzner, 1979; Wood & 

Bandura, 1989). We theorize that digital infrastructure strengthens the positive effect of 

opportunity recognition on entrepreneurial action for the following reasons. First, in countries 


38 

 
with a high level of digital infrastructure and, thus, market access mechanisms, individuals who 

possess the socio-cognitive trait of opportunity recognition are able to evaluate new products, 

services, and/or business model ideas and meet market needs, thereby enhancing the impact of 

opportunity recognition on entrepreneurial activity. For instance, individuals can access already 

existing markets, customers, and relevant competitors through digital forums, discussion boards, 

and other platforms, or (social) networks. With market access, individuals can contact and 

converse with customers—i.e., demand-side narratives (Nambisan and Zahra, 2016)—that serve 

as relevant sources in driving the process of evaluating and modifying potential manifestations of 

envisioned products and services, as well as assessing their demand and use (Davidsson, 2021). 

Thus, individuals who affirmatively indicate that they possess the socio-cognitive trait of 

opportunity recognition benefit from the market access mechanisms provided by digital 

infrastructure through the enhanced ability to determine the viability of a new venture and the 

value of entrepreneurial action, thereby increasing the likelihood of entrepreneurial activity.  

Second, in transiting from the subjective perception of an idea to a viable, operational business, 

individuals who possess the socio-cognitive trait of opportunity recognition benefit from the 

resource access mechanism provided by digital infrastructure. In countries with a high level of 

digital infrastructure, the resource access mechanism enables individuals to identify and acquire 

relevant but missing tangible and intangible resources (Bhagavatula et al., 2010; Haynie et al., 

2009; Nambisan & Zahra, 2016), which condition the feasibility and viability of the 

entrepreneurial endeavor.