paint-brush
How Machine Learning Is Changing Startup Predictionsby@exitstrategy

How Machine Learning Is Changing Startup Predictions

by ExitStrategyAugust 7th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This review covers various studies on AI applications in predicting startup success and venture capital outcomes. Key insights include the use of Crunchbase data, machine learning models like CapitalVX and gradient boosting, and the impact of behavioral theories and web-based data on investment predictions. Notable findings emphasize the importance of diverse features and data sources in enhancing predictive accuracy.
featured image - How Machine Learning Is Changing Startup Predictions
ExitStrategy HackerNoon profile picture

Authors:

(1) Mark Potanin, a Corresponding ([email protected]);

(2) Andrey Chertok, ([email protected]);

(3) Konstantin Zorin, ([email protected]);

(4) Cyril Shtabtsovsky, ([email protected]).

Abstract and 1. Introduction

2 Related works

3 Dataset Overview, Preprocessing, and Features

3.1 Successful Companies Dataset and 3.2 Unsuccessful Companies Dataset

3.3 Features

4 Model Training, Evaluation, and Portfolio Simulation and 4.1 Backtest

4.2 Backtest settings

4.3 Results

4.4 Capital Growth

5 Other approaches

5.1 Investors ranking model

5.2 Founders ranking model and 5.3 Unicorn recommendation model

6 Conclusion

7 Further Research, References and Appendix

The application of AI in fintech has substantially transformed the financial services industry over the past decades [1]. For example, one of the most well-known applications is credit risk assessment [2]. Another challenging task could be stock market prediction [3]. This paper focuses on startup prediction and the VC market, and there is a growing literature on analyzing investments using machine learning.


In the paper [4], authors present a machine learning model, CapitalVX, trained on a large dataset obtained from Crunchbase, to predict the outcomes for startups, i.e., whether they will exit successfully through an IPO or acquisition, fail, or remain private. They investigated MLP, Random Forest, XGBoost and used mostly numerical features from the dataset. In [5] paper, authors conducted a review on existing machine learning techniques that are recently contributed to understanding the need of start-ups, trends of business and can provide recommendations to plan their future strategies to deal with the business problems. The study conducted by [6] underscores the potential of machine learning applications in the venture capital industry, demonstrating its ability to predict various outcomes for early-stage companies including subsequent funding rounds or closure.


In another study [7], authors use behavioral decision theory to compare the investment returns of an algorithm with those of 255 business angels (BAs) investing via an angel investment platform. The study found that, on average, the algorithm achieved higher investment performance than the BAs. However, experienced BAs who were able to suppress their cognitive biases could still achieve best-in-class investment returns. This research presents novel insights into the interplay of cognitive limitations, experience, and the use of algorithms in early-stage investing. This study [8] proposes a data-driven framework, wherein the model was trained on 600,000 companies across two decades and 21 significant features.


This review [9] provides a thorough analysis of AI applications in Venture Capital, categorizing influential factors on a company’s probability of success or fund-raising into three clusters: team/personal characteristics, financial considerations, and business features. In another study [10], authors leveraged Crunchbase data from 213,171 companies to develop a machine learning model to predict a company’s success. Despite limiting the number of predictors, it achieved promising results in precision, recall, and F1 scores, with the best outcomes from the gradient boosting classifier.


This study [11] explores the untapped potential of web-based open sources in contrast to just structured data from the startup ecosystem. A significant performance enhancement is demonstrated by incorporating web mentions of the companies into a robust machine learning pipeline using gradient boosting.


This study [12] aims to assist VC firms and Angel investors in identifying promising startups through rigorous evaluations, emphasizing the impact of founder backgrounds and capital collected in seed and series stages. This very recent paper published in 2023 [13] introduces a novel model for predicting startup success that incorporates both internal conditions and industry characteristics, addressing a gap in previous research that focused primarily on internal factors. Using data from over 218,000 companies from Crunchbase and six machine learning models, the authors found media exposure, monetary funding, the level of industry convergence, and the level of industry association to be key determinants of startup success.


In this study [14], authors analyze more than 187,000 tweets from 253 new ventures’ Twitter accounts achieving up to 76% accuracy in discriminating between failed and successful businesses . The research outlined in [15] investigates the methodologies used by venture capitalists when evaluating technology-based startups, using the influence of weak (Twitter sentiment) and strong (patents) signals on venture valuations. Findings reveal that while both signals positively associate with venture valuations, Twitter sentiment fails to correlate with long-term investment success, unlike patents. Furthermore, startup age and VC firm experience act as boundary conditions for these signal-valuation relationships.


This paper is available on arxiv under CC 4.0 license.