Authors:
(1) Mark Potanin, a Corresponding ([email protected]);
(2) Andrey Chertok, ([email protected]);
(3) Konstantin Zorin, ([email protected]);
(4) Cyril Shtabtsovsky, ([email protected]).
3 Dataset Overview, Preprocessing, and Features
3.1 Successful Companies Dataset and 3.2 Unsuccessful Companies Dataset
4 Model Training, Evaluation, and Portfolio Simulation and 4.1 Backtest
5 Other approaches
5.2 Founders ranking model and 5.3 Unicorn recommendation model
7 Further Research, References and Appendix
According to some characteristics - the number of previous startups (founder, co-founder), their area, success, etc. - we can also score founders. An escalated score is indicative of a company’s enhanced credibility. The results of these models can be used both for preliminary scoring of companies and as independent features in other models. An example is presented in Table 5.
It was revealed that the median time for a company to achieve the status of a "unicorn" is 4-5 years. Thus, in this period of time, about half of the unicorns have reached this status, moreover, the second half is waiting in the wings in the near future. This model identifies nascent companies established within this 4-5 year time frame, isolates ’unicorns’ within this subset, scores entities bearing the greatest resemblance and subsequently generates a list of the top 30 recommendations.
For 2016-2021 simulation run:
• On Jan 1st of each year, a list of recommendations of potential unicorns is formed.
• Every month, in case of the announcement of a round (series_X), a company is added to the portfolio if its valuation is below 1 billion and the round is not too high.
• Companies that have reached 2.5 billion or have not had rounds for 3 years are removed from the portfolio.
As a result, at the end of the period, a portfolio of companies was formed. The limitation in this context is the scarcity of information related to post_money_valuation information. Further development: as new data become available, building a more complex recommendation system. The results are presented in Table 6.
This paper is available on arxiv under CC 4.0 license.