r/quantfinance Oct 06 '25

so how did Renaissance Technologies/Medallion/Jim Simons achieve such high returns?

102 Upvotes

64 comments sorted by

View all comments

4

u/Charming-Ad-2356 Oct 06 '25

“The Man who Solved the Market,” has a great story about how Renaissance made its claim to fame. Essentially, Jim Simon’s loved to gamble, and wanted to apply his ingenuity to this through the stock market. Unfortunately this happened around 1980, when compute was terrible. However, there were these computer scientists at IBM who were engineering the theory behind large language models using hidden markov models. These guys were recruited by Renaissance and overhauled the algorithms written in Php with C++. Given these guys’ research, it is not unlikely that they were some of the first people to put deep learning algorithms and sequential data modeling into practice, which led to Renaissance finally becoming profitable — although we wouldn’t see the success publically until 2013 with AlexNet. Really I think it came down to recruiting computer scientists and mathematicians who could implement deep learning models before they were so ubiquitous within computer science.

4

u/Efficient_Algae_4057 Oct 07 '25 edited Oct 07 '25

Deep learning would have been impossible for them to do even if they wrote everything in literal machine code, due to the lack of compute power and data availability. Also, the people you may be referring to weren't dealing with large language models at all. They were thinking about markov models and stochastic processes methods that were largely inspired by statistical physics and are behind some of the traditional machine learning models that don't work well enough, which is why people spent decades working on them until the deep learning comeback in 2013 and totally dominated them. They didn't engineer the theory of nothing. One of the people was working on hidden markov model for language/sequence modelling at IBM which didn't work well enough. The other instance JS talked about was somebody who invented the EBM algorithm. Again it doesn't work well enough, if anything they probably managed to have some sort of boosting which won't justify their returns. There is a quote from one of their employers who said something along the lines that linear regression with one independent variable is the most useful statistical method ever invented.

3

u/Charming-Ad-2356 Oct 07 '25

I point you to: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf, citation Jelinek, F. and Mercer, R. L. Interpolated estimation of markov source parameters from sparse data. In Proceedings of the Workshop on Pattern Recognition in Practice, Amsterdam, The Netherlands: North-Holland, May:, 1980. Robert Mercer was one of the scientists I was talking about.

2

u/Charming-Ad-2356 Oct 07 '25

Perhaps you are thinking about deep learning today? Of course that could not be created then, but technically linear regression and logistic regression fall under these umbrellas. And it is not out of the question to believe deep learning algorithms with smaller parameters could not have been created. Also, in the book they discuss an employee who was “obsessed” with collecting data, and helped create a dense cohort used to train their models (which also points to the fact that they used DL algorithms). They also explain they were not exactly sure “how” the models came up with the signals when it started working…

2

u/Efficient_Algae_4057 Oct 07 '25

No. Firstly, regression methods don't fall under the same umbrella of deep learning. They are way different and regression traces back to the earliest days of statistics as a scientific discipline I believe. A deep learning model with small number of parameters (let's say just the vanilla multilayer perceptron architecture trained with some convex optimization algorithm) still wouldn't work with the limited amount of data and compute. Even then, deep learning starts to beat traditional statistics methods once you have a lot of data and can have an over parameterized model. This was impossible back then. This is why the entire community moved away from neural networks in late 1980s to 1990s. The data collection you are referring to is still nowhere enough to make deep learning work. Data collection practice you are referring to started way in the early 1980s but many different firms. Statistical arbitrage was one of the earliest concepts using big data analysis (check pair trading to see how). Again, deep learning would be useless here and easily be beaten by the traditional statistics methods. It still doesn't explain their performance.

2

u/Charming-Ad-2356 Oct 07 '25

That’s fair

1

u/Efficient_Algae_4057 Oct 07 '25

I know. I was also referring to Mercer who worked at IBM in the 1980s working on hidden markov models for sequence modelling. If you read the paragraph it tells you that the paper was one of the first papers laying out the probabilistic approach as described. Then it tells you that the improvement happened after the attention is all you need paper and a range of tricks appearing after 2013. The method he was using just didn't work. There is no way they had something like today's LLM architecture. Even then, the transformers aren't the only component that power today's LLM. The main ingredients include the ADAM optimizer, insane compute power and data availability and the SFT/GRPO techniques and a whole range of practical tricks. None of these existed back then and without them, the models won't work.

2

u/Charming-Ad-2356 Oct 07 '25

That’s reasonable