With the sudden onslaught of new artificial intelligence-based applications and tools, many managers may now be marketing their strategies as using “artificial intelligence” (AI) or machine learning. The following questions could help you ensure that you’re speaking with a methodologically rigorous AI manager and not one simply trying to jump on a fad.
Here we will share what we consider to be some high value due diligence questions you can ask an AI-based asset manager, what type of answer you could expect to receive, and any answers that would raise a red flag for me if I was sitting in the due diligence seat.
1. What kind of AI or machine learning model are you using?
What to look for in an answer
Common AI or machine learning approaches include Neural Nets or Deep Learning (effectively used interchangeably), genetic algorithms, decision trees or random forests, support vector machines, ensemble models, or Bayesian filtering (usually applied to word or sentiment-based analysis). These models vary in their methodological robustness. Deep learning, genetic algorithms, and decision trees are at the high end of the rigor spectrum; support vector machines and Bayesian filters a little lower. They also differ in their “black boxness” for lack of a better word. Deep learning models are notoriously opaque, while genetic algorithms and decision trees are generally pretty clear about which data points at which weights are contributing to the output.
Red Flag answer
It’s actually just an ordinary least squares regression/logit model/list of statistically derived signals.
While these may properly be considered quantitative approaches to investing, it is really stretching (by a torturous amount) the definitions of machine learning and especially artificial intelligence to market these approaches as such.
Redder Flag answer
Oh, we are using ChatGPT to summarize research/generate ideas.
While ChatGPT is a great research tool, by no means should use of that application be marketed as an AI strategy. If I came across someone doing this, I might be inclined to report them to the SEC.
2. In which programming language is your model implemented?
What to look for in an answer
Python, C++, C#, R, and Java are common and well-regarded coding languages used among data scientists. Of these, Python and R have the most “user-ready” libraries for implementing deep learning. If a manager is primarily using C++, C#, or Java, it is quite likely they are doing something very proprietary, which in my mind is more likely to generate alpha than an out of the box solution that anyone with passing programming skills could implement. That said, someone using Python or R should be taken seriously until proven otherwise.
Red Flag answer
Programming language? What programming language? It’s all in an Excel workbook.
I love Excel as much as the next data scientist, but I would never dream of building an artificial intelligence-based model using it. I’d be even less likely to market a strategy that did so as artificial intelligence or machine learning. This might not elicit an immediate report to the SEC, but I’d certainly at least consider it.
3. Is there anything proprietary about your model construction and optimization process?
What to look for in an answer
We believe the more proprietary a model, the better. Ideally, a manager’s model will have proprietary data inputs, use some kind of novel statistical approach for training/optimizing the model, and in a best-case scenario will have a proprietary method for validating the model’s performance. In modeling/machine learning, the validation methods may be called “fitness functions” or “loss functions”, as they are a method for evaluating if the model is achieving its targeted goal.
Red Flag answer
We use [a program that anyone could license and implement].
A Google search will quickly reveal if they are using a basic software package or not. These may be a good, quick and dirty way to implement an AI-based investment strategy. However, someone whose primary goal is quick and dirty is more likely to be fad-chasing than running a strategy they genuinely believe can produce alpha for their investors.
Redder Flag answer
I’m not sure. I don’t really know how it all actually works.
This is probably self-evident, but I would never invest my own money with a manager who cannot explain and does not understand their investment system. If you are speaking with someone from distribution, sales and marketing and you get this answer, ask to speak to one of their quantitative staff or a portfolio manager. If you are in fact speaking with one of their quantitative staff or portfolio managers, this would be a good time to hang up the phone and end the interview.
4. How regularly do your data scientists/programmers interact with your portfolio management team?
What to look for in an answer
I believe there should be a significant level of interaction between the quantitative team and the portfolio management team if they are not in fact one and the same. These two roles should be engaged in a constant feedback loop to understand how the model is working in the real market environment and what can be changed or added to produce better live results.
Red Flag answer
Not very much at all.
Unless their quantitative team is following the market and strategy performance as closely as the PM team, this could be a recipe for disaster, especially if something significant changes about the market environment that hasn’t been seen before in the backtest sample data.
Redder Flag answer
We do not employ any data scientists or programmers.
If you didn’t hang up the phone after question 3, I’d go ahead and end that call now. Even if they work with outside contractors with expertise, I personally would not invest my or my clients’ life savings in an “AI” strategy with a firm that does not have any internal expertise in artificial intelligence, machine learning, or data science.
5. How are human beings overseeing and verifying the AI output? What are their qualifications for overseeing the models?
What to look for in an answer
We believe human beings should be heavily involved in all facets of model design, model verification, and live implementation. The stock market is a particularly fraught environment in which to implement a quantitative model because there is only one data history. Other AI applications you likely use in your daily life, like facial recognition software, can continuously generate a fresh data history for training and testing. This enables them to avoid “overfitting” to one history of data and thus provides greater accuracy when they are used “out of sample”. There needs to be a high level of expert oversight of the model outputs before they are implemented as portfolio decisions to ensure the model’s recommendations make sense in the current environment.
Further, having someone with a statistical/quantitative background participating in that review will help to ensure not just market-based technical or fundamental factors are considered in an override decision, but also methodological and statistical factors.
Red Flag answer
We just follow the model.
Would you get in a self-driving car if you had no way of over-riding the autopilot? If not, I wouldn’t trust a portfolio manager who hasn’t considered a scenario where they’d need to override the AI model they are using. I would particularly beware if the model implements trades automatically.
6. Where are you sourcing your data from? Do you have a license in place that you pay for? Do you have any backup sources?
What to look for in an answer
It is much preferable if a modeler is relying on a data source they have purchased and licensed. The data provider is incentivized to clean the data, reduce latency (the amount of time between when market data prints and when you can access it through the service), provide warning/guidance if data structures are being changed, and minimize downtime. Free sources of data may simply change or disappear overnight with no warning.
However, even paid for and licensed data sources may experience temporary issues. In these cases, it’s prudent to have a backup data source to turn to for continuity of the models.
Red Flag answer
We’ve just been scraping data from [finance website]. It’s working great so far. (By the way, have you seen how much Bloomberg charges?)
Our company used to rely on Yahoo finance data as a backup security pricing source, which worked great until they decided to change the website and made it impossible to scrape. We quickly located another price data source we could license and have had no issues with the backup service since.
The point being that a free data source is great until the day it returns gibberish or stops working entirely.
7. What is your disaster recovery/contingency plan in the event you cannot access or generate your model results on a trading day?
What to look for in an answer
Every year our company has a training with our cybersecurity consultant, and it is reliably filled with brand new tales of peril. It seems hackers are always a step ahead of us and trying to figure out how they can make our lives difficult enough that we’ll send them some Bitcoin to leave us alone.
Any highly computational manager should have considered what they would do in the event that their mission critical machines were compromised. Having a cybersecurity consultant or internal cybersecurity expert should be part of that plan, and that’s true for any manager, but particularly for a manager that relies on computers for investing. They should also have at least one reliable source of backup in the event that their primary machines become inaccessible due to natural disaster or nefarious actors.
Red Flag answer
We’re not too worried about a disaster, we could regenerate our models pretty quickly if we lost access to our computers.
Not only would this viewpoint be out of step with the SEC (we’re anticipating as they come out with their new cybersecurity rule in the near future), but embedded in that answer are really two red flags. Firstly, if someone isn’t worried about needing to quickly rebuild their models, the models are probably not very complex/worth investing in in the first place. Our company tends to seek out “difficult to compute” input variables and fitness functions as we see this as a moat around our alpha. If it’s hard for us to compute, it should be even harder for someone else. Secondly, someone providing that answer likely does not have a proper fear of the current cybersecurity environment.
Why We Care
I could share even more due diligence questions for interviewing AI-based managers, but these seven are a good start and should filter out a lot of the fluff and noise.
Artificial intelligence and machine learning are incredibly powerful quantitative tools for investing and solving other difficult problems. We believe more and more asset managers will begin incorporating these tools into their investment process in some fashion, and so we believe it is a good time to begin to understand these methods and tools so you can properly vet managers’ use of them.
While we at BCM would love to be your preferred and/or only AI-based investment solution, we want even more to help ensure that if you do invest with another AI-based manager that it is a good experience and meets your expectations as an investment professional. We have a deep interest in the technology and are committed to ensuring, as managers, that we are applying it to investment decision-making appropriately and responsibly and encourage you to demand that same commitment from any AI or machine learning-oriented asset managers you use.