COVID-19: Our deliveries are guaranteed and our customer service is open! Find out more

In conversation with Artnome’s Jason Bailey

Turbare speaks to Jason Bailey, founder of Artnome, an analytical digital art database that’s helping improve opportunities for collectors and artists alike.

Author Will McBain Wed 20th Jan 2021
Artnome founder Jason Bailey looking into camera

We’re very excited to bring you an interview with Jason Bailey, the man who created Artnome! Artnome uses Artificial Intelligence to improve historical records, give valuations on art, and it also plans on boosting opportunities for artists from marginalised groups.

Turbare asked Jason Bailey a few questions about Artnome, AI and the art world.

Turbare: Hi Jason, to start off, please tell us a little bit about yourself. I’ve read that you consider yourself an art nerd, how did you become one?

Jason Bailey (JB): I think I was born an art nerd. An artist but in a family of engineers. Both of my parents grew up not particularly wealthy but despite that, and for the social status or the class that they grew up in at the time, it would have been unusual to have travelled the world and saw all those museums and developed a passion for art as they did.

So even though I grew up without a lot of money early on, it was normal for me to go to museums and to look and talk about art, and I actually thought that art was a thing just for poor people. Because I didn’t understand that this is something that rich people collect and it’s also a status symbol. It was more like the museums are free and that’s what we can afford to get to go to, and do these things.

So early on I knew I wanted to be an artist. Many conversations at the dinner table focused on art and I was never much of a contributor so I became sort of a listener and soaked a lot of that stuff up.

Turbare: So when you were growing up you would go off to the museums at the weekends and you would look and seek out art?

JB: Yes, from a young age my Dad would take me at the weekends to the Museum of Fine Arts, and that’s how I learned history, I would say more than through school. He walked me through the different civilisations and the time periods and things like that, so not being able to place myself within the family as an engineer or a science person or a builder, I kind of hung my name on being the artist of the family and art has been core to my identity ever since.

Turbare: So at what point after your studies and the beginnings of your career did you decide that Artnome needed to be created?

JB: I went to school in the late 90s for painting, sculpture, printmaking and art history. And when I graduated, like most art students, I realised there was a very limited subset of professions or jobs. I had to pay for food and shelter and I had to find a way to make a living. So after trying to make it as an artist I realised that I needed to find a way to get a job so I went back into the family business and became a marketing slash design person, for various engineering companies.

And all the while I was missing being around art, and then I read this book called ‘Provenance: How a Con Man and a Forger Rewrote the History of Modern Art,’ which talked about how the assumed rate of forgeries and misattributed works in museums is somewhere between 15 and 20% of all works.

That’s a statistic that is debatable and it’s the nature of forgery that no one really knows the number for sure, but I’ve done some math that kind of proves that that’s fairly likely. So I read that book, and I thought ‘well this is crazy’ not only are people forging paintings, but they’re forging documentation, and for me it was almost like someone was forging my Bible, because art’s been my identity since I was a little kid, and to realise that it was so susceptible to being rewritten kind of blew my mind.

I had at the time spent 10 to 15 years working in big data and machine learning and analytics, in my day job for companies that specialise in those areas. So I thought, well, these giant companies like the GE’s and Toyota’s, Johnson and Johnson’s of the world, have these massive customer databases, but where’s the massive database for art and artists? If you looked online on Google and you tried to figure out how many paintings that Rothko made, they didn’t exist publicly online which seemed crazy to me.

Turbare: Why are there so few publicly available databases for art?

JB: I think the tempting answer is to say the opacity is beneficial for some people on the market side. It’s argued it is in place by design, because it allows the galleries to pick and choose what they say about which works, or by who, and how many are available, so that’s one potential reason but I think that’s not fully fair.

I think when you actually think about the exercise of going in and figuring out every authentic work by a given artist, especially if they’ve been dead for quite some time, it’s a pretty difficult task. There are these books called Catalogue Raisonné that do exist in print format, so a lot of the better known artists like the Monet’s and people like that are gonna have Catalogue Raisonné’s, but those printed books can cost.

They’re very rare and the Picasso Catalogue Raisonné until recently was $200,000, right, so it’s not something that everyone has on their bookshelf and your library doesn’t have it. Their database is in print format and when something’s in print format you can’t run an analysis like you could on Excel, right. You can’t run an analysis across a single artist, you can’t compare multiple artists, so these basic questions like, what’s the average height of a Pollock or the average width of a Cézanne? No one can answer that, because it’s never been put into a format that allows you to do an analysis. Well now I can answer some of those questions.

What’s a lot of paintings for someone to paint in their life? Well unless you know that number across a lot of painters, you don’t really know, so we have major auction houses and dealers who will say things like: “Oh, this is the largest Blackboard painting from the early 70s,” but if you actually look at the data that’s not true, but they’re under the gun to come up with different ways, adjectives and superlatives to describe the works. So sometimes they’ll just say this is the largest, this is the smallest, or this is the most rare, and until you have data, those are hard questions to answer.

Andy Warhol pop art exhibition Tate Gallery
Courtesy Tate


Turbare: In layman’s terms, how did you build your AI model and which artists can benefit from it?

JB: In layman’s terms we actually have two databases. We have the Catalogue Raisonné database, and then we have an option database. So those are the two sources we use to build our model.

One of the big problems in data is about 70% to 80% of data science is gathering and cleaning data, the sort of janitorial work around data that is not very sexy and takes a lot of time, and that’s true across the board so all people doing interesting things in analytics are spending a lot of time, more than people realise, cleaning the data.

So the first step is prepping the data, and that step is pretty huge, getting all that data together, and once you have good clean data you can build these mathematical models, to try to get some signal around predictive pricing.

And then you do something called back testing, so you come up with models that you think will do a good job of predicting prices in the future, and you come up with a list of features – which is just a fancy name for data points – that are inputs that you think are going to impact the model, and you continue to play around with them until you back test and you start to see more success, and the predictive behaviour improves.

All predictive models, or analytics, require lots of good clean data, and there’s not lots of good clean data in the art world right now. So a big part of our mission and what we’ve been trying to do is gather a bunch of data, clean it, and unify it across multiple sources, which is a really critical component to any of these things.

In addition to that, we’re now able to start building out a model that requires a fair amount of testing, that’d be the second point. But all these models are built on historical data, and not only is there not a lot of art data but there’s not especially a lot of data on women and artists of colour.

So given that the market right now is hot for those two groups that have been overlooked, I think this shift towards valuing underrepresented artists is a good thing. Data will really open up a new marketplace for younger folks who are interested in a more diverse group of artists, so that we can buy online and we’ll have more of a global culture around it.


Andy Warhol's artwork Turquoise Marilyn
Andy Warhol’s Turquoise Marilyn | Courtesy Sotheby’s

Turbare: What’s been the receptivity from the likes of Christie’s to what you’re trying to do. Are you welcomed and supported by them?

JB: Yes, I think it’s because our mission from the beginning was to try to make the data readily available. I’ve actually been invited to speak at both Christie’s and Sotheby’s multiple times. I think they realise that we’re moving into a time period where people are going to try to find this information on their own, or it’s inevitable that the information will become available.

We’ve got this massive forgery issue, money laundering and all kinds of obfuscation, but we’ve all agreed that these artists and these works are important and that’s why we preserve them and build these expensive museums and write about them. So, for all the external signals that this is really important to us, yet to think we don’t have a public inventory is kind of crazy.

I talked to the Yale library, the Harvard library, Columbia, the Getty Institute, all these places because I wanted to confirm if it is true, that there is no single database that covers the complete works by all these artists? Everyone came back one by one saying no such thing exists but ‘we wish it did’.

So, I did one artist and it took a fair amount of time and money to get the catalogues, but it worked and I transformed that into data and I’m like “okay if I could scale that, then I can get pretty quickly to a larger database and I think I got to the largest database in the world on my own, as one person, and I thought “well, maybe if I can show people what one person working nights and weekends can do by just loosening the lid on the jar, and speaking about it then someone else will come along and do this and I won’t have to write,” it’s just something that I think exists, but as far as I know no one else has come along, so I continue to try to develop it.

Turbare: Does your team share this same sense of calling that you have, and what have you enjoyed working on together?

JB: The team was initially made up of a lot of volunteers, and it turned out that this idea we have resonates with a lot of people, who think we should take better care of our cultural treasures, and figure out where they are and what they are.

We have people with a core level interest, a lot of really sharp data scientists in the world who are looking for interesting data sets that are relevant to their interests.

I’ve worked with some pretty bright folks, including a data scientist who worked with me to encode Van Gogh’s paintings, and figure out why they shifted towards the yellow colour scheme. There’s been these theories around for 100 years or more that he took a medication at the end of his career that turned his vision yellow, this foxglove plant that they were giving medicinally. Then the other theory was that when he went to Paris, he saw the Impressionists and that changed his palette.

Well, we looked at the weather patterns for everywhere that he ever lived, and averaged all the paintings for each location that he had lived in, and an algorithm looked at the average colour for each location, and the pattern showing how much sun or rain there is. We saw a direct correlation showing that when it gets sunny and yellow in his paintings, it’s because he’s outdoors for the majority of this time in sunny weather.

Turbare: What type of things can affect your machine learning model predictions and valuations, and what data is missing?

JB: There’s a distinction that the art world makes between a valuation and a prediction. So a valuation is how much is this thing worth today. That’s based on what we know about comparable sales in the artists market. And then there’s prediction, which is how much something is worth in the future, and in the art world people tend to stay away from prediction, because there’s so many different variables, so that they want to talk about valuation.

What data we’re missing is that we have the least amount of data around artists of colour, and female artists because the space has largely been driven to celebrate work by white men, so people that look like us and I guess as a result, there’s a problem that all prediction engines and models tend to perpetuate bias.

So if we look backwards, our data would say that work by white men is worth about 40% more than art by women, but also other variables like, for instance, Pollock’s paintings with more blue are going to be worth 10 times more than his others.

When you look at the biases against artists of colour and women, if you’re really looking to use analytics and data to invest in the art market to your advantage, it’s actually the absence of data and the absence of analytics, and the clarity of bias around those two groups that you would use to try to invest if that makes sense. So the more data you have, the greater transparency in the market there will be, which will help artists of colour and women.

Turbare: So the more data and more analytics that come into the market and into your models, then it’s predicted to help women and people of colour, including African artists?

JB: Yes, we certainly need more data around those groups of people that are underrepresented in order for the models to be more useful.

You know people told me, “The art market is conservative and rich and white and slow to move,” but I disagree with them. I basically said, “look, the younger generation is there and we’re witnessing a big shift generationally.” Tastes are moving more towards equality, and that’s going to change this market pretty dramatically, and will improve valuations and sales for historically undervalued artists.

Now all these models are based on history, so I think you can see the problem we have, and my team and I have been developing a valuation system that can do a pretty darn good job of looking at someone like a Rothko, and giving you a range of prices that their work might fall under for. But a pretty big blind spot would be, as I said, artists of colour or female artists, and our models will only get better with more data about them and their work.

Sign up or login to join our tribe

Sign up or login to join our tribe