CIAC data architect Jonathan Leek works to quantify the hidden cost of vacancy
It can be hard to miss the broken windows, boarded-up doors and collapsing roofs on the red brick buildings while driving through certain neighborhoods in the city of St. Louis.
Vacant properties pose an eyesore to residents, and they’re often connected to other problems such as increased crime and a growing risk to physical health through injury or the buildup of trash that can attract pests and spread disease.
They make neighborhoods less desirable.
It’s a challenge to accurately quantify that cost.
Leek has been working to change that. Long before he joined CIAC in March, he was part of the team that constructed the vacancy portal prototype launched last year at stlvacancy.com. The group hopes to pull publicly available data from St. Louis’ Open Data site and gather it into one comprehensive set to be updated as information changes, permits are issued or canceled and parcel ownership or vacancy status changes.
It should provide a clearer picture of vacancy in the city.
This fall, Leek has been participating in an invitation-only accelerator program put on by global nonprofit DataKind in conjunction with Microsoft. DataKind works to bring data science and artificial intelligence to bear on societal problems, and the organization has been aiding Leek and his team in using AI to implement predictive models that show the potential increases in property values and the associated tax base that could result from a reduction in vacancy.
“If you’re trying to quantify how much all these blighted properties cost the city of St. Louis, it’s really easy to say, ‘Okay, there are this many vacant buildings that are collectively worth this much, and if they were populated, this is what someone would be paying in property tax,’” Leek said. “That number is super easy to get. The number of that is basically impossible to get and no city has really been able to get at is if there are three vacant buildings in my neighborhood. How much more would I be paying in property tax if those were populated?”
Leek and his team are trying to train AI to assess the value of a building the same way the city assessor does by feeding it a variety of data points – number of bedrooms, number of bathrooms, square footage, number of garages and code violations. But they are adding to those the number of vacant buildings located within 100 meters, 200 meters, 300 meters – all the way out to 12 kilometers.
All the data is compiled in a large table, the last column of which lists the value at which the property is assessed. Through artificial intelligence, a computer can guess at the assessed value of the property.
After resetting all those vacancy counts at the different distances to zero – in essence telling the machine that all the vacant properties are populated – it can arrive at a new assessed value, still following the approach learned from analyzing data from the city assessor. The difference between the two assessed values reveals how much lower the cost that can be attributed to nearby vacant buildings or land.
“There’s 100 things that come from this – how banks will value property, how people can access loans, how we more precisely value properties near each other such that we can have mixed income housing without overtaxing one,” CIAC Director Paul Evensen said. “Surgery with a butter knife is messy. Surgery with the surgeon’s tool can be very neat and precise, and we’re trying to get that level of precision, so we don’t do the side harm that comes with a less precise use of data.”
Through the accelerator program, Leek and his team took part in a DataDive on Oct. 6. Leek provided an overview of the existing data available in St. Louis, and group of 12 data science volunteers spread out across the country and internationally split up tasks to get it in better working order.
“We did not get done what we were hoping to get done, unfortunately, but we moved the ball pretty far forward,” he said.
He’s continued to work at refining the dataset over the past month.
There are several constituencies interested in the finished product. One is the Vacancy Collaborative because if its members are able to put a dollar value to vacancy’s cost, they are better positioned to argue for investment in solutions.
Leek said researchers are also eager to take advantage of the devised models, which Leek and his team have designed to be manipulated with more than just vacancy figures.
“We’re writing the code in such a way that you can change what variables you put into the system,” he said. “You can train your own models. Once we have a good example for how to do this type of work, then it becomes a research strategy where you can say, ‘OK, instead of putting vacancy data, and let’s put in, for example, racial demographic data and see if there’s a social justice issue with how assessments happen.
“You can tie in whatever you want, but it provides a framework for how you can use AI to answer some of these questions that have been very tricky to answer before.”
Leek is self-taught in data science after earning a degree in psychology from Missouri University of Science and Technology. He previously worked as a consultant for Daugherty Business Solutions before joining CIAC, where he’s been involved with the St. Louis Regional Data Alliance and the development of the Regional Data Exchange.
“I’m very much in the camp that the city should be making more data-driven decisions,” Leek said. “But we need that lever. We need more and better ways to convince the budget makers to put money towards this because these tools are expensive, and the people with these skills are expensive.”
Short URL: https://blogs.umsl.edu/news/?p=82962