Investigations and analyses performed for businesses are usually confidential. That prevents analysts in the private sector from giving examples of their best work. That’s not the case for analyses done for news organizations.
During my days as an investigative reporter, I routinely obtained, maintained and analyzed databases. Often this involved compelling public officials to comply with public records laws. These specialties — data analysis and use of public records laws — gave me the ability to survey the big picture and got me picked as a collaborator on many of the newspaper’s best projects.
A criminal investigation was launched after fellow Plain Dealer reporter Mark Puente and I reported that a county board of revision member was working a second job during hours he was paid to work for the county. The exciting shoe leather reporting for this story started after I analyzed county parking-garage data and found that the board member averaged just three hours a day at the county building. The board member resigned after publication of the story. Puente and I then wrote a story showing the lax work habits of other board staffers. The board administrator was ousted from his position that week, and state investigators began pursuing criminal charges against board workers. (Aug. 1, 2010)
Sometimes the only way to analyze data is the old-fashioned way, examining and entering
it by hand. Fellow reporter Mark Puente and I, with the help of two interns, examined more than 100,000 paper records. We found more than 2,200 of these tax-related documents that were improperly — perhaps illegally — altered. Together, they slashed $145 million from the county’s tax rolls. State investigators seized many of these records after learning of the newspaper’s findings for a criminal investigation. (Sept. 5, 2010)
Half of the 22 members of a little-known board lowered each others’ property values, saving each other a total of $25,000 on their taxes.
(Oct. 10, 2010)
I needed to identify cronies of the county’s corrupt auditor who got tax breaks. Unfortunately, the names of these folks appeared in different formats in each dataset and the tax breaks — in the form of value changes — could appear in any one of four other datasets. Cleaning data and combining tables took days. Then my collaborators and I painstakingly hand-checked its instance of donors getting breaks. (Nov. 28, 2010)
I fought for months to make Cleveland officials comply with the state’s public records laws and release public records on officers’ uses of force. The department kept these records in different units and never analyzed the records — something done by more proactive departments. After gathering the more than 238,000 electronic records, my analysis uncovered, among other things, that the department rubber-stamped each use of force “justified.” I joined four databases to identify trends, including potentially dangerous officers. This prompted the department to change its policies and create a new database, which might have remedied the problem. Instead, Mayor Frank Jackson’s administration fought to keep from releasing the new data under the state’s public records laws; the same type of problems I had uncovered persisted, leading eight years later to the U.S. Department of Justice issuing the city a consent decree that is expected to cost taxpayers $13 million. (Jan. 14, 2007)
As part of a continuing investigation into the county’s boards of revision with fellow reports Mark Puente and Henry J. Gomez, I obtained and analyzed a database of more than 78,000 complaints filed by property owners (later updated to include more than 95,000 cases). I found that the property values of nearly 26,000 properties were decided without giving property owners a hearing. In all, these decisions shaved more than $400 million from the county tax rolls. After publication of this story, attorneys for the county concluded that most of the cases violated state law. This opened the door to property owners to have their cases heard. (Aug. 22, 2010)
The county sheriff threatened to release inmates from jail if commissioners cut his budget. I obtained a database detailing why each inmate was in jail and used criteria gleaned from interviews to identify the sorts of suspects most likely to be released if the sheriff took this action. (Jan. 12, 2009)
Cleveland’s building and housing department was plagued with problems. I joined databases containing the city’s payroll, inspections done by the department and property purchases in the county to identify parcels bought in violation of anti-corruption policies. I then teamed up with fellow reporter Joe Wagner to report the out the story. (June 15, 2009)
After the bodies of 11 women were found in and around the Imperial Avenue house of suspected serial killer Anthony Sowell, people wanted to know all they could about missing persons. I had a database of every police report filed in the city going back several years — a jewel in my library of databases that literally took a three-year public records battle to acquire. My analysis mapped the addresses given on missing person reports and broke down the demographics of those who disappeared in
Cleveland. (Nov. 15, 2009)
Calculating the cost of corruption and mismanagement was not only a challenging exercise in computer-assisted reporting, it also required me to digest the complex workings of the state’s property tax laws. After hours of interviews with officials, I wrote a step-by-step summary of how I understood the system to work. This summary was way too detailed for the newspaper. But I had to know my basic understanding was sound. After a few revisions, the county treasurer said I’d gotten it. I then devised a method for tallying how much board members’ mismanagement and poor work habits cost the county, something no one had ever calculated. I reported and wrote the story with fellow reporter Henry J. Gomez. (Feb. 20, 2011)
One of the more challenging and ultimately rewarding investigations I performed as a reporter was figuring out how much property owners were ripped off on their taxes because of the mismanagement that two other reporters (Henry J. Gomez and Mark Puente) and I uncovered in a little-known but very powerful county agency, the Board of Revision. I started with a database of 95,844 complaints to the board and pulled from it all cases from 2008 where taxpayers succeeded in getting their property values reduced. I then joined these cases with a database of more than 1 million records detailing the appraised value of every property in the county between 2002 and 2009. Every case where a property’s 2009 value was higher than the reduced value in 2008 represented a likely case of a property owner being overtaxed. I found nearly 8,000 of them. Finding how much these property owners were overtaxed presented a challenge: I had no database that detailed each parcel’s taxing district so I used ArcGIS to map the nearly 8,000 cases that I had identified and then did a spatial join to match up parcel numbers and tax districts. After exporting the results to a database, I updated the records with the year’s tax rates and arrived at how much property owners were possibly overcharged on their taxes — upwards of $7 million. My analysis, which was published on Sept. 23, 2010, also found that more than half of the overcharged taxpayers did not file new complaints – meaning they almost certainly were not aware the county owed them refunds. After my investigation was published, a group of these overcharged property owners filed a lawsuit. (Updated October 13, 2015: The county fought for more than 5 years, before a judge ordered that the property owners be repaid.)
I noticed many Cleveland City Council members skipped meetings. Come to find out, the council clerk conveniently doesn’t track members’ attendance. I went through 30 months of meeting minutes with fellow reporter Henry J. Gomez to create a database of attendance and analyzed it to produce a story showing which council members shirk their legislative duties. (June 11, 2008)
Calculating the cost of corruption and mismanagement was not only a challenging exercise in computer-assisted reporting, it also required me to digest the complex workings of tax laws. I spent hours talking to officials to understand it. I then wrote a several page explanation with examples. After I got this blessed by the county treasurer, I knew I understood the system and devised a method for tallying the cost. No one in the county had ever calculated the losses that occurred. But after I published this story with fellow reporter Henry J. Gomez, a consultant hired by the county came to me and asked how to do it. (Feb. 20, 2011)
Data analysis can be a vehicle that takes you to stories that have nothing to do with data. In this case, I requested ArcGIS files showing snow plow routes the mayor claimed to have improved years ago. A months-long records battle ensued with Mayor Frank Jackson. When I got the files, they didn’t match the paper records. I looked up 30+ workers from the streets department to find two who would talk. They told me Mayor Jackson never fulfilled his promise to the public. Only then did the mayor’s spokeswomen admit the routes still weren’t ready even after two years of “work.” (March 12, 2011)
When other reporters at The Plain Dealer got their hands on records, they often brought them to me for analysis. In this case, Mark Puente handed me a database of tickets issued by Ohio State Troopers. My analysis identified trends including the location of speed traps and evidence to refute the myth that troopers issue more tickets at the end of the months. (May 20, 2007)