Our work began in 2017 with a single police officer in central New Jersey.
(Editor’s Note: This how-to piece centers on New Jersey, an urban state of nine million that borders the cities of New York and Philadelphia on the US east coast. The project is by a team from NJ Advance Media, the state’s leading provider of local news, whose holdings include The Star-Ledger and other daily and weekly newspapers.)
The family of a 16-year-old claimed in a Facebook post that an officer in the borough of Carteret “savagely” beat their son after a brief car chase. They included photos of severe injuries to his face. Two days after reporter Craig McCarthy’s story was published on , officer Joseph Reiman was charged with assault. He has pleaded not guilty.
Could the community have known Reiman might be at risk of violent behavior?
From the time Reiman was hired in 2015, the 50-person department logged 115 incidents in which an officer used force, records show. Reiman accounted for 24 of them, more than one-fifth the entire department’s total and twice as many as the next officer.
Carteret’s civilian police commissioner said Reiman was simply a “proactive” cop. But why didn’t he raise red flags within the county prosecutor’s office or the state Attorney General’s Office, which are both responsible for overseeing local police?
In truth, it could have been other places like Carteret, Carlstadt or Carney’s Point. In the 17 years since the Attorney General’s Office first required police to report when they use force in the hopes of identifying problematic officers, departments and trends, the system has been virtually ignored.
It took a landmark New Jersey Supreme Court ruling in July to make these reports fully available to the public. To produce The Force Report, NJ Advance Media filed 506 public records requests and collected 72,677 use-of-force forms covering 2012 through 2016. They cover every municipal police department in New Jersey, as well as the State Police.
The results are now available at nj.com/force, the most comprehensive statewide database of police force in the United States.
If you have basic questions about the project or how to search the database but don’t want all the technical stuff, check out our friendly Frequently Asked Questions page.
For the rest of the nitty-gritty, keep reading this methodology.
Requesting the Records
Prior to the court ruling, numerous reporters at NJ Advance Media had attempted to build a statewide database of police force over the past decade, only to run into major roadblocks. Police departments refused to release use-of-force reports, or agreed to release them but would redact officer names, making it impossible to identify outliers or meaningful trends.
To build The Force Report database, reporters initially sought reports dating back 10 years, from 2007 through 2016, but were told by departments that many older forms were inaccessible in storage, had been lost or were otherwise not available. The requests were eventually narrowed to five years, from 2012 through 2016, the most recent full year available.
Records requests were first filed with all 21 county prosecutors. Under guidelines established by the state Attorney General’s Office, local police departments are to report all force incidents to the county prosecutor’s office annually in a manner set forth by the prosecutor. Ideally, these should have been a more centralized source for all forms.
Offices in five counties — Atlantic, Camden, Cape May, Salem and Sussex — fulfilled the requests, but the completeness of their records quickly came into question. The Camden County Prosecutor’s Office, for example, handed over more than 5,000 forms, but reporters later found thousands more in the hands of local police departments.
The rest of the county offices denied the requests or quoted expensive service fees.
Monmouth County asked for $4,737 — 78.95 hours of work at $60 per hour — to have a records custodian read through the documents, which they said would take three minutes per form. The Ocean County Prosecutor’s Office denied the request, saying it was impossible to comb through every case file to pull a use-of-force form.
This initial round of requests prompted a primary question of the investigation: Who, if anyone, was actually keeping track of these forms and analyzing the valuable data in them?
With little confidence in the county records, reporters turned to the police departments themselves. In total, over the course of about seven months, reporters filed 506 public records requests and received 72,677 paper records or digital scans of records.
Within days of filing the requests, the reporting effort received a boost when the New Jersey State Association of Chiefs of Police circulated a memo to every department in the state advising them of the requests and encouraging them to turn over records.
Still, countless hours were spent on the phone with local records custodians. Some quoted thousands of dollars in fees, while others denied requests outright, saying collecting so many records effectively would shut down their office for days.
The city of Trenton sought $1,500 to produce the records, a fee that was negotiated down to $550. Kearny asked for $1,369 for just 150 reports but later provided the records for free when the news organization objected. In South Brunswick, the records custodian said she planned to deny the request even though the police department was expecting it and had already compiled the records. The township provided them after a call to the department.
On several occasions, records were provided and fees were reduced after NJ Advance Media made clear the organization would file a lawsuit or make a complaint to the state government records council. Eventually, all but one department was forthcoming.
In Hillside, Chief Vincent Ricciardi refused to release the records over the advice of the township’s attorney. On November 2, 2017, after more than three months of negotiations, the township agreed to provide one year of records. The news organization objected and, in a letter dated Nov. 7, gave the township one week to comply with the request or face a lawsuit. The records were provided in full and without cost on Nov. 17.
Building the Database
Many of the records were a mess.
Some were blackened with age or nearly illegible. Others lacked key details, weren’t fully completed or lacked the signature of a supervisor, which is required. In Phillipsburg, three years worth of forms can’t be accessed because they are under quarantine for mold contamination.
In some instances, records custodians provided incorrect years, or there were gaps in the five-year period. Two departments provided the incorrect forms for 2016. And, throughout the reporting process, the news organization continued to identify shortcomings.
Reports were missing for some incidents that had been reported in news coverage. In one case, reporters were told forms were never completed. In other cases, officials said they had no idea if the records weren’t filled out or if they were just missing.
Some reports for fatal force also were missing from the provided records. Those cases, however, are also required to be reported on a separate form, and NJ Advance Media filled the gaps with additional records requests to prosecutors’ offices and the Attorney General’s Office.
In total, the force records contained approximately three million individual data points for entry into a database. But none of the forms could be electronically read or converted into a spreadsheet, so NJ Advance Media vetted and hired a third-party company, Invensis, to complete the work.
In total, NJ Advance Media spent $30,058 for data entry, $3,745 for public records requests, $1,497 in scanning and organization costs, and $1,500 for a statistician’s review.
Because force incidents are self-reported by police officers, they inherently contain some human error. Reporters consulted with independent experts to devise an input and review system to minimize any additional errors during data entry.
The company input batches of approximately 750 forms per day, which were then audited daily by a rotating team of six reporters. The reporters randomly selected 15 forms, or 2 percent of the daily batch, and checked roughly 600 data points against original records.
Only a handful of times did a daily batch exceed the 2 percent error rate, often because of inconsistencies in forms used by different departments. In those instances, the news organization discussed the errors with Invensis and forms were re-entered correctly.
The team also combined daily audits into a monthly file and conducted additional audits to identify problems. In total, the audits revealed a 0.6 percent error rate, well within the 2 percent benchmark.
Cleaning the Database
Once data entry and audits were completed, NJ Advance Media created a single, master database. Next, the data had to be cleaned and standardized.
That included numerous different formats for times, dates, town names, officer ranks, codes and designations for race, as well as a variety of entries for criminal charges. In addition, officers often included mini-narratives to describe the nature of the force used, requiring a manual standardization of those sections to produce a meaningful analysis.
The most difficult task was standardizing officer names. In some cases, one individual officer appeared in the database with five different name spellings. Reporters used identifiers such as badge numbers, state pension records and news archives to find errors.
The team assigned unique IDs to each officer and incident. This helped prevent errors if different officers in separate departments had the same name. It also ensured standard analysis in instances in which a single incident of force involved multiple officers.
Duplicate forms also posed a challenge. Each member of the team employed a different method for searching the database to identify and remove duplicate entries. The cleaning also identified numerous cases in which force incidents involved animals — in particular, deer. Most animal entries were eliminated for the purpose of analysis.
Analyzing the data by department required the creation of a data crosswalk showing towns that department policed. The team also used FBI data to calculate detailed arrest rates by department, allowing for more meaningful analysis of categories such as race. Additional data from the US Census and other sources was used.
Even that process revealed significant shortcomings. For example, reporters discovered the Bridgeton police department had failed to report more than 4,000 arrests to the FBI over a four-year span, from 2012 through 2015.
Finally, NJ Advance Media hired an independent statistician — John Lamberth, an expert in police statistics — to check its equations, calculations and methodology and to comment on its overall presentation of the data.
Modifications were made as a result of Lamberth’s suggestions.
Knowing the Limitations
This a living, breathing and imperfect database.
Using force is a normal and necessary part of policing. This is not a database of police misconduct. A high number of uses of force does not necessarily indicate wrongdoing.
Use-of-force incidents are self-reported, and thus, forms are dependent on the accuracy of the completing officer. And that’s if a form was completed at all, and if it was turned over under the public record requests. While every precaution has been taken to ensure the integrity of the available data, some small percentage of error is inevitable.
Not everyone interprets the statewide guidelines for reporting force the same. Some departments or officers may be more rigorous about reporting, thus making it appear as though their numbers are higher. Statewide baselines for force help account for these inconsistencies.
Comparative data for the use of K-9s (police dogs) or Tasers was excluded from the database because few departments reported using those types of force. In addition, comparisons of how often subjects or officers were hospitalized were excluded because of inconsistent reporting.
Data on whether a subject was marked as under the influence was not captured because officers usually do not drug test or conduct a sobriety test unless someone is suspected of driving under the influence. As a result, the variable was unreliable.
NJ Advance Media is not providing case-by-case images of the 72,677 hard-copy forms to protect the privacy of subjects. Many forms contain personal information, including that of juveniles, victims of domestic violence or those who suffer from mental illness.
Still, this project represents the most comprehensive statewide database of police force in the United States and serves as an unparalleled resource for the residents of New Jersey. And we need your help to improve it. If you see errors in the database or know specific incidents that are missing, please contact firstname.lastname@example.org. You can view our log of changes here.
We are continuing to make this data set better. The numbers in this story were last updated Jan. 8, 2019. See the changes we’ve made here.
This post first appeared on NJ.com’s website and is cross-posted here with permission.
Craig McCarthy is a criminal justice reporter at NJ Advance Media. He led the data collection through hundreds of records requests and coordinated input while co-leading the reporting effort for NJ Advance Media’s 16-month project, The Force Report.
Stephen Stirling is a data reporter at NJ Advance Media. His stories have appeared in FiveThirtyEight.com, NJ.com and The Star-Ledger. You can also find his work on Twitter and Instagram @sstirling, and his coding on Github.