Day Seventeen: Institutional Incentives

SDS 237: Data Ethnography

Lindsay Poirier
Statistical & Data Sciences, Smith College

Fall 2023

Reminders

  • If you have not started on MP2 yet, you are late in doing so.
  • There is reading from Cooking Data due this Thursday.

Consider a time that you would be rewarded or penalized based on your performance towards a numeric metric.

Turn to a neighbor and discuss:

  • What kinds of decisions did you have to make about how to behave in relation to this metric?
  • How might people “game” this metric?

NYC Stop, Question, and Frisk

  • Permits officers to stop individuals when “reasonable suspicion” of crime committed

  • 2011 District Court Floyd and Ourlicht vs. City of New York

    • Presents data to show degree of racial profiling in practice

    • Aggregated from series of UF-250 forms officers fill out

j-No, Flickr

Hon. Scheindlin’s Ruling

Joel Spector ⓒ2013

Because it is impossible to individually analyze each of those stops, plaintiffs’ case was based on the imperfect information contained in the NYPD’s database of forms (‘UF-250s’) that officers are required to prepare after each stop.

Juking the Stats

  • CompStat: crime reduction strategy instituted in NYC in the 1990s

  • Used crime and deployment data as performance metrics

  • Institutionally incentivized data manipulation

pardonmeforasking, Flickr

Disclosure Datasets

Tabular datasets that aggregate information produced and reported by the same institutions they are meant to hold accountable.

  • Self-disclosure concerns:

    • “Juking the stats” (policing)

    • “Cooking the books” (campaign finance)

    • “Phantom reductions” (environmental monitoring)

Classes of Accountability Data

Disclosure Data

Dan Nguyen

Evaluative Data

Digits.co.uk Images

Monitoring Data

^Ivan Radic, on Flickr^

False Reporting

  • Lying or misreporting data
  • Auditing can be challenging

^Sample HMDA Data Collection Form^

Deceptive Accounting

  • Not technically false but deliberately misleading
  • Takes advantage of ambiguities in standards or laws
  • Often involves “creative” approaches to measurement or classification

Discursive Risk of Regulatory Burden

  • Scope of dataset determined by reporting thresholds
  • Stakeholders have advocated for strengthening or loosening thresholds in line with certain political commitments
  • “Regulatory burden” discourse has been powerful tool for loosening reporting requirements

Who gets to self-disclose?