Historically the theft of wages has been a business and policy problem and not an engineering problem. That is no longer the case. A group of volunteers have accepted the challenge to create a predictive model for where wage theft is currently undetected and then visualize that data.

Our goal is to find gaps in the dataset where violators are flying under the radar.

The Stanford University Center for Integrated Facility Engineering hosts this project, which is appropriate since the construction industry is one of the leading sources of wage theft.

In the ongoing fight against wage theft, Stanford CIFE has joined with the Santa Clara County Wage Theft Coalition. The coalition is a collection of community organizations in Santa Clara County (aka Silicon Valley) with a mission to stop wage theft. Currently wage theft is slowed by a mix of government and non-profit organizations such as the Department of Labor Wage and Hour Division, which enforces the Fair Labor Standards Act (FLSA) for minimum wage and overtime pay standards.

Our goal is to stop the predatory exploitation of vulnerable populations. These include: paying less than minimum wage, failing to pay overtime, forcing work off the clock, issuing paychecks that bounce, stealing tips, denying required meal and rest breaks, misclassifying work (i.e., as independent contract work), and not paying at all.

Even when a worker exercises their right, they face illegal retaliation. Even with a court award, it is hard to collect; on average, workers collect 20 cents on the dollar.

The Wage Theft Coalition needs to show public policy makers that wage theft is a problem in their jurisdiction and that they must pass ordinances to end exploitation.

Wage theft is a classic missing species problem: Predictions using existing data predict known violators. The known violators form a demographic, however, there are undiscovered demographics of violators. For example, what agricultural crops have the highest complaints?


  • What features predict populations vulnerable to wage theft?
  • To what degree can these features estimate to what degree wage theft impacts a population?
  • To what degree do visualizations of impact estimates allow advocates to A) draw conclusions and B) host discussions?


  • Investigation and complaint source cases; organized by NAICS and case unique ID. Each case is an event of a single company, multiple employees, and multiple violations.
  • Source of case (not available to public)
  • Industry demographics such as OSHA, MSHA, Census, Dept of Commerce, ACS
  • State, region, and city, such as the California Labor Commission, the County of Santa Clara, and the City of San Jose
  • Advocate groups such as Building Trades Council (BTC) and the Foundation for Fair Contracting (FFC),


  • Data visualizations
  • Integration of datasets from various sources (next step is automate with API)
  • Supported ordinance development
    • Santa Clara County
    • City of Milpitas
    • City of Cupertino
    • City of Sunnyvale
    • City of Mountain View
  • Developed apps
    • ‘Eat, Shop, Sleep’ of violators
    • Roofers wage theft app
    • WorkerReport for complaints
    • HourVoice to record work hours
  • A DataKind DataDive for social good. Volunteer data scientists came together from around the bay area for a weekend ‘hackathon.’ We developed predictive models and visualizations of wage theft. We intended for these to provide content for our site and other sites.