Google Pixelbook. Image credit: Google
The Electronic Privacy Information Center (“EPIC”), a civil liberties group based in Washington D.C., filed an amicus brief in the United States vs. Wilson case concerning Google scanning billions of users’ files for unlawful content and then sending that information to law enforcement agencies.
Bypassing the Fourth Amendment
EPIC alleges that law enforcement is using Google, a private entity, to bypass the Fourth Amendment, which requires due process and probable cause before “searching or seizing” someone’s property.
As a private entity, Google doesn’t have to abide by the Fourth Amendment as the government has to, so it can do those mass searches on its behalf and then give the government the results. The U.S. government has been increasingly using this strategy to bypass Fourth Amendment protections of U.S. citizens and to expand its warrantless surveillance operations further.
Image Hashes vs. Image Matches
Google and a few other companies have “voluntarily” agreed to use a database of images hashes from the National Center for Missing and Exploited Children (NCMEC) to help the agency find exploited children.
More than that, the companies would also give any information they have on the people who owned those images, given they are users of said companies’ services and have shared the images through those services.
Image hash values are unique alphanumerical strings of characters that can be associatedwith images. These values are then used to match one image to another and see if the files are 100% identical.
EPIC alleges that Google has gone even beyond this voluntary commitment to help NCMEC find criminals who exploit children by using image hash matching, and it’s now also using image matching techniques that can look at different files to see whether or not they contain a certain image.
EPIC said this is very different from the first case of hash matching because image matching can result in many false positives (the algorithm can say that a certain file contains the original image, even though it doesn’t).
Referring Innocent People to Law Enforcement
EPIC noted that neither Google nor the government has revealed how the image matching algorithm works nor have they revealed accuracy, reliability, or validity of the technique, all of which are required for scientific evidence in court.
EPIC argues that Google or other companies could use similar algorithms to scan not just for images of exploited children, but also for other purposes such as determining if files contain religious views, political opinions, or “banned books.”
Google was recently involved in a controversy about its development of a censored search engine for China, called “Project Dragonfly.” The search engine would enable the identification of material that the Chinese government considers “sensitive,” which likely goes much further than images of exploited children.
A Need for Algorithmic Transparency
In the Carpenter vs. United States case, the Supreme Court recognized that the existing Fourth Amendment standards need to be reexamined in the new digital age. The Court ruled that the government couldn’t automatically track individuals’ locations everywhere they go for long periods of time without a warrant.
If the equivalent of the digital surveillance translated to the physical world meant that the government would have to deploy costly surveillance operations that would rarely happen, then the much cheaper automated digital surveillance shouldn’t be permitted without a warrant, either.
EPIC argued in its new briefing that automated scanning of files for various “crimes” falls into the same category. Even if the scanning of files can be cataloged as “private search,” the government would need to have “virtual certainty” that the files it intends to open are the same ones that were scanned by the private company, and this may not be possible. The government can’t guarantee that the files identified by Google are the same ones that the user uploaded.
This is also why EPIC believes that algorithmic transparency is critical for software that interacts with the justice system and provides information that incriminates users of various services.