Facebook is pouring a lot of time and money into augmented reality, including building its own AR glasses with Ray-Ban.
Right now, these gadgets can only record and share imagery, but according to The Verge, the company wants their AI systems to be able to constantly analyzing peoples’ lives using first-person video. This will give them the access to record what people see, do and hear in order to assist them with everyday tasks.
Facebook’s researchers have outlined a series of skills it wants these systems to develop, such as “episodic memory,” which could answer questions like “where did I leave my keys?” Another skill it is interested in is “audio-visual diarization” (remembering who said what when).
Right now, the tasks outlined above can’t be achieved reliably by any AI system, and Facebook stresses that this is a research project rather than a commercial development. But apparently, the company sees functionality like these as the future of AR.
“Definitely, thinking about augmented reality and what we’d like to be able to do with it, there’s possibilities down the road that we’d be leveraging this kind of research,” Facebook AI research scientist Kristen Grauman told the outlet.
Of course, these types of functionalities would have huge privacy concerns; privacy experts are already worried about how Facebook’s AR glasses allow wearers to covertly record the public. Such concerns will only be worsened if future versions of the hardware not only record footage, but analyze and transcribe it, turning wearers into walking surveillance machines.
The name of Facebook’s research project is called Ego4D, which refers to the analysis of first-person, or “egocentric,” video. It consists of two major components: an open dataset of egocentric video and a series of benchmarks that Facebook thinks AI systems should be able to tackle in the future.
The dataset is the biggest of its kind ever created, and Facebook partnered with 13 universities around the world to collect the data. In total, over 3,000 hours of footage were recorded by 855 participants living in nine different countries. The universities, rather than Facebook, were responsible for collecting the data.
Participants, some of whom were paid, wore GoPro cameras and AR glasses to record video of the natural activity around them. All footage was de-identified by the universities, which included blurring the faces of bystanders and removing any personally identifiable information.
Facebook’s record on privacy has been problematic, spanning data leaks and $5 billion fines from the FTC, repeatedly showing their growth and engagement numbers more valuable to them than their users well-being. The “audio-visual diarization” task (transcribing what different people say) never mentions removing data about people who don’t want to be recorded.
When asked about these issues, a spokesperson for Facebook said that it expected that privacy safeguards would be introduced further down the line. “We expect that to the extent companies use this dataset and benchmark to develop commercial applications, they will develop safeguards for such applications,” said the spokesperson. “For example, before AR glasses can enhance someone’s voice, there could be a protocol in place that they follow to ask someone else’s glasses for permission, or they could limit the range of the device so it can only pick up sounds from the people with whom I am already having a conversation or who are in my immediate vicinity.”