Case Proposal: The service collects data from your voice when you communicate with a service verbally

I propose the following data to be a new case:

Fields Data
Name The service collects data from your voice when you communicate with a service verbally
Description The service collects data and recordings of your voice when you communicate with a service verbally and may use them to improve their services or used for commercial purposes.
Classification blocker
Topic Topic Types of Information Collected (ToS;DR Phoenix)
Weight 50
1 Like

Should it be separated from Case 397: Your biometric data is collected?

There aren’t many points linked to Case 397, so I think creating a whole new case for this kind of biometric collection will overwhelm the cases’ list and make it harder for reviewers to find the right case to link a quote to.

1 Like

For me, biometric data is more like fingerprint and facial recognition.

My argument is mainly that big companies make voice assistants available and collect voice data from their users.

Especially for siri, bixby, alexa, google assistant, etc.

And I think it’s important to make the difference between biometric data and voice data which can sometimes be very sensitive.

What do you think about it?

Agree that voice is not biometric data, didn’t we also want to move away from the phrase “the service”?


For me, whether it is “this service” or “the service” doesn’t change much as I try to change the titles as much as possible by putting the name of the service instead.

The discussion was about moving away from that phrasing to make it cleaner and easier to read.

“Data about your voice is collected” - Not is it only shorter, it’s much easier to read.

Note: The case names appear in the extension so we don’t want to clutter it.



I agree with “Data about your voice is collected” is really much easier to read.


I am raising the subject again with regard to this case.

Do we add it?


Maybe we should wait a few days for other opinions to be expressed.


Voice prints are totally biometric data. However, I get the sense the proposal isn’t asking about [re]identification, but rather about ‘ownership’ of the voice data? Some companies have been caught generating TTS systems using consumer data from collection occurrences they didn’t notify people of, for instance.


Voice prints are totally biometric data. However, I get the sense the proposal isn’t asking about [re]identification,…


Yes, I think that in order to keep a basis for this proposal, we could change the title to
"Your voice data can be used for many purpose"

1 Like

To reopen the subject, I found this in the Duolingo privacy policy:

“To recognize speech your audio may be sent to a third party provider such as Google, Apple, or Amazon Web Services.”

1 Like

Isn’t the main issue that its the service’s business model? Assigning a blocker to a language learning service that collects voice data to function doesn’t seem right to me.

1 Like

No, but for me, the user must know that when he speaks, his voice data are processed by google, amazon or apple.

1 Like

You forgot the important bit in the privacy policy though:

We may ask you to allow Duolingo to collect and analyze your speech data to help us understand the effectiveness of our lessons, and to improve the product.

This is what qualifies it as a blocker. Its all about the context


To add to this, we have case-394 which states “that it makes sense for the service” and is neutral.

We have to keep that in mind as well. Thats why I think case-400 is wrong too as Google Maps or any geolocation provider would have a worse grade just because they exist


Case 400 is supposed to be only assigned to services that don’t rely on geo-location, according to its description:

Unless the service relies on Geo Location, this case is to be assigned to points that don’t need your GPS coordinates to function properly.


or we can change the name by " Data from the voice are processed by Third-part" and no blocker but bad .

PS: Duolingo is similar to Quizlet and the speaking part is not at all necessary for learning. So in my opinion, it is important to warn the user that their voice is being processed by web giants known for their unethical practices.