This is an old post that I queued up and never published. Have at it.
What's the worst thing about this pandemic? Other than hundreds of thousands of people dead. And pervasive economic hardship. And the utter, casual, banal evil demonstrated by about half the country? You know what, forget the question.
Conversing through a mask sucks. Your voice is muffled, and nobody can read lips anymore. To address that last issue, I've been working on a project I'm calling "Verbal Supplement".
Verbal Supplement is a device that you wear over your face mask that displays a transcription of your speech. It's real-life subtitles. It comprises 3 sub projects: Hardware, Embedded software, and a Flutter phone app.
HARDWARE
The hardware is an M5StickC (paid link. As an Amazon Associate I earn from qualifying purchases.) in a 3D printed case. The case is designed to provide a place to attach elastics to keep it on your face and away from interfering with your speech. It sits between my upper lip and my nose. Unfortunately, the screen on the M5StickC is quite small. In practice, only about 6 words at a time can be displayed ¯\_(ツ)_/¯. I've got hardware with a larger screen coming, so I expect to address that in the near future.
EMBEDDED SOFTWARE
The Verbal Supplement embedded software does two main things: bluetooth low energy connectivity and text display. On startup it sets up a standard BLE serial service as well as a battery level service (currently unused). Then it begins advertising. On connection to the serial service by the phone app, any incoming text is piped straight to the screen. I used a wonderful scrolling text library to accomplish the text display. There's a bit of power management regarding sleeping and screen brightness too. You can see the code here.
FLUTTER APP
For the phone side, I used Flutter, simply because I wanted to build something with it. I basically hacked together the demo apps from flutter-reactive-ble and speech-to-text libraries. I definitely need to revisit the app architecture in the future. On initial run, the app will scan for nearby BLE devices that are advertising serial service and allow the user to select one. Once a device is selected, it connects to that device, starts a speech to text listening loop, and sends any detected text over the serial channel. There is no attempt to encrypt or otherwise protect this data, which is probably fine since this is the same data the user is speaking out loud, intending to be heard. But sending it to Google/Apple is less than ideal. I recently came across Vosk which apparently supports both Android and IOS and supports local continuous speech recognition on Android at least.
TASKER
To make it easier to use in the field, I created a Tasker task that starts up the Flutter app when the M5StickC is detected nearby. That way, I don't have to find and open the app, or even take my phone out of my pocket. Just turn on the M5StickC and in a second or two the phone will detect it and open the app. The app automatically connects and starts listening.
FUTURE
I'd like to replace the naive listening loop with a sound threshold triggered mechanism. Perhaps using the M5StickC microphone for it's proximity to the user's speech. Of course, increasing the screen size will make it far more practical as well. And I'd like to add some tactile feedback since the user usually won't be able to see the screens while actually using the app.
Making use of the battery service would also be a nice to have.
I'd love to test it with an iPhone, just to see if the supposed platform independence benefits of Flutter materialize, but I haven't yet looked into what I need to do so. I'm not going to be paying Apple for the privilege of running my own software on a device I own.