Looking Forward Week 2

Joint reflection from Sean Zhu and Stuti Mohgaonkar:

In today's class, Gus shared many devices that showcase the extent and capability and the extent of the capabilities of current assistive devices for low vision users. Here are some of our thoughts.

There are many devices, and the devices are expensive. Each device cost about as much as a low-end smartphone or tablet. And Gus had so many of them — so the total cost is even more. During class, he demoed a device that senses whether in the direction it is pointed at, and a separate device that senses the color of the object it is pointed at. Couldn't these two functionalities be built into one device? Even better, can't these just be built as phone apps? Not only would that save money, but it would also save the hassle of fumbling around looking for the right device.

Tactile/haptic feedback is important. Gus was able to use the devices without looking with them, and he did so with remarkable accuracy. At first glance, it appears that the devices give only audio feedback. But just as importantly, they give tactile feedback to help the user hold the device properly and find the right buttons to push! On the other hand, when it comes to assistive software, the usage of tactile feedback appears to be almost nonexistent, save for the vibration for notifications, and long-press haptic feedback.

We imagine that adding haptic feedback in assistive software can be very useful. For example, Gus showed a screen reader that, when reading a link in a web browser, said "google.com, visited". Both parts of the phrase are conveyed as words, but for sighted users, the "visited" part is conveyed by coloring the link purple. Maybe for low-vision users, "visited" can be conveyed without using words as well, either through haptic feedback or a sound effect, to prevent the word "visited" from drowning out the other content.

We noticed that voice input is far from perfect. At this point, to be able to interact with voice assistants and have it interpret us correctly doesn't come naturally. We are required to speak to assistants in a voice with an unnatural, newscaster-like modulation to get it to understand us.

The voice output of devices is also hard to parse. It was overwhelming to hear all the robotic voices from different devices one after the other — they all sounded pretty much the same!

Location information. Sight is a directional sense — we can see which direction various pieces of light are coming from, and we can process a huge amount of this information quickly. When Gus demoed the product scanner, he had to, through trial and error, reorient the product many times until the device was able to scan the barcode. There has to be a better way to do this, right?