Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Your profile states that you are blind.

I’m struggling to make sense of a your story. Why would a blind user bother putting on a VR headset???



I took virtual reality in this case to mean coaxing the text model into pretending it's talking about drugs in the context of the game, not graphical VR.


I told the model that it is hooked in a virtual game, nothing more. it is text only anyways, I think.


You do know that some people aren't totally blind, right?


Totally blind in my case though, but the virtual game part was about the prompt. On the other hand, it would be interesting to see if the visual information in a virtual game could be communicated in alternative ways. If the computer has meta info about the 3d objects instead of just rendering info on how to show them, it might improve the accessibility somewhat.


Also with the rapid advances of vision language models, I would be surprised if we don't see image-to-text-to-voice system that works with real-time video in a not-so-far future! Like a reverse "Genie" where instead of providing a prompt and it generates a world, you provide a streaming video and it spouts relevant information when changes happen, or on demand, for instance...


It would be great to have it as a backup, but it will always be the heaviest in computation and responsiveness solution so it should be the last one used.


Have you played around with the current vision features? I am pretty sure even gpt-4.1 can give you pretty good descriptions of e.g. screen captures, including being able to "read" and reproduce text.


yes, there are multiple addons giving screen readers the ability to prompt ai-s for image recognition. they work rather well, btw, though the value is often situational. agentic behavior might help further, though it will need some polishing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: