Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It seems to be 'fixed' now, but once I was trying to translate from Hindi to Nepali (which are actually closely-related languages, think Italian and Spanish) with a simple sentence along the lines 'Ram came', where 'Ram' is a common Hindu name (effectively: 'John'), written in Devanagari with a long vowel: राम (rām).

And I gave the Hindi input in devanagari (राम आ गया), but still the Nepali translation ended up being the equivalent of 'the sheep came' (भेडा आयो), so somewhere along the line it seemed to be treating the name राम (rām) as equivalent to the English string 'ram' and translating accordingly.

So if the intermediate language isn't English, it certainly has some English-like properties....



This is interesting! So its not just about grammatical patterns, but other stuff that might be parsed as named entities.


But it's bizarre that, even if for some reason it doesn't recognise common Indian names, that it just treats 'unknown strings' as English words.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: