You can use ISO encoding: LaTeX goes back to Donald E. ISO cannot represent Arabic text. ThorstenStaerk I don't think it does. Try VIM If you have vim you can use this: On a Linux system, it should be the case as well. The man page says this:
LaTeX/Export To Other Formats
If you are a non-English speaker, L a T e X can be configured to typeset in your language. If you want the plain text go to a file, use. Some introductions of pdf2htmlEX can be found on its own wiki page. Here you will find sections about different formats with description about how to get it. Try different resolutions to fit your needs, but dpi should be enough. You can convert a String to a byte and vice-versa given an encoding.
How do I change the encoding of my files? - TeX - LaTeX Stack Exchange
This however will not generally have sufficient resolution for whole pages or large areas. I think that Joseph is right. A solution that worked for me, in Linux Ubuntu Auto-detection is based purely on the file itself, with no checking for inputenc or similar, so it may be that Emacs is a better choice in some cases. If you want to get from your mangled text to the original, you simply have to reverse the incorrect encoding process.
Description: Good try, but you should actually test the answer. So I had to try out different encoding conversions. On the other hand, bulk changes sound like a question for SuperUser. The website doesn't convert anything, your string was ok to begin with, with just html needing to be unescaped.