Copy and Paste PDF text gives wrong text -
i have pdf following text: localização
when copy text , paste, gives me:
localizac¸ ˜ao
any appreciate
tks
for computer generated documents (not ocrd/scanned)
some systems latex
generates composed characters because system's font doesn't contain (or support) such glyph in current encoding. consequence. generated on fly using composed glyphs.
making 2 glyphs one:
a + ´ -> Á
because of 'trick', selectable pdf text information contains 2 separated glyphs. graphically both rendered @ same spot.
the quick solution:
luckily, generated character pairs not happen naturally in written paragraph (maybe in language). quite safe search/replace them using case-sensitive method. can manually favorite text editor, or using python script, etc. automated or not, principle of solution same.
Comments
Post a Comment