Frequently asked questions

15 questions on Korean encoding mojibake and recovery.

Q1. What is the difference between EUC-KR and CP949?
EUC-KR is the 1987 Korean precomposed standard covering 2,350 Hangul. CP949 is Microsoft’s 1996 extension that added 8,822 more Hangul and is the default Windows Korean code page. Most characters share the same bytes in both encodings, but extended syllables like 뷁, 똠방 exist only in CP949. Most files people call "EUC-KR" in practice are actually CP949.
Q2. Is my pasted text sent to a server?
No. All recovery runs locally via the browser’s TextEncoder / TextDecoder APIs. Once the page loads you can disconnect the network and the tool still works. Open DevTools Network tab and confirm there is zero upload traffic while you type.
Q3. A ZIP contains files with broken Korean names — can this tool fix all of them at once?
The current version handles one pasted string at a time. For bulk ZIP filename recovery use unar on macOS, 7z with the right charset flag, or BandiZip on Windows. A batch filename recovery tool is on the roadmap.
Q4. Why do you show 5 candidates instead of the single best guess?
For short or doubly-mangled input the top candidates often have near-identical scores. Pinning a single winner tends to be wrong, so we surface all top 5 and let you pick by context. The "score" combines Hangul syllable ratio and ASCII health, penalised by U+FFFD replacement characters and Latin-extended noise.
Q5. Can you recover text that already contains U+FFFD (�)?
No. U+FFFD is Unicode’s official "these bytes were dropped during decoding" marker. The original bytes do not exist anymore, so no tool can recover them. Go back one step in the pipeline (original file, original URL, original DB row) and paste from there.
Q6. A URL contains percent-encoded bytes like %EC%95%88%EB%85%95. How do I recover?
Paste it as-is. The tool detects %XX patterns, converts them to a byte array, and decodes as UTF-8 / EUC-KR. Double-encoded inputs like %25EC%2595... are detected and decoded twice automatically.
Q7. Email subjects arrive as =?UTF-8?B?...?=. Does this tool decode them?
The current release does not auto-unwrap RFC 2047 encoded-word. Decode the Base64 portion after ?B? separately, then paste the resulting bytes or text here. Native encoded-word parsing is on the roadmap.
Q8. The recovered text still looks broken — what went wrong?
Three likely causes: (1) the mojibake chain is 3+ levels deep and two pair combinations are not enough, (2) U+FFFD replacement characters are in the mix and bytes are already lost, or (3) the text actually contains Chinese/Japanese too and the Hangul score is low. For case (1), paste the top candidate back into the input and run it through the tool once more.
Q9. Does it work when the text also contains Chinese or Japanese?
Yes. The scoring function weights Hangul syllables the heaviest but also credits CJK unified characters. For pure Japanese mojibake a Japanese-focused tool (ftfy, or our planned Japanese mojibake fixer) will be more accurate.
Q10. When would I use "Copy share URL"?
It builds a URL with the broken text embedded as ?q=, so you can send a coworker the exact input and they see the same candidates. Note: the input is embedded in the URL, so do not share sensitive data this way.
Q11. MySQL shows Korean as 안녕 — which encoding chain is this?
Classic case: the bytes are UTF-8 (EC 95 88 EB 85 95) but the client decoded them as Latin-1. The tool’s first candidate ("UTF-8 → Latin-1") is exactly the unwind. The real fix is SET NAMES utf8mb4 or a correct connection charset.
Q12. Does it fix decomposed Jamo like ㅇㅏㄴㄴㅕㅇ?
That is a Unicode normalisation issue (NFD), not byte-level mojibake, so a separate path is needed. A future version will add an NFC normalisation candidate when decomposed Jamo is detected. As a workaround run "…".normalize("NFC") in a JavaScript console.
Q13. Is it OK to use at work?
Safe for internal data because the text never leaves your browser. If you use the "Copy share URL" feature note that the raw input is embedded in the URL — do not share URLs that contain sensitive data.
Q14. Does it work on mobile?
Yes. iOS Safari, Chrome on Android, and Samsung Internet all support TextEncoder / TextDecoder. The copy buttons use navigator.clipboard, which works on iOS 16+.
Q15. Is it open source?
Not public yet, but the core logic lives in a single src/lib/encodingFixer.ts using standard browser APIs only. Open-sourcing is under consideration based on demand.