Question 1

What is the difference between EUC-KR and CP949?

Accepted Answer

EUC-KR is the 1987 Korean precomposed standard covering 2,350 Hangul. CP949 is Microsoft’s 1996 extension that added 8,822 more Hangul and is the default Windows Korean code page. Most characters share the same bytes in both encodings, but extended syllables like 뷁, 똠방 exist only in CP949. Most files people call "EUC-KR" in practice are actually CP949.

Question 2

Is my pasted text sent to a server?

Accepted Answer

No. All recovery runs locally via the browser’s TextEncoder / TextDecoder APIs. Once the page loads you can disconnect the network and the tool still works. Open DevTools Network tab and confirm there is zero upload traffic while you type.

Question 3

A ZIP contains files with broken Korean names — can this tool fix all of them at once?

Accepted Answer

The current version handles one pasted string at a time. For bulk ZIP filename recovery use unar on macOS, 7z with the right charset flag, or BandiZip on Windows. A batch filename recovery tool is on the roadmap.

Question 4

Why do you show 5 candidates instead of the single best guess?

Accepted Answer

For short or doubly-mangled input the top candidates often have near-identical scores. Pinning a single winner tends to be wrong, so we surface all top 5 and let you pick by context. The "score" combines Hangul syllable ratio and ASCII health, penalised by U+FFFD replacement characters and Latin-extended noise.

Question 5

Can you recover text that already contains U+FFFD (�)?

Accepted Answer

No. U+FFFD is Unicode’s official "these bytes were dropped during decoding" marker. The original bytes do not exist anymore, so no tool can recover them. Go back one step in the pipeline (original file, original URL, original DB row) and paste from there.

Question 6

A URL contains percent-encoded bytes like %EC%95%88%EB%85%95. How do I recover?

Accepted Answer

Paste it as-is. The tool detects %XX patterns, converts them to a byte array, and decodes as UTF-8 / EUC-KR. Double-encoded inputs like %25EC%2595... are detected and decoded twice automatically.

Question 7

Email subjects arrive as =?UTF-8?B?...?=. Does this tool decode them?

Accepted Answer

The current release does not auto-unwrap RFC 2047 encoded-word. Decode the Base64 portion after ?B? separately, then paste the resulting bytes or text here. Native encoded-word parsing is on the roadmap.

Question 8

The recovered text still looks broken — what went wrong?

Accepted Answer

Three likely causes: (1) the mojibake chain is 3+ levels deep and two pair combinations are not enough, (2) U+FFFD replacement characters are in the mix and bytes are already lost, or (3) the text actually contains Chinese/Japanese too and the Hangul score is low. For case (1), paste the top candidate back into the input and run it through the tool once more.

Question 9

Does it work when the text also contains Chinese or Japanese?

Accepted Answer

Yes. The scoring function weights Hangul syllables the heaviest but also credits CJK unified characters. For pure Japanese mojibake a Japanese-focused tool (ftfy, or our planned Japanese mojibake fixer) will be more accurate.

Question 10

When would I use "Copy share URL"?

Accepted Answer

It builds a URL with the broken text embedded as ?q=, so you can send a coworker the exact input and they see the same candidates. Note: the input is embedded in the URL, so do not share sensitive data this way.

Question 11

MySQL shows Korean as ì•ˆë…• — which encoding chain is this?

Accepted Answer

Classic case: the bytes are UTF-8 (EC 95 88 EB 85 95) but the client decoded them as Latin-1. The tool’s first candidate ("UTF-8 → Latin-1") is exactly the unwind. The real fix is SET NAMES utf8mb4 or a correct connection charset.

Question 12

Does it fix decomposed Jamo like ㅇㅏㄴㄴㅕㅇ?

Accepted Answer

That is a Unicode normalisation issue (NFD), not byte-level mojibake, so a separate path is needed. A future version will add an NFC normalisation candidate when decomposed Jamo is detected. As a workaround run "…".normalize("NFC") in a JavaScript console.

Question 13

Is it OK to use at work?

Accepted Answer

Safe for internal data because the text never leaves your browser. If you use the "Copy share URL" feature note that the raw input is embedded in the URL — do not share URLs that contain sensitive data.

Question 14

Does it work on mobile?

Accepted Answer

Yes. iOS Safari, Chrome on Android, and Samsung Internet all support TextEncoder / TextDecoder. The copy buttons use navigator.clipboard, which works on iOS 16+.

Question 15

Is it open source?

Accepted Answer

Not public yet, but the core logic lives in a single src/lib/encodingFixer.ts using standard browser APIs only. Open-sourcing is under consideration based on demand.

Frequently asked questions