Consider this testerrorxetex.tex
\font\1="[latinmodern-math.otf]"
\1
\message{𝒞}
\undefined 1𝒞
\bye
on running xetex testerrorxetex, following console output results
$ xetex testerrorxetex
This is XeTeX, Version 3.14159265-2.6-0.99999 (TeX Live 2018) (preloaded format=xetex)
restricted \write18 enabled.
entering extended mode
(./testerrorxetex.tex 𝒞
! Undefined control sequence.
l.4 \undefined
1풞
? R
OK, entering \nonstopmode...
[1] )
(see the transcript file for additional information)
Output written on testerrorxetex.pdf (1 page).
Transcript written on testerrorxetex.log.
Also the log file contains U+D49E in place of U+1D49E. The mwe has 1𝒞 just to make sure it is not only an issue if the higher UTF-8 character immediately follows the undefined control sequence. The \message{𝒞} is there to show the problem is somewhat specific to error messages generated by (Xe)TeX.
This looks somewhat reminiscent of #146 and perhaps #80 but those tickets appeared to have been solved, so it is something else (I searched the bug database and did not find a duplicate of my report).
This perturbates preview-latex (used in Emacs + AUCTeX) which creates artificial errors when generation previews (for math formulas typically) and arose as an AUCTeX bug report.
Anonymous
Sorry about my not so good title, the exact situation is
so "losing a byte" is not correct description.
This does look a lot like issue #146; perhaps the fix there was not correct/sufficient.
It does indeed; I was worried that perhaps my XeTeX from TL2018 might not have had the fix to #146 but as I could not reproduce at my locale the #146 issue, I opened this ticket.