Ticket #2283 (closed defect: fixed)

Opened 6 years ago

Last modified 16 months ago

mcview scrolling issues with heavy utf-8 files

Reported by: egmont Owned by:
Priority: major Milestone: 4.8.14
Component: mcview Version: 4.7.3
Keywords: Cc:
Blocked By: #2132 Blocking:
Branch state: no branch Votes for changeset:

Description (last modified by andrew_b) (diff)

wget http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
mcview UTF-8-demo.txt
Scroll up and down with the arrow (or pgup/pgdn) keys. Notice that very often a partial line appears on the top row, you have to press the arrow key twice or even more times to actually scroll by one line.
This happens when the topmost line that you're scrolling in or out contains lots of non-ascii characters. More precisely, I believe this occurs exactly when the number of bytes forming the topmost row is bigger than the terminal's width.
Buggy both in 4.7.3 and 4.7.0.7, fully UTF-8 environment.

Attachments

mcview-utf8-scroll.png (40.9 KB) - added by egmont 6 years ago.
Screenshot - though experiencing the behavior is much more useful

Change History

Changed 6 years ago by egmont

Screenshot - though experiencing the behavior is much more useful

comment:1 Changed 6 years ago by egmont

Note: the bug only happens when word wrapping is enabled (that is, you see 2UnWrap in the button bar), and happens even despite the terminal being wider than the file.

comment:2 follow-up: ↓ 3 Changed 6 years ago by egmont

I'm looking at mc-4.7.0.7. Here the bug is in src/viewer/move.c, mcview_move_up() and mcview_move_down() functions, the view->text_wrap_mode branches. The logic that modifies col (e.g. col += width, col -= width etc.) assume that width and bytelenght are the same notions (because col actually means offset in the file), hence does not handle UTF-8 or CJK (double width) characters correctly.

I don't see what the best solution would be, probably someone more familiar with the utf8/width functions of mc could fix it much faster than me.

comment:3 in reply to: ↑ 2 Changed 6 years ago by andrew_b

  • Blocked By 2132 added

Replying to egmont:

The logic that modifies col (e.g. col += width, col -= width etc.) assume that width and bytelenght are the same notions (because col actually means offset in the file), hence does not handle UTF-8 or CJK (double width) characters correctly.

Yes, this is the known issue. At least the #2132 ticket requires such fix.

comment:4 Changed 6 years ago by andrew_b

  • Component changed from mc-core to mcview
  • Description modified (diff)

comment:5 Changed 5 years ago by andrew_b

  • Branch state set to no branch
  • Milestone changed from 4.7 to Future Releases

comment:6 Changed 22 months ago by egmont

Similar scrolling issues are also reproducible with:

  • no UTF-8 but nroff formatting (e.g. an English manual page) [same underlying cause]
  • no UTF-8 and no nroff either (just plain ASCII text) and a line that's exactly as wide as the widow [the top line becomes empty but doesn't scroll out; probably an off-by-one somewhere]

comment:7 Changed 22 months ago by egmont

Note that the internal Help viewer also suffers from similar (but different) bug: it always jumps by a whole paragraph rather than one visual line. (The help viewer and mcview seem to be a totally different pieces of code.)

comment:8 Changed 16 months ago by egmont

This was fixed in #3250, please close the bug.

comment:9 Changed 16 months ago by andrew_b

  • Status changed from new to closed
  • Resolution set to fixed
  • Milestone changed from Future Releases to 4.8.14
Note: See TracTickets for help on using tickets.