Ticket #3250: mc-3250-viewer-rewrite-v1a.patch

File mc-3250-viewer-rewrite-v1a.patch, 68.0 KB (added by egmont, 10 years ago)

Reimplementation, v1a (only minor stylistic updates)

Line 
1diff --git a/AUTHORS b/AUTHORS
2index bb85c83..60ef7f7 100644
3--- a/AUTHORS
4+++ b/AUTHORS
5@@ -64,6 +64,7 @@ Egmont Koblinger <egmont@gmail.com>
6         Support of extended mouse clicks beyond 223 column
7         Support of bracketed paste mode of xterm
8                 (http://invisible-island.net/xterm/ctlseqs/ctlseqs.html#Bracketed%20Paste%20Mode)
9+        Rewritten viewer
10 
11 Erwin van Eijk <wabbit@corner.iaf.nl>
12 
13diff --git a/src/viewer/Makefile.am b/src/viewer/Makefile.am
14index 53bc7a4..0602084 100644
15--- a/src/viewer/Makefile.am
16+++ b/src/viewer/Makefile.am
17@@ -3,6 +3,7 @@ noinst_LTLIBRARIES = libmcviewer.la
18 
19 libmcviewer_la_SOURCES = \
20        actions_cmd.c \
21+       ascii.c \
22        coord_cache.c \
23        datasource.c \
24        dialogs.c \
25@@ -16,7 +17,6 @@ libmcviewer_la_SOURCES = \
26        mcviewer.h \
27        move.c \
28        nroff.c \
29-       plain.c \
30        search.c
31 
32 AM_CPPFLAGS = -I$(top_srcdir) $(GLIB_CFLAGS) $(PCRE_CPPFLAGS)
33diff --git a/src/viewer/actions_cmd.c b/src/viewer/actions_cmd.c
34index 8df149e..ab46ab9 100644
35--- a/src/viewer/actions_cmd.c
36+++ b/src/viewer/actions_cmd.c
37@@ -510,6 +510,8 @@ mcview_execute_cmd (mcview_t * view, unsigned long command)
38         break;
39     case CK_Bookmark:
40         view->dpy_start = view->marks[view->marker];
41+        view->dpy_paragraph_skip_lines = 0;  // TODO: remember this value in the marker???
42+        view->dpy_wrap_dirty = TRUE;
43         view->dirty++;
44         break;
45 #ifdef HAVE_CHARSET
46@@ -592,6 +594,7 @@ mcview_adjust_size (WDialog * h)
47     widget_set_size (WIDGET (view), 0, 0, LINES - 1, COLS);
48     widget_set_size (WIDGET (b), LINES - 1, 0, 1, COLS);
49 
50+    view->dpy_wrap_dirty = TRUE;
51     mcview_compute_areas (view);
52     mcview_update_bytes_per_line (view);
53 }
54diff --git a/src/viewer/ascii.c b/src/viewer/ascii.c
55new file mode 100644
56index 0000000..bad535a
57--- /dev/null
58+++ b/src/viewer/ascii.c
59@@ -0,0 +1,889 @@
60+/*
61+   Internal file viewer for the Midnight Commander
62+   Function for plain view
63+
64+   Copyright (C) 1994-2014
65+   Free Software Foundation, Inc.
66+
67+   Written by:
68+   Miguel de Icaza, 1994, 1995, 1998
69+   Janne Kukonlehto, 1994, 1995
70+   Jakub Jelinek, 1995
71+   Joseph M. Hinkle, 1996
72+   Norbert Warmuth, 1997
73+   Pavel Machek, 1998
74+   Roland Illig <roland.illig@gmx.de>, 2004, 2005
75+   Slava Zanko <slavazanko@google.com>, 2009
76+   Andrew Borodin <aborodin@vmail.ru>, 2009-2014
77+   Ilia Maslakov <il.smind@gmail.com>, 2009
78+   Rewritten almost from scratch by:
79+   Egmont Koblinger <egmont@gmail.com>, 2014
80+
81+   This file is part of the Midnight Commander.
82+
83+   The Midnight Commander is free software: you can redistribute it
84+   and/or modify it under the terms of the GNU General Public License as
85+   published by the Free Software Foundation, either version 3 of the License,
86+   or (at your option) any later version.
87+
88+   The Midnight Commander is distributed in the hope that it will be useful,
89+   but WITHOUT ANY WARRANTY; without even the implied warranty of
90+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
91+   GNU General Public License for more details.
92+
93+   You should have received a copy of the GNU General Public License
94+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
95+
96+   ------------------------------------------------------------------------------------------------
97+
98+   The viewer is implemented along the following design principles:
99+
100+   Goals: Always display simple scripts, double wide (CJK), combining accents and spacing marks
101+   (often used e.g. in Devanagari) perfectly. Make the arrow keys always work correctly.
102+
103+   Absolutely non-goal: RTL.
104+
105+   Terminology: a "paragraph" is the text between two adjacent newline characters. A "line" or
106+   "row" is a visual row on the screen. In wrap mode, the viewer formats a paragraph into one or
107+   more lines.
108+
109+   The parser-formatter is designed to be stateless across paragraphs. This is so that we can walk
110+   backwards without having to reparse the whole file (although we still need to reparse and
111+   reformat the whole paragraph, but it's a lot better).
112+
113+   The parser-formatter, however, needs to carry a state across lines. Currently this state
114+   contains:
115+
116+    - The logical column (as if we didn't wrap). This is used for handling TAB characters after a
117+      wordwrap consistently with less.
118+
119+    - Whether the last nroff character was bold or underlined. This is used for displaying the
120+      ambiguous _\b_ sequence consistently with less.
121+
122+    - Whether the desired way of displaying a lonely combining accent or spacing mark is to place
123+      it over a dotted circle (we do this at the beginning of the paragraph of after a TAB), or to
124+      ignore the combining char and show replacement char for the spacing mark (we do this if e.g.
125+      too many of these were encountered and hence we don't glue them with their base character).
126+
127+    - (This state needs to be expanded if e.g. we decide to print verbose replacement characters
128+      (e.g. "<U+0080>") and allow these to wrap around lines.)
129+
130+   The state also contains the file offset, as it doesn't make sense to ever
131+   know the state without knowing the corresponding offset.
132+
133+   The state depends on various settings (viewer width, encoding, nroff mode, charwrap or wordwrap
134+   mode (if we'll have that one day) etc., needs to be recomputed if any of these changes.
135+
136+   Walking forwards is usually relatively easy both in the file and on the screen. Walking
137+   backwards within a paragraph would only be possible in some special cases and even then it would
138+   be painful, so we always walk back to the beginning of the paragraph and reparse-reformat from
139+   there.
140+
141+   (Walking back within a line in the file would have at least the following difficulties: handling
142+   the parser state; processing invalid UTF-8; processing invalid nroff (e.g. what is "_\bA\bA"?).
143+   Walking back on the display: we wouldn't know where to display the last line of a paragraph, or
144+   where to display a line if its following line starts with a wide (CJK or Tab) character. Long
145+   story short: just forget this approach.)
146+
147+   Most important variables:
148+
149+    - dpy_start: Both in unwrap and wrap modes this points to the beginning of the topmost
150+      displayed paragraph.
151+
152+    - dpy_text_column: Only in unwrap mode, an additional horizontal scroll.
153+
154+    - dpy_paragraph_skip_lines: Only in wrap mode, an additional vertical scroll (the number of
155+      lines that are scrolled off at the top from the topmost paragraph).
156+
157+    - dpy_state_top: Only in wrap mode, the offset and parser-formatter state at the line where
158+      displaying the file begins is cached here.
159+
160+    - dpy_wrap_dirty: If some parameter has changed that makes it necessary to reparse-redisplay
161+      the topmost paragraph.
162+
163+   In wrap mode, the three variables "dpy_start", "dpy_paragraph_skip_lines" and "dpy_state_top"
164+   are kept consistent. Think of the first two as the ones describing the position, and the third
165+   as a cached value for better performance so that we don't need to wrap the invisible beginning
166+   of the topmost paragraph over and over again. The third value needs to be recomputed each time a
167+   parameter that influences parsing or displaying the file (e.g. width of screen, encoding, nroff
168+   mode) changes, this is signaled by "dpy_wrap_dirty" to force recomputing "dpy_state_top" (and
169+   clamp "dpy_paragraph_skip_lines" if necessary).
170+
171+   ------------------------------------------------------------------------------------------------
172+
173+   Help integration
174+
175+   I'm planning to port the help viewer to this codebase.
176+
177+   Splitting at sections would still happen in the help viewer. It would either copy a section, or
178+   set force_max and a similar force_min to limit displaying to one section only.
179+
180+   Parsing the help format would go next to the nroff parser. The colors, alternate character set,
181+   and emitting the version number would go to the "state". (The version number would be
182+   implemented by emitting remaining characters of a buffer in the "state" one by one, without
183+   advancing in the file position.)
184+
185+   The active link would be drawn similarly to the search highlight. Other than that, the viewer
186+   wouldn't care about links (except for their color). help.c would keep track of which one is
187+   highlighted, how to advance to the next/prev on an arrow, how the scroll offset needs to be
188+   adjusted when moving, etc.
189+
190+   Add wrapping at word boundaries to where wrapping at char boundaries happen now.
191+ */
192+
193+#include <config.h>
194+
195+#include "lib/global.h"
196+#include "lib/tty/tty.h"
197+#include "lib/skin.h"
198+#include "lib/util.h"           /* is_printable() */
199+#ifdef HAVE_CHARSET
200+#include "lib/charsets.h"
201+#endif
202+
203+#include "src/setup.h"          /* option_tab_spacing */
204+
205+#include "internal.h"
206+
207+/*** global variables ****************************************************************************/
208+
209+/*** file scope macro definitions ****************************************************************/
210+
211+#define BASE_CHARACTER_FOR_LONELY_COMBINING 0x25CC  /* dotted circle */
212+#define MAX_COMBINING_CHARS 4  /* both slang and ncurses support exactly 4 */
213+
214+// I think space looks better than arrows. Still, use arrows while developing for better debugging.
215+// Final version could take it from the skin.
216+#define PARTIAL_CJK_AT_LEFT_MARGIN  0x25C2  // ' '  /* '<' doesn't look that good */
217+#define PARTIAL_CJK_AT_RIGHT_MARGIN 0x25B8  // ' '  /* '>' doesn't look that good */
218+
219+/*
220+ * Wrap mode: This is for safety so that jumping to the end of file (which already includes
221+ * scrolling back by a page) and then walking backwards is reasonably fast, even if the file is
222+ * extremely large and consists of maybe full zeros or something like that. If there's no newline
223+ * found within this limit, just start displaying from there and see what happens. We might get
224+ * some displaying parameteres (most importantly the columns) incorrect, but at least will show the
225+ * file without spinning the CPU for ages. When scrolling back to that point, the user might see a
226+ * garbled first line (even starting with an invalid partial UTF-8), but then walking back by yet
227+ * another line should fix it.
228+ *
229+ * Unwrap mode: This is not used, we wouldn't be able to do anything reasonable without walking
230+ * back a whole paragraph (well, view->data_area.height paragraphs actually).
231+ */
232+#define MAX_BACKWARDS_WALK_IN_PARAGRAPH (100 * 1000)
233+
234+/*** file scope type declarations ****************************************************************/
235+
236+/*** file scope variables ************************************************************************/
237+
238+/*** file scope functions ************************************************************************/
239+
240+// TODO: These methods shouldn't be necessary, see ticket 3257
241+
242+static int
243+mcview_wcwidth (const mcview_t * view, int c)
244+{
245+#ifdef HAVE_CHARSET
246+       if (view->utf8) {
247+               if (g_unichar_iswide(c))
248+                       return 2;
249+               if (g_unichar_iszerowidth(c))
250+                       return 0;
251+       }
252+#endif /* HAVE_CHARSET */
253+       return 1;
254+}
255+
256+static gboolean
257+mcview_ismark (const mcview_t * view, int c)
258+{
259+#ifdef HAVE_CHARSET
260+       if (view->utf8)
261+               return g_unichar_ismark(c);
262+#endif /* HAVE_CHARSET */
263+       return FALSE;
264+}
265+
266+/* actually is_non_spacing_mark_or_enclosing_mark */
267+static gboolean
268+mcview_is_non_spacing_mark (const mcview_t * view, int c)
269+{
270+#ifdef HAVE_CHARSET
271+       if (view->utf8) {
272+               GUnicodeType type = g_unichar_type(c);
273+               return type == G_UNICODE_NON_SPACING_MARK || type == G_UNICODE_ENCLOSING_MARK;
274+       }
275+#endif /* HAVE_CHARSET */
276+       return FALSE;
277+}
278+
279+#if 0
280+static gboolean
281+mcview_is_spacing_mark (const mcview_t * view, int c)
282+{
283+#ifdef HAVE_CHARSET
284+       if (view->utf8) {
285+               return g_unichar_type(c) == G_UNICODE_SPACING_MARK;
286+       }
287+#endif /* HAVE_CHARSET */
288+       return FALSE;
289+}
290+#endif /* 0 */
291+
292+static gboolean
293+mcview_isprint (const mcview_t * view, int c)
294+{
295+#ifdef HAVE_CHARSET
296+       if (!view->utf8)
297+               c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
298+       return g_unichar_isprint(c);
299+#endif /* HAVE_CHARSET */
300+       // TODO this is very-very buggy by design: ticket 3257 comments 0-1
301+       return is_printable (c);
302+}
303+
304+static int
305+mcview_char_display (const mcview_t * view, int c, char *s)
306+{
307+#ifdef HAVE_CHARSET
308+       if (mc_global.utf8_display) {
309+               if (!view->utf8)
310+                       c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
311+               if (!g_unichar_isprint (c))
312+                       c = '.';
313+               return g_unichar_to_utf8(c, s);
314+       } else if (view->utf8) {
315+               if (g_unichar_iswide(c)) {
316+                       s[0] = s[1] = '.';
317+                       return 2;
318+               }
319+               if (g_unichar_iszerowidth(c))
320+                       return 0;
321+               // TODO the is_printable check below will be broken for this
322+               c = convert_from_utf_to_current_c (c, view->converter);
323+       } else {
324+               // TODO the is_printable check below will be broken for this
325+               c = convert_to_display_c (c);
326+       }
327+#endif /* HAVE_CHARSET */
328+       // TODO this is very-very buggy by design: ticket 3257 comments 0-1
329+       if (!is_printable (c))
330+               c = '.';
331+       *s = c;
332+       return 1;
333+}
334+
335+/* --------------------------------------------------------------------------------------------- */
336+
337+/*
338+ * Just for convenience, a common interface in front of mcview_get_utf and mcview_get_byte, so that
339+ * the caller doesn't have to care about utf8 vs 8-bit modes.
340+ *
341+ * Normally: stores c, updates state, returns TRUE.
342+ * At EOF: state is unchanged, c is undefined, returns FALSE.
343+ *
344+ * Also, temporary hack: handle force_max here.
345+ * TODO: move it to lower layers (datasource.c)?
346+ */
347+static gboolean
348+mcview_get_next_char (mcview_t * view, mcview_state_machine_t * state, int *c)
349+{
350+       gboolean result;
351+       int bytes_consumed;
352+
353+       /* Pretend EOF if we reached force_max */
354+       if (view->force_max >= 0 && state->offset >= view->force_max) {
355+               return FALSE;
356+       }
357+#ifdef HAVE_CHARSET
358+        if (view->utf8)
359+        {
360+            *c = mcview_get_utf (view, state->offset, &bytes_consumed, &result);
361+            if (!result)
362+               return FALSE;
363+            /* Pretend EOF if we crossed force_max */
364+            if (view->force_max >= 0 && state->offset + bytes_consumed > view->force_max) {
365+                return FALSE;
366+            }
367+            state->offset += bytes_consumed;
368+            return TRUE;
369+        }
370+#endif /* HAVE_CHARSET */
371+        if (!mcview_get_byte (view, state->offset, c))
372+           return FALSE;
373+        state->offset++;
374+        return TRUE;
375+}
376+
377+/*
378+ * This function parses the next nroff character and gives it to you along with its desired color,
379+ * so you never have to care about nroff again.
380+ *
381+ * The nroff mode does the backspace trick for every single character (Unicode codepoint). At least
382+ * that's what the GNU groff 1.22 package produces, and that's what less 458 expects. For
383+ * double-wide characters (CJK), still only a single backspace is emitted. For combining accents
384+ * and such, the print-backspace-print step is repeated for the base character and then for each
385+ * accent separately.
386+ *
387+ * So, the right place for this layer is after the bytes are interpreted in UTF-8, but before
388+ * joining a base character with its combining accents.
389+ *
390+ * Normally: stores c and color, updates state, returns TRUE.
391+ * At EOF: state is unchanged, c and color are undefined, returns FALSE.
392+ *
393+ * color can be null if the caller doesn't care.
394+ */
395+static gboolean
396+mcview_get_next_maybe_nroff_char (mcview_t * view, mcview_state_machine_t * state, int *c, int *color)
397+{
398+       mcview_state_machine_t state_after_nroff;
399+       int c2, c3;
400+
401+       if (color) *color = VIEW_NORMAL_COLOR;
402+
403+       if (!view->text_nroff_mode)
404+               return mcview_get_next_char (view, state, c);
405+
406+       if (!mcview_get_next_char (view, state, c))
407+               return FALSE;
408+       /* Don't allow nroff formatting around CR, LF, TAB or other special chars */
409+       if (!mcview_isprint(view, *c))
410+               return TRUE;
411+
412+       state_after_nroff = *state;
413+
414+       if (!mcview_get_next_char (view, &state_after_nroff, &c2))
415+               return TRUE;
416+       if (c2 != '\b')
417+               return TRUE;
418+
419+       if (!mcview_get_next_char (view, &state_after_nroff, &c3))
420+               return TRUE;
421+       if (!mcview_isprint(view, c3))
422+               return TRUE;
423+
424+       if (*c == '_' && c3 == '_') {
425+               *state = state_after_nroff;
426+               if (color) *color = state->nroff_underscore_is_underlined ? VIEW_UNDERLINED_COLOR : VIEW_BOLD_COLOR;
427+               return TRUE;
428+       } else if (*c == c3) {
429+               *state = state_after_nroff;
430+               state->nroff_underscore_is_underlined = FALSE;
431+               if (color) *color = VIEW_BOLD_COLOR;
432+               return TRUE;
433+       } else if (*c == '_') {
434+               *c = c3;
435+               *state = state_after_nroff;
436+               state->nroff_underscore_is_underlined = TRUE;
437+               if (color) *color = VIEW_UNDERLINED_COLOR;
438+               return TRUE;
439+       } else {
440+               return TRUE;
441+       }
442+}
443+
444+/*
445+ * Get one base character, along with its combining or spacing mark characters. (Let's call this a
446+ * "full character".)
447+ *
448+ * (A spacing mark is a character that extends the base character's width 1 into a combined
449+ * character of width 2, yet these two character cells should not be separated. E.g. Devanagari
450+ * <U+0939><U+094B>.)
451+ *
452+ * This method exists mainly for two reasons. One is to be able to tell if we fit on the current
453+ * line or need to wrap to the next one. The other is that both slang and ncurses seem to require
454+ * that the character and its combining marks are printed in a single call (or is it just a
455+ * limitation of mc's wrapper to them?).
456+ *
457+ * For convenience, this method takes care of converting CR or CR+LF into LF.
458+ * TODO this should probably happen later, when displaying the file?
459+ *
460+ * Normally: stores cs and color, updates state, returns >= 1 (entries in cs).
461+ * At EOF: state is unchanged, cs and color are undefined, returns 0.
462+ *
463+ * @param view ...
464+ * @param state the parser-formatter state machine's state, updated
465+ * @param cs store the characters here
466+ * @param clen the room available in cs (that is, at most clen-1 combining marks are allowed), must
467+ *   be at least 2
468+ * @param color if non-NULL, store the color here, taken from the first codepoint's color
469+ * @return the number of entries placed in cs, or 0 on EOF
470+ */
471+static int
472+mcview_next_full_char (mcview_t * view, mcview_state_machine_t * state, int *cs, int clen, int *color)
473+{
474+       int i = 1;
475+       mcview_state_machine_t state_after_combining;
476+
477+       if (!mcview_get_next_maybe_nroff_char (view, state, cs, color))
478+               return 0;
479+
480+       /* Process \r and \r\n newlines. */
481+       if (cs[0] == '\r') {
482+               int cnext;
483+               mcview_state_machine_t state_after_crlf = *state;
484+               if (mcview_get_next_maybe_nroff_char (view, &state_after_crlf, &cnext, NULL) && cnext == '\n')
485+                       *state = state_after_crlf;
486+               cs[0] = '\n';
487+               return 1;
488+       }
489+
490+       /* We don't want combining over non-printable characters. This includes '\n' and '\t' too. */
491+       if (!mcview_isprint(view, cs[0]))
492+               return 1;
493+
494+       if (mcview_ismark(view, cs[0])) {
495+               if (!state->print_lonely_combining) {
496+                       /* First character is combining. Either just return it, ... */
497+                       return 1;
498+               } else {
499+                       /* or place this (and subsequent combining ones) over a dotted circle. */
500+                       cs[1] = cs[0];
501+                       cs[0] = BASE_CHARACTER_FOR_LONELY_COMBINING;
502+                       i = 2;
503+               }
504+       }
505+
506+       if (mcview_wcwidth(view, cs[0]) == 2) {
507+               /* Don't allow combining or spacing mark for wide characters, is this okay? */
508+               return 1;
509+       }
510+
511+       /* Look for more combining chars. Either at most clen-1 zero-width combining chars,
512+        * or at most 1 spacing mark. Is this logic correct? */
513+       for (; i < clen; i++) {
514+               state_after_combining = *state;
515+               if (!mcview_get_next_maybe_nroff_char (view, &state_after_combining, &cs[i], NULL))
516+                       return i;
517+               if (!mcview_ismark(view, cs[i]) || !mcview_isprint(view, cs[i]))
518+                       return i;
519+               if (g_unichar_type(cs[i]) == G_UNICODE_SPACING_MARK) {  // is this the right check?
520+                       /* Only allow as the first combining char. Stop processing in either case. */
521+                       if (i == 1) {
522+                               *state = state_after_combining;
523+                               i++;
524+                       }
525+                       return i;
526+               }
527+               *state = state_after_combining;
528+       }
529+       return i;
530+}
531+
532+/*
533+ * Parse, format and possibly display one visual line of text.
534+ *
535+ * Formatting starts at the given "state" (which encodes the file offset and parser and formatter's
536+ * internal state). In unwrap mode, this should point to the beginning of the paragraph with the
537+ * default state, the additional horizontal scrolling is added here. In wrap mode, this should
538+ * point to the beginning of the line, with the proper state at that point.
539+ *
540+ * In wrap mode, if a line ends in a newline, it is consumed, even if it's exactly at the right
541+ * edge. In unwrap mode, the whole remaining line, including the newline is consumed. Displaying
542+ * the next line should start at "state"'s new value, or if we displayed the bottom line then
543+ * state->offset tells the file offset to be shown in the top bar.
544+ *
545+ * If "row" is offscreen, don't actually display the line but still update "state" and return the
546+ * proper value. This is used by mcview_wrap_move_down to advance in the file.
547+ *
548+ * @param view ...
549+ * @param state the parser-formatter state machine's state, updated
550+ * @param row print to this row
551+ * @param paragraph_ended store TRUE if paragraph ended by newline or EOF, FALSE if wraps to next
552+ *   line
553+ * @return the number of rows, that is, 0 if we were already at EOF, otherwise 1
554+ */
555+static int
556+mcview_display_line (mcview_t * view, mcview_state_machine_t * state, int row, gboolean *paragraph_ended)
557+{
558+       const screen_dimen left = view->data_area.left;
559+       const screen_dimen top = view->data_area.top;
560+       const screen_dimen width = view->data_area.width;
561+       const screen_dimen height = view->data_area.height;
562+       off_t dpy_text_column = view->text_wrap_mode ? 0 : view->dpy_text_column;  // actually maybe we shouldn't allow view->data_area.left to be any different
563+       screen_dimen col = 0;
564+       int color;
565+       int cs[1 + MAX_COMBINING_CHARS];
566+       int n;
567+       char str[(1 + MAX_COMBINING_CHARS) * UTF8_CHAR_LEN + 1];
568+       int charwidth;
569+       int i, j;
570+       mcview_state_machine_t state_saved;
571+
572+       if (paragraph_ended) *paragraph_ended = TRUE;
573+
574+       if (!view->text_wrap_mode && col >= dpy_text_column + width) {
575+               /* Optimization: Fast forward to the end of the line, rather than carefully
576+                * parsing and then not actually displaying it. */
577+               off_t eol = mcview_eol(view, state->offset, mcview_get_filesize (view));
578+               int retval = eol > state->offset ? 1 : 0;
579+               mcview_state_machine_init (state, eol);
580+               return retval;
581+       }
582+
583+       while (1) {
584+               state_saved = *state;
585+               n = mcview_next_full_char (view, state, cs, 1 + MAX_COMBINING_CHARS, &color);
586+               if (n == 0)
587+                       return col > 0 ? 1 : 0;
588+
589+               if (view->search_start <= state->offset && state->offset < view->search_end)
590+                       color = SELECTED_COLOR;
591+
592+               if (cs[0] == '\n') {
593+                       /* New line: reset all formatting state for the next paragraph. */
594+                       mcview_state_machine_init (state, state->offset);
595+                       return 1;
596+               }
597+
598+               if (mcview_is_non_spacing_mark(view, cs[0])) {
599+                       /* Lonely combining character. Probably leftover after too many combining chars. Just ignore. */
600+                       continue;
601+               }
602+
603+               /* Nonprintable, or lonely spacing mark */
604+               if ((!mcview_isprint(view, cs[0]) || mcview_ismark(view, cs[0])) && cs[0] != '\t')
605+                       cs[0] = '.';
606+
607+               charwidth = 0;
608+               for (i = 0; i < n; i++) {
609+                       charwidth += mcview_wcwidth(view, cs[i]);
610+               }
611+
612+               /* Adjust the width for TAB. It's handled below along with the normal characters,
613+                * so that it's wrapped consistently with them, and is painted with the proper
614+                * attributes (although currently it can't have a special color). */
615+               if (cs[0] == '\t') {
616+                       charwidth = option_tab_spacing - state->unwrapped_column % option_tab_spacing;
617+                       state->print_lonely_combining = TRUE;
618+               } else {
619+                       state->print_lonely_combining = FALSE;
620+               }
621+
622+               /* In wrap mode only: We're done with this row if the full character wouldn't fit.
623+                * Except if at the first column, because then it wouldn't fit in the next row either.
624+                * In this extreme case let the unwrapped code below do its best to display it. */
625+               if (view->text_wrap_mode && (off_t) col + charwidth > dpy_text_column + width && col > 0) {
626+                       *state = state_saved;
627+                       if (paragraph_ended) *paragraph_ended = FALSE;
628+                       return 1;
629+               }
630+
631+               /* Display, unless outside of the viewport. */
632+               if (row >= 0 && row < (int) height) {
633+                       if ((off_t) col >= dpy_text_column &&
634+                           (off_t) col + charwidth <= dpy_text_column + width) {
635+                               /* The full character fits entirely in the viewport. Print it. */
636+                               tty_setcolor(color);
637+                               widget_move (view, top + row, left + ((off_t) col - dpy_text_column));
638+                               if (cs[0] == '\t') {
639+                                       for (i = 0; i < charwidth; i++)
640+                                               tty_print_char(' ');
641+                               } else {
642+                                       j = 0;
643+                                       for (i = 0; i < n; i++) {
644+                                               j += mcview_char_display(view, cs[i], str + j);
645+                                       }
646+                                       str[j] = '\0';
647+                                       /* This is probably a bug in our tty layer, but tty_print_string
648+                                        * normalizes the string, whereas tty_printf doesn't. Don't normalize,
649+                                        * since we handle combining characters ourselves correctly, it's
650+                                        * better if they are copy-pasted correctly. Ticket 3255. */
651+                                       tty_printf ("%s", str);
652+                               }
653+                       } else if ((off_t) col < dpy_text_column &&
654+                                  (off_t) col + charwidth > dpy_text_column) {
655+                               /* The full character would cross the left edge of the viewport.
656+                                * This cannot happen with wrap mode. Print replacement character(s),
657+                                * or spaces with the correct attributes for partial Tabs. */
658+                               tty_setcolor(color);
659+                               for (i = dpy_text_column; i < (off_t) col + charwidth && i < dpy_text_column + width; i++) {
660+                                       widget_move (view, top + row, left + (i - dpy_text_column));
661+                                       tty_print_anychar(cs[0] == '\t' ? ' ' : PARTIAL_CJK_AT_LEFT_MARGIN);
662+                               }
663+                       } else if ((off_t) col < dpy_text_column + width &&
664+                                  (off_t) col + charwidth > dpy_text_column + width) {
665+                               /* The full character would cross the right edge of the viewport
666+                                * and we're not wrapping. Print replacement character(s),
667+                                * or spaces with the correct attributes for partial Tabs. */
668+                               tty_setcolor(color);
669+                               for (i = col; i < dpy_text_column + width; i++) {
670+                                       widget_move (view, top + row, left + (i - dpy_text_column));
671+                                       tty_print_anychar(cs[0] == '\t' ? ' ' : PARTIAL_CJK_AT_RIGHT_MARGIN);
672+                               }
673+                       }
674+               }
675+
676+               col += charwidth;
677+               state->unwrapped_column += charwidth;
678+
679+               if (!view->text_wrap_mode && col >= dpy_text_column + width) {
680+                       /* Optimization: Fast forward to the end of the line, rather than carefully
681+                        * parsing and then not actually displaying it. */
682+                       off_t eol = mcview_eol(view, state->offset, mcview_get_filesize (view));
683+                       mcview_state_machine_init (state, eol);
684+                       return 1;
685+               }
686+       }
687+}
688+
689+/*
690+ * Parse, format and possibly display one paragraph (perhaps not from the beginning).
691+ *
692+ * Formatting starts at the given "state" (which encodes the file offset and parser and formatter's
693+ * internal state). In unwrap mode, this should point to the beginning of the paragraph with the
694+ * default state, the additional horizontal scrolling is added here. In wrap mode, this may point
695+ * to the beginning of the line within a paragraph (to display the partial paragraph at the top),
696+ * with the proper state at that point.
697+ *
698+ * Displaying the next paragraph should start at "state"'s new value, or if we displayed the bottom
699+ * line then state->offset tells the file offset to be shown in the top bar.
700+ *
701+ * If "row" is negative, don't display the first abs(row) lines and display the rest from the top.
702+ * This was a nice idea but it's now unused :)
703+ *
704+ * If "row" is too large, don't display the paragraph at all but still return the number of lines.
705+ * This is used when moving upwards.
706+ *
707+ * @param view ...
708+ * @param state the parser-formatter state machine's state, updated
709+ * @param row print starting at this row
710+ * @return the number of rows the paragraphs is wrapped to, that is, 0 if we were already at EOF,
711+ *   otherwise 1 in unwrap mode, >= 1 in wrap mode. We stop when reaching the bottom of the
712+ *   viewport, it's not counted how many more lines the paragraph would occupy
713+ */
714+static int
715+mcview_display_paragraph (mcview_t * view, mcview_state_machine_t * state, int row)
716+{
717+       const screen_dimen height = view->data_area.height;
718+       int lines = 0;
719+       gboolean paragraph_ended;
720+
721+       while (1) {
722+               lines += mcview_display_line(view, state, row, &paragraph_ended);
723+               if (paragraph_ended)
724+                       return lines;
725+
726+               if (row < (int) height) {
727+                       row++;
728+                       /* stop if bottom of screen reached */
729+                       if (row >= (int) height)
730+                               return lines;
731+               }
732+       }
733+}
734+
735+/*
736+ * Recompute dpy_state_top from dpy_start and dpy_paragraph_skip_lines. Clamp
737+ * dpy_paragraph_skip_lines if necessary.
738+ *
739+ * This method should be called in wrap mode after changing one of the parsing or formatting
740+ * properties (e.g. window width, encoding, nroff), or when switching to wrap mode from unwrap or
741+ * hex.
742+ *
743+ * If we stayed within the same paragraph then try to keep the vertical offset within that
744+ * paragraph as well. It might happen though that the paragraph became shorter than our desired
745+ * vertical position, in that case move to its last row.
746+ */
747+static void
748+mcview_wrap_fixup (mcview_t * view)
749+{
750+       mcview_state_machine_t state_prev;
751+       gboolean paragraph_ended;
752+       int lines = view->dpy_paragraph_skip_lines;
753+
754+       if (!view->dpy_wrap_dirty)
755+               return;
756+       view->dpy_wrap_dirty = FALSE;
757+
758+       view->dpy_paragraph_skip_lines = 0;
759+       mcview_state_machine_init (&view->dpy_state_top, view->dpy_start);
760+
761+       while (lines--) {
762+               state_prev = view->dpy_state_top;
763+               if (!mcview_display_line (view, &view->dpy_state_top, -1, &paragraph_ended))
764+                       break;
765+               if (paragraph_ended) {
766+                       view->dpy_state_top = state_prev;
767+                       break;
768+               }
769+               view->dpy_paragraph_skip_lines++;
770+       }
771+}
772+
773+/* --------------------------------------------------------------------------------------------- */
774+/*** public functions ****************************************************************************/
775+/* --------------------------------------------------------------------------------------------- */
776+
777+/*
778+ * In both wrap and unwrap modes, dpy_start points to the beginning of the paragraph.
779+ *
780+ * In unwrap mode, start displaying from this position, probably applying an additional horizontal
781+ * scroll.
782+ *
783+ * In wrap mode, an additional dpy_paragraph_skip_lines lines are skipped from the top of this
784+ * paragraph. dpy_state_top contains the position and parser-formatter state corresponding to the
785+ * top left corner so we can just start rendering from here. Unless dpy_wrap_dirty is set in which
786+ * case dpy_state_top is invalid and we need to recompute first.
787+ */
788+void
789+mcview_display_text (mcview_t * view)
790+{
791+       const screen_dimen left = view->data_area.left;
792+       const screen_dimen top = view->data_area.top;
793+       const screen_dimen height = view->data_area.height;
794+       int row;
795+       int n;
796+       mcview_state_machine_t state;
797+       gboolean again;
798+
799+       do {
800+               again = FALSE;
801+
802+               mcview_display_clean (view);
803+               mcview_display_ruler (view);
804+
805+               if (view->text_wrap_mode) {
806+                       mcview_wrap_fixup (view);
807+                       state = view->dpy_state_top;
808+               } else {
809+                       mcview_state_machine_init(&state, view->dpy_start);
810+               }
811+               row = 0;
812+               while (row < (int) height) {
813+                       n = mcview_display_paragraph (view, &state, row);
814+                       if (n == 0) {
815+                               /* In the rare case that displaying didn't start at the beginning
816+                                * of the file, yet there are some empty lines at the bottom,
817+                                * scroll the file and display again. This happens when e.g. the
818+                                * window is made bigger, or the file becomes shorter due to
819+                                * charset change or enabling nroff. */
820+                               if ((view->text_wrap_mode ? view->dpy_state_top.offset : view->dpy_start) > 0) {
821+                                       mcview_ascii_move_up (view, height - row);
822+                                       again = TRUE;
823+                               }
824+                               break;
825+                       }
826+                       row += n;
827+               }
828+       } while (again);
829+
830+       view->dpy_end = state.offset;
831+       view->dpy_state_bottom = state;
832+
833+       if (mcview_show_eof != NULL && mcview_show_eof[0] != '\0') {
834+               while (row < (int) height) {
835+                       widget_move (view, top + row, left);
836+                       // TODO: should make it no wider than the viewport
837+                       tty_print_string (mcview_show_eof);
838+                       row++;
839+               }
840+       }
841+}
842+
843+/*
844+ * Move down.
845+ *
846+ * It's very simple. Just invisibly format the next "lines" lines, carefully carrying the formatter
847+ * state in wrap mode. But before each step we need to check if we've already hit the end of the
848+ * file, in that case we can no longer move. This is done by walking from dpy_state_bottom.
849+ *
850+ * Note that this relies on mcview_display_text() setting dpy_state_bottom to its correct value
851+ * upon rendering the screen contents. So don't call this function from other functions (e.g. at
852+ * the bottom of mcview_ascii_move_up()) which invalidate this value.
853+ */
854+void
855+mcview_ascii_move_down (mcview_t * view, off_t lines)
856+{
857+       gboolean paragraph_ended;
858+
859+       while (lines--) {
860+               /* See if there's still data below the bottom line. If not, we can't scroll any
861+                * more. If there is, adjust dpy_state_bottom by imaginarily displaying one more
862+                * line there. */
863+               if (view->dpy_state_bottom.offset >= mcview_get_filesize (view))
864+                       break;
865+               mcview_display_line (view, &view->dpy_state_bottom, -1, &paragraph_ended);
866+
867+               /* Okay, there's enough data. Move by 1 row at the top, too. No need to check for
868+                * EOF, that can't happen. */
869+               if (!view->text_wrap_mode) {
870+                       view->dpy_start = mcview_eol(view, view->dpy_start, mcview_get_filesize (view));
871+                       view->dpy_paragraph_skip_lines = 0;
872+                       view->dpy_wrap_dirty = TRUE;
873+               } else {
874+                       mcview_display_line (view, &view->dpy_state_top, -1, &paragraph_ended);
875+                       if (paragraph_ended) {
876+                               view->dpy_start = view->dpy_state_top.offset;
877+                               view->dpy_paragraph_skip_lines = 0;
878+                       } else {
879+                               view->dpy_paragraph_skip_lines++;
880+                       }
881+               }
882+       }
883+}
884+
885+/*
886+ * Move up.
887+ *
888+ * Unwrap mode: Piece of cake. Wrap mode: If we'd walk back more than the current line offset
889+ * within the paragraph, we need to jump back to the previous paragraph and compute its height to
890+ * see if we start from that paragraph, and repeat this if necessary. Once we're within the desired
891+ * paragraph, we still need to format it from its beginning to know the state.
892+ *
893+ * See the top of this file for comments about MAX_BACKWARDS_WALK_IN_PARAGRAPH.
894+ *
895+ * force_max is a nice protection against the rare extreme case that the file underneath us
896+ * changes, we don't want to endlessly consume a file of maybe full of zeros upon moving upwards.
897+ */
898+void
899+mcview_ascii_move_up (mcview_t * view, off_t lines)
900+{
901+       int i;
902+
903+       if (!view->text_wrap_mode) {
904+               while (lines--)
905+                       view->dpy_start = mcview_bol(view, view->dpy_start - 1, 0);
906+               view->dpy_paragraph_skip_lines = 0;
907+               view->dpy_wrap_dirty = TRUE;
908+       } else {
909+               while (lines > view->dpy_paragraph_skip_lines) {
910+                       /* We need to go back to the previous paragraph. */
911+                       if (view->dpy_start == 0) {
912+                               /* Oops, we're already in the first paragraph. */
913+                               view->dpy_paragraph_skip_lines = 0;
914+                               mcview_state_machine_init(&view->dpy_state_top, 0);
915+                               return;
916+                       }
917+                       lines -= view->dpy_paragraph_skip_lines;
918+                       view->force_max = view->dpy_start;
919+                       view->dpy_start = mcview_bol (view, view->dpy_start - 1, view->dpy_start - MAX_BACKWARDS_WALK_IN_PARAGRAPH);
920+                       mcview_state_machine_init(&view->dpy_state_top, view->dpy_start);
921+                       /* This is a tricky way of denoting that we're at the end of the paragraph.
922+                        * Normally we'd jump to the next paragraph and reset paragraph_skip_lines. But for
923+                        * walking backwards this is exactly what we need. */
924+                       view->dpy_paragraph_skip_lines = mcview_display_paragraph (view, &view->dpy_state_top, view->data_area.height);
925+                       view->force_max = -1;
926+               }
927+
928+               /* Okay, we have have dpy_start pointing to the desired paragraph, and we still need to
929+                * walk back "lines" lines from the current "dpy_paragraph_skip_lines" offset. We can't do
930+                * that, so walk from the beginning of the paragraph. */
931+               mcview_state_machine_init(&view->dpy_state_top, view->dpy_start);
932+               view->dpy_paragraph_skip_lines -= lines;
933+               for (i = 0; i < view->dpy_paragraph_skip_lines; i++)
934+                       mcview_display_line (view, &view->dpy_state_top, -1, NULL);
935+       }
936+}
937+
938+/* --------------------------------------------------------------------------------------------- */
939+
940+void
941+mcview_state_machine_init (mcview_state_machine_t * state, off_t offset)
942+{
943+       memset(state, 0, sizeof (*state));
944+       state->offset = offset;
945+       state->print_lonely_combining = TRUE;
946+}
947+
948+/* --------------------------------------------------------------------------------------------- */
949diff --git a/src/viewer/datasource.c b/src/viewer/datasource.c
950index 3389ee4..d6da436 100644
951--- a/src/viewer/datasource.c
952+++ b/src/viewer/datasource.c
953@@ -164,7 +164,7 @@ mcview_get_ptr_string (mcview_t * view, off_t byte_index)
954 /* --------------------------------------------------------------------------------------------- */
955 
956 int
957-mcview_get_utf (mcview_t * view, off_t byte_index, int *char_width, gboolean * result)
958+mcview_get_utf (mcview_t * view, off_t byte_index, int *bytes_consumed, gboolean * result)
959 {
960     gchar *str = NULL;
961     int res = -1;
962@@ -172,7 +172,7 @@ mcview_get_utf (mcview_t * view, off_t byte_index, int *char_width, gboolean * r
963     gchar *next_ch = NULL;
964     gchar utf8buf[UTF8_CHAR_LEN + 1];
965 
966-    *char_width = 0;
967+    *bytes_consumed = 0;
968     *result = FALSE;
969 
970     switch (view->datasource)
971@@ -218,7 +218,7 @@ mcview_get_utf (mcview_t * view, off_t byte_index, int *char_width, gboolean * r
972     if (res < 0)
973     {
974         ch = *str;
975-        *char_width = 1;
976+        *bytes_consumed = 1;
977     }
978     else
979     {
980@@ -226,7 +226,7 @@ mcview_get_utf (mcview_t * view, off_t byte_index, int *char_width, gboolean * r
981         /* Calculate UTF-8 char width */
982         next_ch = g_utf8_next_char (str);
983         if (next_ch)
984-            *char_width = next_ch - str;
985+            *bytes_consumed = next_ch - str;
986         else
987             return 0;
988     }
989diff --git a/src/viewer/display.c b/src/viewer/display.c
990index 00c6ec0..b1bd390 100644
991--- a/src/viewer/display.c
992+++ b/src/viewer/display.c
993@@ -251,10 +251,6 @@ mcview_display (mcview_t * view)
994     {
995         mcview_display_hex (view);
996     }
997-    else if (view->text_nroff_mode)
998-    {
999-        mcview_display_nroff (view);
1000-    }
1001     else
1002     {
1003         mcview_display_text (view);
1004diff --git a/src/viewer/internal.h b/src/viewer/internal.h
1005index 9562c52..0754df7 100644
1006--- a/src/viewer/internal.h
1007+++ b/src/viewer/internal.h
1008@@ -87,6 +87,18 @@ typedef struct
1009     coord_cache_entry_t **cache;
1010 } coord_cache_t;
1011 
1012+// TODO: find a better name. This is not actually a "state machine",
1013+// but a "state machine's state", but that sounds silly.
1014+// Could be parser_state, formatter_state...
1015+typedef struct
1016+{
1017+    off_t offset;               /* The file offset at which this is the state. */
1018+    off_t unwrapped_column;     /* Columns if the paragraph wasn't wrapped, */
1019+                                /* used for positioning TABs in wrapped lines */
1020+    gboolean nroff_underscore_is_underlined;  /* whether _\b_ is underlined rather than bold */
1021+    gboolean print_lonely_combining;   /* whether lonely combining marks are printed on a dotted circle */
1022+} mcview_state_machine_t;
1023+
1024 struct mcview_nroff_struct;
1025 
1026 struct mcview_struct
1027@@ -141,8 +153,12 @@ struct mcview_struct
1028     /* Display information */
1029     gboolean active;            /* Active or not in QuickView mode */
1030     screen_dimen dpy_frame_size;        /* Size of the frame surrounding the real viewer */
1031-    off_t dpy_start;            /* Offset of the displayed data */
1032+    off_t dpy_start;            /* Offset of the displayed data (start of the paragraph in non-hex mode) */
1033     off_t dpy_end;              /* Offset after the displayed data */
1034+    off_t dpy_paragraph_skip_lines;  /* Extra lines to skip in wrap mode */
1035+    mcview_state_machine_t dpy_state_top;  /* Parser-formatter state at the topmost visible line in wrap mode */
1036+    mcview_state_machine_t dpy_state_bottom;  /* Parser-formatter state after the bottomvisible line in wrap mode */
1037+    gboolean dpy_wrap_dirty;    /* dpy_state_top needs to be recomputed */
1038     off_t dpy_text_column;      /* Number of skipped columns in non-wrap
1039                                  * text mode */
1040     off_t hex_cursor;           /* Hexview cursor position in file */
1041@@ -153,6 +169,8 @@ struct mcview_struct
1042     struct area ruler_area;     /* Where the ruler is displayed */
1043     struct area data_area;      /* Where the data is displayed */
1044 
1045+    ssize_t force_max;          /* Force a max offset, or -1 */
1046+
1047     int dirty;                  /* Number of skipped updates */
1048     gboolean dpy_bbar_dirty;    /* Does the button bar need to be updated? */
1049 
1050@@ -220,6 +238,12 @@ cb_ret_t mcview_callback (Widget * w, Widget * sender, widget_msg_t msg, int par
1051 cb_ret_t mcview_dialog_callback (Widget * w, Widget * sender, widget_msg_t msg, int parm,
1052                                  void *data);
1053 
1054+/* ascii.c: */
1055+void mcview_display_text (mcview_t *);
1056+void mcview_state_machine_init (mcview_state_machine_t *, off_t);
1057+void mcview_ascii_move_down (mcview_t *, off_t);
1058+void mcview_ascii_move_up (mcview_t *, off_t);
1059+
1060 /* coord_cache.c: */
1061 coord_cache_t *coord_cache_new (void);
1062 void coord_cache_free (coord_cache_t * cache);
1063@@ -308,9 +332,7 @@ void mcview_place_cursor (mcview_t *);
1064 void mcview_moveto_match (mcview_t *);
1065 
1066 /* nroff.c: */
1067-void mcview_display_nroff (mcview_t * view);
1068 int mcview__get_nroff_real_len (mcview_t * view, off_t, off_t p);
1069-
1070 mcview_nroff_t *mcview_nroff_seq_new_num (mcview_t * view, off_t p);
1071 mcview_nroff_t *mcview_nroff_seq_new (mcview_t * view);
1072 void mcview_nroff_seq_free (mcview_nroff_t **);
1073@@ -318,10 +340,6 @@ nroff_type_t mcview_nroff_seq_info (mcview_nroff_t *);
1074 int mcview_nroff_seq_next (mcview_nroff_t *);
1075 int mcview_nroff_seq_prev (mcview_nroff_t *);
1076 
1077-
1078-/* plain.c: */
1079-void mcview_display_text (mcview_t *);
1080-
1081 /* search.c: */
1082 mc_search_cbret_t mcview_search_cmd_callback (const void *user_data, gsize char_offset,
1083                                               int *current_char);
1084diff --git a/src/viewer/lib.c b/src/viewer/lib.c
1085index 6d51206..a5ab76d 100644
1086--- a/src/viewer/lib.c
1087+++ b/src/viewer/lib.c
1088@@ -106,9 +106,8 @@ mcview_toggle_magic_mode (mcview_t * view)
1089 void
1090 mcview_toggle_wrap_mode (mcview_t * view)
1091 {
1092-    if (view->text_wrap_mode)
1093-        view->dpy_start = mcview_bol (view, view->dpy_start, 0);
1094     view->text_wrap_mode = !view->text_wrap_mode;
1095+    view->dpy_wrap_dirty = TRUE;
1096     view->dpy_bbar_dirty = TRUE;
1097     view->dirty++;
1098 }
1099@@ -120,6 +119,7 @@ mcview_toggle_nroff_mode (mcview_t * view)
1100 {
1101     view->text_nroff_mode = !view->text_nroff_mode;
1102     mcview_altered_nroff_flag = 1;
1103+    view->dpy_wrap_dirty = TRUE;
1104     view->dpy_bbar_dirty = TRUE;
1105     view->dirty++;
1106 }
1107@@ -144,6 +144,8 @@ mcview_toggle_hex_mode (mcview_t * view)
1108         widget_want_cursor (WIDGET (view), FALSE);
1109     }
1110     mcview_altered_hex_mode = 1;
1111+    view->dpy_paragraph_skip_lines = 0;
1112+    view->dpy_wrap_dirty = TRUE;
1113     view->dpy_bbar_dirty = TRUE;
1114     view->dirty++;
1115 }
1116@@ -170,6 +172,10 @@ mcview_init (mcview_t * view)
1117     view->coord_cache = NULL;
1118 
1119     view->dpy_start = 0;
1120+    view->dpy_paragraph_skip_lines = 0;
1121+    mcview_state_machine_init (&view->dpy_state_top, 0);
1122+    view->dpy_wrap_dirty = FALSE;
1123+    view->force_max = -1;
1124     view->dpy_text_column = 0;
1125     view->dpy_end = 0;
1126     view->hex_cursor = 0;
1127@@ -282,6 +288,7 @@ mcview_set_codeset (mcview_t * view)
1128             view->converter = conv;
1129         }
1130         view->utf8 = (gboolean) str_isutf8 (cp_id);
1131+        view->dpy_wrap_dirty = TRUE;
1132     }
1133 #else
1134     (void) view;
1135@@ -339,7 +346,7 @@ mcview_bol (mcview_t * view, off_t current, off_t limit)
1136         if (c == '\r')
1137             current--;
1138     }
1139-    while (current > 0 && current >= limit)
1140+    while (current > 0 && current > limit)
1141     {
1142         if (!mcview_get_byte (view, current - 1, &c))
1143             break;
1144diff --git a/src/viewer/mcviewer.c b/src/viewer/mcviewer.c
1145index eb8ec73..dafbc21 100644
1146--- a/src/viewer/mcviewer.c
1147+++ b/src/viewer/mcviewer.c
1148@@ -402,6 +402,10 @@ mcview_load (mcview_t * view, const char *command, const char *file, int start_l
1149   finish:
1150     view->command = g_strdup (command);
1151     view->dpy_start = 0;
1152+    view->dpy_paragraph_skip_lines = 0;
1153+    mcview_state_machine_init (&view->dpy_state_top, 0);
1154+    view->dpy_wrap_dirty = FALSE;
1155+    view->force_max = -1;
1156     view->search_start = 0;
1157     view->search_end = 0;
1158     view->dpy_text_column = 0;
1159@@ -421,7 +425,10 @@ mcview_load (mcview_t * view, const char *command, const char *file, int start_l
1160         else
1161             new_offset = min (new_offset, max_offset);
1162         if (!view->hex_mode)
1163+        {
1164             view->dpy_start = mcview_bol (view, new_offset, 0);
1165+            view->dpy_wrap_dirty = TRUE;
1166+        }
1167         else
1168         {
1169             view->dpy_start = new_offset - new_offset % view->bytes_per_line;
1170diff --git a/src/viewer/move.c b/src/viewer/move.c
1171index 7cd852b..c8facc5 100644
1172--- a/src/viewer/move.c
1173+++ b/src/viewer/move.c
1174@@ -83,6 +83,8 @@ mcview_scroll_to_cursor (mcview_t * view)
1175         if (cursor < topleft)
1176             topleft = mcview_offset_rounddown (cursor, bytes);
1177         view->dpy_start = topleft;
1178+        view->dpy_paragraph_skip_lines = 0;
1179+        view->dpy_wrap_dirty = TRUE;
1180     }
1181 }
1182 
1183@@ -107,64 +109,24 @@ mcview_movement_fixups (mcview_t * view, gboolean reset_search)
1184 void
1185 mcview_move_up (mcview_t * view, off_t lines)
1186 {
1187-    off_t new_offset;
1188-
1189     if (view->hex_mode)
1190     {
1191         off_t bytes = lines * view->bytes_per_line;
1192         if (view->hex_cursor >= bytes)
1193         {
1194             view->hex_cursor -= bytes;
1195-            if (view->hex_cursor < view->dpy_start)
1196+            if (view->hex_cursor < view->dpy_start) {
1197                 view->dpy_start = mcview_offset_doz (view->dpy_start, bytes);
1198+                view->dpy_paragraph_skip_lines = 0;
1199+                view->dpy_wrap_dirty = TRUE;
1200+            }
1201         }
1202         else
1203         {
1204             view->hex_cursor %= view->bytes_per_line;
1205         }
1206-    }
1207-    else
1208-    {
1209-        off_t i;
1210-
1211-        for (i = 0; i < lines; i++)
1212-        {
1213-            if (view->dpy_start == 0)
1214-                break;
1215-            if (view->text_wrap_mode)
1216-            {
1217-                new_offset = mcview_bol (view, view->dpy_start, view->dpy_start - (off_t) 1);
1218-                /* check if dpy_start == BOL or not (then new_offset = dpy_start - 1,
1219-                 * no need to check more) */
1220-                if (new_offset == view->dpy_start)
1221-                {
1222-                    size_t last_row_length;
1223-
1224-                    new_offset = mcview_bol (view, new_offset - 1, 0);
1225-                    last_row_length = (view->dpy_start - new_offset) % view->data_area.width;
1226-                    if (last_row_length != 0)
1227-                    {
1228-                        /* if dpy_start == BOL in wrapped mode, find BOL of previous line
1229-                         * and move down all but the last rows */
1230-                        new_offset = view->dpy_start - (off_t) last_row_length;
1231-                    }
1232-                }
1233-                else
1234-                {
1235-                    /* if dpy_start != BOL in wrapped mode, just move one row up;
1236-                     * no need to check if > 0 as there is at least exactly one wrap
1237-                     * between dpy_start and BOL */
1238-                    new_offset = view->dpy_start - (off_t) view->data_area.width;
1239-                }
1240-                view->dpy_start = new_offset;
1241-            }
1242-            else
1243-            {
1244-                /* if unwrapped -> current BOL equals dpy_start, just find BOL of previous line */
1245-                new_offset = view->dpy_start - 1;
1246-                view->dpy_start = mcview_bol (view, new_offset, 0);
1247-            }
1248-        }
1249+    } else {
1250+        mcview_ascii_move_up (view, lines);
1251     }
1252     mcview_movement_fixups (view, TRUE);
1253 }
1254@@ -187,52 +149,14 @@ mcview_move_down (mcview_t * view, off_t lines)
1255         for (i = 0; i < lines && view->hex_cursor < limit; i++)
1256         {
1257             view->hex_cursor += view->bytes_per_line;
1258-            if (lines != 1)
1259+            if (lines != 1) {
1260                 view->dpy_start += view->bytes_per_line;
1261+                view->dpy_paragraph_skip_lines = 0;
1262+                view->dpy_wrap_dirty = TRUE;
1263+           }
1264         }
1265-    }
1266-    else
1267-    {
1268-        off_t new_offset = 0;
1269-
1270-        if (view->dpy_end - view->dpy_start > last_byte - view->dpy_end)
1271-        {
1272-            while (lines-- > 0)
1273-            {
1274-                if (view->text_wrap_mode)
1275-                    view->dpy_end =
1276-                        mcview_eol (view, view->dpy_end,
1277-                                    view->dpy_end + (off_t) view->data_area.width);
1278-                else
1279-                    view->dpy_end = mcview_eol (view, view->dpy_end, last_byte);
1280-
1281-                if (view->text_wrap_mode)
1282-                    new_offset =
1283-                        mcview_eol (view, view->dpy_start,
1284-                                    view->dpy_start + (off_t) view->data_area.width);
1285-                else
1286-                    new_offset = mcview_eol (view, view->dpy_start, last_byte);
1287-                if (new_offset < last_byte)
1288-                    view->dpy_start = new_offset;
1289-                if (view->dpy_end >= last_byte)
1290-                    break;
1291-            }
1292-        }
1293-        else
1294-        {
1295-            off_t i;
1296-            for (i = 0; i < lines && new_offset < last_byte; i++)
1297-            {
1298-                if (view->text_wrap_mode)
1299-                    new_offset =
1300-                        mcview_eol (view, view->dpy_start,
1301-                                    view->dpy_start + (off_t) view->data_area.width);
1302-                else
1303-                    new_offset = mcview_eol (view, view->dpy_start, last_byte);
1304-                if (new_offset < last_byte)
1305-                    view->dpy_start = new_offset;
1306-            }
1307-        }
1308+    } else {
1309+        mcview_ascii_move_down (view, lines);
1310     }
1311     mcview_movement_fixups (view, TRUE);
1312 }
1313@@ -257,7 +181,7 @@ mcview_move_left (mcview_t * view, off_t columns)
1314             if (old_cursor > 0 || view->hexedit_lownibble)
1315                 view->hexedit_lownibble = !view->hexedit_lownibble;
1316     }
1317-    else
1318+    else if (!view->text_wrap_mode)
1319     {
1320         if (view->dpy_text_column >= columns)
1321             view->dpy_text_column -= columns;
1322@@ -289,7 +213,7 @@ mcview_move_right (mcview_t * view, off_t columns)
1323             if (old_cursor < last_byte || !view->hexedit_lownibble)
1324                 view->hexedit_lownibble = !view->hexedit_lownibble;
1325     }
1326-    else
1327+    else if (!view->text_wrap_mode)
1328     {
1329         view->dpy_text_column += columns;
1330     }
1331@@ -302,6 +226,8 @@ void
1332 mcview_moveto_top (mcview_t * view)
1333 {
1334     view->dpy_start = 0;
1335+    view->dpy_paragraph_skip_lines = 0;
1336+    mcview_state_machine_init(&view->dpy_state_top, 0);
1337     view->hex_cursor = 0;
1338     view->dpy_text_column = 0;
1339     mcview_movement_fixups (view, TRUE);
1340@@ -331,6 +257,8 @@ mcview_moveto_bottom (mcview_t * view)
1341         const off_t datalines = view->data_area.height;
1342 
1343         view->dpy_start = filesize;
1344+        view->dpy_paragraph_skip_lines = 0;
1345+        view->dpy_wrap_dirty = TRUE;
1346         mcview_move_up (view, datalines);
1347     }
1348 }
1349@@ -347,6 +275,8 @@ mcview_moveto_bol (mcview_t * view)
1350     else if (!view->text_wrap_mode)
1351     {
1352         view->dpy_start = mcview_bol (view, view->dpy_start, 0);
1353+        view->dpy_paragraph_skip_lines = 0;
1354+        view->dpy_wrap_dirty = TRUE;
1355     }
1356     view->dpy_text_column = 0;
1357     mcview_movement_fixups (view, TRUE);
1358@@ -424,10 +354,14 @@ mcview_moveto_offset (mcview_t * view, off_t offset)
1359     {
1360         view->hex_cursor = offset;
1361         view->dpy_start = offset - offset % view->bytes_per_line;
1362+        view->dpy_paragraph_skip_lines = 0;
1363+        view->dpy_wrap_dirty = TRUE;
1364     }
1365     else
1366     {
1367         view->dpy_start = offset;
1368+        view->dpy_paragraph_skip_lines = 0;
1369+        view->dpy_wrap_dirty = TRUE;
1370     }
1371     mcview_movement_fixups (view, TRUE);
1372 }
1373@@ -498,9 +432,14 @@ mcview_moveto_match (mcview_t * view)
1374         view->hexedit_lownibble = FALSE;
1375         view->dpy_start = view->search_start - view->search_start % view->bytes_per_line;
1376         view->dpy_end = view->search_end - view->search_end % view->bytes_per_line;
1377+        view->dpy_paragraph_skip_lines = 0;
1378+        view->dpy_wrap_dirty = TRUE;
1379     }
1380-    else
1381+    else {
1382         view->dpy_start = mcview_bol (view, view->search_start, 0);
1383+        view->dpy_paragraph_skip_lines = 0;
1384+        view->dpy_wrap_dirty = TRUE;
1385+    }
1386 
1387     mcview_scroll_to_cursor (view);
1388     view->dirty++;
1389diff --git a/src/viewer/nroff.c b/src/viewer/nroff.c
1390index 6d6c97b..e1f5010 100644
1391--- a/src/viewer/nroff.c
1392+++ b/src/viewer/nroff.c
1393@@ -1,6 +1,6 @@
1394 /*
1395    Internal file viewer for the Midnight Commander
1396-   Function for nroff-like view
1397+   Functions for searching in nroff-like view
1398 
1399    Copyright (C) 1994-2014
1400    Free Software Foundation, Inc.
1401@@ -91,162 +91,6 @@ mcview_nroff_get_char (mcview_nroff_t * nroff, int *ret_val, off_t nroff_index)
1402 /*** public functions ****************************************************************************/
1403 /* --------------------------------------------------------------------------------------------- */
1404 
1405-void
1406-mcview_display_nroff (mcview_t * view)
1407-{
1408-    const screen_dimen left = view->data_area.left;
1409-    const screen_dimen top = view->data_area.top;
1410-    const screen_dimen width = view->data_area.width;
1411-    const screen_dimen height = view->data_area.height;
1412-    screen_dimen row, col;
1413-    off_t from;
1414-    int cw = 1;
1415-    int c;
1416-    int c_prev = 0;
1417-    int c_next = 0;
1418-
1419-    mcview_display_clean (view);
1420-    mcview_display_ruler (view);
1421-
1422-    /* Find the first displayable changed byte */
1423-    from = view->dpy_start;
1424-
1425-    tty_setcolor (VIEW_NORMAL_COLOR);
1426-    for (row = 0, col = 0; row < height;)
1427-    {
1428-#ifdef HAVE_CHARSET
1429-        if (view->utf8)
1430-        {
1431-            gboolean read_res = TRUE;
1432-            c = mcview_get_utf (view, from, &cw, &read_res);
1433-            if (!read_res)
1434-                break;
1435-        }
1436-        else
1437-#endif
1438-        {
1439-            if (!mcview_get_byte (view, from, &c))
1440-                break;
1441-        }
1442-        from++;
1443-        if (cw > 1)
1444-            from += cw - 1;
1445-
1446-        if (c == '\b')
1447-        {
1448-            if (from > 1)
1449-            {
1450-#ifdef HAVE_CHARSET
1451-                if (view->utf8)
1452-                {
1453-                    gboolean read_res;
1454-                    c_next = mcview_get_utf (view, from, &cw, &read_res);
1455-                }
1456-                else
1457-#endif
1458-                    mcview_get_byte (view, from, &c_next);
1459-            }
1460-            if (g_unichar_isprint (c_prev) && g_unichar_isprint (c_next)
1461-                && (c_prev == c_next || c_prev == '_' || (c_prev == '+' && c_next == 'o')))
1462-            {
1463-                if (col == 0)
1464-                {
1465-                    if (row == 0)
1466-                    {
1467-                        /* We're inside an nroff character sequence at the
1468-                         * beginning of the screen -- just skip the
1469-                         * backspace and continue with the next character. */
1470-                        continue;
1471-                    }
1472-                    row--;
1473-                    col = width;
1474-                }
1475-                col--;
1476-                if (c_prev == '_'
1477-                    && (c_next != '_' || mcview_count_backspaces (view, from + 1) == 1))
1478-                    tty_setcolor (VIEW_UNDERLINED_COLOR);
1479-                else
1480-                    tty_setcolor (VIEW_BOLD_COLOR);
1481-                continue;
1482-            }
1483-        }
1484-
1485-        if ((c == '\n') || (col >= width && view->text_wrap_mode))
1486-        {
1487-            col = 0;
1488-            row++;
1489-            if (c == '\n' || row >= height)
1490-                continue;
1491-        }
1492-
1493-        if (c == '\r')
1494-        {
1495-            mcview_get_byte_indexed (view, from, 1, &c);
1496-            if (c == '\r' || c == '\n')
1497-                continue;
1498-            col = 0;
1499-            row++;
1500-            continue;
1501-        }
1502-
1503-        if (c == '\t')
1504-        {
1505-            off_t line, column;
1506-            mcview_offset_to_coord (view, &line, &column, from);
1507-            col += (option_tab_spacing - col % option_tab_spacing);
1508-            if (view->text_wrap_mode && col >= width && width != 0)
1509-            {
1510-                row += col / width;
1511-                col %= width;
1512-            }
1513-            continue;
1514-        }
1515-
1516-        if (view->search_start <= from && from < view->search_end)
1517-        {
1518-            tty_setcolor (SELECTED_COLOR);
1519-        }
1520-
1521-        c_prev = c;
1522-
1523-        if ((off_t) col >= view->dpy_text_column
1524-            && (off_t) col - view->dpy_text_column < (off_t) width)
1525-        {
1526-            widget_move (view, top + row, left + ((off_t) col - view->dpy_text_column));
1527-#ifdef HAVE_CHARSET
1528-            if (mc_global.utf8_display)
1529-            {
1530-                if (!view->utf8)
1531-                {
1532-                    c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
1533-                }
1534-                if (!g_unichar_isprint (c))
1535-                    c = '.';
1536-            }
1537-            else if (view->utf8)
1538-                c = convert_from_utf_to_current_c (c, view->converter);
1539-            else
1540-                c = convert_to_display_c (c);
1541-#endif
1542-            tty_print_anychar (c);
1543-        }
1544-        col++;
1545-#ifdef HAVE_CHARSET
1546-        if (view->utf8)
1547-        {
1548-            if (g_unichar_iswide (c))
1549-                col++;
1550-            else if (g_unichar_iszerowidth (c))
1551-                col--;
1552-        }
1553-#endif
1554-        tty_setcolor (VIEW_NORMAL_COLOR);
1555-    }
1556-    view->dpy_end = from;
1557-}
1558-
1559-/* --------------------------------------------------------------------------------------------- */
1560-
1561 int
1562 mcview__get_nroff_real_len (mcview_t * view, off_t start, off_t length)
1563 {
1564diff --git a/src/viewer/plain.c b/src/viewer/plain.c
1565deleted file mode 100644
1566index 11e65d4..0000000
1567--- a/src/viewer/plain.c
1568+++ /dev/null
1569@@ -1,204 +0,0 @@
1570-/*
1571-   Internal file viewer for the Midnight Commander
1572-   Function for plain view
1573-
1574-   Copyright (C) 1994-2014
1575-   Free Software Foundation, Inc.
1576-
1577-   Written by:
1578-   Miguel de Icaza, 1994, 1995, 1998
1579-   Janne Kukonlehto, 1994, 1995
1580-   Jakub Jelinek, 1995
1581-   Joseph M. Hinkle, 1996
1582-   Norbert Warmuth, 1997
1583-   Pavel Machek, 1998
1584-   Roland Illig <roland.illig@gmx.de>, 2004, 2005
1585-   Slava Zanko <slavazanko@google.com>, 2009
1586-   Andrew Borodin <aborodin@vmail.ru>, 2009-2014
1587-   Ilia Maslakov <il.smind@gmail.com>, 2009
1588-
1589-   This file is part of the Midnight Commander.
1590-
1591-   The Midnight Commander is free software: you can redistribute it
1592-   and/or modify it under the terms of the GNU General Public License as
1593-   published by the Free Software Foundation, either version 3 of the License,
1594-   or (at your option) any later version.
1595-
1596-   The Midnight Commander is distributed in the hope that it will be useful,
1597-   but WITHOUT ANY WARRANTY; without even the implied warranty of
1598-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
1599-   GNU General Public License for more details.
1600-
1601-   You should have received a copy of the GNU General Public License
1602-   along with this program.  If not, see <http://www.gnu.org/licenses/>.
1603- */
1604-
1605-#include <config.h>
1606-
1607-#include "lib/global.h"
1608-#include "lib/tty/tty.h"
1609-#include "lib/skin.h"
1610-#include "lib/util.h"           /* is_printable() */
1611-#ifdef HAVE_CHARSET
1612-#include "lib/charsets.h"
1613-#endif
1614-
1615-#include "src/setup.h"          /* option_tab_spacing */
1616-
1617-#include "internal.h"
1618-
1619-/*** global variables ****************************************************************************/
1620-
1621-/*** file scope macro definitions ****************************************************************/
1622-
1623-/*** file scope type declarations ****************************************************************/
1624-
1625-/*** file scope variables ************************************************************************/
1626-
1627-/*** file scope functions ************************************************************************/
1628-/* --------------------------------------------------------------------------------------------- */
1629-
1630-/* --------------------------------------------------------------------------------------------- */
1631-/*** public functions ****************************************************************************/
1632-/* --------------------------------------------------------------------------------------------- */
1633-
1634-void
1635-mcview_display_text (mcview_t * view)
1636-{
1637-    const screen_dimen left = view->data_area.left;
1638-    const screen_dimen top = view->data_area.top;
1639-    const screen_dimen width = view->data_area.width;
1640-    const screen_dimen height = view->data_area.height;
1641-    screen_dimen row = 0, col = 0;
1642-    off_t from;
1643-    int cw = 1;
1644-    int c, prev_ch = 0;
1645-    gboolean last_row = TRUE;
1646-
1647-    mcview_display_clean (view);
1648-    mcview_display_ruler (view);
1649-
1650-    /* Find the first displayable changed byte */
1651-    from = view->dpy_start;
1652-
1653-    while (row < height)
1654-    {
1655-#ifdef HAVE_CHARSET
1656-        if (view->utf8)
1657-        {
1658-            gboolean read_res = TRUE;
1659-
1660-            c = mcview_get_utf (view, from, &cw, &read_res);
1661-            if (!read_res)
1662-                break;
1663-        }
1664-        else
1665-#endif
1666-        if (!mcview_get_byte (view, from, &c))
1667-            break;
1668-
1669-        last_row = FALSE;
1670-        from++;
1671-        if (cw > 1)
1672-            from += cw - 1;
1673-
1674-        if (c != '\n' && prev_ch == '\r')
1675-        {
1676-            if (++row >= height)
1677-                break;
1678-
1679-            col = 0;
1680-            /* tty_print_anychar ('\n'); */
1681-        }
1682-
1683-        prev_ch = c;
1684-        if (c == '\r')
1685-            continue;
1686-
1687-        if (c == '\n')
1688-        {
1689-            col = 0;
1690-            row++;
1691-            continue;
1692-        }
1693-
1694-        if (col >= width && view->text_wrap_mode)
1695-        {
1696-            col = 0;
1697-            if (++row >= height)
1698-                break;
1699-        }
1700-
1701-        if (c == '\t')
1702-        {
1703-            col += (option_tab_spacing - col % option_tab_spacing);
1704-            if (view->text_wrap_mode && col >= width && width != 0)
1705-            {
1706-                row += col / width;
1707-                col %= width;
1708-            }
1709-            continue;
1710-        }
1711-
1712-        if (view->search_start <= from && from < view->search_end)
1713-            tty_setcolor (SELECTED_COLOR);
1714-        else
1715-            tty_setcolor (VIEW_NORMAL_COLOR);
1716-
1717-        if (((off_t) col >= view->dpy_text_column)
1718-            && ((off_t) col - view->dpy_text_column < (off_t) width))
1719-        {
1720-            widget_move (view, top + row, left + ((off_t) col - view->dpy_text_column));
1721-
1722-#ifdef HAVE_CHARSET
1723-            if (mc_global.utf8_display)
1724-            {
1725-                if (!view->utf8)
1726-                    c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
1727-                if (!g_unichar_isprint (c))
1728-                    c = '.';
1729-            }
1730-            else if (view->utf8)
1731-                c = convert_from_utf_to_current_c (c, view->converter);
1732-            else
1733-            {
1734-                c = convert_to_display_c (c);
1735-                if (!is_printable (c))
1736-                    c = '.';
1737-            }
1738-#else /* HAVE_CHARSET */
1739-            if (!is_printable (c))
1740-                c = '.';
1741-#endif /* HAVE_CHARSET */
1742-
1743-            tty_print_anychar (c);
1744-        }
1745-
1746-        col++;
1747-
1748-#ifdef HAVE_CHARSET
1749-        if (view->utf8)
1750-        {
1751-            if (g_unichar_iswide (c))
1752-                col++;
1753-            else if (g_unichar_iszerowidth (c))
1754-                col--;
1755-        }
1756-#endif
1757-    }
1758-
1759-    view->dpy_end = from;
1760-    if (mcview_show_eof != NULL && mcview_show_eof[0] != '\0')
1761-    {
1762-        if (last_row && mcview_get_byte (view, from - 1, &c) && c != '\n')
1763-            row--;
1764-
1765-        while (++row < height)
1766-        {
1767-            widget_move (view, top + row, left);
1768-            tty_print_string (mcview_show_eof);
1769-        }
1770-    }
1771-}
1772-
1773-/* --------------------------------------------------------------------------------------------- */
1774diff --git a/tests/src/viewer/viewertest.txt b/tests/src/viewer/viewertest.txt
1775new file mode 100644
1776index 0000000..add6284
1777--- /dev/null
1778+++ b/tests/src/viewer/viewertest.txt
1779@@ -0,0 +1,77 @@
1780+* LF as line terminator
1781+This row has 79 columns:   30|       40|       50|       60|       70|      79|
1782+This row has 80 columns:   30|       40|       50|       60|       70|       80|
1783+This row has 81 columns:   30|       40|       50|       60|       70|        81|
1784+
1785+* CR as line terminator
1786This row has 79 columns:   30|       40|       50|       60|       70|      79|
1787This row has 80 columns:   30|       40|       50|       60|       70|       80|
1788This row has 81 columns:   30|       40|       50|       60|       70|        81|
1789
1790* CR+LF as line terminator
1791+This row has 79 columns:   30|       40|       50|       60|       70|      79|
1792+This row has 80 columns:   30|       40|       50|       60|       70|       80|
1793+This row has 81 columns:   30|       40|       50|       60|       70|        81|
1794+
1795+* TAB characters of varying widths (with reference rendering above).
1796+  When wrapped, the trailing bars will not always align
1797+88888888········7·······66······555·····4444····33333···222222··1111111·|
1798+88888888       7       66      555     4444    33333   222222  1111111 |
1799+
1800+* Combining accents on top of every second letter (a, c, ...)
1801+---------------------------------------------------|
1802+ÁBC̀DÉFG̀HÍJK̀LḾNÒPQ́RS̀TÚVẀXÝzỳxẃvùtśrq̀pónm̀lḱjìhǵfèdćbà|
1803+
1804+* More and more combining accents on a single character
1805+---  ---  ---  ---  ---  ---|
1806+0:a  1:à  2:à́  3:à́̂  4:à́̂̃  5:à́̂̃̄|
1807+0:x  1:x̀  2:x̀́  3:x̀́̂  4:x̀́̂̃  5:x̀́̂̃̄|
1808+
1809+* Combining accents at beginning of line, and after tab, with
1810+  reference rendering (explicit dotted circles, and spaces) above
1811+-       -       -       -       -|
1812+◌̀́       ◌̀́       ◌̀́       ◌̀́       ◌̀́|
1813+̀́     ̀́      ̀́      ̀́      ̀́|
1814+
1815+* Same with spacing mark
1816+一      一      一      一      一|
1817+◌ो      ◌ो      ◌ो      ◌ो      ◌ो|
1818+ो      ो       ो       ो       ो|
1819+
1820+* CJK, Lorem ipsum by Google translate, second line shifted by a space.
1821+  When wrapped, the trailing bars will not align
1822+のイプサム嘆き、の痛みに座るが、時折状況が労苦と痛みが彼にいくつかの大きな喜びを調達することができる起こるので。 |
1823+ のイプサム嘆き、の痛みに座るが、時折状況が労苦と痛みが彼にいくつかの大きな喜びを調達することができる起こるので。|
1824+
1825+* Devanagari spacing marks, with reference positions. Just as with CJK,
1826+  the two cells should appear/disappear together
1827+一 一 一 一|一  一  一  一|一   一   一   一|一    一    一    一|
1828+हो हि हो हि|हो  हि  हो  हि|हो   हि   हो   हि|हो    हि    हो    हि|
1829+
1830+* Thai Sara Am
1831+-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --|
1832+aำ bำ cำ dำ eำ fำ gำ hำ iำ jำ kำ lำ mำ nำ oำ pำ qำ rำ sำ tำ uำ vำ wำ xำ yำ zำ|
1833+
1834+* TABs mixed with other wierd characters
1835+-----   -       一      一一一一        --      -- --   |
1836+abcde          の      イプサム    à́b̀́  हो हि   |
1837+
1838+* Extreme stress test: base letter with multiple (c)ombining or (s)pacing marks
1839+---  -------  ----  ------------  -----------  -----------  -----------  -----------|
1840+c:x̀  ccccc:x̀́̂̃̄  s:xो  sssss:xोोोोो  sssccc:xोोो̀́̂  cccsss:x̀́̂ोोो  scscsc:xो̀ो́ो̂  cscscs:x̀ो́ो̂ो|
1841+
1842+* Same as above, but with CJK base char
1843+--一  ------一  --一-  ------一-----  -------一---  -------一---  -------一---  -------一---|
1844+c:の̀  ccccc:の̀́̂̃̄  s:のो  sssss:のोोोोो  sssccc:のोोो̀́̂  cccsss:の̀́̂ोोो  scscsc:のो̀ो́ो̂  cscscs:の̀ो́ो̂ो|
1845+
1846+* Nroff
1847+---------------  一一一一  - - - -  一 一 一 一  -----------------|
1848+_Hello,_World!_  のイプサ  à́ b̀́ c̀́ d̀́  हो हि हो हि  __b___u___b___u__|
1849+__HHeelllloo,,___W_o_r_l_d_!__  ののイイ_プ_サ  aà̀́́ bb̀̀́́ _c_̀_́ _d_̀_́  हहोो हहिि _ह_ो _ह_ि  ____bb_______u______bb_______u____|
1850+______ <- should be bold again
1851+
1852+* Invalid nroff (a backspace b tab backspace tab underscore backspace newline,
1853+  reference rendering in the first row)
1854+a.b     .       _.
1855+ab           _
1856+
1857+* Control characters (00-1F except tab/lf/cr, 7F, 80-9F), should all be replaced by dots
1858+@ABCDEFGH--KL-NOPQRSTUVWXYZ[\]^_|?|@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_|
1859+  ||€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ|
1860+
1861+* Invalid UTF-8 not tested here, use Markus Kuhn's stress test