Ticket #2229 (assigned defect)

Opened 6 years ago

Last modified 5 years ago

MC viewer doesn't like [binary] files with no line endings [2 bugs here]

Reported by: birdie Owned by:
Priority: major Milestone: Future Releases
Component: mcview Version: master
Keywords: Cc: gotar@…
Blocked By: Blocking:
Branch state: no branch Votes for changeset:

Description (last modified by andrew_b) (diff)

Test case:

$ dd if=/dev/zero of=test bs=128M count=1

  1. Try to open this file when the viewer warp mode is set to "Wrap", it will take too much time (21 seconds here, which is unacceptable).
  1. Try to jump to the end of such a file when the viewer wrap mode is set to "No Wrap", it will take too much time (I have killed MC after 75 seconds).

So, actually we have too problems/bugs here.

Attachments

mc-null_viewer.patch (8.3 KB) - added by gotar 6 years ago.
mc-noEOLviewer.patch (4.6 KB) - added by gotar 6 years ago.
updated move.c part of the patch to master base
mc-null_viewer2.patch (9.4 KB) - added by andrew_b 6 years ago.
Summarized patch with code indentation
mc-no-mcview_update-on-quit.patch (1.9 KB) - added by gotar 6 years ago.
mc-skip_useless_display_updates.patch (12.0 KB) - added by gotar 6 years ago.

Change History

comment:1 follow-ups: ↓ 3 ↓ 4 Changed 6 years ago by angel_il

Test case:

$ dd if=/dev/zero of=test bs=128M count=1

  1. Try to open this file when the viewer warp mode is set to "Warp", it will take too >much time (21 seconds here, which is unacceptable).

it's no mc bug, update your "file" utility
see also
http://www.midnight-commander.org/ticket/1427
http://www.midnight-commander.org/ticket/2185

comment:2 Changed 6 years ago by angel_il

about
"2. Try to jump to the end of such a file when the viewer warp mode is set to "No Warp", it will take too much time (I have killed MC after 75 seconds)."

i dont know how to find '\n' without test each symbols...

comment:3 in reply to: ↑ 1 Changed 6 years ago by birdie

Replying to angel_il:

it's no mc bug, update your "file" utility

No, I'm on Fedora 13, 'file' is really new here:

$ time file test
test: data

real    0m0.063s
user    0m0.007s
sys     0m0.003s

So, it seems like the first part of this bug has nothing to do with the 'file' utility. Have you actually tried to reproduce it?

comment:4 in reply to: ↑ 1 ; follow-up: ↓ 5 Changed 6 years ago by andrew_b

  • Description modified (diff)

Replying to angel_il:

it's no mc bug, update your "file" utility

Unfortunately, it's a bug of MC not 'file'.

birdie: s/warp/wrap ? :)

comment:5 in reply to: ↑ 4 Changed 6 years ago by birdie

Replying to andrew_b:

Replying to angel_il:

it's no mc bug, update your "file" utility

Unfortunately, it's a bug of MC not 'file'.

birdie: s/warp/wrap ? :)

Yep, wrap mode :)

comment:6 Changed 6 years ago by birdie

Viewing such a file in "Wrap" mode is a PITA, it seems like MC absolutely hates it :)

comment:7 follow-up: ↓ 8 Changed 6 years ago by birdie

I meant scrolling up and down using PgUp? and PgDown?

comment:8 in reply to: ↑ 7 Changed 6 years ago by angel_il

Replying to birdie:

I meant scrolling up and down using PgUp? and PgDown?

i think - no way for make faster in this case...

comment:9 Changed 6 years ago by birdie

  • Milestone changed from 4.7 to 4.7.3

comment:10 Changed 6 years ago by angel_il

  • Milestone changed from 4.7.3 to 4.7

sorry but - NO WAY!

comment:11 Changed 6 years ago by birdie

I will be annoying.

I need to view great many binary files and right now MC dies too often to be even remotely usable.

comment:12 follow-up: ↓ 13 Changed 6 years ago by zap

I agree this is a very annoying um... not a bug, rather mis-feature.

One possible solution is to limit the length of a single line to some sane value, say, 4096 characters. I don't think there are documents with more characters per line, and for binary files it doesn't make a difference - you wouldn't expect to see something consistent when vieweing a binary file in text mode.

Unfortunately, to see a file in hex mode you always have to open them first in text mode. This makes it just impossible to view large binary files full of zeros (like disk images). Especially this is annoying on weak devices (e.g. I have mc running on my Nokia N810).

So, while scanning for '\n' the scanner needs just to stop when it counts more than 4096 characters, that's all.

comment:13 in reply to: ↑ 12 Changed 6 years ago by andrew_b

Replying to zap:

So, while scanning for '\n' the scanner needs just to stop when it counts more than 4096 characters, that's all.

And what? How are you going to find the start of the next line if '\n' is located after 4096 chars?

comment:14 Changed 6 years ago by angel_il

maybe need search \n or 4096-th byte?

comment:15 Changed 6 years ago by birdie

  • Summary changed from MC viewer doesn't like some files [2 bugs here] to MC viewer doesn't like [binary] files with no line endings [2 bugs here]

I have updated the summary so that other people could find it easier.

comment:16 follow-up: ↓ 17 Changed 6 years ago by angel_il

birdie: try open file from topic in the 'vi' editor... to feel the difference...

comment:17 in reply to: ↑ 16 ; follow-up: ↓ 18 Changed 6 years ago by birdie

Replying to angel_il:

birdie: try open file from topic in the 'vi' editor... to feel the difference...

Far file manager handles this file easily - it can open it in less than 0.5 seconds both in viewer (wrap and unwrap modes) and editor.

If you have to rewrite MC viewer to eliminate this bug, then it must be done.

comment:18 in reply to: ↑ 17 ; follow-up: ↓ 20 Changed 6 years ago by andrew_b

Replying to birdie:

If you have to rewrite MC viewer to eliminate this bug, then it must be done.

Patches are welcome!

comment:19 Changed 6 years ago by gotar

  • Cc gotar@… added

comment:20 in reply to: ↑ 18 Changed 6 years ago by gotar

  • Milestone changed from 4.7 to 4.7.5

Replying to andrew_b:

Patches are welcome!

Fortunately this is only weak, not commented code, which checks too much too many times (like twice run on entire file without newlines, while one run of screen width is enough). I attach POF patch for two functions which makes moving down/up in wrapped mode as fast as possible. There's still a place for improvements in this mode (like immediate jump to the end of such file without multiple scans) and general (creating some sort of newline cache), so after applying this just bump milestone and keep this ticket opened. Note, that this one is based on sources with my last #2135 patch applied, it has already proven me that doesn't break anything that ain't broken now (like #2410).

BTW checks like this:

            new_offset = mcview_bol (view, new_offset);
            if (new_offset < 0)
                new_offset = 0;

should be done in mcview_bol only.

Changed 6 years ago by gotar

Changed 6 years ago by gotar

updated move.c part of the patch to master base

comment:21 Changed 6 years ago by gotar

I've got some patches for unwrapped mode too, but they need much cleanup and solid base (why don't you use tab to indent code?), so I'll post them here after above are commited hopefully into 4.7.5 release.

Changed 6 years ago by andrew_b

Summarized patch with code indentation

comment:22 Changed 6 years ago by andrew_b

  • Status changed from new to accepted
  • Owner set to andrew_b
  • Votes for changeset set to andrew_b

Created 2229_viewer_fast_move_up_down_wrapped branch. Parent branch is master.
changeset:8eef3c7d7855ff6111e07a79135acf15cfdab5c0

Since author of patch is Gotar, my vote here.

comment:23 Changed 6 years ago by slavazanko

  • Votes for changeset changed from andrew_b to andrew_b slavazanko
  • severity changed from no branch to approved

comment:24 Changed 6 years ago by andrew_b

  • Status changed from accepted to testing
  • Votes for changeset changed from andrew_b slavazanko to committed-master
  • Resolution set to fixed
  • severity changed from approved to merged

comment:25 Changed 6 years ago by andrew_b

  • Status changed from testing to reopened
  • Votes for changeset committed-master deleted
  • Resolution fixed deleted
  • severity changed from merged to no branch

Reopened to continue development.

comment:26 Changed 6 years ago by gotar

Thanks.

Time to cope with unwrapped mode - I've got working code for many cases, but first I need some clarification, just let me explain: the weird thing I've noticed is that some operations invoke file rescan which seems (and is) pointless, e.g. ...CK_ViewQuit (it's the obvious case) or CK_Ignore_Key. I've traced the source in src/viewer/actions_cmd.c and here it goes:

mcview_callback->(mcview_handle_key->(mcview_execute_cmd))->mcview_update

So when we quit mcview_execute_cmd() returns MSG_HANDLED via mcview_handle_key() to mcview_callback(), which invokes mcview_display() and we're waiting until it finishes reading up to many MBs just before exit...

The problem can be easily solved - call mcview_display() only when needed, that is when both conditions are met: MSG_HANDLED and something actually happens. And here is my question - how to mark 'do something' situation? we could either:

  1. extend cb_ret_t enum type by MSG_HANDLED_NOOP for keys handled, that require no further actions, or
  2. return MSG_NOT_HANDLED when no further action needs to be done,
  3. use int type with some additional code meaning nothing to do.

As I didn't want to mess with cb_ret_t (it's widely used) and don't know if MSG_NOT_HANDLED involves executing some other code, I've decided to go 3rd way with -MSG_HANDLED (negative, i.e. -1) as internal marker.

Attached patch addresses only Quit operation for you to explain what happens. CK_ViewQuit returns -1 (instead of 1) from mcview_execute_cmd(), then mcview_handle_key() treats this the same way as MSG_HANDLED(1) and returns the value, but mcview_callback() executes mcview_update() only for real MSG_HANDLED(1) - if (-1) is get, it only translates it to positive (so this function still returns cb_ret_t), without refreshing the screen. Speaking shortly: we skip certain handled operations from reread. Way to test:

dd if=/dev/zero of=file bs=1M count=10
mcview file
F2
q

The code works, but I need to know if I can use this -1 value, or should I use MSG_NOT_HANDLED, or maybe you already know how to fix this yourself:)

Changed 6 years ago by gotar

comment:27 Changed 6 years ago by andrew_b

If you have a task to avoid the viewer update on quit, it can be made easier:

diff --git a/src/viewer/display.c b/src/viewer/display.c
index d431732..4ffbce4 100644
--- a/src/viewer/display.c
+++ b/src/viewer/display.c
@@ -183,6 +183,10 @@ mcview_update (mcview_t * view)
 {
     static int dirt_limit = 1;
 
+    /* don't update viewer in inactive dialog */
+    if (view->widget.owner->state != DLG_ACTIVE)
+        return;
+
     if (view->dpy_bbar_dirty)
     {
         view->dpy_bbar_dirty = FALSE;

In general, the callback return value defines the key handling in dlg_key_event(). You can trace youself what functions will be called if callback will return MSG_NOT_HANDLE instead of MSG_HANDLE.

comment:28 Changed 6 years ago by gotar

Thanks again. So we don't want MSG_NOT_HANDLE for handled and ignored operations, instead I'm using negative MSG_HANDLED and in general mcview_callback() handles every integer return code (for further purposes I'm returning e.g. count of lines or columns moved - this way we can reintroduce #2371 feature removed in #1585 for some cases). Keep in mind that mcview_update() is really expensive function which should be avoided whenever possible.

Attached patch addresses also mcview_file_load_data() removing needless now mcview_already_loaded() check.

In next step I'll try to create newline cache to deal with real core problem in unwrapped mode. Currently at least mcview_move_*() functions do the job repeated in mcview_display() later. I'm considering cacheing of current screen, entire file or long lines only, haven't decided yet.

Changed 6 years ago by gotar

comment:29 Changed 6 years ago by gotar

In case CK_ViewMoveToBol there's useless break statement I've just seen, feel free to remove it.

comment:30 follow-up: ↓ 31 Changed 6 years ago by angel_il

"-MSG_HANDLED"
mm... patch seems like a hack... :)

comment:31 in reply to: ↑ 30 Changed 6 years ago by andrew_b

Replying to angel_il:

"-MSG_HANDLED"
mm... patch seems like a hack... :)

Well, I think that enumerate type should not be mixed with generic int and new enum type should be created if required.

comment:32 Changed 6 years ago by gotar

Yes, it is a bit hackish (I've asked about it 2 weeks ago in comment:26) and maybe I would have created new enum type, if I hadn't returned actual counts in mcview_moveto_*().

And if you introduce new enum into mcview_execute_cmd() and mcview_handle_key() (where it can exist) then sth similar to my

if (i != MSG_NOT_HANDLED && i != -MSG_HANDLED)
    [...]
if (i != MSG_NOT_HANDLED)
    return MSG_HANDLED;
return MSG_NOT_HANDLED;

would be required in every case when we expect negative-handled. And I'm too lazy programmer for this;) (yes, it is a bit hackish, but still clearly readable).

comment:33 Changed 6 years ago by gotar

So if you want the code to be 'proper', mcview_moveto_*() should simply return counts (i.e. with 0 instead -MSG_HANDLED), which should later be translated:
MSG_HANDLED_NOOP if 0
MSG_HANDLED otherwise (meaning 1+)
everywhere they're called (as currently 0 means MSG_NOT_HANDLED).

comment:34 Changed 6 years ago by birdie

  • Milestone changed from 4.7.5 to 4.8

comment:35 Changed 5 years ago by andrew_b

  • Status changed from reopened to assigned
  • Owner andrew_b deleted
  • Branch state set to no branch
  • Milestone changed from 4.8 to Future Releases
Note: See TracTickets for help on using tickets.