Ticket #1860 (closed defect: fixed)

Opened 7 years ago

Last modified 7 years ago

Unwanted natural sorting of numbers in file panels

Reported by: Wiseman1024 Owned by: andrew_b
Priority: major Milestone: 4.7
Component: mc-core Version: 4.7.0-pre4
Keywords: sort order collation numeric numbers Cc:
Blocked By: Blocking:
Branch state: Votes for changeset:

Description

I've noticed when Midnight Commander sorts filenames alphabetically, it treats numbers specially as to sort filenames "a1", "a3" and "a20" in this order, contrary to the expected "a1", "a20", "a3" order provided by most if not everything else (examples I had readily available: ls, sort, Konqueror file manager, KDE file chooser, Opera's file chooser, Python's sorted(os.listdir('.'))). As a reference, FAR Manager sorts like everything else too. It's just Midnight Commander the one sorting weirdly.

While this may look nice on the basis that the number 3 comes before the number 20 and so on (when treated as such!), this is extremely irritating because almost everything else will sort files correctly and contradict Midnight Commander's file sorting. It's even dangerous, as it may lead to user confusion and mistakes that could derive in data loss. Allow me to explain my particular case as an example: I use Midnight Commander as my central file management tool. However, in order to view image files, I've associated my own image viewer with the F3 (view) action for image files. This image viewer allows me to walk forwards and backwards within the directory starting from the file I used to open it, so for example I'm in a directory with files "a1.png", "a3.png" and "a20.png" as seen in Midnight Commander. I hit F3 on "a1.png", and then go forwards to the next file expecting to view what was next in Midnight Commander - "a3.png"; however the image viewer (and any other application I have) will jump to "a20.png" if they're alphabetically sorting files. I may then see something I don't like, and decide to delete the next file to the one I started browsing, so when I'm back to Midnight Commander I go and delete the wrong file ("a3.png").

I'm experiencing this problem in both Debian sid on i686 and Ubuntu Hardy on AMD64, with both Midnight Commander 4.7.0-pre1-3 from the Debian sid repository and Midnight Commander 4.7.0-pre4 compiled from source. Both systems are similar in setup. Here's information for one of them:

# uname -a
Linux Zohar 2.6.31-1-686 #1 SMP Sun Nov 15 20:39:33 UTC 2009 i686 GNU/Linux

# cat /etc/issue
Debian GNU/Linux squeeze/sid \n \l

# dpkg -l mc libglib2.0-0 libslang2 | tail -3
ii libglib2.0-0 2.22.2-2 The GLib library of C routines
ii libslang2 2.2.1-1 The S-Lang programming library - runtime version
ii mc 2:4.7.0-pre1-3 midnight commander - a powerful file manager

# locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

I've verified my system's strcoll function orders strings as expected (i.e. strcoll("a10", "a3") returns -1).

Change History

comment:1 Changed 7 years ago by bilbo

Isn't it the same issue as #1536 ?

comment:2 Changed 7 years ago by Wiseman1024

I can confirm it seems to be the same issue as #1536 because it does seem to happen with case-insensitive sorting only. I've provided additional information (such as an example use case, mc and Linux versions, locale/collation settings and the fact strcoll does seem to work properly); maybe the bugs could be merged?

comment:3 Changed 7 years ago by andrew_b

strutilutf8.c

   1278 static char *
   1279 str_utf8_create_key_for_filename (const char *text, int case_sen)
   1280 {
   1281     return str_utf8_create_key_gen (text, case_sen,
   1282                                     g_utf8_collate_key_for_filename);
   1283 }

In order to sort filenames correctly, the g_utf8_collate_key_for_filename() function treats the dot '.' as a special case. Most dictionary orderings seem to consider it insignificant, thus producing the ordering "event.c" "eventgenerator.c" "event.h" instead of "event.c" "event.h" "eventgenerator.c". Also, we would like to treat numbers intelligently so that "file1" "file10" "file5" is sorted as "file1" "file5" "file10".

This bug will be solved if the g_utf8_collate_key() function will be used instead of g_utf8_collate_key_for_filename().

comment:4 Changed 7 years ago by andrew_b

The "In order to sort ... file10" paragraph is a quotation of original GLib documentation.

comment:5 Changed 7 years ago by andrew_b

  • Blocked By 1536 added

(In #1536) Created 1536_utf8_sort_files branch. Parent branch is master.
changeset:675de7f3a692b20566568904fc58974c39a36af1

comment:6 Changed 7 years ago by andrew_b

  • Status changed from new to accepted
  • Owner set to andrew_b

comment:7 Changed 7 years ago by angel_il

  • Votes for changeset set to angel_il
  • severity changed from no branch to on review

comment:8 Changed 7 years ago by angel_il

  • Votes for changeset angel_il deleted
  • severity changed from on review to no branch

sorry...

comment:9 Changed 7 years ago by andrew_b

  • Blocked By 1536 removed

comment:10 Changed 7 years ago by andrew_b

  • Status changed from accepted to testing
  • Resolution set to fixed

comment:11 Changed 7 years ago by andrew_b

  • Status changed from testing to closed
Note: See TracTickets for help on using tickets.