qsort switching to insertsort
Alfred Perlstein
alfred at freebsd.org
Sun Nov 27 06:40:22 UTC 2016
A couple notes:
1) It's interesting to me that it's based on a simple heuristic as
opposed to also checking the depth of the recursion within the qsort
itself.
2) Wondering if upon detection of this situation it might even be faster
to perform some randomization (shuffle) of the data and then retry.
3) Wondering if there could be a way to return an error and indicate the
data is unsorted if the corner case is hit allowing the caller to make a
choice at that point. Obviously the API would need to be changed.
I haven't thought too hard on this.
One thing to keep in mind: Using qsort in a codepath has "deadlines" or
other deterministic needs is not going to work out well. It's better to
use a sort with a known complexity with an upper bound than something
that can be defeated by a pathological input data set.
-Alfred
On 11/26/16 8:17 PM, Tristan Verniquet wrote:
> From: Mehmet Erol Sanliturk <m.e.sanliturk at gmail.com>
> Sent: Sunday, 27 November 2016 5:51 AM
> To: Arne Dag Fidjestøl
> Cc: Tristan Verniquet; Hans Petter Selasky; freebsd hackers
> Subject: Re: qsort switching to insertsort
>
>
>
> On Sat, Nov 26, 2016 at 9:14 AM, Arne Dag Fidjestøl <adf at idi.ntnu.no> wrote:
>
>> On 26 Nov 2016, at 15:35, Mehmet Erol Sanliturk <m.e.sanliturk at gmail.com> wrote:
>>
>> In quick sort , it is necessary to check worst case and switch to another
>> sort method if it is detected .
>> When this is not done , end result is stack overflow , etc . , but not
>> success .
>> Therefore , it is not possible to eliminate an alternative sort method .
> I you sort the smaller partition recursively first, and then sort the larger partition either by tail recursion or iteration, you will only consume O(log n) of stack, so stack overflow needn’t be an issue, regardless of the input.
>
> -adf
>
>
>
> Important problem is caused by almost sorted values . Myself , I am counting the sorted elements first , if they exceed a large percentage of number of all elements , then I am switching to an alternative sort , otherwise a quick sort is used . This is for a simple application .
>
> In an operating system , more complex algorithms may be more useful .
>
>
> Mehmet Erol Sanliturk
>
> ---
>
> It can still trigger with completely unsorted data in the top and bottom half, as long as it chooses the middle value for the pivot. The main reason nearly sorted data is vulnerable is that it is more likely to match these conditions, and more likely to happen in real life situations.
>
> But this is why i don't really consider it an "edge" case, there would be a whole class of situations (like the one we had) where it would be very likely to trigger with very bad side effects.
>
> Maybe, does anyone continue to use FreeBSD qsort while being aware of this implementation detail? Under what conditions/assurances?
>
> Does anyone use FreeBSDs qsort because of this feature?
>
> Tristan
> ________________________________
> From: Mehmet Erol Sanliturk <m.e.sanliturk at gmail.com>
> Sent: Sunday, 27 November 2016 5:51:31 AM
> To: Arne Dag Fidjestøl
> Cc: Tristan Verniquet; Hans Petter Selasky; freebsd hackers
> Subject: Re: qsort switching to insertsort
>
>
>
> On Sat, Nov 26, 2016 at 9:14 AM, Arne Dag Fidjestøl <adf at idi.ntnu.no<mailto:adf at idi.ntnu.no>> wrote:
>
>> On 26 Nov 2016, at 15:35, Mehmet Erol Sanliturk <m.e.sanliturk at gmail.com<mailto:m.e.sanliturk at gmail.com>> wrote:
>>
>> In quick sort , it is necessary to check worst case and switch to another
>> sort method if it is detected .
>> When this is not done , end result is stack overflow , etc . , but not
>> success .
>> Therefore , it is not possible to eliminate an alternative sort method .
> I you sort the smaller partition recursively first, and then sort the larger partition either by tail recursion or iteration, you will only consume O(log n) of stack, so stack overflow needn’t be an issue, regardless of the input.
>
> -adf
>
>
>
> Important problem is caused by almost sorted values . Myself , I am counting the sorted elements first , if they exceed a large percentage of number of all elements , then I am switching to an alternative sort , otherwise a quick sort is used . This is for a simple application .
>
> In an operating system , more complex algorithms may be more useful .
>
>
> Mehmet Erol Sanliturk
>
>
>
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
>
More information about the freebsd-hackers
mailing list