qsort switching to insertsort
Mehmet Erol Sanliturk
m.e.sanliturk at gmail.com
Sun Nov 27 21:33:43 UTC 2016
On Sat, Nov 26, 2016 at 10:40 PM, Alfred Perlstein <alfred at freebsd.org>
wrote:
> A couple notes:
>
> 1) It's interesting to me that it's based on a simple heuristic as opposed
> to also checking the depth of the recursion within the qsort itself.
>
>
Aim is not to enter into quick sort at the beginning . Number of sorted
pairs can be counted on ( input or generation ) of data to sort . Based on
this count , the sorting method may be selected as either quick or an
alternative sort which will be able to sort the data deterministically .
It is possible to implement a robust quick sort if ( time and/or resources
) are sufficient to use . When there is not sufficient resources , a simple
method is better than a more optimized ( which will not be attainable at
present state ) .
> 2) Wondering if upon detection of this situation it might even be faster
> to perform some randomization (shuffle) of the data and then retry.
>
>
Instead of ( entering into quick sort , fail , and , search a way to
recover ) , I would prefer to use another method will will produce a
desired result by using more time .
Sometimes , it is very difficult to generate a "universal" algorithm that
will be able to cure "all" difficulties .
> 3) Wondering if there could be a way to return an error and indicate the
> data is unsorted if the corner case is hit allowing the caller to make a
> choice at that point. Obviously the API would need to be changed.
>
>
Problem is lying in the complexity of writing a very robust quick sort
algorithm . There are many ( at least parts in ) books , papers on these
subject . There is also a cost of obtaining / reading / utilizing such
resources . When the available resource is only a "data structures" book ,
the cheapest solution may be the above described method to sort the data
without entering into error generation and recover from it .
> I haven't thought too hard on this.
>
> One thing to keep in mind: Using qsort in a codepath has "deadlines" or
> other deterministic needs is not going to work out well. It's better to
> use a sort with a known complexity with an upper bound than something that
> can be defeated by a pathological input data set.
>
> -Alfred
>
>
> On 11/26/16 8:17 PM, Tristan Verniquet wrote:
>
>> From: Mehmet Erol Sanliturk <m.e.sanliturk at gmail.com>
>> Sent: Sunday, 27 November 2016 5:51 AM
>> To: Arne Dag Fidjestøl
>> Cc: Tristan Verniquet; Hans Petter Selasky; freebsd hackers
>> Subject: Re: qsort switching to insertsort
>>
>>
>>
>> On Sat, Nov 26, 2016 at 9:14 AM, Arne Dag Fidjestøl <adf at idi.ntnu.no>
>> wrote:
>>
>> On 26 Nov 2016, at 15:35, Mehmet Erol Sanliturk <m.e.sanliturk at gmail.com>
>>> wrote:
>>>
>>> In quick sort , it is necessary to check worst case and switch to another
>>> sort method if it is detected .
>>> When this is not done , end result is stack overflow , etc . , but not
>>> success .
>>> Therefore , it is not possible to eliminate an alternative sort method .
>>>
>> I you sort the smaller partition recursively first, and then sort the
>> larger partition either by tail recursion or iteration, you will only
>> consume O(log n) of stack, so stack overflow needn’t be an issue,
>> regardless of the input.
>>
>> -adf
>>
>>
>>
>> Important problem is caused by almost sorted values . Myself , I am
>> counting the sorted elements first , if they exceed a large percentage of
>> number of all elements , then I am switching to an alternative sort ,
>> otherwise a quick sort is used . This is for a simple application .
>>
>> In an operating system , more complex algorithms may be more useful .
>>
>>
>> Mehmet Erol Sanliturk
>>
>> ---
>>
>> It can still trigger with completely unsorted data in the top and bottom
>> half, as long as it chooses the middle value for the pivot. The main reason
>> nearly sorted data is vulnerable is that it is more likely to match these
>> conditions, and more likely to happen in real life situations.
>>
>> But this is why i don't really consider it an "edge" case, there would be
>> a whole class of situations (like the one we had) where it would be very
>> likely to trigger with very bad side effects.
>>
>> Maybe, does anyone continue to use FreeBSD qsort while being aware of
>> this implementation detail? Under what conditions/assurances?
>>
>> Does anyone use FreeBSDs qsort because of this feature?
>>
>> Tristan
>> ________________________________
>> From: Mehmet Erol Sanliturk <m.e.sanliturk at gmail.com>
>> Sent: Sunday, 27 November 2016 5:51:31 AM
>> To: Arne Dag Fidjestøl
>> Cc: Tristan Verniquet; Hans Petter Selasky; freebsd hackers
>> Subject: Re: qsort switching to insertsort
>>
>>
>>
>> On Sat, Nov 26, 2016 at 9:14 AM, Arne Dag Fidjestøl <adf at idi.ntnu.no
>> <mailto:adf at idi.ntnu.no>> wrote:
>>
>> On 26 Nov 2016, at 15:35, Mehmet Erol Sanliturk <m.e.sanliturk at gmail.com
>>> <mailto:m.e.sanliturk at gmail.com>> wrote:
>>>
>>> In quick sort , it is necessary to check worst case and switch to another
>>> sort method if it is detected .
>>> When this is not done , end result is stack overflow , etc . , but not
>>> success .
>>> Therefore , it is not possible to eliminate an alternative sort method .
>>>
>> I you sort the smaller partition recursively first, and then sort the
>> larger partition either by tail recursion or iteration, you will only
>> consume O(log n) of stack, so stack overflow needn’t be an issue,
>> regardless of the input.
>>
>> -adf
>>
>>
>>
>> Important problem is caused by almost sorted values . Myself , I am
>> counting the sorted elements first , if they exceed a large percentage of
>> number of all elements , then I am switching to an alternative sort ,
>> otherwise a quick sort is used . This is for a simple application .
>>
>> In an operating system , more complex algorithms may be more useful .
>>
>>
>> Mehmet Erol Sanliturk
>>
>>
>>
>> _______________________________________________
>>
>>
Mehmet Erol Sanliturk
More information about the freebsd-hackers
mailing list