How does a DBMS implement their own sorting algorithm? Or do they?


When a SQL is translated to C by parser such as YACC or BISON, does that piece of translated C code contains the the sorting algorithm mathematics? I do not understand how the sorting is implemented in a DBMS (such as MySQL or Microsoft SQL Server) - is the algorithm part of the grammar parser? Or, does the algorithm applies to a resulted group of data only after it get fetched from an SQL query, but not directly apply to the computer memory? Or is the sorting algorithm an ISO standard and that all DBMS are required to use the same algorithm?

I did my researched and googling but find no clear answer. Without unneedingly reading a book on database internal, could someone explain the concept clearly?

Answer

The sorting algorithm is most certainly not a part of the grammar parser, it's technically an 'implementation detail'. It's a rather important one though, as it can fundamentally impact performance of complex queries. The term 'implementation detail' however refers to that it's up to the DBMS vendor to decide what to do and how to do it.

It could even be partially delegated to the query optimizer, as the common sorting algorithms like heapsort, mergesort, quicksort et al all have different 'best case scenarios'. Some perform notably better on 'mostly sorted data', and others are extremely slow on 'extremely unsorted data'. As indexes could contain hints on that a very smart DBMS could even pick a different sorting algorithm based on the data at hand, see this Wikipedia writeup for a comparison. To my knowledge none of the current vendors do that though.

So in the end, what sorting algorithms are used when is just a black box from the programmer's perspective. All you (should) care about is that the output is sorted correctly.

source: stackoverflow.com
js interview questions