Description:
Solving order of UNDO log files
Previously all UNDO logs was initialised with LSN = 0 in page 1.
This meant that the order of the UNDO logs is essentially the
order they got at creation of the file objects. This order
is based on the id of the UNDO log file which is not necessarily
constantly increasing. At creation of the UNDO log the first UNDO
log file is choosen as HEAD. However at restart the last UNDO log
file is choosen as HEAD. If the first UNDO log file is filled
most of the problem disappears since the restart will use sorting
and this could salvage the situation.
The remedy of this problem is to ensure that the very first UNDO
log file is initialised with LSN = 1, This means that the sorting
is sure to pick this UNDO log file again as the current UNDO log
file. If the first UNDO log has been filled then the new UNDO log
file can be choosen anyone, but once it is choosen it will write
a higher LSN number into page 1 and thus ensure that it is picked
for the restart as the current UNDO log.
A side effect of this is that we need the first LSN to be 1 and
not 0, so it affects the initial values of LSN.
To ensure higher probability of avoiding this problem in an upgrade
situation we will ensure that if all files are 0 in page 1 the
first UNDO log file will be choosen as the current one.
Fixing find of UNDO_END
-----------------------
Previously the last LSN was 0, thus when m_last_read_lsn == 1
it meant that we reached the end of the UNDO log and we return
an UNDO_END log record. Now we should check m_last_read_lsn == 2
instead.
This makes it possible to miss the last UNDO log record in upgrade
cases. To avoid this issue we write in the m_last_lsn in the page
header of page 1. This will be 1 in new code and 0 in old code.
We use the fact that this page information was 0 in old code and
call it m_last_lsn in new and set this to 1. This information from
binary search is used to set m_last_lsn in log group record. The
variable is initialised to 1 and will stay 1 unless an old UNDO
log record is read in which case it will change to be 0 instead.
How to repeat:
daily-basic--07 testSystemRestart -n SR_UNDO T6
Create at least 2 UNDO log files.
Insert some data in disk columns
Ensure that the first UNDO log isn't filled.
Add a few more UNDO log files (optional)
Restart
Could result in LCP stop, crashes and other problems.
As soon as the first UNDO log file is filled the problem is gone.
Suggested fix:
See description