Description:
The function Transporter::update_status_overloaded() is called in two places:
1. In TransporterRegistry::updateWritePtr(), it occurs after appending signals into thr_data::m_send_buffers[trp_id], and it checks the length of signals in that local buffer.
2. In Transporter::iovec_data_sent(), it occurs after sending signals, and it checks the remaining length of signals in g_thr_repository->m_send_buffers[trp_id]->m_sending_buffer, the global sending buffer.
I have two questions:
1. These two places check different buffer lengths—one is the local buffer, and the other is the global sending buffer. Both lengths are used to decide whether the transporter should be set to an overloaded status. Is it expected that both these buffers can determine the transporter's overloaded status? Given their distinct usages and lengths, using them together to decide the status could potentially influence each other.
2. If yes, then the first buffer appears to use the wrong length for checking. It utilizes only the used length of the last page of the local buffer, which is significantly smaller than the overloaded limitation. As a result, it can never reach the limitation.
Uint32
mt_send_handle::updateWritePtr(TrpId trp_id,
Uint32 lenBytes,
Uint32 prio)
{
struct thr_send_buffer * b = m_selfptr->m_send_buffers+trp_id;
thr_send_page * p = b->m_last_page;
p->m_bytes += lenBytes;
return p->m_bytes; //<-------------- return the used bytes of last page
}
void
TransporterRegistry::updateWritePtr(TransporterSendBufferHandle *handle,
Transporter* t,
TrpId trp_id,
Uint32 lenBytes,
Uint32 prio)
{
Uint32 used = handle->updateWritePtr(trp_id, lenBytes, prio);
t->update_status_overloaded(used); // <------------use it to decide the overloaded status
if (t->send_limit_reached(used))
{
//-------------------------------------------------
// Buffer is full and we are ready to send. We will
// not wait since the signal is already in the buffer.
// Force flag set has the same indication that we
// should always send. If it is not possible to send
// we will not worry since we will soon be back for
// a renewed trial.
//-------------------------------------------------
if (t->send_is_possible(0))
{
//-------------------------------------------------
// Send was possible, attempt at a send.
//-------------------------------------------------
handle->forceSend(trp_id);
}//if
}
}
How to repeat:
reviewing code at src/common/transporter/TransporterRegistry.cpp:L3895
Suggested fix:
Consider using a uniform checking place, such as the global send and sending buffer, instead of checking two different buffers.
This approach can provide consistency in evaluating the overloaded status without relying on different buffer lengths and usages.