Skip to content

Commit

Permalink
HPCC-32789 Fix potential race deadlock in Thor splitter
Browse files Browse the repository at this point in the history
If the splitter reading arms (COutput) were reading from the same
page (CRowSet chunk) as the write ahead was writing to, then the
write ahead could expand that row set and cause the reader to
read unexpected row data (e.g. null rows).
This caused the splitter arm to premuturely finish, leaving the
splitter unbalanced and stalling as the writeahead blocked soon
afterwards since the finished arm was too far behind.

The bug was that the row set should never expand. It is pre-sized
to avoid that. However, the condition of 'fullness' was incorrect,
relying only on a dynamic calculation of total row size. The fix
is to also check that the number of rows does not exceed the
capacity.

NB: The bug predates the new splitter code, however the new
splitter implementation also changed the way the splitter arms
interacted with writeahead. Previously the arm would call
writeahead once it hit max explicitly, rather than blocking in
the underlying ISharedSmartBuffer implementation.
But it would still be possible I think to hit this bug (albeit
less likely).

Signed-off-by: Jake Smith <jake.smith@lexisnexisrisk.com>
  • Loading branch information
jakesmith committed Oct 25, 2024
1 parent 7b3f72c commit 0b2b661
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 2 deletions.
17 changes: 16 additions & 1 deletion thorlcr/thorutil/thbuf.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1272,6 +1272,10 @@ class CRowSet : public CSimpleInterface, implements IInterface
{
return rows.get(r);
}
inline bool isFull() const
{
return rows.isFull();
}
};

class Chunk : public CInterface
Expand Down Expand Up @@ -1395,6 +1399,7 @@ class CSharedWriteAheadBase : public CSimpleInterface, implements ISharedSmartBu
Owned<CRowSet> outputOwnedRows;
CRowSet *rowSet;
unsigned row, rowsInRowSet;
bool lastWasNull=false; // for sanity check only (there should never be two consequetive nulls)

void init()
{
Expand Down Expand Up @@ -1464,6 +1469,15 @@ class CSharedWriteAheadBase : public CSimpleInterface, implements ISharedSmartBu
}
rowsRead++;
const void *retrow = rowSet->getRow(row++);
if (lastWasNull)
{
if (retrow)
lastWasNull = false;
else
throw makeStringExceptionV(0, "Double null detected - output=%u, rowsRead = %" RCPF "u, rowsWritten = %" RCPF "u, writeAtEof = %s", output, rowsRead, parent.rowsWritten, boolToStr(parent.writeAtEof));
}
else
lastWasNull = (retrow == nullptr);
return retrow;
}
virtual void stop()
Expand Down Expand Up @@ -1693,7 +1707,8 @@ class CSharedWriteAheadBase : public CSimpleInterface, implements ISharedSmartBu
unsigned len=rowSize(row);
CriticalBlock b(crit);
bool paged = false;
if (totalOutChunkSize >= minChunkSize) // chunks required to be at least minChunkSize
// NB: The isFull condition ensures that we never expand inMemRows, which would cause a race with readers reading same set
if (totalOutChunkSize >= minChunkSize || inMemRows->isFull()) // chunks required to be at least minChunkSize, or if hits max capacity
{
unsigned reader=anyReaderBehind();
if (NotFound != reader)
Expand Down
4 changes: 3 additions & 1 deletion thorlcr/thorutil/thmem.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -327,7 +327,8 @@ class graph_decl CThorExpandingRowArray : public CSimpleInterface
if (!resize(numRows+1))
return false;
}
rows[numRows++] = row;
rows[numRows] = row;
numRows++;
return true;
}
bool binaryInsert(const void *row, ICompare &compare, bool dropLast=false); // NB: takes ownership on success
Expand Down Expand Up @@ -356,6 +357,7 @@ class graph_decl CThorExpandingRowArray : public CSimpleInterface
}
inline rowidx_t ordinality() const { return numRows; }
inline rowidx_t queryMaxRows() const { return maxRows; }
inline bool isFull() const { return numRows >= maxRows; }

inline const void **getRowArray() { return rows; }
void swap(CThorExpandingRowArray &src);
Expand Down

0 comments on commit 0b2b661

Please sign in to comment.