当前位置:网站首页>PostgreSQL source code (5) buffer management
PostgreSQL source code (5) buffer management
2022-07-18 17:22:00 【mingjie】
Postgresql Source code (5) Buffer management
Learning notes :https://www.interdb.jp/pg/pgsql08.html
Preface
Imagine a simple buffer Functions of manager :
- Write
- You can apply for a free PAGE Used to write , And corresponds to a PAGE
- When it is full, one brush plate can be automatically eliminated PAGE
- After writing, you can decide to brush the disk immediately or lazy Brush set
- read
- You can read a cached PAGE
- You can read on a disk PAGE
- When it is full, one brush plate can be automatically eliminated PAGE, Then read what you need
have a look PG How did it happen ?
PG Realization
1 TAG
{(16821, 16384, 37721), 1, 3}
Express
- tablespace=16821
- db=16384
- table=37721
- freespace map
- In the document 3 Block
typedef struct buftag
{
RelFileNode rnode; /* physical relation identifier */
typedef struct RelFileNode
{
Oid spcNode; /* tablespace */
Oid dbNode; /* database */
Oid relNode; /* relation */
} RelFileNode;
ForkNumber forkNum; // tables, freespace maps and visibility maps are defined in 0, 1 and 2
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;2 structure
Three tier structure ( The second level is logical )
- buffer table Can be directly from Hash Find buffer id
- In the use of buffer slot front , Need to use buffer id find desc, see slot Of meta Information , Can be used
- So logically, there is a layer in the middle desc Array
2.1 Buffer Table
Here we need to pay attention to , Note that this is the core entry data structure , yes tag Yes id The mapping relation of .
- key
- BufferTag
- RelFileNode rnode
- ForkNumber forkNum
- BlockNumber blockNum
- BufferTag
- value
- BufferLookupEnt
- BufferTag key
- int id
- BufferLookupEnt
- Partitioned hash table , The granularity of locking is smaller .
void
InitBufTable(int size)
{
HASHCTL info;
/* assume no locking is needed yet */
/* BufferTag maps to Buffer */
info.keysize = sizeof(BufferTag);
info.entrysize = sizeof(BufferLookupEnt);
info.num_partitions = NUM_BUFFER_PARTITIONS;
SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
size, size,
&info,
HASH_ELEM | HASH_BLOBS | HASH_PARTITION);
}2.2 Buffer Descriptor
This is also an array , It's initialized here :
void
InitBufferPool(void)
{
bool foundBufs,
foundDescs,
foundIOLocks,
foundBufCkpt;
/* Align descriptors to a cacheline boundary. */
BufferDescriptors = (BufferDescPadded *)
ShmemInitStruct("Buffer Descriptors",
NBuffers * sizeof(BufferDescPadded),
&foundDescs);
...The initialization of the BufferDescPadded, One buffer Corresponding to one desc:
typedef union BufferDescPadded
{
BufferDesc bufferdesc;
char pad[BUFFERDESC_PAD_TO_SIZE];
} BufferDescPadded;The specific content :
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
*
* Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change
【 Add BM_LOCKED(Buffer header lock) Lock can read and write tag、state、wait_backend_pid】
* the tag, state or wait_backend_pid fields. In general, buffer header lock
* is a spinlock which is combined with flags, refcount and usagecount into
* single atomic variable. This layout allow us to do some operations in a
* single atomic operation, without actually acquiring and releasing spinlock;
* for instance, increase or decrease refcount. buf_id field never changes
* after initialization, so does not need locking. freeNext is protected by
* the buffer_strategy_lock not buffer header lock. The LWLock can take care
* of itself. The buffer header lock is *not* used to control access to the
* data in the buffer!
*
* It's assumed that nobody changes the state field while buffer header lock
* is held. Thus buffer header lock holder can do complex updates of the
* state variable in single write, simultaneously with lock release (cleaning
* BM_LOCKED flag). On the other hand, updating of state without holding
* buffer header lock is restricted to CAS, which insure that BM_LOCKED flag
* is not set. Atomic increment/decrement, OR/AND etc. are not allowed.
*
【 hypothesis BM_LOCKED(Buffer header lock) When held , Others cannot update state】
【 Then the person who holds the lock can state Do a lot of updates , Then release the lock 】
* An exception is that if we have the buffer pinned, its tag can't change
* underneath us, so we can examine the tag without locking the buffer header.
* Also, in places we do one-time reads of the flags without bothering to
* lock the buffer header; this is generally for situations where we don't
* expect the flag bit being tested to be changing.
*
* We can't physically remove items from a disk page if another backend has
* the buffer pinned. Hence, a backend may need to wait for all other pins
* to go away. This is signaled by storing its own PID into
* wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER. At present,
* there can be only one such waiter per buffer.
*
* We use this same struct for local buffer headers, but the locks are not
* used and not all of the flag bits are useful either. To avoid unnecessary
* overhead, manipulations of the state field should be done without actual
* atomic operations (i.e. only pg_atomic_read_u32() and
* pg_atomic_unlocked_write_u32()).
*
* Be careful to avoid increasing the size of the struct when adding or
* reordering members. Keeping it below 64 bytes (the most common CPU
* cache line size) is fairly important for performance.
*/
typedef struct BufferDesc
{
BufferTag tag; /* ID of page contained in buffer */
int buf_id; /* buffer's index number (from 0) */
/* state of the tag, containing flags, refcount and usagecount */
pg_atomic_uint32 state;
int wait_backend_pid; /* backend PID of pin-count waiter */
int freeNext; /* link in freelist chain */
LWLock content_lock; /* to lock access to buffer contents */
} BufferDesc;- tag/buf_id: It says that
- state
- flags
- dirty bit:indicates whether the stored page is dirty.
- valid bit: The current page is readable (1)slot There's data 、 Corresponding desc There's data , You can read .(2)invalid:desc No data or in the process of page replacement .
- io_in_progress bit: Whether the buffer manager is from / Read from storage / Write the associated page . let me put it another way , This bit indicates whether a single process holds the descriptor io_in_progress_lock.
- recount
- Record the number of processes accessing the current page , Also called
pin count. Access to the page must pin count++, After use, you must pin count–. - pin count=0 When called unpinned; Not 0 When called pinned.
- Record the number of processes accessing the current page , Also called
- usagecount: Record since loading , Number of times visited , Clock algorithm will use .
- flags
- freeNext: Next free buffer, The logic of adding a free linked list to the array
【desc Describe the three states of the page 】
- Empty
- When the corresponding buffer pool slot does not store pages ( namely refcount and usage_count by 0), The descriptor is empty .
- Pinned
- When the corresponding buffer pool slot stores a page and any PostgreSQL The process is accessing the page ( namely refcount and usage_count Greater than or equal to 1) when , The state of the buffer descriptor is locked .
- Unpinned
- When the corresponding buffer pool slot stores a page but does not PostgreSQL When the process accesses the page ( namely usage_count Greater than or equal to 1, but refcount by 0), The state of this buffer descriptor is unpinned.
2.3 Buffer Descriptor Logic layer
BufferDescriptors Array initialization ,freelist initialization buf->freeNext = i + 1;
...
BufferDescPadded *BufferDescriptors;
...
void
InitBufferPool(void)
{
bool foundBufs,
foundDescs,
foundIOLocks,
foundBufCkpt;
/* Align descriptors to a cacheline boundary. */
BufferDescriptors = (BufferDescPadded *)
ShmemInitStruct("Buffer Descriptors",
NBuffers * sizeof(BufferDescPadded),
&foundDescs);
...
...
/*
* Initialize all the buffer headers.
*/
for (i = 0; i < NBuffers; i++)
{
BufferDesc *buf = GetBufferDescriptor(i);
CLEAR_BUFFERTAG(buf->tag);
pg_atomic_init_u32(&buf->state, 0);
buf->wait_backend_pid = 0;
buf->buf_id = i;
/*
* Initially link all the buffers together as unused. Subsequent
* management of this list is done by freelist.c.
*/
buf->freeNext = i + 1;
LWLockInitialize(BufferDescriptorGetContentLock(buf),
LWTRANCHE_BUFFER_CONTENT);
LWLockInitialize(BufferDescriptorGetIOLock(buf),
LWTRANCHE_BUFFER_IO_IN_PROGRESS);
}
...
...The process of loading the first page :
- freelist Take a free Of desc,pin live (refcount++, usage_count++)
- buffertable Newly added entry, Record tag : buffer_id
- Read page contents from storage to memory
- to update desc Medium meta Information
desc After use, it will not be added to freelist It's in , Unless :
- surface or Indexes Been deleted
- db Been deleted
- surface or Indexes By vacuum full It's empty
2.4 Buffer Pool
A block of memory space , The size is 8K * NBuffers
BufferBlocks = (char *)
ShmemInitStruct("Buffer Blocks",
NBuffers * (Size) BLCKSZ, &foundBufs);3 lock
These locks are all in shared memory .
3.1 Buffer Table Locks
BufMappingLock
Partition lock of hash table , branch s/e
3.2 Desc lock
content_lock
Reading and writing PAGE Light weight lock , branch s/e
- e The mode appears in the following cases
- Insert page、 modify tuple Of t_xmin/t_xmax Field
- Physical delete tuple、 Compress the remaining space of the page (vacuum)
- In page freeze
io_in_progress_lock
For etc PAGE Of IO Action complete , When the process starts from / Load into storage / When writing page data , The process holds the exclusive of the corresponding descriptor when accessing the storage io_in_progress lock .
spinlock( Now it is BM_LOCK Sign a )
desc Of flags and fields When modifying, it will add spinlock.
for example PIN:
LockBufHdr(bufferdesc); /* Acquire a spinlock */
bufferdesc->refcont++;
bufferdesc->usage_count++;
UnlockBufHdr(bufferdesc); /* Release the spinlock */for example set the dirty bit to ‘1’:
#define BM_DIRTY (1 << 0) /* data needs writing */
#define BM_VALID (1 << 1) /* data is valid */
#define BM_TAG_VALID (1 << 2) /* tag is assigned */
#define BM_IO_IN_PROGRESS (1 << 3) /* read or write in progress */
#define BM_JUST_DIRTIED (1 << 5) /* dirtied since write started */
LockBufHdr(bufferdesc);
bufferdesc->flags |= BM_DIRTY;
UnlockBufHdr(bufferdesc);4 Elimination strategy
Four strategies :
typedef enum BufferAccessStrategyType
{
BAS_NORMAL, /* Normal random access */
BAS_BULKREAD, /* Large read-only scan (hint bit updates are
* ok) */
BAS_BULKWRITE, /* Large multi-block write (e.g. COPY IN) */
BAS_VACUUM /* VACUUM */
} BufferAccessStrategyType;BufferAccessStrategyType | Use scenarios | Replacement algorithm |
|---|---|---|
BAS_NORMAL | Random reading and writing in general | clock sweep Algorithm |
BAS_BULKREAD | Read in bulk | ring Algorithm , The ring size is 256 * 1024 / BLCKSZ |
BAS_BULKWRITE | Batch write | ring Algorithm , The ring size is 16 * 1024 * 1024 / BLCKSZ |
BAS_VACUUM | VACUUM process | ring Algorithm , The ring size is 256 * 1024 / BLCKSZ |
clock sweep Algorithm
/*
* The shared freelist control information.
*/
typedef struct
{
/* Spinlock: protects the values below */
// spinlocks , To protect the members below
slock_t buffer_strategy_lock;
/*
* Clock sweep hand: index of next buffer to consider grabbing. Note that
* this isn't a concrete buffer - we only ever increase the value. So, to
* get an actual buffer, it needs to be used modulo NBuffers.
*/
// Next traverse position
pg_atomic_uint32 nextVictimBuffer;
// Free buffer The head of the chain
int firstFreeBuffer; /* Head of list of unused buffers */
// Free buffer The tail of the list
int lastFreeBuffer; /* Tail of list of unused buffers */
/*
* NOTE: lastFreeBuffer is undefined when firstFreeBuffer is -1 (that is,
* when the list is empty)
*/
/*
* Statistics. These counters should be wide enough that they can't
* overflow during a single bgwriter cycle.
*/
// Record the number of times the array is traversed
uint32 completePasses; /* Complete cycles of the clock sweep */
pg_atomic_uint32 numBufferAllocs; /* Buffers allocated since last reset */
/*
* Bgworker process to be notified upon activity or -1 if none. See
* StrategyNotifyBgWriter.
*/
int bgwprocno;
} BufferStrategyControl;Poll every time from the last position , Then check that the buffer The number of citations refcount And number of visits usagecount.
- If refcount,usagecount All zeros , Then go straight back .
- If refcount zero ,usagecount Not zero , So it usagecount reduce 1, Traverse to the next buffer.
- If refcount Not zero , Then traverse the next .
clock sweep The algorithm is an endless loop algorithm , Until you find one refcount,usagecount All zero buffer.
Free list
- To speed up the search for idle buffer The speed of ,postgresql Use the linked list to save these buffer.
- The head and tail of the linked list are composed of BufferStrategyControl Structure of the firstFreeBuffer and lastFreeBuffer Member designation .
- The link list node consists of BufferDesc The structure represents , its freeNext The member points to the next node .
- When there is new leisure buffer, It will be added to the end of the linked list . When free space is needed , Then directly return to the head of the linked list .
Reference counter
there state Before 18 bits Record the number of references in shared memory .
typedef struct BufferDesc
{
BufferTag tag; /* ID of page contained in buffer */
int buf_id; /* buffer's index number (from 0) */
/* state of the tag, containing flags, refcount and usagecount */
pg_atomic_uint32 state;
int wait_backend_pid; /* backend PID of pin-count waiter */
int freeNext; /* link in freelist chain */
LWLock content_lock; /* to lock access to buffer contents */
} BufferDesc;
/*
* Buffer state is a single 32-bit variable where following data is combined.
*
* - 18 bits refcount
【 The process itself uses an array + The hash table is used as the second level cache pined, Record the result of the last refresh here 】
* - 4 bits usage count
* - 10 bits of flags
*
* Combining these values allows to perform some operations without locking
* the buffer header, by modifying them together with a CAS loop.
*
* The definition of buffer state components is below.
*/
#define BUF_REFCOUNT_ONE 1
#define BUF_REFCOUNT_MASK ((1U << 18) - 1)
#define BUF_USAGECOUNT_MASK 0x003C0000U
#define BUF_USAGECOUNT_ONE (1U << 18)
#define BUF_USAGECOUNT_SHIFT 18
#define BUF_FLAG_MASK 0xFFC00000U
/* Get refcount and usagecount from buffer state */
#define BUF_STATE_GET_REFCOUNT(state) ((state) & BUF_REFCOUNT_MASK)
#define BUF_STATE_GET_USAGECOUNT(state) (((state) & BUF_USAGECOUNT_MASK) >> BUF_USAGECOUNT_SHIFT)The real reference counter is here , Every BUFFER One :
typedef struct PrivateRefCountEntry
{
Buffer buffer;
int32 refcount;
} PrivateRefCountEntry;In order to quickly find the specified buffer Reference count of ,PrivateRefCountEntry Array as first level cache , Use hash table as L2 cache .
Be careful : Process private cache !
/*
* Backend-Private refcount management:
*
* Each buffer also has a private refcount that keeps track of the number of
* times the buffer is pinned in the current process. This is so that the
* shared refcount needs to be modified only once if a buffer is pinned more
* than once by an individual backend. It's also used to check that no buffers
* are still pinned at the end of transactions and when exiting.
【 The current process is recorded in private memory : The use of buffer times pin How many times 】
【 function 1: So if the current process pin Many times , Finally, in the shared memory, you only need pin once 】
【 function 2: Also used to check when the transaction ends 、 The process exited without pin Resident buffer】
*
* To avoid - as we used to - requiring an array with NBuffers entries to keep
* track of local buffers, we use a small sequentially searched array
* (PrivateRefCountArray) and an overflow hash table (PrivateRefCountHash) to
* keep track of backend local pins.
*
【 To avoid using NBuffers A large array of elements to track local pin cache , Here we use a 8 Array of elements + One overflow Hash table to record pin】
* Until no more than REFCOUNT_ARRAY_ENTRIES buffers are pinned at once, all
* refcounts are kept track of in the array; after that, new array entries
* displace old ones into the hash table. That way a frequently used entry
* can't get "stuck" in the hashtable while infrequent ones clog the array.
*
* Note that in most scenarios the number of pinned buffers will not exceed
* REFCOUNT_ARRAY_ENTRIES.
【pinned No more than 8 A couple of times ago , be-all refcounts Will be tracked in the array , Another new pin Will replace the old one into the hash table 】
【 In this way, the frequently used ones will always be in the array , Less commonly used ones will be in the hash table 】
*
*
* To enter a buffer into the refcount tracking mechanism first reserve a free
* entry using ReservePrivateRefCountEntry() and then later, if necessary,
* fill it with NewPrivateRefCountEntry(). That split lets us avoid doing
* memory allocations in NewPrivateRefCountEntry() which can be important
* because in some scenarios it's called with a spinlock held...
【 To use this cache tracking mechanism , First use ReservePrivateRefCountEntry Reserve a free array location 】
【 Use... When using NewPrivateRefCountEntry Fill this position 】
【 Why split ? To avoid the ReservePrivateRefCountEntry Do memory allocation , Because sometimes I hold it spinlock Call this function 】
*/
// 【 First level cache 】:8 Elements
static struct PrivateRefCountEntry PrivateRefCountArray[REFCOUNT_ARRAY_ENTRIES];
// Free space index
static uint32 PrivateRefCountClock = 0;
// Point to the empty position in the array
static PrivateRefCountEntry *ReservedRefCountEntry = NULL;
// 【 Second level cache 】:key=buffer_id value=PrivateRefCountEntry
static HTAB *PrivateRefCountHash = NULL;
// The hash table contains entry Number of
static int32 PrivateRefCountOverflowed = 0;
...
void
InitBufferPoolAccess(void)
{
HASHCTL hash_ctl;
memset(&PrivateRefCountArray, 0, sizeof(PrivateRefCountArray));
MemSet(&hash_ctl, 0, sizeof(hash_ctl));
hash_ctl.keysize = sizeof(int32);
hash_ctl.entrysize = sizeof(PrivateRefCountEntry);
PrivateRefCountHash = hash_create("PrivateRefCount", 100, &hash_ctl,
HASH_ELEM | HASH_BLOBS);
}ReservePrivateRefCountEntry
(1) Apply for one in the first level cache The initial state of ReservedRefCountEntry(buffer_id = 0, ref_count = 0)
(2) If the array is full , hold PrivateRefCountClock Positional entry Insert hash surface , Then empty and use .
(3) Note that this function does not return anything , Just maintain ReservedRefCountEntry, Let this pointer point to a free entry Save locally ref_count.
static void
ReservePrivateRefCountEntry(void)
{
/* Already reserved (or freed), nothing to do */
if (ReservedRefCountEntry != NULL)
return;
/*
* First search for a free entry the array, that'll be sufficient in the
* majority of cases.
*/
{
int i;
for (i = 0; i < REFCOUNT_ARRAY_ENTRIES; i++)
{
PrivateRefCountEntry *res;
res = &PrivateRefCountArray[i];
if (res->buffer == InvalidBuffer)
{
ReservedRefCountEntry = res;
return;
}
}
}
/*
* No luck. All array entries are full. Move one array entry into the hash
* table.
*/
{
/*
* Move entry from the current clock position in the array into the
* hashtable. Use that slot.
*/
PrivateRefCountEntry *hashent;
bool found;
/* select victim slot */
ReservedRefCountEntry =
&PrivateRefCountArray[PrivateRefCountClock++ % REFCOUNT_ARRAY_ENTRIES];
/* Better be used, otherwise we shouldn't get here. */
Assert(ReservedRefCountEntry->buffer != InvalidBuffer);
/* enter victim array entry into hashtable */
hashent = hash_search(PrivateRefCountHash,
(void *) &(ReservedRefCountEntry->buffer),
HASH_ENTER,
&found);
Assert(!found);
hashent->refcount = ReservedRefCountEntry->refcount;
/* clear the now free array slot */
ReservedRefCountEntry->buffer = InvalidBuffer;
ReservedRefCountEntry->refcount = 0;
PrivateRefCountOverflowed++;
}
}NewPrivateRefCountEntry
(1) fill buffer_id
(2) Return to one PrivateRefCountEntry, new ref_count=0
static PrivateRefCountEntry *
NewPrivateRefCountEntry(Buffer buffer)
{
PrivateRefCountEntry *res;
/* only allowed to be called when a reservation has been made */
Assert(ReservedRefCountEntry != NULL);
/* use up the reserved entry */
res = ReservedRefCountEntry;
ReservedRefCountEntry = NULL;
/* and fill it */
res->buffer = buffer;
res->refcount = 0;
return res;
}GetPrivateRefCountEntry
(1) Pass in buffer_id look for PrivateRefCountEntry
(2) If buffer_id Already in the array , Directly return the pointer of the element in the array (buffer_id, ref_count)
(3) If not in the array , Go to hash Query in table , If there is no direct return NULL
(4) If we find out do_move==true ? Clean up a position in the array and put the records in the hash table into the array : Directly return the found (buffer_id, ref_count)
static PrivateRefCountEntry *
GetPrivateRefCountEntry(Buffer buffer, bool do_move)
{
PrivateRefCountEntry *res;
int i;
Assert(BufferIsValid(buffer));
Assert(!BufferIsLocal(buffer));
/*
* First search for references in the array, that'll be sufficient in the
* majority of cases.
*/
for (i = 0; i < REFCOUNT_ARRAY_ENTRIES; i++)
{
res = &PrivateRefCountArray[i];
if (res->buffer == buffer)
return res;
}
/*
* By here we know that the buffer, if already pinned, isn't residing in
* the array.
*
* Only look up the buffer in the hashtable if we've previously overflowed
* into it.
*/
if (PrivateRefCountOverflowed == 0)
return NULL;
res = hash_search(PrivateRefCountHash,
(void *) &buffer,
HASH_FIND,
NULL);
if (res == NULL)
return NULL;
else if (!do_move)
{
/* caller doesn't want us to move the hash entry into the array */
return res;
}
else
{
/* move buffer from hashtable into the free array slot */
bool found;
PrivateRefCountEntry *free;
/* Ensure there's a free array slot */
ReservePrivateRefCountEntry();
/* Use up the reserved slot */
Assert(ReservedRefCountEntry != NULL);
free = ReservedRefCountEntry;
ReservedRefCountEntry = NULL;
Assert(free->buffer == InvalidBuffer);
/* and fill it */
free->buffer = buffer;
free->refcount = res->refcount;
/* delete from hashtable */
hash_search(PrivateRefCountHash,
(void *) &buffer,
HASH_REMOVE,
&found);
Assert(found);
Assert(PrivateRefCountOverflowed > 0);
PrivateRefCountOverflowed--;
return free;
}
}5 SRC
ReadBufferExtended
/*
* ReadBufferExtended -- returns a buffer containing the requested
* block of the requested relation. If the blknum
* requested is P_NEW, extend the relation file and
* allocate a new block. (Caller is responsible for
* ensuring that only one backend tries to extend a
* relation at the same time!)
【 Return requested PAGE, If blknum==P_NEW, Expand the table file to apply for a new page to read into memory 】
*
* Returns: the buffer number for the buffer containing
* the block read. The returned buffer has been pinned.
* Does not return on error --- elog's instead.
*
【 Return requested 、 Available pages , Note that this page has PIN 了 】
* Assume when this function is called, that reln has been opened already.
*
* In RBM_NORMAL mode, the page is read from disk, and the page header is
* validated. An error is thrown if the page header is not valid. (But
* note that an all-zero page is considered "valid"; see PageIsVerified().)
【RBM_NORMAL Pattern , The page is read from the disk and verified page header】
* RBM_ZERO_ON_ERROR is like the normal mode, but if the page header is not
* valid, the page is zeroed instead of throwing an error. This is intended
* for non-critical data, where the caller is prepared to repair errors.
*
【RBM_ZERO_ON_ERROR Pattern , If page header Validation failed , Clear directly without error , Applicable to non core data scenarios 】
* In RBM_ZERO_AND_LOCK mode, if the page isn't in buffer cache already, it's
* filled with zeros instead of reading it from disk. Useful when the caller
* is going to fill the page from scratch, since this saves I/O and avoids
* unnecessary failure if the page-on-disk has corrupt page headers.
* The page is returned locked to ensure that the caller has a chance to
* initialize the page before it's made visible to others.
* Caution: do not use this mode to read a page that is beyond the relation's
* current physical EOF; that is likely to cause problems in md.c when
* the page is modified and written out. P_NEW is OK, though.
【RBM_ZERO_AND_LOCK High performance mode : The page is not in the buffer , Do not read from disk , Directly fill in 0. Page locking prevents others from reading a bunch of things before initialization 0】
* RBM_ZERO_AND_CLEANUP_LOCK is the same as RBM_ZERO_AND_LOCK, but acquires
* a cleanup-strength lock on the page.
*
* RBM_NORMAL_NO_LOG mode is treated the same as RBM_NORMAL here.
*
* If strategy is not NULL, a nondefault buffer access strategy is used.
* See buffer/README for details.
*/
Buffer
ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
ReadBufferMode mode, BufferAccessStrategy strategy)
{
bool hit;
Buffer buf;
/* Open it at the smgr level if not already done */
RelationOpenSmgr(reln);I'm going to use mdopen Open physical file , And in reln Of md_seg_fds Record the opened vfd.
1、 from reln Take out the file name of the record return &reln->md_seg_fds[forknum][0], If there is no need to use VFD open
2、 Open file PathNameOpenFile, And record VFD To md_seg_fds in .
/*
* Reject attempts to read non-local temporary relations; we would be
* likely to get wrong data since we have no visibility into the owning
* session's local buffers.
*/
if (RELATION_IS_OTHER_TEMP(reln))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot access temporary tables of other sessions")));
/*
* Read the buffer, and update pgstat counters to reflect a cache hit or
* miss.
*/
pgstat_count_buffer_read(reln);
buf = ReadBuffer_common(reln->rd_smgr, reln->rd_rel->relpersistence,
forkNum, blockNum, mode, strategy, &hit);
if (hit)
pgstat_count_buffer_hit(reln);
return buf;
}ReadBufferExtended
/*
* ReadBuffer_common -- common logic for all ReadBuffer variants
*
* *hit is set to true if the request was satisfied from shared buffer cache.
*/
static Buffer
ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool *hit)
{
...
...
bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
strategy, &found);
...
...
}BufferAlloc
/*
* BufferAlloc -- subroutine for ReadBuffer. Handles lookup of a shared
* buffer. If no buffer exists already, selects a replacement
* victim and evicts the old page, but does NOT read in new page.
*
【 Find one buffer, If one is not eliminated 】
* "strategy" can be a buffer replacement strategy object, or NULL for
* the default strategy. The selected buffer's usage_count is advanced when
* using the default strategy, but otherwise possibly not (see PinBuffer).
*
* The returned buffer is pinned and is already marked as holding the
* desired page. If it already did have the desired page, *foundPtr is
* set TRUE. Otherwise, *foundPtr is set FALSE and the buffer is marked
* as IO_IN_PROGRESS; ReadBuffer will now need to do I/O to fill it.
*
【 Back to buffer Meeting pin live , And the page data has been filled and available 】
【 If the page already exists in the buffer , Go straight back to 】
【 If not, put foundPtr=false,buffer Marked as IO_IN_PROGRESS, Upper level functions “ There is no need to ” Doing it IO】
* *foundPtr is actually redundant with the buffer's BM_VALID flag, but
* we keep it for simplicity in ReadBuffer.
*
* No locks are held either at entry or exit.
*/
static BufferDesc *
BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
BlockNumber blockNum,
BufferAccessStrategy strategy,
bool *foundPtr)
{
BufferTag newTag; /* identity of requested block */
uint32 newHash; /* hash value for newTag */
LWLock *newPartitionLock; /* buffer partition lock for it */
BufferTag oldTag; /* previous identity of selected buffer */
uint32 oldHash; /* hash value for oldTag */
LWLock *oldPartitionLock; /* buffer partition lock for it */
uint32 oldFlags;
int buf_id;
BufferDesc *buf;
bool valid;
uint32 buf_state;
/* create a tag so we can lookup the buffer */
INIT_BUFFERTAG(newTag, smgr->smgr_rnode.node, forkNum, blockNum);
/* determine its hash code and partition lock ID */
newHash = BufTableHashCode(&newTag);use hash value % NUM_BUFFER_PARTITIONS(128), Then go to MainLWLockArray Take the lock in the array :
#define BufTableHashPartition(hashcode) ((hashcode) % NUM_BUFFER_PARTITIONS)
#define BufMappingPartitionLock(hashcode) (&MainLWLockArray[BUFFER_MAPPING_LWLOCK_OFFSET + BufTableHashPartition(hashcode)].lock)
newPartitionLock = BufMappingPartitionLock(newHash);
/* see if the block is in the buffer pool already */
LWLockAcquire(newPartitionLock, LW_SHARED);
buf_id = BufTableLookup(&newTag, newHash);
【hash In the table tag --> buf_id, Found the description is already in buffer It's in 】
if (buf_id >= 0)
{
/*
* Found it. Now, pin the buffer so no one can steal it from the
* buffer pool, and check to see if the correct data has been loaded
* into the buffer.
*/
【 eureka ! direct pinbuffer】
buf = GetBufferDescriptor(buf_id);
【 Here is the expansion analysis of this function , In shared memory desc And local cache ref_count do ++】
valid = PinBuffer(buf, strategy);
/* Can release the mapping lock as soon as we've pinned it */
【PIN You can put the lock when you live 】
LWLockRelease(newPartitionLock);
*foundPtr = TRUE;
【 Found in hash table , But after locking, I found that the page was unavailable , You need to reread the page 】
if (!valid)
{
/*
* We can only get here if (a) someone else is still reading in
* the page, or (b) a previous read attempt failed. We have to
* wait for any active read attempt to finish, and then set up our
* own read attempt if the page is still not BM_VALID.
* StartBufferIO does it all.
*/
if (StartBufferIO(buf, true))
{
/*
* If we get here, previous attempts to read the buffer must
* have failed ... but we shall bravely try again.
*/
*foundPtr = FALSE;
}
}
return buf;
}
【 Coming here means hash I can't find , It's not in the cache 】
/*
* Didn't find it in the buffer pool. We'll have to initialize a new
* buffer. Remember to unlock the mapping lock while doing the work.
*/
【 Need to initialize a new buffer, First put the lock on initialization 】
LWLockRelease(newPartitionLock);
/* Loop here in case we have to try another victim buffer */
for (;;)
{
/*
* Ensure, while the spinlock's not yet held, that there's a free
* refcount entry.
*/
【 Take one from the private L1 cache array (buffer_id,ref_count) Location 】
【 If there is no place , Swap out one to the L2 cache hash table , Then take out a position 】
ReservePrivateRefCountEntry();
/*
* Select a victim buffer. The buffer is returned with its header
* spinlock still held!
*/
【 Find one. buffer Location , return ID and buf_state, Look for the freelist There is no clocksweep Eliminate 】
【 There is an expansion under this function 】
buf = StrategyGetBuffer(strategy, &buf_state);
Assert(BUF_STATE_GET_REFCOUNT(buf_state) == 0);
/* Must copy buffer flags while we still hold the spinlock */
oldFlags = buf_state & BUF_FLAG_MASK;
/* Pin the buffer and then release the buffer spinlock */
【 Both shared and local buffers ref all ++】
PinBuffer_Locked(buf);
/*
* If the buffer was dirty, try to write it out. There is a race
* condition here, in that someone might dirty it after we released it
* above, or even while we are writing it out (since our share-lock
* won't prevent hint-bit updates). We will recheck the dirty bit
* after re-locking the buffer header.
*/
【 Need to brush dirty 】
if (oldFlags & BM_DIRTY)
{
/*
* We need a share-lock on the buffer contents to write it out
* (else we might write invalid data, eg because someone else is
* compacting the page contents while we write). We must use a
* conditional lock acquisition here to avoid deadlock. Even
* though the buffer was not pinned (and therefore surely not
* locked) when StrategyGetBuffer returned it, someone else could
* have pinned and exclusive-locked it by the time we get here. If
* we try to get the lock unconditionally, we'd block waiting for
* them; if they later block waiting for us, deadlock ensues.
* (This has been observed to happen when two backends are both
* trying to split btree index pages, and the second one just
* happens to be trying to split the page the first one got from
* StrategyGetBuffer.)
*/
if (LWLockConditionalAcquire(BufferDescriptorGetContentLock(buf),
LW_SHARED))
{
/*
* If using a nondefault strategy, and writing the buffer
* would require a WAL flush, let the strategy decide whether
* to go ahead and write/reuse the buffer or to choose another
* victim. We need lock to inspect the page LSN, so this
* can't be done inside StrategyGetBuffer.
*/
if (strategy != NULL)
{
XLogRecPtr lsn;
/* Read the LSN while holding buffer header lock */
buf_state = LockBufHdr(buf);
lsn = BufferGetLSN(buf);
UnlockBufHdr(buf, buf_state);
if (XLogNeedsFlush(lsn) &&
StrategyRejectBuffer(strategy, buf))
{
/* Drop lock/pin and loop around for another buffer */
LWLockRelease(BufferDescriptorGetContentLock(buf));
UnpinBuffer(buf, true);
continue;
}
}
/* OK, do the I/O */
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
smgr->smgr_rnode.node.relNode);
FlushBuffer(buf, NULL);
LWLockRelease(BufferDescriptorGetContentLock(buf));
ScheduleBufferTagForWriteback(&BackendWritebackContext,
&buf->tag);
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
smgr->smgr_rnode.node.relNode);
}
else
{
/*
* Someone else has locked the buffer, so give it up and loop
* back to get another one.
*/
UnpinBuffer(buf, true);
continue;
}
}
【 The pieces you take out don't need to be dirty Or the top has been painted 】
【 Then this piece TAG If BM_TAG_VALID, Add back to hash Inside the watch 】
/*
* To change the association of a valid buffer, we'll need to have
* exclusive lock on both the old and new mapping partitions.
*/
if (oldFlags & BM_TAG_VALID)
{
/*
* Need to compute the old tag's hashcode and partition lock ID.
* XXX is it worth storing the hashcode in BufferDesc so we need
* not recompute it here? Probably not.
*/
oldTag = buf->tag;
oldHash = BufTableHashCode(&oldTag);
oldPartitionLock = BufMappingPartitionLock(oldHash);
/*
* Must lock the lower-numbered partition first to avoid
* deadlocks.
*/
if (oldPartitionLock < newPartitionLock)
{
LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
}
else if (oldPartitionLock > newPartitionLock)
{
LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
}
else
{
/* only one partition, only one lock */
LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
}
}
else
【 Otherwise, this old TAG It's invalid , Don't worry about the past , Just lock the new partition 】
{
/* if it wasn't valid, we need only the new partition */
LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
/* remember we have no old-partition lock or tag */
oldPartitionLock = NULL;
/* this just keeps the compiler quiet about uninit variables */
oldHash = 0;
}
/*
* Try to make a hashtable entry for the buffer under its new tag.
* This could fail because while we were writing someone else
* allocated another buffer for the same block we want to read in.
* Note that we have not yet removed the hashtable entry for the old
* tag.
*/
【 new TAG Insert hash table 】
buf_id = BufTableInsert(&newTag, newHash, buf->buf_id);
if (buf_id >= 0)
{
/*
* Got a collision. Someone has already done what we were about to
* do. We'll just handle this as if it were found in the buffer
* pool in the first place. First, give up the buffer we were
* planning to use.
*/
UnpinBuffer(buf, true);
/* Can give up that buffer's mapping partition lock now */
if (oldPartitionLock != NULL &&
oldPartitionLock != newPartitionLock)
LWLockRelease(oldPartitionLock);
/* remaining code should match code at top of routine */
buf = GetBufferDescriptor(buf_id);
valid = PinBuffer(buf, strategy);
/* Can release the mapping lock as soon as we've pinned it */
LWLockRelease(newPartitionLock);
*foundPtr = TRUE;
if (!valid)
{
/*
* We can only get here if (a) someone else is still reading
* in the page, or (b) a previous read attempt failed. We
* have to wait for any active read attempt to finish, and
* then set up our own read attempt if the page is still not
* BM_VALID. StartBufferIO does it all.
*/
if (StartBufferIO(buf, true))
{
/*
* If we get here, previous attempts to read the buffer
* must have failed ... but we shall bravely try again.
*/
*foundPtr = FALSE;
}
}
return buf;
}
/*
* Need to lock the buffer header too in order to change its tag.
*/
buf_state = LockBufHdr(buf);
/*
* Somebody could have pinned or re-dirtied the buffer while we were
* doing the I/O and making the new hashtable entry. If so, we can't
* recycle this buffer; we must undo everything we've done and start
* over with a new victim buffer.
*/
oldFlags = buf_state & BUF_FLAG_MASK;
if (BUF_STATE_GET_REFCOUNT(buf_state) == 1 && !(oldFlags & BM_DIRTY))
break;
UnlockBufHdr(buf, buf_state);
BufTableDelete(&newTag, newHash);
if (oldPartitionLock != NULL &&
oldPartitionLock != newPartitionLock)
LWLockRelease(oldPartitionLock);
LWLockRelease(newPartitionLock);
UnpinBuffer(buf, true);
}
/*
* Okay, it's finally safe to rename the buffer.
*
* Clearing BM_VALID here is necessary, clearing the dirtybits is just
* paranoia. We also reset the usage_count since any recency of use of
* the old content is no longer relevant. (The usage_count starts out at
* 1 so that the buffer can survive one clock-sweep pass.)
*
* Make sure BM_PERMANENT is set for buffers that must be written at every
* checkpoint. Unlogged buffers only need to be written at shutdown
* checkpoints, except for their "init" forks, which need to be treated
* just like permanent relations.
*/
buf->tag = newTag;
buf_state &= ~(BM_VALID | BM_DIRTY | BM_JUST_DIRTIED |
BM_CHECKPOINT_NEEDED | BM_IO_ERROR | BM_PERMANENT |
BUF_USAGECOUNT_MASK);
if (relpersistence == RELPERSISTENCE_PERMANENT || forkNum == INIT_FORKNUM)
buf_state |= BM_TAG_VALID | BM_PERMANENT | BUF_USAGECOUNT_ONE;
else
buf_state |= BM_TAG_VALID | BUF_USAGECOUNT_ONE;
UnlockBufHdr(buf, buf_state);
【 If you are old TAG There are effective , It needs to be deleted from the hash table 】
if (oldPartitionLock != NULL)
{
BufTableDelete(&oldTag, oldHash);
if (oldPartitionLock != newPartitionLock)
LWLockRelease(oldPartitionLock);
}
LWLockRelease(newPartitionLock);
/*
* Buffer contents are currently invalid. Try to get the io_in_progress
* lock. If StartBufferIO returns false, then someone else managed to
* read it before we did, so there's nothing left for BufferAlloc() to do.
*/
if (StartBufferIO(buf, true))
*foundPtr = FALSE;
else
*foundPtr = TRUE;
return buf;
}PinBuffer
summary :
- use buf_id Check whether the local cache has pin 了
- If already pin Local ref_count++
- If there is no local pin, stay desc Update in shared memory state(ref_count++,usage_count++ Maximum to 5), Local ref_count++
- Be careful : Locked page data is not necessarily available , The return value is :
(buf_state & BM_VALID) != 0
static bool
PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
{
Buffer b = BufferDescriptorGetBuffer(buf);here b Than desc Of buf_id many 1.
desc Of buf_id It's from 0 The beginning of the calculation .
#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
bool result;
PrivateRefCountEntry *ref;
【 Look in the two-level cache of numbers and hash tables (buffer_id, ref_count)】
ref = GetPrivateRefCountEntry(b, true);
if (ref == NULL)
{
uint32 buf_state;
uint32 old_buf_state;
【 Failed to apply for one in the array , If the array is full 8 Kick one into the hash table 】
ReservePrivateRefCountEntry();
【 In the array b Fill in 】
ref = NewPrivateRefCountEntry(b);
【 to update buf->state】
old_buf_state = pg_atomic_read_u32(&buf->state);
for (;;)
{
【 If locked , Poll and bring out the new status 】
if (old_buf_state & BM_LOCKED)
old_buf_state = WaitBufHdrUnlocked(buf);
buf_state = old_buf_state;
/* increase refcount */
buf_state += BUF_REFCOUNT_ONE;
【 here strategy == NULL Refers to clock sweep Normal elimination , Otherwise, use ring buffer For batch reading and writing data 】
if (strategy == NULL)
{
/* Default case: increase usagecount unless already max. */
if (BUF_STATE_GET_USAGECOUNT(buf_state) < BM_MAX_USAGE_COUNT)
buf_state += BUF_USAGECOUNT_ONE;
}
else
{
/*
* Ring buffers shouldn't evict others from pool. Thus we
* don't make usagecount more than 1.
*/
if (BUF_STATE_GET_USAGECOUNT(buf_state) == 0)
buf_state += BUF_USAGECOUNT_ONE;
}
if (pg_atomic_compare_exchange_u32(&buf->state, &old_buf_state,
buf_state))
{
result = (buf_state & BM_VALID) != 0;
break;
}
}
}
else
【 If already pin 了 , The local refcount++ that will do , For shared memory desc Come on , A process pin How many times is once 】
{
/* If we previously pinned the buffer, it must surely be valid */
result = true;
}
ref->refcount++;
Assert(ref->refcount > 0);
ResourceOwnerRememberBuffer(CurrentResourceOwner, b);
return result;
}StrategyGetBuffer
/*
* StrategyGetBuffer
*
* Called by the bufmgr to get the next candidate buffer to use in
* BufferAlloc(). The only hard requirement BufferAlloc() has is that
* the selected buffer must not currently be pinned by anyone.
*
* strategy is a BufferAccessStrategy object, or NULL for default strategy.
*
* To ensure that no one else can pin the buffer before we do, we must
* return the buffer with the buffer header spinlock still held.
*/
BufferDesc *
StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state)
{
BufferDesc *buf;
int bgwprocno;
int trycounter;
uint32 local_buf_state; /* to avoid repeated (de-)referencing */
/*
* If given a strategy object, see whether it can select a buffer. We
* assume strategy objects don't need buffer_strategy_lock.
*/
【 If strategy Not empty , Explain go ring buffer,strategy The structure is equipped with ring buffer Although it has structure 】
if (strategy != NULL)
{
buf = GetBufferFromRing(strategy, buf_state);
if (buf != NULL)
return buf;
}
/*
* If asked, we need to waken the bgwriter. Since we don't want to rely on
* a spinlock for this we force a read from shared memory once, and then
* set the latch based on that value. We need to go through that length
* because otherwise bgprocno might be reset while/after we check because
* the compiler might just reread from memory.
*
* This can possibly set the latch of the wrong process if the bgwriter
* dies in the wrong moment. But since PGPROC->procLatch is never
* deallocated the worst consequence of that is that we set the latch of
* some arbitrary process.
*/
【StrategyControl This is clocksweep Core algorithm data structure , There is an introduction above 】
// (gdb) p *StrategyControl
// $6 = {buffer_strategy_lock = 0 '\000', nextVictimBuffer = {value = 0}, firstFreeBuffer = 324, lastFreeBuffer = 16383, completePasses = 0, numBufferAllocs = {value = 0}, bgwprocno = 113}
bgwprocno = INT_ACCESS_ONCE(StrategyControl->bgwprocno);
if (bgwprocno != -1)
{
/* reset bgwprocno first, before setting the latch */
StrategyControl->bgwprocno = -1;
/*
* Not acquiring ProcArrayLock here which is slightly icky. It's
* actually fine because procLatch isn't ever freed, so we just can
* potentially set the wrong process' (or no process') latch.
*/
SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch);
}
/*
* We count buffer allocation requests so that the bgwriter can estimate
* the rate of buffer consumption. Note that buffers recycled by a
* strategy object are intentionally not counted here.
*/
pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);
/*
* First check, without acquiring the lock, whether there's buffers in the
* freelist. Since we otherwise don't require the spinlock in every
* StrategyGetBuffer() invocation, it'd be sad to acquire it here -
* uselessly in most cases. That obviously leaves a race where a buffer is
* put on the freelist but we don't see the store yet - but that's pretty
* harmless, it'll just get used during the next buffer acquisition.
*
* If there's buffers on the freelist, acquire the spinlock to pop one
* buffer of the freelist. Then check whether that buffer is usable and
* repeat if not.
*
* Note that the freeNext fields are considered to be protected by the
* buffer_strategy_lock not the individual buffer spinlocks, so it's OK to
* manipulate them without holding the spinlock.
*/
if (StrategyControl->firstFreeBuffer >= 0)
{
while (true)
{
/* Acquire the spinlock to remove element from the freelist */
SpinLockAcquire(&StrategyControl->buffer_strategy_lock);
【 This is a critical scenario ,firstFreeBuffer above >0 But maybe the lock was used up after I came in 】
if (StrategyControl->firstFreeBuffer < 0)
{
SpinLockRelease(&StrategyControl->buffer_strategy_lock);
break;
}
buf = GetBufferDescriptor(StrategyControl->firstFreeBuffer);
Assert(buf->freeNext != FREENEXT_NOT_IN_LIST);
【freelist One of the first desc Not in the linked list : 1、 Header pointer modification 2、 It came out of it desc Of freeNext invalid 】
/* Unconditionally remove buffer from freelist */
StrategyControl->firstFreeBuffer = buf->freeNext;
buf->freeNext = FREENEXT_NOT_IN_LIST;
/*
* Release the lock so someone else can access the freelist while
* we check out this buffer.
*/
SpinLockRelease(&StrategyControl->buffer_strategy_lock);
/*
* If the buffer is pinned or has a nonzero usage_count, we cannot
* use it; discard it and retry. (This can only happen if VACUUM
* put a valid buffer in the freelist and then someone else used
* it before we got to it. It's probably impossible altogether as
* of 8.3, but we'd better check anyway.)
*/
【 Get state And lock BM_LOCKED】
local_buf_state = LockBufHdr(buf);
【 It's unlikely to fail here , Equivalent to doing an anomaly detection 】
【 Failure can only be vacuum Put one in , Others got it before us and used it directly 】
if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0
&& BUF_STATE_GET_USAGECOUNT(local_buf_state) == 0)
{
if (strategy != NULL)
AddBufferToRing(strategy, buf);
*buf_state = local_buf_state;
return buf;
}
UnlockBufHdr(buf, local_buf_state);
}
}
【 Come here to explain freelist There is nothing to use , Need to be eliminated 】
/* Nothing on the freelist, so run the "clock sweep" algorithm */
trycounter = NBuffers;
for (;;)
{
【ClockSweepTick The function takes StrategyControl->nextVictimBuffer】
【 Because to ensure atomicity , So I wrote a big lump 】
【 The following loop logic is relatively simple , Traverse all of buffer, Did you find pinned Of 】
【 look for usage_count==0 Return , Traverse it buffer Of usage_count Metropolis --, So the first round did not traverse 5 There must be 】
【usage_count The upper limit is 5 了 , Prevent involution ~】
buf = GetBufferDescriptor(ClockSweepTick());
/*
* If the buffer is pinned or has a nonzero usage_count, we cannot use
* it; decrement the usage_count (unless pinned) and keep scanning.
*/
local_buf_state = LockBufHdr(buf);
if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0)
{
if (BUF_STATE_GET_USAGECOUNT(local_buf_state) != 0)
{
local_buf_state -= BUF_USAGECOUNT_ONE;
trycounter = NBuffers;
}
else
{
/* Found a usable buffer */
if (strategy != NULL)
AddBufferToRing(strategy, buf);
*buf_state = local_buf_state;
return buf;
}
}
else if (--trycounter == 0)
{
/*
* We've scanned all the buffers without making any state changes,
* so all the buffers are pinned (or were when we looked at them).
* We could hope that someone will free one eventually, but it's
* probably better to fail than to risk getting stuck in an
* infinite loop.
*/
UnlockBufHdr(buf, local_buf_state);
elog(ERROR, "no unpinned buffers available");
}
UnlockBufHdr(buf, local_buf_state);
}
}边栏推荐
猜你喜欢

【Harmony OS】【FAQ】鸿蒙应用开发问题分享(字体/构造器)

1.4 process control statement
![[machine learning] logic regression principle and code](/img/7b/b7180a7a1cfb60f2cc73aeff7436b8.png)
[machine learning] logic regression principle and code

Chinese translation of zagayevsky's "try to praise this incomplete world"

六 makefile.build的分析

Stc8h Development (XIV): I2C Drive rx8025t high precision Real time clock chip
![[harmonyos] [arkui] Hongmeng ETS method tabs+tabcontent realizes the bottom navigation bar](/img/a8/f719815c31f3c5a9e9fdfee49ee19d.png)
[harmonyos] [arkui] Hongmeng ETS method tabs+tabcontent realizes the bottom navigation bar

thymeleaf介绍与简单应用

How to multiply bank revenue through customer value analysis

Architecture layering of standard web system of Architecture Series
随机推荐
【快应用】快应用用户协议、隐私政策内容中可以多次跳转,点击返回未能返回上一级页面,该如何处理?
STC8H开发(十四): I2C驱动RX8025T高精度实时时钟芯片
1.5.1 无限循环
Reverse salary increase, what is this operation?
Uncaught Error: Rendered fewer hooks than expected. This may be caused by an accidental early return
1.8 classes and objects
Huawei cloud x Black Lake is a system to solve all kinds of production difficulties in the manufacturing industry!
糖尿病遗传风险检测挑战赛-Coggle 30 Days of ML
How to use pycharm to make a simple Snake game
Etcd database source code analysis -- etcdserver bootstrap to remove v2store
SAP Fiori Launchpad 上看不到任何 tile 应该怎么办?
I found an artifact that can efficiently manage interface documents
多米诺骨牌上演:三箭资本崩盘始末
交换
[harmony OS] [FAQ] Hongmeng application development problem sharing (font / constructor)
Stc8h Development (XIV): I2C Drive rx8025t high precision Real time clock chip
[Promotion payée] foire aux questions ensemble, opération de promotion de base FAQ 2
[machine learning] logic regression principle and code
污水排放监控,环保数采仪助力城市黑臭水体治理
宝立食品上交所上市:年营收15.78亿 市值58亿