Shared Persistent Heap Data Environment Manual
1.1.0
|
SPHDE is composed of two major software layers: The Shared Address Space (SAS) layer provides the basic services for a shared address space and transparent, persistent storage. The Shared Persistent Heap (SPH) layer organizes blocks of SAS storage into useful functions for storing and retrieving data.
The I/O subsystems, memory subsystems, and supporting APIs of current operating systems were designed for an environment of scarcity (limited real memory and virtual address spaces). Modern applications have to deal with increasingly complex data structures whose data needs to be shared among process instances and persist to storage while still being constrained by these APIs and subsystems.
Scarcity forces trade-offs between simplicity and efficiency and complicates programming tasks. This is especially true if these data structures contain internal references (pointers). Traditional file systems and relational databases simply don't handle internal reference persistence well. Data persistence to storage in these systems is not transparent and requires additional layers of software to capture the relationships represented by these references. This adds even more overhead and complexity to programming tasks.
The availability of 64-bit commodity processors, cheap high-density DRAM, and commodity operating systems effectively eliminate the original scarcity. The standard POSIX shared memory APIs (e.g., shmat, shmdt,shmctl, et al.) enable the exploitation of abundant memory, but are still not simple to use. A (relatively thin) API layer is needed to manage shared memory access across a large shared address space in a simple and coherent way.
Creating such an API is the goal of this project. The primary function is to manage backing files and memory map them into the application. For Linux, this allows data to be shared directly in the real pages of the kernel's file cache. Since the files are always mapped at the same virtual address, internal C pointers can be maintained for both inter-process sharing and transparent persistent storage. This easily supports zero-copy sharing and operate-in-place persistence.
The SAS layer manages a region of process address space to provide:
Some additional SAS definitions:
For now, the size and virtual address range of the region is fixed for each platform (somewhere beyond TASK_UNMAPPED_BASE and below the main stack). Blocks are allocated in power-of-2 size and alignment from within the region.
Backing storage (files) are allocated in power-of-2 sizes called segments. Segments must be smaller in size than the region and usually larger than blocks. Segments don't necessarily have anything to do with any notion of hardware segmentation, but it may be useful if the size/alignment of SAS segments match the underlying hardware.
A simple example of using SPHDE:
By default the region's path name is either "." or the current directory. The region name can be overridden by setting the "SASSTOREPATH" environment variable, or using the SASJoinRegionByName() function. The "region" name is the path to a directory where the SAS/SPH backing files will be created.
Processes that join with the same region name will share the region and all of its (allocated) storage. All processes that share a region can allocate, deallocate, reference and update blocks (normal C pointer semantics) within the region.
Processes that use a different region name are independent from any other region and only see the region that they have joined. There is no limit to the number of independent regions per process other than those imposed by the file system. Normal file access rules apply. Any process that does not have read/write authority to the region's directory or files can not join the region or access the data.
Once a block is allocated it is backed with file storage, and implicitly mmaped for transparent storage persistence and sharing between processes. The block will always be mapped at the same virtual address each time it is loaded and for each sharing process. This allows complex pointer based data structures to be stored, persisted, and shared without additional effort (the pointers are context independent).
A lock manager is provides so utilities and cooperating processes can synchronize their activities. Locks are "keyed" by virtual addresses which are normally the address of some interesting shared data structure or " utility" object.
Currently the lock manager supports (shared) SasUserLock__READ and (exclusive) SasUserLock__WRITE locks. The intent is to add other lock types as needed. For example, the index utility could use an "INTENT" lock (which allows multiple READ locks but is exclusive with other INTENT and WRITE lock) which is upgradeable to a WRITE lock.
A simple pointer is sufficient to anchor a linked list or quadÂtree etc., but not very user friendly. SPHDE provides a number of utility objects. Utility objects use blocks of storage to provide higher level functions. An application could create a SASStringBTree_t or SPHContext_t and store its pointer in the finder as in the following example:
Utility objects like Context, Index, and StringBTree can be used to create directories or index large arrays of data structures for search. The Context is a combination of a StringBTree and a Index. It allows any object (address) to have a symbolic name (or names). It also provides the reverse mapping, from address to name(s).