C memory model
From Wikipedia, the free encyclopedia
Memory models in the C programming language are a way to specify assumptions that the compiler should make when generating code for segmented memory or paged memory platforms.
For example, on the 16-bit x86 platform, six memory models exist. They control what assumptions are made regarding the segment registers, and the default size of pointers.
Contents |
[edit] Memory segmentation
Four registers are used to refer to four segments on the 16-bit x86 segmented memory architecture. DS (data segment), CS (code segment), SS (stack segment), and ES (extra segment). A logical address on this platform is written segment:offset, in hexadecimal. In real mode, in order to calculate the physical address of a byte of memory, one left-shifts the contents of the appropriate register 4 bits, and then adds the offset.
For example the logical address 7522:F139 yields the 20-bit physical address:
Note that this process leads to aliasing of memory, such that any given physical address may have multiple logical representations. This makes comparison of pointers difficult.
In protected mode, the GDT and LDT are used for this purpose.
[edit] Pointer sizes
Pointers can either be near, far, or huge. Near pointers refer to the current segment, so neither DS nor CS must be modified to dereference the pointer. They are the fastest pointers, but are limited to point to 64 kilobytes of memory (the current segment).
Far pointers contain the new value of DS or CS within them. To use them the register must be changed, the memory dereferenced, and then the register restored. They may reference up to 1 [megabyte] of memory. Note that pointer arithmetic, such as addition and subtraction, if done directly, never modify the segment portion of the pointer, only its offset until the new offset exceeds 0xFFFF or is under 0. Even on exceeding 0xFFFF the segment portion will not change with far pointers, instead the resulting offset will undergo modulo 64K operation (i.e. it will roll back to zero on exceeding 0xFFFF), using far pointers you are still tied to 64K region its just that, that region is outside your current CS or DS.
So if you have something like,
char far* myfarptr = (char far*) 0x50000000L ;
then no matter what you do, using this you will not be able to access memory beyond 0x6000:0000. Which means if you do something like,
unsigned long counter ; for(counter=0; counter<128K; counter++) // I want to access 128K memory *(ptr+counter) = 7 ; // Im writing all 7s into it
So if you try the above program you will see that, the moment your counter becomes (64K + 1), your C compiler owing to the fact that your using far pointer will trunctate the resulting absolute address to 0x5000:0000 only.
Huge pointers are essentially far pointers, but are normalized every time they are modified so that they have the smallest possible segment for that address. This is very slow and is harder and even slower in protected mode, but allows the pointer to point to multiple segments, and allows for accurate pointer comparisons, as if the platform were a flat memory model.
[edit] Memory models
The memory models are:
Model | Data | Code |
Small | near | near |
Medium | near | far |
Compact | far | near |
Large | far | far |
Huge | huge | huge |
Tiny* | near | near |
* In the Tiny model, all four segment registers point to the same segment. In all models with near data pointers, SS equals DS.
Protected mode note: In protected mode a segment cannot be writable, readable and executable. Therefore, when implementing the Small and Tiny memory models the code segment register must point to the same physical address and have the same limit as the data segment register. This defeated one of the features of the 80286, which makes sure data segments are never executable and code segment are never writable (which means that self-modifying code is never allowed). However, on the 80386, with its flat memory model it is possible to protect individual memory pages against writing.
Memory models are not limited to 16-bit programs. It is possible to use segmentation in 32-bit protected mode as well (resulting in 48-bit pointers) and there exist C language compilers which support that. However segmentation in 32-bit mode does not allow to access a larger address space than what a single segment would cover, unless some segments are not always present in memory and the linear address space is just used as a cache over a larger segmented virtual space. It mostly allows to better protect access to various objects (areas up to 1 megabyte long can benefit from a 1-byte access protection granularity, versus the coarse 4 KiB granularity offered by sole paging), and is therefore only used in specialized applications, like telecommunications software. Technically, the "flat" 32-bit address space is a "tiny" memory model for the segmented address space.
[edit] References
- Turbo C++ Version 3.0 User's Guide. Borland International, Copyright 1992.