How to rewrite "malloc" from scratch ? (with C)
Learn to recode memory allocation functions such as "malloc", "free", "realloc", "memcpy", ... without the standard library
The purpose of this post is to help you make these functions, but the work remains to you. However, I will give you pseudo-codes or code parts! Then, you will be able to make the functions yourself.
A first malloc function
This is, if I may say so, a "dumb" memory allocation function. We will not keep it, but it is a good start:
void* my_malloc(size_t size) {
void* allocated = sbrk(0);
void* request = sbrk(size);
if (request == (void*) -1) {
return nullptr;
}
return allocated;
}
Here is an explanation:
void*: It is for pointers with no assigned type, it means you can have a pointer pointing on any object, no matter its type.size_t: You have to create it since we are not using the standard library. It is equal to thelong unsigned inttype.nullptr: You also have to create it since we are not using the standard library. It is defined by:#define nullptr 0
Here is one more, the sbrk() function. According to its documentation :
brk()andsbrk()change the location of the program break, which defines the end of the process's data segment (i.e., the program break is the first location after the end of the uninitialized data segment). Increasing the program break has the effect of allocating memory to the process; decreasing the break deallocates memory. Source : manpages.ubuntu.com
We are also checking if the request was accepted and then we are returning a pointer on the allocated block.
Do you see the problem here? When we allocate an object, we want to free it after. But how can we know which memory block was allocated to free it?
Let's solve this problem.
Memory blocks
For each allocation, we will save the information of that block into a structure called BlockMeta : the meta-information of a memory block. Indeed, the allocated memory size for each allocation will be increased, but no worries.
Create your own from this scheme :
BlockMeta
- size // how much space you are using in memory for this block
- next // the following block
- is_free ? // maybe you already freed it
Help with type
I have chosen to create a type for this structure : typedef s_block_meta* BlockMeta where s_block_meta is the structure
Before updating the malloc function using the BlockMeta structure, we have to create two other functions to request memory space :
BlockMeta mem_find_free_block(BlockMeta* last, size_t size);
BlockMeta mem_request_space(BlockMeta last, size_t size);
We also need to create a global variable called BLOCK_BASE and set as nullptr (a pointer pointing on a NULL value) for now. Then, we will be able to know if it is the first time we allocate memory when we want to do it. We will see this further when writing the mem_alloc() function.
Find a free block in the memory
Here we will create this function:
BlockMeta mem_find_free_block(BlockMeta* last, size_t size);
Firstly, it creates a BlockMeta current variable and initialises it to BLOCK_BASE. In a loop, the value of last is updated to the current variable, and the current variable is updated to the current->next value.
That loop will stop when the next block is free and if the size is correct for the requested allocation. When it finds a free block of the same size or bigger than the requested size, the loop breaks and returns the current block, which is that free block.
Request space in memory
Here we will create this function:
BlockMeta mem_request_space(BlockMeta last, size_t size);
Thanks to the sbrk() function, the function requests a space of the requested size in the memory to the operating system.
Sometimes, sbrk() fails to allocate and returns a null pointer, so we have to check the returned value.
Don't forget we actually don't allocate the requested size, but the size of a
BlockMetaplus the requested size.Consequently, a call to
sbrk()does not look like this:void* request = sbrk(size);but this:
void* request = sbrk(size + sizeof(BlockMeta));
We will also update the last->next value to the value of the created block.
Here is how to initialise the created block:
BlockMeta created_block = sbrk(0);
created_block->size = size;
created_block->next = nullptr;
created_block->is_free = false;
The value of created_block is returned by the function.
Allocate!
The memory allocation function simply looks like this:
void* mem_alloc(size_t size);
At first, we have to check if the requested size is not null. Otherwise, a null pointer is returned.
You can define a macro for requesting space in memory. Then there will be no need to rewrite the same code each time we want to do it.
#define REQUEST_SPACE(last) \
created_block = mem_request_space(last, size); \
if (!created_block) { return nullptr; } // system failure
As you can see, it calls our previous function
mem_request_space()
On the first allocation (BLOCK_BASE is null), we don't have to find a free block in memory, we can directly request space:
BlockMeta created_block;
if (!BLOCK_BASE) {
REQUEST_SPACE(nullptr);
BLOCK_BASE = created_block;
return created_block + 1;
}
Here we update the BLOCK_BASE value for the next allocations to know they are not the first allocation. You obviously know it, but our code does not.
However, when it's not the first allocation, we have to create BlockMeta last (we often saw it in the previous function signatures). It is initialised as last = BLOC_BASE.
Now, we have to search for a free block in memory, using the mem_find_free_block() function. The returned block is the created_block.
Sometimes, no free block is found. We have to request space directly:
created_block = mem_find_free_block(&last, size)
if !created_block {
request space
return created_block + 1
}
Otherwise, we set the created_block as non-free and we return created_block + 1.
Free me that memory!
The memory-freeing function is simple:
void mem_free(void* pointer);
When the pointer is null, it returns directly. Otherwise, it has to retrieve to block corresponding to the pointer and set it as free.
But how do we retrieve the block corresponding to a pointer?
Retrieve blocks from pointers
Here is the code to retrieve the block corresponding to a pointer:
BlockMeta mem_block_from_pointer(void* pointer) {
return (BlockMeta) pointer - 1;
}
Allocate, again!
The reallocate function looks like a mix of the free function and the allocate function:
void* mem_realloc(void* pointer, size_t size);
Indeed, it retrieves the block corresponding to the pointer. If there is enough size in the retrieved block, the same pointer is returned, there is no need to reallocate. It is done as follows:
BlockMeta block = block_from_pointer(pointer)
// Enough space, it simply returns the same memory block
if block->size >= size {
return pointer
}
But when there is not enough space, we have to reallocate. We only have to call our mem_alloc() with the new size (don't forget to check if the returned pointer is null). Then, we copy the content of the current block to the new block before calling mem_free() on the pointer. The returned value is the obtained pointer from the previous call to mem_alloc().
Wait, we have no function to do a memory copy.
Utility functions on memory
Here are three useful functions on memory:
void mem_copy(void* dest, const void* source, size_t size);
bool mem_comp(const void* first, const void* second, size_t size);
void mem_set(void* dest, int fill_value, size_t size);
The first one copies the content of the
sourceblock to thedestination block.The second one compares the content of two blocks.
The third one fills a
destination block with an integer value.
Copy, copy, copy, copy...
Copying is easy, we browse the content of a block and set each value to the other.
for i = 0; i < size; i++ {
dest[i] = source[i]
}
But it is not easy to do with C because we cannot update the pointed value of a void pointer. To access the ith element of a void pointer, we have to convert it to an int pointer as follows:
void* x = ...
// This: `x[i] = ...` cannot be done!
((int*) x)[i] = ...
So, here is the assignment we do in the loop:
((int*) dest)[i] = ((int*) source)[i];
Compare two blocks
It is simple as copying memory but instead of assignment, we do a comparison. When the two values are not equal, the function returns false.
Fill a memory block with a value
It is simple as copying memory but instead of assigning the ith value of a source block, we assign each value to the given fill_value.
Conclusion
Congratulations, you have created a tiny library to allocate, reallocate and free blocks in memory but you also learnt to do some optimisation by reusing the old blocks instead of keeping them needlessly.
Here is the repository of my "memlib" project: https://github.com/antoninhrlt/memlib. Take a look if you have doubts or if you want to compare.
Do not hesitate to comment if you need more help.