Tag Archives: mmap

GSoC 2014 Update : Caching Plugin for Monkey server – Week 2

This the update of the work I did during the my second week of GSoC internship.

What I did this week :

  1. Invested a lot of time in trying to figure out the working of the system call mmap (). I tried out sample programs to map files onto the memory and reading them from the memory.
  2. Once I was quite clear about the functioning of mmap, I then tried to implement mmap in my hash table. I decoupled the current hash table and min heap implementations that I have written.
  3. I then included mapping of files that got inserted into the hash table, and storing the memory location where they got stored, along with the name of the file in the hash table. I also implemented the lookup of a file  – if the file was found in the hashtable, I read the file from the particular memory location that has been stored besides its name (that is the memory location to which it was mmapped while newly inserting it).
  4. After testing ‘mmap’ing with hash table structure, I integrated the hash table and min heap again.

Problem faced :

Initially, I faced problem while reading  a file from a particular memory location using mmap, but solved it.

Yet another issue that still exists :

How to persist the hash table for future use ?

I searched about how I can make hash tables persistent, so that the next time I run the program, the table still contains the previously inserted files and their memory locations. I came up with two rough ideas (I do not know if they’re right) :
  1. Write the hash table to a file and can be loaded for later use.
  2. Use of persistent data structures : I read about it at [2] and [3]

Roadmap :

  1. The aim is to now complete hash table persistence.
  2. Write test cases for the hash table.
My learnings :
Some things that I learnt while working on this are roughly drafted below :
  • ‘mmap’ing of files to the memory means that it is just allocated some space on the memory. It is just reserved space on the memory. It can be accessed through a pointer that was returned.
  • Each process has a unique view of the virtual memory. It is given the feeling that it works with a huge amount of contiguous memory locations while in reality this is not true. Whenever a process begins to function, those parts of the memory that the process needs are loaded into the RAM.
  • Advantage of mmap – It is extremely useful when multiple processes access the same file from memory. Changes made to the file by one process can be kept protected or shared with another process with the help of flags passed as parameters to mmap, such as MAP_SHARED.
  • Usage of fprintf (FILE *stream, const *char format, …) and write (int fd, char *format, int length, …) and some standard file pointers like stdin, stdout, stderr, etc., defined for fprintf () in the standard library – stdio.h and some standard file descriptors such as STDOUT_FILENO, STDIN_FILENO, STDERR_FILENO, etc., defined for write () in the library – unistd.h.
  • Read about and used fstat – an interface call that gets the status of a file in linux. Below is the basic usage of fstat to get the file size.
struct stat fs;
fstat(fd, fs);
int length = fs.st_size;

GSoC 2014 Update : Caching Plugin for Monkey server – Week 1

This is my first week’s update on my GSoC Project – Developing a caching plugin for Monkey server.

During the community bonding period, I had started coding some of the basic data structures that would be required for the caching plugin, such as the hash table for lookup of a resource and a min heap for the deletion of the file that has been used the least number of times. Since these are independent structures from the code, I decided to get started with these. I had done a very simple implementation of these.
Work done this week :
The designed and coded  are the functions that I added :
  1. Hash table : 
    1. create_ht
    2. insert_ht
    3. lookup
    4. delete_ht
  2. Min heap : 
    1. insert
    2. pop
During the first week, I improvised on my code and also added a makefile for easy compilation by referring to [3].
I then concentrated on reading up more about linux system calls that would enable memory copying of files in the cache. My mentor, Eduardo suggested that I try using mmap () for this purpose. mmap is basically a linux system call, an interface, that can be used to map files onto the virtual memory of the process that calls it. You can read more about it at [1] and [2]. I tried out mapping files using mmap function.
Problem faced :  
I had some problems with the makefile and also the delete function of hash table. However I fixed both these problems. You can view my code at [4].
Work for next week : 
I figured out that I would also need to copy the hash table also onto the memory. So I am going to be working on that for the next week. I will also improvise more on my code.
Links :