Tag Archives: fstat

GSoC 2014 Update : Caching Plugin for Monkey server – Week 2

This the update of the work I did during the my second week of GSoC internship.

What I did this week :

  1. Invested a lot of time in trying to figure out the working of the system call mmap (). I tried out sample programs to map files onto the memory and reading them from the memory.
  2. Once I was quite clear about the functioning of mmap, I then tried to implement mmap in my hash table. I decoupled the current hash table and min heap implementations that I have written.
  3. I then included mapping of files that got inserted into the hash table, and storing the memory location where they got stored, along with the name of the file in the hash table. I also implemented the lookup of a file  – if the file was found in the hashtable, I read the file from the particular memory location that has been stored besides its name (that is the memory location to which it was mmapped while newly inserting it).
  4. After testing ‘mmap’ing with hash table structure, I integrated the hash table and min heap again.

Problem faced :

Initially, I faced problem while reading  a file from a particular memory location using mmap, but solved it.

Yet another issue that still exists :

How to persist the hash table for future use ?

I searched about how I can make hash tables persistent, so that the next time I run the program, the table still contains the previously inserted files and their memory locations. I came up with two rough ideas (I do not know if they’re right) :
  1. Write the hash table to a file and can be loaded for later use.
  2. Use of persistent data structures : I read about it at [2] and [3]

Roadmap :

  1. The aim is to now complete hash table persistence.
  2. Write test cases for the hash table.
My learnings :
Some things that I learnt while working on this are roughly drafted below :
  • ‘mmap’ing of files to the memory means that it is just allocated some space on the memory. It is just reserved space on the memory. It can be accessed through a pointer that was returned.
  • Each process has a unique view of the virtual memory. It is given the feeling that it works with a huge amount of contiguous memory locations while in reality this is not true. Whenever a process begins to function, those parts of the memory that the process needs are loaded into the RAM.
  • Advantage of mmap – It is extremely useful when multiple processes access the same file from memory. Changes made to the file by one process can be kept protected or shared with another process with the help of flags passed as parameters to mmap, such as MAP_SHARED.
  • Usage of fprintf (FILE *stream, const *char format, …) and write (int fd, char *format, int length, …) and some standard file pointers like stdin, stdout, stderr, etc., defined for fprintf () in the standard library – stdio.h and some standard file descriptors such as STDOUT_FILENO, STDIN_FILENO, STDERR_FILENO, etc., defined for write () in the library – unistd.h.
  • Read about and used fstat – an interface call that gets the status of a file in linux. Below is the basic usage of fstat to get the file size.
struct stat fs;
fstat(fd, fs);
int length = fs.st_size;