Hello Reader!
So here we are at part 2 from last weeks post. On the Github site, the Week 5 code contains both parts in it. I figured it would be easier to have the entire piece of code then trying to separate out the second part of the code.
So at this part of the code we’re dealing with the Cache Directory Table. The cache directories hold the temporary files for Internet Explorer, and are grouped in quantities of four. Normally there are only four of these directories, but there can be more. The names of these directories are randomly generated.
So first at offset 72 of the index.dat file, we have the number of cache directories. So the first thing I did with my code was pull that 4 byte value, and use that as a key for the number of directories I’ll need to parse out.
num_cache_dir_entries = ie_ind_four_byte(ie_index_header[72:76]) num_cache_dir_parse = num_cache_dir_entries start = 76 end = 88Each cache directory is 12 bytes long, so I initialize a start point of 76 and an end point of 88. This should handle the first directory entry. Then all I need to do is move the end value to the start value, and then create a new end value by adding 12 to the new start value. Then just repeat until the number of cache directories I have to parse is zero.
dict_cache_dir_entry = {} # initialize a dictionary to put the cached directory name and no of cached files while num_cache_dir_parse > 0: cache_dir_entry = ie_ind_cache_dir_entry(ie_index_header[start:end]) dict_cache_dir_entry[cache_dir_entry[1]] = cache_dir_entry[0] # increment our variables start = end # Pass the previous ending point to the start end = end + 12 # Now add 12 to the previous ending point to get the new one num_cache_dir_parse -= 1Now I also create a dictionary called dict_cache_dir_entry that I’ll use to store the cache directory name and the number of cached files in that directory. I’ll end up using the directory name as the key, and the number of cached files as the value. However the trick in the code is that in the index.dat file, the number of cached files is first, and the directory name is second, so I have to swap out the code.
Finally I print out the data along with the other information.
That’s all for this week!
