I’ve been wanting to get back to reading papers and open source code for some time now. As part of a workshop conducted by a friend, I’ve set a goal of grokking 12 pieces over the next 12 months.
A piece meets the criteria if
- It excites me (or)
- It’s around a problem I’m interested in (or)
- It has some side benefit (career or otherwise)
My goal for this week is to finalise at least 4 pieces from my bookmarks and notes.
Vatsal has created a small accountability group to ratchet up the pressure :D
And quoting his presentation, “If nothing works, you can just blackmail people to make them accountable.” :P
For the first piece, I’ve decided to start with a Python library that I find myself using often cachtools. I had a note as such: TTL Cache Python (timed dictionary) reminding me to check out the implementation.
It’s a nifty library that provides caching functionality out of the box. It’s similar to Caffeine or Guava in the Java world.
I generally start off by understanding the major data structures (thanks Antirez) and going through the Unit Tests. However, since this code base is really small, I’ve decided to get answers for the following questions that were on my mind instead.
- How does it maintain global state for every permutation/combination of configuration?
- How does it handle concurrency or does it just rely on the GIL?
- TTL - Does it run a janitor in the background or is it on demand?
- What sort of optimisation does it use to quickly delete expired items?
- How are equivalent deep/pass by reference objects handled when creating the hashkey?
- General re-usability patterns to handle different eviction policies.