Written 23rd June 2020.
I think there’s a knack to exploring source code, one that I’ve been trying to develop. Recently, I’ve been ramping up on a new project at work, and as an apprentice, I’ve only been through this process once before, when I first joined Google. Now, I’m armed with the benefit of experience.
One of the best skills I’ve learned this year is exploring source code. It’s a necessity, since software codebases are so huge, as outlined in this paper. Everything I learned about programming until recently was through personal projects, but I never needed to learn how to read code effectively because the design of my programs (high-level as well as all the details) fit snugly inside my head. On large projects, this is almost impossible, especially if you want to get any useful work done.
The first thing I did was create a blank document, which I planned to fill out with everything I was about to learn. This would include links to source code, design docs, notes, pretty much everything I could learn, serving as a brain cache to prevent me getting lost/overwhelmed.
Early on, I learned where the boundaries of the project lied, i.e. what functionality we were responsible for, and what we weren’t. This helped me as I went for a top down approach: starting by reading documentation we provided to other teams and tunneling inwards to design docs, and finished by reading the source code. Reaching this level, it’s easy to become lost. Luckily, my document serves as a map to guide me around, serving as my anchor back to the high-level problem we’re trying to solve.
For me, learning and understanding something are highly personal activities, so having a “map” has been highly effective.
One of the first things I did was read the header files, which I see as the ideal place to tie the problem (what are we trying to solve) together with the solution (how did we do it). In some respects, a header file, especially with a public API, is the interface between these two, and when necessary, it’s easy to see the underlying implementation, read associated docs, and look at the history of the file.
The same goes for service definitions (such as Swagger or gRPC), method signatures, interfaces, etc..
One of my metrics for success through this process was that I should be able to ask useful questions, as it meant knowing where the gaps in my knowledge lay, and trying to fill them. I also have a rule: never ask the same question twice.
Whenever something cropped up, I made sure to document it so others could benefit.