Prototyping in C |
September 4th, 2022 |
tech |
There are a few things about this situation that are a pretty good fit for C:
I'm prototyping in a data-heavy situation, iterating on the right way to solve various problems. Normally I would use Python for prototyping, and I am using it a bunch here. In many cases, however, I need to run my code quickly over a large amount of data and see how it works in practice, and my Python implementations have generally been far too slow.
I'm implementing algorithms that are computationally straight-forward: a Trie to count unique fixed-length substrings (code), or approximating the previous algorithm with multiple processes writing to a big block of shared memory and accepting collisions (code). If I needed library support or had a large amount of tricky logic I'd use a different tool.
I'm processing very simple formats: essentially just long strings of
[ACGT]*
. This keeps the code rule of two compliant. I wouldn't want to use C for any tricky parsing unless it was sandboxed and I was very confident in the sandboxing setup.-
I'm working on this set of problems mostly by myself, so I should choose the tooling where I'll be able to make progress most quickly. It doesn't matter much right now if my code is a bit weird. Once I get a better handle on how we're going to approach this problem computationally it will likely make sense to rewrite in a modern language, which will both be safer and more readable.
The biggest risk here is that prototype code will become production code, there will always be something more urgent than a rewrite, and at the core of our system we have a chunk of code in an unsafe language that's poorly documented and confusing to read. This is probably the strongest argument against starting anything in C, even for prototyping. While I can't completely commit to ensuring this doesn't happen, I'm going into this with my eyes open. And I will at least commit to seriously documenting anything that's becoming production code.
This is a bit of an unusual confluence of factors, and if you'd asked me a few years ago if I was ever going to write C professionally again, let alone choose to start something in C, I would have said no. Yet, in this case, I think it's the right call.
(Large parts of my rhythm stage setup, including both the MIDI routing and whistle-controlled synthesizer are also in C, though there for minimizing latency instead of maximizing throughput. Since that's something silly I'm doing for fun I feel less weird about it.)
Comment via: facebook, lesswrong