Guidance for developing in software engineering
Status: Notes
Confidence: Likely
There are skills used throughout engineering (and if you’re in college right now they may not be the ones you expect). The curriculum structure with the best connection to evidence that I’m aware of is CDIO (Conceive-Design-Implement-Operate). They articulate the goal of engineering education as:
Graduating engineers should be able to conceive-design-implement-operate complex value-added engineering systems in a modern team-based environment.
It’s worth reading their materials, particularly on Personal and Professional Skills and Interpersonal Skills. These turn out to be the most heavily used and, in many cases, least taught skills for engineers.
Within software engineering, I suggest the following categories for reflection. I see many young programmers who focus only on the first (“practicing leetcode”) almost entirely, which is unfortunate.
Writing programs: Fluently producing code to solve specific, well specified problems.
- Can represent data in terms of sequences, maps, and sets.
- Can access data in sequences, maps, and sets.
- Can implement basic algorithms in terms of loops and conditionals over sequences, maps, and sets, such as folding, searching, iterating over distinct pairs, filtering, and mapping.
- Represent algorithms in functional and logic programming.
- More complicated data structures and algorithms.
- Effective use of generics, higher order functions, and duck typing.
- Metaprogramming, code generation, compilers
- …
Mechanical sympathy: Intuitive understanding of how programs map onto the hardware, and how to adjust programs to use the hardware well.
- Mental model of statement after statement mutating locations in memory, and of functions evaluating and returning values which are in turn used in further evaluation.
- Primitive types and that they map onto binary patterns.
- Disk, RAM, CPU with registers.
- Allocation and garbage collection.
- Memory layout of data structures.
- L1/L2 caches, cache lines, cache locality, page faults.
- …
Architecture: Structure of programs to combine well defined pieces to a larger end, avert issues, make future modifications straightforward, and simplify debugging.
- Organize into a set of functions that call one another.
- Organize functions so that data flows smoothly among them.
- Insert bottlenecks and constraints to control behavior.
- Unify partial models to produce an architecture.
Isolation, coordination, and communication: How to separate different parts of a computation into pieces to make each piece easier to reason about or to place that piece in a different computational environment, how to coordinate those pieces and communicate among them. Includes processes and threads, concurrency, networking, and distributed systems.
- Single threaded programs, running in isolation as processes on a machine.
- HTTP and basic network requests over sockets.
- Pipes and FIFOs.
- Forking processes and threads.
- Basic exclusion (non-atomic operations, locks, synchronized)
- Consensus and leader election
- …
Data persistence: Maintaining data over time without corruption, accessing it effectively, complying with data controls, and preventing data loss.
- Read and write whole files.
- Basic SQL. Simple input parsing. Binary files with fixed offsets, serialization formats, encodings including Unicode.
- SQL, normalization, kv stores, file locking ordering, choosing appropriate storage engines and data models. Data migration.
- Backups and restoring backups.
- File semantics are hard. Tailing files.
- Anonymizing data, securing data, governing regulations like GDPR and HIPAA.
- …
Quality assurance and operations: Demonstrating that programs work as intended and keeping them working so in the presence of real world conditions.
- Manual testing of simple, deterministic programs.
- Basic unit test suites for happy path and common errors.
- Logging output with print statements.
- Structured logging.
- Deriving test plans systematically.
- Property based testing and fuzz testing.
- Configuration management.
- Automatic deployment
- Monitoring, log aggregation.
- …
Debugging: Effectively isolating, mitigating, and remove issues.
- …