Can We Rank Developers ?
Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
A martial-arts-style “black belt means teacher” analogy doesn’t map cleanly to software because promotion incentives and job roles differ.
Briefing
Software-engineering ranks—whether framed as “belt colors” like martial arts or as numeric levels—can’t be made truly fair or universal because programming skill is too context-dependent and too multidimensional. The most concrete proposal discussed is a 0.0–3.0 scale that ties competence to the type of work (additive contributions, infrastructural multipliers, and global multipliers), but the discussion repeatedly returns to a hard constraint: the same person can look like a different “rank” depending on environment, domain, and what’s being measured.
The conversation starts by borrowing the logic of martial arts ranking: a black belt doesn’t mean mastery of everything; it signals readiness to teach and continued growth. That analogy becomes the jumping-off point for programming. A “universal programming measurement” is floated as a way to assess developers more systematically—e.g., a 1.0 threshold for typical corporate employability, 2.0 as a “10x” tier, and higher levels representing compounding impact. The scale is paired with a reliability curve idea: a 1.8 engineer might be roughly 85% competent at level-2 tasks, with reliability improving as the level rises.
But the pushback is immediate and persistent. First, ranks in software already exist in practice (junior/senior/staff/VP/CTO), and they’re often miscalibrated—sometimes based on time served, politics, or how well someone sells their ideas. Second, teaching incentives distort behavior: promotion systems and compensation often reward people for instruction and “learn in public,” which can lead to lots of low-value teaching or premature teaching by those who aren’t ready. Third, “black belt = teacher” doesn’t map cleanly to programming, because many high performers are hired to earn a living and ship work, not to mentor.
The discussion also challenges the meaning of “10x.” Even if a developer is exceptional, the value gap between levels can vary wildly by environment. Greenfield projects may amplify differences; corporate environments can compress them. Some participants argue that “10x” often means producing more work rather than solving problems 10x better, and that different specialties (UI vs. infrastructure, embedded C vs. SaaS, Lisp macro builders vs. UI refactors) don’t fit a single ladder.
Ultimately, the most pragmatic conclusion is cynical but actionable: if engineers don’t build their own ranking or assessment systems, businesses will rank them anyway—often using criteria that don’t match technical merit. Yet the group remains skeptical that any single, organization-independent, universal belt system can work. Programming skill is shaped by domain, failure tolerance in the culture, and the specific tasks being evaluated. The best “rank” systems may therefore be internal, role-aware, and culture-aware rather than universal—and even then, they’ll produce anomalies. The core takeaway is that measuring developers is less about finding a perfect ladder and more about acknowledging what can’t be standardized.
Cornell Notes
The discussion tests whether developer “belt colors” or numeric levels can fairly rank programmers the way martial arts ranks do. A proposed 0.0–3.0 scale links competence to task type: additive business value (level 1), infrastructural multipliers (level 2), and global multipliers (level 3), with a reliability/competence curve (e.g., a 1.8 engineer being ~85% competent at level-2 work). Major objections stress that programming skill is context-dependent (environment, domain, and what tasks are being measured), and that “black belt = teacher” doesn’t translate well to software incentives. The result is skepticism toward any universal, organization-independent ranking system, even while acknowledging that ranking will happen regardless—so better internal assessment may be necessary.
Why does the martial-arts analogy break when applied to software engineering ranks?
What is the proposed 0.0–3.0 programming scale, and how does it try to quantify competence?
Why is “10x” considered unreliable as a universal metric?
How does culture and environment affect developer “rank” outcomes?
What critique targets teaching and “learn in public” incentives?
If universal ranking seems impossible, what practical motivation remains?
Review Questions
- What would a “level 2” developer need to consistently produce under the proposed scale, and why might that not translate across domains?
- How do teaching incentives distort promotion or perceived rank in software compared with martial arts?
- Which factors in the discussion make “10x” a shaky universal benchmark?
Key Points
- 1
A martial-arts-style “black belt means teacher” analogy doesn’t map cleanly to software because promotion incentives and job roles differ.
- 2
A proposed 0.0–3.0 scale ties competence to contribution type (additive value, infrastructural multipliers, global multipliers) and uses a competence/reliability curve (e.g., 1.8 ≈ ~85% at level-2 tasks).
- 3
“10x” performance varies by environment and project type, so a single universal metric can misclassify developers.
- 4
Specialization (UI vs. infrastructure, embedded C vs. SaaS, macro/DSL design vs. refactoring) makes one ladder inherently unfair.
- 5
Teaching can be over-incentivized in software, leading to premature or low-value instruction and skewed perceptions of competence.
- 6
Existing corporate titles already rank people imperfectly, often influenced by time, politics, and communication skill rather than pure technical ability.
- 7
Engineers may still need internal, role-aware assessment systems because businesses will rank developers regardless—often using criteria that don’t match technical merit.