A few weeks ago, Martin Alderson published something that caught my attention: a systematic comparison of how token-efficient different programming languages are when fed to large language models.
The findings were fascinating. And if you’ve been following my writing on Ruby, you won’t be surprised to hear that Ruby came out looking very good indeed.
But this isn’t just about bragging rights for Ruby developers. It points to something bigger: a fundamental shift in how we should think about programming language design in an age where AI is increasingly writing and reading our code.
The Constraint Nobody Saw Coming
Here’s the thing about LLMs that breaks most of our mental models about computing: they don’t care about CPU cycles the way traditional programs do. What they care about, desperately, is context window.
The context window is the amount of information an LLM can hold in its working memory at once. Think of it as the size of the desk where you’re working. No matter how fast you can think, if you can only fit three pages on your desk at a time, that’s a hard limit on how much information you can work with simultaneously.
For software development agents, this matters enormously. A significant portion of your context window is going to be code: the files you’re working with, the changes you’re making, the diffs you’re reviewing. A more token-efficient language means you can fit more code in the window, which means longer productive sessions and better context for the AI to work with.
We’ve spent decades optimizing languages for human readability and machine performance. Now we need to think about a third dimension: AI comprehension efficiency.
The Data
Alderson’s analysis used the RosettaCode project, which contains implementations of over a thousand programming tasks across nearly a thousand languages. He filtered for tasks that had solutions in all 19 of the most popular mainstream languages, then ran them through the GPT-4 tokenizer.
The results showed a 2.6x gap between the most efficient and least efficient languages. That’s not a marginal difference. That’s the difference between fitting one file in your context window and fitting nearly three.
The least token-efficient language in the comparison was C. The most efficient was Clojure. And Ruby? Ruby landed in a comfortable second place among mainstream languages, right behind Clojure.
This surprised me initially. Ruby isn’t usually celebrated for its brevity. It’s celebrated for its expressiveness, its elegance, its joy. But when you think about it more deeply, those qualities and token efficiency turn out to be closely related.
Why Dynamic Languages Win
One of the clearest patterns in the data: dynamic languages consistently outperformed static languages in token efficiency. This makes intuitive sense when you think about what tokenization actually measures.
Every type annotation is tokens. Every explicit declaration is tokens. Every semicolon is a token. Every curly brace is a token. Ruby doesn’t have type annotations. It doesn’t require semicolons. It uses end instead of braces, which might seem like a wash, but Ruby’s approach to blocks and implicit returns means you often write less code overall.
Compare equivalent code in Ruby and Java:
ruby
<em># Ruby</em>
def add(a, b)
a + b
endCode language: HTML, XML (xml)
java
<em>// Java</em>
public static int add(int a, int b) {
return a + b;
}Code language: PHP (php)
The Java version has explicit return type, explicit parameter types, explicit return statement, access modifier, static keyword, and curly braces. Every one of those elements consumes tokens.
But here’s where it gets interesting: languages like Haskell and F# were barely less efficient than the top dynamic languages, despite being statically typed. Why? Type inference. When the compiler can figure out types without you writing them, you get the safety benefits of static typing without the token cost.
This suggests something important about language design for an AI-assisted future: type inference isn’t just a convenience feature. It’s a fundamental efficiency gain.
The Ruby Philosophy Pays Off
I’ve written before about what makes Ruby unique: its blocks, its metaprogramming, its eigenclasses, its modules as mixins. These aren’t just interesting language features. They’re compression mechanisms.
When you can write:
ruby
users.select(&:active?).map(&:email)Code language: CSS (css)
Instead of:
ruby
users.select { |user| user.active? }.map { |user| user.email }
Or worse, the equivalent in a more verbose language, you’re not just saving keystrokes. You’re saving tokens. And when an LLM is reading your code, those savings compound.
Ruby’s idioms like attr_accessor, delegate, and Rails’ conventions like has_many :posts pack enormous semantic density into minimal token counts. A single line of Ruby can express what might take five or ten lines in other languages.
This is why I argued in my Christmas post about Ruby’s future that the framework is uniquely positioned for AI-assisted development. LLMs work better with Ruby because Ruby gives them more signal per token.
The Surprise Contender: Array Languages
One of the more surprising findings in Alderson’s analysis came from a reader suggestion. J, an array language that uses ASCII instead of APL’s special symbols, absolutely dominated at just 70 tokens average. That’s nearly half of Clojure’s 109 tokens.
APL itself, despite its famous terseness, didn’t do nearly as well. The problem? Tokenizers aren’t optimized for APL’s unique glyphs. Each of those beautiful symbols (⍳, ⍴, ⌽) ends up as multiple tokens.
This is a crucial insight: token efficiency isn’t just about character count. It’s about how well your language’s patterns match the tokenizer’s vocabulary. Common English words and standard programming constructs tokenize efficiently. Exotic symbols often don’t.
Ruby benefits from this. Its syntax uses common words: def, end, do, if, class. These are well-represented in any tokenizer trained on programming text. The language’s surface syntax is almost conversational, which turns out to be exactly what you want for LLM consumption.
What This Means for Working with AI Coding Assistants
If you’re using Claude Code, Cursor, GitHub Copilot, or any other AI coding tool, the token efficiency of your language choice has practical implications:
Longer productive sessions. When your code takes fewer tokens to represent, you can work on larger features before hitting context limits. This means fewer interruptions to summarize context or restart conversations.
Better AI comprehension. More code in context means the AI has more information to work with when making suggestions. It can see more of your codebase, understand more relationships, and make more informed recommendations.
Lower costs. If you’re paying per token (which you often are, even indirectly), more efficient code means lower bills. At scale, a 2x token efficiency difference is a 2x cost difference.
Faster iteration. Fewer tokens means faster round trips. When you’re in a tight feedback loop with an AI assistant, every millisecond matters.
I’ve noticed this effect in my own work. When I’m using Claude Code on a Rails project, I can hold more context in the conversation than when working with more verbose languages. The AI understands more about what I’m trying to do because it can see more of the relevant code.
The Convergence of Developer Experience and AI Experience
Here’s what I find most interesting about these findings: the languages that are most token-efficient tend also to be the languages developers find most pleasant to write.
Ruby, Python, Clojure, Haskell. These aren’t random languages. They’re languages that have consistently ranked highly in developer satisfaction surveys. They’re languages people choose when they have a choice.
This isn’t coincidence. The qualities that make a language pleasant to write, expressiveness, minimal boilerplate, clear intent, are the same qualities that make it token-efficient. When you remove unnecessary verbosity for human readers, you’re simultaneously removing unnecessary tokens for AI readers.
In my post about Ruby’s unique structures, I argued that Ruby’s metaprogramming capabilities let you express complex concepts concisely. The same features that make Ruby a joy to write make it efficient to tokenize.
The Counterargument: Compilation Feedback
There’s an important counterpoint here. Languages like Go, C#, and Java may be more verbose, but they offer something valuable in return: compilation feedback.
When an AI generates code in a statically typed language, the compiler can immediately tell it whether the code is valid. No waiting for runtime errors. No subtle bugs lurking in type mismatches. The feedback loop is tight and reliable.
This matters for AI-generated code because hallucinations are real. An LLM might confidently generate code that calls methods that don’t exist or passes the wrong types. In a static language with a good type system, these errors get caught immediately. In a dynamic language, they might not surface until runtime.
The ideal, as Alderson notes, might be languages like Haskell and F# that combine excellent type inference (reducing token cost) with strong static typing (providing compilation feedback). You get the best of both worlds: concise code and reliable error checking.
Implications for Language Design
If context window really is the new constraint, what does this mean for programming language design going forward?
Type inference becomes essential, not optional. Languages that require explicit type annotations everywhere will be at a disadvantage. The future belongs to languages that can infer types without you writing them.
Syntactic minimalism matters more than ever. Every unnecessary character is a wasted token. Languages should provide concise ways to express common patterns.
Convention over configuration has new meaning. Rails’ approach, where sensible defaults eliminate boilerplate, is now an AI efficiency strategy. The less configuration you write, the more context you preserve for meaningful code.
Metaprogramming is a feature, not a bug. Ruby’s ability to generate code at runtime, which some see as dangerous magic, is actually a compression mechanism. It lets you express complex systems in minimal tokens.
Standard library design affects AI productivity. Rich standard libraries with well-named methods tokenize efficiently. When users.select(&:active?) is part of the language’s vocabulary, it’s more efficiently understood than a custom implementation of the same logic.
The Broader Picture
This analysis is one data point in a larger trend. We’re moving from a world where programming languages were optimized purely for human-machine interaction to one where AI is a third participant in the conversation.
Code now needs to be readable by humans, executable by machines, and comprehensible to AI systems with finite context windows. The languages that thread this needle well will thrive. Those that don’t will face increasing friction in AI-assisted workflows.
Ruby, somewhat accidentally, appears to be well-positioned for this transition. Its design philosophy, which has always prioritized developer happiness and expressiveness, turns out to align nicely with the constraints of LLM-based development tools.
This doesn’t mean everyone should drop everything and switch to Ruby. Language choice involves many factors: ecosystem, team expertise, problem domain, existing codebase. But it does suggest that Ruby developers shouldn’t feel apologetic about their language choice in an AI-first future. If anything, they should feel vindicated.
Practical Takeaways
If you’re already using Ruby:
- Embrace idiomatic Ruby. Those compact patterns (
&:method, symbol-to-proc, implicit returns) aren’t just elegant. They’re AI-efficient. - Use Rails conventions.
has_many,validates,scopepack enormous semantic density into minimal tokens. - Consider Slim over ERB. As I discussed in my templating comparison, Slim’s indentation-based syntax eliminates angle brackets and closing tags. That’s fewer tokens.
- Trust metaprogramming where appropriate. Code that generates code can be token-efficient at the interface level, even if the generated code is verbose.
If you’re choosing a language for a new project:
- Consider token efficiency as a factor, alongside all the other factors you already consider.
- Dynamic languages with strong conventions (Ruby, Python) tend to be efficient.
- Statically typed languages with type inference (Haskell, F#, Rust) can be nearly as efficient while providing compile-time feedback.
- Verbose languages with explicit typing (Java, C#, Go) will consume more context window.
If you’re thinking about language evolution:
- Watch for languages that prioritize both AI comprehension and human readability.
- Type inference is increasingly table stakes.
- Convention-over-configuration patterns may become more prevalent across frameworks.
Conclusion
Token efficiency in programming languages is a constraint we’re just beginning to understand. As AI becomes more central to software development, this metric will matter more, not less.
Ruby’s position in this ranking isn’t surprising when you understand Ruby’s philosophy. A language designed to make programmers happy tends also to be a language that expresses ideas concisely. And concise expression is exactly what token-constrained AI systems need.
The irony is beautiful: a language often dismissed as “slow” or “unserious” by systems programmers turns out to be highly efficient in the dimension that increasingly matters. Not CPU cycles, but information density. Not execution speed, but comprehension bandwidth.
As context windows continue to be the bottleneck they are, and as AI assistants continue to become integral to development workflows, language choice will increasingly factor in this new constraint. Ruby developers, as it happens, are already well-positioned for this future.
We optimized for joy, and we got efficiency as a bonus. That’s a pretty Ruby way for things to work out.
Interested in more on Ruby’s place in the AI-assisted development landscape? Check out my posts on Ruby’s unique structures, when to use which Ruby building blocks, and my 2026 Rails predictions.