Saving Money With Embeddings in AI Memory Systems: Why Ruby on Rails is Perfect for LangChain

In the exploration of AI memory systems and embeddings, the author highlights the hidden costs in AI development, emphasizing token management. Leveraging Ruby on Rails streamlines the integration of LangChain for efficient memory handling. Adopting strategies like summarization and selective retrieval significantly reduces expenses, while maintaining readability and scalability in system design.

Over the last few months of rebuilding my Rails muscle memory, I’ve been diving deep into AI memory systems and experimenting with embeddings. One of the biggest lessons I’ve learned is that the cost of building AI isn’t just in the model it’s in how you use it. Tokens, storage, retrieval these are the hidden levers that determine whether your AI stack remains elegant or becomes a runaway expense.

And here’s the good news: with Ruby on Rails, managing these complexities becomes remarkably simple. Rails has always been about turning complicated things into something intuitive and maintainable and when you pair it with LangChain, it feels like magic.


Understanding the Cost of Embeddings

Most people think that running large language models is expensive because of the model itself. That’s only partially true. In practice, the real costs come from:

  • Storing too much raw content: Every extra paragraph you embed costs more in tokens, both for the embedding itself and for later retrieval.
  • Embedding long texts instead of summaries: LLMs don’t need the full novel they often just need the distilled version. Summaries are shorter, cheaper, and surprisingly effective.
  • Retrieving too many memories: Pulling 50 memories for a simple question can cost more than the model call itself. Smart retrieval strategies can drastically cut costs.
  • Feeding oversized prompts into the model: Every extra token in your prompt adds up. Cleaner prompts = cheaper calls.

I’ve seen projects where embedding every word of a document seemed “safe,” only to realize months later that the token bills were astronomical. That’s when I started thinking in terms of summary-first embeddings.


How Ruby on Rails Makes It Easy

Rails is my natural playground for building systems that scale reliably without over-engineering. Why does Rails pair so well with AI memory systems and LangChain? Several reasons:

Migrations Are Elegant
With Rails, adding a vector column with PgVector feels like any other migration. You can define your tables, indexes, and limits in one concise block:

 class AddMemoriesTable < ActiveRecord::Migration[7.1] 
   def change 
     enable_extension "vector" 
     create_table :memories do |t| 
       t.text :content, null: false 
       t.vector :embedding, limit: 1536 
       t.jsonb :metadata 
       t.timestamps 
     end 
   end 
end 


There’s no need for complicated schema scripts. Rails handles the boring but essential details for you.

ActiveRecord Makes Embedding Storage a Breeze
Storing embeddings in Rails is almost poetic. With a simple model, you can create a memory with content, an embedding, and metadata in a single call:

Memory.create!( 
content: "User prefers Japanese and Mexican cuisine.", 
embedding: embedding_vector, 
  metadata: { type: :preference, user_id: 42 } 
) 

And yes, you can query those memories by similarity in a single, readable line:

Memory.order(Arel.sql("embedding <=> '[#{query_embedding.join(',')}]'")) .limit(5) 

Rails keeps your code readable and maintainable while you handle sophisticated vector queries.

LangChain Integration is Natural
LangChain is all about chaining LLM calls, memory storage, and retrieval. In Rails, you already have everything you need: models, services, and job queues. You can plug LangChain into your Rails services to:

  • Summarize content before embedding
  • Retrieve only the most relevant memories
  • Cache embeddings efficiently for repeated use
Rails doesn’t get in the way. It gives you structure without slowing you down.


Saving Money with Smart Embeddings

Here’s the approach I’ve refined over multiple projects:

  1. Summarize Before You Embed
    Instead of embedding full documents, feed the model a summary. A 50-word summary costs fewer tokens but preserves the semantic meaning needed for retrieval.
  2. Limit Memory Retrieval
    You rarely need more than 5–10 memories for a single model call. More often than not, extra memories just bloat your prompt and inflate costs.
  3. Use Metadata Wisely
    Store small, structured metadata alongside your embeddings to filter memories before similarity search. For example, filter by user_id or type instead of pulling all records into the model.
  4. Cache Strategically
    Don’t re-embed unchanged content. Use Rails validations, background jobs, and services to embed only when necessary.

When you combine these strategies, the savings are significant. In some projects, embedding costs dropped by over 70% without losing retrieval accuracy.


Why I Stick With Rails and PostgreSQL

There are many ways to build AI memory systems. You could go with specialized databases, microservices, or cloud vector stores. But here’s what keeps me on Rails and Postgres:

  • Reliability: Postgres is mature, stable, and production-ready. PgVector adds vector search without changing the foundation.
  • Scalability: Rails scales surprisingly well when you keep queries efficient and leverage background jobs.
  • Developer Happiness: Rails lets me iterate quickly. I can prototype, test, and deploy AI memory features without feeling like I’m juggling ten different systems.
  • Future-Proofing: Rails projects can last years without a complete rewrite. AI infrastructure is still evolving having a stable base matters.

Closing Thoughts

AI memory doesn’t have to be complicated or expensive. By thinking carefully about embeddings, summaries, retrieval, and token usage and by leveraging Rails with LangChain you can build memory systems that are elegant, fast, and cost-effective.

For me, Rails is more than a framework. It’s a philosophy: build systems that scale naturally, make code readable, and keep complexity under control. Add PgVector and LangChain to that mix, and suddenly AI memory feels like something you can build without compromise.

In the world of AI, where complexity grows faster than budgets, that kind of simplicity is priceless.

The Art of Reusability and Why AI Still Doesn’t Understand It

AI can generate code but lacks understanding of design intent, making it struggle with reusability. True reusability involves encoding shared ideas and understanding context, which AI cannot grasp. This leads to overgeneralized or underabstracted code. Effective engineering requires human judgment and foresight that AI is currently incapable of providing.

After writing about the team that deleted 200,000 lines of AI-generated code without breaking their app, a few people asked me:

“If AI is getting so good at writing code, why can’t it also reuse code properly?”

That’s the heart of the problem.

AI can produce code.
It can suggest patterns.
But it doesn’t understand why one abstraction should exist and why another should not.

It has no concept of design intent, evolution over time, or maintainability.
And that’s why AI-generated code often fails at the very thing great software engineering is built upon: reusability.


Reusability Isn’t About Copying Code

Let’s start with what reusability really means.

It’s not about reusing text.
It’s about reusing thought.

When you make code reusable, you’re encoding an idea a shared rule or process in one place, so it can serve multiple contexts.
That requires understanding how your domain behaves and where boundaries should exist.

Here’s a small example in Ruby 3.4:

# A naive AI-generated version
class InvoiceService
  def create_invoice(customer, items)
    total = items.sum { |i| i[:price] * i[:quantity] }
    tax = total * 0.22
    {
      customer: customer,
      total: total,
      tax: tax,
      grand_total: total + tax
    }
  end

  def preview_invoice(customer, items)
    total = items.sum { |i| i[:price] * i[:quantity] }
    tax = total * 0.22
    {
      preview: true,
      total: total,
      tax: tax,
      grand_total: total + tax
    }
  end
end

It works. It looks fine.
But the duplication here is silent debt.

A small tax change or business rule adjustment would require edits in multiple places which the AI wouldn’t warn you about.

Now, here’s how a thoughtful Rubyist might approach the same logic:

class InvoiceCalculator
  TAX_RATE = 0.22

  def initialize(items)
    @items = items
  end

  def subtotal = @items.sum { |i| i[:price] * i[:quantity] }
  def tax = subtotal * TAX_RATE
  def total = subtotal + tax
end

class InvoiceService
  def create_invoice(customer, items, preview: false)
    calc = InvoiceCalculator.new(items)

    {
      customer: customer,
      total: calc.subtotal,
      tax: calc.tax,
      grand_total: calc.total,
      preview: preview
    }
  end
end

Now the logic is reusable, testable, and flexible.
If tax logic changes, it’s centralized.
If preview behavior evolves, it stays isolated.

This is design thinking not just text prediction.


Why AI Struggles with This

AI doesn’t understand context it understands correlation.

When it generates code, it pulls from patterns it has seen before. It recognizes that “invoices” usually involve totals, taxes, and items.
But it doesn’t understand the relationship between those things in your specific system.

It doesn’t reason about cohesion (what belongs together) or coupling (what should stay apart).

That’s why AI-generated abstractions often look reusable but aren’t truly so.
They’re usually overgeneralized (“utility” modules that do too much) or underabstracted (duplicate logic with slightly different names).

In other words:
AI doesn’t design for reuse it duplicates for confidence.


A Real Example: Reusability in Rails

Let’s look at something familiar to Rubyists: ActiveRecord scopes.

An AI might generate this:

class Order < ApplicationRecord
  scope :completed, -> { where(status: 'completed') }
  scope :recent_completed, -> { where(status: 'completed').where('created_at > ?', 30.days.ago) }
end

Looks fine, right?
But you’ve just duplicated the status: 'completed' filter.

A thoughtful approach is:

class Order < ApplicationRecord
  scope :completed, -> { where(status: 'completed') }
  scope :recent, -> { where('created_at > ?', 30.days.ago) }
  scope :recent_completed, -> { completed.recent }
end

It’s subtle but it’s how reusability works.
You extract intent into composable units.
You think about how the system wants to be extended later.

That level of foresight doesn’t exist in AI-generated code.


The Human Element: Judgment and Intent

Reusability isn’t just an engineering principle it’s a leadership one.

Every reusable component is a promise to your future self and your team.
You’re saying: “This logic is safe to depend on.”

AI can’t make that promise.
It can’t evaluate trade-offs or organizational conventions.
It doesn’t know when reuse creates value and when it adds friction.

That’s why good engineers are editors, not just producers.
We don’t chase volume; we curate clarity.


My Takeaway

AI is incredible at generating examples.
But examples are not design.

Reusability real, human-level reusability comes from understanding what stays constant when everything else changes.
And that’s something no model can infer without human intent behind it.

So yes AI can write Ruby.
It can even generate elegant-looking methods.
But it still can’t think in Ruby.
It can’t feel the rhythm of the language, or the invisible architecture behind a clean abstraction.

That’s still our job.

And it’s the part that makes engineering worth doing.


Written by Ivan Turkovic; technologist, Rubyist, and blockchain architect exploring how AI and human craftsmanship intersect in modern software engineering.

Returning to the Rails World: What’s New and Exciting in Rails 8 and Ruby 3.3+

It’s 2025, and coming back to Ruby on Rails feels like stepping into a familiar city only to find new skyscrapers, electric trams, and an upgraded skyline.
The framework that once defined web development simplicity has reinvented itself once again.

If you’ve been away for a couple of years, you might remember Rails 6 or early Rails 7 as elegant but slightly “classic.”
Fast-forward to today: Rails 8 and Ruby 3.4 together form one of the most modern, high-performance, and full-stack ecosystems in web development.

Let’s explore what changed from Ruby’s evolution to Rails’ latest superpowers.


The Ruby Renaissance: From 3.2 to 3.4

Over the last two years, Ruby has evolved faster than ever.
Performance, concurrency, and developer tooling have all received major love while the language remains as expressive and joyful as ever.

Ruby 3.2 (2023): The Foundation of Modern Ruby

  • YJIT officially production-ready: Introduced a new JIT compiler written in Rust, delivering 20–40% faster execution on Rails apps.
  • Prism Parser (preview): The groundwork for a brand-new parser that improves IDEs, linters, and static analysis.
  • Regexp improvements: More efficient and less memory-hungry pattern matching.
  • Data class proposal: Early syntax experiments to make small, immutable data structures easier to define.

Ruby 3.3 (2024): Performance, Async IO, and Stability

  • YJIT 3.3 update: Added inlining and better method dispatch caching big wins for hot code paths.
  • Fiber Scheduler 2.0: Improved async I/O great for background processing and concurrent network calls.
  • Prism Parser shipped: Officially integrated, paving the way for better tooling and static analysis.
  • Better memory compaction: Long-running apps now leak less and GC pauses are shorter.

Ruby 3.4 (2025): The Next Leap

  • Prism as the default parser making editors and LSPs much more accurate.
  • Official WebAssembly build: You can now compile and run Ruby in browsers or serverless environments.
  • Async and Fibers 3.0: Now tightly integrated into standard libraries like Net::HTTP and OpenURI.
  • YJIT 3.4: Huge startup time and memory improvements for large Rails codebases.
  • Smarter garbage collector: Dynamic tuning for better throughput under load.

Example: Native Async Fetching in Ruby 3.4

require "async"
require "net/http"

Async do
  ["https://rubyonrails.org", "https://ruby-lang.org"].each do |url|
    Async do
      res = Net::HTTP.get(URI(url))
      puts "#{url} → #{res.bytesize} bytes"
    end
  end
end

That’s fully concurrent, purely in Ruby no threads, no extra gems.
Ruby has quietly become fast, efficient, and concurrent while keeping its famously clean syntax.


The Rails Revolution: From 7 to 8

While Ruby evolved under the hood, Rails reinvented the developer experience.
Rails 7 introduced the “no-JavaScript-framework” movement with Hotwire.
Rails 8 now expands that vision making real-time, async, and scalable apps easier than ever.

Rails 7 (2022–2024): The Hotwire Era

Rails 7 changed the front-end game:

  • Hotwire (Turbo + Stimulus): Replaced complex SPAs with instant-loading server-rendered apps.
  • Import maps: Let you skip Webpack entirely.
  • Encrypted attributes: encrypts :email became a one-line reality.
  • ActionText and ActionMailbox: Brought full-stack communication features into Rails core.
  • Zeitwerk loader improvements: Faster boot and reloading in dev mode.

Example: Rails 7 Hotwire Simplicity

# app/controllers/messages_controller.rb
def create
  @message = Message.create!(message_params)
  turbo_stream.append "messages", partial: "messages/message", locals: { message: @message }
end

That’s a live-updating chat stream with no React, no WebSocket boilerplate.


Rails 8 (2025): Real-Time, Async, and Database-Native

Rails 8 takes everything Rails 7 started and levels it up for the next decade.

Turbo 8 and Turbo Streams 2.0

Hotwire gets more powerful:

  • Streaming updates from background jobs
  • Improved Turbo Frames for nested components
  • Async rendering for faster page loads
class CommentsController < ApplicationController
  def create
    @comment = Comment.create!(comment_params)
    turbo_stream.prepend "comments", partial: "comments/comment", locals: { comment: @comment }
  end
end

Now you can push that stream from Active Job or Solid Queue, enabling real-time updates across users.

Solid Queue and Solid Cache

Rails 8 introduces two built-in frameworks that change production infrastructure forever:

  • Solid Queue: Database-backed job queue think Sidekiq performance without Redis.
  • Solid Cache: Native caching framework that integrates with Active Record and scales horizontally.
# Example: background email job using Solid Queue
class UserMailerJob < ApplicationJob
  queue_as :mailers

  def perform(user_id)
    UserMailer.welcome_email(User.find(user_id)).deliver_now
  end
end

No Redis, no extra service everything just works out of the box.

Async Queries and Connection Pooling

Rails 8 adds native async database queries and automatic connection throttling for multi-threaded environments.
This pairs perfectly with Ruby’s improved Fiber Scheduler.

users = ActiveRecord::Base.async_query do
  User.where(active: true).to_a
end

Smarter Defaults, Stronger Security

  • Active Record Encryption expanded with deterministic modes
  • Improved CSP and SameSite protections
  • Rails generators now use more secure defaults for APIs and credentials

Developer Experience: Rails Feels Modern Again

The latest versions of Rails and Ruby have also focused heavily on DX (developer experience).

  • bin/rails console --sandbox rolls back all changes automatically.
  • New error pages with interactive debugging.
  • ESBuild 3 & Bun support for lightning-fast JS builds.
  • Improved test parallelization with async jobs and Capybara integration.
  • ViewComponent and Hotwire integration right from generators.

Rails in 2025 feels sleek, intelligent, and incredibly cohesive.


The Future of Rails and Ruby Together

With Ruby 3.4’s concurrency and Rails 8’s async, streaming, and caching power, Rails has evolved into a true full-stack powerhouse again capable of competing with modern Node, Elixir, or Go frameworks while staying true to its elegant roots.

It’s not nostalgia it’s progress built on the foundation of simplicity.

If you left the Rails world thinking it was old-fashioned, this is your invitation back.
You’ll find your favorite framework faster, safer, and more capable than ever before.


Posted by Ivan Turkovic
Rubyist, software engineer, and believer in beautiful code.

AngularJS and Ruby on Rails work together

Finding the best integration of AngularJS and Ruby on Rails

Recently I got really excited with AngularJS so to make it work perfectly with Ruby on Rails there are some configurations needed. There are available blog posts on how to integrate it perfectly but somehow I wasn’t happy with the available preferences. Some offered to add the javascript files manually to the project and to manual project organization, others offered some kind of gem packager or even automated as a rails app template.

My goal is to describe how to start new rails app project from the scratch but the instructions should be succinct enough to be able to reuse them for the existing project (Actually I did extract it from the existing application I am working on). For the front end development I recently discovered great gem that really can make it more closer to pure full stack javascript development.
Bower is a great javascript package manager by the people who are working on Twitter Boostrap. It is like Bundler but made for javascript instead of ruby language. When I found out that there is a ruby gem that is integrated with rake tasks so I can easily update all javascript libraries without needing adding gem library for each one. The gem is called bower-rails.

My initial plan is to evolve this post into a series of blog posts on how to develop a fully functional demo application so I’ve included some steps that are might not needed but are good to have. Don’t worry I will provide explanation why I am using each of them.
Here is my plan what I will try to achieve with this series of posts:

  • creating a new demo project with angularJS from scratch, showing all my changes along the way, and trying to explain every step. this will include creating basic rails 4 app
  • adding basic gems
  • setup front end development with Bower
  • adding angularJS
  • implementing basic Rails and AngularJS controllers

Continue reading “AngularJS and Ruby on Rails work together”

Working OAuth2 with Foursquare on Sinatra

require 'rubygems'
require 'sinatra'
require 'oauth2'
require 'json'
require 'net/https'
require 'foursquare2'

set :port, 80

CLIENT_ID = '****************************************************'
CLIENT_SECRET = '****************************************************'
CALLBACK_PATH = '/callbacks/foursquare'

def client
OAuth2::Client.new(CLIENT_ID, CLIENT_SECRET,
{:site => 'https://foursquare.com/',
:token_url => "/oauth2/access_token",
:authorize_url => "/oauth2/authenticate?response_type=code",
:parse_json => true,
:ssl => {:ca_path => '/etc/ssl/certs' }
})
end

def redirect_uri()
uri = URI.parse(request.url)
uri.path = CALLBACK_PATH
uri.query = nil
uri.to_s
end

get CALLBACK_PATH do
puts redirect_uri
if params[:code] != nil
token = client.auth_code.get_token(params[:code], :redirect_uri => redirect_uri).token
client = Foursquare2::Client.new(:oauth_token => token)
email = client.user('self')['contact'].email.to_s
return "Authenticated user: #{email}"
else
'Missing response from foursquare'
end
end

get '/' do
redirect client.auth_code.authorize_url(:redirect_uri => redirect_uri)
end