<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://originalankur.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://originalankur.github.io/" rel="alternate" type="text/html" /><updated>2025-08-18T13:32:38+00:00</updated><id>https://originalankur.github.io/feed.xml</id><title type="html">Ankur.dev</title><subtitle>Ankur.dev - Writings of Ankur Gupta</subtitle><entry><title type="html">Software Development over the years and 2030 prediction</title><link href="https://originalankur.github.io/2025/07/27/programming-timeline.html" rel="alternate" type="text/html" title="Software Development over the years and 2030 prediction" /><published>2025-07-27T00:00:00+00:00</published><updated>2025-07-27T00:00:00+00:00</updated><id>https://originalankur.github.io/2025/07/27/programming-timeline</id><content type="html" xml:base="https://originalankur.github.io/2025/07/27/programming-timeline.html"><![CDATA[<h2 id="where-is-programming-headed">Where is Programming Headed?</h2>

<p>A personal timeline of how programming has evolved over the years. Plus a fun 2030 prediction.</p>

<hr />

<h3 id="2006-the-early-days">2006: The Early Days</h3>

<ul>
  <li>C++ and gcc/g++ toolchain</li>
  <li>UI with C++/Qt (commercial license required)</li>
  <li>SVN for version control</li>
  <li>Code reviews via change requests and SSH logins for end-to-end review</li>
  <li>Man pages as primary documentation</li>
  <li>POSIX queues, mutexes, semaphores, pthreads for multi-threaded systems</li>
  <li>Deployments done in-person at data centers or NOCs</li>
  <li>Coding in emacs, vim, NetBeans, or Eclipse</li>
</ul>

<hr />

<h3 id="2008-web-20-and-collaboration">2008: Web 2.0 and Collaboration</h3>

<ul>
  <li>Stack Overflow launches</li>
  <li>Web 2.0 explodes; web apps and blogs proliferate</li>
  <li>Internet becomes the go-to for questions and answers</li>
  <li>Git adoption begins</li>
  <li>IRC for real-time communication</li>
  <li>Specialized roles like SysAdmin and QA emerge</li>
  <li>Still coding in emacs, vim, NetBeans, or Eclipse</li>
  <li>Windows ecosystem dominates commercial software development</li>
</ul>

<hr />

<h3 id="2012-open-source-and-big-data">2012: Open Source and Big Data</h3>

<ul>
  <li>Open source adoption everywhere; used as a distribution strategy</li>
  <li>Linux becomes the default server OS</li>
  <li>Post-financial crash, web startups boom (especially in India)</li>
  <li>Product Managers run roadmaps, often dubbed the new CTOs</li>
  <li>Transition from desktop to web application development (still in C++)</li>
  <li>Data explodes within organizations; Hadoop gains traction</li>
  <li>JavaScript becomes the default for web UI; Java Swing fades away</li>
  <li>Man pages replaced by Google searches</li>
  <li>Open source matures, but Windows dev environments still strong</li>
</ul>

<hr />

<h3 id="2014-data-engineering-and-cloud">2014: Data Engineering and Cloud</h3>

<ul>
  <li>Data engineering emerges; real-time insights in demand</li>
  <li>Cloud computing becomes ubiquitous; DevOps specialization grows</li>
  <li>CI/CD adoption increases for seamless deployments</li>
  <li>Agile and Scrum managers become standard in engineering teams</li>
  <li>Most development done on personal Linux machines</li>
</ul>

<hr />

<h3 id="2018-containers-and-specialization">2018: Containers and Specialization</h3>

<ul>
  <li>Containers (Kubernetes, Docker) become standard for deployment</li>
  <li>Mobile apps and frontend frameworks drive specialization (FE, BE, Mobile App dev roles)</li>
  <li>Full-stack developers become rare; keeping up with changes is challenging</li>
  <li>Android and Apple stores are key distribution channels; SDK knowledge essential</li>
  <li>Git is the default version control</li>
  <li>VS Code, PyCharm, and visual IDEs see massive adoption</li>
  <li>Blogs and Stack Overflow remain prime resources</li>
  <li>Open source packages like Redis, MySQL, Postgres, Linux, Kubernetes, Docker, RabbitMQ dominate</li>
</ul>

<hr />

<h3 id="2022-llms-and-the-digital-boom">2022: LLMs and the Digital Boom</h3>

<ul>
  <li>LLMs and ChatGPT debut; early versions help with quick answers and code snippets</li>
  <li>Figma becomes the standard for UI mocks and design</li>
  <li>JS frameworks consolidate around React; React Native for mobile</li>
  <li>Firebase and marketing SDK integrations are common</li>
  <li>Social login is expected everywhere</li>
  <li>Massive tech hiring post-COVID; bootcamps proliferate as everyone wants to be a developer</li>
  <li>Developers in AI and ML gravitate toward LLMs and better prompting</li>
</ul>

<hr />

<h3 id="2025-the-ai-transformation">2025: The AI Transformation</h3>

<ul>
  <li>The AI bubble peaks; AI is everywhere and transformational</li>
  <li>LLMs generate most code; many learn directly from AI-generated code</li>
  <li>Layoffs are widespread as companies shed post-COVID hiring excess</li>
  <li>Small orgs adopt AI rapidly, building features that once took teams months</li>
  <li>Companies question developer headcount and shipping speed</li>
  <li>Full-stack developer role resurges as LLMs adhere to standards via prompts</li>
  <li>New grads and laid-off developers struggle to find jobs</li>
  <li>AI agents that write, test, and deploy code are being experimented with</li>
  <li>Rise of AI agents with natural language interfaces for specialized tasks</li>
  <li>Experienced developers with deep systems, architecture, and design knowledge are in demand and leverage LLMs.</li>
</ul>

<h3 id="2030-prediction">2030: Prediction</h3>

<ul>
  <li>Everyone can now build custom software using natural language and it’s vast majority of software is created and hosted in same environment,</li>
  <li>Small language models running on local computer and phones busted the 2025 AI bubble as all the data center capex didn’t generate revenue and chinese and singpore AI labs played a pioneer role in doing so,</li>
  <li>Any and Every software we know of is in a playstore / appstore like store, people pay for store subscription and use as many software as they need. Multiple providers of such stores.</li>
  <li>Software development isn’t a mass employer like it was once before and robotics is the new sunrise industry at it’s peak but nowhere a mass employer for white collar jobs like software,</li>
  <li>Every non tech company still has a software development dept hosting developers who run, host, maintain software but build has taken over buy decision for non specialized software.</li>
  <li>SAAS saw massive consolidation and shutdown leading to lot of unemployment of those who didn’t upskill or merely had soft skills.</li>
  <li>Most tech teams are now single digit in non tech companies</li>
  <li>Service industy is of significantly smaller size now compared to before</li>
  <li>Certain domains still employ and pay top dollars for developers namely kernel development, software for chip design, fintech in HFT, Quant, database engine developer and all sub-fields where knowledge is primarly acquired by getting employed and corporates hold the keys to main branch of the software.</li>
  <li>Game development stack is completely different as AR is now everywhere and LLMs and world engine has forced them to reivent their stack. Every developer gets a different gaming experience that adjust as they play.</li>
</ul>]]></content><author><name></name></author><category term="programming" /><category term="history" /><category term="technology" /><category term="ai" /><category term="software" /><summary type="html"><![CDATA[A timeline reflection on the evolution of programming tools, roles, and culture from 2006 to 2025. Also a small prediction of 2030 will be like.]]></summary></entry><entry><title type="html">Circuit Breakers and Quotas for LLM Workloads</title><link href="https://originalankur.github.io/2025/07/18/implementing-circuit-breakers-and-quotas-for-llm-workloads.html" rel="alternate" type="text/html" title="Circuit Breakers and Quotas for LLM Workloads" /><published>2025-07-18T00:00:00+00:00</published><updated>2025-07-18T00:00:00+00:00</updated><id>https://originalankur.github.io/2025/07/18/implementing-circuit-breakers-and-quotas-for-llm-workloads</id><content type="html" xml:base="https://originalankur.github.io/2025/07/18/implementing-circuit-breakers-and-quotas-for-llm-workloads.html"><![CDATA[<h2 id="circuit-breakers-and-quotas-for-llm-workloads">Circuit Breakers and Quotas for LLM Workloads</h2>

<p>LLM APIs can rack up costs fast—especially in background jobs, serverless functions, or cron-based pipelines running as part of CI/CD or due to programmatic errors doing retries. Here’s how to keep usage in check without overengineering a solution.</p>

<h4 id="set-budget-alerts-and-quotas">Set Budget Alerts and Quotas</h4>

<p>Start with basic alert guardrails:</p>

<ul>
  <li><strong>AWS</strong>: <a href="https://docs.aws.amazon.com/cost-management/latest/userguide/budgets-create.html">Budgets &amp; Alerts</a></li>
  <li><strong>Azure</strong>: Cost Management + action groups</li>
  <li><strong>GCP</strong>: Budgets + export to BigQuery for custom tracking</li>
</ul>

<p>These won’t stop usage, but they give early warning.</p>

<h4 id="use-separate-keys-for-staging-vs-production">Use Separate Keys for Staging vs Production</h4>

<p>Create different API keys with distinct limits:</p>

<ul>
  <li><strong>Staging</strong>: Throttled, low-cap keys</li>
  <li><strong>Production</strong>: Monitored with alerts</li>
</ul>

<p>This prevents accidental overuse during testing and helps in bifurcation.</p>

<h4 id="create-a-billing-apis-that-can-be-called-programmatically">Create a Billing APIs that can be called programmatically</h4>

<p>Use cloud billing APIs to track LLM spend:</p>

<ul>
  <li><strong>AWS</strong>: <a href="https://docs.aws.amazon.com/aws-cost-management/latest/APIReference/API_GetCostAndUsage.html">Cost Explorer</a></li>
  <li><strong>Azure</strong>: <a href="https://learn.microsoft.com/en-us/rest/api/consumption/">Usage Details</a></li>
  <li><strong>GCP</strong>: <a href="https://cloud.google.com/billing/docs/how-to/export-data-bigquery">BigQuery Billing Export</a></li>
</ul>

<p>Against service name once get granular cost usage and return boolean stating if LLM api calls can be made or not.</p>

<h4 id="add-circuit-breakers-in-code-by-calling-above-api">Add Circuit Breakers in Code by calling above API</h4>

<p>Check usage every Nth requests or every Nth min elapsed based of your worker being stateless or stateful. Stop LLM usage if api returns false.</p>

<p>LLM calls often run outside of web requests. Add usage checks in:</p>

<ul>
  <li><strong>Cold start</strong>: Exit if api returns false aka usage exceeds threshold</li>
  <li><strong>Every N calls</strong>: API check before triggering the LLM</li>
</ul>

<p>With these effort, you can avoid budget surprises while keeping your LLM-powered systems running smoothly.</p>]]></content><author><name></name></author><category term="llms" /><category term="technology" /><category term="finops" /><category term="cost" /><summary type="html"><![CDATA[how to ensure LLM API usage are in check without overengineering a solution.]]></summary></entry><entry><title type="html">Simple hacks for reducing audio transcription cost.</title><link href="https://originalankur.github.io/2025/06/27/simple-hacks-for-reducing-audio-transcription-cost.html" rel="alternate" type="text/html" title="Simple hacks for reducing audio transcription cost." /><published>2025-06-27T00:00:00+00:00</published><updated>2025-06-27T00:00:00+00:00</updated><id>https://originalankur.github.io/2025/06/27/simple-hacks-for-reducing-audio-transcription-cost</id><content type="html" xml:base="https://originalankur.github.io/2025/06/27/simple-hacks-for-reducing-audio-transcription-cost.html"><![CDATA[<p>I recently concluded a subcontracted project for one of the top real estate builders in the Middle East.</p>

<h2 id="requirement">Requirement</h2>
<p>With the advent of viral real estate listings on social media comes a barrage of call center inquiries about the projects and their details. The UAE, in particular, receives calls from across the globe.</p>

<p>The client was already using an AI SaaS product that would take call recordings and extract:</p>

<ul>
  <li>Entities:
    <ul>
      <li>Name</li>
      <li>Nationality</li>
      <li>Property, builder, or area of interest</li>
    </ul>
  </li>
  <li>Budget</li>
  <li>Purchase timeline</li>
</ul>

<p>The system also tagged phone numbers and other metadata, automatically raising CRM tickets. These tickets were routed to agents over WhatsApp within seconds, along with a dossier containing extracted details, a transcript, and a playable audio file link. This enabled agents to exercise judgment and prioritize callbacks effectively.</p>

<p><strong>The pricing with the SaaS provider was a fixed cost of 7 AED per call. My task was to replicate this functionality at a lower cost, while ensuring the client’s in-house tech team could take over and fully own the resulting IP.</strong></p>

<h2 id="audio-processing">Audio Processing</h2>

<p>Using <code class="language-plaintext highlighter-rouge">ffmpeg</code>, we processed each audio file to:</p>

<ul>
  <li>Standardize the format to FLAC</li>
  <li>Reduce the bitrate to 128 kbps</li>
  <li>Speed up the playback to 1.25x–1.5x</li>
</ul>

<p>Running <code class="language-plaintext highlighter-rouge">ffprobe</code> on the optimized files showed that we reduced audio file sizes by approximately 60% on average. This resulted in downstream savings in both storage and processing costs.</p>

<h2 id="audio-to-text-conversion">Audio-to-Text Conversion</h2>

<p>After experimenting with optimized versions of OpenAI Whisper and various Hugging Face models, I ultimately chose the reliable Google Speech-to-Text API. One key reason was the need to handle multi-lingual conversations — many East Asian callers switched between their native language and English, e.g. Hindi/Urdu -&gt; English.</p>

<p>Google’s API allowed us to specify a primary language along with up to four alternative languages.</p>

<p>Another useful trick was hosting audio files in Google Cloud Storage. This appeared to speed up transcription time — my hunch is that Google avoids copying files already hosted within their cloud infrastructure (as opposed to fetching from S3 or a local server).</p>

<p>I handed over a Dockerized version of the pipeline, micro-batching audio files in 3-minute windows, and ran a worker pool of 20 concurrent workers. <code class="language-plaintext highlighter-rouge">ffmpeg</code> was the most CPU-intensive part of the process. The client now runs the entire pipeline within Kubernetes, where the number of messages in Redis streams dynamically determines the scaling of the worker pool.</p>

<p>Their cost is now equal to the reserved instance + google transcription cost + Gemini 2.5 Pro. When spread across the no of calls they are now saving approx. 92% in cost in June alone.</p>

<p>Tech stack:</p>
<ul>
  <li>Redis Streams for Pub/Sub</li>
  <li>Python
    <ul>
      <li>DSPy for interaction with LLMs</li>
    </ul>
  </li>
  <li>OpenRouter</li>
  <li>ffmpeg</li>
  <li>Google
    <ul>
      <li>Cloud Storage</li>
      <li>Gemini 2.5 Pro</li>
      <li>Speech-to-text</li>
    </ul>
  </li>
</ul>]]></content><author><name></name></author><category term="ffmpeg" /><category term="gen-ai" /><summary type="html"><![CDATA[Learning from processing real estate call center data]]></summary></entry><entry><title type="html">Reboot</title><link href="https://originalankur.github.io/2025/05/13/reboot.html" rel="alternate" type="text/html" title="Reboot" /><published>2025-05-13T00:00:00+00:00</published><updated>2025-05-13T00:00:00+00:00</updated><id>https://originalankur.github.io/2025/05/13/reboot</id><content type="html" xml:base="https://originalankur.github.io/2025/05/13/reboot.html"><![CDATA[<p>My first interaction with a computer was at Saint Paul’s School in Jodhpur, Rajasthan, on a 386 with a monochrome monitor. I was in 6th standard, learning MS-DOS, BASIC, and WordStar. The real fascination began during the holidays after 10th standard, diving into dBase III Plus, VBASIC, Turbo C, and using telnet to access the internet.</p>

<p>A few months ago, at our 25th school reunion, I was reminded how everyone remembered my early obsession with computers. Some even recalled the DBase III Plus program I built for a school project. That passion never really left me—and perhaps that’s why most of my career has never felt like “work.”</p>

<p>Even before formal education, I was curious about how computers worked. That curiosity finally found structure through courses in Computer Architecture and Operating Systems during my master’s.</p>

<p>It’s been 18 years since I started my journey in IT—from Software Engineer to Manager of Managers. I’ve been fortunate to work with some brilliant people and gain experience across both large enterprises and startups.</p>

<p>Life has kept me busy, but I’m currently on a career break to support my wife as she recovers from health issues. It feels like the right time to restart this blog—both to explore technical interests and reconnect with my roots in coding and writing.</p>]]></content><author><name></name></author><category term="technology" /><category term="personal" /><summary type="html"><![CDATA[A personal reflection on my journey in technology and the decision to restart my blog.]]></summary></entry></feed>