This interview is with Gourav Singla, Software Engineer.
For readers on Featured, could you briefly introduce yourself and share how API design, cloud solutions, microservices, data structures, and software testing show up in your work as a Software Engineer?
I’m Gourav Singla, a software engineer focused on building reliable, scalable software systems across enterprise applications and AI-powered tools. My work includes backend development, system design, cloud deployment, database optimization, and applied AI systems.
API design appears in my work whenever different parts of a system need to communicate clearly and reliably. I focus on building APIs that are simple, consistent, secure, and easy for other developers or frontend applications to use.
Cloud solutions are important because many modern applications need to scale, handle traffic reliably, and remain maintainable. I have worked with cloud platforms such as AWS and Azure for deployment, storage, CI/CD, and production operations.
Microservices and modular architecture matter when systems grow beyond a single large application. I use these approaches to separate responsibilities, improve maintainability, and make systems easier to test, deploy, and troubleshoot.
Data structures and database design are important in performance-sensitive work, especially when handling large datasets, user-generated content, or application workflows. Choosing the right structure or query pattern can directly affect speed and reliability.
Software testing ties everything together. Whether working on enterprise systems or AI-powered applications, I see testing as essential for reducing regressions, validating edge cases, and ensuring software behaves correctly in real-world conditions.
Outside of my professional work, I also created SlideMaker.app, a free AI-powered presentation generation tool built as a personal research and learning project. It has been used by over 100,000 people across 177+ countries, which has given me practical experience in building and improving AI systems used at real-world scale.
How did you arrive at your current focus—especially with your 0-to-100k-user AI side project—and what experiences most shaped your engineering approach?
My current focus came from combining two areas I kept returning to: building reliable software systems and making AI useful in practical workflows.
Professionally, I have worked on enterprise and public-sector software systems where reliability, maintainability, and clear architecture matter a lot. That shaped how I think as an engineer. I try to build systems that are not only functional but also stable, testable, and easy to improve over time.
SlideMaker.app started as a personal research and learning project to explore how large language models could help people turn rough ideas into structured presentation content. I did not expect it to reach large-scale usage. As more people started using it, the project became a real-world learning environment for AI system design, prompt quality, scaling, user experience, multilingual support, and failure handling.
Reaching 100,000+ users taught me that building AI products is not only about the model. Much of the engineering work happens around the model: validation, retries, data handling, interface design, export quality, performance, and edge cases. Small assumptions can create big issues at scale.
The biggest experience that shaped my approach was seeing how real users interact with AI differently than expected. People do not just want generated text. They want structure, clarity, speed, and something they can actually use. That pushed me to focus less on AI as a novelty and more on AI as part of a dependable software system.
Starting with APIs, what is one concrete practice you rely on to design interfaces that stay stable for clients even as your backends evolve?
One practice I rely on is designing APIs around stable contracts rather than internal database or service structures.
Before building an endpoint, I define what the client actually needs: the request shape, response fields, error format, and expected behavior. I keep that contract independent of how the backend is implemented, because backend logic, tables, and services will change over time.
For example, I avoid exposing internal model names or raw database structures directly through the API. Instead, I return clean response objects that match the user workflow. If the backend changes later, the client can keep using the same API without breaking.
I also prefer versioning when a breaking change is unavoidable. Small changes can be handled with optional fields, but if the meaning of a response changes, it should usually become a new version instead of silently changing behavior.
The goal is simple: clients should depend on the API contract, not the backend implementation.
On data structures, what is one choice that measurably improved p95/p99 latency or cloud cost in your production pipeline?
One data structure choice that had a measurable impact was moving from repeated list scans to hash-based lookup maps in the slide generation pipeline.
In SlideMaker.app, each generated deck goes through multiple processing steps: content generation, slide type mapping, theme selection, image matching, export formatting, and validation. Early on, some of these steps repeatedly scanned arrays of slide objects, theme rules, and asset metadata to find matching elements. That was fine at small scale, but as decks grew larger and traffic increased, those repeated linear scans started adding avoidable latency.
I replaced several repeated lookups with dictionaries keyed by stable identifiers such as slide ID, layout type, theme ID, and asset ID. This made lookup operations effectively constant-time instead of repeatedly walking through lists.
The biggest improvement came during export preparation, where the system needs to match each slide with layout rules, content blocks, and asset metadata. After switching to lookup maps, export preparation became noticeably faster and reduced redundant processing.
The measurable result was a reduction in generation and export overhead, especially for longer decks. It also helped reduce unnecessary compute time, which matters because AI workflows already have expensive API and infrastructure components.
The lesson was simple: in AI pipelines, latency doesn’t only come from the model. A lot of p95 and p99 delay comes from small inefficiencies around the model, especially repeated parsing, filtering, and lookup logic. Keeping those paths efficient can improve both user experience and cloud cost.
Building from API stability to service design, what single heuristic do you rely on most when deciding microservice boundaries in a new system?
My main heuristic is: split a service only when the responsibility changes independently from the rest of the system.
I do not start with microservices just because the system may grow. Early over-splitting often creates more complexity than value. I first look at whether a part of the system has its own lifecycle, scaling pattern, failure mode, data ownership, or security boundary.
For example, in an AI content generation workflow, user management, generation jobs, file export, image retrieval, billing, and analytics can all look like separate services. But I would not split all of them immediately. I would separate the parts that need independent scaling or isolation first, such as long-running generation jobs or export processing, because those can fail or spike without bringing down the core user experience.
The question I keep asking is:
“If this component changes, scales, or fails differently, should it be isolated?”
If the answer is yes, it may deserve its own service. If not, it may be better as a module inside a simpler system. This keeps service boundaries tied to real operational needs instead of architecture style.
On the cloud side, based on your “infrastructure-as-code for everything” practice, what is the smallest habit you’d urge a solo engineer to adopt this week to reduce risk and speed experiments?
The smallest habit I would recommend is to define every new cloud resource in version-controlled infrastructure code before creating it manually.
Even for a solo engineer, this habit reduces risk quickly. It creates a clear record of what changed, when it changed, and why. It also makes experiments easier to repeat or roll back.
For example, if I need to test a new storage bucket, queue, environment variable, database setting, or background worker, I try to define it in code instead of clicking through the cloud console. The setup may take a little longer the first time, but it prevents hidden configuration drift later.
The real benefit is not just automation; it is confidence. When your infrastructure is defined in code, you can experiment faster because you know you can recreate the environment, compare changes, and undo mistakes without relying on memory.
For a solo engineer, the first step does not need to be complex. Start with one small rule:
- No important cloud resource should exist only in the console.
Shifting to testing, what is one testing or observability practice that repeatedly caught issues your aggregate dashboards missed in production?
One practice that repeatedly caught issues missed by aggregate dashboards was tracking failures by segment rather than relying only on overall success rates.
In AI workflows, aggregate metrics can look healthy while one user group is having a poor experience. I saw this clearly at SlideMaker.app, where overall generation metrics looked acceptable but certain language groups were failing more often because validation logic handled non-English content differently.
The fix was to break observability down by dimensions like language, export type, deck length, input type, and generation step. Instead of only asking, “What is the overall success rate?” I started asking, “Which specific group is failing more than expected?”
That helped surface issues hidden inside averages, especially multilingual edge cases, export failures, and long-deck processing problems.
My main rule now is simple:
- If users experience the product in different ways, the observability should reflect those differences.
Aggregate dashboards are useful for trends, but segmented monitoring is what catches silent production issues before they become obvious.
For resilience across vendors and regions, what early design decision has given you the most optionality without adding heavy complexity?
The early design decision that gave me the most optionality was keeping vendor-specific logic behind small internal adapter layers.
In AI systems, it is tempting to call a model provider, storage service, or image API directly from many parts of the application. That works at first, but it quickly creates lock-in. If pricing changes, rate limits are hit, a region has latency issues, or a provider has downtime, switching becomes painful.
For SlideMaker.app, I try to keep external services behind simple interfaces. For example, generation, image retrieval, storage, and export-related services should be called through internal functions or adapters instead of being hardcoded everywhere.
That gives flexibility without turning the system into an over-engineered abstraction. The goal is not to design for every possible vendor; the goal is to make sure one provider can be replaced or bypassed without rewriting the whole pipeline.
The practical rule I follow is:
- Keep external dependencies at the edge of the system, not scattered through the core logic.
That small habit makes it easier to test alternatives, control costs, handle failures, and move faster when infrastructure or vendor conditions change.
Finally, to keep a fast-growing microservice healthy day to day, what single metric would you put at the top of the team’s dashboard?
The single metric I would put at the top of the dashboard is end-to-end successful completion rate.
For a fast-growing microservice, isolated metrics like CPU, memory, queue depth, or API latency are useful, but they do not always show whether the user’s workflow actually succeeded. A service can look healthy internally while users are still getting failed exports, incomplete results, or retry loops.
For example, in an AI generation pipeline, I care most about whether a user request successfully moves from input to final usable output. That includes:
- generation
- validation
- storage
- export
- delivery
If any step fails, the user experience fails, even if each individual service appears mostly healthy.
I would track this metric overall, but also break it down by key segments such as:
- region
- language
- input type
- output format
- request size
My rule is:
The top metric should reflect the user’s completed outcome, not just the service’s internal health.
Once that number drops, the team can drill into latency, errors, queues, logs, and infrastructure metrics to find the cause.
Thanks for sharing your knowledge and expertise. Is there anything else you'd like to add?
Thank you for the opportunity to share my perspective.
The biggest lesson I have learned is that good software engineering is not only about choosing the right architecture or tools. It is about building systems that remain understandable, observable, and adaptable as real users interact with them in unexpected ways.
This is especially true for AI systems. The model is only one part of the product. Reliability often depends on the surrounding engineering: validation, retries, monitoring, data handling, export quality, and user experience. Small assumptions in these layers can create large problems at scale.
For me, the goal is to build AI-powered systems that are not just impressive in demos, but useful, reliable, and maintainable in real-world workflows.