Test Your API for Free
Upload your OpenAPI spec and find performance bottlenecks in minutes. Start with 50 free credits.
Get Started Free →Test Your API's Performance by Simulating Real-World Traffic with API Stress Lab
Learn how to run performance tests against your APIs to observe response times, throughput, and error rates under load. Find breaking points before your users do.
When it comes to API testing, you may have wondered some or all of these questions:
How will my APIs perform in real-world situations?
How will the response times change when multiple users are sending requests at the same time?
Will my users see acceptable response times when my system is under load, or will they see errors?
How can I identify performance bottlenecks that may become major production issues?
If so, you've likely realized that as you scale, your API's performance impacts the success of your products and business.
To help teams understand how their APIs behave under load, API Stress Lab provides performance testing capabilities that enable developers to run tests against their APIs to observe response times, throughput, and error rates.
API Performance Testing: What and Why?
In today's fast-paced digital world, providing a great user experience is essential to building a sustainable business and staying ahead of competitors. APIs have increasingly become the backbone of modern businesses, and the quality and reliability of these APIs impact how customers experience a product. To ensure user success, you need to know that your APIs meet the expected functionality (through functional testing) and also that they can handle the expected traffic volume (through performance testing).
API performance testing involves simulating user traffic patterns and observing your API's behavior under load. It is conducted to evaluate how well an API meets performance expectations for response time, throughput, and availability under the simulated load.
API performance testing can help you:
-
Ensure your API can handle the expected traffic patterns and understand how it responds as load increases (load is the number of parallel users hitting your APIs at the same time).
-
Optimize and improve the API's performance to ensure a better user experience.
-
Identify any bottlenecks, latency, and failures and determine the scalability of the system.
Introducing API Performance Testing with API Stress Lab
API Stress Lab now has built-in capabilities for testing your API's performance with your existing OpenAPI specifications. There are three core functionalities we will discuss in this post:
-
Upload your OpenAPI specification to automatically generate realistic test scenarios for all your endpoints.
-
Simulate load by configuring virtual users and load profiles to test different traffic patterns.
-
Visualize key performance metrics in real time, including response time, throughput (requests per second), error rates, and resource utilization.
How to Use API Stress Lab for API Performance Testing
You can use API Stress Lab to set up a performance test by following these steps:
Step 1: Upload Your OpenAPI Specification
Navigate to API Stress Lab and upload your OpenAPI (Swagger) specification file:
If you don't have an OpenAPI spec, you can generate one from your existing API using tools specific to your framework:
- FastAPI (Python): Built-in - available at
/openapi.json - NestJS (Node.js): Built-in OpenAPI support
- Express (Node.js): Use
swagger-jsdoc - Spring Boot (Java): Use
springdoc-openapi
Once uploaded, API Stress Lab will automatically parse all your endpoints, request/response schemas, and data types to generate realistic test scenarios.
Step 2: Select Test Configuration
After uploading your specification, configure your performance test settings:
Choose Your Endpoints:
- Test all endpoints (recommended for comprehensive testing)
- Select specific endpoints (e.g., only critical paths like login, checkout)
- Test individual endpoints in isolation
Configure Load Settings:
- Virtual users (VUs): The maximum number of parallel users you want to simulate
- Test duration: The amount of time (in minutes) for which you want to run the test
- Load profile: Choose between Fixed, Ramp up, or Spike load patterns
Step 3: Configure the Load Profile
You can now simulate user traffic patterns by configuring the load conditions for your tests. You will be able to specify the following inputs:
Virtual Users (VUs)
The maximum number of parallel users you want to simulate. Each virtual user executes your API endpoints continuously, creating realistic concurrent load.
Example: If you expect 200 concurrent users during peak hours, set VUs to 300 (1.5x safety margin).
Test Duration
The amount of time (in minutes) for which you want to run the test.
Recommended durations:
- Smoke test: 1-2 minutes (quick health check)
- Load test: 15-30 minutes (sustained load)
- Stress test: 10-20 minutes (finding breaking point)
Load Profile
The intensity of the load during the test's duration. We currently support three load profiles:
"Fixed" load profile: This will apply a fixed number of virtual users throughout the test duration.
Example:
Virtual Users: 100
Duration: 15 minutes
Pattern: Maintains 100 VUs for entire duration
"Ramp up" load profile: This will slowly increase the number of virtual users during the "ramp up duration" to reach the specified load. Once reached, this number of virtual users will be maintained for the remaining duration.
Example:
Start: 10 VUs
End: 500 VUs
Ramp Duration: 10 minutes
Pattern: Gradually increases from 10 → 500 over 10 minutes
"Spike" load profile: This creates a sudden traffic burst to test how your API handles unexpected surges.
Example:
Baseline: 10 VUs
Spike to: 500 VUs
Hold: 2 minutes
Pattern: 10 VUs → instant jump to 500 VUs → hold 2 min → back to 10 VUs
Step 4: Run the Test and Observe Real-Time Metrics
Click "Run Test" to start your performance test. As soon as the test begins, you will be able to visualize and observe the performance of your APIs in real time.
API Stress Lab will show the following metrics:
Average Response Time: This is the average of the response times received for the multiple parallel virtual users across the various requests.
Requests per Second: The requests per second (throughput) metric helps you observe how many requests can be served by your API per second. Each virtual user is continuously hitting your endpoints, and depending on the response times, each virtual user can send multiple requests in a second.
Error Rate: This metric indicates the fraction of the requests that get a non-2XX response or face non-HTTP errors while sending the request.
P95/P99 Latency: Shows the response time at which 95% or 99% of requests are faster than. This is more meaningful than average as it reveals the experience of your slowest users.
Note that all of the above metrics are cumulative across all your selected requests. API Stress Lab aggregates your metrics in short-term intervals, helping you visualize the changes to these metrics over time.
How Virtual Users Help You Simulate Load on Your API Workflows
Virtual users are parallel users that will hit your APIs at the same time. Each virtual user executes the selected sequence of requests from your OpenAPI specification in order. Multiple virtual users will run these sequences in parallel, creating realistic load for your API workflows.
For example: A typical e-commerce workflow might be:
- Browse products (GET /products)
- View product details (GET /products/:id)
- Add to cart (POST /cart/items)
- Checkout (POST /orders)
Each virtual user will run through this workflow continuously. If you set 50 virtual users, you'll have 50 parallel users going through this flow simultaneously, just like real traffic.
Important: Since virtual users simulate realistic concurrent load, the number of users you can simulate depends on your plan. Check your plan details and current usage in your dashboard.
The ultimate objective of API performance testing is to ensure that your end users get a good experience when consuming your shipped APIs. Therefore, the type of traffic you choose to simulate during testing will depend on the kind of situations you expect your APIs to handle in the production environment.
Visualizing the Metrics of a Performance Test
As soon as the performance test starts, you will be able to visualize and observe the performance of your APIs in real time through an intuitive dashboard.
Response Time Trend
Watch for these patterns:
- Flat line: Good - your API is handling load consistently
- Gradual increase: Warning - resource exhaustion approaching
- Sudden spike: Critical - you've hit a limit (connection pool, memory, etc.)
Throughput (Requests Per Second)
The throughput metric helps you understand your API's capacity:
- Stable throughput: Healthy - API serving requests consistently
- Decreasing throughput: Problem - performance degrading despite constant load
- Throughput matches expected: Verify your calculations are correct
Example calculation:
10 virtual users × 200ms average response time
= Each VU makes 5 requests/second
= Expected throughput: 50 requests/second
Error Rate Over Time
Monitor when errors occur and what causes them:
- 0-0.1% errors: Excellent
- 0.1-1% errors: Acceptable for most use cases
- >1% errors: Critical - needs investigation
Common error patterns:
- Database connection pool exhausted: "No connections available"
- Timeout errors: API responding too slowly
- 500 errors: Application crashes under load
- 429 errors: Rate limiting engaged (good if intentional)
Resource Utilization
API Stress Lab also tracks your API's resource consumption:
- CPU usage: Should stay under 70% at peak load
- Memory usage: Should remain stable (not growing unbounded)
- Database connections: Should not max out your pool
- Network bandwidth: Check for saturation
Drill Down into Your Metrics by Request
Request drill-down lets you drill down into every request that has been executed by the many virtual users. This helps you identify which request could have contributed to a spike in the cumulative average response times, enabling you to fix the problem.
You can visualize the performance metrics of individual requests by selecting the required request in the available filter.
Example use case: Your overall P95 latency spiked to 2 seconds. By drilling down, you discover that the GET /products endpoint is slow (3s average), while other endpoints are fast (200ms average). This pinpoints exactly where to optimize.
Troubleshooting Errors in Your Performance Test Runs
When your performance tests indicate elevated error rates and you would like to know more, you can simply hover over the point of interest and see what's causing the spike. This helps you identify the cause of the error and troubleshoot the problem further.
Once the run is complete, you can also click on the Errors tab to view the detailed error rate breakdown trend and see:
- Which endpoints are failing
- What error codes are being returned
- Error messages and stack traces
- At what load level errors started occurring
Common error patterns and fixes:
Pattern 1: Errors Spike at Specific User Count
Symptom: Everything works until 150 users, then errors spike to 15%
Likely cause: Database connection pool exhausted
Fix:
// Increase connection pool size
pool.max = 50 // was 10
pool.min = 10 // keep warm connectionsPattern 2: Gradual Error Increase
Symptom: Error rate slowly climbs from 0% to 5% over 10 minutes
Likely cause: Memory leak or resource leak
Fix: Profile your application and look for:
- Event listeners not being removed
- Caching without size limits
- Database connections not released
Pattern 3: Immediate Errors on Spike
Symptom: When load jumps from 10 to 500 users, instant 50% error rate
Likely cause: No graceful handling of burst traffic
Fix:
- Implement rate limiting
- Add request queuing
- Configure autoscaling (if cloud-based)
Viewing Past Performance Test Runs
You can view the list of past performance test runs for your API specifications, allowing you to:
- Compare performance across different versions
- Track improvements over time
- Identify performance regressions
- Share results with your team
Use cases:
- Before/after optimization: Did adding Redis caching improve response times?
- Capacity planning: How did performance change after upgrading infrastructure?
- Regression detection: Did the latest deploy slow down the API?
Understanding Your Test Results
After your test completes, API Stress Lab provides a comprehensive report showing:
What Success Looks Like
Excellent performance:
- ✅ P95 response time: under 200ms
- ✅ Error rate: under 0.1%
- ✅ Throughput: Matches expected calculations
- ✅ CPU: under 60%
- ✅ Memory: Stable throughout test
Good performance:
- ✅ P95 response time: 200-500ms
- ✅ Error rate: 0.1-0.5%
- ✅ Throughput: Close to expected
- ✅ CPU: 60-70%
- ✅ Memory: Slight growth but stabilizes
Needs optimization:
- ⚠️ P95 response time: 500ms-1s
- ⚠️ Error rate: 0.5-1%
- ⚠️ Throughput: Below expected
- ⚠️ CPU: 70-85%
- ⚠️ Memory: Growing but manageable
Critical issues:
- ❌ P95 response time: over 1s
- ❌ Error rate: over 1%
- ❌ Throughput: Significantly below expected
- ❌ CPU: over 85%
- ❌ Memory: Growing unbounded
Common Bottlenecks Identified
API Stress Lab helps you identify these common performance issues:
N+1 Query Problem:
- Symptom: Response time increases with data volume
- How detected: Database query count scales linearly with load
- Impact: 10x-100x slower than optimized queries
Missing Database Indexes:
- Symptom: Slow queries that get worse with more data
- How detected: Query execution time analysis
- Impact: Queries take 500ms instead of 5ms
Connection Pool Exhaustion:
- Symptom: Errors spike at specific user count
- How detected: "No connections available" errors
- Impact: API becomes completely unavailable
Memory Leaks:
- Symptom: Performance degrades over time
- How detected: Memory usage grows continuously
- Impact: Application crashes after hours of runtime
Synchronous External Calls:
- Symptom: Response time mirrors external API latency
- How detected: API response time = external API time
- Impact: Unnecessarily slow responses
Real-World Testing Strategy
Scenario 1: Pre-Launch Startup
Expected traffic: 500 concurrent users at peak
Testing plan:
- Smoke test: 5 VUs for 2 minutes (health check) ✅
- Load test: 500 VUs for 20 minutes (sustained load) ✅
- Stress test: Ramp from 10 → 1000 VUs (find breaking point) ✅
Results:
- Smoke test: ✅ All endpoints responding under 100ms
- Load test: ⚠️ P95 latency 800ms (should be under 500ms)
- Stress test: ❌ Breaks at 300 users (database connections)
Fixes applied:
- Increased DB connection pool: 10 → 50
- Added Redis caching for user sessions
- Optimized slow query with index
Re-test results:
- Load test: ✅ P95 latency 250ms
- Stress test: ✅ Handles 800+ users
- Ready to launch ✅
Scenario 2: Production API Monitoring
Current traffic: 200 concurrent users at peak
Testing plan (monthly):
- Load test: 200 VUs (current peak) ✅
- Stress test: Find new breaking point ✅
- Regression check: Compare to last month ✅
Results:
- Load test: ✅ No issues
- Stress test: ⚠️ Breaking point now 400 (was 600 last month)
- Regression: ❌ Recent feature introduced memory leak
Action taken:
- Identified problematic feature
- Fixed memory leak
- Re-tested and verified fix
Action Plan: What to Do Right Now
If You Haven't Tested Yet:
- Generate or upload your OpenAPI specification (10 minutes)
- Run a load test with your expected peak traffic (20 minutes)
- Run a stress test to find your breaking point (20 minutes)
- Document your limits: "We can handle X concurrent users"
- Set up monitoring: Alert when approaching 70% of capacity
If You're Already Testing:
- Add spike testing to catch sudden burst issues
- Run tests regularly (weekly for production APIs)
- Automate tests in your CI/CD pipeline
- Track trends over time to catch regressions
- Re-test after major infrastructure or code changes
If You're Overwhelmed:
Start simple:
- Upload your OpenAPI spec
- Run a 15-minute load test with expected traffic
- If it passes, run a stress test to find limits
- That's it. You're 80% of the way there.
Best Practices for API Performance Testing
Test with Realistic Data
Don't: Test against an empty database Do: Seed your test database with production-scale data (1M+ rows)
Why: Queries fast with 10 rows become slow with 1M rows
Test Complete Workflows
Don't: Test endpoints in isolation only Do: Test realistic user workflows (login → browse → checkout)
Why: Individual endpoints might be fast, but complete workflows might have issues
Test from Realistic Locations
Don't: Test from same datacenter as your API Do: Test from regions where your users actually are
Why: Network latency matters - don't ignore it
Test Regularly
Don't: Test once before launch and never again Do: Test weekly for production APIs, before every major release
Why: Code changes affect performance - catch regressions early
Monitor While Testing
Don't: Just look at response times Do: Monitor CPU, memory, DB connections, and errors
Why: Understand what fails first and why
Conclusion: Performance Testing Made Simple
We hope that with API Stress Lab's performance testing capabilities, you find it easier to test your API's performance and make it part of your development lifecycle. Our goal is to democratize API performance testing for all developers and make it as simple as uploading your OpenAPI specification.
Key takeaways:
- Upload your OpenAPI spec - Automatic test generation saves hours of scripting
- Configure load settings - Virtual users and load profiles simulate realistic traffic
- Observe real-time metrics - Response time, throughput, errors, and resources
- Drill down into issues - Identify exactly which endpoints need optimization
- Test regularly - Make performance testing part of your release process
Getting started is simple:
- Upload your OpenAPI specification
- Configure virtual users and duration
- Click "Run Test"
- Review results and fix bottlenecks
Your plan includes a limited number of performance test runs each month. You can track usage in your dashboard, and API Stress Lab will alert you as you approach your limit.
Ready to test your API? Find your breaking point before your users do.
Related Posts: