Data Sources and Subreports: Keeping Performance in Check - CMO & CTO (An AI Generated Experiment to the past)

Subreports look nice on the canvas and they feel clean when you click around in iReport, but they can send your server for a walk if you are not careful. I have been helping teams wire up JasperReports for invoices, statements and those end of month monsters that marketing loves, and the same story keeps popping up. The report loads fast with sample data, then the real database walks in and everything crawls. If your stack is Tomcat with a shared JNDI connection and a couple of pooled threads, you do not have infinite room to play. Let’s talk about where the time goes and how to keep your reports snappy without giving up features.

How JasperReports really pulls your data

The main report has a data source. Each subreport has its own data source too. That is the bit many folks gloss over. A subreport inside a detail band is evaluated for every row of the parent dataset. So if your main query returns 500 customers and inside that detail you place two subreports for orders and payments, you are very likely issuing N plus 1 queries for each customer. The screen looked tidy. The database did not enjoy it.

With JDBC connections this means repeated trips across the wire. With a JavaBeans data source this means repeated scans over the same lists. With XML or CSV it means repeated parsing unless you stash the parsed document. The fill engine is stubborn in a good way. It will do exactly what the design says. If the subreport sits in a band that fires for each row, it runs each time. If the subreport sits in a group footer that fires once per group, it runs once per group. That is your budget.

There is a catch with pagination too. The engine fills records and pages while it goes, so when a subreport asks for its rows, the parent row is already live. That makes things feel natural in the designer, but it also means timing matters. If you depend on a heavy query inside a subreport, you are blocking the page until that query ends. Think of it like a nested loop in Java. It works. It multiplies cost.

The easiest way to sidestep this is to let the database do the heavy lifting. Flatten the data in one query and use groups in the main report to present it. If you truly need separate datasets, consider a dataset run that reuses the same connection and pass only the parameters, not brand new connections or readers each time.

Subreports without pain

Subreports are not the enemy. They are great for layout reuse, for a self contained block like a miniature table, and for pieces used across many reports. The trick is to control where they live and how often they fire.

Put repeating subreports inside a group header or group footer when the content belongs to that level. For example, a customer summary can live in the customer group header, not in the detail. That way it runs once per customer. A totals block can live in the group footer and it also runs once. If you must show a list inside the list, keep that inner dataset small by passing filtered parameters such as a customer id or a date window, and make sure the query uses those inputs with proper indexes on the database.

If your subreport is only there to repeat a few fields in a grid, drop it and use a frame with static text and text fields. Frames are cheap. Subreports have overhead. If you are using a JavaBeans data source and you glued a subreport just to filter children of a parent, you can pre split the data in Java into a map keyed by the parent id, pass the map to the report, and then the subreport can read directly from the map for constant time lookups.

Watch expressions like Print When. A subreport with a Print When expression that evaluates to false still has a cost to evaluate. If the expression is simple that is fine. If you are calling methods that do I O or heavy logic inside that expression, move the work to a report variable or a parameter computed in Java before the fill.

Data source choices and what they mean for speed

JDBC is the most direct route. The engine can stream rows and you can lean on the database for sorting and grouping. Reuse the same connection through your report and all subreports by passing it as a parameter, instead of opening a new one each time. Keep queries simple, select only the columns you print, and let the database group and aggregate.

JavaBeans or custom JRDataSource is nice when the data already lives in memory. Build indexes once. Create a map of parent id to list of children before the fill and pass that down. Avoid creating new iterators on every row if they scan the whole list. Precompute derived fields so the report does not do math for every cell.

XML and CSV can work for small sets or static references. For larger sets, parse once and reuse. Do not reparse the same file inside each subreport. If you need XPath for many nodes, keep the compiled path objects in your code and pass the results, not the source document, to reduce repeated work.

Memory is part of the story. Big images, wide text fields with long paragraphs, and too many nested elements keep the fill engine busy. Use styles to keep formatting light. If you are stacking many subreports vertically, set them to stretch with overflow only when needed. Fewer moving parts means fewer layout passes.

The shiny design versus the fast design

The shiny design: Main report lists customers. Inside the detail you drop a subreport for orders and another for payments. Each subreport runs its own query filtered by the customer id. It looks clean. It is also two queries per customer, which becomes thousands of queries for a normal batch.

The fast design: One query joins customers, orders and payments with proper filters. The main report groups by customer. Inside the customer group you show the customer block. Inside an orders group you print the rows. Inside a payments group you print the rows. You let the database sort and group. You let the report render. Same content. Fewer queries. Less time on the wire. Less garbage in the JVM.

Another contrast. Subreport for a tiny reusable header is fine and keeps your designs tidy across multiple reports. Subreport for a list that fires for every parent row is expensive. Save subreports for reuse and structure. Use groups and frames for repetition.

Practical checklist to keep JasperReports quick

Flatten where you can by letting the database join and group, then use report groups to present.
Place subreports at the right level so they run once per group, not once per row, unless that is really what you want.
Reuse the same connection by passing it to subreports. Avoid opening new connections inside expressions.
Limit selected columns to only what you print. Wide selects slow everything down.
Index the filters you pass to subreports and queries so lookups are quick.
Precompute in Java for JavaBeans data sources. Build maps and derived fields before the fill.
Cache parsed XML or CSV instead of reparsing inside each subreport.
Keep Print When expressions light. Move heavy logic to variables or parameters.
Use frames and groups for simple repetition. Reserve subreports for genuine reuse.
Measure with real data. Try a batch run on staging with production sizes before you ship.
Watch memory. Large images and deep nesting increase layout passes. Simplify styles.
Avoid duplicate queries. If two subreports ask the same thing, fetch once and split in the report.
Be careful with detail bands. Anything inside them runs for each row.
Pass parameters smartly. Avoid expensive method calls inside parameter expressions.
Profile the fill with logging at fill time to see where the clock is going.

Keep your reports honest. If the design says run a subreport three hundred times, the engine will do exactly that. Your job is to teach the design to ask only for what it needs.

Make it readable, then make it fast, and only then make it pretty. That order tends to stick.

Software Architecture Software Engineering