Improving Performance in ABAP: Optimizing Data Access and Processing
When writing programs in ABAP, performance can often be a critical factor. This is especially true in large SAP environments where reading and processing massive amounts of data can severely impact the overall system performance. In this post, I will explain some key strategies and techniques for improving performance in ABAP, particularly when dealing with database reads, internal table processing, and caching.
📊 Reading Data from the Database
The way you read data from the database can greatly impact performance. Typically, performance issues arise when you’re handling large volumes of data, especially when reading data from the database multiple times in a loop.
For the following examples, we always measure with the following number of records:
- MARA: 83813
- MARC: 190384
🔄 Sequential Reading of All Data
The traditional way of reading data sequentially, using a loop and repeated SELECT statements, is inefficient and can result in high runtimes. Here’s an example:
1
2
3
4
5
6
7
8
9
10
11
12
13
SELECT matnr, matkl
FROM mara
INTO CORRESPONTING FIELDS OF @ls_mara
WHERE matkl IN @s_matkl.
SELECT *
FROM marc
INTO CORRESPONDING FIELDS OF @ls_marc
WHERE matnr EQ @ls_mara-matnr.
ENDSELECT.
ENDSELECT.
Runtime: 41 seconds
This is due to repeated database access for each record, which causes a performance bottleneck.
🔄 Sequential Reading of Missing Data
An improvement over the previous method involves reading the data into an internal table and processing it in memory. This method still has its drawbacks but can be more efficient.
1
2
3
4
5
6
7
8
9
10
11
12
SELECT matnr, matkl
FROM mara
INTO CORRESPONDING FIELDS OF TABLE @lt_mara
WHERE matkl IN @s_matkl.
LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
SELECT *
FROM marc
APPENDING CORRESPONDING FIELDS OF TABLE @lt_marc
WHERE matnr EQ @<s_mara>-matnr.
ENDLOOP.
Runtime: 38 seconds
This approach reduces database connections but still involves repetitive database reads inside the loop.
🚀 Reading via “FOR ALL ENTRIES IN”
A far more efficient approach is to use the FOR ALL ENTRIES IN
clause, which minimizes the number of SELECT statements sent to the database.
It must be ensured that there are data sets in the existing internal table, otherwise the selection is made without restriction!
1
2
3
4
5
6
7
8
9
10
11
SELECT matnr, matkl
FROM mara
INTO CORRESPONDING FIELDS OF TABLE @lt_mara
WHERE matkl IN @s_matkl.
SELECT *
FROM marc
INTO CORRESPONDING FIELDS OF TABLE @lt_marc
FOR ALL ENTRIES IN @lt_mara
WHERE matnr EQ @lt_mara-matnr.
Runtime: 0.7 seconds
This technique reads data in bulk, significantly improving performance.
🧮 Processing Internal Tables
Once the necessary data is in internal tables, processing the data efficiently becomes crucial, especially when the internal tables contain a large number of records.
For the following examples we always measure with the following number of records. The internal tables correspond to the examples from above.
- MARA: 83813
- MARC: 190384
🔄 Sequential Access
Accessing data sequentially in a standard table can lead to inefficiencies when working with large datasets. Here’s an example using standard table access:
1
2
3
4
5
6
7
8
9
10
DATA: lt_mara TYPE TABLE OF t_mara,
lt_marc TYPE TABLE OF t_marc.
... (data selection)
LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) WHERE matnr EQ <s_mara>-matnr.
ENDLOOP.
ENDLOOP.
Runtime: 931 seconds
This is inefficient, as every time we loop over the lt_mara
table, we have to loop through the entire lt_marc
table to find matching records.
🧩 Binary Search
A more efficient approach is to use binary search, but this method requires the internal table to be sorted and have unique data. Here’s an optimized version:
The only requirement is that the data to be selected is unique and sorted.
1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE TABLE OF t_mara,
lt_marc TYPE TABLE OF t_marc.
... (data selection)
SORT lt_mara BY matnr ASCENDING.
LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY matnr = <s_marc>-matnr BINARY SEARCH.
ENDLOOP.
Runtime: 0.6 seconds
Binary search reduces the time complexity to O(log n), significantly improving the performance.
🔑 Secondary Key
If you need to frequently access certain fields, secondary keys provide an efficient method. With ABAP 7.0 EhP2, secondary keys can be created on internal tables to speed up access:
1
2
3
4
5
6
7
8
9
10
11
12
DATA: lt_mara TYPE TABLE OF t_mara,
lt_marc TYPE TABLE OF t_marc
WITH NON-UNIQUE SORTED KEY k1 COMPONENTS matnr. " not unique
... (data selection)
LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) USING KEY k1 WHERE matnr EQ <s_mara>-matnr.
ENDLOOP.
ENDLOOP.
Runtime: 0.6 seconds
1
2
3
4
5
6
7
8
9
10
DATA: lt_mara TYPE TABLE OF t_mara
WITH UNIQUE HASHED KEY k1 COMPONENTS matnr, " unique
lt_marc TYPE TABLE OF t_marc.
... (data selection)
LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY k1 COMPONENTS matnr = <s_marc>-matnr.
ENDLOOP.
Runtime: 0.5 seconds
Secondary keys allow faster data retrieval from sorted tables.
🔀 Table type SORTED/HASHED
For optimal performance, it’s crucial to select the right table type. SORTED and HASHED tables provide different performance benefits, especially when dealing with large datasets. As the SAP blog post shows, the access time for the two types of tables behaves as follows:
READ … WITH KEY | Standard | Sorted | Hashed |
---|---|---|---|
O(1) | Access via index | Access via index | KEY contains the whole table key |
O(logn) | KEY nedds to be sorted and BINARY SEARCH has been added | KEY contains the whole or the beginning part of the table key | - |
O(n) | KEY not sorted or no BINARY SEARCH | KEY does not contains the first field of the table key | KEY does not contain the whole table key |
LOOP AT … WHERE | Standard | Sorted | Hashed |
---|---|---|---|
O(1) | - | - | WHERE contains the whole key |
O(logn) | The table need to be sorted according the WHERE clause + workaround: 1. First find the starting index with READ … BINARY SEARCH 2. LOOP AT … WHERE … FROM INDEX | WHERE contatins the whole or the beginning part of the table key | - |
O(n) | Every other LOOP AT … WHERE | WHERE does not contain the first field of the table key | WHERE does not contain the whole table key |
Here it is especially important to note that a wrong use of the type can even lead to a worse result (see examples O(n) for hashed tables.
1
2
3
4
5
6
7
8
9
10
11
12
13
DATA: lt_mara TYPE TABLE OF t_mara,
lt_marc TYPE SORTED TABLE OF t_marc
WITH UNIQUE KEY matnr werks.
... (data selection)
LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) WHERE matnr EQ <s_mara>-matnr.
ENDLOOP.
ENDLOOP.
Runtime: 0.5 seconds
1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE HASHED TABLE OF t_mara
WITH UNIQUE KEY matnr,
lt_marc TYPE TABLE OF t_marc.
... (data selection)
LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY matnr = <s_marc>-matnr.
ENDLOOP.
Runtime: 0.5 seconds
This table type allows fast, direct access by the key, minimizing search times.
✏️ Changing data of an internal table
In general, there is little difference between using a target structure with LOOP AT INTO
or a field symbol with LOOP AT ASSIGNING
when iterating over a table. The LOOP AT INTO
is so well-optimized in the ABAP kernel that the runtime difference compared to LOOP AT ASSIGNING
is almost negligible.
However, when you need to modify data in an internal table, the choice between a structure and a field symbol becomes significant. To better illustrate this, the amount of data in the MARA table has been increased to 804,019 records to highlight the performance difference.
1
2
3
4
5
6
7
8
9
DATA: lt_mara TYPE TABLE OF t_mara.
... (data selection)
LOOP AT lt_mara INTO DATA(ls_mara).
ls_mara-aenam = sy-uname.
MODIFY lt_mara FROM ls_mara TRANSPORTING aenam.
ENDLOOP.
Runtime: 1.9 seconds
1
2
3
4
5
6
7
8
DATA: lt_mara TYPE TABLE OF t_mara.
... (data selection)
LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
<s_mara>-aenam = sy-uname.
ENDLOOP.
Runtime: 0.3 seconds
🧳 Caching of Frequently Read Data
Another method to improve performance, especially for data that is frequently accessed, is caching. By caching data locally, you avoid repeatedly querying the database for the same information.
For the following examples the data from the first example is used again:
- MARA: 83813
It should be noted here that all materials from the MARA table are in the same material group (MATKL),
🔄 Re-reading Without Cache
Without any form of caching, each time you access a piece of data, a database query is executed:
1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE TABLE OF t_mara.
... (data selection)
LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
SELECT SINGLE pernr
FROM pa0105
INTO @DATA(lv_pernr)
WHERE usrid EQ @<s_mara>-aenam.
ENDLOOP.
Runtime: 34 seconds
🗂️ Reading via Local Cache
To improve performance, you can use a local cache to store frequently accessed data. If the data is already in the cache, it avoids the need to query the database again:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
DATA: gt_pa0105 TYPE HASHED TABLE OF pa0105
WITH UNIQUE KEY usrid.
METHOD get_pernr.
READ TABLE gt_pa0105 ASSIGNING FIELD-SYMBOL(<s_pa0105>) WITH KEY usrid = iv_uname.
IF sy-subrc NE 0.
SELECT SINGLE *
FROM pa0105
INTO @DATA(ls_pa0105)
WHERE usrid EQ @iv_uname.
IF sy-subrc NE 0.
" fake entry
ls_pa0105-usrid = iv_uname.
ENDIF.
INSERT ls_pa0105
INTO TABLE gt_pa0105 ASSIGNING <s_pa0105>.
ENDIF.
rv_pernr = <s_pa0105>-pernr.
ENDMETHOD.
1
2
3
4
5
6
7
8
DATA: lt_mara TYPE TABLE OF t_mara.
... (data selection)
LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
DATA(lv_pernr) = get_pernr( <s_mara>-aenam ).
ENDLOOP.
Runtime: 0.7 seconds
📂 Buffered table
Tables can be configured for active buffering in the table’s technical settings (SE11 or SE13). When a table (or record) is buffered, it is stored in the table buffer on the application server, eliminating the need to perform a database select if the data is already in the buffer.
Merely marking the table for buffering does not ensure that the application server will store the data in the buffer. Depending on the buffer size or the number of active records, buffered records may be removed from the buffer.
1
2
3
4
5
6
7
8
9
10
11
12
DATA: lt_mara TYPE TABLE OF t_mara.
... (data selection)
LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
SELECT SINGLE name1 " USR03 ist gebuffert
FROM usr03
INTO @DATA(lv_name1)
WHERE bname EQ @<s_mara>-aenam.
ENDLOOP.
Runtime: 0.8 seconds
🧠 Final Thoughts
Improving performance in ABAP is essential for large-scale SAP systems, especially when dealing with massive datasets. By utilizing efficient database reading techniques, optimizing internal table processing, and implementing caching mechanisms, you can significantly reduce runtime and enhance system performance. Always consider the size of your dataset and the specific requirements of your application when choosing the best technique for your scenario.
Happy coding, and may your ABAP programs run faster than ever!