Post

Improving Performance in ABAP: Optimizing Data Access and Processing

Improving Performance in ABAP: Optimizing Data Access and Processing

When writing programs in ABAP, performance can often be a critical factor. This is especially true in large SAP environments where reading and processing massive amounts of data can severely impact the overall system performance. In this post, I will explain some key strategies and techniques for improving performance in ABAP, particularly when dealing with database reads, internal table processing, and caching.


📊 Reading Data from the Database

The way you read data from the database can greatly impact performance. Typically, performance issues arise when you’re handling large volumes of data, especially when reading data from the database multiple times in a loop.

For the following examples, we always measure with the following number of records:

  • MARA: 83813
  • MARC: 190384

🔄 Sequential Reading of All Data

The traditional way of reading data sequentially, using a loop and repeated SELECT statements, is inefficient and can result in high runtimes. Here’s an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
SELECT matnr, matkl
  FROM mara
  INTO CORRESPONTING FIELDS OF @ls_mara
  WHERE matkl IN @s_matkl.

  SELECT *
    FROM marc
    INTO CORRESPONDING FIELDS OF @ls_marc
    WHERE matnr EQ @ls_mara-matnr.

  ENDSELECT.

ENDSELECT.

Runtime: 41 seconds

This is due to repeated database access for each record, which causes a performance bottleneck.


🔄 Sequential Reading of Missing Data

An improvement over the previous method involves reading the data into an internal table and processing it in memory. This method still has its drawbacks but can be more efficient.

1
2
3
4
5
6
7
8
9
10
11
12
SELECT matnr, matkl
  FROM mara
  INTO CORRESPONDING FIELDS OF TABLE @lt_mara
  WHERE matkl IN @s_matkl.

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  SELECT *
    FROM marc
    APPENDING CORRESPONDING FIELDS OF TABLE @lt_marc
    WHERE matnr EQ @<s_mara>-matnr.

ENDLOOP.

Runtime: 38 seconds

This approach reduces database connections but still involves repetitive database reads inside the loop.


🚀 Reading via “FOR ALL ENTRIES IN”

A far more efficient approach is to use the FOR ALL ENTRIES IN clause, which minimizes the number of SELECT statements sent to the database.

It must be ensured that there are data sets in the existing internal table, otherwise the selection is made without restriction!

1
2
3
4
5
6
7
8
9
10
11
SELECT matnr, matkl
  FROM mara
  INTO CORRESPONDING FIELDS OF TABLE @lt_mara
  WHERE matkl IN @s_matkl.


SELECT *
  FROM marc
  INTO CORRESPONDING FIELDS OF TABLE @lt_marc
  FOR ALL ENTRIES IN @lt_mara
  WHERE matnr EQ @lt_mara-matnr.

Runtime: 0.7 seconds

This technique reads data in bulk, significantly improving performance.


🧮 Processing Internal Tables

Once the necessary data is in internal tables, processing the data efficiently becomes crucial, especially when the internal tables contain a large number of records.

For the following examples we always measure with the following number of records. The internal tables correspond to the examples from above.

  • MARA: 83813
  • MARC: 190384

🔄 Sequential Access

Accessing data sequentially in a standard table can lead to inefficiencies when working with large datasets. Here’s an example using standard table access:

1
2
3
4
5
6
7
8
9
10
DATA: lt_mara TYPE TABLE OF t_mara,
      lt_marc TYPE TABLE OF t_marc.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) WHERE matnr EQ <s_mara>-matnr.
  ENDLOOP.

ENDLOOP.

Runtime: 931 seconds

This is inefficient, as every time we loop over the lt_mara table, we have to loop through the entire lt_marc table to find matching records.


A more efficient approach is to use binary search, but this method requires the internal table to be sorted and have unique data. Here’s an optimized version:

The only requirement is that the data to be selected is unique and sorted.

1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE TABLE OF t_mara,
      lt_marc TYPE TABLE OF t_marc.

... (data selection)

SORT lt_mara BY matnr ASCENDING.

LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
  READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY matnr = <s_marc>-matnr BINARY SEARCH.

ENDLOOP.

Runtime: 0.6 seconds

Binary search reduces the time complexity to O(log n), significantly improving the performance.


🔑 Secondary Key

If you need to frequently access certain fields, secondary keys provide an efficient method. With ABAP 7.0 EhP2, secondary keys can be created on internal tables to speed up access:

1
2
3
4
5
6
7
8
9
10
11
12
DATA: lt_mara TYPE TABLE OF t_mara,
      lt_marc TYPE TABLE OF t_marc
              WITH NON-UNIQUE SORTED KEY k1 COMPONENTS matnr. " not unique

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) USING KEY k1 WHERE matnr EQ <s_mara>-matnr.

  ENDLOOP.

ENDLOOP.

Runtime: 0.6 seconds

1
2
3
4
5
6
7
8
9
10
DATA: lt_mara TYPE TABLE OF t_mara
              WITH UNIQUE HASHED KEY k1 COMPONENTS matnr, " unique
      lt_marc TYPE TABLE OF t_marc.

... (data selection)

LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
  READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY k1 COMPONENTS matnr = <s_marc>-matnr.

ENDLOOP.

Runtime: 0.5 seconds

Secondary keys allow faster data retrieval from sorted tables.


🔀 Table type SORTED/HASHED

For optimal performance, it’s crucial to select the right table type. SORTED and HASHED tables provide different performance benefits, especially when dealing with large datasets. As the SAP blog post shows, the access time for the two types of tables behaves as follows:

READ …
WITH KEY
StandardSortedHashed
O(1)Access via indexAccess via indexKEY contains the
whole table key
O(logn)KEY nedds to be sorted
and BINARY SEARCH has been added
KEY contains the whole or
the beginning part of the table key
-
O(n)KEY not sorted or
no BINARY SEARCH
KEY does not contains the first field
of the table key
KEY does not contain the
whole table key

LOOP AT …
WHERE
StandardSortedHashed
O(1)--WHERE contains the
whole key
O(logn)The table need to be sorted according the
WHERE clause + workaround:
1. First find the starting index with
READ … BINARY SEARCH
2. LOOP AT … WHERE … FROM INDEX
WHERE contatins the whole or
the beginning part of the table key
-
O(n)Every other LOOP AT … WHEREWHERE does not contain the first field
of the table key
WHERE does not contain the
whole table key

Here it is especially important to note that a wrong use of the type can even lead to a worse result (see examples O(n) for hashed tables.

1
2
3
4
5
6
7
8
9
10
11
12
13
DATA: lt_mara TYPE TABLE OF t_mara,
      lt_marc TYPE SORTED TABLE OF t_marc
              WITH  UNIQUE KEY matnr werks.

... (data selection)


LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) WHERE matnr EQ <s_mara>-matnr.

  ENDLOOP.

ENDLOOP.

Runtime: 0.5 seconds

1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE HASHED TABLE OF t_mara
              WITH UNIQUE KEY matnr,
      lt_marc TYPE TABLE OF t_marc.

... (data selection)

LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
  READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY matnr = <s_marc>-matnr.

ENDLOOP.

Runtime: 0.5 seconds

This table type allows fast, direct access by the key, minimizing search times.


✏️ Changing data of an internal table

In general, there is little difference between using a target structure with LOOP AT INTO or a field symbol with LOOP AT ASSIGNING when iterating over a table. The LOOP AT INTO is so well-optimized in the ABAP kernel that the runtime difference compared to LOOP AT ASSIGNING is almost negligible.

However, when you need to modify data in an internal table, the choice between a structure and a field symbol becomes significant. To better illustrate this, the amount of data in the MARA table has been increased to 804,019 records to highlight the performance difference.

1
2
3
4
5
6
7
8
9
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara INTO DATA(ls_mara).
  ls_mara-aenam = sy-uname.
  MODIFY lt_mara FROM ls_mara TRANSPORTING aenam.

ENDLOOP.

Runtime: 1.9 seconds

1
2
3
4
5
6
7
8
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  <s_mara>-aenam = sy-uname.

ENDLOOP.

Runtime: 0.3 seconds


🧳 Caching of Frequently Read Data

Another method to improve performance, especially for data that is frequently accessed, is caching. By caching data locally, you avoid repeatedly querying the database for the same information.

For the following examples the data from the first example is used again:

  • MARA: 83813

It should be noted here that all materials from the MARA table are in the same material group (MATKL),


🔄 Re-reading Without Cache

Without any form of caching, each time you access a piece of data, a database query is executed:

1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  SELECT SINGLE pernr
    FROM pa0105
    INTO @DATA(lv_pernr)
    WHERE usrid EQ @<s_mara>-aenam.

ENDLOOP.

Runtime: 34 seconds


🗂️ Reading via Local Cache

To improve performance, you can use a local cache to store frequently accessed data. If the data is already in the cache, it avoids the need to query the database again:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
DATA: gt_pa0105 TYPE HASHED TABLE OF pa0105
                WITH UNIQUE KEY usrid.
                
METHOD get_pernr.

  READ TABLE gt_pa0105 ASSIGNING FIELD-SYMBOL(<s_pa0105>) WITH KEY usrid = iv_uname.
  IF sy-subrc NE 0.
    SELECT SINGLE *
      FROM pa0105
      INTO @DATA(ls_pa0105)
      WHERE usrid EQ @iv_uname.
    IF sy-subrc NE 0.
      " fake entry
      ls_pa0105-usrid = iv_uname.

    ENDIF.

    INSERT ls_pa0105
      INTO TABLE gt_pa0105 ASSIGNING <s_pa0105>.

  ENDIF.

  rv_pernr = <s_pa0105>-pernr.

ENDMETHOD.                
1
2
3
4
5
6
7
8
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  DATA(lv_pernr) = get_pernr( <s_mara>-aenam ).

ENDLOOP.

Runtime: 0.7 seconds


📂 Buffered table

Tables can be configured for active buffering in the table’s technical settings (SE11 or SE13). When a table (or record) is buffered, it is stored in the table buffer on the application server, eliminating the need to perform a database select if the data is already in the buffer.

Merely marking the table for buffering does not ensure that the application server will store the data in the buffer. Depending on the buffer size or the number of active records, buffered records may be removed from the buffer.

1
2
3
4
5
6
7
8
9
10
11
12
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  SELECT SINGLE name1		" USR03 ist gebuffert
    FROM usr03
    INTO @DATA(lv_name1)
    WHERE bname EQ @<s_mara>-aenam.

ENDLOOP.

Runtime: 0.8 seconds


🧠 Final Thoughts

Improving performance in ABAP is essential for large-scale SAP systems, especially when dealing with massive datasets. By utilizing efficient database reading techniques, optimizing internal table processing, and implementing caching mechanisms, you can significantly reduce runtime and enhance system performance. Always consider the size of your dataset and the specific requirements of your application when choosing the best technique for your scenario.

Happy coding, and may your ABAP programs run faster than ever!

This post is licensed under CC BY 4.0 by the author.