Home ABAP - Performance Optimized Data Processing
Post
Cancel

ABAP - Performance Optimized Data Processing

When writing programs with ABAP, there are many ways to significantly improve performance. Generally, the performance in the SAP environment usually drops when reading/processing masses of data. Exactly at these points there are some approaches to accelerate these operations.

Reading data from the database

When (re-)reading data from the database, a distinction is usually made between single/record processing and mass/bulk processing. With record processing, there is always the problem that a connection must be established from the application server to the database server for each record read - even if this only takes a few milliseconds, it accumulates when thousands of records are read.

For the following examples, we always measure with the following number of records:

  • MARA: 83813
  • MARC: 190384

Sequential reading of all data

The classical way to read data from the database was record processing (loop). However, this is very poor in terms of performance due to the above reasons:

1
2
3
4
5
6
7
8
9
10
11
12
13
SELECT matnr, matkl
  FROM mara
  INTO CORRESPONTING FIELDS OF @ls_mara
  WHERE matkl IN @s_matkl.

  SELECT *
    FROM marc
    INTO CORRESPONDING FIELDS OF @ls_marc
    WHERE matnr EQ @ls_mara-matnr.

  ENDSELECT.

ENDSELECT.

Runtime: 41 seconds

Sequential reading of missing data

1
2
3
4
5
6
7
8
9
10
11
12
SELECT matnr, matkl
  FROM mara
  INTO CORRESPONDING FIELDS OF TABLE @lt_mara
  WHERE matkl IN @s_matkl.

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  SELECT *
    FROM marc
    APPENDING CORRESPONDING FIELDS OF TABLE @lt_marc
    WHERE matnr EQ @<s_mara>-matnr.

ENDLOOP.

Runtime: 38 seconds

Reading via “FOR ALL ENTRIES IN”

If you have an internal table with data, where you can select for each data set to another table, then you can use the addition FOR ALL ENTRIES IN.

It must be ensured that there are data sets in the existing internal table, otherwise the selection is made without restriction!

1
2
3
4
5
6
7
8
9
10
11
SELECT matnr, matkl
  FROM mara
  INTO CORRESPONDING FIELDS OF TABLE @lt_mara
  WHERE matkl IN @s_matkl.


SELECT *
  FROM marc
  INTO CORRESPONDING FIELDS OF TABLE @lt_marc
  FOR ALL ENTRIES IN @lt_mara
  WHERE matnr EQ @lt_mara-matnr.

Runtime: 0.7 seconds

Processing internal tables

If one has read once all necessary data from the data base into internal tables, these must be read during the processing from the internal table. If there are many entries in the internal table, an inperformant access (despite the fact that everything is already in memory) can result in a long runtime.

For the following examples we always measure with the following number of records. The internal tables correspond to the examples from above.

  • MARA: 83813
  • MARC: 190384

Sequential access

If you have a “standard table” (e.g. if no separate type was specified when declaring the internal table), then access to data in the table is sequential - this means that one record is processed after the other.

1
2
3
4
5
6
7
8
9
10
DATA: lt_mara TYPE TABLE OF t_mara,
      lt_marc TYPE TABLE OF t_marc.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) WHERE matnr EQ <s_mara>-matnr.
  ENDLOOP.

ENDLOOP.

Runtime: 931 seconds

A special case is the binary search, which can be used as an addition for READ TABLE.

The only requirement is that the data to be selected is unique and sorted.

1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE TABLE OF t_mara,
      lt_marc TYPE TABLE OF t_marc.

... (data selection)

SORT lt_mara BY matnr ASCENDING.

LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
  READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY matnr = <s_marc>-matnr BINARY SEARCH.

ENDLOOP.

Runtime: 0.6 seconds

Secondary key

Since ABAP 7.0 EhP2 there are Secondary keys for internal tables. These basically function like an index in the database to additional fields. With these secondary keys you can define individual fields of the internal table for an optiomized access without having to change the type of the table itself:

1
2
3
4
5
6
7
8
9
10
11
12
DATA: lt_mara TYPE TABLE OF t_mara,
      lt_marc TYPE TABLE OF t_marc
              WITH NON-UNIQUE SORTED KEY k1 COMPONENTS matnr. " nicht eindeutig

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) USING KEY k1 WHERE matnr EQ <s_mara>-matnr.

  ENDLOOP.

ENDLOOP.

Runtime: 0.6 seconds

1
2
3
4
5
6
7
8
9
10
DATA: lt_mara TYPE TABLE OF t_mara
              WITH UNIQUE HASHED KEY k1 COMPONENTS matnr, " eindeutig
      lt_marc TYPE TABLE OF t_marc.

... (data selection)

LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
  READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY k1 COMPONENTS matnr = <s_marc>-matnr.

ENDLOOP.

Runtime: 0.5 seconds

Table type SORTED/HASHED

If you have the option, you should specifically use the correct table type for processing a large number of data records (especially > 10 000). Here you can distinguish between a SORTED TABLE and a HASHED TABLE. As the SAP blog post shows, the access time for the two types of tables behaves as follows:

READ …
WITH KEY
Standard Sorted Hashed
O(1) Access via index Access via index KEY contains the
whole table key
O(logn) KEY nedds to be sorted
and BINARY SEARCH has been added
KEY contains the whole or
the beginning part of the table key
-
O(n) KEY not sorted or
no BINARY SEARCH
KEY does not contains the first field
of the table key
KEY does not contain the
whole table key

LOOP AT …
WHERE
Standard Sorted Hashed
O(1) - - WHERE contains the
whole key
O(logn) The table need to be sorted according the
WHERE clause + workaround:
1. First find the starting index with
READ … BINARY SEARCH
2. LOOP AT … WHERE … FROM INDEX
WHERE contatins the whole or
the beginning part of the table key
-
O(n) Every other LOOP AT … WHERE WHERE does not contain the first field
of the table key
WHERE does not contain the
whole table key

Here it is especially important to note that a wrong use of the type can even lead to a worse result (see examples O(n) for hashed tables.

1
2
3
4
5
6
7
8
9
10
11
12
13
DATA: lt_mara TYPE TABLE OF t_mara,
      lt_marc TYPE SORTED TABLE OF t_marc
              WITH  UNIQUE KEY matnr werks.

... (data selection)


LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>) WHERE matnr EQ <s_mara>-matnr.

  ENDLOOP.

ENDLOOP.

Runtime: 0.5 seconds

1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE HASHED TABLE OF t_mara
              WITH UNIQUE KEY matnr,
      lt_marc TYPE TABLE OF t_marc.

... (data selection)

LOOP AT lt_marc ASSIGNING FIELD-SYMBOL(<s_marc>).
  READ TABLE lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>) WITH KEY matnr = <s_marc>-matnr.

ENDLOOP.

Runtime: 0.5 seconds

Changing data of an internal table

Basically, it does not make much difference whether you use a target structure with LOOP AT INTO or a field symbol LOOP AT ASSIGNING when iterating over a table. The LOOP AT INTO is meanwhile so well optimized in the ABAP kernel that you can measure almost no difference to LOOP AT ASSIGNING regarding runtime.

As soon as you want to change data in an internal table, however, it makes a big difference whether you work with a structure or a field symbol. In the following examples the amount of data of the table MARA was increased to 804,019, so that you can measure the difference better:

1
2
3
4
5
6
7
8
9
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara INTO DATA(ls_mara).
  ls_mara-aenam = sy-uname.
  MODIFY lt_mara FROM ls_mara TRANSPORTING aenam.

ENDLOOP.

Runtime: 1.9 seconds

1
2
3
4
5
6
7
8
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  <s_mara>-aenam = sy-uname.

ENDLOOP.

Runtime: 0.3 seconds

Caching of frequently read data

If there are recurring accesses to a table from different parts of the program, it is worth thinking about a suitable caching method.

For the following examples the data from the first example is used again:

  • MARA: 83813

It should be noted here that all materials from the MARA table are in the same material group (MATKL),

Re-reading without cache

The classic approach: just do a select on the database in the loop. It may be that a cache in the database then takes effect here and speeds everything up a bit, but reading the same data over and over again is counterproductive.

1
2
3
4
5
6
7
8
9
10
11
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  SELECT SINGLE pernr
    FROM pa0105
    INTO @DATA(lv_pernr)
    WHERE usrid EQ @<s_mara>-aenam.

ENDLOOP.

Runtime: 34 seconds

Reading via local cache

A local cache works on the principle of “I look to see if the record is already in the internal table - if not, I read it from the database”. For this purpose, a “proxy class” with optimized access to the database or internal cache can be used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
DATA: gt_pa0105 TYPE HASHED TABLE OF pa0105
                WITH UNIQUE KEY usrid.
                
METHOD get_pernr.

  READ TABLE gt_pa0105 ASSIGNING FIELD-SYMBOL(<s_pa0105>) WITH KEY usrid = iv_uname.
  IF sy-subrc NE 0.
    SELECT SINGLE *
      FROM pa0105
      INTO @DATA(ls_pa0105)
      WHERE usrid EQ @iv_uname.
    IF sy-subrc NE 0.
      " fake entry
      ls_pa0105-usrid = iv_uname.

    ENDIF.

    INSERT ls_pa0105
      INTO TABLE gt_pa0105 ASSIGNING <s_pa0105>.

  ENDIF.

  rv_pernr = <s_pa0105>-pernr.

ENDMETHOD.                
1
2
3
4
5
6
7
8
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  DATA(lv_pernr) = get_pernr( <s_mara>-aenam ).

ENDLOOP.

Runtime: 0.7 seconds

Buffered table

Tables can also be marked for active buffering in the technical settings for the table (SE11 or SE13). If a table (or record) is buffered, it is kept in its own table buffer on the application server - thus the select on the database is omitted (if the result is in the buffer).

Simply marking the table for buffering does not guarantee that the application server can also store this data in the buffer. Depending on the buffer size or the number of active records in the buffer, buffered records can also be displaced from the buffer again.

1
2
3
4
5
6
7
8
9
10
11
12
DATA: lt_mara TYPE TABLE OF t_mara.

... (data selection)

LOOP AT lt_mara ASSIGNING FIELD-SYMBOL(<s_mara>).
  SELECT SINGLE name1		" USR03 ist gebuffert
    FROM usr03
    INTO @DATA(lv_name1)
    WHERE bname EQ @<s_mara>-aenam.

ENDLOOP.

Runtime: 0.8 seconds