inMemory on Oracle RAC

In my last blog I described my first experience with inMemory at a customer side. I was really impressed about the massive performance gain and the ease of use. Now it’s time for a second trial. As stated earlier the customer was using a single instance database as test environment but the production and preproduction were running on Oracle RAC. This again shows how important it is to have the same environment for preproduction and production – or better for test as well as that whould have shown the non comparable results early enough and not just before it went to production.

As with the first blob we had a couple of simple and complex queries which I’m unfortunately not allowed to use here. So I will use my own environment again.
The first query is a simple count on a simple table while the more complex is a join with 4 dimension tables and one fact table so a simple star scheme. At the customer we had some more complex snowflake schemes with up to 4 facts and couple of dimensions but the results where exactly the same as described here.
First we again load the tables with the fact table as the most critical and the dimensions with priority “high”.
As usual I was not reading the entire document first so to my suprise after loading the tables the first query (SELECT COUNT(*) FROM umsaetze” took more than 10 (!) minutes. Remember the query on a single instance without inMemory was using about 40 seconds and with inMemory around one! So what did we do wrong? Lets look for the data population:

SELECT inst_id, segment_name,  inmemory_size, bytes, bytes_not_populated, populate_status
  FROM v$im_user_segments

SEGMENT_NAME         INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED POPULATE_
-------------------- ------------- ---------- ------------------- ---------
REGION                     1179648     393216                   0 COMPLETED
KUNDEN                    11665408   83886080            40189952 COMPLETED
UMSAETZE                1452670976 6978273280          3163938816 COMPLETED
ZEITEN                     1179648     131072                   0 COMPLETED

Okay, no wonder: While REGION and ZEITEN (TIMES) have been entirely loaded KUNDEN (CUSTOMERS) and the fact table UMSAETZE (REVENUE) have only been loaded partly. After talking to the DBA (we had no direct access to the database so had to phone the DBA) we found that we really did not had sufficient memory. So he increased the parameter inmemory_size so it should be suitable for the all tables.

Second trial – no difference, again 400 seconds and the query on v$im_user_segments shows the same result. But we are on a RAC database. So v$ is probably not the right view but gv$ is:

SELECT inst_id, segment_name,  inmemory_size, bytes, bytes_not_populated, populate_status
  FROM gv$im_user_segments

   INST_ID SEGMENT_NAME         INMEMORY_SIZE      BYTES BYTES_NOT_POPULATED POPULATE_
---------- -------------------- ------------- ---------- ------------------- ---------
         2 REGION                     1179648     393216                   0 COMPLETED
         2 KUNDEN                    11665408   83886080            40189952 COMPLETED
         2 UMSAETZE                1452670976 6978273280          3163938816 COMPLETED
         2 ZEITEN                     1179648     131072                   0 COMPLETED
         1 KUNDEN                    12713984   83886080            39043072 COMPLETED
         1 UMSAETZE                1237254144 6978273280          3740106752 COMPLETED

Parallelism

As you can see, the tables KUNDEN and UMSATZE are completely populated but distributed between the two instances. Okay if I had read the documentation I would have realized that the tables will be distributed between the instances when using RAC. This implies that you have to use inter instance parallelism for your queries.

So next trial with the parameter “parallel_degree_policy=AUTO;”

So let’s execute the query again:

SELECT count(*)
FROM
 umsaetze

SQL>
  COUNT(*)
----------
 144500000

Elapsed: 00:00:00.93


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          2          0           0
Execute      1      0.01       0.27          0          5          0           0
Fetch        2      0.05       0.56          0          0          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.08       0.85          0          7          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 103
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=5 pr=0 pw=0 time=784786 us)
        14         14         14   PX COORDINATOR  (cr=5 pr=0 pw=0 time=479066 us)
         0          0          0    PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us)
         0          0          0     SORT AGGREGATE (cr=0 pr=0 pw=0 time=0 us)
         0          0          0      PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us cost=2183 size=0 card=144500000)
         0          0          0       TABLE ACCESS INMEMORY FULL UMSAETZE (cr=0 pr=0 pw=0 time=0 us cost=2183 size=0 card=144500000)
		 

Impressive: less than one second!

So the result is to use parallelism when using RAC (actually the documentation just says that).

Complex Query

But I didn’t had a good feeling. What about a more complex query that might not benefit from parallelism? Let me run the complex query (again with automatic parallelism):

SELECT sum(u.umsatz) as gesamtumsatz,
       a.artikelgruppe,
       z.monat_des_jahres,
       z.jahreszahl,
       r.region,
       k.plz,
       k.ort
  FROM region r
       INNER JOIN umsaetze u ON (r.region_id = u.region_id)
       INNER JOIN kunden k ON (k.kunden_id = u.kunden_id)
       INNER JOIN artikel a ON (a.artikel_id = u.artikel_id)
       INNER JOIN zeiten z ON (zeiten_id = zeit_id)
 WHERE r.region = 'Nord'
   AND z.jahreszahl in (2014,2015)
   AND z.monat_des_jahres = 12
 GROUP BY a.artikelgruppe, z.monat_des_jahres, z.jahreszahl, k.plz, k.ort, r.region
 FETCH FIRST 10 PERCENT ROWS ONLY

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.20       0.52          1          5          0           0
Execute      1      0.00       0.05          0          0          0           0
Fetch       88     78.74     309.07     602615     626637          0        1304
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       90     78.94     309.64     602616     626642          0        1304

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 103
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
      1304       1304       1304  VIEW  (cr=626637 pr=602615 pw=0 time=309120365 us cost=61732 size=188525563 card=1376099)
     13031      13031      13031   WINDOW BUFFER (cr=626637 pr=602615 pw=0 time=309125958 us cost=61732 size=112840118 card=1376099)
     13031      13031      13031    HASH GROUP BY (cr=626637 pr=602615 pw=0 time=309111308 us cost=61732 size=112840118 card=1376099)
    511356     511356     511356     HASH JOIN  (cr=626637 pr=602615 pw=0 time=302369604 us cost=35455 size=112840118 card=1376099)
       374        374        374      TABLE ACCESS INMEMORY FULL ARTIKEL (cr=5 pr=3 pw=0 time=18590 us cost=1 size=5984 card=374)
    511356     511356     511356      HASH JOIN  (cr=626632 pr=602612 pw=0 time=302126752 us cost=35451 size=90822534 card=1376099)
   1000000    1000000    1000000       TABLE ACCESS INMEMORY FULL KUNDEN (cr=4770 pr=4766 pw=0 time=75903 us cost=115 size=22000000 card=1000000)
    511356     511356     511356       HASH JOIN  (cr=621862 pr=597846 pw=0 time=300198624 us cost=30075 size=60548356 card=1376099)
      1351       1351       1351        TABLE ACCESS INMEMORY FULL REGION (cr=41 pr=37 pw=0 time=9573 us cost=1 size=12159 card=1351)
   3128202    3128202    3128202        HASH JOIN  (cr=621821 pr=597809 pw=0 time=302970633 us cost=30058 size=217609035 card=6217401)
        32         32         32         JOIN FILTER CREATE :BF0000 (cr=14 pr=0 pw=0 time=626 us cost=1 size=594 card=54)
        32         32         32          TABLE ACCESS INMEMORY FULL ZEITEN (cr=14 pr=0 pw=0 time=492 us cost=1 size=594 card=54)
   3128202    3128202    3128202         JOIN FILTER USE :BF0000 (cr=621807 pr=597809 pw=0 time=301926572 us cost=29693 size=3468000000 card=144500000)
   3128202    3128202    3128202          TABLE ACCESS INMEMORY FULL UMSAETZE (cr=621807 pr=597809 pw=0 time=301826778 us cost=29693 size=3468000000 card=144500000)

Frustrating! Even though both inMemory and BloomFilter (BF) have been used the query took about 300 seconds (instead of 70 without inMemory and 3 seconds with single instance and inMemory). Very disappointing!

So queries which cannot benefit from the parallelism will take much longer than on a single instance database. Looking in the documentation I found that there is an option “DUPLICATE” with the ALTER TABLE INMEMORY command (e.g. ALTER TABLE umsaetze INMEMORY DUPLICATE PRIORITY CRITICAL). Unforunately that parameter is for engineered systems only. So even though the parameter is correctly shown in the v$im_user_segments it does not work. The tables will always be distributed.

We validate that inMemory works on that platform in principial we asked the DBA to shut down one instance and as expected the query run in the expected time:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.25       0.27          0          5          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       88      1.50       1.61          0         17          0        1304
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       90      1.76       1.89          0         22          0        1304

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 103
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
      1304       1304       1304  VIEW  (cr=17 pr=0 pw=0 time=1606149 us cost=18577 size=1785247 card=13031)
     13031      13031      13031   WINDOW BUFFER (cr=17 pr=0 pw=0 time=1611588 us cost=18577 size=1068542 card=13031)
     13031      13031      13031    HASH GROUP BY (cr=17 pr=0 pw=0 time=1600326 us cost=18577 size=1068542 card=13031)
    511356     511356     511356     HASH JOIN  (cr=17 pr=0 pw=0 time=1432922 us cost=14973 size=41931192 card=511356)
       374        374        374      TABLE ACCESS INMEMORY FULL ARTIKEL (cr=5 pr=0 pw=0 time=157 us cost=1 size=5984 card=374)
    511356     511356     511356      HASH JOIN  (cr=12 pr=0 pw=0 time=1335242 us cost=14971 size=33749496 card=511356)
    511356     511356     511356       HASH JOIN  (cr=9 pr=0 pw=0 time=827741 us cost=11851 size=22499664 card=511356)
      1351       1351       1351        TABLE ACCESS INMEMORY FULL REGION (cr=3 pr=0 pw=0 time=21736 us cost=4 size=12159 card=1351)
   3128202    3128202    3128202        HASH JOIN  (cr=6 pr=0 pw=0 time=705636 us cost=11839 size=118395025 card=3382715)
        32         32         32         JOIN FILTER CREATE :BF0000 (cr=3 pr=0 pw=0 time=29789 us cost=4 size=594 card=54)
        32         32         32          TABLE ACCESS INMEMORY FULL ZEITEN (cr=3 pr=0 pw=0 time=29723 us cost=4 size=594 card=54)
   3128202    3128202    3128202         JOIN FILTER USE :BF0000 (cr=3 pr=0 pw=0 time=189813 us cost=11471 size=3468000000 card=144500000)
   3128202    3128202    3128202          TABLE ACCESS INMEMORY FULL UMSAETZE (cr=3 pr=0 pw=0 time=180968 us cost=11471 size=3468000000 card=144500000)
   1000000    1000000    1000000       TABLE ACCESS INMEMORY FULL KUNDEN (cr=3 pr=0 pw=0 time=30677 us cost=151 size=22000000 card=1000000)

Less than two seconds for the query.

Conclusion

But what is the outcome?
I don’t see any benefit of distributing tables across cluster instances. Yes a simple query might use parallelism but I cannot rely on it and on the opposite the query might run 10 times slower than without inMemory. Users with ngineered systems might want to use the option “DUPLICATE” to populate the entire tables on all RAC instances. That’s for sure the most effective way of using inMemory with RAC as you can distribute the connections. In the event of an instance failure the tables will be automatically redistributed so inMemory is fail safe. Customers with RAC on non engineered systems need a workaround!

Workaround

Actually there are two workarounds available. I’ve tested one hat it works okay: There is an undocumented parameter named “_inmemory_auto_distribute”. If you set this parameter on both instances you can reload the tables on both instances. So it looks like it is similar to the DUPLICATE parameter – no idea why Oracle doesn’t make that documented. Another solution will be use a parallel instance group service (paralle_instance_group) that is bound to one instance. I will test that and let you know if that’s preferrable or not. In the meantime I would appreciate your feedback and experience.

4 thoughts on “inMemory on Oracle RAC”

  1. Hi,

    In RAC the plan is serial, that is why you see lots of physical reads and performance is not good. A serial query cannot access the IMC store on another instance. You can try with a parallel hint and compare. Can you please send me the 10053 trace for the RAC case so that I can have a look why the plan is serial.

    1. Johannes Ahrends

      I’ve not yet had enough time to investigate on the query itself. But on the other hand: what if a query cannot be parallelized? And what if, like at the customer side only one instance is dedicated for the BI application?

  2. Hi Johannes,

    The “_inmemory_auto_distribute” parameter has been disabled in the first bundle patch for Database In-Memory. It also violates the license agreement since the DUPLICATE option requires an engineered system. As for your observation about RAC and distributed objects, that is why you have to set parallel_degree_policy to AUTO, to insure that objects will be accessed using parallel query on all of the RAC nodes. And lastly, an object must be at least 64KB in size to be eligible to be populated in the In-Memory column store.

    Regards,

    Andy

  3. Johannes Ahrends

    Thanks for the comment on 64k – that’s why one table was missing. I will check out how we can make use of IM at that particular customer. currently they decided to stop the investigation on IM.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top