Data loss with Database Services

In one of my projects we had a disaster most recently: An application was connected to two databases simultaneously and propagated changes to both of them.

But why was this possible? Why did we use two databases? And why is it possible for a service to connect to two databases – that shouldn’t work, should it?

IT WORKS!
Here are some details:
Because of a failure of the application an older version of a pluggable database had to be checked. This isn’t a problem as we are using NetApp as storage provider and SnapCenter for backup and recovery. This is a nice solution because you can clone databases of an arbitrary size within minutes.

In this specific situation a colleague created a point in time clone of the production database to Friday 5 pm. The clone had to reside on the same server as the production because it’s not allowed to use production database on any other stage (i.e. Test, Development, Preproduction).
The colleague opened the database for read and write access to check the data and export some tables. As usual the first export failed with ORA-01555 “Snapshot too old”. So he decided to continue next Monday.
Nobody realized that since Oracle version 12.2 all services (!) are automatically started while opening the database (no matter if it is a PDB or NON-CDB).

NOTE: To be more precise: Only those services will be started which have been created and started before the last “SAVE STATE” command issued for the PDB.

The services registered with the listener (works as designed). But so several services from the clone database registered with the listener with the same name as the production database. Any connect from those services was randomly connected to the production and the clone.

The Data Loss

In this case the application onl realized the problem by Tuesday. So 5 days of data loss because one can imagine that it was impossible to restore the missing transactions on the production database.
In the past we used this kind of cloning many times without any problems. The reason was that up to 12.1 services are not automatically started while opening the PDB.

Solution

Unfortunately there is no solution for this problem. I opened an SR with Oracle support but it was closed because it is worked as designed and the listener accepts multiple services from different instances.

Warning

Please be very careful with database cloning. The issue occurs regardless of whether you use Multitenant or not. The problem exists as soon as multiple databases connect to a single listener. But especially with Multitenant Database and many pluggable databases it could easily happen that a service is accidentally created twice. If both services run within the same databas, the second service will not start. But if they belong to different databases both services will register.
Even though this will often end with error messages as the same user or the same table might not exist in both PDBs it could happen, for example if you are using Enterprise User Security (global defined Users) and similar database layouts like with a tenant architecture. Or with clones of course.

Example:

oracle@server1[MX011]% srvctl add service -db MX01 -pdb jospdb -service jo_demo -preferred MX011 -available MX012
oracle@server1[MX011]% srvctl start service -db MX01 -service jo_demo
oracle@server1[MX011]% srvctl add service -db MX02 -pdb joszweitepdb -service jo_demo -preferred MX021 -available MX022
oracle@server1[MX011]% srvctl start service -db MX02 -pdb joszweitepdb
oracle@server1[MX011]% lsnrctl status |grep -A2 jo_demo
Service "jo_demo" has 2 instance(s).
   Instance "MX011", status READY, has 1 handler(s) for this service…
   Instance "MX021", status READY, has 1 handler(s) for this service…

I would like to know from Oracle why it is possible that services are able to register with the listener even though the databases are different. I don’t see the need for this “feature”.

Workaround for Cloning

  1. Stop the listener before opening the database.
  2. Open the (Pluggable) Database and stop the services.
  3. Start the listener again.

As you can see it´s mandatory that you don’t use the default service (name of the database) within your application as you are unable to stop those services. So on the positive side: This gives an example why you should always create application specific services.

A better method is of course to clone the database to a different server (probably not production). But as already said: That’s not always allowed.

Conclusion

Be careful while cloning a database. It could end up in a disaster!

Addendum

If you are using Multitenant Database, be aware that opening a pluggable database will automatically start the listener again!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top