Wednesday, April 8, 2020

BigQuery Materialized Views and Oracle Materialized Views

One of the common ways of one-to-many replication setups in Oracle databases involve, on high level, having one master transaction database which holds the transactions, then a mview log is created on that table.


Then all the other reporting databases subscribe their respective materialized views (MViews) to this log table. These MViews remain in sync with the master log table through incremental refresh or through complete refresh. As long as it runs fine, it runs fine but when things break, it becomes ugly, and I mean ugly. The MViews at reporting databases could lag behind the master log due to network issue or if the master database goes down. Doing a complete refresh is also a nightmare and you have to do lots of purging and tinkering. The more subscribing MViews, the more hassle it is when things break.

BigQuery is Google's managed data warehousing service which now offers materialized views. If you have managed Oracle MViews, it brings you to tears when you learn that BigQuery MViews offers following:

Zero maintenance: A materialized view is recomputed in background once the base table has changed. All incremental data changes from the base tables are automatically added to the materialized views. No user inputs are required.

Always fresh: A materialized view is always consistent with the base table, including BigQuery streaming tables. If a base table is modified via update, merge, partition truncation, or partition expiration, BigQuery will invalidate the impacted portions of the materialized view and fully re-read the corresponding portion of the base table. For an unpartitioned materialized view, BigQuery will invalidate the entire materialized view and re-read the entire base table. For a partitioned materialized view, BigQuery will invalidate the affected partitions of the materialized view and re-read the entire corresponding partitions from the base table. Partitions that are append-only are not invalidated and are read in delta mode. In other words, there will never be a situation when querying a materialized view results in stale data.

Smart tuning: If a query or part of a query against the source table can instead be resolved by querying the materialized view, BigQuery will rewrite (reroute) the query to use the materialized view for better performance and/or efficiency.

In my initial testing, the things work like a charm and refresh takes at most couple of minutes. I will be posting some tests here very soon. But suffice is to say that delegating manaagement of Mview refresh to Google is reason enough to move to BigQuery.


No comments: