Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers Pat Patterson Principal Developer Evangelist [email protected] @metadaddy Agenda Foreign Data Wrappers Writing FDW’s in C Multicorn Database.com FDW for PostgreSQL FDW in action Why Foreign Data Wrappers? External data sources look like local tables! – Other SQL database • MySQL, Oracle, SQL Server, etc – NoSQL database • CouchDB, Redis, etc – File – LDAP – Web services • Twitter! Why Foreign Data Wrappers? Make the database do the work – SELECT syntax • DISTINCT, ORDER BY etc – Functions • COUNT(), MIN(), MAX() etc – JOIN external data to internal tables – Use standard apps, libraries for data analysis, reporting Foreign Data Wrappers 2003 - SQL Management of External Data (SQL/MED) 2011 – PostgreSQL 9.1 implementation – Read-only – SELECT-clause optimization – WHERE-clause push-down • Minimize data requested from external source Future Improvements – JOIN push-down • Where two foreign tables are in the same server – Support cursors FDW’s in PostgreSQL ‘Compiled language’ (C) interface Implement a set of callbacks typedef struct FdwRoutine { NodeTag type; /* These functions are required. */ GetForeignRelSize_function GetForeignRelSize; GetForeignPaths_function GetForeignPaths; GetForeignPlan_function GetForeignPlan; ExplainForeignScan_function ExplainForeignScan; BeginForeignScan_function BeginForeignScan; IterateForeignScan_function IterateForeignScan; ReScanForeignScan_function ReScanForeignScan; EndForeignScan_function EndForeignScan; /* These functions are optional. */ AnalyzeForeignTable_function AnalyzeForeignTable; } FdwRoutine; FDW’s in PostgreSQL Much work! • CouchDB FDW • https://github.com/ZhengYang/couchdb_fdw/ • couchdb_fdw.c > 1700 LoC Multicorn http://multicorn.org/ PostgreSQL 9.1+ extension Python framework for FDW’s Implement two methods… Multicorn from multicorn import ForeignDataWrapper class ConstantForeignDataWrapper(ForeignDataWrapper): def __init__(self, options, columns): super(ConstantForeignDataWrapper, self).__init__(options, columns) self.columns = columns def execute(self, quals, columns): for index in range(20): line = {} for column_name in self.columns: line[column_name] = '%s %s' % (column_name, index) yield line Database.com FDW for PostgreSQL OAuth login to Database.com / Force.com – Refresh on token expiry Force.com REST API – SOQL query • SELECT firstname, lastname FROM Contact Request thread puts records in Queue, execute() method gets them from Queue JSON parsing – skip embedded metadat < 250 lines code Demo Conclusion Foreign Data Wrappers make the whole world look like tables! Writing FDW’s in C is hard! – Or, at least, time consuming! Writing FDW’s in Python via Multicorn is easy! – Or, at least, quick! Try it for yourself! Resources http://wiki.postgresql.org/wiki/SQL/MED http://wiki.postgresql.org/wiki/Foreign_data_wrappers http://multicorn.org/ https://github.com/metadaddy-sfdc/Database.comFDW-for-PostgreSQL
© Copyright 2024