Schedule - FOSDEM PGDay 2020

PGSpider - High-performance SQL cluster engine.

Date: 2020-01-31
Time: 15:20–16:10
Room: PGDay at Hilton
Feedback: Leave feedback

In recent years, the number of edge and sensor devices in IoT systems has increased abruptly to the thousands or the tens of thousands. When users store data in high-performance edge and sensor devices, they prefer to search data directly on them rather than collecting the data in the cloud. Users also want to handle this distributed big data as a single virtual table that they can search at high speed. For that reason, we developed PGSpider which can access various data sources at high speed by using PostgreSQL Foreign Data Wrappers (FDW). By using FDW's pushdown function to its full potential and thread parallelization, PGSpider can acquire and process data at lightning speed.

I will talk about the following points.

  • PGSpider fundamental mechanisms.
    • Internal structures and the mechanism to manage nodes arranged as a tree diagram.
  • Multi-Tenant feature.
    • Enable to access a lot of tables located in multiple nodes as a single table.
  • Parallelization and multi-threading speed-ups.
    • Why multi-threading in PostgreSQL is hard.
  • Pushdown control when using multiple FDWs.
    • How to control FDWs that support pushdown and FDW that don't.
  • Performance measurement.
    • Compare with competitive software.
  • Searching subtrees with a new original "UNDER" clause.
  • Future development.

Slides

The following slides have been made available for this session:

Speaker

Taiga Katayama