MySQL Postgresql / PostGIS MySQL Postgresql / PostGIS postgresql postgresql

MySQL Postgresql / PostGIS


A few thoughts.

First PostgreSQL and MySQL are completely different beasts when it comes to performance tuning. So if you go the porting route be prepared to rethink your indexing strategies. Not only does PostgreSQL have a far more flexible indexing than MySQL, but the table approaches are very different also, meaning the appropriate indexing strategies are as different as the tactics are. Unfortunately this means you can expect to struggle a bit. If i could give advice I would suggest dropping all non-key indexes at first and then adding them back sparingly as needed.

The second point is that nobody here can likely give you a huge amount of practical advice at this point because we don't know the internals of your program. In PostgreSQL, you are best off indexing only what you need, but you can index functions' outputs (which is really helpful in cases like this) and you can index only part of a table.

I am more a PostgreSQL guy than a MySQL guy so of course I think you should go with PostgreSQL. However rather than tell you why etc. and have you struggle at this scale, I will tell you a few things that I would look at using if I were trying to do this.

  • Functional indexes
  • Write my own functions for indexes for related analysis
  • PostGIS is pretty amazing and very flexible

In the end, switching db's at this volume is going to be a learning curve, and you need to be prepared for that. However, PostgreSQL can handle the volume just fine.


The number of rows is quite irrelevant here. The question is how much of the point in polygon work that can be done by the index.

The answer to that depends on how big the polygons are.

PostGIS is very fast to find all points in the bounding box of a polygon. Then it takes more effort to find out if the point actually is inside the polygon.

If your polygons is small (small bounding boxes) the query will be efficient. If your polygons are big or have a shape that mekes the bounding box big then it will be less efficient.

If your polygons is more or less static there is work arounds. You can divide your polygons in smaller polygons and recreate the idnex. Then the index will be more efficient.

If your polygons is actually multipolygons the firs step is to split the multipolygons to polygons with ST_Dump and recreate and build an index on the result.

HTH

Nicklas