Back your data up, restructure your RAID array to RAID 10, and try again. If you're using RAID 5 or RAID 6 for direct attached storage, stop now. Do not use cheap SSDs without proper power-failure protection unless you don't care about keeping your data. They're less beneficial when you follow the advice above - which reduces disk flushes / number of fsync()s - but can still be a big help. Good SSDs with reliable, power-protected write-back caches make commit rates incredibly faster. Use good quality SSDs for storage as much as possible. You should also look at tuning your system: See non-durable settings in the Pg manual. If you set fsync=off you can also set full_page_writes=off again, just remember to turn it back on after your import to prevent database corruption and data loss. Do not do this if there is already any data you care about in any database on your PostgreSQL install. If and only if you don't mind losing your entire PostgreSQL cluster (your database and any others on the same cluster) to catastrophic corruption if the system crashes during the import, you can stop Pg, set fsync=off, start Pg, do your import, then (vitally) stop Pg and set fsync=on again. Look at the PostgreSQL logs and make sure it's not complaining about checkpoints occurring too frequently. Set a high max_wal_size value ( checkpoint_segments in older versions) and enable log_checkpoints. How many depends on your hardware's disk subsystem as a rule of thumb, you want one connection per physical hard drive if using direct attached storage. INSERT or COPY in parallel from several connections. This won't help much if you've batched your work into big transactions, though. Use synchronous_commit=off and a huge commit_delay to reduce fsync() costs. Again, you seem to be doing this already. There's no practical limit AFAIK, but batching will let you recover from an error by marking the start of each batch in your input data. Don't try to list too many values in a single VALUES though those values have to fit in memory a couple of times over, so keep it to a few hundred per statement.īatch your inserts into explicit transactions, doing hundreds of thousands or millions of inserts per transaction. If you can't use COPY consider using multi-valued INSERTs if practical. Do not do this if the import is split across multiple transactions as you might introduce invalid data. If doing the import within a single transaction, it's safe to drop foreign key constraints, do the import, and re-create the constraints before committing. (It takes much less time to build an index in one pass than it does to add the same data to it progressively, and the resulting index is much more compact). If you can take your database offline for the bulk import, use pg_bulkload.ĭrop indexes before starting the import, re-create them afterwards. Unfortunately in PostgreSQL 9.4 there's no support for changing tables from UNLOGGED to logged. The ideal solution would be to import into an UNLOGGED table without indexes, then change it to logged and add the indexes. If you're interested DB restore performance with pg_restore or psql execution of pg_dump output, much of this doesn't apply since pg_dump and pg_restore already do things like creating triggers and indexes after it finishes a schema+data restore). (Note that this answer is about bulk-loading data into an existing DB or to create a new one. version.See populate a database in the PostgreSQL manual, depesz's excellent-as-usual article on the topic, and this SO question. Note: If we are using an earlier version of PostgreSQL, we will need a workaround to have the upsert feature as the ON CONFLICT clause is only accessible from PostgreSQL 9.5. This action is used to update some fields in the table. It defines that we do nothing if the row already presents in the table.ĭO UPDATE SET column_1 = value_1. In the above command, the Conflict_action can be one of the following: Actions In this, the constraint name could be the name of the UNIQUE constraint. It is used to specify a column name in the particular table. In the above command, the Conflict_target can be one of the following: Target For supporting the upsert attribute, the PostgreSQL added the ON CONFLICT target action clause to the INSERT command.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |