pg_sql | Unsorted

Telegram-канал pg_sql - PostgreSQL

2806

English speaking PostgreSQL public chat. This group is for discussions on PostgreSQL-related topics and strives to provide best-effort support as well.

Subscribe to a channel

PostgreSQL

Maybe we should add a rule "if you're involved in crypto or blockchain stuff, dump your s**t somewhere else"? 🤔

Читать полностью…

PostgreSQL

We just set up a group dedicated to PG freelancers/small businesses in Europe. Topics will (probably) be jobs, rates, experiences w/ clients, improving visibility etc. pp.

Since I don't nearly know all of you out there, join us if you're interested:

/channel/+GUaaD3HwQ98zNGYy

Читать полностью…

PostgreSQL

Compared to cloud anything bare metal is already affordable. Check https://autobase.tech/

Читать полностью…

PostgreSQL

Good luck with that...

Читать полностью…

PostgreSQL

So it is timescale? My opinion is that storage is cheap and total size required depends on retention policy + Wal generated. So I don't get the problem statement here without numbers.

Читать полностью…

PostgreSQL

I guess, since I don't have like a lot of updates in the hypertable, and only it does append new data ingestion, so I can tell by logic that no need for autovacuuming, as there won't be like a lot of dead rows there in the hypertable.

and only depends on the policy set on both the hypertable and mvs refresh policy to handle this instead of autovacuum.... as I believe bt enabling autovacuum it could further complicate the problem as by default timescale has a lot of triggers like jobs, hypertable_chunks, etc... to be run when new data inserted, so I see no valid point here to enable it so far

Читать полностью…

PostgreSQL

Should it also be set for longer time to autovacuum longer than my refresh policy of my mvs?

Читать полностью…

PostgreSQL

So, we might have this situation here:
* you ingest data into your main table -> WAL gets written
* autovavuum doesn't bother looking at it yet, afterall only a tiny fraction of the table has been added
* as you left checkpoint_timeout at the default of 5 minutes, a CHECKPOINT is triggered
* you refresh your MV -> main table gets read
* main table gets visibility bits etc. refreshed, triggering a full page write, as there was a CHECKPOINT before

I can imagine that raising checkpoint_timeout to something significantly larger than your MV refresh rate will reduce your WAL count quite a bit.
That said, if your DB is in the TB range and you ingest a lot of new data, it's not a miracle that your WAL count is high too 🤷

Читать полностью…

PostgreSQL

Your MVs are aggregations? I.e., not _large_?

Читать полностью…

PostgreSQL

I'm also not sure if matvievs are WAL logged. My guess would be yes. Maybe a smart(er) approach to that part would help too 🤔

Читать полностью…

PostgreSQL

And what do you believe you'll achieve with it? A WAL segment is 16MB. It doesn't grow larger if you set archive_timeout, it will just force an archiving if you didn't fill the 16MB in 500 seconds.
Raise checkpoint_timeout. Activate wal_compression if you didn't do so yet. That will probably reduce your WAL throughput.

Читать полностью…

PostgreSQL

Oh, sorry I mean 500 it is by default in second, it is approximately 8 mins

Читать полностью…

PostgreSQL

yep, it is already 0, I didn't change it or configure it, so that's why I asked if I could increase it?

Maybe to 500ms?

Читать полностью…

PostgreSQL

But here archive_timeout if I increase it, it will take more time to switch to other wal segment, which will benefit in not have a frequent short time of writing or increasing of wals.

I mean I don't care about the total size of the repository, I just care about the rate or the frequency of wals archiving, so I guess by increaing the archive_timeout too, it will make a reasonable sense here? Is it right?

we are using GraphQL to fetch data from the IoT apps used to represent and store the data in timescaledb, and also we are using continuous materialized views to fetch new data from the app.

Not sure if there are more updating statements more than just inserting new data, will check that... But yeah, most probably we don't update any data, it is just new a lot of data being inserted

Читать полностью…

PostgreSQL

Our workload is heavy, especially because we use Timescaledb, which yes it has new data updates in a short time.

Note: the database and workload in terabytes, so we have a huge data there.

so, what do you think?

Should I increase the checkpoint_timeout and archive_timeout too?

Читать полностью…

PostgreSQL

🔗 We are looking for a developer (blockchain)
About the company:
Decentralized platform
Responsibilities:
Development and implementation of blockchain solutions for specific tasks.
Creation and testing of smart contracts.
Support and optimization of existing blockchain systems.
Ensuring the security of blockchain applications.
Requirements:
Experience developing on solidity rust (from 2 years):
Experience creating and testing smart contracts, including decentralized applications (dapps).
Knowledge of the defi, nft, and other blockchain-based applications protocols.
Ability to work with web3. js, ethers. js or other libraries for interaction with ethereum.
experience with truffle, hardhat or similar frameworks for developing, testing and deploying smart contracts.
experience with any of the frames (vue, react, svelte)
conditions:
remote work
work in sprints
performance bonuses
salary - $5000-25000 (payment in usdt) (depending on development experience, negotiated separately)

write me a DM, I may not respond right away

Читать полностью…

PostgreSQL

Purchase price (in $$$) is only a small part of the costs of a server...

So, "the best" are probably servers from the vendor that is already in your infrastructure, by means of "we have a pipeline for the firmware patches in place, we know the out-of-band-interfaces, we stock replacement disks on site" etc.

In general, aim for the second highest CPU in the range, and go for high single core speed. Your money is *usually* spent better for RAM and I/O.

But anyway, did you really think that 1000 connections and write-anywhere part through? Because otherwise you'll probably be disappointed badly that your expensive server are idling while waiting on the DLM most of the time...

Читать полностью…

PostgreSQL

Affordable but high quality 😏

Читать полностью…

PostgreSQL

Hey everyone,

Any recommendations on-premise bare-metal servers with SSD storage. Something affordable but high-quality. 😀

We’re setting up new clusters with a multi-master configuration, aiming to handle around 1K concurrent connections. data growth about <= 3 GB / days.

Any suggestions?

Читать полностью…

PostgreSQL

what do u think like ur opinion on that matter?

Читать полностью…

PostgreSQL

Hard to tell. I'm not really familiar with Timescale. One or the other will trigger a visibility bit update, *that* should happen before the checkpoint.

Читать полностью…

PostgreSQL

Should I also have autovacuum on that hypertable for sure?

Читать полностью…

PostgreSQL

They are not a big deal, they are just having 4 or 5 column selection, along with counting column

Читать полностью…

PostgreSQL

yep, it does logged, because it does affect the hypertable.

The core problem here, is that also my refresh policy of my continous aggregation materialized views when it runs, it also affecting the wal, and I used multiple mvs build on top of each other to further divide the data in hourly, weekly, monthly, yearly, etc...

so my refresh policy has to run in a very frequent short time in order to fetch new data from the hypertable.

Also, I cannot have like a short retention period on my hypertable, because I needed these amount of data to measure our metrics like I said in even years, and for long period of time...

Читать полностью…

PostgreSQL

You could also switch to larger WAL segment size if it's just the number that worries you 😏

Читать полностью…

PostgreSQL

Yeah, it is 0 already, and the wals keep on growings

Читать полностью…

PostgreSQL

0 means inactive/no timeout. Leave it as is.

Читать полностью…

PostgreSQL

The default of archive_timeout is 0. Maybe try that first?

Читать полностью…

PostgreSQL

Just checkpoint_timeout. The only thing that archive_timeout will give you is even more WAL volume.

But let me ask: you're UPDATEing your timeseries data? I'm not sure if TimescaleDB is optimized for that... or do you mean you INSERTing a lot of new data?

Читать полностью…

PostgreSQL

archive_timeout is only relevant if your system is really low traffic. A WAL segment gets archived when it's reached (usually) 16MB. If you set archive_timeout, it will be written with less data in it after the timeout.
You seem be be producing too much WAL. So archive_timeout will not help you at all, you'd only produce more files.
checkpoint_timeout (defaulting to 5 minutes) in combination with full_page_writes (defaulting to on) on the other hand can lead to 8kb being written to the WAL (& hence archived) every 5 minutes, even if you only change one byte on a page every 5 minutes.

Is your workload UPDATE-heavy? Maybe even something that UPDATEs rows that don't need updating (re-writing unchanged data)?

Читать полностью…
Subscribe to a channel