tal

tal@olio.cafe · edit-2 1 day ago

I don’t know if there’s a term for them, but Bacula (and I think AMANDA might fall into this camp, but I haven’t looked at it in ages) are oriented more towards…“institutional” backup. Like, there’s a dedicated backup server, maybe dedicated offline media like tapes, the backup server needs to drive the backup, etc).

There are some things that rsnapshot, rdiff-backup, duplicity, and so forth won’t do.

At least some of them (rdiff-backup, for one) won’t dedup files with different names. If a file is unchanged, it won’t use extra storage, but it won’t identify different identical files at different locations. This usually isn’t all that important for a single host, other than maybe if you rename files, but if you’re backing up many different hosts, as in an institutional setting, they likely files in common. They aren’t intended to back up multiple hosts to a single, shared repository.
Pull-only. I think that it might be possible to run some of the above three in “pull” mode, where the backup server connects and gets the backup, but where they don’t have the ability to write to the backup server. This may be desirable if you’re concerned about a host being compromised, but not the backup server, since it means that an attacker can’t go dick with your backups. Think of those cybercriminals who encrypt data at a company and wipe other copies and then demand a ransom for an unlock key. But the “institutional” backup systems are going to be aimed at having the backup server drive all this, and have the backup server have access to log into the individual hosts and pull the backups over.
Dedup for non-identical files. Note that restic can do this. While files might not be identical, they might share some common elements, and one might want to try to take advantage of that in backup storage.
rdiff-backup and rsnapshot don’t do encryption (though duplicity does). If one intends to use storage not under one’s physical control (e.g. “cloud backup”), this might be a concern.
No “full” backups. Some backup programs follow a scheme where one periodically does a backup that stores a full copy of the data, and then stores “incremental” backups from the last full backup. All rsnapshot, rdiff-backup, and duplicity are always-incremental, and are aimed at storing their backups on a single destination filesystem. A split between “full” and “incremental” is probably something you want if you’re using, say, tape storage and having backups that span multiple tapes, since it controls how many pieces of media you have to dig up to perform a restore.
I don’t know how Bacula or AMANDA handle it, if at all, but if you have a DBMS like PostgreSQL or MySQL or the like, it may be constantly receiving writes. This means that you can’t get an atomic snapshot of the database, which is critical if you want to be reliably backing up the storage. I don’t know what the convention is here, but I’d guess either using filesystem-level atomic snapshot support (e.g. btrfs) or requiring the backup system to be aware of the DBMS and instructing it to suspend modification while it does the backup. rsnapshot, rdiff-backup, and duplicity aren’t going to do anything like that.

I’d agree that using the more-heavyweight, “institutional” backup programs can make sense for some use cases, like if you’re backing up many workstations or something.

tal@olio.cafe · 1 day ago

Because every “file” in the snapshot is either a file or a hard link to an identical version of that file in another snapshot.) So this can be a problem if you store many snapshots of many files.

I think that you may be thinking of rsnapshot rather than rdiff-backup which has that behavior; both use rsync.

But I’m not sure why you’d be concerned about this behavior.

Are you worried about inode exhaustion on the destination filesystem?

tal@olio.cafe · edit-2 2 days ago

slow

rsync is pretty fast, frankly. Once it’s run once, if you have -a or -t passed, it’ll synchronize mtimes. If the modification time and filesize matches, by default, rsync won’t look at a file further, so subsequent runs will be pretty fast. You can’t really beat that for speed unless you have some sort of monitoring system in place (like, filesystem-level support for identifying modifications).

tal@olio.cafe · 2 days ago

Most Unix commands will show a short list of the most-helpful flags if you use --help or -h.

tal@olio.cafe · edit-2 2 days ago

sed can do a bunch of things, but I overwhelmingly use it for a single operation in a pipeline: the s// operation. I think that that’s worth knowing.

sed 's/foo/bar/'

will replace all the first text in each line matching the regex “foo” with “bar”.

That’ll already handle a lot of cases, but a few other helpful sub-uses:

sed 's/foo/bar/g'

will replace all text matching regex “foo” with “bar”, even if there are more than one per line

sed 's/\([0-9a-f]*\)/0x\1/g

will take the text inside the backslash-escaped parens and put that matched text back in the replacement text, where one has ‘\1’. In the above example, that’s finding all hexadecimal strings and prefixing them with ‘0x’

If you want to match a literal “/”, the easiest way to do it is to just use a different separator; if you use something other than a “/” as separator after the “s”, sed will expect that later in the expression too, like this:

sed 's%/%SLASH%g

will replace all instances of a “/” in the text with “SLASH”.

tal@olio.cafe · edit-2 2 days ago

I would generally argue that rsync is not a backup solution.

Yeah, if you want to use rsync specifically for backups, you’re probably better-off using something like rdiff-backup, which makes use of rsync to generate backups and store them efficiently, and drive it from something like backupninja, which will run the task periodically and notify you if it fails.

rsync: one-way synchronization

unison: bidirectional synchronization

git: synchronization of text files with good interactive merging.

rdiff-backup: rsync-based backups. I used to use this and moved to restic, as the backupninja target for rdiff-backup has kind of fallen into disrepair.

That doesn’t mean “don’t use rsync”. I mean, rsync’s a fine tool. It’s just…not really a backup program on its own.

tal@olio.cafe · edit-2 4 days ago

OOMs happen because your system is out of memory.

You asked how to know which process is responsible. There is no correct answer to which process is “wrong” in using more memory — all one can say is that processes are in aggregate asking for too much memory. The kernel tries to “blame” a process and will kill it, as you’ve seen, to let your system continue to function, but ultimately, you may know better than it which is acting in a way you don’t want.

It should log something to the kernel log when it OOM kills something.

It may be that you simply don’t have enough memory to do what you want to do. You could take a glance in top (sort by memory usage with shift-M). You might be able to get by by adding more paging (swap) space. You can do this with a paging file if it’s problematic to create a paging partition.

EDIT: I don’t know if there’s a way to get a dump of processes that are using memory at exactly the instant of the OOM, but if you want to get an idea of what memory usage looks at at that time, you can certainly do something like leave a top -o %MEM -b >log.txt process running to get a snapshot every two seconds of process memory use. top will print a timestamp at the top of each entry, and between the timestamped OOM entry in the kernel log and the timestamped dump, you should be able to look at what’s using memory.

There are also various other packages for logging resource usage that provide less information, but also don’t use so much space, if you want to view historical resource usage. sysstat is what I usually use, with the sar command to view logged data, though that’s very elderly. Things like that won’t dump a list of all processes, but they will let you know if, over a given period of time, a server is running low on available memory.

tal@olio.cafe · 7 days ago

Yeah, I saw, but it’s an interesting topic.

tal@olio.cafe · edit-2 7 days ago

Is your concern compromise of your data or loss of the server?

My guess is that most burglaries don’t wind up with people trying to make use of the data on computers.

As to loss, I mean, do an off-site backup of stuff that you can’t handle losing and in the unlikely case that it gets stolen, be prepared to replace hardware.

If you just want to keep the hardware out of sight and create a minimal barrier, you can get locking, ventillated racks. I don’t know how cost-effective that is; I’d think that that might cost more than the expected value of the loss from theft. If a computer costs $1000 and you have a 1% chance of it being stolen, you should not spend more than $10 on prevention in terms of reducing cost of hardware loss, even if that method is 100% effective.

EDIT: My guess is also that

tal@olio.cafe · edit-2 7 days ago

Wikipedia: Mantrap (snare)

Mantraps that use deadly force are illegal in the United States, and in notable tort law cases the trespasser has successfully sued the property owner for damages caused by the mantrap. There is also the possibility that such traps could endanger emergency service personnel such as firefighters who must forcefully enter such buildings during emergencies. As noted in the important American court case of Katko v. Briney, “the law has always placed a higher value upon human safety than upon mere rights of property”.[5]

EDIT: I’d add that I don’t know about the “life always takes precedence over property” statement; Texas has pretty permissive use of deadly force in defense of property. However, I don’t think that anywhere in the US permits traps that make use of deadly force.

tal@olio.cafe · edit-2 7 days ago

You might want to list the platform you want to use it on. I’m assuming that you’re wanting to access this on a smartphone of some sort?

tal@olio.cafe · edit-2 7 days ago

This will increase your privacy by protecting you from ISP web traffic analysis. It does this by generating fake DNS and HTTP request.

If you’re the kind of attacker in a position to be doing traffic analysis in the first place, I suspect that there are a number of ways to filter this sort of thing out. And it’s fundamentally only generating a small amount of noise. I suspect that most people who would be worried about traffic analysis are less worried about someone monitoring their traffic knowing that it’s really 20% of their traffic going to particular-domain.com instead of just 2% of their traffic, and more that they don’t know it to be known that they’re talking to particular-domain.com at all.

For DNS, I think that most users are likely better-off either using a VPN to a VPN provider that they’re comfortable with, DNS-over-HTTP, or DNSSEC.

HTTPS itself will protect a lot of information, though not the IP address being connected to (which is a significant amount of information, especially with the move to IPv6), analysis of the encrypted data being requested (which I’m sure could be fingerprinted to some degree for specific sites to get some limited idea of what a user is doing even inside an encrypted tunnel). A VPN is probably the best bet to deal with an ISP that might be monitoring traffic.

There are also apparently some attempts at addressing the fact that TLS’s SNI exposes domain names in clear text to someone monitoring a connection — so someone may not know exactly what you’re sending, but knowing the domain you’re connecting to may itself be an issue.

In a quick test, whatever attempts to mitigate this have actually been deployed, SNI still seems to expose the domain in plaintext for the random sites that I tried.

$ sudo tcpdump -w packets.pcap port https

$ tshark -r packets.pcap -2 -R ssl.handshake.extensions_server_name

I see microsoft.com, google.com, olio.cafe (my current home instance), and cloudflare.net have plaintext SNI entries show up. My guess is that if they aren’t deploying something to avoid exposure of their domain name, most sites probably aren’t either.

In general, if you’re worried about your ISP snooping on your traffic, my suggestion is that the easiest fix is probably to choose a VPN provider that you do trust and pass your traffic through that VPN. The VPN provider will know who you’re talking to, but you aren’t constrained by geography in VPN provider choice, unlike ISP choice. If you aren’t willing to spend anything on this, maybe something like Tor, I2P, or, if you can avoid the regular Web entirely for whatever your use case is, even Hyphanet.

tal@olio.cafe · edit-2 7 days ago

Mulvad apparently uses Wireguard. Is there an Android Wireguard client that supports multiple VPNs and toggling each independently?

tal@olio.cafe · edit-2 10 days ago

IIRC, it’s been Spain that’s been the strongest proponent of disallowing end-to-end encryption.

I’m guessing that this is because Spain has some substantial separatist movements and wants the federal government to be able to monitor them.

kagis

https://edri.org/our-work/chat-control-what-is-actually-going-on/

While certain pro-digital-surveillance countries like Hungary, Ireland, Spain and Denmark have unwaveringly supported the mass scanning and encryption-breaking measures, many other countries have been rightly alarmed.

They don’t have specifics there on a state-by-state basis.

https://www.wired.com/story/europe-break-encryption-leaked-document-csa-law/

Of the 20 EU countries represented in the document leaked to WIRED, the majority said they are in favor of some form of scanning of encrypted messages, with Spain’s position emerging as the most extreme. “Ideally, in our view, it would be desirable to legislatively prevent EU-based service providers from implementing end-to-end encryption,” Spanish representatives said in the document.

I’m pretty sure that Wired is referencing the leaked documents that I’m thinking of.

tal@olio.cafe · 15 days ago

Have you actually tested it? I’d think that it would work. Unless some Lemmy instance is actively going out of its way to identify and block Tor nodes, I don’t see why it wouldn’t.

checks

lemmy.today looks fine to me on it.

tal@olio.cafe · edit-2 16 days ago

I’d also bet against the CMOS battery, if the pre-reboot logs were off by 10 days.

The CMOS battery is used to maintain the clock when the PC is powered off. But he has a discrepancy between current time and pre-reboot logs. He shouldn’t see that if the clock only got messed up during the power loss.

I’d think that the time was off by 10 days prior to power loss.

I don’t know why it’d be off by 10 days. I don’t know uptime of the system, but that seems like an implausible amount of drift for a PC RTC, from what I see online as lilely RTC drift.

It might be that somehow, the system was set up to use some other time source, and that was off.

It looks like chrony is using the Debian NTP pool at boot, though, and I donpt know why it’d change.

Can DHCP serve an NTP server, maybe?

kagis

This says that it can, and at least when the comment was written, 12 years ago, Linux used it.

https://superuser.com/questions/656695/which-clients-accept-dhcp-option-42-to-configure-their-ntp-servers

The ISC DHCP client (which is used in almost any Linux distribution) and its variants accept the NTP field. There isn’t another well known/universal client that accepts this value.

If I have to guess about why OSX nor Windows supports this option, I would say is due the various flaws that the base DHCP protocol has, like no Authentification Method, since mal intentioned DHCP servers could change your systems clocks, etc. Also, there aren’t lots of DHCP clients out there (I only know Windows and ISC-based clients), so that leave little (or no) options where to pick.

Maybe OS X allows you to install another DHCP client, Windows isn’t so easy, but you could be sure that Linux does.

My Debian trixie system hhas the ISC DHCP client installed in 2025, so might still be a factor. Maybe a consumer broadband router on your network was configured to tell the Proxmox box to use it as a NTP server or something? I mean, bit of a long shot, but nothing else that would change the NTP time source immediately comes to mind, unless you changed NTP config and didn’t restart chrony, and the power loss did it.

tal@olio.cafe · edit-2 16 days ago

I don’t think that the grid frequency is used for PC timekeeping. You have internal timekeeping circuits. AC power stops at the PSU, and I don’t think that there’s any cable over which a time protocol flows from the PSU to the motherboard.

tal@olio.cafe · 21 days ago

I have not done so in the traditional sense in quite some years. My experience was that it was an increasing headache due to crashing into a wide variety of anti-spam efforts. Get email past one and crash into another.

Depending upon your use case – using the “forward to a smarthost” feature in some mail server packages to forward to a mailserver run by a SMTP service provider with whom you have an account might work for you. Then it still looks to local software like you have a local mailserver.

If I were going to do a conventional, no-smarthost mailserver today, I think that I would probably start out by setting up a bunch of spam-filtering stuff — SpamAssassin, I dunno what-all gets used these days on a “regular” account — and then emailing stuff from my server and seeing what throws up red flags. That’d let me actually see the scoring and stuff that’s killing email. Once I had it as clean as I could get it, I’d get a variety of people I know on different mail servers and ask them to respond back to a test email, and see what made it out.