I’m the administrator of kbin.life, a general purpose/tech orientated kbin instance.

  • 0 Posts
  • 55 Comments
Joined 2 years ago
cake
Cake day: June 29th, 2023

help-circle

  • Actually how is your ISP giving out IPs to you? Mine uses IPv6 PD to give me a /48. And I then use SLAAC locally on the first /64 prefix on my LAN. Plus another /64 for VPN connections.

    If you mean receiving RA/ND packets from your ISP (which are used to announce IPv6 prefixes) then you need to allow icmpv6 packets (if you don’t want to be able to be pinged, just block echo requests, ICMP in v4 and v6 carry important messages otherwise).

    If your ISP uses DHCPv6 Prefix delegation you will need to allow packets to UDP port 546 and run a DHCPv6 client capable of handling PD messages.

    If you have a fixed prefix, then you probably don’t need to use your ISPs SLAAC at all. You could just put your router on a fixed IP as <yourprefix>::1 and then have your router create RA/ND packets (radvd package in linux, not sure what it would be on pfsense) and assign IPs within your network that way.

    If you have a dynamic prefix… It’s a problem I guess. But probably someone has done it and a google search will turn up how they handled it.

    EDIT: Just clarified that the RA/ND packets advertise prefixes, not assign addresses.


  • I believe the privacy concerns are made moot if all consumer level routers by default blocked incoming untracked connections and you need to poke holes in the firewall for the ports you need.

    Having said that, even knowing the prefix it’s a huge address space to port scan through. So it’s pretty secure too with privacy extensions enabled.

    But for sure the onus is on the router makers for now.





  • Yeah I think allowing a write in answer is too risky. You will end up with 12 unique text answers otherwise.

    I do like the idea of the equivalent of an open verdict. Which is probably a mix of options 1 and 3 from your list. If you don’t believe either of the provided options are suitable and you don’t want to skip then this option would be a nice thing.





  • So on my mbin instance, it’s on cloudflare. So I filter the AS numbers there. Don’t even reach my server.

    On the sites that aren’t behind cloudflare. Yep it’s on the nginx level. I did consider firewall level. Maybe just make a specific chain for it. But since I was blocking at the nginx level I just did it there for now. I mean it keeps them off the content, but yes it does tell them there’s a website there to leech if they change their tactics for example.

    You need to block the whole ASN too. Those that are using chrome/firefox UAs change IP every 5 minutes from a random other one in their huuuuuge pools.


  • Yeah, I probably should look to see if there’s any good plugins that do this on some community submission basis. Because yes, it’s a pain to keep up with whatever trick they’re doing next.

    And unlike web crawlers that generally check a url here and there, AI bots absolutely rip through your sites like something rabid.


  • If you’re running nginx I am using the following:

    if ($http_user_agent ~* "SemrushBot|Semrush|AhrefsBot|MJ12bot|YandexBot|YandexImages|MegaIndex.ru|BLEXbot|BLEXBot|ZoominfoBot|YaK|VelenPublicWebCrawler|SentiBot|Vagabondo|SEOkicks|SEOkicks-Robot|mtbot/1.1.0i|SeznamBot|DotBot|Cliqzbot|coccocbot|python|Scrap|SiteCheck-sitecrawl|MauiBot|Java|GumGum|Clickagy|AspiegelBot|Yandex|TkBot|CCBot|Qwantify|MBCrawler|serpstatbot|AwarioSmartBot|Semantici|ScholarBot|proximic|GrapeshotCrawler|IAScrawler|linkdexbot|contxbot|PlurkBot|PaperLiBot|BomboraBot|Leikibot|weborama-fetcher|NTENTbot|Screaming Frog SEO Spider|admantx-usaspb|Eyeotabot|VoluumDSP-content-bot|SirdataBot|adbeat_bot|TTD-Content|admantx|Nimbostratus-Bot|Mail.RU_Bot|Quantcastboti|Onespot-ScraperBot|Taboolabot|Baidu|Jobboerse|VoilaBot|Sogou|Jyxobot|Exabot|ZGrab|Proximi|Sosospider|Accoona|aiHitBot|Genieo|BecomeBot|ConveraCrawler|NerdyBot|OutclicksBot|findlinks|JikeSpider|Gigabot|CatchBot|Huaweisymantecspider|Offline Explorer|SiteSnagger|TeleportPro|WebCopier|WebReaper|WebStripper|WebZIP|Xaldon_WebSpider|BackDoorBot|AITCSRoboti|Arachnophilia|BackRub|BlowFishi|perl|CherryPicker|CyberSpyder|EmailCollector|Foobot|GetURL|httplib|HTTrack|LinkScan|Openbot|Snooper|SuperBot|URLSpiderPro|MAZBot|EchoboxBot|SerendeputyBot|LivelapBot|linkfluence.com|TweetmemeBot|LinkisBot|CrowdTanglebot|ClaudeBot|Bytespider|ImagesiftBot|Barkrowler|DataForSeoBo|Amazonbot|facebookexternalhit|meta-externalagent|FriendlyCrawler|GoogleOther|PetalBot|Applebot") { return 403; }

    That will block those that actually use recognisable user agents. I add any I find as I go on. It will catch a lot!

    I also have a huuuuuge IP based block list (generated by adding all ranges returned from looking up the following AS numbers):

    AS45102 (Alibaba cloud) AS136907 (Huawei SG) AS132203 (Tencent) AS32934 (Facebook)

    Since these guys run or have run bots that impersonate real browser agents.

    There are various tools online to return prefix/ip lists for an autonomous system number.

    I put both into a single file and include it into my web site config files.

    EDIT: Just to add, keeping on top of this is a full time job! EDIT 2: Removed Mojeek bot as it seems to be a normal web crawler.



  • I did a routine upgrade on my mbin server, where I had an old version with changes I made myself.

    Well turns out I upgraded something (probably redis) that broke symfony that broke everything.

    So I had a fun afternoon upgrading to the latest mbin version. I mean I needed to anyway but my hand was forced.

    Yep sometimes an innocent looking update will change your weekend plans.

    Anyways, any reason not to use ssh?