Hi folx

Not much has changed since we last brought this up half a year ago, which is probably a mistake as link trackers have become more ubiquitous, and the corporations that know our names and addresses have built up shadow profiles on us, but better late than never.

Anyway, cutting to the chase. This bot will warn you in DMs when you share a tracking link. That's it. Post over.

Read on if you want to see my unhinged tracking link rants.

What are link trackers?

When you share a youtube link you may notice an ?si=(random gibberish) at the end. You may notice the same with Instagram, except here it's ?igshid. On Twitter, it's ?t. On TikTok and Reddit you have urls that end in gibberish like vm.tiktok․com/blahblah or reddit․com/r/blahblah/s/blahblah.

These URLs are artisanal. They are made only for you.

Other site's URLs can also be called "high entropy" URLs, for example, they may contain the time down to the millisecond, in one case.

When you share these URLs to the world wide web, you broadcast to this service (to YouTube, to Google, to TikTok, to Reddit etc.) that "Hey! This previously-anonymous account is actually me!". When you share this link to your friend halfway across the world who only talks to you on Discord, and they click it, you broadcast to this service that actually you two are buddies. Same here on Hexbear. This sharing helps these sites build a social graph on us.

The threat is two-fold. Google has a powerful search crawler, and also runs a massive ad network. They could sift through the pages they indexed on Hexbear and link the exact Hexbear account to your real name. People who have clicked on your shared link will also be exposed as having been on that exact page to which you shared the link. This kind of metadata leak can be dangerous, as law enforcement has previously asked Google to reveal people who watched so-and-so YouTube video at so-and-so time.

This bot also handles TikTok, Yandex, Snapchat, Meta/Facebook trackers that all have this same ad-related threat.

What can mods on Hexbear do?

If you're a mod and you think this is important, you can @ mention this bot on a community you moderate. The bot should reply to you with some cringe, and then you can appoint it as a mod. When given mod powers, it will remove any comment/post that contains tracking links if the user has not fixed it after a day.

I will probably add functionality to sift through old comments that have dangerous trackers (like TikTok, which exposes your name and picture to anyone who clicks it) and remove/report them soon.

How to protect yourself on other sites and on your phone

Install the ClearUrls extension on desktop (if you're on Chrome... please switch, that is another privacy issue entirely). ClearUrls will cut down on most of your worries.

Be on the lookout for the high-entropy parameters when you share things on your phone as well. Parameters in the url that look like ?si=blahblah, ?igshid, which look like they'd stand for "share ID" or "Instagram share ID", as well as obfuscated TikTok links like vm.tiktok․com/blahblah will all track you and your social circle.

How to protect your identity from leakage if you accidentally click on a tracking URL

If you're browsing a sensitive website, like Hexbear, and you happen to click a tracking URL that goes to YouTube, Google/YouTube can correlate your click with the appearance of this URL on Hexbear, associating your identity with this site.

To avoid this, you may use Firefox Multi-Account Containers, and make Hexbear use its own container that you keep separate from everything else. Although this solution is not perfect, it will prevent one facet of your identity leaking and make it harder for other sites to correlate your digital footprint.

What other threats exist hidden in URLs

The biggest threat is TikTok, which basically doxxes you when you share a link with someone.

When someone clicks your TikTok link, a big banner on top of their screen shows your profile picture and your name. If you used your real name and picture... well. Uh-oh.

Other "light doxxing hazards" exist on other sites. After looking through Hexbear comments using the search function, you can find comments that link to *****, comments that link to ****, etc. that may include the user's general location down to the city, their preferred language, their screen width and height (in the URL!!! for some reason???), and some very high-entropy parameters that look like a long string of gibberish.

If I sat down today and looking to dox someone by looking at their profile and they shared links willy-nilly, I'd have some pretty good leads.

What can the maintainer of HexReplyBot do?

HexReplyBot does not handle YouTube tracking parameters properly. The maintainer can check this RegExr post I made with the modified regex. I bodged it real quick, but it should remove the ?si at least. It will still keep the ?pp parameter, but I got lazy and it's not as common. Please consider changing the regex out, thank you.

Some links

https://archive.ph/8c80m - law enforcement using metadata provided by YouTube to find the real name of a suspect
https://hexbear.net/comment/4439859 - someone mentioning that they keep getting a Hexbear user recommended to them on TikTok because they clicked that user's TikTok link months ago
https://archive.is/WD7ke - "We kill people based on metadata" Can't be bothered to find it but ross ulbricht got busted on some metadata links between his email and stackoverflow. Now imagine if they had tracking links back then to triangulate his stackoverflow identity (which now has tracking links) with some other offsite identity.

Share any feedback or thoughts, I'll take it into consideration.

    • WhatDoYouMeanPodcast [comrade/them]
      ·
      1 month ago

      Regex is scary. I view it like the dark side of the force. I did a bunch of work taking "the length of the phrase after the comma until the next ' ' in the string" instead of trying to decipher regex.

    • Gorb [they/them]
      ·
      1 month ago

      Still to this day I have not bothered to learn regex

  • BountifulEggnog [she/her]
    ·
    1 month ago

    Tracking TikTok links should be completely banned, huge security risk. Admin(s?) can you please consider it?

    • Chronicon [they/them]
      ·
      1 month ago

      if they can be detected using a simple regex (no lookaheads/behinds iirc) I think the slur filter can remove them.

  • CarbonScored [any]
    ·
    1 month ago

    Good bot, even if it's a little annoying. It'd be cooler if Lemmy had an integrated auto-replace tool for this.

    • RedWizard [he/him, comrade/them]
      ·
      1 month ago

      There is an issue open for this exact thing: https://github.com/LemmyNet/lemmy/issues/4905

      CleanURLs provides repo of rules: https://github.com/ClearURLs/Rules that can be used for this task.

      I'm working my way through the Rust book to learn more about how Rust works so I can add this into the server backend of Lemmy once I feel confident.

  • RedWizard [he/him, comrade/them]
    ·
    edit-2
    1 month ago

    Damn, this was on my list of things to do, lol. I have to ask, is there a Hexbear Coding Collective, maybe a publicly hosted Gitea server or even just a Github Organization? I would love to contribute to these projects, and help give back to the Hexbear tech infrastructure in some practical way.

    Also, I just want to point out that there is a repo of tracking rules independent of CleanURLs that can be used for this task.

    https://github.com/ClearURLs/Rules

    That way, you do not have to come up with these regexes yourself.

  • What_Religion_R_They [none/use name]
    ·
    1 month ago

    Curious to hear input from people who have mod powers on this. What makes the most sense? To mod this bot, or to have the bot point out posts and comments, maybe have the bot reply to tracking links publicly... or something else entirely?

  • ashinadash [she/her]
    ·
    1 month ago

    Superb post, ty. Does InsidiousTrackers not provide a comment removal reason though? Just checked the modlog and I think it should show "reason: tracking" or smth.

  • WhatDoYouMeanPodcast [comrade/them]
    ·
    1 month ago

    So if I give someone a tiktok link even if I delete the ? and everything past it, do I still get doxxed? I just have this feeling like I do.

    • citrussy_capybara [ze/hir]
      ·
      1 month ago

      if the link is just www.tiktok.com/@username/video/numbers without a ? the url doesn’t have tracking. if it’s the vm.tiktok ones can either open in a browser to get www. version or use a site like urlex.org to get the www. link then remove the ? and everything after. (also works on the tracking reddit links)

  • EllenKelly [comrade/them]
    ·
    1 month ago

    I encourage everyone to check this android app that helps cleaning links

    https://f-droid.org/packages/com.trianguloy.urlchecker/

    Some people here are terrible with link hygiene, but people irl are so much worse

  • footfaults [none/use name]
    ·
    edit-2
    1 month ago

    Please don't have a bot DM me. That's very annoying.

    I get that there's some technical challenges that would need to be solved for those tracking links to be stripped out by lemmy itself but I'm just annoyed by a nerd saying uhhhh excuse me you used a youtube link instead of some dodgy YouTube frontend that will disappear in a year, or how dare you use a twitter link instead of some twitter proxy that will shut down in 3 months.

    Like why not spend the coding time actually fixing the issue instead of just annoying people

    • What_Religion_R_They [none/use name]
      ·
      1 month ago

      It's not about a frontend, we have that already in HexReplyBot. It's about removing a parameter tacked on at the end in your YouTube/Instagram/TikTok links.

      • footfaults [none/use name]
        ·
        edit-2
        1 month ago

        It's the same technique. A bot that just replies with 'oh sweaty you should have done something different'

        One complains about YouTube or X links and other complains about tracking info in URLs.

        It's annoying and instead there should be code in Lemmy that does the URL sanitizing

        • What_Religion_R_They [none/use name]
          ·
          edit-2
          1 month ago

          I agree with you. Seems like there's an active issue in Lemmy for this, so it'll get implemented eventually. Then it'll take a while for Hexbear to update.

    • BountifulEggnog [she/her]
      ·
      edit-2
      1 month ago

      Sharing the wrong TikTok link can and has doxed users before, and some users have unsafe living situations if they got doxed.

      uhhhh excuse me you used a youtube link instead of some dodgy YouTube frontend that will disappear in a year

      You fundamentally do not understand what this post is talking about.

    • RedWizard [he/him, comrade/them]
      ·
      1 month ago

      Comrade, operational security is what this post is about, not anticapitalist frontends for media services. TikTok will dox you if you are not careful.