Bio field too short. Ask me about my person/beliefs/etc if you want to know. Or just look at my post history.

  • 0 Posts
  • 3 Comments
Joined 2 years ago
cake
Cake day: August 3rd, 2023

help-circle
  • Hell, I don’t submit help requests without a confident understanding of what’s wrong.

    Hi Amazon. My cart, ID xyz123, failed to check out. Your browser javascript seems to be throwing an error on line 173 of “null is not an object”. I think this is because the variable is overwritten in line 124, but only when the number of items AND the total cart price are prime.

    Generally, by the time I have my full support request, I have either solved my problem or solved theirs.


  • I agree that this is a problem.

    “Responsible disclosure” is a thing where an organization is given time to fix their code and deploy before the vulnerability is made public. Failing to fix the issue in a reasonable time, especially a timeline that your org has publicly agreed to, will cause reputational harm and is thus an incentive to write good code that is free of vulns and to remediate ones when they are identified.

    This breaks down when the “organization” in question is just a few people with some free time who made something so fundamentally awesome that the world depends on it and have never been compensated for their incredible contributions to everyone.

    “Responsible disclosure” in this case needs a bit of a redesign when the org is volunteer work instead of a company making profit. There’s no real reputational harm to ffmpeg, since users don’t necessarily know they use it, but the broader community recognizes the risk, and the maintainers feel obligated to fix issues. Additionally, a publicly disclosed vulnerability puts tons of innocent users at risk.

    I don’t dislike AI-based code analysis. It can theoretically prevent zero-days when someone malicious else finds an issue first, but running AI tools against that xkcd-tiny-block and expecting that the maintainers have the ability to fit into a billion-dollar-company’s timeline is unreasonable. Google et al. should keep risks or vulnerabilities private when disclosing them to FOSS maintainers instead of holding them to the same standard as a corporation by posting issues to a git repo.

    A RCE or similar critical issue in ffmpeg would be a real issue with widespread impact, given how broadly it is used. That suggests that it should be broadly supported. The social contract with LGPL, GPL, and FOSS in general is that code is released ‘as is, with no warranty’. Want to fix a problem, go for it! Only calling out problem just makes you a dick: Google, Amazon, Microsoft, 100’s of others.

    As many have already stated: If a grossly profitable business depends on a “tiny” piece of code they aren’t paying for, they have two options: pay for the code (fund maintenance) or make their own. I’d also support a few headlines like “New Google Chrome vulnerability will let hackers steal you children and house!” or “watching this youtube video will set your computer on fire!”


  • As with other responses, I recommend a local model, for a vast number of reasons, including privacy and cost.

    Ollama is a front end that lets you run several kinds of models on Windows and Linux. Most will run without a GPU, but the performance will be bad. If your only compute device is a laptop without a GPU, you’re out of luck running things locally with any speed… that said, if you need to process a large file and have time to just let the laptop cook, you can probably still get what you need overnight or over a weekend…

    If you really need something faster soon, you can probably buy any cheap($5-800) off-the-shelf gaming pc from your local electronics store like best buy, microcenter, walmart, and get more ‘bang for your buck’ over a longer term running a model locally, assuming this isn’t a one-off need. Aim for >=16GB RAM on the PC itself and >=10GB on the GPU for real-time responses. I have a 10GB RTX 3080 and have success running 8B models on my computer. I’m able to run a 70B model, but it’s a slideshow. The ‘B’ metric here is parameters and context(history). Depending on what your 4k-lines really means (book pages/printed text?, code?) a 7-10B model is probably able to keep it all ‘loaded in memory’ and be able to respond to questions about the file without forgetting parts of it.

    From a privacy perspective, I also HIGHLY recommend not using the various online front ends. There’s no guarantee that any info you upload to them stays private and generally their privacy policies have a line like ‘we collect information about your interactions with us including but not limited to user generated content, such as text input and images…’ effectively meaning anything your send them is theirs to keep. If your 4k line file is in any way business related, you shouldn’t send it to a service you don’t operate.

    Additionally, as much as I enjoy playing with these tools, I’m an AI skeptic. Ensure you review the response and can sanity check it – AI/LLMs are not actually intelligent and will make shit up.