@MagicShel

MagicShel@programming.dev · edit-2 12 days ago

You made a lot of points here. Many I agree with, some I don’t, but I specifically want to address this because it seems to be such a common misconception.

It does and it doesn’t discard the original. It isn’t impossible to recreate the original (since all the data it gobbled up gets stored somewhere in some shape or form and can be truthfully recreated, at least judging by a few comments bellow and news reports). So AI can and does recreate (duplicate or distribute, perhaps) copyrighted works.

AI stores original works like a dictionary does. All the words are there, but the order and meaning is completely gone. An original work is possible to recreate by randomly selecting words from the dictionary, but it’s unlikely.

The thing that makes AI useful is that it understands the patterns words are typically used in. It orders words in the right way far more often than random chance. It knows “It was the best of” has a lot of likely options for the next word, but if it selects “times” as the next word, it’s far more likely to continue with, “it was the worst of times.” Because that sequence of words is so ubiquitous due to references to the classic story. But over the course of following these word patterns, it will quickly glom onto a different pattern and create a wholly new work from the original “prompt.”

There are only two cases in which an original work should be duplicated: either the training data is far too small and the model is overtrained on that particular work, or the work is the most derivative text imaginable lacking any flair or originality.

Adding more training data makes it less likely to recreate any original works.

I am aware of examples where it was claimed an LLM reproduced entirely code functions including original comments. That is either a case of overtraining, or far too many people were already copying that code verbatim into their own, thus making that work very over represented in the training data (same thing, but it was infringing developers who poisoned the data, not researchers using bad training data).

Bottom line: when created with enough data, no original works are stored in any way that allows faithful reproduction other than by chance so random that it’s similar to rolling dice over a dictionary.

None of this means AI can do no wrong, I just don’t find the copyright claim compelling.

MagicShel@programming.dev · 17 days ago

I had an A500 and the 40MB drive was as expensive as the computer.

MagicShel@programming.dev · 19 days ago

I can definitely account for 1.

MagicShel@programming.dev · edit-2 25 days ago

Also, this copy reads like it was written by AI. If this is indicative of the stuff on the website, I very much would not like to read more. If it was written by a human, they should definitely lay off the LSD before writing.

it is obvious that this powerful synergy will change our societal norms and potentialities.

Fuck right off.

MagicShel@programming.dev · 25 days ago

I mostly get what you’re saying, though I don’t have the requisite understanding to follow formal proofs, but if there is one thing I do know for certain, it’s that “understanding” is anthropomorphizing and shorthand for something that is very much not understanding in a human context at all.

I get that it can be hard to find the right words to explain a some of these emergent phenomena, but I think it’s misleading to use words that make AI appear to have a thought process akin to anything we could understand as such—at least in settings where folks might not understand the shorthand as such.

And maybe everyone here is aware of that, but it makes me uneasy, hence this comment to hopefully make that point.

MagicShel@programming.dev · 27 days ago

Okay. Well I’m not that worried until I see where things are headed. I can see a lot of ways for things to go badly, but no point in borrowing trouble over it.

MagicShel@programming.dev · edit-2 27 days ago

Yes and this was my reasoning for saying it would be fine to federate. But I’ll point out that federating ads would mean using my server’s infrastructure to serve ads on behalf of someone else. That would cost the admin more money and would require more user donations to keep it going. So just being able to block isn’t the necessary solution. Not sure that was even your point but I wanted to bring it up.

MagicShel@programming.dev · 27 days ago

My instance is defederated from threads. At the time I mildly disagreed with that decision. Federated ads would vindicate that decision. I don’t need threads content that badly.

MagicShel@programming.dev · 27 days ago

Well that sounds enshitty.

MagicShel@programming.dev · 27 days ago

We need to bring back people who can identify shops from some of the pixels and having seen quite a few shops in their time.

MagicShel@programming.dev · 27 days ago

You’re not going to out-compete the sociopaths if you’re a saint. That’s a reality.

MagicShel@programming.dev · 27 days ago

I’m not a fan of ~~Hitler~~ Steve Jobs, but I am a big fan of the guy who killed ~~Hitler~~ Steve Jobs.

MagicShel@programming.dev · 29 days ago

Thankfully there are other options because you just nailed the two places I refuse to ever get gas from when there is any other option. If there was a good third option I’d take it here, but while Google commands so much market share and a new competitor would probably siphon users from Bing (and it’s not enough users) I don’t think a real alternative will come. I’m intrigued by kagi, though.

MagicShel@programming.dev · 30 days ago

It’s on the falling edge of the hype curve. It’s quite expected, and you’re right about where it’s headed. It can’t do everything people want/expect but it can do some things really well. It’ll find its niche and people will continue to refine it and find new uses, but it’ll never be the threat/boon folks have been expecting.

People are using it for things it’s not good at thinking it’ll get better. And it has to an extent. It is technically very capable of writing prose or drawing pictures, but it lacks any semblance of artistry and it always will. I’ve seen trained elephants paint pictures, but they are interesting for the novelty, not for their expression. AI could be the impetus for more people to notice art and what makes good art special.

MagicShel@programming.dev · 30 days ago

If that was what you took from my post, give it another read. I’m not pro MS. I’m pro not feeding Google. And Bing is fine.

MagicShel@programming.dev · 1 month ago

Supposedly there’s a paid one that is good. I haven’t tried. The thing is Google is completely enshittified. They don’t have to care about you or the sites you search. So my theory is Bing is better because they are hungrier and anything that takes away market share from Google is good—but I’m fully aware that Microsoft was just as shitty as Google and will be again if they get back on top.

Everything else I know of is either just an alternate front end for one of them or an aggregator of both. So you’re right, there’s precious little alternative to Google. But it’s almost bad enough I’m ready for the return of web rings of good sites vouching for each other.

MagicShel@programming.dev · 1 month ago

Google results are actually already pretty terrible. They just have tremendous inertia.

MagicShel@programming.dev · 1 month ago

That makes perfect sense and explains why you can’t fix it just by bypassing blocking temporarily and reinstalling the app.

MagicShel@programming.dev · edit-2 1 month ago

This is the way to go. I tried pihole using Samsung smart features, but if you block the telemetry eventually your apps stop working and you can’t get them working again without doing a factory reset with blocking down. It’s prohibitively a pain in the ass, taking hours every time YouTube stops working.

Never had any issues with Roku on pihole.

MagicShel@programming.dev · 1 month ago

I appreciate the information and the links. I didn’t mean to imply this isn’t exciting or useful technology, just that when an article is pure hype I come away thinking someone is trying to sell me something, not give me actual information.