mastodon.uno è uno dei tanti server Mastodon indipendenti che puoi usare per partecipare al fediverso.
Mastodon.Uno è la principale comunità mastodon italiana. Con 77.000 iscritti è il più grande nodo Mastodon italiano: anima ambientalista a supporto della privacy e del mondo Open Source.

Statistiche del server:

6,3K
utenti attivi

#alignment

7 post6 partecipanti1 post oggi

Current techniques for #AI #safety and #alignment are fragile, and often fail

This paper proposed something deeper: giving the AI model a theory of mind, empathy, and kindness

The paper doesn't have any evidence; it's really just an hypothesis

I'm a bit doubtful that anthropomorphizing like this is really useful, but certainly it would be helpful if we were able to get more safety at a deeper level

If only Asimov's Laws were something we could actually implement!

arxiv.org/abs/2411.04127

arXiv logo
arXiv.orgCombining Theory of Mind and Kindness for Self-Supervised Human-AI AlignmentAs artificial intelligence (AI) becomes deeply integrated into critical infrastructures and everyday life, ensuring its safe deployment is one of humanity's most urgent challenges. Current AI models prioritize task optimization over safety, leading to risks of unintended harm. These risks are difficult to address due to the competing interests of governments, businesses, and advocacy groups, all of which have different priorities in the AI race. Current alignment methods, such as reinforcement learning from human feedback (RLHF), focus on extrinsic behaviors without instilling a genuine understanding of human values. These models are vulnerable to manipulation and lack the social intelligence necessary to infer the mental states and intentions of others, raising concerns about their ability to safely and responsibly make important decisions in complex and novel situations. Furthermore, the divergence between extrinsic and intrinsic motivations in AI introduces the risk of deceptive or harmful behaviors, particularly as systems become more autonomous and intelligent. We propose a novel human-inspired approach which aims to address these various concerns and help align competing objectives.

3/3 D. Dannett:
AI is filling the digital world with fake intentional systems, fake minds, fake people, that we are almost irresistibly drawn to treat as if they were real, as if they really had beliefs and desires. And ... we won't be able to take our attention away from them.

... [for] the current #AI #LLM .., like ChatGPT and GPT-4, their goal is truthiness, not truth.

#LLM are more like historical fiction writers than historians.

2/3 D. Dannett:
the most toxic meme today ... is the idea that truth doesn't matter, that truth is just relative, that there's no such thing as establishing the truth of anything. Your truth, my truth, we're all entitled to our own truths.

That's pernicious, it's attractive to many people, and it is used to exploit people in all sorts of nefarious ways.

The truth really does matter.

1/3 Great philosofer Daniel Dannett, before passing away, had a chance to share thoghts on AI which are still quite relevant:
1. The most toxic meme right now - is the idea that truth doesn't matter, that truth is just relative.
2. For the Large Language Models like GPT-4 -- their goal is truthiness, not truth. ... Technology in the position to ignore the truth and just feed us what makes sense to them.

bigthink.com/series/legends/ph

#LLM #AI #truth #alignment
(Quotes in the following toots)

Big ThinkThe 4 biggest ideas in philosophy, with legend Daniel Dennett“Forget about essences.” Philosopher Daniel Dennett on how modern-day philosophers should be more collaborative with scientists if they want to make revolutionary developments in their fields.