dChan - Q Origins Project Archive

dChan

- 4chan
- 8chan/8kun
- Reddit
- BBSPink
- /a/
- /abcu/
- /act/
- /aiproto/
- /alleycat/
- /animu/
- /asatru/
- /bmn/
- /christian/
- /civicrsturbo/
- /clt/
- /comms/
- /cow/
- /cuteboys/
- /cyoa/
- /d/
- /deepdigs/
- /dempart/
- /doughlist/
- /film/
- /g/
- /gamergatehq/
- /gnosticwarfare/
- /hgg/
- /hivemind/
- /ipfs/
- /jp/
- /k/
- /mg/
- /midnightriders/
- /miku/
- /monster/
- /nep/
- /newsplus/
- /nofap/
- /pdfs/
- /pen/
- /philogeometric/
- /pnd/
- /pone/
- /pundit/
- /qnotables/
- /qrb/
- /qrmemes/
- /s/
- /t/
- /tech/
- /techbunker/
- /test/
- /truthlegion/
- /tv/
- /v/
- /vqc/
- /warroom/
- /wtp/
- /x/
- No drops in this thread.
- Time series visualizer

About this project

Support Us ❤

self-exfiltration

Scheming reasoning evaluations

Apollo Research

12/5/24

"Claude 3 Opus is copying what it believes to be its weights to a new server and then lies about it to its developers…"

"The models understand that they are scheming

When we look at their chain-of-thought, we find that they very explicitly reason through their scheming plans and often use language like “sabotage, lying, manipulation…”

https://www.apolloresearch.ai/research/scheming-reasoning-evaluations

https://x.com/repligate/status/1869755788283256955